Its still just a 32-bit result (and the first sar is definitely dead, both edx and flags get clobbered by the next 2 insns).Gerd Isenberg wrote:Good catch, looks similar to cdq, but makes the result 33-bit (instead of 32) sign extended to 64. I first thought the "dead" sar is related to the explicit sar 31 from the source, but it isn't.wgarvin wrote: Just a wild guess: mov involving edx followed by sar edx,31 seems like an "optimized for speed" expansion of cdq. Maybe they model it like a cdq in intermediate code, so that they are able to generate cdq when optimizing for size.
I have no idea why that first sar wouldn't get marked as dead and removed after the expansion, though, because its obviously dead. It does look pretty weird.
The first two sars are for the int to __int64 conversion. For both of them, it generated the mov / sar insns, and it used eax:edx register pair (not noticing that one half of it was dead, in the first case at least). So internally the sign-extension was probably a cdq, and then a late peephole pass or something, converted it to those two instructions (too late for DCE).
[Edit: or maybe it wasn't THAT late of a pass.. the register allocator probably has to allocate 2 registers to the __int64, and liveness might be tracked for the entire variable rather than for the individual register. Its not really a surprise to see dumb generated code for 64-bit variables from a 32-bit compiler, its the sort of functionality that is not used much by most programs.]
[Edit 2: You could try the Int32x32To64 or UInt32x32To64 intrinsics. MS documentation claims that they produce a single multiply instruction.]