abulmo wrote:Did you actually compile that code? I ask because int is 32-bits on most 64-bit platforms, including MSVC/Intel/GCC on x64 under either Windows or Linux...
Pointers are 64 bits. Vincent Diepeveen's example uses an array of ints and thus does pointer arithmetics using 64 bits. If 64-bit pointer tends to slow down a program, on the other hand, the 64 bit ABI using registers instead of the stack to pass parameters usually makes the program much faster. Linux has a new (and somewhat experimental) 32 bit ABI (x32), similar to the 64-bit ABI but with 32 bit pointers, perhaps the best of both world?
Okay, I had to download some docs from Intel and actually skim through them to find out how this really works on x64.

I don't think any "pointer math" actually occurs, but the address calculations are done using 64-bit registers so actually, Vincent was correct that sign-extending 8- or 16-bit values is not completely free here. Funny how this kind of info is not easy to find with Google; I guess not many programmers worry much about this kind of detail (and if they do they probably have those manuals handy anyway)..
I think there's basically two cases, both of which require the index (in this case a 32-bit int variable) to be sign-extended into 64-bits.
(1) if the base address of the array is at an immediate offset from some other 64-bit register such as RSP, RBP etc. (this is the case for any array on the stack) then the addressing would be something like [RSP+4*RAX+nnn]. Even for a global variable, if there is a suitable fixed address already in a register, (such as the address of some other global variable, or the address of the start of the module, or some global table... anything where the compiler knows the fixed offset between that thing and this global array variable) then method (1) can be used.
(2) otherwise, it has to form a 64-bit address for the array in a register first. It can load the 64-bit immediate with one instruction (this should never add latency, as it can be done anywhere earlier in the code). Then it is the same as method 1 but using this register as the base, i.e. something like [RDX+4*RAX].
In either case, the only thing you pay for (in latency) would be sign-extending a 32-bit "int" value into 64-bit RAX. Using "unsigned" instead of "int" would avoid that, since the zero-extension is actually free if you just write to EAX, i.e. something like MOVZX EAX,byte ptr [whatever].
So yeah. x86 can handle simple cases like this with its addressing modes (no actual instructions doing "pointer arithmetic") but a signed index does need to be sign-extended to 64 bits first, which is different from 32-bit x86 code. Bummer.
...On the other hand, if the compiler knows that accessing your array at negative indexes is "undefined behaviour", it might cleverly skip the sign-extension and effectively give you zero-extension of 32-bit to 64-bit instead. I'm not sure what array indexing situations are "undefined behaviour" but many of them are, and unfortunately modern compilers are starting to get more aggressive about taking advantage of stuff like that.
[Edit: to be clear, x64 has no addressing mode that mixes different-size operands in the address calculation; you can't add a 32-bit index register to a 64-bit base address for example. If you're calculating a 64-bit address, you have to use 64-bit register(s) for base/index. Immediates are 32-bits and are sign-extended to 64-bit for free, but registers used in the address calculation have to be sign-extended in advance unless the compiler can prove that using any negative value would be undefined behaviour. You are allowed to use different operand and address sizes; you can load a 64-bit value from a 32-bit address or a 32-bit value from a 64-bit address. You just can't mix sizes within the address calculation itself.]