Why is USE_32BIT_MULTIP faster?

Bas Hamstra · Post by **Bas Hamstra** » Mon Nov 03, 2014 2:31 am

Hi there,

I am checking out the magic numbers generation code from Tord. I have an Intel I3 which is 64 bit, a 64 bit version of windows, and downloaded VSExpress and set it to X64 target.

So I do not understand why #define_32bit_multiplication is still WAY faster than NOT setting this? I thought 64 bit compiler would payoff here?

Bas Hamstra

Volker Annuss · Post by **Volker Annuss** » Mon Nov 03, 2014 7:01 am

Here is the code from the chessprogramming wiki.

Code: Select all

int transform(uint64 b, uint64 magic, int bits) {
#if defined(USE_32_BIT_MULTIPLICATIONS)
  return
    (unsigned)((int)b*(int)magic ^ (int)(b>>32)*(int)(magic>>32)) >> (32-bits);
#else
  return (int)((b * magic) >> (64 - bits));
#endif
}

An explanation could be, that the 32 bit version uses an xor operator, that produces no carry bits, while in the 64 bit version there are implicit carries inside the multiplication.
So it is more likely, that every single bit from the returned index only depends on a single bit of b. So there are more 32 bit magics than 64 bit ones.

Have you checked, if your 32 bit magics really work? There are signed integer overflows in the 32 bit multiplications, so the compiler (if it is able to see it) can do anything it wants with that code, even make it WAY faster. ;-)

Gerd Isenberg · Post by **Gerd Isenberg** » Mon Nov 03, 2014 1:28 pm

Bas Hamstra wrote:Hi there,

I am checking out the magic numbers generation code from Tord. I have an Intel I3 which is 64 bit, a 64 bit version of windows, and downloaded VSExpress and set it to X64 target.

So I do not understand why #define_32bit_multiplication is still WAY faster than NOT setting this? I thought 64 bit compiler would payoff here?

Bas Hamstra

Hi Bas,

working on magic Tao? Wow!

Do you mean time for all find_magics or time for one "transform"? As Volker pointed out, the conditional compiled transform routines are not equivalent due to xor instead of plus. Otherwise, 64-bit imul/shr ahould be faster on x86-64.

Cheers,
Gerd

Volker Annuss · Post by **Volker Annuss** » Mon Nov 03, 2014 7:14 pm

Gerd Isenberg wrote: As Volker pointed out, the conditional compiled transform routines are not equivalent due to xor instead of plus.

Even with plus they are not equivalent. But you can replace xor by plus and if I am right with the idea that carries from one index bit to another are a problem, it should take longer to find magics with plus.

Why is USE_32BIT_MULTIP faster?

Why is USE_32BIT_MULTIP faster?

Re: Why is USE_32BIT_MULTIP faster?

Re: Why is USE_32BIT_MULTIP faster?

Re: Why is USE_32BIT_MULTIP faster?