On a 32-bit Pentium 4 HT 2.4Ghz box, I get this:
1,000,000 hash keys took 0.531496 seconds
1,000,000 scores took 1.044043 seconds
Nothing special about the compile flags:
CFLAGS = -O3 -Wall -pedantic -std=c99
The binary is
ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), for GNU/Linux 2.6.9, dynamically linked (uses shared libs), for GNU/Linux 2.6.9, not stripped
running on
Linux 2.6.18-164.6.1.el5 #1 SMP Tue Nov 3 16:18:27 EST 2009 i686 i686 i386 GNU/Linux
Now I took the code over to a 64-bit x86_64 box, and recompiled it:
ELF 64-bit LSB executable, AMD x86-64, version 1 (SYSV), for GNU/Linux 2.4.0, dynamically linked (uses shared libs), not stripped
Slightly different version of the OS but pretty similar:
Linux 2.6.9-89.0.18.ELsmp #1 SMP Tue Dec 15 14:10:51 EST 2009 x86_64 x86_64 x86_64 GNU/Linux
The box is a 3.0Ghz Pentium 4 HT and is generally newer than the 32-bit box.
The code, however, runs 2.5 times slower:
1,000,000 hash keys took 1.435741 seconds
1,000,000 scores took 2.142182 seconds
I'd thought the 64-bit processor would make a difference...the other way
The only code change I played with was the typedef for the bitboard (same typedef for the hash key):
On 32-bit:
typedef unsigned long long int BITBOARD;
On 64-bit, I tried both "unsigned long long" and "unsigned long" but the number did not change.
Now, I realize code doesn't magically run faster just because you recompile it to 64-bits - I've certainly experienced that in my professional life.
But in the case of chess engines, I would think the benefits of 64-bit would be great: slinging 64-bit bitboards around natively, more CPU registers, etc.
So, I'm a little confused...
I can work up a publishable example if someone wants to see the specific code. It's all single-threaded at this point, btw.