64-bit intrinsic performance

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

Gerd Isenberg
Posts: 2250
Joined: Wed Mar 08, 2006 8:47 pm
Location: Hattingen, Germany

Re: 64-bit intrinsic performance

Post by Gerd Isenberg »

nthom wrote:
Gerd Isenberg wrote: Can you post the generated 64-bit assembly of a typical bitboard serialization loop? What is your processor?
What do you mean by a serialization loop? Can you post pseudo code?.
Something like this, e.g. to traverse a target bb (x) for movegen:

Code: Select all

if ( x ) do {
   int idx = bitScanForward(x); // square index from 0..63
   *list++ = foo(idx, ...);
} while (x &= x-1); // reset LS1B
nthom wrote:I'm testing on a Phenom 9550 quad core.
That should be fast as hell with 64-bit bsf.
Gerd Isenberg
Posts: 2250
Joined: Wed Mar 08, 2006 8:47 pm
Location: Hattingen, Germany

Re: 64-bit intrinsic performance

Post by Gerd Isenberg »

Gerd Isenberg wrote:
nthom wrote:I'm testing on a Phenom 9550 quad core.
That should be fast as hell with 64-bit bsf.
Oups, its a K10, I confused it with core2 or something.
Bsf/Bsr x86-64 Timings

BSF reg, reg VectorPath 4
BSF reg, mem VectorPath 7

So not that fast, since VectorPath blocks also other units, but should be faster than 2 * 32 bit bsf. So may be for those AMD boxes De Bruijn mul is faster...

Or even better leading zero count ...
LZCNT reg, reg Direct Path single 2, 1 (Latency, Reciprocal Throughput)
User avatar
nthom
Posts: 112
Joined: Thu Mar 09, 2006 6:15 am
Location: Australia

Re: 64-bit intrinsic performance

Post by nthom »

FYI i figured it out - it turned out to be a problem with my VS2008 optimization settings. I had maximise speed (/O2) on but it wasn't getting through to the actual command line until i changed a heap of stuff and then changed it all back!

Now my 64-bit version is about 50% faster than the 32-bit one.
User avatar
Greg Strong
Posts: 388
Joined: Sun Dec 21, 2008 6:57 pm
Location: Washington, DC

Re: 64-bit intrinsic performance

Post by Greg Strong »

nthom wrote:FYI i figured it out - it turned out to be a problem with my VS2008 optimization settings. I had maximise speed (/O2) on but it wasn't getting through to the actual command line until i changed a heap of stuff and then changed it all back!

Now my 64-bit version is about 50% faster than the 32-bit one.
How did you know it wasn't getting through to the command line? I might be having the same problem...
User avatar
nthom
Posts: 112
Joined: Thu Mar 09, 2006 6:15 am
Location: Australia

Re: 64-bit intrinsic performance

Post by nthom »

In the property pages of your project, expand C/C++ and click on the Command Line item at the bottom of the list. It will have things like:

/O2 /Oi /GL /D "WIN32" /D "_WINDOWS" /D "NDEBUG" /D "_AFXDLL" /D "_MBCS" /FD /EHsc /MD /Gy /Yu"stdafx.h" /Fp"Release\LittleBlitzer.pch" /Fo"Release\\" /Fd"Release\vc90.pdb" /W3 /nologo /c /Zi /TP /errorReport:prompt

The /O2 is the one for maximise speed.