Causes for inconsistent benchmark signatures

Discussion of chess software programming and technical issues.

Moderator: Ras

User avatar
Evert
Posts: 2929
Joined: Sat Jan 22, 2011 12:42 am
Location: NL

Causes for inconsistent benchmark signatures

Post by Evert »

I added a "benchmark" command to Jazz (searching a given set of positions to a given depth and reporting total nodes as well as time and NPS), which will hopefully be very useful. At the moment it's a bit of a pain, however.

The problem is I get different total node-counts in different environments.

OS X, 32 and 64 bit (GCC and Clang):
signature 3356838

Linux 64 bit
signature 3356838

Linux 32 bit
signature 3358082

Windows 64 bit (cross compiled and running under Wine)
signature 3340172

Windows 32 bit (cross compiled and running under Wine)
signature 3357543

What are some common-causes for inconsistencies like these? I've eliminated a random component to root move ordering (so Jazz doesn't always play the same moves early on; disabled for this) and calling libc's qsort() function (which is an unstable sort and will produce slightly different ordering on different platforms because of differences in the implementation).
Michel
Posts: 2292
Joined: Mon Sep 29, 2008 1:50 am

Re: Causes for inconsistent benchmark signatures

Post by Michel »

Did you try valgrind?

A common cause of such inconsistencies is uninitialized variables.
AlvaroBegue
Posts: 931
Joined: Tue Mar 09, 2010 3:46 pm
Location: New York
Full name: Álvaro Begué (RuyDos)

Re: Causes for inconsistent benchmark signatures

Post by AlvaroBegue »

Are you using rand() anywhere? (e.g., in initializing Zobrist tables)
User avatar
Evert
Posts: 2929
Joined: Sat Jan 22, 2011 12:42 am
Location: NL

Re: Causes for inconsistent benchmark signatures

Post by Evert »

Michel wrote:Did you try valgrind?

A common cause of such inconsistencies is uninitialized variables.
Good question. I hadn't tried valgrind recently (but I did run the code through the Clang static analyser). For some reason valgrind refuses to check my 32-bit compiles, but the 64 bit ones it does. However, it finds nothing...
User avatar
Evert
Posts: 2929
Joined: Sat Jan 22, 2011 12:42 am
Location: NL

Re: Causes for inconsistent benchmark signatures

Post by Evert »

AlvaroBegue wrote:Are you using rand() anywhere? (e.g., in initializing Zobrist tables)
I use a random component in sorting the root moves at the beginning of the search, but that's disabled for the benchmark. Without it benchmark results are not reproducible on the same machine.

That's the only one (apart from opening books, but those are disabled as well).
User avatar
hgm
Posts: 28359
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: Causes for inconsistent benchmark signatures

Post by hgm »

Well, if you really want to know, I guess you should look for the lowes search depth where you see a difference, and then do a search that reports the 'split node counts' for every move separately at 1, 2, 3... ply, and zoom in on the difference using that, to see which node makes the difference.

An alternative would be to print all hash keys on a file, one per line, and use the diff command to get the first difference. Then patch the engine to print the path when it encounters a position that hash key. If the number of nodes for which you get a difference is manageable, this might be the quickest method.
User avatar
Evert
Posts: 2929
Joined: Sat Jan 22, 2011 12:42 am
Location: NL

Re: Causes for inconsistent benchmark signatures

Post by Evert »

hgm wrote:Well, if you really want to know, I guess you should look for the lowes search depth where you see a difference, and then do a search that reports the 'split node counts' for every move separately at 1, 2, 3... ply, and zoom in on the difference using that, to see which node makes the difference.
Yup, that's what I ended up doing (more or less): identify the search depth where the problem first appeared (which turned out to be 2) and then identify the exact position (the benchmark searches a couple) that causes the deviation, then print out the movelist that is searched (which isn't hard for a depth of 2).

This revealed the presence of a spurious en-passant capture in the movelist in 32-bit Linux that wasn't there in 64 bits or on OS X, which was generated because of a spurious en-passant target square being set from loading the FEN position (I don't mean an en-passant square being set while the capture is illegal, I mean a random square on the board was marked as "en passant" square). This turned out to be a harmless square in 64 bit mode, but generated a spurious pawn capture in 32 bit mode.

Tracking where the en-passant square came from, it turned out that the FEN position was incorrect and was missing the en-passant field (or the castling field, it's hard to tell the difference if one of them is missing and the other one is -).

So I corrected the FEN string and made my FEN parser more robust when dealing with incorrect FENs.

That corrected the problem and I now get consistent node-counts across different platforms, compilers and compiler options.

Phew. :lol:
User avatar
hgm
Posts: 28359
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: Causes for inconsistent benchmark signatures

Post by hgm »

Life is so much easier at 2 ply! :lol:

In the new Fairy-Max derivative I am making (introducing move sorting in a real move list) I fixed a problem this morning that made random pieces disappear from the board, in a rather irreproducible way. Saving and restoring the piece on the to-square of e.p. captures seems to have made this problem go away. I guess that is what you get when you transfer the info whether a move is an e.p. capture through the move encoding in the hash table, in stead of basing it on the move generator's knowledge that it was a Pawn that was moving, and it went to the e.p. square.
Rein Halbersma
Posts: 749
Joined: Tue May 22, 2007 11:13 am

Re: Causes for inconsistent benchmark signatures

Post by Rein Halbersma »

Evert wrote: What are some common-causes for inconsistencies like these? I've eliminated a random component to root move ordering (so Jazz doesn't always play the same moves early on; disabled for this) and calling libc's qsort() function (which is an unstable sort and will produce slightly different ordering on different platforms because of differences in the implementation).
I would try using std::stable_sort if you can use C++/STL, or write your own insertion sort to rule out algorithm related inconsistencies. E.g. in Stockfish they do

Code: Select all

// Our insertion sort, guaranteed to be stable, as is needed
  void insertion_sort(MoveStack* begin, MoveStack* end)
  {
    MoveStack tmp, *p, *q;

    for (p = begin + 1; p < end; ++p)
    {
        tmp = *p;
        for (q = p; q != begin && *(q-1) < tmp; --q)
            *q = *(q-1);
        *q = tmp;
    }
  }
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Causes for inconsistent benchmark signatures

Post by bob »

Evert wrote:I added a "benchmark" command to Jazz (searching a given set of positions to a given depth and reporting total nodes as well as time and NPS), which will hopefully be very useful. At the moment it's a bit of a pain, however.

The problem is I get different total node-counts in different environments.

OS X, 32 and 64 bit (GCC and Clang):
signature 3356838

Linux 64 bit
signature 3356838

Linux 32 bit
signature 3358082

Windows 64 bit (cross compiled and running under Wine)
signature 3340172

Windows 32 bit (cross compiled and running under Wine)
signature 3357543

What are some common-causes for inconsistencies like these? I've eliminated a random component to root move ordering (so Jazz doesn't always play the same moves early on; disabled for this) and calling libc's qsort() function (which is an unstable sort and will produce slightly different ordering on different platforms because of differences in the implementation).
That is almost certainly a bug that needs fixing. NO reason for the node counts to vary, assuming NOTHING changes but the 32/64 bit compiler differences. IE same hash size, etc... Anything else indicates a bug.