Causes for inconsistent benchmark signatures

Discussion of chess software programming and technical issues.

Moderator: Ras

User avatar
Evert
Posts: 2929
Joined: Sat Jan 22, 2011 12:42 am
Location: NL

Re: Causes for inconsistent benchmark signatures

Post by Evert »

Of course the fun started up again when I re-ran the benchmark test one ply deeper than before and I again ended up with slightly different node counts. :cry:

This time I knew the difference occurred at depth=9, because all was fine at depth=8. Fortunately I wrote some rather extensive logging functions a long time ago that print out the search tree along with search bounds, check state, position being searched, pruning decisions, almost everything, in short. So all I had to do was switch the logging on (through a pre-processor symbol), tweak it a bit (so it would only kick in after the 8-ply search) and run the benchmark on both systems. Diff (actually vimdiff) did the rest.

Turns out I had a typo in the code that verifies mate-scores in the quiescence search, which was accessing an out-of-bounds array index. Maybe that code is a bit too clever to keep anyway, but I fixed the bug rather than delete the code. All seems well again now.

Sadly this does mean I missed my chance to release a new version of Jazz at revision 720, which would have been neat. :(
Michel
Posts: 2292
Joined: Mon Sep 29, 2008 1:50 am

Re: Causes for inconsistent benchmark signatures

Post by Michel »

which was accessing an out-of-bounds array index.
This wasn't caught by Valgrind? That's possible since Valgrind does not
catch all memory errors.

Was this a global array or an array on the stack? If it was a global
array then the error might have been caught by compiling with libefence.
A buffer overrun on the stack might have been caught by enabling the gcc stack protection options.
User avatar
Evert
Posts: 2929
Joined: Sat Jan 22, 2011 12:42 am
Location: NL

Re: Causes for inconsistent benchmark signatures

Post by Evert »

Michel wrote: This wasn't caught by Valgrind? That's possible since Valgrind does not
catch all memory errors.
No. But it's possible that I didn't run the benchmark to the same depth, or, come to think of it, it may be that it wasn't actually accessing an out-of-bounds element but just the wrong (almost random) element. The calculation for the index was wrong and I fixed it without checking what incorrect element it was probing. ;)
Was this a global array or an array on the stack?
Well, neither, I guess - it's a dynamically allocated array that resides in a struct that is passed to the function.
User avatar
Evert
Posts: 2929
Joined: Sat Jan 22, 2011 12:42 am
Location: NL

Re: Causes for inconsistent benchmark signatures

Post by Evert »

Rein Halbersma wrote: I would try using std::stable_sort if you can use C++/STL, or write your own insertion sort to rule out algorithm related inconsistencies.
Yes, that's what I did (use my own sort; the code is C, so no STL). In fact I had the code in there already, but disabled. Possibly because it tested as slower on selected test positions back when I wrote it; tests as faster now though, so it's not only a bug fix but actually an improvement. :D