Trying to catch a difficult bug

Ras · Post by **Ras** » Wed Nov 09, 2016 5:59 pm

Henk wrote:In principal only functional programming is ok.

And still, every top engine is in C or C++. The topmost engine in a functional language is Barbarossa (AFAIK) which in written in Haskell. Place 285 with 2025 ELO.

Functional programming is good for chess in theory and problematic in practice. The reason is that the perfect "immutable state" in the search tree is only reached by copying around e.g. the whole board and lots of other things instead of making/unmaking moves.

Ras · Post by **Ras** » Wed Nov 09, 2016 6:09 pm

I don't think it is stack. The standard stack size under Windows is 1 MB and 8 MB under Linux. That's huge, and to get so deep in the recursion that you even might think of using it up, you'd need an extremely efficient search tree pruning, a very fast computer and a lot of time.

Just for the record, I'm allowing 20 plies of search depth plus 8 plies of QS, then the static eval, and the interrupt stacks on top of that - all with a total stack size of 28 kB.

But it is easy to verify this. Use GCC and pass the additional option "-fstack-usage". It will generate a bunch of .su files, one for each C source file. You will find the static stack size of each function, not counting the children functions. There are automated tools for evaluating the call tree, but they don't work with recursion.

Add up the path from main until the search algorithm starts, then add the stack usage of the search algorithm times the search depth, then add up QS and static eval. It will not even be close to 1 MB.

Ras · Post by **Ras** » Wed Nov 09, 2016 6:44 pm

I had something similar when I took over NG-Play. Rare and unmotivated crashes, every 20 games or so, but only after reducing the hash table size and converting it from heap allocation to static allocation. It took me a full weekend to fix it.

The issue went as follows: the hash tables were defined with a certain size (power of two). The bucket size was three. Now if, by accident, the bucket started close to the end AND the first entries were occupied, then the last write could happen at something like table[TABLESIZE] or so, which was out of bounds.

But that's not what made the program crash. Right after the hash table, there was another array. The array variable itself contained the address of the memory where the array content was, of course. This variable was overwritten by the out of bounds write.

Means, the array pointer of this array now was pointing into the dark. So the program crashed lateron when trying to do something with that array, which involved dereferencing the array pointer, at a totally unrelated point.

The reason why it had never been crashing in the original version was that it had been using dynamic allocation, and therefore this error had only involed a bit of heap trashing.

The solution was just to increase the table size by the bucket size.

How did I find it?

I compiled the program with -g -O0. Then I loaded it into the GNU debugger, GDB. Works under GNU/Linux as well as in Cygwin under Windows. I set a catch on the signal SIGSEGV, and GDB stopped at the line where the the crash was.

I restarted the whole thing and set both a catch on SIGSEGV and a write-catch on the corrupted array pointer variable in question. I let it run again until it stopped in the hash table function where the out of bounds access overwrote the array pointer. And then I had just to figure out why that happened.

Here are some useful links how to use GDB in command line mode:

http://www.unknownroad.com/rtfm/gdbtut/gdbsegfault.html
http://www.cprogramming.com/debugging/segfaults.html
https://sourceware.org/gdb/onlinedocs/g ... reakpoints

Another thing might be undefined behaviour, which can be really nasty. CppCheck and GCC with -Wall can detect some instances of that, but by far not all.

Solution: use the undefined behaviour sanitiser. Works under GNU/Linux only, not with Cygwin, but firing up a live distro via USB is easy enough. I used GNU/Linux Mint because it has a reasonably recent GCC.

Use the following options to compile the program:

-Wall -Wmaybe-uninitialized -Wstrict-aliasing -Wlogical-op -Wcast-align -g -fsanitize=address -fsanitize=bounds -fsanitize=object-size -fsanitize=alignment -fsanitize=null -fsanitize=undefined -fsanitize=shift -fsanitize=signed-integer-overflow -fsanitize=integer-divide-by-zero

The whole thing will be two or three times slower due to the code instrumentation. But it will print out useful feedback on the terminal, indicating what kind of problem there is in what line.

HTH.

brtzsnr · Post by **brtzsnr** » Thu Nov 10, 2016 12:36 am

For very obscure crashes due to wrongly accessing memory I use Address Sanitizer. It very nice that will find the first illegal memory access and print the stack trace. Undefined behaviour sanitizer is also cool. Give it a try.

https://gcc.gnu.org/onlinedocs/gcc/Inst ... tions.html

gcc -fsanitize=address -g source_macsourceface.c

Henk · Post by **Henk** » Thu Nov 10, 2016 9:35 am

Ras wrote:
Henk wrote:In principal only functional programming is ok.
And still, every top engine is in C or C++. The topmost engine in a functional language is Barbarossa (AFAIK) which in written in Haskell. Place 285 with 2025 ELO.

Functional programming is good for chess in theory and problematic in practice. The reason is that the perfect "immutable state" in the search tree is only reached by copying around e.g. the whole board and lots of other things instead of making/unmaking moves.

You probably only need to copy a few references of a tree. But still may be too slow. Although multi threading might give compensation.

Evert · Post by **Evert** » Thu Nov 10, 2016 11:49 am

Ras wrote:I don't think it is stack. The standard stack size under Windows is 1 MB and 8 MB under Linux. That's huge, and to get so deep in the recursion that you even might think of using it up, you'd need an extremely efficient search tree pruning, a very fast computer and a lot of time.

Correct, but if you allocate a data structure of a few MB on the stack, it may well fill up. I've had that happen to me a few times.
The solution, of course, is to not declare local data structures that are that big.

Ras · Post by **Ras** » Thu Nov 10, 2016 5:52 pm

Henk wrote:You probably only need to copy a few references of a tree.

Doesn't help. Yeah, you could of course hand over the move chain instead of a pointer to a modified board. But then in every node, you'd have to make all the moves of the chain on a lokal copy of the board.

But still may be too slow.

That's unavoidable, at least if it shall stay purely functional. I wonder how Barbarossa has implemented hash tables which are mutable state by definition.

Although multi threading might give compensation.

Indeed, it might - but we aren't speaking of multi-core here. Functional actually kicks in when the task is how to transparantly handle some thousand underlying servers - where you simply don't have any chance to implement that in C or C++ unless the implementation would be allowed to take forever and a day.

Ras · Post by **Ras** » Thu Nov 10, 2016 5:57 pm

Agree here, but if I take a look on what you'd put on the stack, that's mostly the move list. Say for 300 moves per node, and that is already overkill. Let's say one uint32_t per move (from, to, mvv_lva and flags). That is 1200 bytes per depth level. In order to get a stack overflow with the 1 MB Windows stack, you'd need more than 800 plies seach depth.

But compiling with -fstack-usage will easily show whether some functions throw half of London on the stack.

Henk · Post by **Henk** » Thu Nov 10, 2016 6:12 pm

Ras wrote:
Henk wrote:You probably only need to copy a few references of a tree.
Doesn't help. Yeah, you could of course hand over the move chain instead of a pointer to a modified board. But then in every node, you'd have to make all the moves of the chain on a lokal copy of the board.

But still may be too slow.
That's unavoidable, at least if it shall stay purely functional. I wonder how Barbarossa has implemented hash tables which are mutable state by definition.

Although multi threading might give compensation.
Indeed, it might - but we aren't speaking of multi-core here. Functional actually kicks in when the task is how to transparantly handle some thousand underlying servers - where you simply don't have any chance to implement that in C or C++ unless the implementation would be allowed to take forever and a day.

No you can represent a chess board by a tree instead of an array. So when performing a move you create a new tree out of an existing tree by reusing (referencing) whole branches.

Not that much difference between a tree and a sparse array.

Sven · Post by **Sven** » Thu Nov 10, 2016 7:38 pm

Ras wrote:Agree here, but if I take a look on what you'd put on the stack, that's mostly the move list. Say for 300 moves per node, and that is already overkill. Let's say one uint32_t per move (from, to, mvv_lva and flags). That is 1200 bytes per depth level. In order to get a stack overflow with the 1 MB Windows stack, you'd need more than 800 plies seach depth.

But compiling with -fstack-usage will easily show whether some functions throw half of London on the stack.

The problem of the OP is most probably caused by a bug. One possible bug could be an infinite recursion, e.g. forever doing nullmoves, forever doing check extensions, ... Such a case would break *all* kinds of reasonable stack size limits.

Trying to catch a difficult bug

Re: Trying to catch a difficult bug

Re: Trying to catch a difficult bug

Re: Trying to catch a difficult bug

Re: Trying to catch a difficult bug

Re: Trying to catch a difficult bug

Re: Trying to catch a difficult bug

Re: Trying to catch a difficult bug

Re: Trying to catch a difficult bug

Re: Trying to catch a difficult bug

Re: Trying to catch a difficult bug