Your favorite crash

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

What makes your engine crash most?

Due to a division by zero
1
4%
Infinite loop
3
12%
Illegal addressing, such as a pointer out of memory
21
84%
 
Total votes: 25

User avatar
Rebel
Posts: 6991
Joined: Thu Aug 18, 2011 12:04 pm

Your favorite crash

Post by Rebel »

During the years I noticed that when my engine crashes during development it's because of a division by zero, apparently my favorite sloppiness.

Yours is?

*edit - the 4th option (other, please elaborate) dropped off, sloppiness again or forum software bug, odd...
User avatar
hgm
Posts: 27787
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: Your favorite crash

Post by hgm »

For me segfaulting is by far the dominant cause. Stack overflow, which is not an option in this poll, has been a good second. (E.g. because of check extention and mutual checking.) Hanging due to infinite loops only rarely. During the development of my latest engine (for Tenjiku Shogi) this occurred somewhat more often than I am used to, because it has many bit-extraction loops, and sometimes I forgot to add the "todo -= bit;" to clear the bit that was just treated, or accidentally had put a conditional "continue;" before it.
D Sceviour
Posts: 570
Joined: Mon Jul 20, 2015 5:06 pm

Re: Your favorite crash

Post by D Sceviour »

hgm wrote:Stack overflow, which is not an option in this poll, has been a good second. (E.g. because of check extention and mutual checking.)
What would be the best method or code insertion for testing stack overflow?
mjlef
Posts: 1494
Joined: Thu Mar 30, 2006 2:08 pm

Re: Your favorite crash

Post by mjlef »

My old program NOW measured time since midnight. When I went to a tournament in another time zone, I did not bother to reset my computer to local time. So when midnight arrived at my home, I saw the elapsed time went negative! I figured it out pretty fast (seeing -86366 was a clue since a day has 60x60x24 = 86400 seconds). I explained the bug to the tournament director and my opponent, who were kind enough to allow me to force my program to move. Needless to say, that night I changed my time function to take days, months and years into effect to keep this from ever happening again.
D Sceviour
Posts: 570
Joined: Mon Jul 20, 2015 5:06 pm

Re: Your favorite crash

Post by D Sceviour »

One problem found was alignment with local variables. This might be caused by a recursive problem. Apparently, the system reserves a register scratch zone or "red zone" for recursion at the bottom of the variable data list. Another problem might be memory leaks when not initializing user defined types and structures. Other problems like over-stepping array bounds are usually picked up by normal debugging methods.

I do not know much about the red zone or even if it is a real problem, but there is a quotation from the AMD64 ABI:
The 128-byte area beyond the location pointed to by %rsp is considered to be reserved and shall not be modified by signal or interrupt handlers. Therefore, functions may use this area for temporary data that is not needed across function calls. In particular, leaf functions may use this area for their entire stack frame, rather than adjusting the stack pointer in the prologue and epilogue. This area is known as the red zone.
User avatar
hgm
Posts: 27787
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: Your favorite crash

Post by hgm »

D Sceviour wrote:What would be the best method or code insertion for testing stack overflow?
I usually keep a global counter 'ply' to hold the ply level, and an array path[MAXPLY] to hold the moves made at every ply (path[ply++] = move; in MakeMove()). At the top of Search() I then test for ply >= MAXPLY, and if it is, I print the entire path, and exit.
Last edited by hgm on Mon May 22, 2017 9:47 pm, edited 1 time in total.
D Sceviour
Posts: 570
Joined: Mon Jul 20, 2015 5:06 pm

Re: Your favorite crash

Post by D Sceviour »

hgm wrote:
D Sceviour wrote:What would be the best method or code insertion for testing stack overflow?
I usually keep a global counter 'ply' to hold the ply level, and an array path[MAXPLY] to hold the moves made at every ply (path[ply++] = move; in MakeMove()). At the top of Search() I then test for ply >= MAXPLY, and if it is, I print the entire path, and exit.
I thought you were referring to the system stack rather than ply overflow - although the MAXPLY might be good crude indicator. In the case of a system stack overflow, does the system do a wrap around on the stack pointer, or does it try to reallocate more stack memory automatically?
Dann Corbit
Posts: 12537
Joined: Wed Mar 08, 2006 8:57 pm
Location: Redmond, WA USA

Re: Your favorite crash

Post by Dann Corbit »

On many systems, the program's stack is a fixed size constant.

It used to be easy to generate stack overflow with something like this:

Code: Select all

#include <stdlib.h>
#include <stdio.h>
#include <math.h>

int stackSmasher&#40;int counter&#41;
&#123;
    double ObnoxiouslyLargeDoubleArray&#91;131072&#93;;
    int i;
    for &#40;i = 0; i < 131072; i++)
    &#123;
        ObnoxiouslyLargeDoubleArray&#91;i&#93; = i * rand&#40;);
    &#125;
    printf&#40;"counter=%d, sin&#40;ObnoxiouslyLargeDoubleArray&#91;131071&#93;) = %g\n", counter, sin&#40;ObnoxiouslyLargeDoubleArray&#91;131071&#93;));
    if &#40;counter <= 0&#41; return 0;
    return stackSmasher&#40;counter--);
&#125;
called by:

Code: Select all

#include <stdio.h>
#include <math.h>

int stackSmasher&#40;int counter&#41;;

int main&#40;void&#41;
&#123;
    stackSmasher&#40;256&#41;;
    return 0;
&#125;
But now all the compilers will simply eliminate the tail recursion.

You can change the stack size of a program even after it is linked with Windows (not sure about Linux, because I never tried it).
[/code]
Taking ideas is not a vice, it is a virtue. We have another word for this. It is called learning.
But sharing ideas is an even greater virtue. We have another word for this. It is called teaching.
User avatar
hgm
Posts: 27787
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: Your favorite crash

Post by hgm »

D Sceviour wrote:I thought you were referring to the system stack rather than ply overflow - although the MAXPLY might be good crude indicator. In the case of a system stack overflow, does the system do a wrap around on the stack pointer, or does it try to reallocate more stack memory automatically?
Well, either of the two can happen, depending on how you organize the program. If the move list and PV are held in local variables, they are on the system stack. And like Dan says, this is usually limited to a few MB.
Ras
Posts: 2487
Joined: Tue Aug 30, 2016 8:19 pm
Full name: Rasmus Althoff

Re: Your favorite crash

Post by Ras »

D Sceviour wrote:What would be the best method or code insertion for testing stack overflow?
Since my chess system is based on a micro-controller, stack overflow is much more of a threat than with PC systems. I'm using two methods:

1) I call GCC with the option -fstack-usage. That gives me the stack usage for each function. So I "just" have to limit the maximum depth, multiply that with the stack usage per level, add up QS multiplied by QS levels and add up the stack usage for static eval.

Actually, I'm also adding up the stack usage for interrupt service handlers and exception handlers on top of that.

2) I have code instrumentation for stack usage, optionally enabled with a #define. Assuming a descendent stack (x86 and ARM), you can just declare local variables and take their address, most valuable in static eval, of course. If said address is lower than the lowest recorded stack usage so far, update the latter to the former. That's a bit hacky with regard to the C standard, but it works for confirming that the analysis (step 1) didn't overlook anything relevant.

That said, I remember a default value of 1 MB for stack on a PC. I have 28 kB (!) allocated for the stack, and that works for 20 plies regular search, extended by 8 plies QS search, and then static eval. So I don't see how stack usage would ever be an issue on a PC.