How to find SMP bugs ?

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

User avatar
lucasart
Posts: 3232
Joined: Mon May 31, 2010 1:29 pm
Full name: lucasart

Re: How to find SMP bugs ?

Post by lucasart »

jdart wrote:
lucasart wrote:
Maybe other engines are directly playing the TT move, in an attempt to save time generating moves if it fails bigh. I don't do that.
Practically all modern engines do this. It is a considerable time-saver, because you skip the whole move generation step.
Not in my experience. I tried it, and there was no mesurable gain. The problem is that you tax the general case to speedup the special case. Whenever you have a tt move, the node is a cut node, and the tt move is the best move, you have a speedup. But in all other cases, you add overhead in the sorting code machinery. Also, it makes the code more complicated, hence harder to maintain and less flexible. Also I have a practical "proof in the pudding": my NPS is on par with Stockfish w/o it.
Theory and practice sometimes clash. And when that happens, theory loses. Every single time.
Sven
Posts: 4052
Joined: Thu May 15, 2008 9:57 pm
Location: Berlin, Germany
Full name: Sven Schüle

Re: How to find SMP bugs ?

Post by Sven »

lucasart wrote:
jdart wrote:
lucasart wrote:
Maybe other engines are directly playing the TT move, in an attempt to save time generating moves if it fails bigh. I don't do that.
Practically all modern engines do this. It is a considerable time-saver, because you skip the whole move generation step.
Not in my experience. I tried it, and there was no mesurable gain. The problem is that you tax the general case to speedup the special case. Whenever you have a tt move, the node is a cut node, and the tt move is the best move, you have a speedup. But in all other cases, you add overhead in the sorting code machinery. Also, it makes the code more complicated, hence harder to maintain and less flexible. Also I have a practical "proof in the pudding": my NPS is on par with Stockfish w/o it.
I agree to your reasoning why trying the TT move prior to normal move generation is not necessarily a considerable time-saver. A better wording, as always, would be "it can be a considerable time-saver, depending on other properties of the engine".

But I disagree to your reasoning in your last sentence: that your NPS is on par with Stockfish can have a lot of different reasons that are fully unrelated to this "try TT move without movegen" point. For instance, your evaluation might be faster due to including less eval features. The only way to get a "proof in the pudding" would be to compare "with" vs. "without" *in your engine*.
Sven
Posts: 4052
Joined: Thu May 15, 2008 9:57 pm
Location: Berlin, Germany
Full name: Sven Schüle

Re: How to find SMP bugs ?

Post by Sven »

lucasart wrote:
jdart wrote:
lucasart wrote:
Maybe other engines are directly playing the TT move, in an attempt to save time generating moves if it fails bigh. I don't do that.
Practically all modern engines do this. It is a considerable time-saver, because you skip the whole move generation step.
Not in my experience. I tried it, and there was no mesurable gain. The problem is that you tax the general case to speedup the special case. Whenever you have a tt move, the node is a cut node, and the tt move is the best move, you have a speedup. But in all other cases, you add overhead in the sorting code machinery. Also, it makes the code more complicated, hence harder to maintain and less flexible. Also I have a practical "proof in the pudding": my NPS is on par with Stockfish w/o it.
I agree to your reasoning why trying the TT move prior to normal move generation is not necessarily a considerable time-saver. A better wording, as always, would be "it can be a considerable time-saver, depending on other properties of the engine".

But I disagree to your reasoning in your last sentence: that your NPS is on par with Stockfish can have a lot of different reasons that are fully unrelated to this "try TT move without movegen" point. For instance, your evaluation might be faster due to including less eval features. The only way to get a "proof in the pudding" would be to compare "with" vs. "without" *in your engine*.
User avatar
lucasart
Posts: 3232
Joined: Mon May 31, 2010 1:29 pm
Full name: lucasart

Re: How to find SMP bugs ?

Post by lucasart »

Sven Schüle wrote:The only way to get a "proof in the pudding" would be to compare "with" vs. "without" *in your engine*
I already said that I did that, and measured no gain. With vs. without, all else equal. I was just giving another "indicator" for those who refuse to believe it.
Theory and practice sometimes clash. And when that happens, theory loses. Every single time.
User avatar
lucasart
Posts: 3232
Joined: Mon May 31, 2010 1:29 pm
Full name: lucasart

Re: How to find SMP bugs ?

Post by lucasart »

cdani wrote:
mar wrote:First thing that comes to mind would be TT move validation.
So I'd start with a simple check that tt move is legal.
To solve this in one go, I generated random moves, even with nonsense squares and states. Now should be 100% reliable in Andscacs.
Good strategy.

I removed every kind of synchronization out there, and simplified the search to the bare minimum, still crashes. I then disabled the HT, and crashes stopped. So it's indeed HT related. But it's not about the legality of the tt move. The problem is triggered by HT pruning, with garbage data. If I disable this HT pruning code, crashes disappear:

Code: Select all

if (he.depth >= depth && ply > 0) {
    if &#40;he.score <= alpha && he.bound >= EXACT&#41;
        return he.score;
    else if &#40;he.score >= beta && he.bound <= EXACT&#41;
        return he.score;
    else if &#40;alpha < he.score && he.score < beta && he.bound == EXACT&#41;
        return he.score;
&#125;
It's really not obvious why HT pruning with garbage data could crash, but now I have a single threaded debug strategy. Generate random garbage in the hash entry, and see what happens from the HT pruning to the crash.

For now, let's sleep over this...
Theory and practice sometimes clash. And when that happens, theory loses. Every single time.
Sven
Posts: 4052
Joined: Thu May 15, 2008 9:57 pm
Location: Berlin, Germany
Full name: Sven Schüle

Re: How to find SMP bugs ?

Post by Sven »

lucasart wrote:If I disable this HT pruning code, crashes disappear:

Code: Select all

if &#40;he.depth >= depth && ply > 0&#41; &#123;
    if &#40;he.score <= alpha && he.bound >= EXACT&#41;
        return he.score;
    else if &#40;he.score >= beta && he.bound <= EXACT&#41;
        return he.score;
    else if &#40;alpha < he.score && he.score < beta && he.bound == EXACT&#41;
        return he.score;
&#125;
I see no way for the code above to crash. But could accessing "he" somehow be related to using an illegal address? E.g.

Code: Select all

he = *illegal_ptr;
prior to the code above? And if you disable the code above then that assignment gets optimized away so it does not crash there any longer?
Just a wild guess, of course ...