Perft search speed bottleneck

Discussion of chess software programming and technical issues.

Moderators: hgm, Dann Corbit, Harvey Williamson

User avatar
hgm
Posts: 27703
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: Perft search speed bottleneck

Post by hgm »

lucasart wrote:I generate legal moves directly, and with nosignificant performance loss compared to a pseudo-legal generator, but with the huge gain of never having to check anything afterwards.
Note, however, that although this works wonders for a perft, it hardly does anything for an engine. Moves that leave your King in check (if it wasn't already before the move) are rare, and letting the engine simply play them and discover that it can capture a King in the reply node to abort there doesn't introduce much overhead.

For my engine the precalculation of pinned pieces is mainly useful because at the same time it also detects slider checks (in which case the dedicated evasion generator can be used), and whether you are exposed to discovered checks.
Chan Rasjid
Posts: 588
Joined: Thu Mar 09, 2006 4:47 pm
Location: Singapore

Re: Perft search speed bottleneck

Post by Chan Rasjid »

hgm wrote:
lucasart wrote:I generate legal moves directly, and with nosignificant performance loss compared to a pseudo-legal generator, but with the huge gain of never having to check anything afterwards.
Note, however, that although this works wonders for a perft, it hardly does anything for an engine. Moves that leave your King in check (if it wasn't already before the move) are rare, and letting the engine simply play them and discover that it can capture a King in the reply node to abort there doesn't introduce much overhead.

For my engine the precalculation of pinned pieces is mainly useful because at the same time it also detects slider checks (in which case the dedicated evasion generator can be used), and whether you are exposed to discovered checks.
Illegal moves that leave one's own king open to slider attack is not rare. I measured it. In late-middle or endgame, the percentage of illegal psuedo-legal moves may be 30/40%. The method of detecting capture of the king in the next node (once done in Crafty) is inefficient.

[EDIT] There could be good reason why we need to know that a move is valid
before we call search; eg futilty pruning, etc.

Rasjid.
JarvisXerxes
Posts: 20
Joined: Thu May 02, 2013 5:52 pm
Location: United States

Re: Perft search speed bottleneck

Post by JarvisXerxes »

Thanks Lucas! Yes, I do need to integrate the legality check into my moveGen. The profiler does complain that isLegalMove() is a hot path.

Stepping through another's source code has been a big headache for me. My program architecture is very different from other opensource engines. I did take a look at Stockfish's movegen, but got my mind blown out (or perhaps I'm just not experienced enough in programming).

That's why I post here and ask for high-level advice. :)
Life is a statistical distribution.
JarvisXerxes
Posts: 20
Joined: Thu May 02, 2013 5:52 pm
Location: United States

Re: Perft search speed bottleneck

Post by JarvisXerxes »

tpetzke wrote:Hi Jim,

welcome to the forum.

for (int i = currentPointer; i < nexPointer; i++)
Where are current pointer and nex pointer initialized ? Do you maintain them outside of your function ? I would recommend to change that.

Let your move generator return you a pointer to the list of generated moves.

What is your move representation? Most engines have it as an plain int so it is passed around rather quickly and easy to store in the transposition table later.

Where you do the legality check does not influence the speed. For perft a significant speedup however is to perform a legal check at depth 1 and return here the number of legal moves instead of executing them all and return 1. For later search performance this does not help much.

And make sure your are compiling your code in Visual Studio - Release Mode. I'm sure you are doing that, but I wasn't with my first release.

Thomas...
Hi Thomas,

My code snippet is sort of pseudo-C++. I omit a bit more details to outline my idea more clearly.
Thanks for the reminder - yes I'm indeed doing the recursion one level deeper, which isn't necessary and probably slows perft down significantly.

My move representation follows the common approach: a dozen bits for the from/to squares, a dozen bits for the captured/moving pieces and a few more for special flags (castling, EP capture).
When I pass a Move object (basically a 32-int) to make/unmake(), they simply call get() methods on the move object to obtain relevant info.
Life is a statistical distribution.
JarvisXerxes
Posts: 20
Joined: Thu May 02, 2013 5:52 pm
Location: United States

Re: Perft search speed bottleneck

Post by JarvisXerxes »

Chan Rasjid wrote:
hgm wrote:
lucasart wrote:I generate legal moves directly, and with nosignificant performance loss compared to a pseudo-legal generator, but with the huge gain of never having to check anything afterwards.
Note, however, that although this works wonders for a perft, it hardly does anything for an engine. Moves that leave your King in check (if it wasn't already before the move) are rare, and letting the engine simply play them and discover that it can capture a King in the reply node to abort there doesn't introduce much overhead.

For my engine the precalculation of pinned pieces is mainly useful because at the same time it also detects slider checks (in which case the dedicated evasion generator can be used), and whether you are exposed to discovered checks.
Illegal moves that leave one's own king open to slider attack is not rare. I measured it. In late-middle or endgame, the percentage of illegal psuedo-legal moves may be 30/40%. The method of detecting capture of the king in the next node (once done in Crafty) is inefficient.

[EDIT] There could be good reason why we need to know that a move is valid
before we call search; eg futilty pruning, etc.

Rasjid.
So if I'm getting this correctly... is it always prefered to build the legality verification into movegen()?
Life is a statistical distribution.
User avatar
hgm
Posts: 27703
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: Perft search speed bottleneck

Post by hgm »

Are you sure you counted the right thing? When I count for the following position in Fairy-Max
[d]r5k1/pNpnrp1p/4b1pq/1B6/3P4/2P1bN2/PPQ3PP/R4R1K w - - 7 20
I get 2,463,023 King captures out of 71,981,845 nodes. That is only 3.4%. That sounds much more reasonable than the number you report.

And Fairy-Max is completely stupid about this: it has no separate move generator for check evasions. So I am pretty sure the bulk of the illegal moves is from positions where it already is in check, so that almost any move is illegal. Of course in a serious engine you would never do that.
JarvisXerxes
Posts: 20
Joined: Thu May 02, 2013 5:52 pm
Location: United States

Re: Perft search speed bottleneck

Post by JarvisXerxes »

hgm wrote:Are you sure you counted the right thing? When I count for the following position in Fairy-Max
[d]r5k1/pNpnrp1p/4b1pq/1B6/3P4/2P1bN2/PPQ3PP/R4R1K w - - 7 20
I get 2,463,023 King captures out of 71,981,845 nodes. That is only 3.4%. That sounds much more reasonable than the number you report.

And Fairy-Max is completely stupid about this: it has no separate move generator for check evasions. So I am pretty sure the bulk of the illegal moves is from positions where it already is in check, so that almost any move is illegal. Of course in a serious engine you would never do that.
So will the search speed drop if I generate strictly legal moves only? If not, I'd prefer to verify inside moveGen(). Then I don't need to test legality every time I make a move.
Life is a statistical distribution.
tpetzke
Posts: 686
Joined: Thu Mar 03, 2011 4:57 pm
Location: Germany

Re: Perft search speed bottleneck

Post by tpetzke »

The engine will later only spend a small percentage of their time in movegen, so making it a bit slower/faster does not make a huge difference.

If you implement an efficient legal check, you can do it in movegen. It is not so popular but I do it as well.

Thomas...
Sven
Posts: 4052
Joined: Thu May 15, 2008 9:57 pm
Location: Berlin, Germany
Full name: Sven Schüle

Re: Perft search speed bottleneck

Post by Sven »

JarvisXerxes wrote:
Chan Rasjid wrote: Or are you ruuning your debug version with assert() turned on?
I discovered the terrible blunder right before Chan posted. You're absolutely right. I built my project in visual studio under "Debug mode" instead of "Release", which definitely slows things down by an order of magnitude.

But even a correct build gives me only 20000 kn/s, still way below the standard.

Thanks a lot to HGM for your detailed explanation. This clarifies a lot of mysteries. I'm glad to listen to advice directly from qperft's author.
I'd try out the proposed optimizations and restructure my movegen.
Hi Jim,

there are a couple of threads in the Programming subforum where problems like legality checking, perft speed etc. are discussed. The most recent one I remember is only few days or weeks old. I suggest that you check that one as well, in addition to all the valuable replies you already received here.

My summary for you:

1) Don't worry too much about speed at all. Writing bug-free code, implementing a decent tree search and setting up a powerful testing framework are much more important than speed when starting a new engine.

2) Don't worry too much about perft speed as well. Perft is mainly an important testing method to verify correctness of your move generator, make/unmake and basic data structures. Running perft faster saves some seconds or maybe minutes of testing time only.

3) Even doubling the speed of your move generator and make/unmake code would only have a very small impact on the playing strength of a real engine since usually an engine uses only a small portion of its search time for these parts while most of the time is used for static evaluation and search. For instance if your movegen/make/unmake initially takes 20% of all search time and you really manage to double its speed (not for perft, of course!) then previous 100% become 90% so you get an overall 10% speedup only. In reality numbers will be even much smaller.

4) Measuring search speed is usually done by counting the number of nodes visited, including leaves as well as internal nodes. A fair comparison of perft speed between different programs (on the same HW platform!) would require to compare these "nodes per second" numbers while "leaves per second" can be misleading since it depends heavily on the algorithm to calculate that number (see next point).

5) There is mainly one significant speedup trick for perft and that is "bulk counting" (already mentioned by HGM) where make/unmake is avoided as much as possible at the last ply above the horizon since you only need the number of legal moves. Just in case you have not done that already: the trick is that all pseudo-legal moves except
- check evasions,
- king moves,
- moves of pinned pieces, and
- ep captures
are always legal so these can be counted without make/unmake. So at least for perft you can calculate all squares of pinned pieces, save them in a bitboard, and use that together with the other conditions above to decide whether you need the legality test. This will usually result in a boost of perft speed. Whether this method also improves overall speed for the normal search depends on a couple of things, please refer to the recent thread I mentioned above for a discussion about this topic.

Sven
Sven
Posts: 4052
Joined: Thu May 15, 2008 9:57 pm
Location: Berlin, Germany
Full name: Sven Schüle

Re: Perft search speed bottleneck

Post by Sven »

JarvisXerxes wrote:
Chan Rasjid wrote:
hgm wrote:
lucasart wrote:I generate legal moves directly, and with nosignificant performance loss compared to a pseudo-legal generator, but with the huge gain of never having to check anything afterwards.
Note, however, that although this works wonders for a perft, it hardly does anything for an engine. Moves that leave your King in check (if it wasn't already before the move) are rare, and letting the engine simply play them and discover that it can capture a King in the reply node to abort there doesn't introduce much overhead.

For my engine the precalculation of pinned pieces is mainly useful because at the same time it also detects slider checks (in which case the dedicated evasion generator can be used), and whether you are exposed to discovered checks.
Illegal moves that leave one's own king open to slider attack is not rare. I measured it. In late-middle or endgame, the percentage of illegal psuedo-legal moves may be 30/40%. The method of detecting capture of the king in the next node (once done in Crafty) is inefficient.

[EDIT] There could be good reason why we need to know that a move is valid
before we call search; eg futilty pruning, etc.

Rasjid.
So if I'm getting this correctly... is it always prefered to build the legality verification into movegen()?
I would say "no" in general. I would always prefer to decouple move generation and legality testing. One typical exception might be testing for attacks to F1/D1/F8/D8 in case of castling moves.

As I have pointed out in my longer reply, if you really want to improve perft speed then you need to implement a way to avoid most make/unmake operations for legality testing. In that case it does not matter much where the legality test is done.

Sven