Speed up Move Generation for Mailbox

ndbd · Post by **ndbd** » Tue Jan 03, 2023 11:01 am

I have created a small engine. One (of the many) bottlenecks is move generation which I would like to speed up.

My implementation is a typical 12x10 mailbox. During profiling I identified that I spend a vast amount of time in checking if a pseudo-legal move is a legal move.

For each pseudo-legal move I test legality by

a) applying the move, check if the king is in check afterwards, and undo the move
b) for castles, check if castling is legal (i.e. king must not be in check, doesn't castle into check, doesn't castle over a checked square)

The function testLegality(Move) is the main bottleneck.

Within testLegality 50 percent of the time of is spent on isKingInCheck().

The function isKingInCheck is implemented
a) checking if a pawn attacks by directly looking at the potential squares
b) loop over all squares. for each square compute the distance between the king and the current square, and use the distance to look up potential attackers in a pre-computed attack table. Then for each of these piece, generate all pseudo-legal moves, and test if there is a pseudo-legal move where the target square is the king location. If yes, we have an attack.

The performance right now is very slow. I would like to speed up this with minimal effort and without resorting to a complete rewrite, like bitboards.

Are there any straight forward ways to speed up isKingInCheck()? Any help is appreciated!

hgm · Post by **hgm** » Tue Jan 03, 2023 11:53 am

The greatest speedup is not doing it.

Testing whether moves are legal is mostly a waste of time. Because in practice most moves are legal, and for those there would be no difference whether you had checked it or not. And you should certainly not do the test during move generation: in about half the nodes you will search only a single move, which then gives you a beta cutoff. Any time you spent on generating the other moves and testing those for legality is then wasted, whether they were legal or not.

Furthermore, testing whether a move is legal by making it, and using a general IsKingInCheck() to test whether you are in check is an enormously inefficient way for doing it. Usually you would know whether you are in check, and if so, by which move. If you were not in check to start with, you can only be in check after the move when the moved piece checks you, (and when you are testing whether a move exposes your own King, this is of course never the case, as the moved piece is your own), or when it discovered the move of a slider that it blocked on its from-square. In other words, you don't have to test for attacks on your King from all directions, or test all pieces for whether they check the King; you only have to test for attacks in the directions of the from- and to-square of the previous move, and possibly in the direction from which an existing check was coming.

A fast simple mailbox implementation would defer checking of legality to the point where you actually needs to search the move. At that time you would have to make the move anyway if the move was legal (which is usually the case when you are not already in check before the move), so little is lost by making the move and testing for legality afterwards. Perhaps even in the daughter node, which you could make return a -INFINITY score if the move was illegal, so that the move will be ignored in the parent node.

The legality check at that point would test whether the moved piece checks you from its new location, or whether a Queen would check you from the old location of the moved piece. If the latter is the case you would have to test whether an enemy slider move was discovered.

A very fast way to do these things is through a lookup table that for any pair of squares tells you which pieces could potentially (i.e. on an empty board) move from one to the other. This could for instance be a 2d array of bytes aligned[origin][destination], where each bit in an element corresponds to one of the 6 piece types. You can then look up in aligned[toSqr][kingSqr] whether the moved piece could check you, and from aligned[fromSqr][kingSqr] whether a Queen could check you.

The point is that this is usually not the case, and then you are immediately done. If a Knight, King or Pawn is aligned to check you, you are also immediately done, this time knowing the move was illegal. If the moved piece was a slider, or for testing the discovered checks, having alignment does not guarantee a check, and you will still have to test whether there were obstaces in the path of the move. This you could do with the aid of another array unitStep[origin][destination], which would tell you with which step you have to scan the board for the nearest obstacle. If that is the moved piece, you are in check (i.e. the move was illegal), if it was a piece behind the evacuated square you would have to test its color and type to see if it could slide in your direction.

If you are smart you can even use different bits for distant and contact moves of sliders in the aligned[][] table. Then you could treat contact checks by sliders the same as you treat checks by K, N or P, and only continue with the ray scan for distant alignment. (To detect the discovered check you would always have to do the ray scan, because such a check must be distant.)

You could shrink the size of the tabled by making those 1-dimensional, indexed by the square difference: aligned[destination - source]. This works best when you would have a 12x16 board, because then the square difference is a unique signatue of the move. On 12x10 the same square difference can often be obtained by a genuine move and a move that crosses the 2-wide internal boundary of the board. This would produce false alignments. That would not be fatal, because the ray scan would run into the boundary guards, if not blocked already before that, but it would decrease efficiency by doing extra ray scans that could have been avoided.

If you are in check before the move, it might be faster to test whether the move stands a chance of resolving the check. Here you can again distinguish double checks from contact checks and distant checks, and King moves from other moves. On double checks you don't even have to try non-King moves. For a contact check the only non-King move that could help would be the capture of the checker, which is very cheap to test. For distant checks, you would also have to figure out whether a non-King move blocks that check. You could use the unitStep[][] array for that, by keeping the step of the checking move, and testing if the to-square of the move has the same step relative to King (i.e. unitStep[toSqr][kingSqr] == checkStep). If that is not the case you can discard the move right away. If it is, there still is no guarantee the move lands between checker and King. But you could test that cheaply from a table that holds the distance between (aligned) squares. Of course a move that resolves the existing check would still have to be tested for not exposing your King.

Finally, note that when a piece is pinned on its King, it usually makes a lot of its moves (perhaps even all) illegal, and it would be inefficient to test those each one by one. It could be more efficient to test even before move generation which pieces are pinned (by testing alignment of enemy sliders with your King, and if you find any, test if there is a single piece of your own in between). For such pieces you would then not generate all their moves, but only the moves along the pin ray. This guarantees none of the generated non-King moves will ever expose the King to capture, and you only have to test the generated King moves for being legal.

Henk · Post by **Henk** » Tue Jan 03, 2023 2:43 pm

If you like bugs then go do heavy optimizing on your movegenerator. Hi, hi hi. Find out if your perft tests are good enough.

[By the way I just discovered a bug in en passant in my code which slipped through. So good to be paranoid about this]

dangi12012 · Post by **dangi12012** » Tue Jan 03, 2023 3:30 pm

hgm wrote: ↑Tue Jan 03, 2023 11:53 amFinally, note that when a piece is pinned on its King, it usually makes a lot of its moves (perhaps even all) illegal, and it would be inefficient to test those each one by one. It could be more efficient to test even before move generation which pieces are pinned (by testing alignment of enemy sliders with your King, and if you find any, test if there is a single piece of your own in between). For such pieces you would then not generate all their moves, but only the moves along the pin ray. This guarantees none of the generated non-King moves will ever expose the King to capture, and you only have to test the generated King moves for being legal.

On that idea there are some subtlelties with pinned ep captures which may or may not be legal.
Also when a queen is pinned it can only move along the pinned axis (which is important when handling queens as (rook | bishops) so you need to remember the pinned fact)

https://www.codeproject.com/Articles/53 ... egenerator

You can also cache the pin masks (since they only change when something in the ray from pinner <-> pinned changes or the king moves.

Generally its very quick to check if pins could possibly exist (HVMask & EnemyRooks) and if yes create a accurate pinmask where all pieces that are a member of it can only move inside that mask.
Better to do most of your ideas with BB imho. Nothing beats the cheapness of (x & y)

Henk · Post by **Henk** » Tue Jan 03, 2023 4:50 pm

What is the maximum speed you can get currently with one thread on an avarage computer ?

Probably computing the keys in the lookup hastables per square limits the spead or not?

hgm · Post by **hgm** » Tue Jan 03, 2023 5:25 pm

Indeed, King moves and e.p. captures still require some extra attention. But e.p. captures require that anyway during MakeMove(), and are very rare in the tree. So you can do the required test in that code section with zero overhead for the other moves, and virtually no impact on speed no matter how slow the test is you use for it.

And as I said, for the pinned pieces you have to generate the moves along the pin ray (and only those). The way I did that in Joker was to generate their moves as soon as you detect the pin. (At which time you will be aware of the orientation of the pin ray.) And then temporarily remove the piece from the piece list (i.e. mark it as captured) during the normal move generation.

(x & y) might be fast (although perhaps not so fast when your board has more than 64 squares), but the effort is usually in obtaining x or y.

With mailbox you could also keep track of aligned enemy sliders incrementally. You would then only have to do the alignment test for all 5 enemy sliders when the King moves, and only redo one alignment test when one of these sliders moves. And bit extraction techniques are not exclusively useful for bitboards: you could indicate which sliders are aligned with 5 bits in a byte, and extract those that are in order to find out whether you would have to do test for pinned pieces on its ray connecting to King.

When you are using an incrementally updated attack map, as I did in the 'mailbox trials', you can use anothe primary filter for detecting pins. Namely whether the moving piece was attacked by an enemy slider. That information you find directly in the attack-map entry for that piece. Usually it would not be thus attacked, and you are done. If there are slider attacks on it, you have to find the direction of the attack by getting the location of the identified slider, and then look up the location of the obstacle in the opposit direction in the neighbor table to see whether that is your King.

Bulk detection of pins there is less useful, because the moves are generated per victim rather than per moving piece. And it is not all that common that a piece has more than one capture in the first place. This could very well be a general issue in QS, shifting the balance in favor of separate pin detection for each capture. Nevertheless, if you generate the captures per attacker, and you find the attacker was pinned (which would also tell you what the pinner was), you can immediately discard all captures of that piece that do not capture the pinner.

Chessnut1071 · Post by **Chessnut1071** » Wed Jan 04, 2023 3:08 am

hgm wrote: ↑Tue Jan 03, 2023 11:53 am The greatest speedup is not doing it.

Testing whether moves are legal is mostly a waste of time. Because in practice most moves are legal, and for those there would be no difference whether you had checked it or not. And you should certainly not do the test during move generation: in about half the nodes you will search only a single move, which then gives you a beta cutoff. Any time you spent on generating the other moves and testing those for legality is then wasted, whether they were legal or not.

Furthermore, testing whether a move is legal by making it, and using a general IsKingInCheck() to test whether you are in check is an enormously inefficient way for doing it. Usually you would know whether you are in check, and if so, by which move. If you were not in check to start with, you can only be in check after the move when the moved piece checks you, (and when you are testing whether a move exposes your own King, this is of course never the case, as the moved piece is your own), or when it discovered the move of a slider that it blocked on its from-square. In other words, you don't have to test for attacks on your King from all directions, or test all pieces for whether they check the King; you only have to test for attacks in the directions of the from- and to-square of the previous move, and possibly in the direction from which an existing check was coming.

A fast simple mailbox implementation would defer checking of legality to the point where you actually needs to search the move. At that time you would have to make the move anyway if the move was legal (which is usually the case when you are not already in check before the move), so little is lost by making the move and testing for legality afterwards. Perhaps even in the daughter node, which you could make return a -INFINITY score if the move was illegal, so that the move will be ignored in the parent node.

The legality check at that point would test whether the moved piece checks you from its new location, or whether a Queen would check you from the old location of the moved piece. If the latter is the case you would have to test whether an enemy slider move was discovered.

A very fast way to do these things is through a lookup table that for any pair of squares tells you which pieces could potentially (i.e. on an empty board) move from one to the other. This could for instance be a 2d array of bytes aligned[origin][destination], where each bit in an element corresponds to one of the 6 piece types. You can then look up in aligned[toSqr][kingSqr] whether the moved piece could check you, and from aligned[fromSqr][kingSqr] whether a Queen could check you.

The point is that this is usually not the case, and then you are immediately done. If a Knight, King or Pawn is aligned to check you, you are also immediately done, this time knowing the move was illegal. If the moved piece was a slider, or for testing the discovered checks, having alignment does not guarantee a check, and you will still have to test whether there were obstaces in the path of the move. This you could do with the aid of another array unitStep[origin][destination], which would tell you with which step you have to scan the board for the nearest obstacle. If that is the moved piece, you are in check (i.e. the move was illegal), if it was a piece behind the evacuated square you would have to test its color and type to see if it could slide in your direction.

If you are smart you can even use different bits for distant and contact moves of sliders in the aligned[][] table. Then you could treat contact checks by sliders the same as you treat checks by K, N or P, and only continue with the ray scan for distant alignment. (To detect the discovered check you would always have to do the ray scan, because such a check must be distant.)

You could shrink the size of the tabled by making those 1-dimensional, indexed by the square difference: aligned[destination - source]. This works best when you would have a 12x16 board, because then the square difference is a unique signatue of the move. On 12x10 the same square difference can often be obtained by a genuine move and a move that crosses the 2-wide internal boundary of the board. This would produce false alignments. That would not be fatal, because the ray scan would run into the boundary guards, if not blocked already before that, but it would decrease efficiency by doing extra ray scans that could have been avoided.

If you are in check before the move, it might be faster to test whether the move stands a chance of resolving the check. Here you can again distinguish double checks from contact checks and distant checks, and King moves from other moves. On double checks you don't even have to try non-King moves. For a contact check the only non-King move that could help would be the capture of the checker, which is very cheap to test. For distant checks, you would also have to figure out whether a non-King move blocks that check. You could use the unitStep[][] array for that, by keeping the step of the checking move, and testing if the to-square of the move has the same step relative to King (i.e. unitStep[toSqr][kingSqr] == checkStep). If that is not the case you can discard the move right away. If it is, there still is no guarantee the move lands between checker and King. But you could test that cheaply from a table that holds the distance between (aligned) squares. Of course a move that resolves the existing check would still have to be tested for not exposing your King.

Finally, note that when a piece is pinned on its King, it usually makes a lot of its moves (perhaps even all) illegal, and it would be inefficient to test those each one by one. It could be more efficient to test even before move generation which pieces are pinned (by testing alignment of enemy sliders with your King, and if you find any, test if there is a single piece of your own in between). For such pieces you would then not generate all their moves, but only the moves along the pin ray. This guarantees none of the generated non-King moves will ever expose the King to capture, and you only have to test the generated King moves for being legal.

I have a PEXT engine clocking about 800,000 node/sec on a 2.6 Ghz, 1 core, I7. I check for legal moves, so, how much speed do you estimate not checking the legality would speed things up? I'd need at least 50% before I'm going to invest the time and effort to rewrite that engine that took me 1 year to write. Right now I'm modifying a Blockchain algorithm for a failing bitcoin miner. The money's good, but I really miss chess.

JacquesRW · Post by **JacquesRW** » Wed Jan 04, 2023 3:56 am

Chessnut1071 wrote: ↑Wed Jan 04, 2023 3:08 am I have a PEXT engine clocking about 800,000 node/sec on a 2.6 Ghz, 1 core, I7. I check for legal moves, so, how much speed do you estimate not checking the legality would speed things up? I'd need at least 50% before I'm going to invest the time and effort to rewrite that engine that took me 1 year to write. Right now I'm modifying a Blockchain algorithm for a failing bitcoin miner. The money's good, but I really miss chess.

Is this 800,000 nodes/sec in perft or search? Either way, seems like there are other bottlenecks to worry about before complicating the legality checking.

Chessnut1071 · Post by **Chessnut1071** » Wed Jan 04, 2023 6:23 am

JacquesRW wrote: ↑Wed Jan 04, 2023 3:56 am
Chessnut1071 wrote: ↑Wed Jan 04, 2023 3:08 am I have a PEXT engine clocking about 800,000 node/sec on a 2.6 Ghz, 1 core, I7. I check for legal moves, so, how much speed do you estimate not checking the legality would speed things up? I'd need at least 50% before I'm going to invest the time and effort to rewrite that engine that took me 1 year to write. Right now I'm modifying a Blockchain algorithm for a failing bitcoin miner. The money's good, but I really miss chess.
Is this 800,000 nodes/sec in perft or search? Either way, seems like there are other bottlenecks to worry about before complicating the legality checking.

I don't use perft. Instead, I use solutions to 40 - 120 checkmate solutions from 4-ply - 24-ply. There's not much difference between the two give that many games, but I can't afford to miss even one checkmate solution. btw, what's a good node rate? I used to clock 6,500/nps before joining this board 1 1/2 years ago and I was thinking 800,000 was pretty good. I know I don't have the time to get to the 10,000,000 like Muller and many here can achieve, but I'd like to get into the millions on my old, 1-core i7.

lithander · Post by **lithander** » Wed Jan 04, 2023 10:28 am

Chessnut1071 wrote: ↑Wed Jan 04, 2023 6:23 am I don't use perft. Instead, I use solutions to 40 - 120 checkmate solutions from 4-ply - 24-ply. There's not much difference between the two give that many games, but I can't afford to miss even one checkmate solution. btw, what's a good node rate? I used to clock 6,500/nps before joining this board 1 1/2 years ago and I was thinking 800,000 was pretty good. I know I don't have the time to get to the 10,000,000 like Muller and many here can achieve, but I'd like to get into the millions on my old, 1-core i7.

Outside 'perft' comparing the nps becomes pretty meaningless. First of all people count differently. Should you count nodes that get pruned (not searched) after considerable work has been done on them like for example doing a static-exchange-eval? Should you count nodes that you would have searched but luckily you found them in the TT? There is no clear answer to these questions...

But more importantly the way to improve your engine is often to make it smarter about how it searches (the order of moves and how it prunes) and to improve the quality of it's evaluation. And making progress in any of the two domains usually costs raw performance. Leorik 1.0 was the fastest in terms of nps (7M nps on a Ryzen 3600) and it's become slower and slower since and now version 2.3.2 is at 3M nps but of course playing much stronger.

Speed up Move Generation for Mailbox

Speed up Move Generation for Mailbox

Re: Speed up Move Generation for Mailbox

Re: Speed up Move Generation for Mailbox

Re: Speed up Move Generation for Mailbox

Re: Speed up Move Generation for Mailbox

Re: Speed up Move Generation for Mailbox

Re: Speed up Move Generation for Mailbox

Re: Speed up Move Generation for Mailbox

Re: Speed up Move Generation for Mailbox

Re: Speed up Move Generation for Mailbox