bitboards like mailbox

hgm · Post by **hgm** » Fri Nov 26, 2021 9:22 am

gaard wrote: ↑Fri Nov 26, 2021 8:35 amCheating should be allowed. If you add a member to you board structure like uint8_t pawn_on_files[ColorCount][1 + 8 + 1], where pawn_on_files[0/1][0/9] are off-board, and assuming the lsb of pawn_on_files[0/1][1-8] are rank 1 and the msb is rank 8, you can combine the benefits of mailbox with pseudo-bitboards if you update it incrementally. With a hash table the difference is probably a wash though, since the hit rate will be above 90%.

Indeed, there is nothing against mixed or redundant representations. (Most bitboard engines also use a mailbox board, for fast identification of victim type.) I think that Fruit 2.1 detects passers the way you describe. But the incremental update is a pain, because you would have to do it on every move to make sure the info is available when you get a pawn-hash miss. It is bad enough you have to incrementally update the pawn hash key. But however you do it, evaluation of pawn structure is important in chess, and complex. So you will always gain a lot by using a pawn hash table. And once you do, it doesn't matter much how costly it is to process a miss, as you won't have many misses. But it would probably not be a good idea to maintain other pawn data incrementally on every move.

It always annoyed me having to update the pawn hash key. It can change both when the moving piece is a pawn and when a pawn gets captured, and if statements that test whether the piece is a pawn are poorly predictable. So I usually maintain the pawn key just like the main hash key, unconditionally applying the key changes due to mover and victim. Except that they use Zobrist tables where all non-pawns have a table of all zeroes. This doesn't require much extra space, as I usually implement the Zobrist tables as an array of pointers indexed by piece number (so basically part of the piece list), which point to a board-size table of keys. That way pieces of the same type can point to the same table, so you just need 2x6 tables for the piece types, plus a 'null table' for the empty square (which acts as 'victim' of a non-capture, so that you can unconditionally update the key for the victim contribution too). For the pawn has I have a similar array of pointers, where all the pointers for non-pawns point to the null table.

I often wondered if one could not combine both keys into one: use one 64-bit key, but use only N bits for the non-pawns, the remaining part of the key for those being zero. If Stockfish can get away with using 16 bits as hash signature for the main TT, you could probably also do that for the signature of the pawn hash: collisions there would not be ay more harmful than collisions in the TT. And with a 32-byte entry a 1MB pawn hash table would only have 2^15 entries, so only 15 bits are needed for the index. So only 32 bits of pawn key are needed, and the other 32 bits could then be used to distinguish constellations of the other pieces.

hgm · Post by **hgm** » Fri Nov 26, 2021 10:31 am

tcusr wrote: ↑Thu Nov 25, 2021 11:05 pm thanks very much
hope you don't mind if i adopt your piece encoding idea

btw just a question, is it better to use a legal generator for mailbox?
EDIT:
i also don't understand why your piece lists are one-dimensional? do you index them by piece type + color?

In mailbox it is relatively easy to suppress generation of illegal moves for pieces other than king (i.e. account for pinning). You can run through the slider part of the opponent piece list (nominally 5 pieces) to test which of the sliders is aligned with your king. (Usually none, in which case you are done.) For each aligned slider you would get the direction of the alignment. In the neighbor table you could then look up whether the neighbor of the king in that direction is the same as that of the slider in the opposit direction. If it is, the piece there is pinned if it is yours (and could deliver discovered check if it is an opponent). For such a piece you could use dedicated move generation code to generate only its moves along the pin ray, and exclude it from the normal move generation (e.g. by temporarily marking it as 'captured' in the piece list).

If you do this, the only illegal moves you can still generate are King moves and some e.p. captures. You will get some speedup, but it will not be dramatic. In most of my engines I just test whether the moved piece happened to be pinned, from the alignment of the from-square with its king, and in case there is such alignment, by what was behind that. And abort the node returning a -INF score. And when a King has stumbled into check, I just leave it for the move generation to discover it can capture the King, and abort similarly if that is the case. (One caveat: if you exclude illegal moves from pinned pieces, you could miss the fact that they could capture the king!)

This applies to positions where you were not in check to begin with. If you are in check, most moves will be illegal, and it would be beneficial to not rely on a full move generation in the daughter. Then it is better to apply some test on every move to see whether it solves the existing check. (By landing on the ray, to interpose or capture the checker, or by moving the king.) For double checks you only have to try king moves. For other moves, you would check the alignment as a first quick filter for rejecting most moves. (But see my remarks two postings earlier.)

As to the piece list:

The way I do it the color is always part of the piece encoding. E.g. white pieces are 16-31, black pieces are 32-47, edge guard = 48, empty square = 0. The '16' and '32' bit can then be used for testing the color. (And I use the values 16 and 32 to keep track of who is on move in the sideToMove variable.) One of the tricks for making it fast, though, is that you don't want to test for the color very often. Because testing means conditional branches, and branches can get mis-prodicted, and mis-predictions are costly. So it helps to treat all pieces the same, irrespective of their color. Which is easier when they are all in the same list (and empty squares and edge-guards are also in that list, as dummy sinks for data that you were not interested in). E.g. when generating captures any move can go into the attack map attackers[victim], whether the victim is friend, foe, edge guard or non-existent (= empty). No need to test for what it is. Those that go to attackers[EMPTY] or attackers[GUARD] would never be looked at. Those that hit foes will be used to extract the captures when you get to searching those. Those that hit friends might be useful for determining which of your pieces are protected or not.

When generating non-captures you just test whether squares are occupied or empty, and when they are occupied you are not interested in the color of the occupant (or whether it is an edge guard). When you generate moves you just run through the relevant part of the piece list, (i.e. from sideToMove to sideToMove+15), and already know the color of the pieces you are going to find there. When testing for discovered checks it is sometimes needed to determine whether the piece behind the evacuated square is friend or a foe; it cannot be an edge guard, because you do this test only when the piece is aligned, and checkFlags[GUARD] would be 0, so that an edge guard can never pass the alignment test.

gaard · Post by **gaard** » Fri Nov 26, 2021 12:21 pm

hgm wrote: ↑Fri Nov 26, 2021 9:22 am
gaard wrote: ↑Fri Nov 26, 2021 8:35 amCheating should be allowed. If you add a member to you board structure like uint8_t pawn_on_files[ColorCount][1 + 8 + 1], where pawn_on_files[0/1][0/9] are off-board, and assuming the lsb of pawn_on_files[0/1][1-8] are rank 1 and the msb is rank 8, you can combine the benefits of mailbox with pseudo-bitboards if you update it incrementally. With a hash table the difference is probably a wash though, since the hit rate will be above 90%.
I often wondered if one could not combine both keys into one: use one 64-bit key, but use only N bits for the non-pawns, the remaining part of the key for those being zero. If Stockfish can get away with using 16 bits as hash signature for the main TT, you could probably also do that for the signature of the pawn hash: collisions there would not be ay more harmful than collisions in the TT. And with a 32-byte entry a 1MB pawn hash table would only have 2^15 entries, so only 15 bits are needed for the index. So only 32 bits of pawn key are needed, and the other 32 bits could then be used to distinguish constellations of the other pieces.

You could but I'm not sure where the gains would be. I can go 3e+6 nodes without a pawn hash collision with a 14-bit key when the mask is already 16 bits, but disabling the incremental update altogether yields less than a 0.3% gain in perft.

hgm · Post by **hgm** » Fri Nov 26, 2021 1:08 pm

That is the update of the pawn key? That indeed isn't much.

tcusr · Post by **tcusr** » Fri Nov 26, 2021 7:18 pm

hgm wrote: ↑Fri Nov 26, 2021 8:19 am In the mailbox trials I was of course doing things incrementally, so that you would hardly ever generate moves for a piece from scratch

how do you generate moves incrementally?

also for the generation of captures of sliders is it okay to loop the eight neighbors and find enemy pieces? i plan to do the same for my 'isKingInCheck' but i don't know if it's optimal

hgm · Post by **hgm** » Fri Nov 26, 2021 8:36 pm

The details of this are discussed in the 'mailbox trials' discussion, including source code of some fully operational test programs. But in short:

If the captures are stored as an attack map, instead of building that attack map from scratch by generating moves for all pieces in every node, you copy the attack map from the parent, and then apply the modifications to it that were the consequence of the move that led to the new node. In particular, when that move was a capture:
* you generate all moves of the moved piece from itis from-square, and delete those from the attack map
* you generate all the moves of the moved piece (or the piece it promoted to) from the to-square, to add those to the attack map
* you take all sider attacks to the from-square (from the old attack map), and add those to the attack map of the next obstacle in the same direction
(Non-captures still require some extra work to account for blocking of sliders moves over the to-square, but only about 6% of the moves is a non-capture.) This gives you the new attack map from generating moves for only 2 pieces, rather than 16, plus a bit of work for discovering the slider attacks, which is very little as on the average there are very few slider attack on the evacuated square.

As to testing whether a king is in check:

There is another active thread in this forum section on exactly this subject.

How you can best do that depends on which king and in which situation. Testing whether king B is in check after a move of player A is intrinsically an incremental task, because you know that king B was not in check before the move. So you only have to test for checks created by the move, i.e. direct checks from the to-square, and discovered checks over the from square. No need to test in 8 directions, 2 directions will suffice.

If you test whether king A is in check after a move of player A (i.e. if the move was illegal), it is more tricky to do it incrementally, because you could already have been in check before the move, and trying to evade it. But it is likely that you would want to know anyway whether the side to move is in check (e.g. for awarding a check extension), and figured it out before the move through the method in the previous paragraph. And remembered some important info from that other than the bare fact that you were in check. Such as whether it was a double check, and if not, what was the location of the checker, and in which direction from the king it was located if it was a diistant slider.

The most difficult case is when you moved the king. Especially on a non-capture you would not have any information available on the square you are moving to. (The neighbor table only holds info on occupied squares.) It is best to run through the slider part of the enemy piece list to detect whether any of those is aligned with the new king location, and if it is test whether the neighbor of that slider in the king direction is not closer than this king. (Because there are only 5 sliders, while there are 8 directions, and most sliders would not be aligned, so the test on each of those is fast.) Likewise you would run through the king and knights part of the piece list to test their alignment. (Again, it is better to test 3 pieces than look on the board in 8 king and 8 knight directions.) Finally you can examine two board squares to see if there are enemy pawns there that attack you. (Here it is better to look on the board than in the piece list, because there can be 8 pawns, and only 2 directions they can attack from.) Fortunately only a small fraction of the moves are king moves.

Otherwise, (i.e. move of a non-king), if you were not in check before the move, you have to test whether the moved piece was pinned. The check can then only come from the to-square, and, in case it is aligned, you can test what the new neigbor of the king in that direction is, and if that slides in your direction. (Or if you test before the move, the neighbor of the neighbor.) And then if the to-square was not in that same direction too (move along the pin ray).

If you already were in check before the move, most moves would not get you out of check. You would have to test that in addition to the test for moving away pinned pieces. It is better to test it first, because if the check remains, you can already discard the move and would not have to do the pin test. On double checks you can discard the move of a non-royal right away, as only king moves can solve these. Otherwise, moves that capture the recorded checker solve the check (just a comparison between the squares). Finally, if the checker was a slider, recognized from the fact that you recorded a check direction, the check is solved if the to-square is in that same direction from king, and lands closer to the king than the checker.

tcusr · Post by **tcusr** » Fri Nov 26, 2021 11:49 pm

this is a lot of new information for me (thinking in terms of bitboards was far easier) but i'm still determined to give mailbox a try.
really helpful as always, thanks.

hgm · Post by **hgm** » Sat Nov 27, 2021 3:50 pm

It occurred to me last night that when you indicate that a piece is captured by setting its location to the invalid square number 239, and make the captureFlags[] table large enough, that you would not have to test for the piece actually being present on the board, in an IsSquareAttacked() test. Square numbers of a 0x88 board run from 0 to 0x77 = 119. So the difference between two on-board square numbers from -119 to +119. The difference between 239 and a number from 0 to 119 would run from 120 to 239.

So if you define an array captureFlags[359], you could test for slider attacks by

Code: Select all

for(i=3; i<=7; i++) { // run through 5 sliders
  int slider = location[stm + i]; // get the square they are on from piece list (239 = captured)
  int mask = captureMask[i];      // which flags apply to this piece type
  int flags = (captureFlags + 119)[slider - sqr];    // offset allows index from -119 to 239
  if(mask & flags) { // test for alignment
    int direction = (rayTable + 119)[slider - sqr];  // direction of slider relative to sqr
    int stopper = neighbor[slider][direction];       // nearest obstacle in that direction
    if(direction != (rayTable + 119)[stopper - sqr]) // stopper is not in the same direction relative to sqr
          return TRUE; // so slider can reach sqr
  }
}

The idea here is that most sliders would not be properly aligned with the king, so that the code in the if-statement is only rarely executed. Since the loop has a fixed number of iterations (namely 5: Q+2R+2B), the compiler would unroll it. So i would in practice be a constant with a different value in each iteration, and captureMask[ i] would not have to be loaded from a table, but is also a constant known at compile time. Each iteration then uses an (indexed) load to get 'slider', (assuming stm was put in a register before the loop), a double-indexed load (by 'slider' and -sqr) to get 'flags' (assuming -sqr was calculated in a register before the loop), an AND instruction to AND 'flags' with a constant, and a branch (predicted taken) to skip the conditional code section. That is just 4 instructions per slider.

The section of captureFlags above 239 would be all zeros, so a capture piece (slider = 239) would always result in flags = 0, and thus would never be considered aligned with the given 'sqr'. Only on-board sliders can pass the test for alignment.

Attacks by knights and kings would be done in a similar (unrolled) loop with 3 iterations (i = 0 to 2). Except that blocking is no issue there, so that we can return TRUE directly if mask & flags tests non-zero.

The efficiency of the code in the if-statement is less important, as it is usually skipped. And the IsSquareAttacked() is only rarely called in the first place, just for determining legality of king moves. There still is one problem when we would want to do this test before the move, though: it would allow the King to 'hide in its own shadow'. I.e. when the king is in check by a slider, a move away from that slider along the ray would of course keep it in check. But the to-square of that move is beyond the nearest neighbor of that slider, which is the king on its from-square. So the code as given would consider that attack blocked. This exception could be caught by putting an extra

Code: Select all

  if(stopper == fromSqr) return TRUE;

inside the if-statement.

tcusr · Post by **tcusr** » Sun Nov 28, 2021 9:27 pm

but you don't know where the sliders are right? i just add them from the fen string and also there are pawn promotions that mess up the order

btw i finished writing the move generator (i don't have a neighbor table yet so i just skip empty squares for captures) and it's really similar to the one in 'mailbox trials', is it optimal? this way the loop for leapers cannot be unrolled (i really like it though because it's so compact)

hgm · Post by **hgm** » Sun Nov 28, 2021 10:12 pm

Well, promotions are a complicating factor. In the mailbox trials I did not implement those, because they happen so rarely that they would not affect speed, and the trial was all about measuring speed. But in a real engine you must of course support promotions.

In Joker I use a piece list with a generous number of extra slots, with 32 entries per color (white = 32-63, black = 64-95), and the slider and leaper parts are treated as stacks that grow towards each other. E.g. at the start 32-34 are KNN, 43-47 are QRRBB (and Pawns are 48-55). If a Pawn promotes I mark it as 'captured', and create a new piece in the list (at 35 for a Knight, and at 42 for a slider), expanding the size of the corresponding section. And put the assigned number on the board to replace the Pawn. (Pawns are recognizable by having the '16' bit set, and sliders would be recognizable by having the '8' bit set, as long as you haven't more than 8 sliders.) Note that the total size of the piece list is not really an important issue; I never want to loop through the entire piece list. I just loop through the 3 sections (slider / leaper / pawn) separately, because the pieces in those need different treatment. So the sections must be compact, but space between those doesn't hurt any. In the root I squeeze captured pieces out of the list (to compactify the sections).

I guess having sections of variable length interferes with the loops over them having a fixed number of iterations. Now it is a matter of taste how much effort you want to spend on allowing more than 5 sliders or more than 2 knights. I am pretty sure that making the engine resign immediately when it gets in such a situation would not measurably decrease its Elo. Providing slightly less optimized routine for the various tasks wich can handle a variable number of iterations, and switching to these when such a 'never happens' situation requires it, would be an alternative. Or you could have some extra code under an if(extraPieces) clause to handle these pieces, which in normal situations would always skip that code, and be predicted correctly, so that it hardly causes any overhead.

Loops can always be unrolled, but if they have a variable number of iterations you cannot eliminate the test for termination of the loop by this. If a loop would only do 2 or 8 iterations (for Pawn and Knight/King captures), it could be unrolled with only a test for termination after the second iteration. The compiler would not be smart enough to do that, so you would have to do the unrolling in the source code to profit from this.

bitboards like mailbox

Re: bitboards like mailbox

Re: bitboards like mailbox

Re: bitboards like mailbox

Re: bitboards like mailbox

Re: bitboards like mailbox

Re: bitboards like mailbox

Re: bitboards like mailbox

Re: bitboards like mailbox

Re: bitboards like mailbox

Re: bitboards like mailbox