Canonical Position Representation

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

User avatar
leanchess
Posts: 176
Joined: Sun Dec 08, 2019 8:16 pm
Full name: Dmitry Shechtman

Re: Canonical Position Representation

Post by leanchess »

hgm wrote: Fri Apr 10, 2020 5:24 pm There are certain end-game studies where the only winning move is an under-promotion to Rook or Bishop. The Saavedra study is a famous one. In such positions promoting to Queen doesn't work, because it would stalemate the opponent. And it could be the only chance at promotion that you get.
Thank you for enlightening me. I had seen a forum post (on chess.com IIRC) that led me to believe in the nonsense I had posted above. There goes the idea of having permanent indices for even/odd bishops.

I am not sure if using the entire 256-bit game state as key for TT, OB or EGT probing would ever be competitive. In any case it prompts the question why it wouldn't be much better to use a 192-bit game state for that.
Not as a key, but rather as the value, to prevent collisions (yes, I am familiar with the "Hash Collisions Don't Matter" article). In any case, I don't see how 192 bits cover STM and castling rights.

I am not sure that a LUT with function pointers was used before in perft, but it is a standard technique.
Well, I was hoping to only use perft to confirm that my ideas work (with mixed results so far) and move to engine implementation at a later stage.

Pointers are 8 bytes on x64 architecture, square numbers or step vectors only 1 byte, so the size of the LUT could also be an argument.
I believe 8 bytes are not strictly a requirement on x86-64, so an optimising compiler should be able to shrink those to at least half that. I guess I should try and see how it works.

Disjoint lists are often the best solution to allow for super-numerary promotions. IIRC Qperft also splits the list in a pawn, leaper and slider section. That makes it easy to add Knights or sliders; otherwise you would have to shuffle the pieces, or alter their type, so that you can no longer be sure which type you are dealing with just from the list index.
My idea is to limit the pawns to the latter 8 indices, but allow those to promote to queen or knight (that's actually the easiest update - only a single bit flip).

(And you would have to shuffle them around in an engine which needs them ordered by decreasing value, etc.)
Why would you want that particular order? Why not KRBNQP or KRBQNP?

In HaQiKi D I do use a contiguous list, because Xiangqi doesn't have promotions. So pieces are either there, or not, and in the latter case I use an invalid square number. (With 90 board squares you have plenty of those without needing extra bits.)
Yes, Shogi and Xiangqi seem to present a real challenge, as bitboards don't work for those (at least until 128-bit CPUs are commonplace).
User avatar
hgm
Posts: 27874
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: Canonical Position Representation

Post by hgm »

Whether 192 bits is enough depends on what you require in terms of super-numerary pieces. One could for instance use 6 bits per piece, but 5 for the Bishops. (Although Bishop promotions can be important, I don't believe they would ever be important when there already are Bishops, and least of all when there already is a Bishop on the same shade. A second one on the same shade is often completely useless in the end-game.) That treats all pieces as distinguishable (like your scheme does), which does have the disadvantage that the same position can have multiple representations. (In games without piece drops this is hardly a problem, though.)

You can indicate a Pawn is captured by putting it on first rank, and that it has just made a double-push by putting it on last rank in the same file. You can indicate a piece is captured by putting it under its own King. As the Rooks are distinguishable, you can designate one to be the a-Rook and the other the h-Rook, and represent a Rook with castling rights with the same square number as the opponent King. You saved two bits on the Bishop locations, so 4 bits in total. One of those could indicate side-to-move. That leaves 3 bits, which you could use to indicate the following situations:

0 - nothing special
1 - first white Pawn is a Queen
2 - first black Pawn is a Queen
3 - first white Pawn is a Knight
4 - first black Pawn is a Knight
5 - first white and black Pawn are Queens
6 - first and second white Pawn are Queens, first black Pawn is a Queen
7 - first and second black Pawn are Queens, first white Pawn is a Queen

That would cover most practical cases with super-numerary pieces. If you want to cover more super-numerary cases, you can make two more bits available by compressing the Knight-pair constellation to 11 bits.
User avatar
leanchess
Posts: 176
Joined: Sun Dec 08, 2019 8:16 pm
Full name: Dmitry Shechtman

Re: Canonical Position Representation

Post by leanchess »

Wow. I see some very interesting ideas (e.g. using the opponent king's location for castling rights), as well as many similarities to my approach (namely, using own king's location for captures and last rank for e.p.), and should make canonicalisation somewhat easier. I didn't understand what two additional bits were saved on, however.

In any case, I still see advantages in my approach, i.e. the same representation for move generation with LUT. One advantage is the castling rights being encoded within byte 0 leads to a more straightforward implementation (e.g. just write the location when moving the king). The already mentioned trivial promotion to queen/knight is another one.

Suppose supernumerary pieces are not a concern. How significant are the space savings obtained from shrinking 256 bits to 192? Are those worth the extra processing required?
User avatar
hgm
Posts: 27874
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: Canonical Position Representation

Post by hgm »

The 'saved bits' were because of the Bishops being encoded as 5 bits rather than 6, because they can only access half the board.

Wether it is worth it is difficult to say. I don't think there is significant extra processing required. But I am also not sure whether using the complete games state as hash key or signature compared to using a 64-bit Zobist signature has any advantage at all.
User avatar
leanchess
Posts: 176
Joined: Sun Dec 08, 2019 8:16 pm
Full name: Dmitry Shechtman

Re: Canonical Position Representation

Post by leanchess »

Those are the two bits I did understand. Where did another two come from? Edit: got it. Too little sleep lately :oops:

Again, I wasn't suggesting using the gamestate as the key. Speaking of Zobrist keys, was there any further progress on your dependency reduction effort?
User avatar
hgm
Posts: 27874
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: Canonical Position Representation

Post by hgm »

I did not have time yet to continue that. It seems mostly of theoretical interest anyway, because in engines shortening the signature to 16-bits in order to save space appears to be worth it. And with 16 bits the key space is just too small to push colliding positions so far apart in terms of chess moves that a typical search tree would not bridge the gap. If the tree is larger than the key space, there is no place left to move 'nearby' positions to that is out of tree-reach.
User avatar
leanchess
Posts: 176
Joined: Sun Dec 08, 2019 8:16 pm
Full name: Dmitry Shechtman

Re: Canonical Position Representation

Post by leanchess »

I believe I have found a solution that reasonably addresses the supernumerary piece issue while keeping the encoding within 192 bits.

As already discussed, captured pieces should be placed under the king, and I see no reason (other than slowing down incremental updates) for the pawns not to behave similarly. That would free up eight illegal locations, i.e. three bits, with the LSB being reserved for STM/Q and the other two being interpreted as follows:

00 Knight
01 Rook
10 White-even bishop
11 White-odd bishop

The STM/Q bits could then be obtained by inspecting the pawns' locations.
  • If all are quasilegal, both bits are encoded within the bishops' LSBs.
  • If one or more are illegal, the first bit is encoded within the first illegal pawn's LSB.
  • If two or more are illegal, the second bit is encoded within the second illegal pawn's LSB.
This affords us with all possible combinations of up to 7 NRB pieces (with the queen being pushed into the pawns' territory should the need arise).

Incrementally updating a canonical representation would be rather non-trivial, however.

P.S. Bishops require a different capture indication, since both kings reside on arbitrary shades. Do you have a solution for that?
User avatar
hgm
Posts: 27874
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: Canonical Position Representation

Post by hgm »

The reason I saw for the Pawns was that the King can be on the 1st or 8th rank in the same file, so that the e.p. rights and being captured would get the same code.

I also don't understand your super-numerary system. If one of my Pawns promotes to Knight, but I otherwise have a complete set of pieces. Which bits would indicate it is a Knight, and which bits would indicate where that Knight is located?

Oops, I did not think about the captured Bishops.
User avatar
leanchess
Posts: 176
Joined: Sun Dec 08, 2019 8:16 pm
Full name: Dmitry Shechtman

Re: Canonical Position Representation

Post by leanchess »

hgm wrote: Sat Apr 11, 2020 9:19 am The reason I saw for the Pawns was that the King can be on the 1st or 8th rank in the same file, so that the e.p. rights and being captured would get the same code.
Of course, I completely overlooked that. How about using a constant for capture and another for e.p. (column == (index % 8) in the latter case)?
I also don't understand your super-numerary system. If one of my Pawns promotes to Knight, but I otherwise have a complete set of pieces. Which bits would indicate it is a Knight, and which bits would indicate where that Knight is located?
Assuming there is at least a captured pawn, the queen would take its place, then the knight would take the queen's place (assuming the pieces are in RBNQ order).

No solution for captured bishops, I presume?

Edit: There's an endless recursion regarding the queen count. I need to think about it.
User avatar
leanchess
Posts: 176
Joined: Sun Dec 08, 2019 8:16 pm
Full name: Dmitry Shechtman

Re: Canonical Position Representation

Post by leanchess »

I suppose we could reserve indices 1 and 2 to the bishops, which would further limit our supernumerary options.

In any case, this is seemingly becoming less and less practical. Now, how about 208 bits? If we added just an extra octet per side, we'd get a much simpler and more practical encoding. How bad could that be (e.g. cache utilisation-wise)?