Canonical Position Representation

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

User avatar
leanchess
Posts: 176
Joined: Sun Dec 08, 2019 8:16 pm
Full name: Dmitry Shechtman

Re: Canonical Position Representation

Post by leanchess »

As I've been trying to implement this I encountered a problem of a different kind.

C doesn't have 48-bit words (duh!). If I use 64 bits everywhere, I cannot use memcpy or memcmp, which seems to take the advantage of 192 over 256 bits away. I suppose I could pack the color and the Q-flag with the two pawn triplets into 32 bits and keep the rest in the leading 64 bits, but that wouldn't look very pretty.

P.S. Due to lack of a solution for triplet encoding, I'm settling for a LUT (32KiB will do, as the lower 11 bits can simply be treated as a pair).
User avatar
leanchess
Posts: 176
Joined: Sun Dec 08, 2019 8:16 pm
Full name: Dmitry Shechtman

Re: Canonical Position Representation

Post by leanchess »

I'm now considering incorporating Pio's "colour-agnostic" idea.

If I understand it correctly, that means STM is always white, and everything is flipped on every makemove. For disjoint pieces that means a simple XOR, right? Now, if we replace addition with XOR in pair and triplet encodings, flipping those should be straightforward as well.

We may then use two bits for castling rights, eliminating the need to update the rooks on king moves and allowing us to support Chess960. Three bits should still suffice for supernumerary pieces, since they could be interpreted differently depending on the values of the RBNQ flags.
User avatar
hgm
Posts: 27875
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: Canonical Position Representation

Post by hgm »

The following (not directly related to the direction this topic is taking) occured to me:

The basic piece-list representation, which just uses 6 bits for the location of each piece, is not unique, because pieces of equal type can be permuted to give a different encoding, while the position remains the same. Especially for the Pawns this is a large effect: there are 8! = 40,320 permutations of positions with all Pawns present. That is equivalent to more than 15 bit of information!

Now we can to use smarter location encodings that are less wasteful, e.g. by encoding pairs or triples, so that we can do with fewer bits. And then use the bits that we free up this way for something else (such as indicating presence of super-numerary pieces). This has the additional advantage that it improves the uniqueness of the encoding, especially for pieces that are normally only present in pairs. For Pawns, however, it would still leave a lot of freedom for how to group the Pawns into pairs or triples, and thus does not solve the uniqueness problem completely.

An alternative approach is to stick to encoding individual Pawns, so that no bits are freed at all, but use the permutation of the Pawns to encode additional information. As mentioned above, there is a potential of 15 bits there, much more than what you could free up with the aid of pair or triplet grouping. This would also address the uniqueness problem: if we assign a unique permutation of the Pawns to each of the additional state variables we want to encode, this prescribes exactly how a certain constellation of Pawns should be encoded.

We could for instance require that for a position without any super-numerary pieces, the Pawns must occur in order of increasing square number. Such rules can make updating of the state a bit cumbersome, as changing location of one Pawn might require it to be sorted to a different Pawn 'slot' of the game state, all Pawns between the old and the new slot would have to be shifted one place. (The 'Pio encoding' suffers from this problem for all pieces.) For Pawns the impact of this can be strongly ameliorated by numbering the squares per file rather than per rank. Pawn non-captures can then never change the order, and only captures sometimes would require swapping the Pawn with the Pawn on the adjacent file. If capttured Pawns are encoded by a 1st-rank location, we have the choice from 8 such locations, and can almost always pick the 1st-rank square that preserves the order.

If we want to use this for encoding super-numerary pieces, we cannot use all 8! permutations, as one or more of the Pawn slots would be needed to encode the location of such pieces, and their location would then affect the permutation. If a single super-numerary piece would always go in the 8th Pawn slot, 5 of the 7! = 5040 permutations of the other 7 slots could be used to indicate whether the 8th slot is P, N, B, R or Q. These can all have the first 6 Pawns in the same order, e.g. purely increasing. If there are two super-numerary pieces, this would then have to be indicated by a different order of the first 6 slots. There are 10 combinations of two pieces, and we would need the possibility to indicate there are less than two, so this would require 11 of the 6! = 720 possible permutations. These cannot all have the same order of the first five, but require 2 of the 120 permutations of those. The other 118 permutations of the first 5 can be used to specify combinations of super-numerary pieces in the last 3 slots. There are 40 such combinations. Together with the two codes for less than three pieces, the 42 permutations of 5 could be assigned to have 9 different permutations of the first 4 (as each of those can then have 5 different schoices for the 5th.) There are 4! = 24 permutations of 4, so that still leaves 15 for indicating some selected combinations of four super-numerary pieces in the last 4 slots (like 4Q, or N+3Q).
User avatar
leanchess
Posts: 176
Joined: Sun Dec 08, 2019 8:16 pm
Full name: Dmitry Shechtman

Re: Canonical Position Representation

Post by leanchess »

Pio wrote: Mon Apr 13, 2020 11:28 pm Hi, I store my board in 192 (3*64) bits (see my post http://www.talkchess.com/forum3/viewto ... 93&t=49575 how I do it). The nice thing with my representation is that it is simple, compact, can represent all possible chess positions (and all impossible ones as long as there is at most 16 pieces/pawns each). Two other nice properties that are really nice is that my representation can 1) be made colour agnostic (is very nice if used in opening database) branchless and 2) my representation contains an occupancy bitboard of 64 bits so you can use that as one of your primary keys (but you will have to do a composite primary key with the rest of the board data) and thus immediately find all positions having the same squares occupied.

/Pio

It took me a little while, but I did finally implement it.

leanchess wrote: If I understand it correctly, that means STM is always white, and everything is flipped on every makemove.

I understood incorrectly. If colour is considered part of the type, we only need three extra types:
  1. white virgin rook or white pawn en passant (depending on rank),
  2. black counterpart of the above, and
  3. black king to move,
which actually leaves us with one unused type!

Edit: We need to encode white king to move if we want to stay colour-agnostic, as Pio puts it. Flipping would then only require swapping the nibbles while flipping the LSBs.
syzygy
Posts: 5569
Joined: Tue Feb 28, 2012 11:56 pm

Re: Canonical Position Representation

Post by syzygy »

leanchess wrote: Fri Apr 10, 2020 10:40 am The goal is maintaining a single uniform representation for move generation, hashing, tablebooks etc. (although I understand that keeping at least bitboards in addition is required for efficient check detection).
Just use Zobrist hashing.
syzygy
Posts: 5569
Joined: Tue Feb 28, 2012 11:56 pm

Re: Canonical Position Representation

Post by syzygy »

leanchess wrote: Mon Apr 13, 2020 8:36 pm I'm having trouble finding the formula for the third in a triplet. I thought it would be as simple as it was for pairs, but I was gravely mistaken.

Any ideas? Perhaps some combination of your method with mine?
(I realise I am responding to a post of a few months ago.)

If you are looking for a way to encode the positions of n identical pieces on 0 <= sq1 < sq2 < .... < sqn in one number, then use this:

Bin(sq1, 1) + Bin(sq2, 2) + ... + Bin(sqn, n)

where Bin(n,k) = n! / (k!(n-k)!).