Canonical Position Representation

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

MOBMAT
Posts: 390
Joined: Sat Feb 04, 2017 11:57 pm
Location: USA

Re: Canonical Position Representation

Post by MOBMAT »

Example of a position where under promotion to a Bishop wins, to a Queen draws, and to a Rook or knight loses.

[d]8/7P/8/3P4/1p6/1k1K4/p7/8 w - - 0 1
i7-6700K @ 4.00Ghz 32Gb, Win 10 Home, EGTBs on PCI SSD
Benchmark: Stockfish15.1 NNUE x64 bmi2 (nps): 1277K
User avatar
leanchess
Posts: 176
Joined: Sun Dec 08, 2019 8:16 pm
Full name: Dmitry Shechtman

Re: Canonical Position Representation

Post by leanchess »

hgm wrote: Sun Apr 12, 2020 9:46 pm It occurred to me that pairs can be encoded such that the individual locations are easy to derive by calculation rather than tabe lookup. You can encode them such that the 6 lowest bits are the square number of the first piece, while the second can be calculated from the other 5 bits as

sqr1 = pairCode & 63;
sqr2 = pairCode >> 5 & 62; // an even number
sqr2 += sqr1 & 1;
sqr2 += (sqr2 >= sqr1);

Great stuff. I believe this could be optmised by adjusting the last bit instead:

Code: Select all

sqr1 = pairCode & 63;
sqr2 = pairCode >> 6;
sqr2 -= sqr1 + 1;
sqr2 &= 63;

I don't see why this shouldn't work for triplets as well (with two subtractions instead of one for sqr3).

There are probably some cycles to be gained by performing SUB/ADD/AND on two/four pairs (or two triplets) at once.
User avatar
leanchess
Posts: 176
Joined: Sun Dec 08, 2019 8:16 pm
Full name: Dmitry Shechtman

Re: Canonical Position Representation

Post by leanchess »

I've been thinking about using this representation in conjunction with lazy promotions, which requires encoding pawns on the last rank, leading to 65 possible states.

If we store the pawns in two 6-bit slots and two 15-bit triplets per side in the following starting arrangement:

a bcd efg h

and keep them sorted by file, the h-pawn will never occupy the first slot or the first triplet. We therefore only need a1-g1 to encode en passant in the first slot/triplet, which frees up h1 for capture indication. The same applies to the a-pawn w.r.t. the second triplet / last slot, where capture would be indicated by a1 instead.

Does this make any sense?
User avatar
hgm
Posts: 27859
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: Canonical Position Representation

Post by hgm »

What you propose doesn't seem to work at all. E.g. when sqr1 = 0, there is no way to get sqr2 = 40, as pairCode>>6 can only range from 0-31, so that after the modulo 64 sqr2 must be 0-30 or 63. And when sqr1 = 40, sqr2 can only be 23-54. So there is no way to encode (0, 40).

On a more general note, I am starting to wonder if one really can say that a scheme that allows 4 Bishops on each side is really better than one that at most allows 3 for one player. It is a nice academic exercise to study how far you can push it, but I don't think that with 192 bits you could push it to allowing all theoretical possibilities. And we have long since passed the point where we consider all paractical possibilities. Every attempt to to allow more unpractical possibilities should be considered a step back when it requires extra decoding or update effort.

So my guess is that your proposal of 3 Pawn pairs, Knight and Bishop pairs, and individual K, Q, R and two more P is actually the best compromise. The 4 spare bits each side has can be used as follows:

2 bits indicate whether the first single 'pawn' is P, B, R or Q.
1 bit indicates whether the second single 'pawn' is P or N
1 bit indicates whether the first pawn pair is PP or QQ.

This allows, for each player:
1 super-numerary piece of any kind.
2, 3 or 4 Queens (total).
2, 3 or 4 Queens (total) in combination with a super-numerary Knight
a super-numerary Knight in combination with a supernumerary Bishop or Rook
a super-numerary Bishop or Rook in combination with 3 Queens, or 2 Queens (total) if there are not more than 5 Pawns
some even more ridiculous combinations.

Only the first two stand any chance of ever occurring in practice in normal games. Of course one can be exposed to ridiculous combinations is a set-up position. Or even to illegal positions. E.g. Charge of the Light Brigade features 3 Queens vs 7 Knights in the presence of 8 Pawns for each player. This is not even reachable in FIDE chess, so it is a variant. But it could be a legitimate desire to be able to play that variant. But the logical way to achieve that would be to augment the variable game state with a fixed 'game-state extension', which tells you what piece types are actually represented by the 5 pairs and 6 singles the game state encodes. This game-state extension would be set upon loading a position, and for Charge of the Light brigade it would just redefine all normal piecse of the white player as Knights, and the Rooks of the black player as Queens, and you would not even have to touch the super-numerary stuff.
User avatar
leanchess
Posts: 176
Joined: Sun Dec 08, 2019 8:16 pm
Full name: Dmitry Shechtman

Re: Canonical Position Representation

Post by leanchess »

hgm wrote: Mon Apr 13, 2020 10:21 am What you propose doesn't seem to work at all. E.g. when sqr1 = 0, there is no way to get sqr2 = 40, as pairCode>>6 can only range from 0-31, so that after the modulo 64 sqr2 must be 0-30 or 63. And when sqr1 = 40, sqr2 can only be 23-54. So there is no way to encode (0, 40).
Right. I was trying to optimise the conditional away. The root of all evil, as they say...

I'd still like to explore the triplet option, since spare 7 bits would allow keeping updates to a minimum. Encoding 0-2 NRBs and 0-3 queens requires 3*3*3*4=108 codes, leaving us 20 supernumerary possibilities to get crazy with.
User avatar
hgm
Posts: 27859
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: Canonical Position Representation

Post by hgm »

There isn't any advantage in 'optimizing away' conditional expressions (i.e. the C comparison operators ==, !=, >, <, >= and <=).

I don't grasp the significance of your calculation of the number of material combinations. We don't need any extra bits to describe how many 'ordinary' Knights (say) there are; the fields containing the locations already tell us that, by whether they coincide with the King location.

And information on super-numerary pieces will 'never' have to be updated in practice, as they will just not occur. And this will only have to be concluded on promotions, which are very rare in search trees to begin with. Common case would be that a move is not a promotion, and when it is, the common case would be that it promotes to a piece that was captured before, and you just resurrect that through its original encoding, changing the location from 'at King' to the promotion square, and changing the pawn location to 'not present'.
User avatar
leanchess
Posts: 176
Joined: Sun Dec 08, 2019 8:16 pm
Full name: Dmitry Shechtman

Re: Canonical Position Representation

Post by leanchess »

hgm wrote: Mon Apr 13, 2020 12:19 pm There isn't any advantage in 'optimizing away' conditional expressions (i.e. the C comparison operators ==, !=, >, <, >= and <=).
I'm unfamiliar with x86-64. If there is a single instruction that increments conditionally, that's already optimal, of course.
I don't grasp the significance of your calculation of the number of material combinations. We don't need any extra bits to describe how many 'ordinary' Knights (say) there are; the fields containing the locations already tell us that, by whether they coincide with the King location.
It is redundant. The problem with the 'at king' approach, however, is that it involves traversal of the entire piece list for every king move. If the captures are recorded separately and only the castling rights are encoded as 'at king', only the rooks are updated, and only during the first king move.

Edit: I just realised that simply marking a piece as captured would create multiple encodings for equal positions. I propose encoding captured pieces as a1 (or h1 for the second in each pair) in addition to marking them as captured.
User avatar
hgm
Posts: 27859
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: Canonical Position Representation

Post by hgm »

leanchess wrote: Mon Apr 13, 2020 12:31 pmI'm unfamiliar with x86-64. If there is a single instruction that increments conditionally, that's already optimal, of course.
It is almost as good as that. Comparison sets 'condition codes' in the CPU status register depending on the outcome (as in most architectures). In a following instruction these condition codes can be used to branch on, but there also is an instructtion that copies a specified condition into the lowest bit of one of a register, zeroing allother bits, such that a 0 or 1 results. Even on architectures that do not have that there usually are branch-free tricks. E.g. they might have an 'add with carry' instruction for the purpose of implementing integer arithmetic for longer data types, and a comparison (basically a subtraction) can set the carry or not depending on the relative magnitude of the compared quantities. And then adding 0+0 with carry would create 0 or 1 depending on the magnitude. And to test for equallity you can subtract two quantities, and then add an all-1 constant. This produces a carry whenever the difference was not exactly zero, and a subsequent 0+0 with carry then produces 1 in that case, to implement the != operator. Optimizing compilers should know these tricks.
Edit: I just realised that simply marking a piece as captured would create multiple encodings for equal positions. I propose encoding captured pieces as a1 (or h1 for the second in each pair) in addition to marking them as captured.
We have that problem anyway. When two pieces of the same type are encoded as singles, exchanging their location alters the encoding, but not the position.
User avatar
leanchess
Posts: 176
Joined: Sun Dec 08, 2019 8:16 pm
Full name: Dmitry Shechtman

Re: Canonical Position Representation

Post by leanchess »

hgm wrote: Mon Apr 13, 2020 1:32 pm It is almost as good as that. Comparison sets 'condition codes' in the CPU status register depending on the outcome (as in most architectures). In a following instruction these condition codes can be used to branch on, but there also is an instructtion that copies a specified condition into the lowest bit of one of a register, zeroing allother bits, such that a 0 or 1 results. Even on architectures that do not have that there usually are branch-free tricks. E.g. they might have an 'add with carry' instruction for the purpose of implementing integer arithmetic for longer data types, and a comparison (basically a subtraction) can set the carry or not depending on the relative magnitude of the compared quantities. And then adding 0+0 with carry would create 0 or 1 depending on the magnitude. And to test for equallity you can subtract two quantities, and then add an all-1 constant. This produces a carry whenever the difference was not exactly zero, and a subsequent 0+0 with carry then produces 1 in that case, to implement the != operator. Optimizing compilers should know these tricks.
I forgot about ADC. Still, had my suggestion worked, the last three operations could have been performed on two pairs in parallel.

Isn't there a way to set those two bits at once? The following almost works:

Code: Select all

	sqr1 = pairCode & 63;
	sqr2 = pairCode >> 5 & 62;
	sqr2 -= (sqr1 & 30) - 1;
	sqr2 &= 63;
The only problem is with encoding (n, n + 2) where n is even.

P.S. Encoding piece counts separately would also allow encoding the rooks as a pair.
User avatar
leanchess
Posts: 176
Joined: Sun Dec 08, 2019 8:16 pm
Full name: Dmitry Shechtman

Re: Canonical Position Representation

Post by leanchess »

I think I just discovered the ultimate decoding algorithm:

Code: Select all

	sqr1 = pairCode & 63;
	sqr2 = pairCode >> 6;
	sqr2 += sqr1 + 1;
	sqr2 &= 63;
I suppose we've been overthinking it... :oops: