Canonical Position Representation

leanchess · Post by **leanchess** » Fri Apr 10, 2020 9:26 am

Gentlemen (regrettably, I am afraid there are no ladies present),

Recently I have been experimenting with move generation from dense piece lists, and noticed that an arbitrary position (assuming no RB-promotions) may be stored in 256 bits (128 per side) as a whole (STM, castling and e.p. included). This means 4 64-bit registers (or 2 128-bit registers, if we are really down to the metal).

The next logical step seems to be canonicalisation, which may come in handy for preventing hash collisions. However, maintaining a canonical list requires many conditionals and selective updates (16 octets in the worst case), so it might be more sensible to defer canonicalisation to the endgame.

In any case, something similar has been requested some time ago:

https://stackoverflow.com/questions/213 ... s-position

I haven't come across an existing solution thus far, so I am wondering whether this is worth pursuing.

As I am new to the field, I would greatly appreciate your feedback.

Thanks in advance!

hgm · Post by **hgm** » Fri Apr 10, 2020 10:18 am

I don't know what exactly your goal is. But for compactness one can do much better than 256 bits. E.g. Hufmann encoding would make use of the fact that not all piece types are equally abundant, and assign shorter codes to the most abundant (normally empty squares). The natural frequencies might be upset through promotions, but OTOH promotions are only possible after captures, which increases the number of empty squares, so the effect is not as large as one may fear. Also here it matters a lot whether one is interested in encoding all theoretically possible positions, or just those that could occur in an actual game. (E.g. would it really be necessary to encode positions where one player has 7 Knights?)

leanchess · Post by **leanchess** » Fri Apr 10, 2020 10:40 am

hgm wrote: ↑Fri Apr 10, 2020 10:18 am I don't know what exactly your goal is.

The goal is maintaining a single uniform representation for move generation, hashing, tablebooks etc. (although I understand that keeping at least bitboards in addition is required for efficient check detection).

Also here it matters a lot whether one is interested in encoding all theoretically possible positions, or just those that could occur in an actual game. (E.g. would it really be necessary to encode positions where one player has 7 Knights?)

Obviously, I'm only looking at realistic Chess[960] positions. To keep it simple, I limit the canonical representation to 2 rooks, 1 white-even bishop and 1 white-odd bishop (for lack of knowledge of existing nomenclature). This does leave us room for 11 knights, however

hgm · Post by **hgm** » Fri Apr 10, 2020 11:18 am

I think there is no justification for discarding positions with 3 Rooks, but not with 4 or more Knights.

As was mentioned in the stackoverflow discussion you referred to, piece-list encoding is more compact than board encoding. And the piece-list encodings discussed there are quite inefficient; they do not exploit the fact that some pieces are of the same type. To unambiguously encode the constellation of a pair of identical pieces one would only need 11 bits, not 12, as there are only 64*63/2 constellations even on an empty board. And for Pawns the reduction is even a factor 8! in addition to that they can only exist in 75% of the board, which could save another 18 bits.

leanchess · Post by **leanchess** » Fri Apr 10, 2020 12:20 pm

To clarify, I'm not looking for the smallest possible representation, but rather for one that could in addition be used as the source for move generation. This means (unless somebody has a better idea) 8 bits per piece, 6 of which are location (with a twist for e.p. pawns) and 2 are castling/type1/type2, depending on the index.

One advantage of such encoding is that the value may be used (either directly or with an additional bit from the index) as an index into an array of function pointers. For example, the queen portion (assuming contiguous location in bits 0-6) would look like this:

Code: Select all

	// ...
	queen_a1, queen_1, queen_1, queen_1, queen_1, queen_1, queen_1, queen_h1,
	queen_a , queen  , queen  , queen  , queen  , queen  , queen  , queen_h ,
	queen_a , queen  , queen  , queen  , queen  , queen  , queen  , queen_h ,
	queen_a , queen  , queen  , queen  , queen  , queen  , queen  , queen_h ,
	queen_a , queen  , queen  , queen  , queen  , queen  , queen  , queen_h ,
	queen_a , queen  , queen  , queen  , queen  , queen  , queen  , queen_h ,
	queen_a , queen  , queen  , queen  , queen  , queen  , queen  , queen_h ,
	queen_a8, queen_8, queen_8, queen_8, queen_8, queen_8, queen_8, queen_h8,
	// ...

hgm wrote: ↑Fri Apr 10, 2020 11:18 amI think there is no justification for discarding positions with 3 Rooks, but not with 4 or more Knights.

It is my understanding that no human (in their right mind) would ever promote to anything other than queen or knight, and I'm assuming that no programmer (again, in their right mind) would let their engine make such a move. Other than wrong perft counts for some unrealistic hand-crafted positions, what are the implications of the limitations I proposed above?

hgm · Post by **hgm** » Fri Apr 10, 2020 1:26 pm

Promotion to Rook or Bishop is sometimes the only winning move, but I don't know of any examples where this would happenw hen you already have two of those. So occurrence of 'super-numerary' Rooks or Bishops is indeed highly unlikely. There is an opening line (of the Kings gambit, IIRC) where one player promotes to Knight while he already has two. But having to do that twice is also quite inconceivable.

OK, so you are proposing a piece list after all. That you mentioned 256 bits put me on the wrong track, because that is what a board representation of 64 x 4 bits/piece would use. Why so large? The location only requires 6 bits, and specifying the 'castling type' on Knights and Bishops seems a bit wasteful. Representing castling rights can be done with only 4 bits in total. You want to make sure that all the entries are byte-aligned?

I am wondering about your use case. Normally one either doesn't care much about size of the representation, and makes every effort to organize it for fast use (engines). Or one doesn't care about speed, only for compactness (data bases). A compromise seems to be unsuitable for either of these applications.

Unpacking a format that uses 6 bits per piece location hardly involves any work: just shift them to the low bits, and AND with 63. To step through an array of bytes you would have to increment an index, and load the byte from memory with an indexed addressing mode. Not really faster. If you generate moves through an unrolled loop (so that you don't have to test what piece type you are dealing with, but you know it for every iteration in advance), there is also hardly any penalty in using pair encodings for the Rook; you can use an 11-bit location code as index in a 2048-entry table that contains two square numbers, and then generate the moves from those square numbers.

leanchess · Post by **leanchess** » Fri Apr 10, 2020 2:10 pm

hgm wrote: ↑Fri Apr 10, 2020 1:26 pm Promotion to Rook or Bishop is sometimes the only winning move, but I don't know of any examples where this would happenw hen you already have two of those. So occurrence of 'super-numerary' Rooks or Bishops is indeed highly unlikely. There is an opening line (of the Kings gambit, IIRC) where one player promotes to Knight while he already has two. But having to do that twice is also quite inconceivable.

I am even worse a chess player than a I am a chess programmer, so please forgive my ignorance. I am assuming that any promotion to rook/bishop could be safely substituted by a full promotion.

I am wondering about your use case. Normally one either doesn't care much about size of the representation, and makes every effort to organize it for fast use (engines). Or one doesn't care about speed, only for compactness (data bases). A compromise seems to be unsuitable for either of these applications.

I seem to be failing repeatedly at explaining my idea, so let me just try and show you the code.

This is an excerpt from my (as of yet not fully functional) perft function:

Code: Select all

	//...

	piece_t k = {0, ps->pieces[0], 1};
	uint8_t b = pt->pieces[0] & 0x3F;

	int64_t n = (*perft_king_lut[i])(ps, pt, ms, mt, &next, b, k);
	if (n < 0)
		return 0;
	nodes += n;

	piece_t p;
	for (p.i = 1, p.m = 2; p.i < 16; ++p.i, p.m <<= 1) {
		if (p.m & ms) {
			p.v = ps->pieces[p.i];
			uint16_t i = p.v + ((p.i & 8) << 5);
			int64_t n = (*perft_lut[i])(ps, pt, ms, mt, &next, b, p);
			if (n < 0)
				return 0;
			nodes += n;
		}
	}

	//...

Legend:

ps - Side to move
pt - The other side
ms - mask for ps's pieces (not really required, but seems to be faster)
mt - mask for pt's pieces (ditto)
k - ps's king + castling rights
b - pt's king
p - piece structure:
- p.i - index
- p.v - value
- p.m - mask (not really required, currently used for unmake)
next - next perft state

The most interesting part is perft_lut, an array (eventually) containing 512 function pointers as follows:

000-03F Knight moves by origin square
040-07F Rook moves
080-0BF Bishop moves
0C0-0FF Queen moves
100-13F Copy of 000-03F (for promotions to knights)
140-17F White pawn moves
180-1BF Black pawn moves
1C0-1FF Copy of 0C0-0FF (for promotions to queens)

As king+castling always resides at index 0, it makes sense to keep a separate perft_king_lut[256] as follows:

00-3F Normal king moves by origin square
40-7F Long castling king moves
80-BF Short castling king moves
C0-FF Full castling king moves

The latter three sections are copies of the first one, with the exceptions of relative offsets 04 (e1) and 74 (e8).

The only restriction so far is 7 bishops+rooks, but I'm thinking ahead w.r.t. canonicalised version, where the penalties for arbitrary piece counts would become significant.

hgm · Post by **hgm** » Fri Apr 10, 2020 4:18 pm

Well, yeah, this is how one can generate moves from a piece list. However, moves for a given piece depend on what else is on the board, and to know whether a given square is occupied so that sliders cannot reach beyond it, and by what color piece to know whether you can capture it, can only be done with the aid of some board representation. Otherwise you would have to run through the entire piece list to see if the square you contemplate moving to happens to be occupied by something.

And it still isn't clear to me why this would be a particularly good way to do it, or even what the criteria are by which this should be judged. Your first posting talks about "dense piece lists", but what wakes the list in your proposal "dense"? It seems an ordinary piece list, not very different from what for instance Qperft uses. And it needs 256 bits, while in principle you could do with 192 bits easily.

leanchess · Post by **leanchess** » Fri Apr 10, 2020 4:51 pm

hgm wrote: ↑Fri Apr 10, 2020 4:18 pm Well, yeah, this is how one can generate moves from a piece list.

So I understand the idea of function LUTs isn't new, just as I suspected.

However, moves for a given piece depend on what else is on the board, and to know whether a given square is occupied so that sliders cannot reach beyond it, and by what color piece to know whether you can capture it, can only be done with the aid of some board representation. Otherwise you would have to run through the entire piece list to see if the square you contemplate moving to happens to be occupied by something.

Yes, piece lists alone probably wouldn't cut it. As I already said, keeping at least a bitboard per side would seem advisable.

And it still isn't clear to me why this would be a particularly good way to do it, or even what the criteria are by which this should be judged.

To reiterate, the idea is to have the exact same bits for everything: move generation, TT (collision detection), OB, TB etc.

Your first posting talks about "dense piece lists", but what wakes the list in your proposal "dense"?

Yes, decorated ordinary piece lists, if you will, as opposed to Loop's disjoint lists as described in the wiki article.

Anyway, could you please confirm/dispel my assumption considering underpromotions?

hgm · Post by **hgm** » Fri Apr 10, 2020 5:24 pm

There are certain end-game studies where the only winning move is an under-promotion to Rook or Bishop. The Saavedra study is a famous one. In such positions promoting to Queen doesn't work, because it would stalemate the opponent. And it could be the only chance at promotion that you get.

I am not sure if using the entire 256-bit game state as key for TT, OB or EGT probing would ever be competitive. In any case it prompts the question why it wouldn't be much better to use a 192-bit game state for that.

I am not sure that a LUT with function pointers was used before in perft, but it is a standard technique. And probably not very fast for this application. One problem with function pointers from tables is that the branch is unpredictable, so that you will virtually always have a mis-predict. You would not have that if only data (like directions and ranges for the piece at that location) came from a LUT. In my engine HaQiKi D I use a table each piece type uses a board-size table that points to a zero-terminated list of board steps according to which the piece can move, so that I can omit directions that would already go off-board on the first step. (Especially a perft for the start position would benefit from that, as all pieces start at the edge. But for an engine it matters much less, as it will quickly centralize the pieces. I mainly did it because it was an easy way to prevent Advisors and Kings could leave the Palace; I just omitted the directions that would step out of it for squares on the Palace edge.) Pointers are 8 bytes on x64 architecture, square numbers or step vectors only 1 byte, so the size of the LUT could also be an argument.

Disjoint lists are often the best solution to allow for super-numerary promotions. IIRC Qperft also splits the list in a pawn, leaper and slider section. That makes it easy to add Knights or sliders; otherwise you would have to shuffle the pieces, or alter their type, so that you can no longer be sure which type you are dealing with just from the list index. (And you would have to shuffle them around in an engine which needs them ordered by decreasing value, etc.) In HaQiKi D I do use a contiguous list, because Xiangqi doesn't have promotions. So pieces are either there, or not, and in the latter case I use an invalid square number. (With 90 board squares you have plenty of those without needing extra bits.)

Canonical Position Representation

Canonical Position Representation

Re: Canonical Position Representation

Re: Canonical Position Representation

Re: Canonical Position Representation

Re: Canonical Position Representation

Re: Canonical Position Representation

Re: Canonical Position Representation

Re: Canonical Position Representation

Re: Canonical Position Representation

Re: Canonical Position Representation