Zobrist keys (again)

hgm · Post by **hgm** » Mon Jan 26, 2009 11:07 am

I suddenly got this idea for creating a better Zobrist set: The idea was triggered when I was thinking about Shogi. In Shogi every piece can promote, not just Pawns. So it seems necessary to have different sets of Zobrist keys (one for each square) for both normal and promoted form of each piece. This would double the number of required keys. (Well, nearly so, as a King and a Gold do not promote.)

But then I realized, as each piece only has a single promoted form, that you might as well have only a single extra set of Zobrist keys (one for each square), indicating that the piece on that square is promoted. Thus a normal piece on as quare would be included in the hash key as a single Zobrist random in the total hash key, but the promoted form by two: the hash key for the unpromoted form, and the promotion key for that square. This would save a lot of keys. (Only 1 extra set of 81 (the Shogi board is 9x9) in stead of 12 extra sets for black and white promoted Pawn, Lance, Knight, Silver, Bishop, Rook.)

Now how can I use this idea in Chess? Well, a direct equivalent would be to use a key of 64 keys to make the distinction black/white pieces. Then, in stead of 12 sets of 64 keys, you would only need 7 (6 for the piece type, one for the color).

At first sight this seems to be a trick to reduce the size of the Zobrist table, but the trick is not free, as you would have to XOR with multiple keys to add or remove a single piece (if it were, say, black). So for speed you would pre-compute the products of the color key and the piece-type key for each piece, and have a table of 12*64 keys anyway. So you would do the normal thing, except that you would generate the keys for the pieces not as independent random numbers, but in a structured way, as XORs of a color and a piece-type key.

As a result, the elementary keys you start from to do this construction are much smaller in number (7*64 in stead of 12*64). And that it would thus be easier to create such a set with a better independency of the underlying keys. That the constructed keys might have poor dependency does not hurt, as the construction method sees to it that dependent keys will not occur together. E.g. the keys for all black pieces on e4 will get more dependent, because they all contain the same black key for e4 as a component, but you don't care, because there can only be a single piece on e4 in normal Chess positions.

Pushing the strategy further: to specify the contents of a square only needs 4 bits in Chess. You could use 1 bit for the color, 3 bits for the piece type. The color bit we already discussed. But the other 3 bits could be treated the same way. You could make a set of 64 keys for each of the 4 bits, only 256 keys in total. And then construct keys for each of the 12 colored piece types from these 256 'basis keys'. A basis key would only be present if is corresponding bit was 1. The likelyhood of two positions being confused would then be related to the number of bits that differed in this representation, i.e. the hamming distance between the positions encoded as 64 nibbles.

Now I would think intuitively that the collision rate could be improved by taking into account the relative abundance of the pieces, in a Hufman-coding inspired way: As there are as many Pawns as other pieces, it sounds reasonable to use a single basis key for the Pawns (so they have a better chance of being independent). So the basis keys would consist of 64 'occupied' keys, 64 'black' keys, 64 'pawn' keys, and 3 more sets of 64 to distinguish the other pieces.

Aleks Peshkov · Post by **Aleks Peshkov** » Mon Jan 26, 2009 1:09 pm

I wonder why are you bothering with low-level tricks for a very simple and not performance crucial part.

You can limit yourself to 12 64-bit numbers for each piece type.
Each board square is computed by cyclic bit-rotation.

Almost any set of 12 de Bruijin 64-bit numbers is a simple choice because they are guaranteed to have decent Hamming distance for such rotation in all permutations.

It is possible to select a better then average "dozen-set" by brute force test of de Bruijin or plain random numbers.

hgm · Post by **hgm** » Mon Jan 26, 2009 1:28 pm

I am not sure why you are considering this 'low-level tricks'. This term usually applies to programming in a way to exploit a specific machine architecture. I am merely trying to develop mathematical theory that would optimally exploit known properties of real Chess positions (number of pieces of each type, limited number of pieces on each square) to reduce the number of collissions for a given set of basis keys.

This technique could be generally applied to any set of basis keys, no matter how they were constructed. So also to the DeBruin set you suggest.

On another note, I am not so sure that this would make good keys. Have you tested the dependency of the keys generated according to your scheme? For a key-length of 48 bits, how many 6th-order dependencies did you have within a set of 768 keys?

Aleks Peshkov · Post by **Aleks Peshkov** » Mon Jan 26, 2009 5:14 pm

For the set I posted not long ago I tested minimum and maximum (to avoid mirrored shadows) Hamming distances between any two and any three numbers. You can try to test those numbers deeper if you have the framework.

I do not understand a question about 48-bits. I tested 32-bit halves for minimum Hamming distance, they were decent, better then random set, search brute force in a hour period.

It is easier to find and test 12 de Brujin numbers (there are plenty of them, but they all generated in about a hour on slow notebook) then to find 768 random numbers. I fear that all de Brujin numbers have a sort of interdependency, but it is not obvious and I did not discover any.

hgm · Post by **hgm** » Mon Jan 26, 2009 6:40 pm

The point is that you tested hamming distances, which are about as significant for the quality of the keys as the number of French words they contain when interpreted as an ascii string. Who cares if the keys all have a hamming distance between 28 and 36 to each other, if they are all dependent and make almost all position collide with almost any other?

For this reason it would be relevant to know how independent the keys are. If the keys you get from your prescription are any better than average, there should not be any dependency between any 6 out of the 768-key set.

jwes · Post by **jwes** » Mon Jan 26, 2009 8:52 pm

hgm wrote:The point is that you tested hamming distances, which are about as significant for the quality of the keys as the number of French words they contain when interpreted as an ascii string. Who cares if the keys all have a hamming distance between 28 and 36 to each other, if they are all dependent and make almost all position collide with almost any other?

For this reason it would be relevant to know how independent the keys are. If the keys you get from your prescription are any better than average, there should not be any dependency between any 6 out of the 768-key set.

To carry your point further, what matters is how well a particular set of Zobrist keys work in game situations. Testing anything else would require a second step of proving that that thing is highly correlated with game performance.

diep · Post by **diep** » Mon Jan 26, 2009 11:12 pm

Aleks Peshkov wrote:I wonder why are you bothering with low-level tricks for a very simple and not performance crucial part.

You can limit yourself to 12 64-bit numbers for each piece type.
Each board square is computed by cyclic bit-rotation.

Almost any set of 12 de Bruijin 64-bit numbers is a simple choice because they are guaranteed to have decent Hamming distance for such rotation in all permutations.

It is possible to select a better then average "dozen-set" by brute force test of de Bruijin or plain random numbers.

It is interesting as you can also use it to form encryption schemes.

Note i already toyed with the HGM idea a while ago.

Quickest plan i figured out is to have 13 hashkeys.
One for every piece and 1 special key for en passant and castling.

If we do a lookup for each piece in a square:
hashlookup = rotateleft(hashkeys[piece],sq);

Note it is better to use rotateleft than rotateright (shifting right at P4 can be like 7 cycles or so, not sure about rotate though).

So we have 1 tricky hardware instruction namely rotate.

You just rotate the required key.

Downsides of this approach:

a) rotating simply eats an extra cpu cycle
b) loading the instruction from the L1i is more costly than the lookup of a table of 64 * 12 hashkey entries.
c) the extra instruction slows down the program for sure,
whereas the lookup isn't faster than it already was,
odds is 99.99% anyway that it already was in L1d.

My advice is to first profile your code with cachegrind or even better vtune.
That will give a lot of answers.

From encryption viewpoint seen however the idea is very interesting, because with just a few basic instructions we want to create more unique hashkeys from a small set of start values, which gives a bigger domain where we can choose from.

So the question to HGM is, find now a solution to generate an unique hashkey using just 1 constant hashlookup() define without doing a reference to L1d cache at all. So we need to create unique value from a constant value using 1 side and 6 different pieces. Or simply 1 variable,
in diep i have all 3 variables available always in some register (or rename register).

For chess, yeah, the idea sucks.

Thanks,
Vincent

hgm · Post by **hgm** » Tue Jan 27, 2009 9:31 am

I have thought some more about this, and the problem is not entirely equivalent to Hufman coding. The point is that a one bit in the board encoding is encoded in the total hash key by the presence of a basis key, and a zero bit by the absence. To minimize the average probability for collissions you need as few basis keys in the total hash key as possible. So the game is to minimize the number of one bits in the board encoding, not the total number of bits (as Hufman does).

This would suggest we have to encode the most abundant pieces by codes with only a few one bits, i.e. pawns. But this ignores the fact that Pawns are not very mobile, so positions with a Pawn in a very different place are much les likely to occur in the same search tree than positions with, say, a Queen in a very different place. So there is a lot to be said for assigning codes with only a few one bits to the most mobile pieces as well.

This philosophy brought me to the following scheme: the contents of each board square is encoded by 4 bits, as follows

0000 empty square
1000 P
0001 p
0100 Q
0010 q
1001 N
0110 n
1100 B
0011 b
1010 R
0101 r
1110 K
0111 k

Thus P and Q are each indicated by a single Zobrist basis key, N, B and R by 2, and K by 3. In other words, the Zobrist 'randoms' for pieces on the same square are intentionally made strongly dependent (3rd-order dependency) by setting

N = P^p
B = P^Q
R = P^q
K = P^Q^q
n = Q^q
b = p^q
r = p^Q
k = p^Q^q

So positions with N on, say, e4, would now collide with positions that had two Pawns on e4 (a white and a black one), but where otherwise identical. This is not very harmful.

By clustering the keys for one square as close together as possible (in terms of dependency), room is created in code space to put these key clusters further apart. This manifests itself by the fact that we have only to find a set of 256 (4x64) basis keys with good independency, in stead of a set of 768 (12x64).

Note that several codes for each square are still left unused, and those could be used for e.p. and castling keys. E.g. the e.p. square could be encoded by using the alternative code 1111 for an empty square to which e.p. capture is possible, which occupies no extra code space, as the corresponding key (P^p^Q^q) is already dependent on existing keys for that square, in a way that can never produce a collision. Similarly, this code could be used for a Rook with castling rights in a corner square. (Or watever starting square Rooks had in FRC.) This would mean the key for white Q-side castling would be constructed as R^Q^p from the other a1 keys.

This way of constructing Zobrist keys should strongly reduce collission rate, even if the basis keys are generated as plain random numbers, without any effort to crank up their independency. But you could of course still do the latter.

Vinvin · Post by **Vinvin** » Tue Jan 27, 2009 11:11 am

hgm wrote:...
This philosophy brought me to the following scheme: the contents of each board square is encoded by 4 bits, as follows

0000 empty square
1000 P
0001 p
0100 Q
0010 q
1001 N
0110 n
1100 B
0011 b
1010 R
0101 r
1110 K
0111 k
...

May be you can set a different value for rooks when they allow/don't allow the castle right, that would make 2 more values. If king moves for the 1st time in the game then 2 rooks values will be affected.

Zach Wegner · Post by **Zach Wegner** » Tue Jan 27, 2009 5:34 pm

hgm wrote:I have thought some more about this, and the problem is not entirely equivalent to Hufman coding. The point is that a one bit in the board encoding is encoded in the total hash key by the presence of a basis key, and a zero bit by the absence. To minimize the average probability for collissions you need as few basis keys in the total hash key as possible. So the game is to minimize the number of one bits in the board encoding, not the total number of bits (as Hufman does).

This would suggest we have to encode the most abundant pieces by codes with only a few one bits, i.e. pawns. But this ignores the fact that Pawns are not very mobile, so positions with a Pawn in a very different place are much les likely to occur in the same search tree than positions with, say, a Queen in a very different place. So there is a lot to be said for assigning codes with only a few one bits to the most mobile pieces as well.

This philosophy brought me to the following scheme: the contents of each board square is encoded by 4 bits, as follows

0000 empty square
1000 P
0001 p
0100 Q
0010 q
1001 N
0110 n
1100 B
0011 b
1010 R
0101 r
1110 K
0111 k

Thus P and Q are each indicated by a single Zobrist basis key, N, B and R by 2, and K by 3. In other words, the Zobrist 'randoms' for pieces on the same square are intentionally made strongly dependent (3rd-order dependency) by setting

N = P^p
B = P^Q
R = P^q
K = P^Q^q
n = Q^q
b = p^q
r = p^Q
k = p^Q^q

So positions with N on, say, e4, would now collide with positions that had two Pawns on e4 (a white and a black one), but where otherwise identical. This is not very harmful.

By clustering the keys for one square as close together as possible (in terms of dependency), room is created in code space to put these key clusters further apart. This manifests itself by the fact that we have only to find a set of 256 (4x64) basis keys with good independency, in stead of a set of 768 (12x64).

Note that several codes for each square are still left unused, and those could be used for e.p. and castling keys. E.g. the e.p. square could be encoded by using the alternative code 1111 for an empty square to which e.p. capture is possible, which occupies no extra code space, as the corresponding key (P^p^Q^q) is already dependent on existing keys for that square, in a way that can never produce a collision. Similarly, this code could be used for a Rook with castling rights in a corner square. (Or watever starting square Rooks had in FRC.) This would mean the key for white Q-side castling would be constructed as R^Q^p from the other a1 keys.

This way of constructing Zobrist keys should strongly reduce collission rate, even if the basis keys are generated as plain random numbers, without any effort to crank up their independency. But you could of course still do the latter.

Of course, you can combine this idea with my deBruijn one. You can use four different deBruijns for the piece encoding, as well as a special one that is shifted to either the e.p. square or the square of the rook involved in castling. Five different keys should be a much more manageable search space for finding these things. And also, since Vincent brought it up in another thread, my idea was not meant to save time, but rather to create a theoretically better set of hash keys. I would precompute all of them and stick them in my current array anyways...

Zobrist keys (again)

Zobrist keys (again)

Re: Zobrist keys (again)

Re: Zobrist keys (again)

Re: Zobrist keys (again)

Re: Zobrist keys (again)

Re: Zobrist keys (again)

Re: Zobrist keys (again)

Re: Zobrist keys (again)

Re: Zobrist keys (again)

Re: Zobrist keys (again)