Page 1 of 2

On Zobrist keys

Posted: Sun Jun 21, 2009 10:23 am
by Lasse Hansen
Hi!

In my program Sillycon I use 781, 64 bit Zobrist keys.
12x64 Piece squares
8 En-passant
4 Castling rights
1 Wtm

Then I played with the idea to reduce the number of keys by rotating them. So I did that by making them as:

Code: Select all

	// Zobrist Hash keys
	for &#40;pcs=BKing; pcs <= WKing; pcs++) &#123;
		if &#40;pcs == Neutral&#41; pcs++;
		HKey&#91;pcs&#93;&#91;H8&#93; = Random64&#40;);
		for &#40;sqr=H7; sqr<=A1; sqr++) HKey&#91;pcs&#93;&#91;sqr&#93; = rol64&#40;HKey&#91;pcs&#93;&#91;H8&#93;,sqr&#41;;
	&#125;
	HKeyEp&#91;0&#93; = Random64&#40;);
	for &#40;i=1; i<8; i++) HKeyEp&#91;i&#93; = rol64&#40;HKeyEp&#91;0&#93;,i&#41;; // H-A
	for &#40;i=0; i<4; i++) HKeyCastle&#91;i&#93; = Random64&#40;);
	HKeywtm = Random64&#40;);
by making my own rol64 function:

Code: Select all

inline U64 rol64&#40;U64 A, U8 n&#41;
&#123;
	asm volatile&#40;"push %%r11;"
				 "mov %0,%%r11;"
				 "mov %1,%%cl;"
				 "shl %%cl,%0;"
				 "mov $64, %%cl;"
				 "sub %1,%%cl;"
				 "shr %%cl,%%r11;"
				 "or  %%r11,%0;"
				 "pop %%r11;"
				 &#58;"=r"&#40;A&#41;
				 &#58;"r"&#40;n&#41;,"0"&#40;A&#41;
				 &#58;"%cl");					
	return A;	
&#125;
Then I ran a set of test using perft with and without hashing, using the rotated keys. I tried on the following positions to the depth:
depth 8: "rnbqkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w QKqk -\n"
depth 6: "r3k2r/p1ppqpb1/bn2pnp1/3PN3/1p2P3/2N2Q1p/PPPBBPPP/R3K2R w QKqk- -\n"
depth 5: "4k3/qqqqqqqq/q7/8/8/7Q/QQQQQQQQ/3K4 w - - 0 1\n"

I was a bit surprised that there were no errors! And it should also be possible to use one common (rotated) key for wtm, en-passant and castling properties also. This implies that one only need 13 Zobrist keys=104 bytes, when having the possibility to rotate them in the fly. One drawback is of course that there are no 64 bit rol or ror functions in, at least, my processor (AMD64), but well, they inserted the popcnt in the newer ones.

Regards,
Lasse

Re: On Zobrist keys

Posted: Sun Jun 21, 2009 11:20 am
by Gerd Isenberg
Hi Lasse,
I think HGM mentioned a similar idea to reduce the Hashkey space 64 fold. BTW your processor has 64-bit ror/rol. VC and Intel-C have intrinsics. You can use them with GCC inline assembly as well, see for instance Jörg Arndt's excellent paper Matters Computational, ideas, algorithms, source code:

1.11 Bit-wise rotation of a word

Code: Select all

static inline ulong asm_ror&#40;ulong x, ulong r&#41;
&#123;
   asm ("rorq %%cl, %0" &#58; "=r" &#40;x&#41; &#58; "0" &#40;x&#41;, "c" &#40;r&#41;);
   return x;
&#125;
Gerd

Re: On Zobrist keys

Posted: Mon Jun 22, 2009 10:37 am
by steffan
I too have had success using rotated Zobrist keys. As an experiment (a long time ago!) I modified Crafty to use rotated keys and had no collisions in the small number of self-play games I tried.

Cheers,
Steffan

Re: On Zobrist keys

Posted: Mon Jun 22, 2009 8:00 pm
by Gerd Isenberg
steffan wrote:I too have had success using rotated Zobrist keys. As an experiment (a long time ago!) I modified Crafty to use rotated keys and had no collisions in the small number of self-play games I tried.

Cheers,
Steffan
Hi Steffan,
I wonder whether random base keys or De Bruin sequences work better. Two instead of more than 100 cachelines for a very frequent access will likely relax L1 a bit. Whether this is worth the rotate instruction, which is bound on cl? Likely.

Code: Select all

U64 CACHEALIGN basekey&#91;16&#93;; // 2 cachelines instead of > 100

ror&#40;basekey&#91;piece&#93;, square&#41;; // or rol
instead of

Code: Select all

U64 zobristkey&#91;16&#93;&#91;64&#93;;

zobristkey&#91;piece&#93;&#91;square&#93;; 
Cheers,
Gerd

Re: On Zobrist keys

Posted: Mon Jun 22, 2009 9:58 pm
by steffan
Gerd Isenberg wrote:I wonder whether random base keys or De Bruin sequences work better.
I think the ideal keys have on average half their bits set, and all bits are mutually independent. Random keys satisfy both criteria, but de Bruijn keys fail the independence criterion : For example, all de Bruijn sequences have exactly half their bits set.

Cheers,
Steffan

Re: On Zobrist keys

Posted: Mon Jun 22, 2009 10:01 pm
by Edmund
Gerd Isenberg wrote:
steffan wrote:I too have had success using rotated Zobrist keys. As an experiment (a long time ago!) I modified Crafty to use rotated keys and had no collisions in the small number of self-play games I tried.

Cheers,
Steffan
Hi Steffan,
I wonder whether random base keys or De Bruin sequences work better. Two instead of more than 100 cachelines for a very frequent access will likely relax L1 a bit. Whether this is worth the rotate instruction, which is bound on cl? Likely.

Code: Select all

U64 CACHEALIGN basekey&#91;16&#93;; // 2 cachelines instead of > 100

ror&#40;basekey&#91;piece&#93;, square&#41;; // or rol
instead of

Code: Select all

U64 zobristkey&#91;16&#93;&#91;64&#93;;

zobristkey&#91;piece&#93;&#91;square&#93;; 
Cheers,
Gerd
or you go for

Code: Select all

U8 zobristkey&#91;16*64+7&#93;
return *&#40;U64 *) &#40;zobristkey + piece*64 + square&#41;;
1031 bytes = 16.11 cachelines
and no additional ror/rol instruction

Re: On Zobrist keys

Posted: Mon Jun 22, 2009 11:02 pm
by Gerd Isenberg
Codeman wrote: or you go for

Code: Select all

U8 zobristkey&#91;16*64+7&#93;
return *&#40;U64 *) &#40;zobristkey + piece*64 + square&#41;;
1031 bytes = 16.11 cachelines
and no additional ror/rol instruction
Might be quite expensive on x86, specially if you cross cacheline boarders. May be with some SSE4 unaligned load instruction.

Re: On Zobrist keys

Posted: Tue Jun 23, 2009 12:26 am
by Aleks Peshkov
steffan wrote:
Gerd Isenberg wrote:I wonder whether random base keys or De Bruin sequences work better.
I think the ideal keys have on average half their bits set, and all bits are mutually independent. Random keys satisfy both criteria, but de Bruijn keys fail the independence criterion : For example, all de Bruijn sequences have exactly half their bits set.
Random keys have random dependence. Set of only 12 de Bruijn keys (instead of 768 pseudo-random) can be easier to test, as they are proved to be good enough for itself rotation.

Re: On Zobrist keys

Posted: Tue Jun 23, 2009 8:00 am
by wgarvin
Gerd Isenberg wrote:
Codeman wrote: or you go for

Code: Select all

U8 zobristkey&#91;16*64+7&#93;
return *&#40;U64 *) &#40;zobristkey + piece*64 + square&#41;;
1031 bytes = 16.11 cachelines
and no additional ror/rol instruction
Might be quite expensive on x86, specially if you cross cacheline boarders. May be with some SSE4 unaligned load instruction.
You could however, overlap 12 keys for each square at 2-byte intervals and fit them all within an aligned 32-byte block. That gives you a 2 KB table (32*64 bytes) that requires no rotate and never does a read that crosses cacheline boundaries. Its not as compact as the rotated keys, but its better than the usual 6 KB (12*8*64).

Can anyone can think of a way to use 1-byte intervals and make the table smaller without causing some accesses to cross a cache line boundary? I tried to figure out a way, but have come up with nothing so far.

Re: On Zobrist keys

Posted: Tue Jun 23, 2009 8:55 am
by Zach Wegner
You can do it if you muck with the address a bit to get the unused pieces (12-15) to cluster at the top of a cache line:

Code: Select all

unsigned char data&#91;64*16&#93;;
s1=square&3;
s2=square>>2;
hashkey = *&#40;u64*)&#40;data+s2*64+s1+piece*4&#41;;
...or more bit-twiddlingly:

Code: Select all

unsigned char data&#91;64*16&#93;;
hashkey = *&#40;u64*)&#40;data+&#40;square&~3&#41;*15+square+piece*4&#41;;