Alternative to power of two size hash tables

Discussion of chess software programming and technical issues.

Moderator: Ras

MOBMAT
Posts: 414
Joined: Sat Feb 04, 2017 11:57 pm
Location: USA

Alternative to power of two size hash tables

Post by MOBMAT »

Last winter, I was checking the Stockfish site for new development builds, and came across a release that had a different scheme for indexing into the hash table.

I think you can link to the info here:

http://abrok.eu/stockfish/?page=6

The actual text follow:
Author: Joost VandeVondele
Date: Mon Dec 18 16:32:21 2017 +0100
Timestamp: 1513611141

Allow for general transposition table sizes. (#1341)

For efficiency reasons current master only allows for transposition table sizes that are N = 2^k in size, the index computation can be done efficiently as (hash % N) can be written instead as (hash & 2^k - 1). On a typical computer (with 4, 8... etc Gb of RAM), this implies roughly half the RAM is left unused in analysis.

This issue was mentioned on fishcooking by Mindbreaker:

http://tests.stockfishchess.org/tests/v ... ccbb8be04

Recently a neat trick was proposed to map a hash into the range [0,N[ more efficiently than (hash % N) for general N, nearly as efficiently as (hash % 2^k):

https://lemire.me/blog/2016/06/27/a-fas ... eduction/

namely computing (hash * N / 2^32) for 32 bit hashes. This patch implements this trick and now allows for general hash sizes. Note that for N = 2^k this just amounts to using a different subset of bits from the hash. Master will use the lower k bits, this trick will use the upper k bits (of the 32 bit hash).

There is no slowdown as measured with [-3, 1] test:

http://tests.stockfishchess.org/tests/v ... 0ccbb8be04

LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 128498 W: 23332 L: 23395 D: 81771

There are two (smaller) caveats:

1) the patch is implemented for a 32 bit hash (so that a 64 bit multiply can be used), this effectively limits the number of clusters that can be used to 2^32 or to 128Gb of transpostion table. That's a change in the maximum allowed TT size, which could bother those using 256Gb or more regularly.

2) Already in master, an excluded move is hashed into the position key in rather simple way, essentially only affecting the lower 16 bits of the key. This is OK in master, since bits 0-15 end up in the index, but not in the new scheme, which picks the higher bits. This is 'fixed' by shifting the excluded move a few bits up. Eventually a better hashing scheme seems wise.

Despite these two caveats, I think this is a nice improvement in usability.

Bench: 5346341
As I was in the middle of some code rewrite, I decided to try this idea. So far, it is working fine. My implementation is as follows:

Initialize the hash with:

Code: Select all

U32 m_NumHashEntries = (HashSize * MEGABYTE) / sizeof(TTMAIN);
	m_Hash = (TTMAIN *)calloc(m_NumHashEntries, sizeof(TTMAIN));
Note that m_NumHashEntries will probably NOT be a power of two!

TTMAIN is the hash entry structure
MEGABTYE = 1024 * 1024

To get the correct index from the key...

Code: Select all

U32 GetHashIndex(const U64 key){
	return ((key & BITMASK32) * m_NumHashEntries) >> 32;
	// from stockfish
   }
where:
U64 BITMASK32 = 0xFFFFFFFF;

the 32 bit right shift is a very fast divide so the extra overhead of this method is not very much. The first mask gives use the lower 32bits of the 64bit key, and the multiply will not overflow a 64bit number. Then we shift 32 to the right to get the upper portion as our index.
i7-6700K @ 4.00Ghz 32Gb, Win 10 Home, EGTBs on PCI SSD
Benchmark: Stockfish15.1 NNUE x64 bmi2 (nps): 1277K
User avatar
Volker Annuss
Posts: 181
Joined: Mon Sep 03, 2007 9:15 am

Re: Alternative to power of two size hash tables

Post by Volker Annuss »

That trick was already discussed here.
MOBMAT
Posts: 414
Joined: Sat Feb 04, 2017 11:57 pm
Location: USA

Re: Alternative to power of two size hash tables

Post by MOBMAT »

wow, the idea has been around a long time!

I'll read that thread for more info, thanks.
i7-6700K @ 4.00Ghz 32Gb, Win 10 Home, EGTBs on PCI SSD
Benchmark: Stockfish15.1 NNUE x64 bmi2 (nps): 1277K
syzygy
Posts: 5951
Joined: Tue Feb 28, 2012 11:56 pm

Re: Alternative to power of two size hash tables

Post by syzygy »

I have tried to modify the trick to make it work for hash tables larger than 128 GB:
https://github.com/official-stockfish/S ... ssues/1349

Unfortunately all my tests failed, even though there is no clear reason why this would be measurably slower on x86-64:

Code: Select all

return &table[(key * (unsigned __int128)clusterCount) >> 64].entry[0];
Dann Corbit
Posts: 12870
Joined: Wed Mar 08, 2006 8:57 pm
Location: Redmond, WA USA

Re: Alternative to power of two size hash tables

Post by Dann Corbit »

If you use a union of 2 64 bit integers and a 128 bit integer then you would not even have to shift. Just grab the top half.
Taking ideas is not a vice, it is a virtue. We have another word for this. It is called learning.
But sharing ideas is an even greater virtue. We have another word for this. It is called teaching.
syzygy
Posts: 5951
Joined: Tue Feb 28, 2012 11:56 pm

Re: Alternative to power of two size hash tables

Post by syzygy »

Dann Corbit wrote:If you use a union of 2 64 bit integers and a 128 bit integer then you would not even have to shift. Just grab the top half.
That's what the compiler already does.
Dann Corbit
Posts: 12870
Joined: Wed Mar 08, 2006 8:57 pm
Location: Redmond, WA USA

Re: Alternative to power of two size hash tables

Post by Dann Corbit »

Pretty strange that it is slow then.
Taking ideas is not a vice, it is a virtue. We have another word for this. It is called learning.
But sharing ideas is an even greater virtue. We have another word for this. It is called teaching.
User avatar
Volker Annuss
Posts: 181
Joined: Mon Sep 03, 2007 9:15 am

Re: Alternative to power of two size hash tables

Post by Volker Annuss »

I have not measured how fast it is, but you might want to try

Code: Select all

return &table[(((key * cluster_count) & 0x3ffffffffull) >> 34].entry[0];
for 512GB if your cluster size is 32.
syzygy
Posts: 5951
Joined: Tue Feb 28, 2012 11:56 pm

Re: Alternative to power of two size hash tables

Post by syzygy »

Volker Annuss wrote:I have not measured how fast it is, but you might want to try

Code: Select all

return &table[(((key * cluster_count) & 0x3ffffffffull) >> 34].entry[0];
for 512GB if your cluster size is 32.
I have tried something similar to get to 1 TB:

Code: Select all

return &table[((key >> 20) * unitCount) >> 29].entry[0];
It also tested worse:
http://tests.stockfishchess.org/tests/v ... 0ccbb8c19e
User avatar
Volker Annuss
Posts: 181
Joined: Mon Sep 03, 2007 9:15 am

Re: Alternative to power of two size hash tables

Post by Volker Annuss »

Much too late to edit. This one should be correct:

Code: Select all

return &table[((key & 0x3ffffffffull) * cluster_count) >> 34].entry[0];