Can someone explain what advantage there is to...

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

Zenmastur
Posts: 919
Joined: Sat May 31, 2014 8:28 am

Re: Can someone explain what advantage there is to...

Post by Zenmastur »

hgm wrote:None of my engines stores more than 32 bits of the hash key. Joker uses 9-byte hash entries, 7 per bucket.

5 entries of 12 bytes each is also a nice design, if you want to store dual bounds + depth (2x 2-byte score, 2x 1-byte depth, 2-byte move, 4-byte key). Then you have 4 bytes left in a 64-byte bucket, which can be used for 4 aging fields (while the 5th entry is always-replace, and so does not need an aging field).
Good answer!

Regards,

Forrest
Only 2 defining forces have ever offered to die for you.....Jesus Christ and the American Soldier. One died for your soul, the other for your freedom.
Zenmastur
Posts: 919
Joined: Sat May 31, 2014 8:28 am

Re: When the whole key is stored

Post by Zenmastur »

sje wrote:
Zenmastur wrote:That I could understand if most chess programs were written in Lisp OR they had dynamically re-sizeable TT. Most don't.
"Most" ≠ "All"

The number of signature bits recoverable from a table entry given its contents and its address governs the false positive rate when all other factors are held equal.

40 bits will be enough for storing a moderately sized opening book...

What do you consider moderate in size for a book?
sje wrote:... 64 bits will not be enough for calculating perft(13) and higher...
Not sure why you would need a TT to calculate perft, but if you do need one, perft(8) has ~40% chance of having a collision with a 64-bit key. Perft(13) will have a collision rate of about 0.1. Having 10% of your entries over-written doesn't sound like a usable system to me! :lol:
sje wrote:It's not necessary that the length of a transposition table be a power of two. Another approach is to have the entry count of the table be some prime p and the address of an entry given by the value key mod p. For some years now, the cost of a division isn't much different than the cost of a mask and shift; using key mod p can be an improvement because it allows for better utilization of available memory. After all, there are many primes from which to choose.
How much of a difference is not much?


Regards,

Forrest
Only 2 defining forces have ever offered to die for you.....Jesus Christ and the American Soldier. One died for your soul, the other for your freedom.
Zenmastur
Posts: 919
Joined: Sat May 31, 2014 8:28 am

Re: When the whole key is stored

Post by Zenmastur »

syzygy wrote:
sje wrote:It's not necessary that the length of a transposition table be a power of two. Another approach is to have the entry count of the table be some prime p and the address of an entry given by the value key mod p. For some years now, the cost of a division isn't much different than the cost of a mask and shift; using key mod p can be an improvement because it allows for better utilization of available memory. After all, there are many primes from which to choose.
No added benefit in requiring p to be prime.
Agreed!

But I would like to know how much of a difference there is between mod and shift/mask.
syzygy wrote:
Zenmastur wrote:So I'm still left wondering why they choose to waste memory?
The easy answer is that they do that because it's easy.

Hash entries have a fixed size indepedent of the size of the hash table (i.e. independent of the number of bits used for indexing). So it does not make sense to eliminate exactly those bits that are used for indexing, unless the size of your hash table is fixed. Also keep in mind that most want hash entries to be 16 bytes, so that 4 entries fit exactly in a cacheline. Hash entries of 12 bytes are less efficient to access.
To the first statement, that's one of things I was wondering about, to the second statement, I guess that depends on how you measure efficiency.
syzygy wrote: ...Storing 32 bits of the key in the hash entry and using the rest for indexing into the table seems to be the best compromise to me, also because nowadays you probably use most of the other 32 bits for indexing into the table anyway.
Agreed.

Regards,

Forrest
Only 2 defining forces have ever offered to die for you.....Jesus Christ and the American Soldier. One died for your soul, the other for your freedom.
User avatar
sje
Posts: 4675
Joined: Mon Mar 13, 2006 7:43 pm

Re: When the whole key is stored

Post by sje »

Symbolic's default book has 481,583 entries.

Collisions happen all the time. They aren't much of a problem. False positive matches can be a problem, and this problem is greatly mitigated by long signatures. Symbolic (and Oscar) use 128 bit signatures, and this drives the probability of a false positive to practically nothing.

The use of a prime is a tradition first employed with the idea that a relatively large prime is unlikely to accidentally be a dominating factor of the signature generation function.

http://stackoverflow.com/questions/1145 ... er-modulus
User avatar
Evert
Posts: 2929
Joined: Sat Jan 22, 2011 12:42 am
Location: NL

Re: When the whole key is stored

Post by Evert »

Zenmastur wrote: Not sure why you would need a TT to calculate perft,
Of course you don't. You don't need one to play chess either.
Sure helps to speed things up though.
but if you do need one, perft(8) has ~40% chance of having a collision with a 64-bit key. Perft(13) will have a collision rate of about 0.1.
How do you get those numbers?
Anyway, Steven is probably the guy who knows the most about efficiently doing perft calculations in the world. ;)
Having 10% of your entries over-written doesn't sound like a usable system to me! :lol:
Why?
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: When the whole key is stored

Post by bob »

Zenmastur wrote:
syzygy wrote:
sje wrote:It's not necessary that the length of a transposition table be a power of two. Another approach is to have the entry count of the table be some prime p and the address of an entry given by the value key mod p. For some years now, the cost of a division isn't much different than the cost of a mask and shift; using key mod p can be an improvement because it allows for better utilization of available memory. After all, there are many primes from which to choose.
No added benefit in requiring p to be prime.
Agreed!

But I would like to know how much of a difference there is between mod and shift/mask.
syzygy wrote:
Zenmastur wrote:So I'm still left wondering why they choose to waste memory?
The easy answer is that they do that because it's easy.

Hash entries have a fixed size indepedent of the size of the hash table (i.e. independent of the number of bits used for indexing). So it does not make sense to eliminate exactly those bits that are used for indexing, unless the size of your hash table is fixed. Also keep in mind that most want hash entries to be 16 bytes, so that 4 entries fit exactly in a cacheline. Hash entries of 12 bytes are less efficient to access.
To the first statement, that's one of things I was wondering about, to the second statement, I guess that depends on how you measure efficiency.
syzygy wrote: ...Storing 32 bits of the key in the hash entry and using the rest for indexing into the table seems to be the best compromise to me, also because nowadays you probably use most of the other 32 bits for indexing into the table anyway.
Agreed.

Regards,

Forrest
why don't you simply test? replace the usual

htable = signature & mask;

by

htable = signature % mask;

then search to a fixed depth that takes maybe 5 minutes total. Run both and you can determine whether there is any cost or not. Note that you don't need to change anything else since the above two are equivalent in their final result. So a one line change and a benchmark run.
mvk
Posts: 589
Joined: Tue Jun 04, 2013 10:15 pm

Re: When the whole key is stored

Post by mvk »

bob wrote:why don't you simply test? replace the usual

htable = signature & mask;

by

htable = signature % mask;

then search to a fixed depth that takes maybe 5 minutes total. Run both and you can determine whether there is any cost or not. Note that you don't need to change anything else since the above two are equivalent in their final result. So a one line change and a benchmark run.
Better do:

htable = signature % (mask + 1)

So that the search tree is identical as well. Or precompute 'mask + 1' outside of the critical path.
[Account deleted]
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: When the whole key is stored

Post by bob »

mvk wrote:
bob wrote:why don't you simply test? replace the usual

htable = signature & mask;

by

htable = signature % mask;

then search to a fixed depth that takes maybe 5 minutes total. Run both and you can determine whether there is any cost or not. Note that you don't need to change anything else since the above two are equivalent in their final result. So a one line change and a benchmark run.


Better do:

htable = signature % (mask + 1)

So that the search tree is identical as well. Or precompute 'mask + 1' outside of the critical path.
Should have used size rather than mask when doing the mod, was not thinking..
User avatar
sje
Posts: 4675
Joined: Mon Mar 13, 2006 7:43 pm

key mod p

Post by sje »

For key mod p to work well, it's not necessary that p be prime. But p should be odd (and so coprime with 2^N) so that all N bits of the key get a chance to contribute to the entry address generation.
User avatar
hgm
Posts: 27790
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: When the whole key is stored

Post by hgm »

Divisions take some 30 clock cycles, compared to 1-3 for shifts and multiplies. And this is not only latency, but even the throughput is only 1 per 30 cycles, as the divide unit is not pipelined.

But I guess you could pre-calculate (1<<N)/size, so that you can do the modulo as key - ((((1<<N)/size)*key)>>N)*size with sufficiently large N to not lose precision.

I would be surprised if an optimizer would be smart enough to do that, though, if you just write key%size.