As I finally got to implementing TT, I decided to forego generating pseudorandom values for castling rights. I instead update a "dirty" bitboard, the relevant bits of which contribute to the lower 16 bits of the key. This guarantees that any two positions that only differ in their castling rights map to separate buckets:
I measured a ~5% performance gain in perft(7) from SP. I understand that there will be false negatives in more advanced positions, but those should seemingly be quite rare.
In micro-Max I use index = (boardKey + (stm + 128)*epSqr) & hashMask, where stm = 8 or 16, and epSqr = 128 if there are no e.p. rights. If the rest of the keys is random, this doesn't increase the chance for collisions. Micro-Max doesn't bother to hash the castling rights.
According to this, rotll() is compiled into an actual rotate instruction. The performance gain isn't as great as I'd hoped (and falsely advertised), but it is noticeable.
hgm wrote: ↑Sun Apr 26, 2020 5:09 pm
In micro-Max I use index = (boardKey + (stm + 128)*epSqr) & hashMask, where stm = 8 or 16, and epSqr = 128 if there are no e.p. rights. If the rest of the keys is random, this doesn't increase the chance for collisions. Micro-Max doesn't bother to hash the castling rights.
Thanks, it's good to know that my approach isn't completely off.
Now I need to somehow get rid of the ternary. I tried using ep_square = 0xF8 for none, adding 8, and then clearing the upper 4 bits, but that doesn't seem to work.
I use the Polyglot hash values for my TT. It saves some space since I don't need another dedicated array for the hash values and the current position's TT hash value can be used directly when looking up openings from a Polyglot book.
I maintain castling rights as 4 bits (low order in a byte), setting or clearing them as needed.
I pre-calculate (at program start) the 16 possible castling hash values (index 0 = zero) and use the castling rights bits as an index into the array of 16 hash values. No loops or if statements are required, just a memory fetch.
MOBMAT wrote: ↑Mon Apr 27, 2020 8:02 am
I use the Polyglot hash values for my TT. It saves some space since I don't need another dedicated array for the hash values and the current position's TT hash value can be used directly when looking up openings from a Polyglot book.
I maintain castling rights as 4 bits (low order in a byte), setting or clearing them as needed.
I pre-calculate (at program start) the 16 possible castling hash values (index 0 = zero) and use the castling rights bits as an index into the array of 16 hash values. No loops or if statements are required, just a memory fetch.
I couldn't think of a faster way.
If built-in opening book lookup is required, there is indeed no alternative. The speedup is due to the transparent dirty BB replacing the 4-bit castling rights.
I realise that using the macro as it is might be slower than lookup, but consider incremental updates, where a single value to be added to the key would be trivially calculated from the piece/colour/vector.