I have tried several implementations to read/write in my TT while doing a perft with multiple threads. I have correct results up to depth 11, but when going too deep, I get slightly off results.
Locks might work, I haven't tried to compute perft 12 with them, but the performance hit is too significant, so I want to avoid this (unless you have a fast approach?).
My TT entry is just 2 64bits ints, one for the zobrist key, and one for the nodes count. The suggested lockless implementation on the chess wiki does not work => key = zobrist ^ nodes
What is commonly used at lower depths, with hopefully good performance results?
TT false hits at lower depths
Moderator: Ras
-
- Posts: 36
- Joined: Fri Dec 29, 2023 4:47 pm
- Location: Belgium
- Full name: thomas albert
-
- Posts: 729
- Joined: Mon Apr 19, 2010 7:07 pm
- Location: Sweden
- Full name: Peter Osterlund
Re: TT false hits at lower depths
Lockless hashing works perfectly if you implement it correctly, for example using C++ memory_order_relaxed atomics. See for example: viewtopic.php?p=671589#p671589 Be warned though that there were quite a bit of confused posts in that thread.chessbit wrote: ↑Mon Sep 29, 2025 3:26 pm I have tried several implementations to read/write in my TT while doing a perft with multiple threads. I have correct results up to depth 11, but when going too deep, I get slightly off results.
Locks might work, I haven't tried to compute perft 12 with them, but the performance hit is too significant, so I want to avoid this (unless you have a fast approach?).
My TT entry is just 2 64bits ints, one for the zobrist key, and one for the nodes count. The suggested lockless implementation on the chess wiki does not work => key = zobrist ^ nodes
What is commonly used at lower depths, with hopefully good performance results?
I suspect the problem in your case is that you don't take the remaining depth into account. In my perft program I handle that by having separate hash tables for each depth. This way it is also easier to control how much hash memory to use for different depths. An alternative is to instead use a single hash table but also store the depth somehow in the hash entries.
-
- Posts: 36
- Joined: Fri Dec 29, 2023 4:47 pm
- Location: Belgium
- Full name: thomas albert
Re: TT false hits at lower depths
Great thanks for the reply, I'll adapt my implementation tomorrow with atomics.petero2 wrote: ↑Mon Sep 29, 2025 7:50 pmLockless hashing works perfectly if you implement it correctly, for example using C++ memory_order_relaxed atomics. See for example: viewtopic.php?p=671589#p671589 Be warned though that there were quite a bit of confused posts in that thread.chessbit wrote: ↑Mon Sep 29, 2025 3:26 pm I have tried several implementations to read/write in my TT while doing a perft with multiple threads. I have correct results up to depth 11, but when going too deep, I get slightly off results.
Locks might work, I haven't tried to compute perft 12 with them, but the performance hit is too significant, so I want to avoid this (unless you have a fast approach?).
My TT entry is just 2 64bits ints, one for the zobrist key, and one for the nodes count. The suggested lockless implementation on the chess wiki does not work => key = zobrist ^ nodes
What is commonly used at lower depths, with hopefully good performance results?
I suspect the problem in your case is that you don't take the remaining depth into account. In my perft program I handle that by having separate hash tables for each depth. This way it is also easier to control how much hash memory to use for different depths. An alternative is to instead use a single hash table but also store the depth somehow in the hash entries.
My TT is declared as a matrix [depth][2^28], but I haven't really played with memory yet, I was more focused on having it work first. I hope that with better memory allocation, I would be able to increase the size.
What do you mean with the remaining depth? Depth 1 is never stored in the TT because it's faster to count the moves than it is to use the TT.
-
- Posts: 5743
- Joined: Tue Feb 28, 2012 11:56 pm
Re: TT false hits at lower depths
If you do a perft 15 and you are in a node that is 10 ply from the root position, then remaining depth is 5. So for that position you still need to calculate perft 5.chessbit wrote: ↑Mon Sep 29, 2025 9:26 pmMy TT is declared as a matrix [depth][2^28], but I haven't really played with memory yet, I was more focused on having it work first. I hope that with better memory allocation, I would be able to increase the size.
What do you mean with the remaining depth? Depth 1 is never stored in the TT because it's faster to count the moves than it is to use the TT.
You might encounter the same position at different plies from the root. To avoid this, you could include either the remaining depth (or alternatively/equivalently the distance to the root) in the hash key. But then you cannot use the depth in your decision whether to replace a TT entry or not.
Since your TT is a two-dimensional array [depth][2^28], it seems you already have separate tables for different remaining depths. (In this case replace-always is probably the best replacement strategy.)
Note that with 2^28 entries and a 64-bit key, once the TT is full, there will be a false hit every 2^(64-28) = 2^36 = 68.7 billion probes into that part of the table which should not have hit. So 64 bits is not going to be enough for higher perfts.
-
- Posts: 36
- Joined: Fri Dec 29, 2023 4:47 pm
- Location: Belgium
- Full name: thomas albert
Re: TT false hits at lower depths
I see, indeed that makes sense. That might be my problem. Thanks for the clarification.syzygy wrote: ↑Mon Sep 29, 2025 10:14 pmIf you do a perft 15 and you are in a node that is 10 ply from the root position, then remaining depth is 5. So for that position you still need to calculate perft 5.chessbit wrote: ↑Mon Sep 29, 2025 9:26 pmMy TT is declared as a matrix [depth][2^28], but I haven't really played with memory yet, I was more focused on having it work first. I hope that with better memory allocation, I would be able to increase the size.
What do you mean with the remaining depth? Depth 1 is never stored in the TT because it's faster to count the moves than it is to use the TT.
You might encounter the same position at different plies from the root. To avoid this, you could include either the remaining depth (or alternatively/equivalently the distance to the root) in the hash key. But then you cannot use the depth in your decision whether to replace a TT entry or not.
Since your TT is a two-dimensional array [depth][2^28], it seems you already have separate tables for different remaining depths. (In this case replace-always is probably the best replacement strategy.)
Note that with 2^28 entries and a 64-bit key, once the TT is full, there will be a false hit every 2^(64-28) = 2^36 = 68.7 billion probes into that part of the table which should not have hit. So 64 bits is not going to be enough for higher perfts.
Is it common to use 128 bits keys for higher perfts?
-
- Posts: 5743
- Joined: Tue Feb 28, 2012 11:56 pm
Re: TT false hits at lower depths
Yes. For example, Steven Edwards used 128-bit keys in his attempt to calculate perft 14:chessbit wrote: ↑Tue Sep 30, 2025 6:16 pmI see, indeed that makes sense. That might be my problem. Thanks for the clarification.syzygy wrote: ↑Mon Sep 29, 2025 10:14 pmIf you do a perft 15 and you are in a node that is 10 ply from the root position, then remaining depth is 5. So for that position you still need to calculate perft 5.chessbit wrote: ↑Mon Sep 29, 2025 9:26 pmMy TT is declared as a matrix [depth][2^28], but I haven't really played with memory yet, I was more focused on having it work first. I hope that with better memory allocation, I would be able to increase the size.
What do you mean with the remaining depth? Depth 1 is never stored in the TT because it's faster to count the moves than it is to use the TT.
You might encounter the same position at different plies from the root. To avoid this, you could include either the remaining depth (or alternatively/equivalently the distance to the root) in the hash key. But then you cannot use the depth in your decision whether to replace a TT entry or not.
Since your TT is a two-dimensional array [depth][2^28], it seems you already have separate tables for different remaining depths. (In this case replace-always is probably the best replacement strategy.)
Note that with 2^28 entries and a 64-bit key, once the TT is full, there will be a false hit every 2^(64-28) = 2^36 = 68.7 billion probes into that part of the table which should not have hit. So 64 bits is not going to be enough for higher perfts.
Is it common to use 128 bits keys for higher perfts?
viewtopic.php?p=667349#p667349
-
- Posts: 36
- Joined: Fri Dec 29, 2023 4:47 pm
- Location: Belgium
- Full name: thomas albert
Re: TT false hits at lower depths
FYI I have tried with atomics but it didn't help. I have tried two different sets of zobrist keys, but I suspect my hashes may not be good enough. I guess you would run into the same issue at lower perfts? Maybe your zobrist keys are better and so still work at perft 12, maybe even 13? I'm a bit out of ideas otherwise, except trying 128 bits hashes, but that still doesn't solve the problem of collisions, which would yield potentially wrong results at even lower depths... which sucks.petero2 wrote: ↑Mon Sep 29, 2025 7:50 pmLockless hashing works perfectly if you implement it correctly, for example using C++ memory_order_relaxed atomics. See for example: viewtopic.php?p=671589#p671589 Be warned though that there were quite a bit of confused posts in that thread.chessbit wrote: ↑Mon Sep 29, 2025 3:26 pm I have tried several implementations to read/write in my TT while doing a perft with multiple threads. I have correct results up to depth 11, but when going too deep, I get slightly off results.
Locks might work, I haven't tried to compute perft 12 with them, but the performance hit is too significant, so I want to avoid this (unless you have a fast approach?).
My TT entry is just 2 64bits ints, one for the zobrist key, and one for the nodes count. The suggested lockless implementation on the chess wiki does not work => key = zobrist ^ nodes
What is commonly used at lower depths, with hopefully good performance results?
I suspect the problem in your case is that you don't take the remaining depth into account. In my perft program I handle that by having separate hash tables for each depth. This way it is also easier to control how much hash memory to use for different depths. An alternative is to instead use a single hash table but also store the depth somehow in the hash entries.
-
- Posts: 5743
- Joined: Tue Feb 28, 2012 11:56 pm
Re: TT false hits at lower depths
Unless your Zobrist keys have been generated with a really bad random generator, they are almost certainly good enough.chessbit wrote: ↑Wed Oct 01, 2025 8:39 pmFYI I have tried with atomics but it didn't help. I have tried two different sets of zobrist keys, but I suspect my hashes may not be good enough. I guess you would run into the same issue at lower perfts? Maybe your zobrist keys are better and so still work at perft 12, maybe even 13? I'm a bit out of ideas otherwise, except trying 128 bits hashes, but that still doesn't solve the problem of collisions, which would yield potentially wrong results at even lower depths... which sucks.
But if you're not sure, just take numbers from https://www.random.org/bytes/
If you have no errors when running single-threaded, then your problem is not lack of randomness but the interaction between threads.
You could try this. Assuming your key is 128 bits (16 bytes) and your perft value is 64 bits (8 bytes), arrange them as follows:
5 bytes key
3 bytes perft value
---
5 bytes key
3 bytes perft value
---
6 bytes key
2 bytes perft value
Read this entry with 3 64-bit reads, then check that the key bytes match the current position's key. If it does, you can use the perft value.
Writes work the same. Just prepare the values and write in one go.
Perhaps you want to use inline assembly to read/write, so the compiler does not spread around the read and write operations.
-
- Posts: 5743
- Joined: Tue Feb 28, 2012 11:56 pm
Re: TT false hits at lower depths
What should also work, without interleaving the key and they perft value bytes:syzygy wrote: ↑Wed Oct 01, 2025 9:14 pmUnless your Zobrist keys have been generated with a really bad random generator, they are almost certainly good enough.chessbit wrote: ↑Wed Oct 01, 2025 8:39 pmFYI I have tried with atomics but it didn't help. I have tried two different sets of zobrist keys, but I suspect my hashes may not be good enough. I guess you would run into the same issue at lower perfts? Maybe your zobrist keys are better and so still work at perft 12, maybe even 13? I'm a bit out of ideas otherwise, except trying 128 bits hashes, but that still doesn't solve the problem of collisions, which would yield potentially wrong results at even lower depths... which sucks.
But if you're not sure, just take numbers from https://www.random.org/bytes/
If you have no errors when running single-threaded, then your problem is not lack of randomness but the interaction between threads.
You could try this. Assuming your key is 128 bits (16 bytes) and your perft value is 64 bits (8 bytes), arrange them as follows:
5 bytes key
3 bytes perft value
---
5 bytes key
3 bytes perft value
---
6 bytes key
2 bytes perft value
Read this entry with 3 64-bit reads, then check that the key bytes match the current position's key. If it does, you can use the perft value.
Writes work the same. Just prepare the values and write in one go.
Perhaps you want to use inline assembly to read/write, so the compiler does not spread around the read and write operations.
To read:
- read key (2 x 64-bit reads)
- if a match, read perft value and then read the key again.
- if still a match, use the perft value.
To write:
- clear the key to 0 (2 x 64-bit writes) (I guess clearing half the key should be suffiicient)
- write the perft value, then write the key.
Do this with inline assembly to make sure memory locations are actually read and written in the intended error.
(This can be achieved with atomics, but I am too lazy to check what kind of atomic semantics is needed.)
If someone sees how this could go wrong, please let me know.
-
- Posts: 928
- Joined: Sun Nov 19, 2006 9:16 pm
- Location: Russia
- Full name: Aleks Peshkov
Re: TT false hits at lower depths
Your problem is not collisions. You should store depth in TT record. "position startpos moves g1f3 g8f6 f3g1 f6g8" is legal transposition and have different perft from startpos, because it has remaining depth-4.
'perft 2' and 'perft 6' are different perft numbers of the same position, but you store them in the same slot in TT.
'perft 2' and 'perft 6' are different perft numbers of the same position, but you store them in the same slot in TT.