Im doing first atempts to prepar my engine to multi thread
I asume that i ned:
list of moes (local for the thread)
list of killers local
hash table global
but I doubt it if history table and pawn hash / eval hash are beter local or global
Can you help me?
multi thread
Moderator: Ras
-
bob
- Posts: 20943
- Joined: Mon Feb 27, 2006 7:30 pm
- Location: Birmingham, AL
Re: multi thread
All of that should be global. But for any tree state stuff, such as piece locations, board position, killer moves, PV, move lists, etc, you have to maintain that in a separate memory region for each thread...MDiaz wrote:Im doing first atempts to prepar my engine to multi thread
I asume that i ned:
list of moes (local for the thread)
list of killers local
hash table global
but I doubt it if history table and pawn hash / eval hash are beter local or global
Can you help me?
-
Zach Wegner
- Posts: 1922
- Joined: Thu Mar 09, 2006 12:51 am
- Location: Earth
Re: multi thread
Really? I think for at least evaluation hash and pawn hash, it would be better to be local. Their only point is a linear time savings, it can't affect the search tree (which would be exponential). And a relatively small one at that. They are also both localized. So if you have many processors writing constantly to them, not only are they going to fill it with a bunch of junk that wouldn't be useful to the other processors, you ALSO reduce the cache hit rate substantially.bob wrote:All of that should be global. But for any tree state stuff, such as piece locations, board position, killer moves, PV, move lists, etc, you have to maintain that in a separate memory region for each thread...
For a qsearch hash, it's debatable. Mine is local, and should be small enough to fit entirely in cache. But I could see people getting use out of a global one. IIRC Vincent said his is global.
-
bob
- Posts: 20943
- Joined: Mon Feb 27, 2006 7:30 pm
- Location: Birmingham, AL
Re: multi thread
My pawn hash has a hit rate of over 99%. That says "global" and not "local". My regular hash is also global for the same reason, it is a "global" deal that is used throughout the tree.Zach Wegner wrote:Really? I think for at least evaluation hash and pawn hash, it would be better to be local. Their only point is a linear time savings, it can't affect the search tree (which would be exponential). And a relatively small one at that. They are also both localized. So if you have many processors writing constantly to them, not only are they going to fill it with a bunch of junk that wouldn't be useful to the other processors, you ALSO reduce the cache hit rate substantially.bob wrote:All of that should be global. But for any tree state stuff, such as piece locations, board position, killer moves, PV, move lists, etc, you have to maintain that in a separate memory region for each thread...
For a qsearch hash, it's debatable. Mine is local, and should be small enough to fit entirely in cache. But I could see people getting use out of a global one. IIRC Vincent said his is global.
-
Zach Wegner
- Posts: 1922
- Joined: Thu Mar 09, 2006 12:51 am
- Location: Earth
Re: multi thread
OK, but you're going to get the same hit rate with one processor. So the locality for pawn hash could probably go one way or the other, but the cache coherency still matters. My pawn hash is pretty small anyways, which makes both factors lean towards per-cpu (I would think at least).bob wrote:My pawn hash has a hit rate of over 99%. That says "global" and not "local". My regular hash is also global for the same reason, it is a "global" deal that is used throughout the tree.
The locality issue was primarily in reference to the evaluation hash. Hit rates there are 20-40%.
-
Gian-Carlo Pascutto
- Posts: 1260
- Joined: Sat Dec 13, 2008 7:00 pm
Re: multi thread
If you think about this more, it should be obvious that the case in which the cache coherency is going to penalize you, is exactly the case where you really want a shared table. (A miss followed by a store)So the locality for pawn hash could probably go one way or the other, but the cache coherency still matters.
However, I would be worried about lock contention. You can ignore locking in the ttable, but for a pawnhash, a collision would have a much higher chance of hurting you, exactly because of the high hitrates.
Does Crafty lock the pawnhash?
-
bob
- Posts: 20943
- Joined: Mon Feb 27, 2006 7:30 pm
- Location: Birmingham, AL
Re: multi thread
I don't get burned by the coherency issue much. First, either the entire entry gets overwritten, or nothing is modified. An entry is pretty big at 32 bytes, so I get 2 per cache block, which is minimal interference. Also, I am a bit tricky here as well, in that when I do a probe and get a hit, I move that entry to local memory and so long as the pawn positions don't change, I continue to use that local entry and never go back and probe again.Zach Wegner wrote:OK, but you're going to get the same hit rate with one processor. So the locality for pawn hash could probably go one way or the other, but the cache coherency still matters. My pawn hash is pretty small anyways, which makes both factors lean towards per-cpu (I would think at least).bob wrote:My pawn hash has a hit rate of over 99%. That says "global" and not "local". My regular hash is also global for the same reason, it is a "global" deal that is used throughout the tree.
The locality issue was primarily in reference to the evaluation hash. Hit rates there are 20-40%.
On an 8 core system, the hit rate would be 8x worse since each core would calculate things that might have already been calculated on other cores, which doesn't seem like a good idea to me. Even worse, on a nehalem, with one L3 cache, you have 8 copies of the same information on an 8-core system, since each thread would be using local data rather than shared. Again that influences cache footprint.
My thinking is to try to minimize sharding read/write data across all threads, at least where you have a single piece of read/write data in the same block with a bunch of read-only variables. But beyond that, I leave it to the cache controllers to work things out, while I try to minimize duplicated data. That's a reason I prefer threads over processes, as with EGTBs, eugene's code is threaded, but with processes you get egtb cache duplicated as well, which greatly increases the total I/O rate and memory usage instead of using his LRU cache scheme which tries to reduce that...
-
bob
- Posts: 20943
- Joined: Mon Feb 27, 2006 7:30 pm
- Location: Birmingham, AL
Re: multi thread
It uses the "lockless" (XOR) approach, yes...Gian-Carlo Pascutto wrote:If you think about this more, it should be obvious that the case in which the cache coherency is going to penalize you, is exactly the case where you really want a shared table. (A miss followed by a store)So the locality for pawn hash could probably go one way or the other, but the cache coherency still matters.
However, I would be worried about lock contention. You can ignore locking in the ttable, but for a pawnhash, a collision would have a much higher chance of hurting you, exactly because of the high hitrates.
Does Crafty lock the pawnhash?
-
frankp
- Posts: 233
- Joined: Sun Mar 12, 2006 3:11 pm
Re: multi thread
Is there a reference to this approach? Hash locks are killing me.bob wrote:
It uses the "lockless" (XOR) approach, yes...
-
bob
- Posts: 20943
- Joined: Mon Feb 27, 2006 7:30 pm
- Location: Birmingham, AL
Re: multi thread
Yes. Same idea used in Crafty's main hash table. I believe there is an electronic copy on my web page at www.cis.uab.edu/facultyfrankp wrote:Is there a reference to this approach? Hash locks are killing me.bob wrote:
It uses the "lockless" (XOR) approach, yes...