DTS and the call stack

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

jswaff

Re: DTS and the call stack

Post by jswaff »

bob wrote: Crafty uses the YBW type approach, which is still pretty good at choosing split points since we have to search one move at a node before doing any parallel searching. Most of the time if I get a fail high, it is on the first branch, meaning that the YBW approach does pretty well overall. But one thing the crafty approach does is that there is no requirement that all processors "stick together". A can help B, and if B finishes its part first, it can help A finish up its part, and so forth. So there is very little wait time, which is why the idle time is so very low in Crafty's search threads and why the NPS scales so well. NPS won't scale well if threads are not doing useful work...

So Crafty has a little of DTS, and a YBW flavor to boot...
Ok, by only attempting a split after the first move is tried you are already giving yourself a > 90% probability that the node is not a CUTOFF node. Makes sense.

However, there doesn't appear to be any decision making on the part of the idle processors. They are just "grabbed up" by the first busy processor that comes along. That seems to be very different than DTS, where an idle processor decides where to create a new split point (in the absence of an existing suitable SP). In DTS it is the idle processors that "drive things." They tell the busy processors when it's time to split. In YBW, the busy processors just try to split after every move searched.

Is that correct, or am I missing that decision logic in Crafty? If I am correct, why would you not allow the idle processor to choose where to create the split points based on information provided by the busy processors, as opposed to just getting snatched up by the first busy processor that comes along?

--
James
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: DTS and the call stack

Post by bob »

jswaff wrote:
bob wrote: Crafty uses the YBW type approach, which is still pretty good at choosing split points since we have to search one move at a node before doing any parallel searching. Most of the time if I get a fail high, it is on the first branch, meaning that the YBW approach does pretty well overall. But one thing the crafty approach does is that there is no requirement that all processors "stick together". A can help B, and if B finishes its part first, it can help A finish up its part, and so forth. So there is very little wait time, which is why the idle time is so very low in Crafty's search threads and why the NPS scales so well. NPS won't scale well if threads are not doing useful work...

So Crafty has a little of DTS, and a YBW flavor to boot...
Ok, by only attempting a split after the first move is tried you are already giving yourself a > 90% probability that the node is not a CUTOFF node. Makes sense.

However, there doesn't appear to be any decision making on the part of the idle processors. They are just "grabbed up" by the first busy processor that comes along. That seems to be very different than DTS, where an idle processor decides where to create a new split point (in the absence of an existing suitable SP). In DTS it is the idle processors that "drive things." They tell the busy processors when it's time to split. In YBW, the busy processors just try to split after every move searched.

Is that correct, or am I missing that decision logic in Crafty? If I am correct, why would you not allow the idle processor to choose where to create the split points based on information provided by the busy processors, as opposed to just getting snatched up by the first busy processor that comes along?

--
James
You are correct in how it works. Doing it any other way would be horribly complicated inside a recursive search. For example, you are at ply=15 out in the tree, and now you wish you could split at ply=5. How to update the call stack to make that work? Only solution I could come up with was to unwind the call stack by doing 10 returns to get me back to ply 5, then splitting there, and depending on the hash table to prevent this from being too expensive.

There's no easy way to allow an idle processor to choose where to split, because the busy process then has to somehow return to a different place when it gets back to that point, but the call stack is already set...