DTS and the call stack

Discussion of chess software programming and technical issues.

Moderators: bob, hgm, Harvey Williamson

Forum rules
This textbox is used to restore diagrams posted with the [d] tag before the upgrade.
bob
Posts: 20687
Joined: Mon Feb 27, 2006 6:30 pm
Location: Birmingham, AL

Re: DTS and the call stack

Post by bob » Fri Jun 22, 2007 8:42 pm

Pradu wrote:
jswaff wrote:
bob wrote: As I mentioned, the parent jumps into ThreadWait() and then jumps right back into the search stuff. This lets the parent finish first, return to ThreadWait() and since the current split point has not been completed, the parent then can be used to help the other threads complete their sub-trees...
How do you know the parent will finish first? I can see why you would want it to finish first, but not how you know it will.

--
James
Sometimes it will sometimes it won't. It just turns into another helper thread (only to the other threads at the split point though) when the parent finishes first and has no more work at the split-point.
In crafty it won't just help other threads at that split point, it will help other threads that need help at _any_ split point...

jswaff

Re: DTS and the call stack

Post by jswaff » Fri Jun 22, 2007 8:44 pm

Pradu wrote:
jswaff wrote:
bob wrote: As I mentioned, the parent jumps into ThreadWait() and then jumps right back into the search stuff. This lets the parent finish first, return to ThreadWait() and since the current split point has not been completed, the parent then can be used to help the other threads complete their sub-trees...
How do you know the parent will finish first? I can see why you would want it to finish first, but not how you know it will.

--
James
Sometimes it will sometimes it won't. It just turns into another helper thread (only to the other threads at the split point though) when the parent finishes first and has no more work at the split-point.
Yes, that's probably what he meant. I must've misunderstood him when he said "this let's the parent finish first."

So, it seems the answer to my original question is:
- the parent must "return back up" since its call stack is the only one set up for it.
- if the parent finishes and one or more helpers are still searching, join in and help them out.

Like Hyatt said, that's not "true DTS," but it's close enough. :)

--
James

jswaff

Re: DTS and the call stack

Post by jswaff » Fri Jun 22, 2007 8:53 pm

bob wrote:
jswaff wrote:
bob wrote: As I mentioned, the parent jumps into ThreadWait() and then jumps right back into the search stuff. This lets the parent finish first, return to ThreadWait() and since the current split point has not been completed, the parent then can be used to help the other threads complete their sub-trees...
How do you know the parent will finish first? I can see why you would want it to finish first, but not how you know it will.

--
James
I don't know it will finish first. But I have to at least handle that case. take the following scenario...

I start a search with one thread active, three hung in ThreadWait() waiting for work. As soon as I find a good split point by the YBW criterion, I split the current search into four threads and give them each work to do. Three of the threads are already hung in ThreadWait() waiting on work so they charge off immediately. The original thread then calls ThreadWait() which then lets it charge off as the fourth thread doing the search. Now if the first thread finishes first, it still returns to ThreadWait() where other active threads can give it work. If it finishes last, it just cleans up and goes back to the original search TREE struct which had no threads helping it.

Hard to explain, not that hard to make work, but the data structures are somewhat complicated... This way no thread has to stop and wait, they are always in ThreadWait() which instantly gets them new assignments when they re-enter that code...
That clears it up, I just misunderstood your earlier post when you said "this let's the parent finish first." Makes perfect sense as you just explained it.

I like that approach. Tord is doing about the same thing in Viper (using YBW), so I think I will too. Much better than trying to rewrite my search to do "true DTS."

Thanks .. for the paper, Crafty, and for answering my questions. :)
--
James

Pradu
Posts: 287
Joined: Sat Mar 11, 2006 2:19 am
Location: Atlanta, GA
Contact:

Re: DTS and the call stack

Post by Pradu » Fri Jun 22, 2007 9:01 pm

bob wrote:
Pradu wrote:
jswaff wrote:
bob wrote: As I mentioned, the parent jumps into ThreadWait() and then jumps right back into the search stuff. This lets the parent finish first, return to ThreadWait() and since the current split point has not been completed, the parent then can be used to help the other threads complete their sub-trees...
How do you know the parent will finish first? I can see why you would want it to finish first, but not how you know it will.

--
James
Sometimes it will sometimes it won't. It just turns into another helper thread (only to the other threads at the split point though) when the parent finishes first and has no more work at the split-point.
In crafty it won't just help other threads at that split point, it will help other threads that need help at _any_ split point...
How does one handle that? What happens if the parent thread T1 picks up a split point S2 for a thread that isn't under its original split point S1 and the rest of the threads at S1 finish the search. Does further search (backing up) at S1 have to wait for the parent thread to finish it's search at S2? Wouldn't this waste time by not doing precious bounds updating for the parent(s) of S1 which might cutoff the search of S2 or improve bounds on S2?

jswaff

Re: DTS and the call stack

Post by jswaff » Fri Jun 22, 2007 9:45 pm

Pradu wrote:
bob wrote:
Pradu wrote:
jswaff wrote:
bob wrote: As I mentioned, the parent jumps into ThreadWait() and then jumps right back into the search stuff. This lets the parent finish first, return to ThreadWait() and since the current split point has not been completed, the parent then can be used to help the other threads complete their sub-trees...
How do you know the parent will finish first? I can see why you would want it to finish first, but not how you know it will.

--
James
Sometimes it will sometimes it won't. It just turns into another helper thread (only to the other threads at the split point though) when the parent finishes first and has no more work at the split-point.
In crafty it won't just help other threads at that split point, it will help other threads that need help at _any_ split point...
How does one handle that? What happens if the parent thread T1 picks up a split point S2 for a thread that isn't under its original split point S1 and the rest of the threads at S1 finish the search. Does further search (backing up) at S1 have to wait for the parent thread to finish it's search at S2? Wouldn't this waste time by not doing precious bounds updating for the parent(s) of S1 which might cutoff the search of S2 or improve bounds on S2?
Yeah, I was thinking the same thing. Seems you could really "hold things up" if the owner goes off helping everyone else out, meanwhile the search he started is complete and he needs to go return the value back up.

Haven't thought this through completely, but maybe you could signal to the owner that is off helping at another split point that all his helpers are done, so he can gracefully bow out and go clean things up at his own split point.
That way the owner can be productive, and still returns to finish up what he started.

This is getting interesting. :)
--
James

bob
Posts: 20687
Joined: Mon Feb 27, 2006 6:30 pm
Location: Birmingham, AL

Re: DTS and the call stack

Post by bob » Sat Jun 23, 2007 12:42 am

Pradu wrote:
bob wrote:
Pradu wrote:
jswaff wrote:
bob wrote: As I mentioned, the parent jumps into ThreadWait() and then jumps right back into the search stuff. This lets the parent finish first, return to ThreadWait() and since the current split point has not been completed, the parent then can be used to help the other threads complete their sub-trees...
How do you know the parent will finish first? I can see why you would want it to finish first, but not how you know it will.

--
James
Sometimes it will sometimes it won't. It just turns into another helper thread (only to the other threads at the split point though) when the parent finishes first and has no more work at the split-point.
In crafty it won't just help other threads at that split point, it will help other threads that need help at _any_ split point...
How does one handle that? What happens if the parent thread T1 picks up a split point S2 for a thread that isn't under its original split point S1 and the rest of the threads at S1 finish the search. Does further search (backing up) at S1 have to wait for the parent thread to finish it's search at S2? Wouldn't this waste time by not doing precious bounds updating for the parent(s) of S1 which might cutoff the search of S2 or improve bounds on S2?
It's simply the way I designed things to work with the data structures. The point is that any thread can work anywhere in the tree, with no problems, since the data structures are replicated at each split point. For example, assuming 4 threads A, B, C and D,
A and B can split together, then C can help A and D can help B. If A finishes first, it can help B, C, or D no matter where they are in the tree...

Remember I am not doing a full DTS since I kept my recursive search for simplicity reasons. So once A splits with B, C and D, there is no way the tree can be split anywhere _above_ that split point, since nobody can search there until the split point has been finished. So everyone works below that split point, but there can be many split points below that one. In fact, I on occasion have as many as 60-100 active split points in some endings....

bob
Posts: 20687
Joined: Mon Feb 27, 2006 6:30 pm
Location: Birmingham, AL

Re: DTS and the call stack

Post by bob » Sat Jun 23, 2007 12:43 am

jswaff wrote:
Pradu wrote:
bob wrote:
Pradu wrote:
jswaff wrote:
bob wrote: As I mentioned, the parent jumps into ThreadWait() and then jumps right back into the search stuff. This lets the parent finish first, return to ThreadWait() and since the current split point has not been completed, the parent then can be used to help the other threads complete their sub-trees...
How do you know the parent will finish first? I can see why you would want it to finish first, but not how you know it will.

--
James
Sometimes it will sometimes it won't. It just turns into another helper thread (only to the other threads at the split point though) when the parent finishes first and has no more work at the split-point.
In crafty it won't just help other threads at that split point, it will help other threads that need help at _any_ split point...
How does one handle that? What happens if the parent thread T1 picks up a split point S2 for a thread that isn't under its original split point S1 and the rest of the threads at S1 finish the search. Does further search (backing up) at S1 have to wait for the parent thread to finish it's search at S2? Wouldn't this waste time by not doing precious bounds updating for the parent(s) of S1 which might cutoff the search of S2 or improve bounds on S2?
Yeah, I was thinking the same thing. Seems you could really "hold things up" if the owner goes off helping everyone else out, meanwhile the search he started is complete and he needs to go return the value back up.

Haven't thought this through completely, but maybe you could signal to the owner that is off helping at another split point that all his helpers are done, so he can gracefully bow out and go clean things up at his own split point.
That way the owner can be productive, and still returns to finish up what he started.

This is getting interesting. :)
--
James
Remember, once a split is done at ply N, splits can only be done at ply N+1 and beyond. Which means any of 'em are required to complete before we can complete ply N. Think about it for a minute in the context of YBW and it will become clearer...

bob
Posts: 20687
Joined: Mon Feb 27, 2006 6:30 pm
Location: Birmingham, AL

Re: DTS and the call stack

Post by bob » Sat Jun 23, 2007 12:46 am

jswaff wrote:
bob wrote:
jswaff wrote:
bob wrote: As I mentioned, the parent jumps into ThreadWait() and then jumps right back into the search stuff. This lets the parent finish first, return to ThreadWait() and since the current split point has not been completed, the parent then can be used to help the other threads complete their sub-trees...
How do you know the parent will finish first? I can see why you would want it to finish first, but not how you know it will.

--
James
I don't know it will finish first. But I have to at least handle that case. take the following scenario...

I start a search with one thread active, three hung in ThreadWait() waiting for work. As soon as I find a good split point by the YBW criterion, I split the current search into four threads and give them each work to do. Three of the threads are already hung in ThreadWait() waiting on work so they charge off immediately. The original thread then calls ThreadWait() which then lets it charge off as the fourth thread doing the search. Now if the first thread finishes first, it still returns to ThreadWait() where other active threads can give it work. If it finishes last, it just cleans up and goes back to the original search TREE struct which had no threads helping it.

Hard to explain, not that hard to make work, but the data structures are somewhat complicated... This way no thread has to stop and wait, they are always in ThreadWait() which instantly gets them new assignments when they re-enter that code...
That clears it up, I just misunderstood your earlier post when you said "this let's the parent finish first." Makes perfect sense as you just explained it.

I like that approach. Tord is doing about the same thing in Viper (using YBW), so I think I will too. Much better than trying to rewrite my search to do "true DTS."

Thanks .. for the paper, Crafty, and for answering my questions. :)
--
James
More than welcome. It's an interesting project to get this to work right where everyone helps everyone and it scales to a reasonable number of processors (at least 8, preferably beyond, I've run on 16 and 32 with good results)...

jswaff

Re: DTS and the call stack

Post by jswaff » Sat Jun 23, 2007 1:15 am

bob wrote: Remember I am not doing a full DTS since I kept my recursive search for simplicity reasons. So once A splits with B, C and D, there is no way the tree can be split anywhere _above_ that split point, since nobody can search there until the split point has been finished. So everyone works below that split point, but there can be many split points below that one. In fact, I on occasion have as many as 60-100 active split points in some endings....
Bob, I'm missing something. In your DTS paper you describe in great detail how important it is to choose a good split point. Manohararajah's thesis talks about a global split point list that is consulted by an idle thread before it sends the HELP message.

You don't appear to be doing that. You check a couple of simple conditions, then go right into Thread( ) after the first move is searched. In Thread( ) it looks like you immediately try to launch a parallel search.

So, what am I missing? You seem to be doing more of a YBW search. (I'm looking at Crafty 19.8.)

--
James

bob
Posts: 20687
Joined: Mon Feb 27, 2006 6:30 pm
Location: Birmingham, AL

Re: DTS and the call stack

Post by bob » Sat Jun 23, 2007 3:47 am

jswaff wrote:
bob wrote: Remember I am not doing a full DTS since I kept my recursive search for simplicity reasons. So once A splits with B, C and D, there is no way the tree can be split anywhere _above_ that split point, since nobody can search there until the split point has been finished. So everyone works below that split point, but there can be many split points below that one. In fact, I on occasion have as many as 60-100 active split points in some endings....
Bob, I'm missing something. In your DTS paper you describe in great detail how important it is to choose a good split point. Manohararajah's thesis talks about a global split point list that is consulted by an idle thread before it sends the HELP message.

You don't appear to be doing that. You check a couple of simple conditions, then go right into Thread( ) after the first move is searched. In Thread( ) it looks like you immediately try to launch a parallel search.

So, what am I missing? You seem to be doing more of a YBW search. (I'm looking at Crafty 19.8.)

--
James
That's correct. But let's back up a bit. In the late 70's and early 80's the best parallel search algorithm was called "PVS" (for principle variation search). There all the processors always "stuck together" and worked at the same split point. The idea was to step down in the tree following the left-most (or PV) nodes until some satisfactory depth was reached, then the tree was split at that point. Once every branch there was searched in parallel (and as processors ran out of moves to search they just dropped into an idle state) then the entire "wad" of processors backed up one ply and split there. Until you landed at the root and split there.

The problem was all the accumulated idle time when everyone was waiting on the last processor to search the last move at a split point.

DTS broke with this tradition with the "ants at a picnic" type approach where anybody could help anybody anywhere. DTS went a lot further since Cray Blitz was non-recursive, which gave it a lot of flexibility in choosing a split point since whenever an idle processor wanted to help, it could look at the state of the tree for all busy processors and pick the best/most-promising place to split.

Crafty uses the YBW type approach, which is still pretty good at choosing split points since we have to search one move at a node before doing any parallel searching. Most of the time if I get a fail high, it is on the first branch, meaning that the YBW approach does pretty well overall. But one thing the crafty approach does is that there is no requirement that all processors "stick together". A can help B, and if B finishes its part first, it can help A finish up its part, and so forth. So there is very little wait time, which is why the idle time is so very low in Crafty's search threads and why the NPS scales so well. NPS won't scale well if threads are not doing useful work...

So Crafty has a little of DTS, and a YBW flavor to boot...

Post Reply