Shared hash table smp result

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

Daniel Shawul
Posts: 4185
Joined: Tue Mar 14, 2006 11:34 am
Location: Ethiopia

Re: Depth = 20 Results

Post by Daniel Shawul »

Actually, the implementation I am using is much simpler than you are describing. The *only* communication between threads is through the various hash tables and a flag to tell the threads when to stop.
I don't know what you are talking about now. Didn't you say you divide up the root move list??? The 1st processor searches the 1st,3rd,5th move...etc , and the second processor 2nd,4th,6th...etc. Those were your words not mine. Then I suggested that you use a lock to share the work as that will allow any processor to work on any move. Hardly rocket science. This is what you said
The 3rd and 4th threads are actually searching iteration depth+1, again with even/odd threads searching the even/odd moves at the root, which is why the average depth searched is a bit more than 12 in those cases. In principle I might speed the whole thing up further by having each of 3 threads search 1/3rd of the root moves and each of 4 threads searching 1/4th of the root moves, but that is more work to implement, and I haven't tried it yet.
So when I start a new iteration, I send the same alpha beta limits (+/- 15 from the previous iteration score) and same root move list to each thread. The threads each search independently using a normal PVS algorithm from that point onward until one of the threads either fails high or has searched all the root moves.
You are seriously confused sorry. I think you should back up read what you said before read what you are saying now and judge for yourself. You should really see by now how difficult it is without the YBW criteria. And please don't pretend I am doing some rocket science when I am not...
dchoman
Posts: 171
Joined: Wed Dec 28, 2011 8:44 pm
Location: United States

Re: Depth = 20 Results

Post by dchoman »

Daniel Shawul wrote:
You are seriously confused sorry. I think you should back up read what you said before read what you are saying now and judge for yourself. You should really see by now how difficult it is without the YBW criteria. And please don't pretend I am doing some rocket science when I am not...
No need to be impolite. Sometimes the written word alone can lead to surprising misunderstandings. I was just trying to say that I wanted to implement SMP without all the work of getting a full implementation working. This is not because I am scared of it, but because this is my hobby and I have limited time and I wanted something working that needed minimal debugging. It surprised me that the lazy approach worked as well as it did, which is what generated my first contribution to the lazy smp thread a couple of months ago. I thought the conventional wisdom was that shared hash is worth no gain... obviously it is worth something non-negligible.

Regarding inconsistency between my descriptions... again that may be a misunderstanding. Please read the rest of the thread you referenced. I posted that I had made a mistake in my implementation and alternating the root move as well in that fashion didn't work.

I went back to my previous implementation which is described in the previous lazy smp threads and above. Indeed, after the first move I do search the even/odd moves in even/odd threads, but each thread eventually does search all of the moves but that is hopefully much faster due to hash hits. I even said this again in the post above, so this is not inconsistent. The alternation of the moves after the first move is good for 5 - 10% improvement over just searching the all in the same order in each thread simultaneously. So it is a small optimization, probably because the other threads are going out of sync naturally, so alternating the moves after the first doesn't change that much.

In that last thread, I was trying to extend that alteration idea to the first move and failed, and I made the mistake of thinking it was a large improvement for a short while. I am very sorry to have dragged you into that. I did post a retraction relatively quickly in that thread.

- Dan
dchoman
Posts: 171
Joined: Wed Dec 28, 2011 8:44 pm
Location: United States

Re: Depth = 20 Results

Post by dchoman »

BTW, I absolutely agree that doing a division of labor inside the search of the root node, rather than just calling the root node search in all threads, should be even better. This is not my current implementation, although I am gaming it a bit by changing the move order in the even/odd threads.

To really do YBW at the root, I would need to divide up the work and wait for returns from the threads at that point, and my current implementation just doesn't do that. I could implement it, and I will certainly take a crack at it, but I can't test this aspect soon due to my work schedule (we are mid-semester here).

- Dan
Daniel Shawul
Posts: 4185
Joined: Tue Mar 14, 2006 11:34 am
Location: Ethiopia

Re: Depth = 20 Results

Post by Daniel Shawul »

No need to be impolite. Sometimes the written word alone can lead to surprising misunderstandings.
Sorry about that, I should have expressed myself better. I just felt frustrated at the fact we are not doing the same thing when you already said you search different moves and different depth to avoid redundancy. Doing that will increase idle time as you saw in my results, and also searching N moves with an open window is a killer! If you search all moves by all processors, the redundancy will be higher but it might actually be faster since you don't have too many open windows. ABDADA f.i shares work through hashtable communications by incrementing the number of processors working at a node. I am not keen on implementing that or any variation of it.
But if you do want to investigate this further, I will re-iterate what I have been saying before, following a simple incremental procedure. Otherwise it is too many variables to analyze at least for me.
a) Same depth + YBW
b) Same depth + No YBW
c) Different depth + YBW
d) Different depth + No YBW
For example you said in your previous post searching at same depth or not with 4 processors didn't matter much as both got 3.6sec. Is that really the case? Other questions can also be answered similarly.
dchoman
Posts: 171
Joined: Wed Dec 28, 2011 8:44 pm
Location: United States

Re: Depth = 20 Results

Post by dchoman »

I can understand how frustrating that was, and I am sorry for not explaining my approach more clearly. I'll definitely look at a sequence similar to what you suggest once I have a framework that can accomplish that more naturally.

- Dan