bob wrote:Let's pick a single topic? DTS is a good algorithm. Complex to program compared to what I (and most others) do today. My point about DTS however, was that the trees searched back then were easier to search than those today. They were MUCH more uniform, with no forward pruning of any kind, and no reductions except for null-move. More uniform trees are easier to search in parallel. So DTS provides a strong upper bound on what can be done. But when discussing Houdini and such, DTS doesn't belong in the conversation and is completely irrelevant.
We were discussing potential benefits of HT in general. Some tests suggest that Houdini 3 profits from it, at least under certain conditions. I have pointed out that a hypothetical Cray Blitz on x86-hardware would have benefitted as well (given that 12% or nps increase from HT is quite reasonable).
My reference, btw was NOT to the DTS paper. It was to a paper on "parallel search of alpha/beta game trees" published in the Journal of Parallel Computing. That was a straightforward mathematical analysis of the effect of move ordering on the size of alpha/beta trees, following up on the analysis Knuth/Moore did for serial alpha/beta searching.
I have the paper here. It is about PVS and EPVS, two algorithms that I don't think anybody is using anymore. The conclusion is that "[f]uture algorithms must allow processors to work at different nodes within the tree in order to be able to effectively use large number of processors". Indeed. This paper does not seem relevant for YBW (nor for DTS, for that matter).
I thought I was clear about the article I was referencing.
Yes, you were clear. But my point was and is that it is dangerous to rely on the "conventional knowledge" that is mostly based on those old papers from a time where algorithms were tested on 24 positions searched to a ply or 5. In fact you are now arguing that the DTS paper has lost its relevance because today's search algorithms are incomparable.
The base idea is to use the traditional knuth/moore form of N = 2 * W ^(d/2) (for D even) or N = W ^ floor(d/2) + W ^ ceil(d/2) for D odd or even. If you analyze that recursively, the first branch at the root searches many more nodes than the remainder. And this can be spelled out mathematically using Knuth's formula. The idea is that when you screw up with move ordering, so that the second move is better than the first, you almost double the size of the sub-tree for that branch...
If you screw up move ordering, you are screwed whether you search in parallel or not. When searching serially the size of the subtree for that branch doubles just as much as when searching in parallel.
BTW that paper had absolutely nothing to do with DTS and had no test results at all, it was a mathematical analysis of the issue only.
I agree it has nothing to do with DTS. It also has nothing to do with YBW. But if it "proves" the 30% parallel overhead (which I don't think it does) independent of the search algorithm used, can you explain why DTS on Cray Blitz had much lower parallel overhead? (And if it doesn't mean anything for DTS, then why should it mean anything for YBW?)
I do not agree it had no test results. In fact, it has test results for 24 Bratko-Kopec positions on 5-ply searches, both for PVS and for EPVS.
Let me summarise the paper:
Section 1:
minimax searches a lot of nodes, alpha/beta much less. Knuth formula, node types.
Section 2:
PVS searches no extra nodes if the tree is perfectly ordered. However, to make this work in practice the search must be able to split at any depth without penalty and (what I believe is only mentioned in section 3) the tree must be perfectly balanced (all subtrees of all moves after the 1st have the same size) because otherwise processors become idle.
Section 3:
It is mentioned that things are different with imperfect move ordering, and that imperfect move ordering cannot be avoided. This is a big problem for PVS, because it makes the tree very imbalanced, therefore lots of idle processors. Enters enhanced PVS or EPVS: if one processor becomes idle, all other searches are aborted and a new split point is made two plies higher in the tree. The TT is relied on to not lose the work that has already been done.
Section 4:
PVS and EPVS were implemented in Cray Blitz, tests were run on a Sequent Balance 21000, results are shown. (It seems 1 processor searched about 63 nodes/second.)
Section 5:
Conclusion. We need a better algorithm.
I don't see anything in this paper on which one could base a conclusion that HT can't possibly be of benefit with YBW-based parallel searches.