Questions on SMP search

bob · Post by **bob** » Fri May 13, 2011 4:40 pm

Gian-Carlo Pascutto wrote:
bob wrote: With best first you have to store the entire tree searched so far, except for those few positions you can classify as won or lost.
With depth-first, you have to store the entire tree searched so far, except for those few positions you can classify as too good or too bad (and hence are reduced heavily = low depth).

However, with best first, you can not get rid of those. Suppose the "good" branches drop like a rock at some key depth because of a tactical issue that is not visible at shallow depths. If you "threw those other positions away" you are beyond screwed. Best first keeps them all "open" even if they are not being expanded, so that they can be expanded if things go wrong along the primary pathways.

Otherwise, how will you ever have the "best node" to search.
Otherwise, your move ordering sucks and you will search a much bigger tree than minimal.

Nothing you can do about that if you start off down the wrong path and discover after significant effort that it is no good.

At some depth, all the "best moves" can appear to suck, which leaves some very old and unexplored move as the next thing to search. Common terminology is an "open list" containing nodes that have not been completed so far.
At some depth, all the "best moves" can appear to suck, which leaves some very undeep and under-explored move as the next thing to search to full depth. Common terminology is to call this a "fail-low". Because the open-list, err, hashtable will not have a lot of information in this case, the branching factor tends to suck.

there is no "fail low" in best-first. The score for the previous "best move" simply drops enough so that it is no longer best and another move is expanded instead...

The terminology for minimax/alpha-beta doesn't apply here. It is a completely different approach to searching. Yes, you could say that unexpanded nodes are pruned, but that is wrong, because they can still be expanded if they bubble to the top as others drop down. But that would again be mangling the usual definition of the word.
[/quote]

As far as your latter comment, it is best to first pick up an AI book and look at the classic definition of "best-first search."

As for your latter comment, it is best to pick up AI research papers written after your AI book was and note the proofs of equivalence.[/quote]

"proof of equivalence" is nothing new. But it doesn't change the fact that the two search methodologies, while producing the same answer, do it in _completely_ different ways. And that only in a theoretical sense is there any true equivalence in final result. If you force a bunch of assumptions that don't apply in real chess positions...

There are plenty of new AI books that cover the topic. But the search paradigms are completely different.

Then you see why you don't actually "prune" anything, you just keep expanding the most promising nodes and let the others sit and collect dust, unless everything else slowly drops enough to look worse than some of those dust collectors...
Then you see you don't actually "prune" anything, you just keep searching good, non-reduced moves deeper and let the LMR-ed nodes sit and collect dust in the hashtable. Unless the score suddenly drops below that of the LMR-ed nodes.

If you throw things away, it can't be true best-first, because you can't know what will be best at some point in the future, so you just keep expanding the best nodes until you have to make a move.
If you throw some things out of your hashtable, alpha-beta cannot work well, because you can't know the best move to search first in each position and your branching factor starts to explode to pure minimax.

There is a world of difference between "can't work at all" (throwing things out of best-first) as opposed to "can't work well" (overwriting in hash table). Again, more AI references: the benefit of depth-first is reduced storage requirement compared to breadth-first or best-first. They can, optimally, search and find the same solution. They won't search the same tree in the same order, however. And they won't have the same storage requirement.

Daniel Shawul · Post by **Daniel Shawul** » Fri May 13, 2011 4:54 pm

That is _really_ mangling the use of EBF and doesn't match chess trees at all.

Definiation of branching factor from wiki

In computing, tree data structures, and game theory, the branching factor is the number of children at each node. If this value is not uniform, an average branching factor can be calculated.

That definition I agree with. But note "children" is a specific concept. Not "children we plan on searching."

Let me give you a reason why I prefered to use that formula. It demonstrates the effect of LMR on the EBF
clearly and the EBF as defined by that formula goes to zero as time goes on...
For the fixed depth iterative widening method here is what I get using the forumula I proposed.

Number of nodes is O(W^D) or W^(D+1) - 1 to be precise for a uniform tree where all nodes are expanded.
Hence

Code: Select all

            EBF = (W1^(D+1) -  1) / (W2 ^ (D + 1) - 1) 
                   ~ (W1/W2) ^ (D + 1)

Now lets plot for two widening strategies one with LMR like where the width is increased by one every iteration, and the other when it doubled. The first one results in an exponential curve, while in the second one the EBF remains constant.

As you can see in the picture for the EBF to remain constant , we have to multiply the number of moves considered in every iteration. Otherwise the EBF either increase or decreases

p.s : how do you display an image in this forum ?

Daniel Shawul · Post by **Daniel Shawul** » Fri May 13, 2011 6:41 pm

That is _really_ mangling the use of EBF and doesn't match chess trees at all.
Definiation of branching factor from wiki
In computing, tree data structures, and game theory, the branching factor is the number of children at each node. If this value is not uniform, an average branching factor can be calculated.
That definition I agree with. But note "children" is a specific concept. Not "children we plan on searching."

Lets infact drop the above definition and use the test book definition. That simplifies the concept a lot more as the previous definition is formulated with an iterative deepening search and no selectivity in mind. It happens to give the same result as the true definition I discuss below.
EBF = W^(D+1) / W ^ D = W

The text book definition which does not mention iterative deepening at as in the book :"Principles of artificial intelligence By Nils J. Nilsson" , calculates the effective branching factor from the total nodes searched. Look at the description in page 91-93. And it has plots at the end for BF and penetrance P as well. http://books.google.com/books?id=4JwN5D ... &q&f=false
It determines an EBF from the total nodes searched and length D of the tree. So it basically says EBF = W for a uniform tree of same width.

For my example above
iteration 1 -> EBF = 1
iteration 2 -> EBF = 3
iteration 3 -> EBF = 12
etc...

Note now there is no comparison between iterations. And this definition of branching factor of a tree can be used for any type of search, depth first, best first etc... since it calculates an effective virtual width (EBF) from the total nodes searched.

For the record, I am not sure I would do the "widening" as you are doing it, I would likely do a "tapered widening" myself so that I don't look at the same number of moves at every depth, I would likely reduce the number as ply goes up.

The reason why I am using the same width at all plies is to simplify the discussion. So lets stick with it.

Daniel Shawul · Post by **Daniel Shawul** » Fri May 13, 2011 9:09 pm

D in the table is actually EBF.

bob · Post by **bob** » Fri May 13, 2011 11:04 pm

Daniel Shawul wrote:
D in the table is actually EBF.

Explain to me where you intend for this to go. It seems pretty obvious that if you want to adjust W and D, you can always, for any value of W, find a D that produces the number of nodes you want. And vice-versa.

What does that have to do with the discussion about "widening" (whatever that means) in terms of todays chess programs that use normal alpha/beta, with iterative deepening (not iterative widening)? We play with two parameters inside the tree. One is depth, where we reduce some branches and extend others. The other is W near the tips where we prune (and discard) moves we think are futile to search. By pruning, we are reducing the EBF since the width of the tree is being reduced. Even though it is only within the last 4 plies, that is the majority of the space we search. For depth, it is all over the place, with the supposed idea that we search important stuff more deeply by extending, we search less important stuff to a fixed depth, and we search bad stuff to a reduced depth. So the average depth doesn't change from a normal search, but we search key stuff deeper and lousy stuff shallower. A balancing act.

I don't see any "widening" in that description. More a narrowing due to the physical pruning of moves in the last 4 plies, which makes it look like W in those positions is smaller than it actually is...

Daniel Shawul · Post by **Daniel Shawul** » Fri May 13, 2011 11:44 pm

Here are the plots for both widening width W and deepening depth D are incorporated (a real chess engine situation I suppose). The previous plots were just to show you the effect of widening without deepening.
You skipped the part where the definition of EBF as used in chess (i.e iterative deepening ) is different from the one used in text books. Because what we use is a relative factor that is indicative of how hard it is to go from iteration N to iteration N + 1. Here are the plots with both the text book definition and the one we are using.

The red lines are the text book EBF and blue is what we are using.
So for a constant width, decreasing width (more pruning as depth increases), increasing width (less pruning as depth decreases) are plotted above. For more pruning with depth , EBF decreases in a non-linear fashion according to the definiton we use. So my argument is :

If one says "I prune more with iteration depth", then the effort required to go to iteration N+1 from N decreases and not in a linear fashion as you suggested in the other threads. Even with constant pruning methodology (same W at all iterations), the EBF as we use it in chess decreases non linearly as shown in the first plot.

Note: In the previous formula I forgot a factor in the equation for total number of nodes. But it doesn't affect the result that much as it was a ratio which gets cancelled.

Code: Select all

T = (W ^ (D + 1)) / (W - 1)

bob · Post by **bob** » Sat May 14, 2011 4:08 am

Daniel Shawul wrote:Here are the plots for both widening width W and deepening depth D are incorporated (a real chess engine situation I suppose). The previous plots were just to show you the effect of widening without deepening.
You skipped the part where the definition of EBF as used in chess (i.e iterative deepening ) is different from the one used in text books. Because what we use is a relative factor that is indicative of how hard it is to go from iteration N to iteration N + 1. Here are the plots with both the text book definition and the one we are using.

The red lines are the text book EBF and blue is what we are using.
So for a constant width, decreasing width (more pruning as depth increases), increasing width (less pruning as depth decreases) are plotted above. For more pruning with depth , EBF decreases in a non-linear fashion according to the definiton we use. So my argument is :

If one says "I prune more with iteration depth", then the effort required to go to iteration N+1 from N decreases and not in a linear fashion as you suggested in the other threads. Even with constant pruning methodology (same W at all iterations), the EBF as we use it in chess decreases non linearly as shown in the first plot.

Note: In the previous formula I forgot a factor in the equation for total number of nodes. But it doesn't affect the result that much as it was a ratio which gets cancelled.
Code: Select all
T = (W ^ (D + 1)) / (W - 1)

First, I am not seeing anyone do "more pruning as depth increases." I have only seen pruning applied in the last N plies (4 for Crafty, I won't try to speak for others). So we are not adding more pruning, we are just adding another ply of normal search with the bottom 4 plies getting pruned regardless of the depth.

If anyone uses unbounded LMR reductions that grow larger and larger as the depth grows, then that might represent some modest increase in selectivity ply by ply. I'm not sure I would use unbounded R however, you don't want to lose most of the advantage of going to iteration N+1 because you reduce more and overlook anything new you might find...

abulmo · Post by **abulmo** » Sat May 14, 2011 7:22 pm

bhlangonijr wrote:
michiguel wrote:
Everything was really fine, until I tried to make a PGO compile of my "new" SMP engine, using GCC. It doesn't work. It appears that when you try the PGO with a multi-threaded application it screw up with the profile data.

recent gcc compilers have a -profile-correction argument to compile multi-threaded application.

bob · Post by **bob** » Sat May 14, 2011 7:43 pm

abulmo wrote:
bhlangonijr wrote:
michiguel wrote:
Everything was really fine, until I tried to make a PGO compile of my "new" SMP engine, using GCC. It doesn't work. It appears that when you try the PGO with a multi-threaded application it screw up with the profile data.

recent gcc compilers have a -profile-correction argument to compile multi-threaded application.

I tried that when it first showed up, but the thing still went belly up complaining that the PGO data file(s) were corrupted...

abulmo · Post by **abulmo** » Sat May 14, 2011 8:07 pm

bob wrote:
abulmo wrote:
michiguel wrote:
Everything was really fine, until I tried to make a PGO compile of my "new" SMP engine, using GCC. It doesn't work. It appears that when you try the PGO with a multi-threaded application it screw up with the profile data.

recent gcc compilers have a -profile-correction argument to compile multi-threaded application.

I tried that when it first showed up, but the thing still went belly up complaining that the PGO data file(s) were corrupted...

Yes, the -fprofile-correction flag just applies some heuristics to correct or smooth incoherent data. But at least the compilation should produce a working executable.

Questions on SMP search

Re: Questions on SMP search

Re: Questions on SMP search

Re: Questions on SMP search

Re: Questions on SMP search

Re: Questions on SMP search

Re: Questions on SMP search

Re: Questions on SMP search

Re: Questions on SMP search

Re: Questions on SMP search

Re: Questions on SMP search