Stockfish and optimizing options for SMP search

bob · Post by **bob** » Wed May 11, 2011 6:29 pm

Milos wrote:
Evert wrote:Surely that depends on how good your move ordering is?
If the move ordering is good, you'd normally expect the cutoff to be caused by one of the first few moves (probably the first, but possibly, say, one of the first three; it's easy to get that number for different engines). So the further down the list you go, the greater the probability that you will not get any fail-high at all. Occasionally it'll happen that a move way down at the bottom of the list fails high, but that should be sufficiently rare that you get a net gain by assuming that it won't and act accordingly.
Isn't that also the whole idea behind late move reductions?

Or do I misunderstand what you're saying?
I bold the crucial sentence which is wrong. Since we are talking about split points that cause overhead we are talking about small remaining depths. And the predecessor node is ALL node. There we know very little about ordering (these kind of branches might have even been pruned in the previous iteration) and that move order is most probably wrong. So when you pass lets say first 3 or 5 moves on the list what you say (and probably Bob but he has difficulties articulating that) is that from that point it's higher chance that there will be no cut-off at all than that there will be some. That's what I claim is wrong. And, if there was a cut-off splitting work would only cause unnecessary overhead.
Of course the moveCount vs. depth curve needs to be nicely trimmed but saying a priori that any raising of split depth condition on later moves is a bad idea is just product of famous professor ego that always needs to be right, and no way product of facts!

Sorry, but BULLSHIT. some of us have actually measured fail highs. In Crafty, 92% of fail highs happen on the _first_ move. Take it to the first two moves, and the probability goes to just over 95%. By the time you get to the last move, the chance of it failing high is essentially zero. Where do you come up with this nonsense from???

bob · Post by **bob** » Wed May 11, 2011 6:30 pm

Milos wrote:
Milos wrote:
Evert wrote:Surely that depends on how good your move ordering is?
If the move ordering is good, you'd normally expect the cutoff to be caused by one of the first few moves (probably the first, but possibly, say, one of the first three; it's easy to get that number for different engines). So the further down the list you go, the greater the probability that you will not get any fail-high at all. Occasionally it'll happen that a move way down at the bottom of the list fails high, but that should be sufficiently rare that you get a net gain by assuming that it won't and act accordingly.
Isn't that also the whole idea behind late move reductions?

Or do I misunderstand what you're saying?
I bold the crucial sentence which is wrong. Since we are talking about split points that cause overhead we are talking about small remaining depths. And the predecessor node is ALL node. There we know very little about ordering (these kind of branches might have even been pruned in the previous iteration) and that move order is most probably wrong. So when you pass lets say first 3 or 5 moves on the list what you say (and probably Bob but he has difficulties articulating that) is that from that point it's higher chance that there will be no cut-off at all than that there will be some. That's what I claim is wrong. And, if there was a cut-off splitting work would only cause unnecessary overhead.
Of course the moveCount vs. depth curve needs to be nicely trimmed but saying a priori that any raising of split depth condition on later moves is a bad idea is just product of famous professor ego that always needs to be right, and no way product of facts!
Let me elaborate a little more, since ppl here (university professors included ) are obviously bad with statistics.
Lets say you are in ALL node and you have 20 CUT node successors. And lets say probability to have a cut-off in one of these 20 CUT nodes is 95%.
You can assume this is similar to a Monty Hall problem.
So you search first 3 CUT nodes and you don't find a cut-off. So what's the probability that there will be a cut-off when you search the fourth CUT node, smaller or bigger than before (or even bigger than was the probability to have a cut-off when you've searched the first node)?
If you don't get it, try reading the Monty Hall problem. As I've said it's a common misconception which obviously not even university professors are immune of.

And it is a comment that has zero relevance in _this_ discussion... jeez...

bob · Post by **bob** » Wed May 11, 2011 6:33 pm

jwes wrote:
bob wrote:
Milos wrote:
bob wrote:Actually, in any post you write here, "fact" is one thing nobody will find...
You seams to know what everybody else assumes for a fact. I wonder when have you developed that mind reading skill???

I did _not_ say your idea was good. I said it was flawed, and explained why. If you don't get that...

To judge if idea is good or not you first need to understand it. Your "explanation" was for totally unrelated thing, therefore the only conclusion is that you didn't get it.
How hard is it to understand? Perhaps you are the one that doesn't understand the idea, which would explain why you don't get the point that the idea is not a good one. Not horrible, but certainly not good.

One simple flaw is that when you reach a position with remaining depth of N, you have _no idea_ what the tree below each of the remaining moves looks like. Might be quite simple and narrow. Might be quite deep and bushy. But one thing is for sure, the more moves that have been searched, the more likely it is that all moves need to be searched, which is exactly where we'd like to split the tree to avoid search overhead caused by an unexpected fail high.

But of course you don't understand any of that...
An interesting collorary to this is that the more sure that the more sure you are that this is a good split point, the better it would be to reduce or prune and save even more work.

I'm not sure that is true. One question is "do I need to search every node at this ply?" (Is this an ALL/CUT node?) If so, this is an excellent point to split the tree. That is independent of the "should I further reduce/prune the late moves at this ply?" question. We are trying to split at points where we don't suddenly get a cutoff (fail-high) which means that other things done in parallel here are wasted. I don't really care whether I am searching good moves or lousy moves in parallel at all, load-balancing is not an issue in my current code. If one thread runs out of work, it will quickly jump in and help the remaining threads finish their stuff...

Milos · Post by **Milos** » Wed May 11, 2011 6:45 pm

bob wrote:Sorry, but BULLSHIT. some of us have actually measured fail highs. In Crafty, 92% of fail highs happen on the _first_ move. Take it to the first two moves, and the probability goes to just over 95%. By the time you get to the last move, the chance of it failing high is essentially zero. Where do you come up with this nonsense from???

You don't even understand the concept of a priori and a posteriori probabilities.
I know you are a professor of CS, but being so hollow in statistics is just shocking. Did you even have a statistics course in your life????
What you measure in Crafty is a priori probability to have a fail high in a certain move (most probably cases close to the root, here we are talking about cases close to tips where ordering is much worse).
Having 95% a priori chance to have a cut-off at first move doesn't mean the chance to have a cut-off at a second and latter moves if first one doesn't produce it is only 5%!!! The probability substantially increases!!!
And this is what you obviously do not understand...

Evert · Post by **Evert** » Wed May 11, 2011 7:34 pm

Milos wrote: So you search first 3 CUT nodes and you don't find a cut-off. So what's the probability that there will be a cut-off when you search the fourth CUT node, smaller or bigger than before (or even bigger than was the probability to have a cut-off when you've searched the first node)?
If you don't get it, try reading the Monty Hall problem. As I've said it's a common misconception which obviously not even university professors are immune of.

Again though, that would depend on the move ordering, wouldn't it? If you have reason to believe that one of the moves in the list will cause a cut-off and you know it's not one of the first few (so move ordering is bad), then you know that it has to be one of the remaining moves that causes the cut-off.
But you don't know a-priori that there is going to be one, it's possible that you need to search all moves without finding a clear "best move".
You can assume one or the other, move ordering is good and therefore you can expect to not get a cut-off at all after the first few moves you try, or move ordering is horrible and any move you try next could cause a cut-off. You make an assumption and act accordingly and whether that helps you or not depends on how good your assumption was. If you're right more often than wrong, it helps you, otherwise it hurts.

So you're near the leaves, with little dynamical information to guide move ordering: no hash move, no killer move, maybe a little bit of help from a history-type heuristic. Other than that, you're down to static rules for move ordering. On the other hand, the search out there will be very selective anyway, so you're likely looking for quick and obvious tactical moves to give you a cutoff. If you select those using static rules and sort them higher, don't you still expect, on average, to have better-than-random move ordering?

Isn't that right, or am I still missing something? It sounds like this is something that should be easily testable, but it probably does depend on the program you're testing.

Milos · Post by **Milos** » Wed May 11, 2011 7:44 pm

Evert wrote:Again though, that would depend on the move ordering, wouldn't it? If you have reason to believe that one of the moves in the list will cause a cut-off and you know it's not one of the first few (so move ordering is bad), then you know that it has to be one of the remaining moves that causes the cut-off.
But you don't know a-priori that there is going to be one, it's possible that you need to search all moves without finding a clear "best move".
You can assume one or the other, move ordering is good and therefore you can expect to not get a cut-off at all after the first few moves you try, or move ordering is horrible and any move you try next could cause a cut-off. You make an assumption and act accordingly and whether that helps you or not depends on how good your assumption was. If you're right more often than wrong, it helps you, otherwise it hurts.

So you're near the leaves, with little dynamical information to guide move ordering: no hash move, no killer move, maybe a little bit of help from a history-type heuristic. Other than that, you're down to static rules for move ordering. On the other hand, the search out there will be very selective anyway, so you're likely looking for quick and obvious tactical moves to give you a cutoff. If you select those using static rules and sort them higher, don't you still expect, on average, to have better-than-random move ordering?

Yes you will have a better than random move ordering but much worse than what Bob measures in his Crafty close to the root. But that's not the point.
The point is, if you search first 2-3 moves and you don't find the cut-off, the chance to have a cut-off in the remaining moves substantially increases i.e. your a posteriori probability becomes much larger than a priori one.
The reason is that if you don't find a cut-off in first 2-3 moves there is much greater chance your ordering is wrong than the chance you will not see a cut-off at all since probability of wrong ordering is much higher than probability of non-finding cut-off. Hopefully this makes it more clear now.

bob · Post by **bob** » Wed May 11, 2011 8:23 pm

Milos wrote:
bob wrote:Sorry, but BULLSHIT. some of us have actually measured fail highs. In Crafty, 92% of fail highs happen on the _first_ move. Take it to the first two moves, and the probability goes to just over 95%. By the time you get to the last move, the chance of it failing high is essentially zero. Where do you come up with this nonsense from???
You don't even understand the concept of a priori and a posteriori probabilities.
I know you are a professor of CS, but being so hollow in statistics is just shocking. Did you even have a statistics course in your life????
What you measure in Crafty is a priori probability to have a fail high in a certain move (most probably cases close to the root, here we are talking about cases close to tips where ordering is much worse).
Having 95% a priori chance to have a cut-off at first move doesn't mean the chance to have a cut-off at a second and latter moves if first one doesn't produce it is only 5%!!! The probability substantially increases!!!
And this is what you obviously do not understand...

Here's the simple problem with your ridiculous analysis.

1. we are splitting at what we hope to be an ALL (or PV) node. We want _every_ move to fail low after we have split. I do mean _every_ move.

2. Since that is the case, _every_ successor node is supposed to fail high. Your completely nonsensical comment about the nodes below the ALL node getting a higher and higher probability of failing high is simply utter nonsense.

It shows a complete lack of understanding of what is going on. BTW, your 5% stuff is a crock. 95 out of every 100 fail highs happens on the first two moves of any CUT node. Therefore, there is a 5% probability that the fail high will happen after the first 2 moves. Probability is probability. 100% is the highest probability you can have (1.00). So what you are rambling on about I have no idea about, nor do I want to know anything about it because it is wrong...

If you don't understand what I just wrote, then study some before posting utter nonsense... Any good paper on alpha/beta pruning will take you right to full understanding, without my wasting any further time on a lost cause.

bob · Post by **bob** » Wed May 11, 2011 8:42 pm

Evert wrote:
Milos wrote: So you search first 3 CUT nodes and you don't find a cut-off. So what's the probability that there will be a cut-off when you search the fourth CUT node, smaller or bigger than before (or even bigger than was the probability to have a cut-off when you've searched the first node)?
If you don't get it, try reading the Monty Hall problem. As I've said it's a common misconception which obviously not even university professors are immune of.
Again though, that would depend on the move ordering, wouldn't it? If you have reason to believe that one of the moves in the list will cause a cut-off and you know it's not one of the first few (so move ordering is bad), then you know that it has to be one of the remaining moves that causes the cut-off.
But you don't know a-priori that there is going to be one, it's possible that you need to search all moves without finding a clear "best move".
You can assume one or the other, move ordering is good and therefore you can expect to not get a cut-off at all after the first few moves you try, or move ordering is horrible and any move you try next could cause a cut-off. You make an assumption and act accordingly and whether that helps you or not depends on how good your assumption was. If you're right more often than wrong, it helps you, otherwise it hurts.

So you're near the leaves, with little dynamical information to guide move ordering: no hash move, no killer move, maybe a little bit of help from a history-type heuristic. Other than that, you're down to static rules for move ordering. On the other hand, the search out there will be very selective anyway, so you're likely looking for quick and obvious tactical moves to give you a cutoff. If you select those using static rules and sort them higher, don't you still expect, on average, to have better-than-random move ordering?

Isn't that right, or am I still missing something? It sounds like this is something that should be easily testable, but it probably does depend on the program you're testing.

Don't try to discuss this with him. He has no idea what he is talking about, he is mangling terminology (probability of one cut node failing high over another cut node? BOTH should fail high, they are CUT nodes, after all. This entire discussion is simply technically irrelevant nonsense.

Evert · Post by **Evert** » Wed May 11, 2011 9:31 pm

Milos wrote: The reason is that if you don't find a cut-off in first 2-3 moves there is much greater chance your ordering is wrong than the chance you will not see a cut-off at all since probability of wrong ordering is much higher than probability of non-finding cut-off.

That is a testable statement that it should be easy to get statistics for, which should end any discussion one way or the other (and if different programs show different things, then that's interesting and will teach us something).
I don't think it's a-priori obvious that that would have to be true (in fact, my intuition says it'd be false, but my intuition could easily be off on this and I haven't done an intensive study of this). I do know that when I did a quick test with my own program, it is very rare to get a cut-off from very late moves and it was more likely that you'd end up searching all moves, but my program is fairly weak.
Either way, as I said, that should be easy to test one way or the other.

bob · Post by **bob** » Wed May 11, 2011 10:27 pm

Evert wrote:
Milos wrote: The reason is that if you don't find a cut-off in first 2-3 moves there is much greater chance your ordering is wrong than the chance you will not see a cut-off at all since probability of wrong ordering is much higher than probability of non-finding cut-off.
That is a testable statement that it should be easy to get statistics for, which should end any discussion one way or the other (and if different programs show different things, then that's interesting and will teach us something).
I don't think it's a-priori obvious that that would have to be true (in fact, my intuition says it'd be false, but my intuition could easily be off on this and I haven't done an intensive study of this). I do know that when I did a quick test with my own program, it is very rare to get a cut-off from very late moves and it was more likely that you'd end up searching all moves, but my program is fairly weak.
Either way, as I said, that should be easy to test one way or the other.

It is simply a false statement, based on fantasy. We can certainly measure percentage of cutoffs on first move vs rest of moves. I already do this and have been providing this number in every version of Crafty for many years. It averages 92% over a typical game. It is _much_ more likely that after you have searched 2 moves with no cutoff, when you expect one to happen at this node, that instead, you have a move ordering problem earlier in the tree that has caused ALL nodes to flip to CUT nodes and vice-versa, exactly what you don't want to happen when you decide to split at a particular ALL node. Cutoffs on late moves _are_ rare. Since cutoffs happen 92% of the time on the first move, at any ply where you get a cutoff, and since this goes up to 95% for the first 2, there is little chance left that a cutoff will happen beyond that point. Only 5 in 100 times will you get a cutoff beyond the first 2 moves. Pretty simple math, which leaves me wondering "what in the world is this discussion supposed to be about, because it is not about normal trees, alpha/beta, and normal probabilities..."

It is more a case of trying to justify stupid comments by offering more stupid comments to support the original. Based on his logic, the last move is more likely than the 3rd move to cause a cutoff. That is easy enough to measure... I just made a quick run on one position, for 180 seconds, and got 0%. Since this is in units of 1/100ths, I ran it again to produce the actual number rather than the percentage (which was zero) and it happened 6 times in 3 minutes in a typical middlegame position. I failed to print out the total number of fail high nodes, but the search was 5+ billion nodes, total. My FH counter is not adjusted in the q-search, since there we quite often fail high on the static eval. I don't adjust any counters there at all although it would be trivial to add and would likely inflate the 92% number since so many moves get refuted by a direct recapture...

Stockfish and optimizing options for SMP search

Re: Stockfish and optimizing options for SMP search

Re: Stockfish and optimizing options for SMP search

Re: Stockfish and optimizing options for SMP search

Re: Stockfish and optimizing options for SMP search

Re: Stockfish and optimizing options for SMP search

Re: Stockfish and optimizing options for SMP search

Re: Stockfish and optimizing options for SMP search

Re: Stockfish and optimizing options for SMP search

Re: Stockfish and optimizing options for SMP search

Re: Stockfish and optimizing options for SMP search