It's a very elegant definition. I did a quick depth 15 measurement across some positions for SF, and 58% of the non-PV node fail highs from main search (those that make it past null-move, etc.) come in cut nodes, using this definition.Don wrote:it sounds more complicated than it is, but the definition is simple, if you are an odd distance from the last PV node it's a cut node, otherwise it's an all node.
ROC analysis of Late Move Reductions
Moderator: Ras
-
- Posts: 568
- Joined: Tue Dec 12, 2006 10:10 am
- Full name: Gary Linscott
Re: ROC analysis of Late Move Reductions
-
- Posts: 5106
- Joined: Tue Apr 29, 2008 4:27 pm
Re: ROC analysis of Late Move Reductions
Stockfish is one of the few modern programs that don't define CUT and ALL nodes unless that has changed recently. We found it to be a definite benefit to handle them differently. It's not HUGE, but it is a benefit.gladius wrote:It's a very elegant definition. I did a quick depth 15 measurement across some positions for SF, and 58% of the non-PV node fail highs from main search (those that make it past null-move, etc.) come in cut nodes, using this definition.Don wrote:it sounds more complicated than it is, but the definition is simple, if you are an odd distance from the last PV node it's a cut node, otherwise it's an all node.
Capital punishment would be more effective as a preventive measure if it were administered prior to the crime.
-
- Posts: 568
- Joined: Tue Dec 12, 2006 10:10 am
- Full name: Gary Linscott
Re: ROC analysis of Late Move Reductions
It's not in master, just testing. I tried a while back with some all/cut tweaks without any luck. Take #2 is here: http://tests.stockfishchess.org/tests/v ... 14ec4dd802Don wrote:Stockfish is one of the few modern programs that don't define CUT and ALL nodes unless that has changed recently. We found it to be a definite benefit to handle them differently. It's not HUGE, but it is a benefit.gladius wrote:It's a very elegant definition. I did a quick depth 15 measurement across some positions for SF, and 58% of the non-PV node fail highs from main search (those that make it past null-move, etc.) come in cut nodes, using this definition.Don wrote:it sounds more complicated than it is, but the definition is simple, if you are an odd distance from the last PV node it's a cut node, otherwise it's an all node.
-
- Posts: 4186
- Joined: Tue Mar 14, 2006 11:34 am
- Location: Ethiopia
Re: ROC analysis of Late Move Reductions
However you define node types reducing expected CUT nodes less,if anything, is the only thing that makes sense. You expect to fail high (which is more than updating alpha in PV nodes) so you should be careful there. That is very intuitive and probably one reason why PV nodes are reduced less or not at all.gladius wrote:It's a very elegant definition. I did a quick depth 15 measurement across some positions for SF, and 58% of the non-PV node fail highs from main search (those that make it past null-move, etc.) come in cut nodes, using this definition.Don wrote:it sounds more complicated than it is, but the definition is simple, if you are an odd distance from the last PV node it's a cut node, otherwise it's an all node.
As to the definition of node types, clearly the one that switches expected CUT nodes to ALL after searching say 3 moves is better than sticking with original definition. Would you rather look at evidence (i.e fail high in first 3 moves >95% likelihood) and adjust your definition of node types, or stick with a static node type assignment? One could measure how many fail highs there are after searching 3 moves in a CUT node. Iff there are more fail highs after the 3rd move in a CUT node will not switching be good idea. It is much safer to say the node is an ALL node after searching 3 moves. Now for decisions on IID or singular extensions that you have to make before searching any move, you have no choice but to use that static decision. But for LMR mostly you search at least 3 moves so it is most likely that it is an ALL node or your move ordering is screwed. It doesn't make sense to trust static node assignment more than the evidence. It has been proven mostly in parallel search literature that the dynamic assignment gives superior result eg. strong-YBWC.
Anyway even if we disagree on node definition, I can't get around why you would reduce CUT nodes more. I read Don's reply but i still don't get it and I suspect others who reduce CUT nodes more may give other justifications. For parallel search for instance, you avoid splitting at CUT nodes however you define it, why should it be the different (infact opposite) for LMR ?
-
- Posts: 1471
- Joined: Tue Mar 16, 2010 12:00 am
Re: ROC analysis of Late Move Reductions
Don's explanation is correct.Daniel Shawul wrote:Anyway even if we disagree on node definition, I can't get around why you would reduce CUT nodes more. I read Don's reply but i still don't get it and I suspect others who reduce CUT nodes more may give other justifications.
One can afford reducing CUT nodes more after the initial move failed to produce a cut, because any error will simply cause a research - this time as an ALL node.
The approach requires some special management of the hash table in order to distinguish information coming from CUT and ALL nodes - in view of the different reduction levels the information is not interchangeable.
-
- Posts: 4186
- Joined: Tue Mar 14, 2006 11:34 am
- Location: Ethiopia
Re: ROC analysis of Late Move Reductions
Well then what is the difference with a node that is originally assigned as an ALL node? The question is not if you can reduce CUT nodes more but if you should reduce them more than ALL nodes. After searching 3 moves the node has become an ALL node with >95% probability so you reduce them as much as you reduce any ALL node. Also as I said the dynamic node assignment has been proven to be better than static node assignment for other applications so what is the reason for not using it. I can muster up any excuse to justify absurd reduction methodology attaching a 'doesn't give much elo but if you do this and this only it works'. The question is if you have the idea first, and then did the implementation or the reverse.Houdini wrote:Don's explanation is correct.Daniel Shawul wrote:Anyway even if we disagree on node definition, I can't get around why you would reduce CUT nodes more. I read Don's reply but i still don't get it and I suspect others who reduce CUT nodes more may give other justifications.
One can afford reducing CUT nodes more after the initial move failed to produce a cut, because any error will simply cause a research - this time as an ALL node.
Sounds gibberish at least for now..The approach requires some special management of the hash table in order to distinguish information coming from CUT and ALL nodes - in view of the different reduction levels the information is not interchangeable.
-
- Posts: 20943
- Joined: Mon Feb 27, 2006 7:30 pm
- Location: Birmingham, AL
Re: ROC analysis of Late Move Reductions
I saw it, in fact. It is an exact repeat of what I did in the DTS implementation of Cray Blitz. But the concept of "reducing a cut node" is a contradiction in terms if you add "but don't reduce the first move." A "CUT" node would only have one move searched if it is a true CUT node.Don wrote:This is all in the context of how we define a cut node and an all node which I made clear in my post which you apparently missed seeing.bob wrote:Don wrote:We also reduce cut nodes more. It appears to be safer to do that and I will explain our reasoning on this. I don't for sure it this reasoning is correct but many of our tests show that reducing them more has an unexpected effect which is the opposite of what you might expect. It can actually slow down the search slightly and improve the quality. It's even easy to reason about why this might be happening. Here is how it goes. Keep in mind all of this is from a PVS search framework:Daniel Shawul wrote:In light of the LMR euphoria, I thought i would bump this thread up which actually has some substance. I run some tests regarding lmr and prunings some weeks ago. Reducing ALL nodes more is a loss for me after 10000 games as you mentioned but I did it only in the upper parts >6 depth. If i did it everywhere it is a total loss. As I mentioned before, rybka and co seem to reduce CUT nodes more which is weird.
"reduce cut nodes more"? A normal cut node only searches one move, and you never reduce the first move searched. I'm not following here...
My original comment still stands. I believe one reduces moves on the basis of the move, not the position. It is not uncommon for a CUT node to become an ALL node, and vice-versa. If you make decisions on the expected node type, you introduce yet another instability since you would do things differently depending on whether it is really a CUT node or an ALL node..
The problem I saw when doing this in DTS is that once you fail to fail-high on the first move of a CUT node, EVERYTHING along the path is now "in doubt". Every node type could be flipping as you find a new best move, or you find the best move is worse than the aspiration bound.
I won't repeat it in full here but basically we estimate which node type it will be BEFORE it is even searched and we stick with that definition. And our definition has the merits of being much easier to reason about.
The classical definition is useless for a chess program if you actually intend to do anything useful with it.
Also, you don't know a node is a "CUT" node until you get a beta cutoff...
You can guess. You can also flip a coin I suppose??
If you are at a cut node the previous node was an all node. At all nodes you expect all moves to fail low and at cut nodes you expect the first move to fail high and exit. The insight is in this: what happens if a cut node does produce a cutoff? The answer is that the previous move (which is at an ALL node) will fail high and get researched. The re-search saves the day! If the heavy reduction at the cut node causes the search to FAIL to produce the expected cutoff (and it would have otherwise), there is NO damage done because of this.
There is no free lunch however. The more aggressive cut node reductions should save time, but that may not necessarily happen. Because it can trigger additional re-searches at the previous node it can actually cause the search to take MORE time. But strangely enough we have seen a slight quality improvement, probably due to these additional re-searches which with LMR will be done at greater depths (because these won't be reduced.)
The way you categorize cut and all nodes could change all of this however. We have a very anal definition because in reality you cannot know until a search is complete, so we decide BEFORE the first move is searched. It cannot change because it always toggles back and forth. If the previous node was a cut node, we consider this an ALL node. The only exception is for PV nodes which are a special case of ALL nodes. In the PVS framework, the first zero width window searched from a PV node is by our definition a CUT node and if you have to do a re-search then it is suddenly promoted to a PV nodes (as per PVS search) and only then can the cut and all nodes swap positions. In other words, these internal search failures can force the status of every node in the subtree to swap if it propagates back to the last PV nodes.
it sounds more complicated than it is, but the definition is simple, if you are an odd distance from the last PV node it's a cut node, otherwise it's an all node.
Anyway i set expected CUT nodes to ALL after few moves are searched and no cut-offs occur so most will be ALL nodes anyway. The other test i did was to reduce loosing captures by a much lower value (1 ply), while other moves could be reduced as much as 6 ply. That seemed to improve scorpio a bit so i have that enabled by default now. I suspect that captures,checks and others moves that are not reduced form the sole of an LMR heavy engine so tampering with those even with small reduction values may be damaging.
Also i did some test with aggressive pruning but this is a loss as always. Move count pruning never worked for me above depth=2, and it probably never will. I am better off reducing increasing lmr reduction in the upper part of the tree by one more ply than worry about pruning in the last plies.
I ended up doing this twice.
Once going down the tree, then again going back up the tree, so that the current CUT node is no longer considered CUT (since we didn't fail on the first move) and the previous ALL node is is now likely a CUT node since we are not failing high, and a fail-low here turns into a fail-high one ply back.
It was complicated, and had a lot of error associated with it.
-
- Posts: 20943
- Joined: Mon Feb 27, 2006 7:30 pm
- Location: Birmingham, AL
Re: ROC analysis of Late Move Reductions
My concern would be that reducing a CUT node more will likely hide the reason it is going to fail high with a normal search. Shallower != better in terms of results, just in terms of performance.Daniel Shawul wrote:Well then what is the difference with a node that is originally assigned as an ALL node? The question is not if you can reduce CUT nodes more but if you should reduce them more than ALL nodes. After searching 3 moves the node has become an ALL node with >95% probability so you reduce them as much as you reduce any ALL node. Also as I said the dynamic node assignment has been proven to be better than static node assignment for other applications so what is the reason for not using it. I can muster up any excuse to justify absurd reduction methodology attaching a 'doesn't give much elo but if you do this and this only it works'. The question is if you have the idea first, and then did the implementation or the reverse.Houdini wrote:Don's explanation is correct.Daniel Shawul wrote:Anyway even if we disagree on node definition, I can't get around why you would reduce CUT nodes more. I read Don's reply but i still don't get it and I suspect others who reduce CUT nodes more may give other justifications.
One can afford reducing CUT nodes more after the initial move failed to produce a cut, because any error will simply cause a research - this time as an ALL node.Sounds gibberish at least for now..The approach requires some special management of the hash table in order to distinguish information coming from CUT and ALL nodes - in view of the different reduction levels the information is not interchangeable.
-
- Posts: 5106
- Joined: Tue Apr 29, 2008 4:27 pm
Re: ROC analysis of Late Move Reductions
What if the second move gives the cutoff? Do you consider that an all node just because the moves were ordered wrong?bob wrote:I saw it, in fact. It is an exact repeat of what I did in the DTS implementation of Cray Blitz. But the concept of "reducing a cut node" is a contradiction in terms if you add "but don't reduce the first move." A "CUT" node would only have one move searched if it is a true CUT node.Don wrote:This is all in the context of how we define a cut node and an all node which I made clear in my post which you apparently missed seeing.bob wrote:Don wrote:We also reduce cut nodes more. It appears to be safer to do that and I will explain our reasoning on this. I don't for sure it this reasoning is correct but many of our tests show that reducing them more has an unexpected effect which is the opposite of what you might expect. It can actually slow down the search slightly and improve the quality. It's even easy to reason about why this might be happening. Here is how it goes. Keep in mind all of this is from a PVS search framework:Daniel Shawul wrote:In light of the LMR euphoria, I thought i would bump this thread up which actually has some substance. I run some tests regarding lmr and prunings some weeks ago. Reducing ALL nodes more is a loss for me after 10000 games as you mentioned but I did it only in the upper parts >6 depth. If i did it everywhere it is a total loss. As I mentioned before, rybka and co seem to reduce CUT nodes more which is weird.
"reduce cut nodes more"? A normal cut node only searches one move, and you never reduce the first move searched. I'm not following here...
We feel differently. We don't do null pruning at PV nodes for example and repeated experiments have show us that this hurts the program. I don't really understand why, I just know that the node type matters for Komodo. Komodo seems to be a relatively strong program despite all these blunders I am making.
My original comment still stands. I believe one reduces moves on the basis of the move, not the position. It is not uncommon for a CUT node to become an ALL node, and vice-versa. If you make decisions on the expected node type, you introduce yet another instability since you would do things differently depending on whether it is really a CUT node or an ALL node..
The problem I saw when doing this in DTS is that once you fail to fail-high on the first move of a CUT node, EVERYTHING along the path is now "in doubt". Every node type could be flipping as you find a new best move, or you find the best move is worse than the aspiration bound.
I won't repeat it in full here but basically we estimate which node type it will be BEFORE it is even searched and we stick with that definition. And our definition has the merits of being much easier to reason about.
The classical definition is useless for a chess program if you actually intend to do anything useful with it.
Also, you don't know a node is a "CUT" node until you get a beta cutoff...
You can guess. You can also flip a coin I suppose??
If you are at a cut node the previous node was an all node. At all nodes you expect all moves to fail low and at cut nodes you expect the first move to fail high and exit. The insight is in this: what happens if a cut node does produce a cutoff? The answer is that the previous move (which is at an ALL node) will fail high and get researched. The re-search saves the day! If the heavy reduction at the cut node causes the search to FAIL to produce the expected cutoff (and it would have otherwise), there is NO damage done because of this.
There is no free lunch however. The more aggressive cut node reductions should save time, but that may not necessarily happen. Because it can trigger additional re-searches at the previous node it can actually cause the search to take MORE time. But strangely enough we have seen a slight quality improvement, probably due to these additional re-searches which with LMR will be done at greater depths (because these won't be reduced.)
The way you categorize cut and all nodes could change all of this however. We have a very anal definition because in reality you cannot know until a search is complete, so we decide BEFORE the first move is searched. It cannot change because it always toggles back and forth. If the previous node was a cut node, we consider this an ALL node. The only exception is for PV nodes which are a special case of ALL nodes. In the PVS framework, the first zero width window searched from a PV node is by our definition a CUT node and if you have to do a re-search then it is suddenly promoted to a PV nodes (as per PVS search) and only then can the cut and all nodes swap positions. In other words, these internal search failures can force the status of every node in the subtree to swap if it propagates back to the last PV nodes.
it sounds more complicated than it is, but the definition is simple, if you are an odd distance from the last PV node it's a cut node, otherwise it's an all node.
Anyway i set expected CUT nodes to ALL after few moves are searched and no cut-offs occur so most will be ALL nodes anyway. The other test i did was to reduce loosing captures by a much lower value (1 ply), while other moves could be reduced as much as 6 ply. That seemed to improve scorpio a bit so i have that enabled by default now. I suspect that captures,checks and others moves that are not reduced form the sole of an LMR heavy engine so tampering with those even with small reduction values may be damaging.
Also i did some test with aggressive pruning but this is a loss as always. Move count pruning never worked for me above depth=2, and it probably never will. I am better off reducing increasing lmr reduction in the upper part of the tree by one more ply than worry about pruning in the last plies.
I ended up doing this twice.
Once going down the tree, then again going back up the tree, so that the current CUT node is no longer considered CUT (since we didn't fail on the first move) and the previous ALL node is is now likely a CUT node since we are not failing high, and a fail-low here turns into a fail-high one ply back.
It was complicated, and had a lot of error associated with it.
Capital punishment would be more effective as a preventive measure if it were administered prior to the crime.
-
- Posts: 9
- Joined: Tue Mar 12, 2013 6:00 pm
- Location: Netherlands
Re: ROC analysis of Late Move Reductions
It's a guess indeed, but an educated guess. If your guess is right most of the time it can still very useful. My experience with ALL/CUT guesses is however not positive. They are wrong in about 30% of positions, so I do not use them to decide on if/how to do LMR.bob wrote:I don't even follow the concept of doing LMR at just ALL nodes. First, you don't know a node is an ALL node until all moves have been searched without a fail-high. Otherwise you can only guess. LMR applies to moves, not positions. Why would you not reduce a bad move whether it is in an ALL node, a CUT node or a PV node? A bad move is a bad move...Daniel Shawul wrote:In light of the LMR euphoria, I thought i would bump this thread up which actually has some substance. I run some tests regarding lmr and prunings some weeks ago. Reducing ALL nodes more is a loss for me after 10000 games as you mentioned but I did it only in the upper parts >6 depth. If i did it everywhere it is a total loss. As I mentioned before, rybka and co seem to reduce CUT nodes more which is weird. Anyway i set expected CUT nodes to ALL after few moves are searched and no cut-offs occur so most will be ALL nodes anyway. The other test i did was to reduce loosing captures by a much lower value (1 ply), while other moves could be reduced as much as 6 ply. That seemed to improve scorpio a bit so i have that enabled by default now. I suspect that captures,checks and others moves that are not reduced form the sole of an LMR heavy engine so tampering with those even with small reduction values may be damaging.
Also i did some test with aggressive pruning but this is a loss as always. Move count pruning never worked for me above depth=2, and it probably never will. I am better off reducing increasing lmr reduction in the upper part of the tree by one more ply than worry about pruning in the last plies.
Move number is a much better predictor if a node is an ALL node. The move number two plies lower is also helpful, if two plies lower the first move did not give a cutoff, this increases the likelihood that this node is an ALL node.