Houdini 3 reducing the depth feature

Discussion of anything and everything relating to chess playing software and machines.

Moderator: Ras

bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Houdini 3 reducing the depth feature

Post by bob »

Don wrote:
bob wrote:
Don wrote:
bob wrote: The issue seemed to revolve around the definition of "PV node". For example, after making the first move, we reach ply=2. One way of looking at this is that this position is a PV node and NOTHING unusual happens to ANY move here. That seems wrong. The other way, not the way I was initially thinking, was that just the move itself that is searched with an open window is not reduced, or not reduced as much.

So I suppose the definition of "PV node" might need a slight modification. In normal discussion, a PV node, as defined in Knuth/Moore (An analysis of alpha/beta pruning) was a node on the left-hand-side of the graph, one PV node per ply, so the "left-most" nodes. I "think", however, that we are talking about "pv moves" instead, which is a different thing.

Do you disagree with any of that? And do you agree that we "seem" to be talking about PV moves rather than the classic PV nodes? Because I certainly reduce lots of moves at PV nodes. Such as doing LMR at the root, which was better in my testing...

I don't think the last chapter of what to do on a fail low or a fail high has been written. And I am almost certain that they should not be treated exactly the same way, even though this is exactly what I do at the present...
The root position is a pv node. A node is a position of course, not a move.

From any pv node the first move played takes us to another pv node. The second move searched and beyond do not lead to pv nodes. If they are done in the PVS framework where you search them with a zero width window then if one of those nodes fails high, it is searched again AS a pv node, i.e. the move is treated as if it's the first move leading to a new PV node.

Another way to look at this is that all pv nodes have some alpha and beta window, but the zero width window searches are always "scout" searches, designed to always fail. A scout search is never a pv node of course.

For clarity it is useful to have 2 searches. One is searchPV and the other is searchNonPV. searchNonPV could be called searchNullWindow if you wanted. SearchPv often calls searchNonPV but searchNonPV never calls searchPV.

I am not sure that conforms to Knuth/Moore without looking at it again. If you assume a perfectly ordered tree it would be always be the left-node or the first move played, but in practice a non-pv sometimes has to be "promoted" to a pv node which means it "should have" been the first move searched.

The confusion arises when you mix up the term "move" with "node" and this gave Larry a lot of trouble when we talked about it and I was trying to get him to understand. If you have 36 moves that can be played from a PV node Larry made the mistake of thinking of them as "pv moves" because they were searched inside the PV search. But except for 1 (or any that got promoted) they were mostly moves that led to non-pv nodes.

In a program such as yours which does not explicitly make the distinction, it's difficult to identify whether a given node is a pv node or not. If you use PVS search you could say that if the window is zero width it's probably not a pv node but I think it could still be a pv node in some circumstances although I believe that is a very temporary situation.
Nothing out of the way with what you wrote. The issue is "What do you do or not do at a PV "node". The root position is ALWAYS a PV node. If you treat it as such, that could mean that at THAT node, you do everything the same way. Which means treating all moves equally. But after making the first move at the root, and going to another PV node, when you come back to the root and try a different move, you end up at a non-PV node unless move ordering is broken.

The issue is, then, are we talking about treating NODES differently (PV vs ALL/CUT) or are we talking about treating MOVES differently? That is where the confusion seems to come in.

What you describe (search PV vs search non-PV) is not about nodes. It is about moves. Because you are treating moves at a PV node differently depending on whether they are searched first, or later.

In Crafty, I treat everything equally, with the only exception being that I don't do null-move at a PV "node", because (a) it can't fail high and (b) I can't allow the result to become the "best move/score" here since a null-move is illegal. But everything else that I do is done uniformly across all moves...

I don't think in ANY search you can accurately say "this is a PV node or this is not." Because we don't have perfect move ordering, and non-PV nodes morph into PV nodes occasionally, when move ordering is wrong. And, of course, PV nodes become non-PV nodes for the same reason.
The general answer is that the PV node is handled much more conservatively with respect to pruning and reductions. And you can be a bit more aggressive with respect to extensions in the PV nodes too.

In PV nodes we do not do null move, futility pruning or any of those sorts of things.
That seems wrong to me. Why? Why not prune bad moves? Because you want more accuracy in the PV? That makes no sense to me because what if the SECOND move you search is really the best? You prune it more heavily? So that it can't become the PV move?

This all seems to be theoretically unsound. If pruning is unsafe anywhere, it must be unsafe everywhere, so why do it?

If this idea works, it works for the wrong reason. There MUST be a way to prune correctly so that not pruning on PV nodes is not a stopgap measure that works due to poor pruning decisions...

That was my point in the initial discussion about treating PV vs non-PV moves/nodes differently.

I might not prune unless I am within N plies of a leaf, but I believe that I should prune ANYWHERE that meets my pruning criteria. Otherwise I am wasting even more time on the PV because my pruning avoids (theoretically, anyway) expending effort that is wasted...




In some sense the conceptual model is that the PV search is the "real" search and the only one that matters.
I don't even remotely agree with that. If that is the case, why not just search the first move and quit and move on to the next iteration? If you search the rest of the moves, you must expect one of them to become the new PV on occasion. How can they if you whack the trees apart with aggressive pruning? That's a point I simply don't get...


That makes it ridiculously selective. Everything else can be viewed as black box to determine if a move should be included in this highly selective PV search. In other words, think of everything else as a candidate move generator for a highly selective program. To be a candidate move generator one would expect it to "throw out" moves faster than they would normally be searched and so the non-pv search does this by means of using null move pruning, futility pruning, highly aggressive LMR, and some surprisingly aggressive forward pruning. It does not select moves, it only decides whether a move is worthy of being included in the "real" search.

This also explains why the branching factor of modern chess programs is 2 or even less.
Not really. If you avoid pruning on the first move, how much of the tree do you get rid of? That tree becomes 90% of the total effort and you can't reduce that further without resorting to more pruning. Hence my disagreement with not pruning on PV nodes. that's the biggest part of the tree, and the part where the most savings are possible.


[quote\ We are essentially just doing a PV search where we are looking (usually) at just 1 or 2 moves and throwing out everything else.

If you think of the non-pv nodes as part of the "real" search you are missing the point. Of course it's semantics, but I see these nodes as simply part of an elaborate recursive candidate move generator in a highly selective search.[/quote]

How do they promote to "needs to be searched further"? That is the problem...
User avatar
Don
Posts: 5106
Joined: Tue Apr 29, 2008 4:27 pm

Re: Houdini 3 reducing the depth feature

Post by Don »

bob wrote:

In some sense the conceptual model is that the PV search is the "real" search and the only one that matters.
I don't even remotely agree with that. If that is the case, why not just search the first move and quit and move on to the next iteration? ....
Because you have to test the other moves with your "candidate move generator" to see if they should be included. Like any selective search you only want to include moves that are likely to be best. Sometimes only 1 move fit's this criteria, other times many may.

So the candidate move generator works like this:

1. Always include the first move that would be generated, i.e hash move, or top capture, etc ... In good chess programs these are extremely likely to be the best move.

2. Otherwise, do a "scout" search around alpha. Use some forward pruning and such in the zero width window search to speed it up.

3. If the zero width window search fails high, add it to the list of moves to be included in the "real" search.

This is just about the smartest candidate move generator you can build. It happens to do a search, it is scalable because it is called with a parameter that determines how good to make it (we call it depth) and it is also passed information it needs such as the score of the best move found so far among the siblings.

... If you search the rest of the moves, you must expect one of them to become the new PV on occasion. How can they if you whack the trees apart with aggressive pruning? That's a point I simply don't get...
They WILL become the best move on occasion, otherwise this wouldn't work. I think the concept you are missing is that the "candidate move generator" or scout search should be viewed as mechanism to suggest candidates, not to provide moves that are guaranteed to be best.

I think your "cognitive" model is that you do the scout search to quickly prove the move sucks, but if it doesn't you have "proved" a new best move and the full window search is only done to get the correct score and PV. But if you stop thinking of it that way you will understand. Just think of it as a gadget that is "pretty good" at "suggesting" moves to try.

And to answer your next objection, YES, it's not as reliable in the full width sense as the anal and slow method of treating in like a brute force search but that it why it's so selective.

Even though it seems wrong to you, Crafty is doing this anyway without your permission. The null move search, even if you don't explicitly code it this way, is mostly giving you big speedups in non-pv nodes. It's making the non-pv nodes come out different - faster and less safe than PV nodes. In other words your scout search sucks, just like mine does and all good programs! Of course that is relative term.

Probably another way to look at this is that you should leave most of the selectivity to the "candidate move generator" and not try too hard to also make the primary tree selective. You can be critical of this if you want but this is what all the very best programs are doing.
That makes it ridiculously selective. Everything else can be viewed as black box to determine if a move should be included in this highly selective PV search. In other words, think of everything else as a candidate move generator for a highly selective program. To be a candidate move generator one would expect it to "throw out" moves faster than they would normally be searched and so the non-pv search does this by means of using null move pruning, futility pruning, highly aggressive LMR, and some surprisingly aggressive forward pruning. It does not select moves, it only decides whether a move is worthy of being included in the "real" search.

This also explains why the branching factor of modern chess programs is 2 or even less.
Not really. If you avoid pruning on the first move, how much of the tree do you get rid of? That tree becomes 90% of the total effort and you can't reduce that further without resorting to more pruning. Hence my disagreement with not pruning on PV nodes. that's the biggest part of the tree, and the part where the most savings are possible.
I think this is a matter of semantics and how you count. You alluded to this a few posts ago. The PV is 90% of the effort only because it has this really expensive but awesome candidate move generator. Your program and mine spend very little time in actual PV nodes. If you time the first move you may find that 90% of the time is spent there and the other 19 (from the opening position) fly by, but that's clearly not the same as saying most of the work is in the pv nodes. In Komodo the number of non-pv nodes is orders of magnitude more than the number of PV nodes and that is true of your program too.

I'm not saying you are wrong, it's just about how you look at it.

We are essentially just doing a PV search where we are looking (usually) at just 1 or 2 moves and throwing out everything else.

If you think of the non-pv nodes as part of the "real" search you are missing the point. Of course it's semantics, but I see these nodes as simply part of an elaborate recursive candidate move generator in a highly selective search.
How do they promote to "needs to be searched further"? That is the problem...
If the candidate move generator (scout search) returns with a score higher than alpha, it means the move is suspiciously good and should not be "thrown out" - it should be including in the proper tree (promoted to a PV move and searched for real.)

The devil is in the details. Getting it balanced is the trick and in Komodo we explicitly have 2 separate functions so that we never forgot they are conceptually different. When we experiment with extensions we do them separately. Our LMR rules are totally different and tested separately and the PV LMR is a lot more paranoid.
Capital punishment would be more effective as a preventive measure if it were administered prior to the crime.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Houdini 3 reducing the depth feature

Post by bob »

Rebel wrote:
bob wrote:
Rebel wrote:
bob wrote: I understand the idea, although I do not buy "all in" that it is correct. Why would one do things so differently in the PV, that you make it very hard for a non-PV move to fail high in the first place??? I don't personally assume that this is the way things have to be done, it's just the way some do it at the moment. From a theoretical perspective, it seems wrong, however.
I am not sure how an idea that produces elo can be wrong. About a year of ten ago I added (what I called) the PVS extension as described here. It's (still) good for 20-30 elo. When I would reduce the depth at a fail-high it still could produce the good score and mainline since the hash-table has all the best moves right resulting in several to many extensions.
This is similar to the tt-singular idea in robolito and friends. I've tried several variations, but not your "4 tt moves in a row to trigger an extension." Won't work for me, period, because I use the speed optimization of whenever I have no best move to store (ALL node, for example) I store the FIRST move I searched. Since I search the TT move without generating moves, it can occasionally save time. It is a performance enhancement, rather than a search improvement (I search the same tree if I don't do this, but I have to generate moves first, and occasionally an ALL node becomes a CUT node and this "pseudo-best-move" will cause a fail high with less effort (no move generation). But this means I would be extending way too much using your idea, since every hash entry has a "best move" even though they are often not a "best move" at all, just "a legal move".
The question was if reducing the depth at a fail-high can produce the real score and the answer is yes. I noticed Richard told you the same thing.
"yes" is not helpful. "How" would be. If you back up several iterations, how can you hope to get the same score you will get from several iterations deeper???

Until you get to the same depth, anyway...
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Houdini 3 reducing the depth feature

Post by bob »

bob wrote:
Rebel wrote:
bob wrote:
Rebel wrote:
bob wrote: I understand the idea, although I do not buy "all in" that it is correct. Why would one do things so differently in the PV, that you make it very hard for a non-PV move to fail high in the first place??? I don't personally assume that this is the way things have to be done, it's just the way some do it at the moment. From a theoretical perspective, it seems wrong, however.
I am not sure how an idea that produces elo can be wrong. About a year of ten ago I added (what I called) the PVS extension as described here. It's (still) good for 20-30 elo. When I would reduce the depth at a fail-high it still could produce the good score and mainline since the hash-table has all the best moves right resulting in several to many extensions.
This is similar to the tt-singular idea in robolito and friends. I've tried several variations, but not your "4 tt moves in a row to trigger an extension." Won't work for me, period, because I use the speed optimization of whenever I have no best move to store (ALL node, for example) I store the FIRST move I searched. Since I search the TT move without generating moves, it can occasionally save time. It is a performance enhancement, rather than a search improvement (I search the same tree if I don't do this, but I have to generate moves first, and occasionally an ALL node becomes a CUT node and this "pseudo-best-move" will cause a fail high with less effort (no move generation). But this means I would be extending way too much using your idea, since every hash entry has a "best move" even though they are often not a "best move" at all, just "a legal move".
The question was if reducing the depth at a fail-high can produce the real score and the answer is yes. I noticed Richard told you the same thing.
"yes" is not helpful. "How" would be. If you back up several iterations, how can you hope to get the same score you will get from several iterations deeper???

Until you get to the same depth, anyway...
I get the point of backing up a bit, because move ordering is now broken, and backing up a few plies gives you easier searches to establish some move ordering for the new move. But "the score"? Can't see how it can be correct until you get to the depth where it fails high, otherwise it would have failed high at earlier depths, don't you think???

So maybe you get "a score" when you back up, and by the time you get back to the right depth you get "the score". That I would buy.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Houdini 3 reducing the depth feature

Post by bob »

Don wrote:
bob wrote:

In some sense the conceptual model is that the PV search is the "real" search and the only one that matters.
I don't even remotely agree with that. If that is the case, why not just search the first move and quit and move on to the next iteration? ....
Because you have to test the other moves with your "candidate move generator" to see if they should be included. Like any selective search you only want to include moves that are likely to be best. Sometimes only 1 move fit's this criteria, other times many may.
This is a circular argument. You are not really "testing" those candidates if you prune them far more aggressively than the PV was pruned. Hence my argument that pruning differently on the PV than on non-PV moves seems to be counter-productive and makes it HARDER for those moves to become "good enough". Or perhaps makes it easier for a move to become good enough, even when it really isn't. Depending on how the aggressive pruning shapes the tree.



So the candidate move generator works like this:

1. Always include the first move that would be generated, i.e hash move, or top capture, etc ... In good chess programs these are extremely likely to be the best move.
Roughly a 5 / 6 probability, actually. About 1 out of 6 times, you will change your mind on the depth N+1 search.


2. Otherwise, do a "scout" search around alpha. Use some forward pruning and such in the zero width window search to speed it up.

3. If the zero width window search fails high, add it to the list of moves to be included in the "real" search.

This is just about the smartest candidate move generator you can build. It happens to do a search, it is scalable because it is called with a parameter that determines how good to make it (we call it depth) and it is also passed information it needs such as the score of the best move found so far among the siblings.

... If you search the rest of the moves, you must expect one of them to become the new PV on occasion. How can they if you whack the trees apart with aggressive pruning? That's a point I simply don't get...
They WILL become the best move on occasion, otherwise this wouldn't work. I think the concept you are missing is that the "candidate move generator" or scout search should be viewed as mechanism to suggest candidates, not to provide moves that are guaranteed to be best.

I think your "cognitive" model is that you do the scout search to quickly prove the move sucks, but if it doesn't you have "proved" a new best move and the full window search is only done to get the correct score and PV. But if you stop thinking of it that way you will understand. Just think of it as a gadget that is "pretty good" at "suggesting" moves to try.

And to answer your next objection, YES, it's not as reliable in the full width sense as the anal and slow method of treating in like a brute force search but that it why it's so selective.

Even though it seems wrong to you, Crafty is doing this anyway without your permission. The null move search, even if you don't explicitly code it this way, is mostly giving you big speedups in non-pv nodes. It's making the non-pv nodes come out different - faster and less safe than PV nodes. In other words your scout search sucks, just like mine does and all good programs! Of course that is relative term.

Probably another way to look at this is that you should leave most of the selectivity to the "candidate move generator" and not try too hard to also make the primary tree selective. You can be critical of this if you want but this is what all the very best programs are doing.
That makes it ridiculously selective. Everything else can be viewed as black box to determine if a move should be included in this highly selective PV search. In other words, think of everything else as a candidate move generator for a highly selective program. To be a candidate move generator one would expect it to "throw out" moves faster than they would normally be searched and so the non-pv search does this by means of using null move pruning, futility pruning, highly aggressive LMR, and some surprisingly aggressive forward pruning. It does not select moves, it only decides whether a move is worthy of being included in the "real" search.

This also explains why the branching factor of modern chess programs is 2 or even less.
Not really. If you avoid pruning on the first move, how much of the tree do you get rid of? That tree becomes 90% of the total effort and you can't reduce that further without resorting to more pruning. Hence my disagreement with not pruning on PV nodes. that's the biggest part of the tree, and the part where the most savings are possible.
I think this is a matter of semantics and how you count. You alluded to this a few posts ago. The PV is 90% of the effort only because it has this really expensive but awesome candidate move generator. Your program and mine spend very little time in actual PV nodes. If you time the first move you may find that 90% of the time is spent there and the other 19 (from the opening position) fly by, but that's clearly not the same as saying most of the work is in the pv nodes. In Komodo the number of non-pv nodes is orders of magnitude more than the number of PV nodes and that is true of your program too.

I'm not saying you are wrong, it's just about how you look at it.

We are essentially just doing a PV search where we are looking (usually) at just 1 or 2 moves and throwing out everything else.

If you think of the non-pv nodes as part of the "real" search you are missing the point. Of course it's semantics, but I see these nodes as simply part of an elaborate recursive candidate move generator in a highly selective search.
How do they promote to "needs to be searched further"? That is the problem...
If the candidate move generator (scout search) returns with a score higher than alpha, it means the move is suspiciously good and should not be "thrown out" - it should be including in the proper tree (promoted to a PV move and searched for real.)

The devil is in the details. Getting it balanced is the trick and in Komodo we explicitly have 2 separate functions so that we never forgot they are conceptually different. When we experiment with extensions we do them separately. Our LMR rules are totally different and tested separately and the PV LMR is a lot more paranoid.
Here's the thing I do not like about the concept. I have found, during debugging, positions where move A is better. But if I force a different move, I get a better score for that move. Why? pruning and reductions. I always search at least one move from any node with no reductions, for safety (this from lots of testing by the way). But if any of the other moves are important, and they get reduced, then I won't see the benefit and overlook something. Such is the nature of reductions and pruning, obviously. I simply believe that moves ought to be reduced or pruned based on something other than whether they are part of a list of moves at a PV node or at a non-PV node...

I don't claim my vision of this is correct. I do claim it is logical and I want to see something concrete to convince me otherwise, something beyond "it looks better in testing." The question to that last response is "better? maybe. But is there something ELSE that is better still, that can be used to trigger or limit these decisions.
syzygy
Posts: 5797
Joined: Tue Feb 28, 2012 11:56 pm

Re: Houdini 3 reducing the depth feature

Post by syzygy »

bob wrote:That seems wrong to me. Why? Why not prune bad moves? Because you want more accuracy in the PV? That makes no sense to me because what if the SECOND move you search is really the best? You prune it more heavily? So that it can't become the PV move?
But crafty uses a form of LMR, right? So at a PV node, at least some of the moves that are searched after the first move (the "PV move") are searched with reduced depth. If one such move is "really the best", the search will only notice it if this reduced depth search already fails high.

So crafty is treating the "PV move" differently from non-PV moves. But I don't see that as a problem.
This all seems to be theoretically unsound. If pruning is unsafe anywhere, it must be unsafe everywhere, so why do it?
It's only unsound in the sense that you won't get the same result as a full width search. Of course a full width search is better if you can still reach the same search depth, but without reductions and pruning you can't reach the same search depth.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Houdini 3 reducing the depth feature

Post by bob »

syzygy wrote:
bob wrote:That seems wrong to me. Why? Why not prune bad moves? Because you want more accuracy in the PV? That makes no sense to me because what if the SECOND move you search is really the best? You prune it more heavily? So that it can't become the PV move?
But crafty uses a form of LMR, right? So at a PV node, at least some of the moves that are searched after the first move (the "PV move") are searched with reduced depth. If one such move is "really the best", the search will only notice it if this reduced depth search already fails high.

So crafty is treating the "PV move" differently from non-PV moves. But I don't see that as a problem.
This all seems to be theoretically unsound. If pruning is unsafe anywhere, it must be unsafe everywhere, so why do it?
It's only unsound in the sense that you won't get the same result as a full width search. Of course a full width search is better if you can still reach the same search depth, but without reductions and pruning you can't reach the same search depth.
As I said, treating "the PV move differently" is forced. You can't prune it, or it won't be a PV move choice. Etc. But at a PV "node" there are lots of moves to search, and I treat the remainder of those moves no differently than I treat moves at any other (non-PV) node. In fact, for consistency (after a lot of testing) I don't reduce the first move searched from any node, very much like the "PV move" concept. But the rest are subject to reductions or pruning depending on each specific move's characteristics...

So this is not exactly about doing things full-width or brute force, it is just about treating all moves equally, so that any have a fair chance of becoming the new best move... Unless they meet criteria to reduce, or prune, or whatever.
syzygy
Posts: 5797
Joined: Tue Feb 28, 2012 11:56 pm

Re: Houdini 3 reducing the depth feature

Post by syzygy »

bob wrote:As I said, treating "the PV move differently" is forced. You can't prune it, or it won't be a PV move choice. Etc.
I agree there's little choice here, but the point remains that PV moves are a bit different, and that when using LMR (and other techniques) you're "disadvantaging" later moves that would turn out to be better had you not reduced or pruned them.
But at a PV "node" there are lots of moves to search, and I treat the remainder of those moves no differently than I treat moves at any other (non-PV) node.
Ok, but the observation that started this thread does not imply that H3 treats moves differently. It might, but that seems to have little to do with the idea of resolving a fail high first to the reduced depth at which the move failing high was searched first. (After resolving it at reduced depth, H3 continues to resolve it at normal depth, at least that's what I understand from the H3 analysis that Robert has been giving here for various positions over the last few months).
So this is not exactly about doing things full-width or brute force, it is just about treating all moves equally, so that any have a fair chance of becoming the new best move... Unless they meet criteria to reduce, or prune, or whatever.
Agreed, you give all other moves the same chance (depending on move criteria), but it is not the same chance as the first move is getting.
User avatar
Rebel
Posts: 7409
Joined: Thu Aug 18, 2011 12:04 pm
Full name: Ed Schröder

Re: Houdini 3 reducing the depth feature

Post by Rebel »

bob wrote:
bob wrote:
Rebel wrote:
bob wrote:
Rebel wrote:
bob wrote: I understand the idea, although I do not buy "all in" that it is correct. Why would one do things so differently in the PV, that you make it very hard for a non-PV move to fail high in the first place??? I don't personally assume that this is the way things have to be done, it's just the way some do it at the moment. From a theoretical perspective, it seems wrong, however.
I am not sure how an idea that produces elo can be wrong. About a year of ten ago I added (what I called) the PVS extension as described here. It's (still) good for 20-30 elo. When I would reduce the depth at a fail-high it still could produce the good score and mainline since the hash-table has all the best moves right resulting in several to many extensions.
This is similar to the tt-singular idea in robolito and friends. I've tried several variations, but not your "4 tt moves in a row to trigger an extension." Won't work for me, period, because I use the speed optimization of whenever I have no best move to store (ALL node, for example) I store the FIRST move I searched. Since I search the TT move without generating moves, it can occasionally save time. It is a performance enhancement, rather than a search improvement (I search the same tree if I don't do this, but I have to generate moves first, and occasionally an ALL node becomes a CUT node and this "pseudo-best-move" will cause a fail high with less effort (no move generation). But this means I would be extending way too much using your idea, since every hash entry has a "best move" even though they are often not a "best move" at all, just "a legal move".
The question was if reducing the depth at a fail-high can produce the real score and the answer is yes. I noticed Richard told you the same thing.
"yes" is not helpful. "How" would be. If you back up several iterations, how can you hope to get the same score you will get from several iterations deeper???

Until you get to the same depth, anyway...
I get the point of backing up a bit, because move ordering is now broken, and backing up a few plies gives you easier searches to establish some move ordering for the new move. But "the score"? Can't see how it can be correct until you get to the depth where it fails high, otherwise it would have failed high at earlier depths, don't you think???

So maybe you get "a score" when you back up, and by the time you get back to the right depth you get "the score". That I would buy.
I think you are underestimating what others have inside:
[d] 8/8/8/3k4/8/8/8/1NN1KNN1 w - -

Code: Select all

ProDeo 1.81
 00:00:41  18.00  13.87  1.Nc3 Kd6 2.Nd3 Kc7 3.Ne3 Kb6 4.Ncd5 Kb5
 00:00:49  18.03  13.87  1.Nf3 
 00:00:50  18.03  14.02  1.Nf3 Kc5 2.Nd3 Kc4 3.Ke2 Kb5 4.Nc3 Kb6 
 00:01:04  19.00  14.07  1.Nf3 Kc5 2.Nd3 Kc4 3.Ke2 Kb5 4.Nc3 Kb6 
 00:02:26  20.00  M 17   1.Nf3 Kc5 2.Nd3 Kc4 3.Ke2 Kd5 4.Nc3 Kd6 5.Nb5 
And now I let it run from the well ordered hash table by taking back 1.Nf3 and then the mate is found 2 plies earlier because of the PVS extension.

Code: Select all

 00:00:05  16.00  13.77  1.Nc3 Kc4 2.Kd2 Kc5 3.Ne3 Kd6 4.Nd3 Kc6 5.Nf3 
 00:00:14  17.00  13.77  1.Nc3 Ke5 2.Nf3 Kf4 3.N1d2 Kg4 4.Ne5 Kf5 5.Nc6 
 00:00:39  18.00  M 17   1.Nc3 Ke5 2.Nf3 Kf4 3.Ke2 Kf5 4.Ne3 Ke6 5.Nd4 
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Houdini 3 reducing the depth feature

Post by bob »

syzygy wrote:
bob wrote:As I said, treating "the PV move differently" is forced. You can't prune it, or it won't be a PV move choice. Etc.
I agree there's little choice here, but the point remains that PV moves are a bit different, and that when using LMR (and other techniques) you're "disadvantaging" later moves that would turn out to be better had you not reduced or pruned them.
I agree. however, I do apply some analysis to each move before it is subjected to reduction. I do not, as of right now, use history ordering, so I really don't do classic "LMR" at all. I do more of a "SMR" (Suspicious Move Reduction". That is, moves that I don't think are very interesting compared to the ones already searched.

But the heart of this discussion still revolves around "node" vs "move". If we do a 20 ply search, with perfect move ordering, there are 20 (assuming no extensions) PV moves to be searched, or 21 PV positions (the root counts here as well). If we talk about the nodes being treated differently, I don't do that and don't like the idea. If we are talking about the "moves" being treated differently, then I agree, to a point. For example, I don't extend checks with a SEE < 0 whether it is on the PV or not...


But at a PV "node" there are lots of moves to search, and I treat the remainder of those moves no differently than I treat moves at any other (non-PV) node.
Ok, but the observation that started this thread does not imply that H3 treats moves differently. It might, but that seems to have little to do with the idea of resolving a fail high first to the reduced depth at which the move failing high was searched first. (After resolving it at reduced depth, H3 continues to resolve it at normal depth, at least that's what I understand from the H3 analysis that Robert has been giving here for various positions over the last few months).
There are two things in the discussion, IMHO.

(1) how to quickly resolve a fail high or fail low so that the search can get past that point. FH and FL are really different animals that most of us currently handle the same way, something I think needs to be looked at quite a bit in fact... Back in the late 90's, John Stanback and I had a long discussion about this issue on r.g.c.c... he proposed the idea that on a fail low, start back over at iteration one, because the fail high is missing a lot of move ordering information, and the previous search of this move did not fail high, so any remembered ordering (HT moves) will likely be wrong. If you do that, the HT entry for the fail low has a really big draft, so starting back over at depth 1 should find some other move, and as you progress through the iterations, you improve move ordering as in a normal search. Unless you lose that fail high. I've thought about this and played with it, but have not made a decision and it sits on my "further testing/thinking is required" list.

(2) producing a score / PV quickly to let the user know what is going on. If you search a FH move to a reduced depth to get a score quicker, that score must be at best maybe a rough approximation of the score that comes from a deeper search. Displaying it might be more confusing, as it is possible that score is actually worse than the scores for previous best moves you displayed Until you get deep enough.

And if you use IID, as some of us do, the benefit of this may well evaporate, since the idea of IID is to do shallow searches internal to the tree, when no best move is known and the search is for a non-null-window (therefore PV) position. For a fail high at the root, every move at ply=2 must have failed low, which leaves you with no clue about a best move. IID can resolve that. Or, if you back up several plies to start the search for this FH move, poor move ordering is not as damaging at the shallower depths. Which is, as I said, very similar to what IID does...

This may be one reason this did not work for me when I experimented with such an idea a while back, as I have had IID in Crafty for at least 15 years now...




So this is not exactly about doing things full-width or brute force, it is just about treating all moves equally, so that any have a fair chance of becoming the new best move... Unless they meet criteria to reduce, or prune, or whatever.
Agreed, you give all other moves the same chance (depending on move criteria), but it is not the same chance as the first move is getting.
No disagreement there. I was somewhat surprised in my testing when I discovered that adding one more requirement for reducing, "this must not be the first move searched" was an improvement. I originally did not have it, because I never reduce SEE-good captures, nor killers. But I found a particular position where there was no SEE-good capture and no killer, so I ended up in the "search the rest of the moves selection phase" to choose the first move, which meant that all moves at that node could be reduced. So that none could spot the tactical issue that was important. I tried protecting the first, the first two, etc. Just the first was the best in testing...