TT with Null Move Cuts A PV Node/Line

xmas79 · Post by **xmas79** » Tue Feb 25, 2014 11:19 pm

Your code is equivalent to:

int PVS_FailSoft_A(int alpha, int beta, ...) {
    ...
    ASSERT(alpha < beta);
    bool isZeroWindow = (beta == alpha+1); // not part of algorithm
    bool isFirstMove = true;
    int bestScore = -INF;
    for (all moves) {
        int score;
        make();
        if (isFirstMove) {
            isFirstMove = false;
            score = -PVS_FailSoft_A(-beta, -alpha, ...);
        } else {
            score = -PVS_FailSoft_A(-(alpha+1), -alpha, ...);
            if (score > alpha) {
                ASSERT(score > alpha);
                ASSERT(score > bestScore);
                //
                // can be either PV node or zero-window (non-PV) node
                //
                // Q: would it be correct to avoid a re-search for "score >= beta" in fail soft?
                // If so, in which of the two cases (PV or non-PV node)?
                //
                score = -PVS_FailSoft_A(-beta, -alpha, ...);
            }
        }
        unmake();
        if (score >= beta) {
            return score;
        }
        if (score > bestScore) {
            bestScore = score;
            alpha = Max(alpha, bestScore); 
        }
    }
    return bestScore;
}

where I moved the newAlpha line at the end and replaced newAlpha with alpha.

In my opinion you can trust the score, in both PV and non PV ndoes. As already said, non PV nodes are easy to figure out. In PV nodes, let's do with an example: if you have a window of say [alpha,beta]=[100,300], on the first move suppose you get a score of 200, then you collapse the window to [200,201] and do a search with [-201,-200], and get a fail high with a value > beta, say 400. If you research with a window of [-300,-200], you will end up with a score of 400 again, since you will have a TT probe with good draft, bound type (all-node) and value, as soon as you enter the node. That child node, in fact, will continue to fail low unless you search it in a way that would not cause the cut-off, eg with a bound like [-401, -200], and now we are like in the case of an exact score, where search can effectively search and return another score, so in this case I do not trust the score.

bob · Post by **bob** » Wed Feb 26, 2014 6:45 am

Evert wrote:
bob wrote:
Evert wrote:
bob wrote: No, because if ANY move exceeds alpha, it normally is >= beta since most searches have a null-window passed in. Only on PV-type nodes do you have a non-null-window, and only on those nodes can value be > alpha AND < beta, because on most nodes if it is > alpha it is = to beta thanks to the null-window.
I'm not sure how that's different from what I said?
This is what was not so clear to me:

With that addition, we're saying that we want an exact score if possible - but if the returned score suggests that the move will cause a cut-off, we don't bother and simply trust the score as given. In that sense, the <beta should be a simple speed optimisation that prevents an unnecessary research that is going to fail high. Is that the idea?
Ah, I see. My default mode of thinking is in terms of fail-soft.

In a fail-hard framework, the score > alpha && score < beta condition simply says to to a re-search on a PV node on any move that improves alpha (we now need to know if it fails high, and we need to update the search bound). However, it will never happen that the score > beta.

In fail soft, that can happen, and then that condition seems to say "don't trust the score, do a re-search with an open window, unless the returned score is a cut-off in which case, take the cut-off and skip the verification." This seems oddly in two minds: should you not either always accept the returned score (in which case you don't need the verification search), or never (in which case the < beta condition is not needed)?

I am not sure how fail-soft changes anything here.

In most nodes you search with window (alpha, alpha+1). You would never want to do a re-search there since the window is null when it is passed in. Doesn't matter whether your call to search returns alpha+1, or alpha+100. No research needed, and none would be done in this case.

For the small number of PV nodes, you are called with (alpha, beta) where beta is greater than alpha+1, i.e. not a null-window search. You search the first branch of that node with the full window (alpha,beta). But after the first move, you search the remainder with (alpha, alpha+1). If you use fail-hard, all you can get is either alpha or alpha+1 backed up. If you get alpha+1, which is less than beta, you want to re-search with the original (alpha,beta) window. Since score > alpha AND score < beta (it is actually alpha+1) you do the re-search here as we discussed. For fail-soft, you have two cases. If you get something > alpha+1, but < beta, you want to re-search with the original (alpha,beta) window. If you get something returned that is > beta on the (alpha,alpha+1) search, then I would not re-search a second time, because fail-soft said the score is >= beta, so I need to bail out and come back with a wider window if appropriate, or just bail out otherwise.

In short, it should work identically whether it is fail-hard or fail-soft, except that sometimes in fail-soft the re-search is not done because the returned value was above the temporary beta value (alpha+1) AND it was above the actual beta value as well...

bob · Post by **bob** » Wed Feb 26, 2014 6:53 am

mvk wrote:
bob wrote:
Edsel Apostol wrote:
bob wrote:
Edsel Apostol wrote:You can try the following:

-don't do null move in the PV nodes
-don't do cutoff from the hash table in the PV nodes, just use the hash move
-store the principal variation to the hash table after every iteration in case the PV positions stored in it has been overwritten already
Seemed to me he was describing a problem trying to construct the PV from the hash table. The above don't help that at all. Perhaps I misunderstood his question?
I think his problem is incomplete PV. He tried various ways to collect the PV (one of them is using the exact nodes from the main TT, others are triangular array, etc) but still fails to display the PV at it's full length from time to time. I'm suggesting ways on how it can be solved. That's what we've been doing in our engine and we don't have that truncated PV issue.
Yes, but your "fix" is not so good. You ignore any hash hit in the PV that would stop the search. Why throw out good information like that, it hurts efficiency?
The answer, based on measurements in my program, would be because 1. it is very simple and 2. the perceived efficiency loss (just a hand full of nodes, far fewer than 1 ms worth of them) doesn't translate into an observable strength loss. The extra nodes are just the immediate cut-offs in the siblings of PV nodes. Therefore 1 prevails. In my case I maintain the PV separate from the hash table, in a variable size array, not only in the transposition table. I believe, but I haven't measured, that that might be important, because you don't want to lose PV moves ever. (It is easy to test with Fruit if that is true: remove the is_pv condition for exact matches and compare elo.)

Test more thoroughly. There are positions where those cutoffs are exact scores are more than a few ms of nodes... In some searches this is the difference between seeing the right move and not seeing it. It only takes a little logical thinking to decide this is a "wrong" idea. You are throwing away useful information for no reason. Measuring how significant is a question I will try to answer when my cluster is free, but it is clearly an idea that loses SOMETHING that can be avoided.

A quick test is fine #70. I get the correct early winning score of +2 two plies quicker allowing hash hits EVERYWHERE. I tried some other positions, particularly endgame positions, and the difference was quite significant.

I don't buy this "I am going to break it because I can't fix it" approach. I can add up a bunch of these "suggestions" I have heard over the years: (a) no hash hits on pv nodes; (b) don't store mate scores at all, they are a problem; (c) don't store draw scores, they are a problem; All ideas that "take the easy way out" at a cost that is more significant than is thought.

There are solutions for ALL of the above issues that work flawlessly, and which don't throw out ANY useful information... Sometimes it takes a bit of effort to get it right, but what good idea doesn't?

The hash table has two effects on search. (1) it improves speed by letting us avoid repeating work already done; (2) it improves ACCURACY by taking scores from one part of the tree and grafting them on to other parts of the tree, giving us extra depth along some branches. Both are useful.

Edsel Apostol · Post by **Edsel Apostol** » Wed Feb 26, 2014 7:44 am

bob wrote:
Edsel Apostol wrote:
bob wrote:
Edsel Apostol wrote:You can try the following:

-don't do null move in the PV nodes
-don't do cutoff from the hash table in the PV nodes, just use the hash move
-store the principal variation to the hash table after every iteration in case the PV positions stored in it has been overwritten already
Seemed to me he was describing a problem trying to construct the PV from the hash table. The above don't help that at all. Perhaps I misunderstood his question?
I think his problem is incomplete PV. He tried various ways to collect the PV (one of them is using the exact nodes from the main TT, others are triangular array, etc) but still fails to display the PV at it's full length from time to time. I'm suggesting ways on how it can be solved. That's what we've been doing in our engine and we don't have that truncated PV issue.
Yes, but your "fix" is not so good. You ignore any hash hit in the PV that would stop the search. Why throw out good information like that, it hurts efficiency?

My solution uses all available information, and doesn't return short PVs either. There is no measurable overhead in my PV storing approach, and I don't give up any search efficiency by not honoring TT entries with sufficient depth, when at a PV node.

You can actually eliminate ALL search instability by throwing the TT out completely, or else just using it to store a best move, but no bound/score/draft, etc.

It's a design trade-off. A simple solution for a bit of inefficiency.

In my limited testing in our engine, returning exact score in PV nodes from hash table if it is outside the alpha and beta window makes it weaker. If I limit it to return exact score if it satisfies (score > alpha && score < beta), then it is at least at par with the one without the cut-offs. So performance-wise they are at par, with the disadvantage that the principal variation is broken at times.

I think the reason why it behaves that way is that with the new search window (previous iteration lets say is -20, 20, now lets say it's 0, 40) the search may stumble upon a better continuation, which wouldn't happen if it is being cut-off at PV nodes when an exact score from a different search window is encountered.

Of course, YMMV with Crafty. It may have more stable search than ours to take advantage of doing cut-offs in PV nodes.

Evert · Post by **Evert** » Wed Feb 26, 2014 11:05 am

bob wrote:For fail-soft, you have two cases. If you get something > alpha+1, but < beta, you want to re-search with the original (alpha,beta) window. If you get something returned that is > beta on the (alpha,alpha+1) search, then I would not re-search a second time, because fail-soft said the score is >= beta

Yes, but that comes back to my question: the only reason to do a verification search when score < beta is that you do not trust the score to be accurate. I understand that. But why can you trust the score (or rather, the bound) if it is > beta? Can you be sure that if the null-window search returned a score > beta, the open window search will too? This is what is not obvious to me.

My understanding of search instability (which I know it is not worth it to eliminate) is that this is not generally the case, but perhaps that's wrong? Or perhaps it doesn't really matter much either way?

Cheney · Post by **Cheney** » Wed Feb 26, 2014 1:14 pm

I've read some more and something clicked

First, the issue I have experienced with my PVTT is a bug with the MakeNullMove. Not only was I not properly restoring the board state when unmaking the move, but during the nullmove, I also had a bad hash. This appears to have lead to more collisions and writing the TT with an invalid move. Later, when I would walk the PVTT and play the moves out, the board became corrupt.

Robert, I read what you wrote in your PV example. All along I was under the impression that a beta cut was killing the triangular array when in fact, it is an EXACT hash:

You find the EXACT hash entry, and you would normally cut the PV off at that point.

I get it and this explains the Triangular Array being truncated. Thank you.

I will look at Crafty. I expect to see where you write the TT that there is something similar to an array attached to that TT entry.

All this review lead me to something else which did not make sense: the oscillation of cutting and not cutting across PLYs when using a zero window. But I think I get that now and will start another post.

mvk · Post by **mvk** » Wed Feb 26, 2014 5:57 pm

bob wrote:
mvk wrote: The answer, based on measurements in my program, would be because 1. it is very simple and 2. the perceived efficiency loss (just a hand full of nodes, far fewer than 1 ms worth of them) doesn't translate into an observable strength loss. The extra nodes are just the immediate cut-offs in the siblings of PV nodes.
Test more thoroughly. There are positions where those cutoffs are exact scores are more than a few ms of nodes... In some searches this is the difference between seeing the right move and not seeing it. It only takes a little logical thinking to decide this is a "wrong" idea. You are throwing away useful information for no reason. Measuring how significant is a question I will try to answer when my cluster is free, but it is clearly an idea that loses SOMETHING that can be avoided.

I'm looking forward to your measurement results when you have them.

bob · Post by **bob** » Wed Feb 26, 2014 6:30 pm

mvk wrote:
bob wrote:
mvk wrote: The answer, based on measurements in my program, would be because 1. it is very simple and 2. the perceived efficiency loss (just a hand full of nodes, far fewer than 1 ms worth of them) doesn't translate into an observable strength loss. The extra nodes are just the immediate cut-offs in the siblings of PV nodes.
Test more thoroughly. There are positions where those cutoffs are exact scores are more than a few ms of nodes... In some searches this is the difference between seeing the right move and not seeing it. It only takes a little logical thinking to decide this is a "wrong" idea. You are throwing away useful information for no reason. Measuring how significant is a question I will try to answer when my cluster is free, but it is clearly an idea that loses SOMETHING that can be avoided.
I'm looking forward to your measurement results when you have them.

Just started the run, cluster was idle. I don't expect anything significant. I just like the simpler code where hash hits are honored everywhere and no useful information is ignored. Will report back later today.

bob · Post by **bob** » Wed Feb 26, 2014 6:34 pm

Evert wrote:
bob wrote:For fail-soft, you have two cases. If you get something > alpha+1, but < beta, you want to re-search with the original (alpha,beta) window. If you get something returned that is > beta on the (alpha,alpha+1) search, then I would not re-search a second time, because fail-soft said the score is >= beta
Yes, but that comes back to my question: the only reason to do a verification search when score < beta is that you do not trust the score to be accurate. I understand that. But why can you trust the score (or rather, the bound) if it is > beta? Can you be sure that if the null-window search returned a score > beta, the open window search will too? This is what is not obvious to me.

My understanding of search instability (which I know it is not worth it to eliminate) is that this is not generally the case, but perhaps that's wrong? Or perhaps it doesn't really matter much either way?

When I search with the interval (X,Y), ANY score that comes back < X or > Y can't be trusted (I assume fail-soft here, obviously, since with fail-hard this can't happen). If the score is > Y, all you know is that SOME score in that subtree went beyond beta, but there might be better moves than that. In the case of PVS, we start with (alpha,beta) but after the first move we search with (alpha,alpha+1). Anything > alpha+1 is a good score in that it is better than the best so far, but the score is not accurate. That returned score > alpha+1 is simply a lower bound on the true score and I want that real number.

Hope that helps...

Remember, if it seems odd, you return on the first score > beta, not on the BEST score > beta. There could be even better moves when you search with a better bound.

bob · Post by **bob** » Wed Feb 26, 2014 6:41 pm

Edsel Apostol wrote:
bob wrote:
Edsel Apostol wrote:
bob wrote:
Edsel Apostol wrote:You can try the following:

-don't do null move in the PV nodes
-don't do cutoff from the hash table in the PV nodes, just use the hash move
-store the principal variation to the hash table after every iteration in case the PV positions stored in it has been overwritten already
Seemed to me he was describing a problem trying to construct the PV from the hash table. The above don't help that at all. Perhaps I misunderstood his question?
I think his problem is incomplete PV. He tried various ways to collect the PV (one of them is using the exact nodes from the main TT, others are triangular array, etc) but still fails to display the PV at it's full length from time to time. I'm suggesting ways on how it can be solved. That's what we've been doing in our engine and we don't have that truncated PV issue.
Yes, but your "fix" is not so good. You ignore any hash hit in the PV that would stop the search. Why throw out good information like that, it hurts efficiency?

My solution uses all available information, and doesn't return short PVs either. There is no measurable overhead in my PV storing approach, and I don't give up any search efficiency by not honoring TT entries with sufficient depth, when at a PV node.

You can actually eliminate ALL search instability by throwing the TT out completely, or else just using it to store a best move, but no bound/score/draft, etc.
It's a design trade-off. A simple solution for a bit of inefficiency.

In my limited testing in our engine, returning exact score in PV nodes from hash table if it is outside the alpha and beta window makes it weaker. If I limit it to return exact score if it satisfies (score > alpha && score < beta), then it is at least at par with the one without the cut-offs. So performance-wise they are at par, with the disadvantage that the principal variation is broken at times.

I think the reason why it behaves that way is that with the new search window (previous iteration lets say is -20, 20, now lets say it's 0, 40) the search may stumble upon a better continuation, which wouldn't happen if it is being cut-off at PV nodes when an exact score from a different search window is encountered.

Of course, YMMV with Crafty. It may have more stable search than ours to take advantage of doing cut-offs in PV nodes.

I hate to point this out, but it sounds like you have a bug if this weakens your program. What is a TT hit? The result from a previous search of this same position. I don't see how using that could hurt. I don't expect it to help a lot with PVS because so few nodes are searched with anything but a null-window, but I don't see any rational way it could hurt.

I don't see how your example can happen, however. A true score is a "constant" for a position. Alpha/Beta guarantees to return the same score that a pure minimax search would return. Hashing can sometimes let you find a better move due to searching deeper via tree grafting, but it should not make you choose a worse move, that would be defective.

TT with Null Move Cuts A PV Node/Line

Re: TT with Null Move Cuts A PV Node/Line

Re: TT with Null Move Cuts A PV Node/Line

Re: TT with Null Move Cuts A PV Node/Line

Re: TT with Null Move Cuts A PV Node/Line

Re: TT with Null Move Cuts A PV Node/Line

Re: TT with Null Move Cuts A PV Node/Line

Re: TT with Null Move Cuts A PV Node/Line

Re: TT with Null Move Cuts A PV Node/Line

Re: TT with Null Move Cuts A PV Node/Line

Re: TT with Null Move Cuts A PV Node/Line