evert823 wrote: ↑Sat Oct 05, 2024 9:43 am
The part I am still doubting is this:
Code: Select all
if (TransTable[p].used_alpha < TransTable[p].calculated_value &
current_alpha < TransTable[p].calculated_value &
TransTable[p].used_beta > TransTable[p].calculated_value &
current_beta > TransTable[p].calculated_value)
{
//if position also equal then accept the stored score
}
That doesn't seem quite right. This is how TT cutoffs work, assuming a fail-soft negamax search:
Failing High
First, we need to ask the following question: When a search fails high, what information can we extract from it?
Well, in order to fail high, our search must find a move that is
good enough to beat beta. Since we end the search as soon as such a move is found, it is easy to see that the rest of the possible legal moves that would've been searched later were not being searched at all. That is, when a search fails high, only a portion of the moves are explored.
This results in a degree of uncertainty to the "true value" of our node -- there is no way to know if any of our discarded moves are better than the move that caused the fail high. Therefore, the "true value" of our node is likely higher or equal to the best value we have gotten. We call our best value a
lower bound.
To illustrate this, here is an example:
our alpha is -inf, our beta is
30, and below are the best values of each of our nodes, had we done search on all of them:
The true value of this node is 100, the best possible outcome we can achieve. But take a look on how alpha/beta pruning resolves it.
We search the first two moves, and they do not trigger a fail high. Therefore for each of them, we raise alpha as necessary.
Now on the third move, we get 40, which exceeds beta. Therefore, we return best value of 40. Note that had we kept searching, we'll find a move that is worth 100. But since we failed high on 40, we stop early without searching the later moves. So 40 is not the truth score of the position, but rather a lower bound on what that true score actually is.
Failing Low
Now we explore the opposite question: When a search fails low, then what information can we extract from that?
(Failing low, in this case, means that none of the moves we searched beat the original alpha. Which means our best value is smaller than alpha.)
Since every move must not beat alpha, and since you negate the score from child in Negamax. We infer that
all of the child nodes beat their beta (-alpha from the parent's perspective). This means that our children failed high, and therefore returns a lowerbound. After the negation, those lowerbounds become upperbounds. And therefore every score we have received are upperbounds. Now since we still only select a "best" score in negamax. The best score we have found will also be an upperbound. That is, the true score of a fail-low node is lower than or equal to the best score we have found.
Exact Nodes
What if we neither failed high nor failed low. In this case the evaluation says within our window, meaning everything down to qsearch would not be affected by fail high upperbounds. Thus our score is exactly the true score we would get.
Putting Everything Together
To summarize, we discussed the following results:
- When failing high, the score we have is a lowerbound of the true score
- When failing low, the score we have is an upperbound of the true score
- When neither happens, the score we have is exactly the true score
Now suppose we searched a node in our search tree. The results were a lowerbound score of value V.
Suppose then we encounter the same position some time later, with possibly a different beta. What happens when beta <= V?
Well, we know from the discussion above that the true score S >= V. So beta <= V <= S. Our true score will always fail high.
This means that if we were to continue our search, it would certainly fail high. So we skip the search altogether and return V as our score.
In code, this means something like this:
Code: Select all
if (TransTable[p].score_type == LOWERBOUND &&
TransTable[p].score >= beta )
return TransTable[p].score;
By the same logic, if we have an UPPERBOUND node, and our TT score V <= alpha. Then we have our true value S <= V <= alpha. Meaning that we already know that the search will fail low. So we return V as a fail low score.
Code: Select all
if (TransTable[p].score_type == UPPERBOUND &&
TransTable[p].score <= alpha )
return TransTable[p].score;
Now if the score is exact, we can just return that score, since it will be the true value anyway
Code: Select all
if (TransTable[p].score_type == EXACT)
return TransTable[p].score;
But here's a problem: If we have a lower depth entry, but we are on some higher depth, then it is not safe to perform those cutoffs, as the true score found on a higher depth will likely be different and much more accurate. So we add an extra guard.
Code: Select all
if (TransTable[p].entry_depth >= depth)
{
// Perform above logic
}
The final code would look something like this:
Code: Select all
if (TransTable[p].entry_depth >= depth)
{
if (TransTable[p].score_type == EXACT)
return TransTable[p].score;
if (TransTable[p].score_type == UPPERBOUND &&
TransTable[p].score <= alpha )
return TransTable[p].score;
if (TransTable[p].score_type == LOWERBOUND &&
TransTable[p].score >= beta )
return TransTable[p].score;
}
If you have any further questions, feel free to ask me or in one of those discord servers, where most top engine devs are active in nowadays:
https://www.chessprogramming.org/Discord_Channels