I have been implementing an MTD-best approach in stockfish. It is a hack, so I bend existing code wherever necessary and from a clean & efficient code perspective it is ugly.
The basic paradigm of MTD best is that it is not necessary to rate the best score, but just to establish there is no better move. Lowerbound(bestmove)<=upperbound (nextmove). If I fail to do so I can improve the estimate of either bound by a new 0-window search. I have implemented this code at the root level search. Current implementation is basic and optimisations to reduce the number of iterations required should be possible.
Some other things need to change as well:
Stockfish uses a searchstack with search information (currentmove, bestmove, killermoves etc.). Since I may alternate between searching best and next move when improving bounds I have made two search stacks: one for the best and one for the next move. One of the implications is that I have killer moves focussed on the PV (the term is used loosely, since it is the lower estimate and not the best line) and a separate set for the rest.
The transposition table stores both an upper and a lower bound as opposed to just one score. This means the table contains more information but less entries for equal table size.
Some logic originally in stockfish may now be broken/altered as an unintentional consequence. I have seen one, possibly two:
Allotted search time algorithms are broken. The code apparently plays for the allotted time without functional extensions of search time. The result is that the mod saves time relative to the original code. My hack is to increase the default allotted time, although this does not have the same impact of course.
Using 0-window exclusively means relative more time is used at the root and searches become more independent. What does this mean for threading& efficient use of multi cores?
Does anybody see other candidates.
To validate my implementation I have played small tournaments against original code. The mod is functional, but looses 30-70. Sometimes a blunder makes it to the root level. If I analyse the position with both original and mod code the blunder is not reproducible. Working assumption is that it is a bug in the transposition table logic, but I am wide open for suggestions as I am unable to locate it.
Analysis.
I have focussed the first analysis on the size of the tree. For each move made in the tournaments I have calculated the effective branching factor for both the original code as well as the mod. Reminder: comments above are applicable to the data below.
The mod has a statistically significant lower branching factor
Code: Select all
mod N Mean StDev SE Mean
MTD 4062 1,867 0,205 0,0032
Orig 4046 1,904 0,223 0,0035
Difference = mu (MTD) - mu (Orig)
Estimate for difference: -0,037364
95% CI for difference: (-0,046691; -0,028037)
T-Test of difference = 0 (vs not =): T-Value = -7,85 P-Value = 0,000 DF = 8038
The mod has a more consistent branching factor (statistically significant lower standard deviation)
Code: Select all
mod N Lower StDev Upper
MTD 4062 0,199608 0,204576 0,209789
Orig 4046 0,217980 0,223416 0,229121
F-Test (normal distribution)
Test statistic = 0,84; p-value = 0,000
The cost can be seen in nodes per second. I was shocked by the size of the difference (246 k vs 371k) and cannot explain this presently. I made no effort to program efficiently, but did not expect to have done so badly. This should be easy to improve, since the fundamentals are not very complex. It is either efficiency of root level code or that of the transposition table logic
Code: Select all
mod N Mean StDev SE Mean
MTD 4062 246639 141267 2217
Orig 4046 371264 149617 2352
Difference = mu (MTD) - mu (Orig)
Estimate for difference: -124625
95% CI for difference: (-130960; -118289)
T-Test of difference = 0 (vs not =): T-Value = -38,56 P-Value = 0,000 DF =8075