Bias and pathology in minimax search, 2005, A. Sadikov, I. Bratko, I. Kononenko
uses KRK endings to investigate this.
Here's a simplified situation which I was able to analyse mathematically. Suppose the branching factor is 2, the true value of all nodes is 0, and that the evaluation function gives noisy estimates of this, independent random variables which are symmetric about 0, with cdf F. The minimax evaluation at the root tends towards F^-1((sqrt(5)-1)/2) ~= F^-1(.618) as the depth tends to infinity. It's funny to see the golden ratio appearing. If the noise follows a standard normal, the median root evaluations look like this.
Code: Select all
Ply root eval 0 0 median of standard normal 1 0.545 median of max(child evals) 2 0.103 3 0.46 4 0.216 5 0.369 6 0.245 7 0.345 8 0.264 9 0.33 10 0.277 11 0.32 12 0.285 13 0.313 14 0.29 15 0.309 16 0.294 17 0.306 18 0.296 19 0.304 20 0.297 HUGE 0.3003214
In a real game, the bias will occur when there is more than one nearly optimal move. This is not necessarily a bad thing. Positions with several nearly optimal moves are arguably superior to ones with only one. I found this from 2007 viewtopic.php?f=7&t=13433&p=114597&hili ... ax#p114597, where H.G.Muller suggests adding extra noise to obtain the effect, apparently unaware of the bias that already exists.
For fixed-depth minimax, the bias does not affect the ordering of evaluations. I guess there is some interaction with quiescent search and the width of aspiration windows. But I guess the bias is not a big issue for traditional engines.
On the other hand, if the search is to variable depths, as in PUCT, the even-odd oscillations seem problematic. I wonder if this part of why DeepMind found mean() better than max() for PUCT.