## Minimax, noisy evaluations, PUCT

Discussion of chess software programming and technical issues.

Moderators: bob, hgm, Harvey Williamson

Forum rules
This textbox is used to restore diagrams posted with the [d] tag before the upgrade.
grahamj
Posts: 41
Joined: Thu Oct 11, 2018 12:26 pm
Full name: Graham Jones

### Minimax, noisy evaluations, PUCT

Minimax is known to have a bias which does not diminish with depth. The paper
Bias and pathology in minimax search, 2005, A. Sadikov, I. Bratko, I. Kononenko
uses KRK endings to investigate this.

Here's a simplified situation which I was able to analyse mathematically. Suppose the branching factor is 2, the true value of all nodes is 0, and that the evaluation function gives noisy estimates of this, independent random variables which are symmetric about 0, with cdf F. The minimax evaluation at the root tends towards F^-1((sqrt(5)-1)/2) ~= F^-1(.618) as the depth tends to infinity. It's funny to see the golden ratio appearing. If the noise follows a standard normal, the median root evaluations look like this.

Code: Select all

``````Ply    root eval
0        0       median of standard normal
1        0.545   median of max(child evals)
2        0.103
3        0.46
4        0.216
5        0.369
6        0.245
7        0.345
8        0.264
9        0.33
10        0.277
11        0.32
12        0.285
13        0.313
14        0.29
15        0.309
16        0.294
17        0.306
18        0.296
19        0.304
20        0.297
HUGE      0.3003214
``````

In a real game, the bias will occur when there is more than one nearly optimal move. This is not necessarily a bad thing. Positions with several nearly optimal moves are arguably superior to ones with only one. I found this from 2007 viewtopic.php?f=7&t=13433&p=114597&hili ... ax#p114597, where H.G.Muller suggests adding extra noise to obtain the effect, apparently unaware of the bias that already exists.

For fixed-depth minimax, the bias does not affect the ordering of evaluations. I guess there is some interaction with quiescent search and the width of aspiration windows. But I guess the bias is not a big issue for traditional engines.

On the other hand, if the search is to variable depths, as in PUCT, the even-odd oscillations seem problematic. I wonder if this part of why DeepMind found mean() better than max() for PUCT.
Graham Jones, www.indriid.com

hgm
Posts: 24002
Joined: Fri Mar 10, 2006 9:06 am
Location: Amsterdam
Full name: H G Muller
Contact:

### Re: Minimax, noisy evaluations, PUCT

Note that the oscillation is the opposite from what seems reasonable: it puts the side to move in the leaves at a disadvantantage. In reality one would expect that side to have the advantage. A side-to-move bonus in the static eval could cure this.