comparing minimax and averaging MCTS with alphabeta rollouts

Evert · Post by **Evert** » Fri Mar 23, 2018 10:15 am

gladius wrote: Agreed, it's certainly worth investigating other approaches, and the 700 ELO is no joke . This quote from the paper is interesting though (even if it doesn't really have any justification):

AlphaZero evaluates positions using non-linear function approximation based on a deep neural network, rather than the linear function approximation used in typical chess programs. This provides a much more powerful representation, but may also introduce spurious approximation errors. MCTS averages over these approximation errors, which therefore tend to cancel out when evaluating a large subtree. In contrast, alpha-beta search computes an explicit minimax, which propagates the biggest approximation errors to the root of the subtree. Using MCTS may allow AlphaZero to effectively combine its neural network representations with a powerful, domain-independent search.

Well, claims without justification can be rejected without justification.
To be honest, that paragraph to me reads as "hey, you know, we're different. Different isn't necessarily better or worse, but we should put some positive spin on it, so... ok, how's this: we're averaging over errors rather than propagating them, so let's say that's better and hope we can get it past the referee without doing more work."

Clearly, you can do more with a non-linear function than with a linear one (and obviously there's no reason "typical chess programs" can't use non-linear evaluation terms; some do) and clearly there are more pitfalls with non-linear functions as well. Calling these "approximation errors" is a simplification and claiming they will average out is not justified without proof (it's intuitively likely, but it depends on the nature of the deviation and a word or two to reassure the reader that yes, in this case they do average out, would not be out of place). You can't claim that backing up the largest score implies backing up the largest error without proof either, for that matter.

Daniel Shawul · Post by **Daniel Shawul** » Wed Mar 28, 2018 9:16 pm

The pure MCTS with MINMAX backup (scorpio-mcts-min) has become significantly stronger (+141elos) than alpha-beta rollouts now at both short and long time controls. This happened after using a dynamic exploration constant that goes down from 0.3 to 0.005. However, i have found another improvement to alpha-beta rollouts that will surely beat scorpio-mcts-win that I will experiment with later (See last paragraph for details)

First the results at 60 + 0.1

Code: Select all

 1.                        scorpio-mcts-min    425     35     35    687    84.9%   -69    14.8%
 2.                        scorpio-mcts-mix    321     32     32    687    77.7%   -52    22.6%
 3.                         scorpio-mcts-ab    284     31     31    687    73.7%   -46    16.6%
 4.                    scorpio-mcts-min-old    -58     30     30    688    42.3%    10    20.6%
 5.                    scorpio-mcts-mix-old   -102     31     31    687    38.6%    16    20.5%
 6.                    scorpio-mcts-avg-old   -194     32     32    687    30.0%    32    23.3%
 7.                        scorpio-mcts-avg   -672     60     60    687     2.9%   111     4.7%

It is about 141 elos better than alpha-beta rollouts. It puts it around 2400 CCRL elo.

scorpio-mcts-mix mixes minmax and averaging in a 3:1 ratio.

The mcts-old's mix their score with the previous stored score. While this in general hurts elo (both mcts-min-old and mcts-mix-old versions are about 450 elo weaker), it helps in th case of averaging! Doing the average from scratch of the children node to compute parent score is pretty pretty bad. What AlphaZero uses is scorpio-mcts-avg-old not scorpio-mcts-avg.

To be examined: I realized alpha-beta rollouts can be significantly improved by rasing the minimum depth we do the standard alpha-beta seach calls. Currently it is set at 0, meaning qsearch() are done with standard method. The qsearch() is also used to evaluate all nodes during creation. I realize that raising this to even 1 improves the alpha-beta rollouts significantly.

The method that I was using before to control the mixing of alpha-beta rollouts with recursive search was to freeze growth of the tree after, say, 50% of the time is spent. I think it is much better to control it with minimum depth to do standard recursive search instead, just like the common min_slit_depth parameter. I will come back with the result of a tournament with values of this depth from 0,1,2,...10. Note that I tested using larger min_depth parameter when i started scorpio-mcts (and it turned out using depht=0 was the best) because i didn't have alpha-beta rollouts. This changes the picture completely..

Daniel Shawul · Post by **Daniel Shawul** » Thu Mar 29, 2018 12:31 am

Here it is. As i expected, alpha-beta rollouts will regain the lead if i did the shallow 3-ply searches with standard recursive alpha-beta (i.e. scorpio-mcts-ab3). scorpio-mcts-min is stll 140 elo stronger than scorpio-mcts-ab0

On-going tournament at 20+0.1

Code: Select all

 1.                                 scorpio    360     60     60    206    86.7%   -27    18.0%
 2.                       scorpio-mcts-ab10    237     54     54    207    74.9%   -17    18.4%
 3.                        scorpio-mcts-ab9    220     53     53    206    73.3%   -17    19.4%
 4.                        scorpio-mcts-ab8    183     53     53    207    69.8%   -15    14.0%
 5.                        scorpio-mcts-ab7    130     49     49    206    63.6%   -12    23.3%
 6.                        scorpio-mcts-ab6     76     49     49    207    58.7%    -8    21.7%
 7.                        scorpio-mcts-ab5    -11     49     49    207    48.8%    -1    19.3%
 8.                        scorpio-mcts-ab4    -81     50     50    207    40.6%     5    15.5%
 9.                        scorpio-mcts-ab3   -141     52     52    206    34.2%    12    14.1%
10.                        scorpio-mcts-ab2   -163     52     52    206    32.3%    14    15.0%
11.                        scorpio-mcts-min   -175     51     51    207    30.4%    17    22.2%
12.                        scorpio-mcts-ab1   -296     59     59    207    20.0%    25    10.1%
13.                         scorpio-mcts-ab   -333     61     61    207    16.9%    29    10.6%

scorpio=standard scorpio engine
scorpi-mcts-min=the best performing sofar i.e. mcts minimax backup
scorpio-mcts-abx = alphabeta rollouts with the leaves (min_depth=x) searched with standard recursive alpha-beta. x=0 is the weakest

Note that I have already tried to use larger depth for scorpio-mcts-min (i.e. MINMAX backup mcts) with larger depth for evaluation of nodes and found that the best value of is 0 (qsearch). So increasing the depth does not help it,which is why I said that implementing alpha-beta rollouts changes this picture ... I will update this page with final results once enough games are run for each engine.

Daniel

Daniel Shawul · Post by **Daniel Shawul** » Thu Mar 29, 2018 7:46 pm

Final results at 20+0.1:

Code: Select all

 1.                       scorpio-mcts-ab16    375     25     25   1209    85.1%   -26    17.1%
 2.                                 scorpio    357     23     23   1310    83.9%   -23    19.3%
 3.                       scorpio-mcts-ab12    285     21     21   1478    77.9%   -25    16.8%
 4.                       scorpio-mcts-ab10    187     21     21   1309    68.1%   -11    20.6%
 5.                        scorpio-mcts-ab9    168     21     21   1308    66.3%   -10    19.6%
 6.                        scorpio-mcts-ab8    131     20     20   1308    62.6%    -7    18.5%
 7.                        scorpio-mcts-ab7     78     20     20   1308    57.2%    -3    18.3%
 8.                        scorpio-mcts-ab6     27     20     20   1308    52.3%     0    20.1%
 9.                        scorpio-mcts-ab5    -50     20     20   1308    44.3%     5    18.1%
10.                        scorpio-mcts-ab4   -126     21     21   1308    36.9%    10    16.3%
11.                        scorpio-mcts-ab3   -177     21     21   1308    32.3%    14    14.2%
12.                        scorpio-mcts-min   -236     22     22   1310    26.1%    18    18.5%
13.                        scorpio-mcts-ab2   -258     22     22   1308    24.7%    19    13.8%
14.                        scorpio-mcts-ab1   -365     25     25   1310    16.6%    27    11.0%
15.                         scorpio-mcts-ab   -389     25     25   1310    14.6%    29    11.7%

With depth=16 the mcts version has 'beaten' standard scorpio but this is most probably just due to luck ( or maybe a better move ordering...).
Also, to beat scorpio-mcts-min, the alpha-beta rollouts version have to only search the last 2-3 plies with standard alpha-beta searcher. Before, i was using qsearch at the leaves.

Daniel

comparing minimax and averaging MCTS with alphabeta rollouts

Re: comparing minimax and averaging MCTS with alphabeta roll

Re: comparing minimax and averaging MCTS with alphabeta roll

Re: comparing minimax and averaging MCTS with alphabeta roll

Re: comparing minimax and averaging MCTS with alphabeta roll