## LCZero: Progress and Scaling. Relation to CCRL Elo

hgm
### Re: LCZero: Progress and Scaling. Relation to CCRL Elo

whereagles wrote:

Leela is black.. 2.36% chance to win on a lone king??
Close enough to 0 to make no difference even at 3500 Elo.

George Tsavdaris
### Re: LCZero: Progress and Scaling. Relation to CCRL Elo

whereagles wrote:

Leela is black.. 2.36% chance to win on a lone king??
How is this possible?? I mean how do the rollouts result in black wins in order black to have wins in its statistics??
Nay Lin Tun
### Re: LCZero: Progress and Scaling. Relation to CCRL Elo

@Kai, is it possible to share your opening test suit?

Daniel Shawul
### Re: LCZero: Progress and Scaling. Relation to CCRL Elo

She averages her tree like no other, and sucks because of that

What is happening is that white will sometimes give up its queen in the tree and the score becomes close to a draw. When that is averaged with the non-loosing lines you get this non-existant winning probability.

Tried it on scorpioMCTS with averaging and minimax backups

With averaging like leela's, a score of -627 is like 2% winning chance

Code: Select all

``````15 -570 5 118404  Ka6-b6 Kd3-c4 Kb6-b7 Qc2-b2
15 -545 7 158627  Ka6-b5 Qc2-g2 Kb5-a4 Qg2-g4 Ka4-b5 Qg4-c4 Kb5-a5 Qc4-c2 Ka5-b4 Qc2-f2 Kb4-b3
16 -569 7 167639  Ka6-b5 Qc2-g2 Kb5-a4 Qg2-d5 Ka4-b4 Qd5-c4 Kb4-a5 Qc4-c2 Ka5-b4 Qc2-f2 Kb4-b3
16 -595 8 177007  Ka6-b5 Qc2-g2 Kb5-a4 Qg2-f2 Ka4-b5 Qf2-f5 Kb5-b4
16 -611 9 219766  Ka6-b5 Qc2-g2 Kb5-a4 Qg2-g7 Ka4-b3 Qg7-b7 Kb3-a4 Qb7-f3
17 -627 9 227292  Ka6-b5 Qc2-g2 Kb5-a4 Kd3-c3 Ka4-b5 Qg2-d5
``````
With minimaxing a score of -2023 is a 0% winning chance

Code: Select all

``````2 -2023 0 554  Ka6-b7 Qc2-b2 Kb7-c8
3 -2034 0 3588  Ka6-b5 Qc2-c4 Kb5-a5 Qc4-c5 Ka5-a4
4 -2044 0 12593  Ka6-b6 Qc2-a4 Kb6-c7 Qa4-f4 Kc7-d7
5 -2056 0 23814  Ka6-b6 Qc2-b2 Kb6-c5 Qb2-e5 Kc5-b4
6 -2063 2 47803  Ka6-b6 Kd3-c3 Kb6-a6 Qc2-a2 Ka6-b6
7 -2064 3 91732  Ka6-b6 Kd3-c3 Kb6-c6 Qc2-a2 Kc6-b6
``````

### Re: LCZero: Progress and Scaling. Relation to CCRL Elo

Nay Lin Tun wrote:@Kai, is it possible to share your opening test suit?
Sure, the link in this post should work:
http://www.talkchess.com/forum/viewtopi ... 5&start=14

tmokonen
### Re: LCZero: Progress and Scaling. Relation to CCRL Elo

George Tsavdaris wrote: How is this possible?? I mean how do the rollouts result in black wins in order black to have wins in its statistics??
L0 takes both wins and draws into consideration, so there's a small residual score from rollouts that result in draws.

Uri Blass
### Re: LCZero: Progress and Scaling. Relation to CCRL Elo

George Tsavdaris wrote:
whereagles wrote:

Leela is black.. 2.36% chance to win on a lone king??
How is this possible?? I mean how do the rollouts result in black wins in order black to have wins in its statistics??
I think it means expected outcome of 2.36% that may be probability of 4.72% for a draw and 95.28% to lose.

MonteCarlo
### Re: LCZero: Progress and Scaling. Relation to CCRL Elo

Indeed. Leela's output is an expected score, not actually a win%.

The wording on the site has since been changed, it seems, although now it includes this "50%=draw" bit in the legend, which caused some debate in the discord.

Probably should just say "50%=equal chances" or some such thing ("expected score" is pretty self-explanatory, so could probably do away with the legend altogether), but not a big deal.

Last net to pass is actually fairly reasonable now. It'll be interesting to see where we are a week from now (it's not even been a week since the last big bug was fixed).

### Re: LCZero: Progress and Scaling. Relation to CCRL Elo

Large improvement from ID69 to ID83.

Now, in only 100 games gauntlets against Zurichess Bern (2232 Elo CCRL) and BikJump v2.01 (2098 Elo CCRL), it performs at about 2200 Elo level at 1s/move and at about 2300 Elo level at 10s/move. On a full 4 core i7 CPU.

On my positional opening test suite of 200 positions, it is firmly settled amongst strong engines (20s/position):

Code: Select all

``````&#91;Search parameters&#58; MaxDepth=99   MaxTime=20.0   DepthDelta=2   MinDepth=7   MinTime=0.1&#93;

Engine                         &#58; Correct  TotalPos  Corr%  AveT&#40;s&#41;  MaxT&#40;s&#41;  TestFile

Komodo 10.2 64-bit             &#58;     145       200   72.5      2.0     20.0  openings200beta07.epd
Houdini 5.01 Pro x64           &#58;     144       200   72.0      2.4     20.0  openings200beta07.epd
Stockfish 8 64 BMI2            &#58;     141       200   70.5      2.0     20.0  openings200beta07.epd
Houdini 5.01 Pro x64 Tactical  &#58;     139       200   69.5      2.3     20.0  openings200beta07.epd
Deep Shredder 13 x64           &#58;     128       200   64.0      2.7     20.0  openings200beta07.epd
Houdini 4 Pro x64              &#58;     126       200   63.0      1.8     20.0  openings200beta07.epd
Andscacs 0.88n                 &#58;     123       200   61.5      2.4     20.0  openings200beta07.epd
Houdini 4 Pro x64 Tactical     &#58;     120       200   60.0      1.6     20.0  openings200beta07.epd
Nirvanachess 2.3               &#58;     119       200   59.5      1.8     20.0  openings200beta07.epd
Fire 5 x64     &#40;3341 CCRL&#41;     &#58;     110       200   55.0      3.0     20.0  openings200beta07.epd
Texel 1.06     &#40;3162 CCRL&#41;     &#58;     110       200   55.0      1.6     20.0  openings200beta07.epd

LCZero  *************  ID83    &#58;     109       200   54.5      1.1     20.0  openings200beta07.epd

Fritz 15       &#40;3227 CCRL&#41;     &#58;     102       200   51.0      1.9     20.0  openings200beta07.epd

LCZero  *************  ID69    &#58;      98       200   49.0      2.7     20.0  openings200beta07.epd

Fruit 2.1      &#40;2685 CCRL&#41;     &#58;      91       200   45.5      1.5     20.0  openings200beta07.epd
Sjaak II 1.3.1 &#40;2194 CCRL&#41;     &#58;      75       200   37.5      4.0     20.0  openings200beta07.epd
BikJump v2.01  &#40;2098 CCRL&#41;     &#58;      74       200   37.0      1.6     20.0  openings200beta07.epd
``````
It improved significantly positionally from ID69 (in only 3 days).

Tactically it seems very weak. On ECM tactical middlegame suite of 879 positions, it performs very badly (1s/position):

Code: Select all

``````Bik Jump 2.01    &#40;2098 CCRL Elo&#41;
score=574/879 &#91;averages on correct positions&#58; depth=4.6 time=0.19 nodes=467671&#93;

Predateur 2.2.1  &#40;1786 CCRL Elo&#41;
score=486/879 &#91;averages on correct positions&#58; depth=6.1 time=0.13 nodes=409596&#93;

LCZero &#40;ID83&#41;
score=173/879 &#91;averages on correct positions&#58; depth=13.6 time=0.24 nodes=312&#93;

LCZero &#40;ID69&#41;
score=171/879 &#91;averages on correct positions&#58; depth=13.5 time=0.25 nodes=318&#93;
``````
And doesn't seem to improve at all. The estimated CCRL Elo level on this tactical suite is about that of Stockfish at depth=3, or maybe 1400 CCRL Elo points. Something has to be done with its MCTS search, maybe on the lines outlined by Daniel Shawl.

MonteCarlo
### Re: LCZero: Progress and Scaling. Relation to CCRL Elo

Thanks for the update Kai!

On the one hand, it's quite possible that a fundamental change to its MCTS implementation will be required at some point if it wants to compete at the highest level, and the work Daniel Shawul has done with Scorpio could prove quite useful in that case (well, it's fantastic work in any case; it's just in this case that it would benefit LC0 ).

On the other hand, unless you subscribe to some form of conspiracy theory around the A0 results, we're nowhere near the limits of this sort of approach, so I wouldn't worry too much about that just yet.

Right now there are still a bunch of bugs being worked out, the network is still rather small, and the project is rather young (barely a month old, and it's barely been a week since the last major bug was discovered and fixed).

Some patience is required. It might turn out that switching to a new implementation of MCTS is required; it might also turn out that the NN at some level gives good enough prior probabilities for moves that even MCTS with averaging is good tactically.

We'll just have to give it some time