LCZero: Progress and Scaling. Relation to CCRL Elo

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

User avatar
hgm
Posts: 27788
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: LCZero: Progress and Scaling. Relation to CCRL Elo

Post by hgm »

whereagles wrote:Image

Leela is black.. 2.36% chance to win on a lone king?? :wink:
Close enough to 0 to make no difference even at 3500 Elo.
User avatar
George Tsavdaris
Posts: 1627
Joined: Thu Mar 09, 2006 12:35 pm

Re: LCZero: Progress and Scaling. Relation to CCRL Elo

Post by George Tsavdaris »

whereagles wrote:Image

Leela is black.. 2.36% chance to win on a lone king?? :wink:
How is this possible?? I mean how do the rollouts result in black wins in order black to have wins in its statistics?? :?
After his son's birth they've asked him:
"Is it a boy or girl?"
YES! He replied.....
Nay Lin Tun
Posts: 708
Joined: Mon Jan 16, 2012 6:34 am

Re: LCZero: Progress and Scaling. Relation to CCRL Elo

Post by Nay Lin Tun »

@Kai, is it possible to share your opening test suit?
Daniel Shawul
Posts: 4185
Joined: Tue Mar 14, 2006 11:34 am
Location: Ethiopia

Re: LCZero: Progress and Scaling. Relation to CCRL Elo

Post by Daniel Shawul »

She averages her tree like no other, and sucks because of that :)

What is happening is that white will sometimes give up its queen in the tree and the score becomes close to a draw. When that is averaged with the non-loosing lines you get this non-existant winning probability.

Tried it on scorpioMCTS with averaging and minimax backups

With averaging like leela's, a score of -627 is like 2% winning chance

Code: Select all

15 -570 5 118404  Ka6-b6 Kd3-c4 Kb6-b7 Qc2-b2
15 -545 7 158627  Ka6-b5 Qc2-g2 Kb5-a4 Qg2-g4 Ka4-b5 Qg4-c4 Kb5-a5 Qc4-c2 Ka5-b4 Qc2-f2 Kb4-b3
16 -569 7 167639  Ka6-b5 Qc2-g2 Kb5-a4 Qg2-d5 Ka4-b4 Qd5-c4 Kb4-a5 Qc4-c2 Ka5-b4 Qc2-f2 Kb4-b3
16 -595 8 177007  Ka6-b5 Qc2-g2 Kb5-a4 Qg2-f2 Ka4-b5 Qf2-f5 Kb5-b4
16 -611 9 219766  Ka6-b5 Qc2-g2 Kb5-a4 Qg2-g7 Ka4-b3 Qg7-b7 Kb3-a4 Qb7-f3
17 -627 9 227292  Ka6-b5 Qc2-g2 Kb5-a4 Kd3-c3 Ka4-b5 Qg2-d5
With minimaxing a score of -2023 is a 0% winning chance

Code: Select all

2 -2023 0 554  Ka6-b7 Qc2-b2 Kb7-c8
3 -2034 0 3588  Ka6-b5 Qc2-c4 Kb5-a5 Qc4-c5 Ka5-a4
4 -2044 0 12593  Ka6-b6 Qc2-a4 Kb6-c7 Qa4-f4 Kc7-d7
5 -2056 0 23814  Ka6-b6 Qc2-b2 Kb6-c5 Qb2-e5 Kc5-b4
6 -2063 2 47803  Ka6-b6 Kd3-c3 Kb6-a6 Qc2-a2 Ka6-b6
7 -2064 3 91732  Ka6-b6 Kd3-c3 Kb6-c6 Qc2-a2 Kc6-b6
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: LCZero: Progress and Scaling. Relation to CCRL Elo

Post by Laskos »

Nay Lin Tun wrote:@Kai, is it possible to share your opening test suit?
Sure, the link in this post should work:
http://www.talkchess.com/forum/viewtopi ... 5&start=14
tmokonen
Posts: 1296
Joined: Sun Mar 12, 2006 6:46 pm
Location: Kelowna
Full name: Tony Mokonen

Re: LCZero: Progress and Scaling. Relation to CCRL Elo

Post by tmokonen »

George Tsavdaris wrote: How is this possible?? I mean how do the rollouts result in black wins in order black to have wins in its statistics?? :?
L0 takes both wins and draws into consideration, so there's a small residual score from rollouts that result in draws.
Uri Blass
Posts: 10268
Joined: Thu Mar 09, 2006 12:37 am
Location: Tel-Aviv Israel

Re: LCZero: Progress and Scaling. Relation to CCRL Elo

Post by Uri Blass »

George Tsavdaris wrote:
whereagles wrote:Image

Leela is black.. 2.36% chance to win on a lone king?? :wink:
How is this possible?? I mean how do the rollouts result in black wins in order black to have wins in its statistics?? :?
I think it means expected outcome of 2.36% that may be probability of 4.72% for a draw and 95.28% to lose.
MonteCarlo
Posts: 188
Joined: Sun Dec 25, 2016 4:59 pm

Re: LCZero: Progress and Scaling. Relation to CCRL Elo

Post by MonteCarlo »

Indeed. Leela's output is an expected score, not actually a win%.

The wording on the site has since been changed, it seems, although now it includes this "50%=draw" bit in the legend, which caused some debate in the discord.

Probably should just say "50%=equal chances" or some such thing ("expected score" is pretty self-explanatory, so could probably do away with the legend altogether), but not a big deal. :)

Last net to pass is actually fairly reasonable now. It'll be interesting to see where we are a week from now (it's not even been a week since the last big bug was fixed).
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: LCZero: Progress and Scaling. Relation to CCRL Elo

Post by Laskos »

Large improvement from ID69 to ID83.

Now, in only 100 games gauntlets against Zurichess Bern (2232 Elo CCRL) and BikJump v2.01 (2098 Elo CCRL), it performs at about 2200 Elo level at 1s/move and at about 2300 Elo level at 10s/move. On a full 4 core i7 CPU.

On my positional opening test suite of 200 positions, it is firmly settled amongst strong engines (20s/position):

Code: Select all

[Search parameters: MaxDepth=99   MaxTime=20.0   DepthDelta=2   MinDepth=7   MinTime=0.1] 

Engine                         : Correct  TotalPos  Corr%  AveT(s)  MaxT(s)  TestFile 
      
Komodo 10.2 64-bit             :     145       200   72.5      2.0     20.0  openings200beta07.epd 
Houdini 5.01 Pro x64           :     144       200   72.0      2.4     20.0  openings200beta07.epd    
Stockfish 8 64 BMI2            :     141       200   70.5      2.0     20.0  openings200beta07.epd 
Houdini 5.01 Pro x64 Tactical  :     139       200   69.5      2.3     20.0  openings200beta07.epd      
Deep Shredder 13 x64           :     128       200   64.0      2.7     20.0  openings200beta07.epd    
Houdini 4 Pro x64              :     126       200   63.0      1.8     20.0  openings200beta07.epd    
Andscacs 0.88n                 :     123       200   61.5      2.4     20.0  openings200beta07.epd 
Houdini 4 Pro x64 Tactical     :     120       200   60.0      1.6     20.0  openings200beta07.epd 
Nirvanachess 2.3               :     119       200   59.5      1.8     20.0  openings200beta07.epd 
Fire 5 x64     (3341 CCRL)     :     110       200   55.0      3.0     20.0  openings200beta07.epd    
Texel 1.06     (3162 CCRL)     :     110       200   55.0      1.6     20.0  openings200beta07.epd    

LCZero  *************  ID83    :     109       200   54.5      1.1     20.0  openings200beta07.epd

Fritz 15       (3227 CCRL)     :     102       200   51.0      1.9     20.0  openings200beta07.epd  

LCZero  *************  ID69    :      98       200   49.0      2.7     20.0  openings200beta07.epd 
  
Fruit 2.1      (2685 CCRL)     :      91       200   45.5      1.5     20.0  openings200beta07.epd  
Sjaak II 1.3.1 (2194 CCRL)     :      75       200   37.5      4.0     20.0  openings200beta07.epd    
BikJump v2.01  (2098 CCRL)     :      74       200   37.0      1.6     20.0  openings200beta07.epd
It improved significantly positionally from ID69 (in only 3 days).



Tactically it seems very weak. On ECM tactical middlegame suite of 879 positions, it performs very badly (1s/position):

Code: Select all

Bik Jump 2.01    (2098 CCRL Elo)
score=574/879 [averages on correct positions: depth=4.6 time=0.19 nodes=467671]

Predateur 2.2.1  (1786 CCRL Elo)
score=486/879 [averages on correct positions: depth=6.1 time=0.13 nodes=409596]

LCZero (ID83)
score=173/879 [averages on correct positions: depth=13.6 time=0.24 nodes=312]

LCZero (ID69)
score=171/879 [averages on correct positions: depth=13.5 time=0.25 nodes=318]
And doesn't seem to improve at all. The estimated CCRL Elo level on this tactical suite is about that of Stockfish at depth=3, or maybe 1400 CCRL Elo points. Something has to be done with its MCTS search, maybe on the lines outlined by Daniel Shawl.
MonteCarlo
Posts: 188
Joined: Sun Dec 25, 2016 4:59 pm

Re: LCZero: Progress and Scaling. Relation to CCRL Elo

Post by MonteCarlo »

Thanks for the update Kai!

On the one hand, it's quite possible that a fundamental change to its MCTS implementation will be required at some point if it wants to compete at the highest level, and the work Daniel Shawul has done with Scorpio could prove quite useful in that case (well, it's fantastic work in any case; it's just in this case that it would benefit LC0 :) ).

On the other hand, unless you subscribe to some form of conspiracy theory around the A0 results, we're nowhere near the limits of this sort of approach, so I wouldn't worry too much about that just yet.

Right now there are still a bunch of bugs being worked out, the network is still rather small, and the project is rather young (barely a month old, and it's barely been a week since the last major bug was discovered and fixed).

Some patience is required. It might turn out that switching to a new implementation of MCTS is required; it might also turn out that the NN at some level gives good enough prior probabilities for moves that even MCTS with averaging is good tactically.

We'll just have to give it some time :)