Close enough to 0 to make no difference even at 3500 Elo.whereagles wrote:
Leela is black.. 2.36% chance to win on a lone king??
LCZero: Progress and Scaling. Relation to CCRL Elo
Moderators: hgm, Rebel, chrisw
-
- Posts: 27793
- Joined: Fri Mar 10, 2006 10:06 am
- Location: Amsterdam
- Full name: H G Muller
Re: LCZero: Progress and Scaling. Relation to CCRL Elo
-
- Posts: 1627
- Joined: Thu Mar 09, 2006 12:35 pm
Re: LCZero: Progress and Scaling. Relation to CCRL Elo
How is this possible?? I mean how do the rollouts result in black wins in order black to have wins in its statistics??whereagles wrote:
Leela is black.. 2.36% chance to win on a lone king??
After his son's birth they've asked him:
"Is it a boy or girl?"
YES! He replied.....
"Is it a boy or girl?"
YES! He replied.....
-
- Posts: 708
- Joined: Mon Jan 16, 2012 6:34 am
Re: LCZero: Progress and Scaling. Relation to CCRL Elo
@Kai, is it possible to share your opening test suit?
-
- Posts: 4185
- Joined: Tue Mar 14, 2006 11:34 am
- Location: Ethiopia
Re: LCZero: Progress and Scaling. Relation to CCRL Elo
She averages her tree like no other, and sucks because of that
What is happening is that white will sometimes give up its queen in the tree and the score becomes close to a draw. When that is averaged with the non-loosing lines you get this non-existant winning probability.
Tried it on scorpioMCTS with averaging and minimax backups
With averaging like leela's, a score of -627 is like 2% winning chance
With minimaxing a score of -2023 is a 0% winning chance
What is happening is that white will sometimes give up its queen in the tree and the score becomes close to a draw. When that is averaged with the non-loosing lines you get this non-existant winning probability.
Tried it on scorpioMCTS with averaging and minimax backups
With averaging like leela's, a score of -627 is like 2% winning chance
Code: Select all
15 -570 5 118404 Ka6-b6 Kd3-c4 Kb6-b7 Qc2-b2
15 -545 7 158627 Ka6-b5 Qc2-g2 Kb5-a4 Qg2-g4 Ka4-b5 Qg4-c4 Kb5-a5 Qc4-c2 Ka5-b4 Qc2-f2 Kb4-b3
16 -569 7 167639 Ka6-b5 Qc2-g2 Kb5-a4 Qg2-d5 Ka4-b4 Qd5-c4 Kb4-a5 Qc4-c2 Ka5-b4 Qc2-f2 Kb4-b3
16 -595 8 177007 Ka6-b5 Qc2-g2 Kb5-a4 Qg2-f2 Ka4-b5 Qf2-f5 Kb5-b4
16 -611 9 219766 Ka6-b5 Qc2-g2 Kb5-a4 Qg2-g7 Ka4-b3 Qg7-b7 Kb3-a4 Qb7-f3
17 -627 9 227292 Ka6-b5 Qc2-g2 Kb5-a4 Kd3-c3 Ka4-b5 Qg2-d5
Code: Select all
2 -2023 0 554 Ka6-b7 Qc2-b2 Kb7-c8
3 -2034 0 3588 Ka6-b5 Qc2-c4 Kb5-a5 Qc4-c5 Ka5-a4
4 -2044 0 12593 Ka6-b6 Qc2-a4 Kb6-c7 Qa4-f4 Kc7-d7
5 -2056 0 23814 Ka6-b6 Qc2-b2 Kb6-c5 Qb2-e5 Kc5-b4
6 -2063 2 47803 Ka6-b6 Kd3-c3 Kb6-a6 Qc2-a2 Ka6-b6
7 -2064 3 91732 Ka6-b6 Kd3-c3 Kb6-c6 Qc2-a2 Kc6-b6
-
- Posts: 10948
- Joined: Wed Jul 26, 2006 10:21 pm
- Full name: Kai Laskos
Re: LCZero: Progress and Scaling. Relation to CCRL Elo
Sure, the link in this post should work:Nay Lin Tun wrote:@Kai, is it possible to share your opening test suit?
http://www.talkchess.com/forum/viewtopi ... 5&start=14
-
- Posts: 1296
- Joined: Sun Mar 12, 2006 6:46 pm
- Location: Kelowna
- Full name: Tony Mokonen
Re: LCZero: Progress and Scaling. Relation to CCRL Elo
L0 takes both wins and draws into consideration, so there's a small residual score from rollouts that result in draws.George Tsavdaris wrote: How is this possible?? I mean how do the rollouts result in black wins in order black to have wins in its statistics??
-
- Posts: 10282
- Joined: Thu Mar 09, 2006 12:37 am
- Location: Tel-Aviv Israel
Re: LCZero: Progress and Scaling. Relation to CCRL Elo
I think it means expected outcome of 2.36% that may be probability of 4.72% for a draw and 95.28% to lose.George Tsavdaris wrote:How is this possible?? I mean how do the rollouts result in black wins in order black to have wins in its statistics??whereagles wrote:
Leela is black.. 2.36% chance to win on a lone king??
-
- Posts: 188
- Joined: Sun Dec 25, 2016 4:59 pm
Re: LCZero: Progress and Scaling. Relation to CCRL Elo
Indeed. Leela's output is an expected score, not actually a win%.
The wording on the site has since been changed, it seems, although now it includes this "50%=draw" bit in the legend, which caused some debate in the discord.
Probably should just say "50%=equal chances" or some such thing ("expected score" is pretty self-explanatory, so could probably do away with the legend altogether), but not a big deal.
Last net to pass is actually fairly reasonable now. It'll be interesting to see where we are a week from now (it's not even been a week since the last big bug was fixed).
The wording on the site has since been changed, it seems, although now it includes this "50%=draw" bit in the legend, which caused some debate in the discord.
Probably should just say "50%=equal chances" or some such thing ("expected score" is pretty self-explanatory, so could probably do away with the legend altogether), but not a big deal.
Last net to pass is actually fairly reasonable now. It'll be interesting to see where we are a week from now (it's not even been a week since the last big bug was fixed).
-
- Posts: 10948
- Joined: Wed Jul 26, 2006 10:21 pm
- Full name: Kai Laskos
Re: LCZero: Progress and Scaling. Relation to CCRL Elo
Large improvement from ID69 to ID83.
Now, in only 100 games gauntlets against Zurichess Bern (2232 Elo CCRL) and BikJump v2.01 (2098 Elo CCRL), it performs at about 2200 Elo level at 1s/move and at about 2300 Elo level at 10s/move. On a full 4 core i7 CPU.
On my positional opening test suite of 200 positions, it is firmly settled amongst strong engines (20s/position):
It improved significantly positionally from ID69 (in only 3 days).
Tactically it seems very weak. On ECM tactical middlegame suite of 879 positions, it performs very badly (1s/position):
And doesn't seem to improve at all. The estimated CCRL Elo level on this tactical suite is about that of Stockfish at depth=3, or maybe 1400 CCRL Elo points. Something has to be done with its MCTS search, maybe on the lines outlined by Daniel Shawl.
Now, in only 100 games gauntlets against Zurichess Bern (2232 Elo CCRL) and BikJump v2.01 (2098 Elo CCRL), it performs at about 2200 Elo level at 1s/move and at about 2300 Elo level at 10s/move. On a full 4 core i7 CPU.
On my positional opening test suite of 200 positions, it is firmly settled amongst strong engines (20s/position):
Code: Select all
[Search parameters: MaxDepth=99 MaxTime=20.0 DepthDelta=2 MinDepth=7 MinTime=0.1]
Engine : Correct TotalPos Corr% AveT(s) MaxT(s) TestFile
Komodo 10.2 64-bit : 145 200 72.5 2.0 20.0 openings200beta07.epd
Houdini 5.01 Pro x64 : 144 200 72.0 2.4 20.0 openings200beta07.epd
Stockfish 8 64 BMI2 : 141 200 70.5 2.0 20.0 openings200beta07.epd
Houdini 5.01 Pro x64 Tactical : 139 200 69.5 2.3 20.0 openings200beta07.epd
Deep Shredder 13 x64 : 128 200 64.0 2.7 20.0 openings200beta07.epd
Houdini 4 Pro x64 : 126 200 63.0 1.8 20.0 openings200beta07.epd
Andscacs 0.88n : 123 200 61.5 2.4 20.0 openings200beta07.epd
Houdini 4 Pro x64 Tactical : 120 200 60.0 1.6 20.0 openings200beta07.epd
Nirvanachess 2.3 : 119 200 59.5 1.8 20.0 openings200beta07.epd
Fire 5 x64 (3341 CCRL) : 110 200 55.0 3.0 20.0 openings200beta07.epd
Texel 1.06 (3162 CCRL) : 110 200 55.0 1.6 20.0 openings200beta07.epd
LCZero ************* ID83 : 109 200 54.5 1.1 20.0 openings200beta07.epd
Fritz 15 (3227 CCRL) : 102 200 51.0 1.9 20.0 openings200beta07.epd
LCZero ************* ID69 : 98 200 49.0 2.7 20.0 openings200beta07.epd
Fruit 2.1 (2685 CCRL) : 91 200 45.5 1.5 20.0 openings200beta07.epd
Sjaak II 1.3.1 (2194 CCRL) : 75 200 37.5 4.0 20.0 openings200beta07.epd
BikJump v2.01 (2098 CCRL) : 74 200 37.0 1.6 20.0 openings200beta07.epd
Tactically it seems very weak. On ECM tactical middlegame suite of 879 positions, it performs very badly (1s/position):
Code: Select all
Bik Jump 2.01 (2098 CCRL Elo)
score=574/879 [averages on correct positions: depth=4.6 time=0.19 nodes=467671]
Predateur 2.2.1 (1786 CCRL Elo)
score=486/879 [averages on correct positions: depth=6.1 time=0.13 nodes=409596]
LCZero (ID83)
score=173/879 [averages on correct positions: depth=13.6 time=0.24 nodes=312]
LCZero (ID69)
score=171/879 [averages on correct positions: depth=13.5 time=0.25 nodes=318]
-
- Posts: 188
- Joined: Sun Dec 25, 2016 4:59 pm
Re: LCZero: Progress and Scaling. Relation to CCRL Elo
Thanks for the update Kai!
On the one hand, it's quite possible that a fundamental change to its MCTS implementation will be required at some point if it wants to compete at the highest level, and the work Daniel Shawul has done with Scorpio could prove quite useful in that case (well, it's fantastic work in any case; it's just in this case that it would benefit LC0 ).
On the other hand, unless you subscribe to some form of conspiracy theory around the A0 results, we're nowhere near the limits of this sort of approach, so I wouldn't worry too much about that just yet.
Right now there are still a bunch of bugs being worked out, the network is still rather small, and the project is rather young (barely a month old, and it's barely been a week since the last major bug was discovered and fixed).
Some patience is required. It might turn out that switching to a new implementation of MCTS is required; it might also turn out that the NN at some level gives good enough prior probabilities for moves that even MCTS with averaging is good tactically.
We'll just have to give it some time
On the one hand, it's quite possible that a fundamental change to its MCTS implementation will be required at some point if it wants to compete at the highest level, and the work Daniel Shawul has done with Scorpio could prove quite useful in that case (well, it's fantastic work in any case; it's just in this case that it would benefit LC0 ).
On the other hand, unless you subscribe to some form of conspiracy theory around the A0 results, we're nowhere near the limits of this sort of approach, so I wouldn't worry too much about that just yet.
Right now there are still a bunch of bugs being worked out, the network is still rather small, and the project is rather young (barely a month old, and it's barely been a week since the last major bug was discovered and fixed).
Some patience is required. It might turn out that switching to a new implementation of MCTS is required; it might also turn out that the NN at some level gives good enough prior probabilities for moves that even MCTS with averaging is good tactically.
We'll just have to give it some time