lczero rating
Moderators: hgm, Rebel, chrisw
-
- Posts: 165
- Joined: Tue Dec 02, 2014 1:29 am
lczero rating
hi what is the current elo rating (ccrl elo) of lczero ,also is there any saturation so far?
-
- Posts: 10948
- Joined: Wed Jul 26, 2006 10:21 pm
- Full name: Kai Laskos
Re: lczero rating
About 2000 CCRL Elo points at blitz (40/4') on a good GPU or 4-8 core i7 CPU. It seems some saturation already appears to happen, but this can still be corrected by some change in parameters (noise, temperature). Then, another network will be necessary, and starting from zero.stavros wrote:hi what is the current elo rating (ccrl elo) of lczero ,also is there any saturation so far?
Here is the plot as a number of games:
http://lczero.org/
But it still might improve. LC0 seems to play terrible endgames (not in any way 2000 Elo level), from what I saw with my own eyes, something should be done in this respect.
Also, positionally it can be strong in openings, much above 2000 Elo level. But tactically it is much below 2000 Elo points. And "much" can mean 1000 or so Elo points. Maybe some improvement on the MCTS rollouts can be improvised, Daniel Shawl posted some interesting results.
-
- Posts: 143
- Joined: Wed Jan 17, 2018 1:26 pm
Re: lczero rating
It will not be necessary to start from zero once the network stalls. Instead, a larger neural net can simply be trained from existing self-play games, afterward the net can continue to improve.Laskos wrote:About 2000 CCRL Elo points at blitz (40/4') on a good GPU or 4-8 core i7 CPU. It seems some saturation already appears to happen, but this can still be corrected by some change in parameters (noise, temperature). Then, another network will be necessary, and starting from zero.stavros wrote:hi what is the current elo rating (ccrl elo) of lczero ,also is there any saturation so far?
Here is the plot as a number of games:
http://lczero.org/
But it still might improve. LC0 seems to play terrible endgames (not in any way 2000 Elo level), from what I saw with my own eyes, something should be done in this respect.
Also, positionally it can be strong in openings, much above 2000 Elo level. But tactically it is much below 2000 Elo points. And "much" can mean 1000 or so Elo points. Maybe some improvement on the MCTS rollouts can be improvised, Daniel Shawl posted some interesting results.
By the way, I don't find LC0's endgame terrible at all, at least not in match games. It's not optimised to win quickly, sure, but I rarely see it giving away a certain win. Taking too long to convert a won position might be unappealing to humans, but is no sign of weakness as long as the position IS won in the end.
-
- Posts: 550
- Joined: Thu Apr 24, 2008 9:31 am
- Location: Belgium
Re: lczero rating
LCZero vs Stockfish : https://docs.google.com/spreadsheets/d/ ... edit#gid=0
-
- Posts: 10948
- Joined: Wed Jul 26, 2006 10:21 pm
- Full name: Kai Laskos
Re: lczero rating
Yes, endgames are usually converted, silly ways, but I saw also some elementary misses.jkiliani wrote:It will not be necessary to start from zero once the network stalls. Instead, a larger neural net can simply be trained from existing self-play games, afterward the net can continue to improve.Laskos wrote:About 2000 CCRL Elo points at blitz (40/4') on a good GPU or 4-8 core i7 CPU. It seems some saturation already appears to happen, but this can still be corrected by some change in parameters (noise, temperature). Then, another network will be necessary, and starting from zero.stavros wrote:hi what is the current elo rating (ccrl elo) of lczero ,also is there any saturation so far?
Here is the plot as a number of games:
http://lczero.org/
But it still might improve. LC0 seems to play terrible endgames (not in any way 2000 Elo level), from what I saw with my own eyes, something should be done in this respect.
Also, positionally it can be strong in openings, much above 2000 Elo level. But tactically it is much below 2000 Elo points. And "much" can mean 1000 or so Elo points. Maybe some improvement on the MCTS rollouts can be improvised, Daniel Shawl posted some interesting results.
By the way, I don't find LC0's endgame terrible at all, at least not in match games. It's not optimised to win quickly, sure, but I rarely see it giving away a certain win. Taking too long to convert a won position might be unappealing to humans, but is no sign of weakness as long as the position IS won in the end.
Compare opening positional versus middlegame tactical abilities:
On positional opening suite:
Openings200beta07 (200 positions, 20s/position)
Code: Select all
[Search parameters: MaxDepth=99 MaxTime=20.0 DepthDelta=2 MinDepth=7 MinTime=0.1]
Engine : Correct TotalPos Corr% AveT(s) MaxT(s) TestFile
Komodo 10.2 64-bit : 145 200 72.5 2.0 20.0 openings200beta07.epd
Houdini 5.01 Pro x64 : 144 200 72.0 2.4 20.0 openings200beta07.epd
Stockfish 8 64 BMI2 : 141 200 70.5 2.0 20.0 openings200beta07.epd
Houdini 5.01 Pro x64 Tactical : 139 200 69.5 2.3 20.0 openings200beta07.epd
Deep Shredder 13 x64 : 128 200 64.0 2.7 20.0 openings200beta07.epd
Houdini 4 Pro x64 : 126 200 63.0 1.8 20.0 openings200beta07.epd
Andscacs 0.88n : 123 200 61.5 2.4 20.0 openings200beta07.epd
Houdini 4 Pro x64 Tactical : 120 200 60.0 1.6 20.0 openings200beta07.epd
Nirvanachess 2.3 : 119 200 59.5 1.8 20.0 openings200beta07.epd
Fire 5 x64 : 110 200 55.0 3.0 20.0 openings200beta07.epd
Texel 1.06 64-bit : 110 200 55.0 1.6 20.0 openings200beta07.epd
Fritz 15 (3227 CCRL) : 102 200 51.0 1.9 20.0 openings200beta07.epd
LCZero ************* ID69 : 98 200 49.0 2.7 20.0 openings200beta07.epd
Fruit 2.1 (2685 CCRL) : 91 200 45.5 1.5 20.0 openings200beta07.epd
Sjaak II 1.3.1 (2194 CCRL) : 75 200 37.5 4.0 20.0 openings200beta07.epd
BikJump v2.01 (2098 CCRL) : 74 200 37.0 1.6 20.0 openings200beta07.epd
On tactical middlegame suite:
ECM (879 positions, 1s/position)
Code: Select all
BikJump v2.01 (2098 CCRL Elo)
score=574/879 [averages on correct positions: depth=4.6 time=0.19 nodes=467671]
Predateur 2.2.1 (1786 CCRL Elo)
score=486/879 [averages on correct positions: depth=6.1 time=0.13 nodes=409596]
LCZero (ID69)
score=171/879 [averages on correct positions: depth=13.5 time=0.25 nodes=318]
-
- Posts: 165
- Joined: Tue Dec 02, 2014 1:29 am
Re: lczero rating
correct me if iam wrong but even google Alphazero progress saturated after 700000
steps https://arxiv.org/pdf/1712.01815.pdf#page=4
i cant imagine lczero to match the latests top emgines.
already latest sd dv+cerebelum book is close to aplhazero
steps https://arxiv.org/pdf/1712.01815.pdf#page=4
i cant imagine lczero to match the latests top emgines.
already latest sd dv+cerebelum book is close to aplhazero
-
- Posts: 1142
- Joined: Thu Dec 28, 2017 4:06 pm
- Location: Argentina
Re: lczero rating
After Leela finally beat TSCP I had to get a newer, stronger opponent and this time it was Vice 1.1, which is around 300 elo stronger than TSCP.
I matched Leela ID 69 vs Vice 1.1 in a 40/40 10-game match (one of the games went on for more than 7 hours!!) and the end result was a surprising 5.5-4.5, in favor of Vice. I would have expected Vice to dominate much more, but looks like Leela is learning tricks really fast.
Here's the 10-game match PGN: http://www.mediafire.com/file/l1sltwy5k ... 20Vice.pgn
I matched Leela ID 69 vs Vice 1.1 in a 40/40 10-game match (one of the games went on for more than 7 hours!!) and the end result was a surprising 5.5-4.5, in favor of Vice. I would have expected Vice to dominate much more, but looks like Leela is learning tricks really fast.
Here's the 10-game match PGN: http://www.mediafire.com/file/l1sltwy5k ... 20Vice.pgn
Follow my tournament and some Leela gauntlets live at http://twitch.tv/ccls
-
- Posts: 1627
- Joined: Thu Mar 09, 2006 12:35 pm
Re: lczero rating
What is "steps"?stavros wrote:correct me if iam wrong but even google Alphazero progress saturated after 700000
steps https://arxiv.org/pdf/1712.01815.pdf#page=4
i cant imagine lczero to match the latests top emgines.
already latest sd dv+cerebelum book is close to aplhazero
After his son's birth they've asked him:
"Is it a boy or girl?"
YES! He replied.....
"Is it a boy or girl?"
YES! He replied.....
-
- Posts: 1627
- Joined: Thu Mar 09, 2006 12:35 pm
Re: lczero rating
What is the ratio of time of generating self-play games to training from these games? If it is 10:1 for example then creating a bigger NN and training it again then no harm is done once you have the self-played games.jkiliani wrote: It will not be necessary to start from zero once the network stalls. Instead, a larger neural net can simply be trained from existing self-play games, afterward the net can continue to improve.
BUT since these self-played games have been played by a smaller(and weaker) NN, by training from them a bigger NN, doesn't this creates an non optimum procedure?
After his son's birth they've asked him:
"Is it a boy or girl?"
YES! He replied.....
"Is it a boy or girl?"
YES! He replied.....
-
- Posts: 165
- Joined: Tue Dec 02, 2014 1:29 am
Re: lczero rating
from : https://arxiv.org/pdf/1712.01815.pdf#page=4George Tsavdaris wrote:What is "steps"?stavros wrote:correct me if iam wrong but even google Alphazero progress saturated after 700000
steps https://arxiv.org/pdf/1712.01815.pdf#page=4
i cant imagine lczero to match the latests top emgines.
already latest sd dv+cerebelum book is close to aplhazero
"We trained a separate instance of
AlphaZero
for each game. Training proceeded
for 700,000 steps (mini-batches of size 4,096) starting from randomly initialised parameters,
using 5,000 first-generation TPUs (
15
) to generate self-play games and 64 second-generation
TPUs to train the neural networks.
1
Further details of the training procedure are provided in the
Methods."