lczero rating
Posted: Mon Apr 02, 2018 7:31 pm
hi what is the current elo rating (ccrl elo) of lczero ,also is there any saturation so far?
About 2000 CCRL Elo points at blitz (40/4') on a good GPU or 4-8 core i7 CPU. It seems some saturation already appears to happen, but this can still be corrected by some change in parameters (noise, temperature). Then, another network will be necessary, and starting from zero.stavros wrote:hi what is the current elo rating (ccrl elo) of lczero ,also is there any saturation so far?
It will not be necessary to start from zero once the network stalls. Instead, a larger neural net can simply be trained from existing self-play games, afterward the net can continue to improve.Laskos wrote:About 2000 CCRL Elo points at blitz (40/4') on a good GPU or 4-8 core i7 CPU. It seems some saturation already appears to happen, but this can still be corrected by some change in parameters (noise, temperature). Then, another network will be necessary, and starting from zero.stavros wrote:hi what is the current elo rating (ccrl elo) of lczero ,also is there any saturation so far?
Here is the plot as a number of games:
http://lczero.org/
But it still might improve. LC0 seems to play terrible endgames (not in any way 2000 Elo level), from what I saw with my own eyes, something should be done in this respect.
Also, positionally it can be strong in openings, much above 2000 Elo level. But tactically it is much below 2000 Elo points. And "much" can mean 1000 or so Elo points. Maybe some improvement on the MCTS rollouts can be improvised, Daniel Shawl posted some interesting results.
Yes, endgames are usually converted, silly ways, but I saw also some elementary misses.jkiliani wrote:It will not be necessary to start from zero once the network stalls. Instead, a larger neural net can simply be trained from existing self-play games, afterward the net can continue to improve.Laskos wrote:About 2000 CCRL Elo points at blitz (40/4') on a good GPU or 4-8 core i7 CPU. It seems some saturation already appears to happen, but this can still be corrected by some change in parameters (noise, temperature). Then, another network will be necessary, and starting from zero.stavros wrote:hi what is the current elo rating (ccrl elo) of lczero ,also is there any saturation so far?
Here is the plot as a number of games:
http://lczero.org/
But it still might improve. LC0 seems to play terrible endgames (not in any way 2000 Elo level), from what I saw with my own eyes, something should be done in this respect.
Also, positionally it can be strong in openings, much above 2000 Elo level. But tactically it is much below 2000 Elo points. And "much" can mean 1000 or so Elo points. Maybe some improvement on the MCTS rollouts can be improvised, Daniel Shawl posted some interesting results.
By the way, I don't find LC0's endgame terrible at all, at least not in match games. It's not optimised to win quickly, sure, but I rarely see it giving away a certain win. Taking too long to convert a won position might be unappealing to humans, but is no sign of weakness as long as the position IS won in the end.
Code: Select all
[Search parameters: MaxDepth=99 MaxTime=20.0 DepthDelta=2 MinDepth=7 MinTime=0.1]
Engine : Correct TotalPos Corr% AveT(s) MaxT(s) TestFile
Komodo 10.2 64-bit : 145 200 72.5 2.0 20.0 openings200beta07.epd
Houdini 5.01 Pro x64 : 144 200 72.0 2.4 20.0 openings200beta07.epd
Stockfish 8 64 BMI2 : 141 200 70.5 2.0 20.0 openings200beta07.epd
Houdini 5.01 Pro x64 Tactical : 139 200 69.5 2.3 20.0 openings200beta07.epd
Deep Shredder 13 x64 : 128 200 64.0 2.7 20.0 openings200beta07.epd
Houdini 4 Pro x64 : 126 200 63.0 1.8 20.0 openings200beta07.epd
Andscacs 0.88n : 123 200 61.5 2.4 20.0 openings200beta07.epd
Houdini 4 Pro x64 Tactical : 120 200 60.0 1.6 20.0 openings200beta07.epd
Nirvanachess 2.3 : 119 200 59.5 1.8 20.0 openings200beta07.epd
Fire 5 x64 : 110 200 55.0 3.0 20.0 openings200beta07.epd
Texel 1.06 64-bit : 110 200 55.0 1.6 20.0 openings200beta07.epd
Fritz 15 (3227 CCRL) : 102 200 51.0 1.9 20.0 openings200beta07.epd
LCZero ************* ID69 : 98 200 49.0 2.7 20.0 openings200beta07.epd
Fruit 2.1 (2685 CCRL) : 91 200 45.5 1.5 20.0 openings200beta07.epd
Sjaak II 1.3.1 (2194 CCRL) : 75 200 37.5 4.0 20.0 openings200beta07.epd
BikJump v2.01 (2098 CCRL) : 74 200 37.0 1.6 20.0 openings200beta07.epd
Code: Select all
BikJump v2.01 (2098 CCRL Elo)
score=574/879 [averages on correct positions: depth=4.6 time=0.19 nodes=467671]
Predateur 2.2.1 (1786 CCRL Elo)
score=486/879 [averages on correct positions: depth=6.1 time=0.13 nodes=409596]
LCZero (ID69)
score=171/879 [averages on correct positions: depth=13.5 time=0.25 nodes=318]
What is "steps"?stavros wrote:correct me if iam wrong but even google Alphazero progress saturated after 700000
steps https://arxiv.org/pdf/1712.01815.pdf#page=4
i cant imagine lczero to match the latests top emgines.
already latest sd dv+cerebelum book is close to aplhazero
What is the ratio of time of generating self-play games to training from these games? If it is 10:1 for example then creating a bigger NN and training it again then no harm is done once you have the self-played games.jkiliani wrote: It will not be necessary to start from zero once the network stalls. Instead, a larger neural net can simply be trained from existing self-play games, afterward the net can continue to improve.
from : https://arxiv.org/pdf/1712.01815.pdf#page=4George Tsavdaris wrote:What is "steps"?stavros wrote:correct me if iam wrong but even google Alphazero progress saturated after 700000
steps https://arxiv.org/pdf/1712.01815.pdf#page=4
i cant imagine lczero to match the latests top emgines.
already latest sd dv+cerebelum book is close to aplhazero