Page 1 of 14

Lc0 51010

Posted: Fri Mar 29, 2019 5:21 am
by lkaufman
I noticed that Lc0 started a new network from scratch, and after just ten iterations it showed a rating around 1500 (now 1466). I have read that these self-play ratings are inflated, so I imagined that its true strength was lower than 1500. So I played it a couple games at 5' + 5" (I'm too old for pure 5' blitz on a computer) and was surprised that I lost without a chance both games. I don't play GM level blitz at age 71, but I'm still at least 2200 level in blitz so this seemed rather strange. It's true that I have a very good GPU (RTX 2080) so I suppose this is the explanation, but anyway to check this out I set it to play some 1' + 1" games against Fruit 2.2.1 (rated around 2750 CCRL), and I was shocked to see this 1500 rated new Lc0 version beating 2750 rated Fruit by 9.5 to 0.5, a 512 elo margin!! Can a network less than one day old really be playing at 3250 level already (on good GPU)? Have others had similar experiences with new networks in the past? What does this mean?

Re: Lc0 51010

Posted: Fri Mar 29, 2019 5:36 am
by mwyoung
lkaufman wrote: Fri Mar 29, 2019 5:21 am I noticed that Lc0 started a new network from scratch, and after just ten iterations it showed a rating around 1500 (now 1466). I have read that these self-play ratings are inflated, so I imagined that its true strength was lower than 1500. So I played it a couple games at 5' + 5" (I'm too old for pure 5' blitz on a computer) and was surprised that I lost without a chance both games. I don't play GM level blitz at age 71, but I'm still at least 2200 level in blitz so this seemed rather strange. It's true that I have a very good GPU (RTX 2080) so I suppose this is the explanation, but anyway to check this out I set it to play some 1' + 1" games against Fruit 2.2.1 (rated around 2750 CCRL), and I was shocked to see this 1500 rated new Lc0 version beating 2750 rated Fruit by 9.5 to 0.5, a 512 elo margin!! Can a network less than one day old really be playing at 3250 level already (on good GPU)? Have others had similar experiences with new networks in the past? What does this mean?
I will test the new NN. But Your RTX 2080 is a beast. What was Fruit tested with in your match?

Re: Lc0 51010

Posted: Fri Mar 29, 2019 8:43 am
by mwyoung
Stockfish was not as impressed as Fruit. Lc0 51012 vs Stockfish
Results.jpg

Re: Lc0 51010

Posted: Fri Mar 29, 2019 10:50 am
by George Tsavdaris
lkaufman wrote: Fri Mar 29, 2019 5:21 am I noticed that Lc0 started a new network from scratch, and after just ten iterations it showed a rating around 1500 (now 1466). I have read that these self-play ratings are inflated, so I imagined that its true strength was lower than 1500. So I played it a couple games at 5' + 5" (I'm too old for pure 5' blitz on a computer) and was surprised that I lost without a chance both games. I don't play GM level blitz at age 71, but I'm still at least 2200 level in blitz so this seemed rather strange. It's true that I have a very good GPU (RTX 2080) so I suppose this is the explanation, but anyway to check this out I set it to play some 1' + 1" games against Fruit 2.2.1 (rated around 2750 CCRL), and I was shocked to see this 1500 rated new Lc0 version beating 2750 rated Fruit by 9.5 to 0.5, a 512 elo margin!! Can a network less than one day old really be playing at 3250 level already (on good GPU)? Have others had similar experiences with new networks in the past? What does this mean?
With what GUI do you play with Leela?
Did you created a NEW engine for this 51010 net? Because if not and you just altered the old settings for example, it might still uses an older net(e.g test40) that you possibly had on the folder.

Re: Lc0 51010

Posted: Fri Mar 29, 2019 12:38 pm
by Guenther
George Tsavdaris wrote: Fri Mar 29, 2019 10:50 am
lkaufman wrote: Fri Mar 29, 2019 5:21 am I noticed that Lc0 started a new network from scratch, and after just ten iterations it showed a rating around 1500 (now 1466). I have read that these self-play ratings are inflated, so I imagined that its true strength was lower than 1500. So I played it a couple games at 5' + 5" (I'm too old for pure 5' blitz on a computer) and was surprised that I lost without a chance both games. I don't play GM level blitz at age 71, but I'm still at least 2200 level in blitz so this seemed rather strange. It's true that I have a very good GPU (RTX 2080) so I suppose this is the explanation, but anyway to check this out I set it to play some 1' + 1" games against Fruit 2.2.1 (rated around 2750 CCRL), and I was shocked to see this 1500 rated new Lc0 version beating 2750 rated Fruit by 9.5 to 0.5, a 512 elo margin!! Can a network less than one day old really be playing at 3250 level already (on good GPU)? Have others had similar experiences with new networks in the past? What does this mean?
With what GUI do you play with Leela?
Did you created a NEW engine for this 51010 net? Because if not and you just altered the old settings for example, it might still uses an older net(e.g test40) that you possibly had on the folder.
I thought the same first, but real blitz games told me otherwise ;-)
Even on my very slow gpu NN 51010 is already very good at least vs. Humans in fair conditions.
(no thinking window, no evals to watch - no chance at 40/4 w/o preparation - I am slightly below 2000 Fide compared now,
in lichess still 2050-2100 at 3-0 games)

Re: Lc0 51010

Posted: Fri Mar 29, 2019 3:39 pm
by jhellis3
Now imagine if the training data wasn't filled with noise ;).

Re: Lc0 51010

Posted: Fri Mar 29, 2019 5:39 pm
by lkaufman
mwyoung wrote: Fri Mar 29, 2019 5:36 am
lkaufman wrote: Fri Mar 29, 2019 5:21 am I noticed that Lc0 started a new network from scratch, and after just ten iterations it showed a rating around 1500 (now 1466). I have read that these self-play ratings are inflated, so I imagined that its true strength was lower than 1500. So I played it a couple games at 5' + 5" (I'm too old for pure 5' blitz on a computer) and was surprised that I lost without a chance both games. I don't play GM level blitz at age 71, but I'm still at least 2200 level in blitz so this seemed rather strange. It's true that I have a very good GPU (RTX 2080) so I suppose this is the explanation, but anyway to check this out I set it to play some 1' + 1" games against Fruit 2.2.1 (rated around 2750 CCRL), and I was shocked to see this 1500 rated new Lc0 version beating 2750 rated Fruit by 9.5 to 0.5, a 512 elo margin!! Can a network less than one day old really be playing at 3250 level already (on good GPU)? Have others had similar experiences with new networks in the past? What does this mean?
I will test the new NN. But Your RTX 2080 is a beast. What was Fruit tested with in your match?
Fruite 2.2.1 was just a single thread engine, but it was running on a very fast i7 (4.9 GHz). I also played five games against the much stronger Naum 4 running on 7 threads, which should ber about 3100 CCRL, and Lc0 51010 won by 4 to 1, an even higher performance rating!

Re: Lc0 51010

Posted: Fri Mar 29, 2019 5:42 pm
by lkaufman
George Tsavdaris wrote: Fri Mar 29, 2019 10:50 am
lkaufman wrote: Fri Mar 29, 2019 5:21 am I noticed that Lc0 started a new network from scratch, and after just ten iterations it showed a rating around 1500 (now 1466). I have read that these self-play ratings are inflated, so I imagined that its true strength was lower than 1500. So I played it a couple games at 5' + 5" (I'm too old for pure 5' blitz on a computer) and was surprised that I lost without a chance both games. I don't play GM level blitz at age 71, but I'm still at least 2200 level in blitz so this seemed rather strange. It's true that I have a very good GPU (RTX 2080) so I suppose this is the explanation, but anyway to check this out I set it to play some 1' + 1" games against Fruit 2.2.1 (rated around 2750 CCRL), and I was shocked to see this 1500 rated new Lc0 version beating 2750 rated Fruit by 9.5 to 0.5, a 512 elo margin!! Can a network less than one day old really be playing at 3250 level already (on good GPU)? Have others had similar experiences with new networks in the past? What does this mean?
With what GUI do you play with Leela?
Did you created a NEW engine for this 51010 net? Because if not and you just altered the old settings for example, it might still uses an older net(e.g test40) that you possibly had on the folder.
Fritz GUI (technically Komodo 12 ChessBase GUI), and yes, new engine. It was obviously not playing like normal Lc0 versions. Evaluations of positions with material lost were drastically different.

Re: Lc0 51010

Posted: Fri Mar 29, 2019 5:51 pm
by lkaufman
mwyoung wrote: Fri Mar 29, 2019 8:43 am Stockfish was not as impressed as Fruit. Lc0 51012 vs Stockfish

Results.jpg
Well I don't expect a 3250 engine to score much against SF on a powerful computer. The point is that if it is 3250 level with a self-play rating under 1500, what should it be with a 3000+ self-play rating in a few weeks? One point I'm not clear on; are the self-play ratings based on the 800 node games (if that is the current number), or are they based on games at 1' + 1", and if the later, is it hardware-adjusted or just a mix of all hardware at that level?

Re: Lc0 51010

Posted: Fri Mar 29, 2019 7:40 pm
by Werewolf
lkaufman wrote: Fri Mar 29, 2019 5:51 pm
mwyoung wrote: Fri Mar 29, 2019 8:43 am Stockfish was not as impressed as Fruit. Lc0 51012 vs Stockfish

Results.jpg
Well I don't expect a 3250 engine to score much against SF on a powerful computer. The point is that if it is 3250 level with a self-play rating under 1500, what should it be with a 3000+ self-play rating in a few weeks?
Sadly not 3250 + 1500. The self play ratings are..."not to scale" as an architect might put it.