Lc0 51010

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

lkaufman
Posts: 5960
Joined: Sun Jan 10, 2010 6:15 am
Location: Maryland USA

Lc0 51010

Post by lkaufman »

I noticed that Lc0 started a new network from scratch, and after just ten iterations it showed a rating around 1500 (now 1466). I have read that these self-play ratings are inflated, so I imagined that its true strength was lower than 1500. So I played it a couple games at 5' + 5" (I'm too old for pure 5' blitz on a computer) and was surprised that I lost without a chance both games. I don't play GM level blitz at age 71, but I'm still at least 2200 level in blitz so this seemed rather strange. It's true that I have a very good GPU (RTX 2080) so I suppose this is the explanation, but anyway to check this out I set it to play some 1' + 1" games against Fruit 2.2.1 (rated around 2750 CCRL), and I was shocked to see this 1500 rated new Lc0 version beating 2750 rated Fruit by 9.5 to 0.5, a 512 elo margin!! Can a network less than one day old really be playing at 3250 level already (on good GPU)? Have others had similar experiences with new networks in the past? What does this mean?
Komodo rules!
mwyoung
Posts: 2727
Joined: Wed May 12, 2010 10:00 pm

Re: Lc0 51010

Post by mwyoung »

lkaufman wrote: Fri Mar 29, 2019 5:21 am I noticed that Lc0 started a new network from scratch, and after just ten iterations it showed a rating around 1500 (now 1466). I have read that these self-play ratings are inflated, so I imagined that its true strength was lower than 1500. So I played it a couple games at 5' + 5" (I'm too old for pure 5' blitz on a computer) and was surprised that I lost without a chance both games. I don't play GM level blitz at age 71, but I'm still at least 2200 level in blitz so this seemed rather strange. It's true that I have a very good GPU (RTX 2080) so I suppose this is the explanation, but anyway to check this out I set it to play some 1' + 1" games against Fruit 2.2.1 (rated around 2750 CCRL), and I was shocked to see this 1500 rated new Lc0 version beating 2750 rated Fruit by 9.5 to 0.5, a 512 elo margin!! Can a network less than one day old really be playing at 3250 level already (on good GPU)? Have others had similar experiences with new networks in the past? What does this mean?
I will test the new NN. But Your RTX 2080 is a beast. What was Fruit tested with in your match?
"The worst thing that can happen to a forum is a running wild attacking moderator(HGM) who is not corrected by the community." - Ed Schröder
But my words like silent raindrops fell. And echoed in the wells of silence.
mwyoung
Posts: 2727
Joined: Wed May 12, 2010 10:00 pm

Re: Lc0 51010

Post by mwyoung »

Stockfish was not as impressed as Fruit. Lc0 51012 vs Stockfish
Results.jpg
"The worst thing that can happen to a forum is a running wild attacking moderator(HGM) who is not corrected by the community." - Ed Schröder
But my words like silent raindrops fell. And echoed in the wells of silence.
User avatar
George Tsavdaris
Posts: 1627
Joined: Thu Mar 09, 2006 12:35 pm

Re: Lc0 51010

Post by George Tsavdaris »

lkaufman wrote: Fri Mar 29, 2019 5:21 am I noticed that Lc0 started a new network from scratch, and after just ten iterations it showed a rating around 1500 (now 1466). I have read that these self-play ratings are inflated, so I imagined that its true strength was lower than 1500. So I played it a couple games at 5' + 5" (I'm too old for pure 5' blitz on a computer) and was surprised that I lost without a chance both games. I don't play GM level blitz at age 71, but I'm still at least 2200 level in blitz so this seemed rather strange. It's true that I have a very good GPU (RTX 2080) so I suppose this is the explanation, but anyway to check this out I set it to play some 1' + 1" games against Fruit 2.2.1 (rated around 2750 CCRL), and I was shocked to see this 1500 rated new Lc0 version beating 2750 rated Fruit by 9.5 to 0.5, a 512 elo margin!! Can a network less than one day old really be playing at 3250 level already (on good GPU)? Have others had similar experiences with new networks in the past? What does this mean?
With what GUI do you play with Leela?
Did you created a NEW engine for this 51010 net? Because if not and you just altered the old settings for example, it might still uses an older net(e.g test40) that you possibly had on the folder.
After his son's birth they've asked him:
"Is it a boy or girl?"
YES! He replied.....
User avatar
Guenther
Posts: 4605
Joined: Wed Oct 01, 2008 6:33 am
Location: Regensburg, Germany
Full name: Guenther Simon

Re: Lc0 51010

Post by Guenther »

George Tsavdaris wrote: Fri Mar 29, 2019 10:50 am
lkaufman wrote: Fri Mar 29, 2019 5:21 am I noticed that Lc0 started a new network from scratch, and after just ten iterations it showed a rating around 1500 (now 1466). I have read that these self-play ratings are inflated, so I imagined that its true strength was lower than 1500. So I played it a couple games at 5' + 5" (I'm too old for pure 5' blitz on a computer) and was surprised that I lost without a chance both games. I don't play GM level blitz at age 71, but I'm still at least 2200 level in blitz so this seemed rather strange. It's true that I have a very good GPU (RTX 2080) so I suppose this is the explanation, but anyway to check this out I set it to play some 1' + 1" games against Fruit 2.2.1 (rated around 2750 CCRL), and I was shocked to see this 1500 rated new Lc0 version beating 2750 rated Fruit by 9.5 to 0.5, a 512 elo margin!! Can a network less than one day old really be playing at 3250 level already (on good GPU)? Have others had similar experiences with new networks in the past? What does this mean?
With what GUI do you play with Leela?
Did you created a NEW engine for this 51010 net? Because if not and you just altered the old settings for example, it might still uses an older net(e.g test40) that you possibly had on the folder.
I thought the same first, but real blitz games told me otherwise ;-)
Even on my very slow gpu NN 51010 is already very good at least vs. Humans in fair conditions.
(no thinking window, no evals to watch - no chance at 40/4 w/o preparation - I am slightly below 2000 Fide compared now,
in lichess still 2050-2100 at 3-0 games)
https://rwbc-chess.de

trollwatch:
Chessqueen + chessica + AlexChess + Eduard + Sylwy
jhellis3
Posts: 546
Joined: Sat Aug 17, 2013 12:36 am

Re: Lc0 51010

Post by jhellis3 »

Now imagine if the training data wasn't filled with noise ;).
lkaufman
Posts: 5960
Joined: Sun Jan 10, 2010 6:15 am
Location: Maryland USA

Re: Lc0 51010

Post by lkaufman »

mwyoung wrote: Fri Mar 29, 2019 5:36 am
lkaufman wrote: Fri Mar 29, 2019 5:21 am I noticed that Lc0 started a new network from scratch, and after just ten iterations it showed a rating around 1500 (now 1466). I have read that these self-play ratings are inflated, so I imagined that its true strength was lower than 1500. So I played it a couple games at 5' + 5" (I'm too old for pure 5' blitz on a computer) and was surprised that I lost without a chance both games. I don't play GM level blitz at age 71, but I'm still at least 2200 level in blitz so this seemed rather strange. It's true that I have a very good GPU (RTX 2080) so I suppose this is the explanation, but anyway to check this out I set it to play some 1' + 1" games against Fruit 2.2.1 (rated around 2750 CCRL), and I was shocked to see this 1500 rated new Lc0 version beating 2750 rated Fruit by 9.5 to 0.5, a 512 elo margin!! Can a network less than one day old really be playing at 3250 level already (on good GPU)? Have others had similar experiences with new networks in the past? What does this mean?
I will test the new NN. But Your RTX 2080 is a beast. What was Fruit tested with in your match?
Fruite 2.2.1 was just a single thread engine, but it was running on a very fast i7 (4.9 GHz). I also played five games against the much stronger Naum 4 running on 7 threads, which should ber about 3100 CCRL, and Lc0 51010 won by 4 to 1, an even higher performance rating!
Komodo rules!
lkaufman
Posts: 5960
Joined: Sun Jan 10, 2010 6:15 am
Location: Maryland USA

Re: Lc0 51010

Post by lkaufman »

George Tsavdaris wrote: Fri Mar 29, 2019 10:50 am
lkaufman wrote: Fri Mar 29, 2019 5:21 am I noticed that Lc0 started a new network from scratch, and after just ten iterations it showed a rating around 1500 (now 1466). I have read that these self-play ratings are inflated, so I imagined that its true strength was lower than 1500. So I played it a couple games at 5' + 5" (I'm too old for pure 5' blitz on a computer) and was surprised that I lost without a chance both games. I don't play GM level blitz at age 71, but I'm still at least 2200 level in blitz so this seemed rather strange. It's true that I have a very good GPU (RTX 2080) so I suppose this is the explanation, but anyway to check this out I set it to play some 1' + 1" games against Fruit 2.2.1 (rated around 2750 CCRL), and I was shocked to see this 1500 rated new Lc0 version beating 2750 rated Fruit by 9.5 to 0.5, a 512 elo margin!! Can a network less than one day old really be playing at 3250 level already (on good GPU)? Have others had similar experiences with new networks in the past? What does this mean?
With what GUI do you play with Leela?
Did you created a NEW engine for this 51010 net? Because if not and you just altered the old settings for example, it might still uses an older net(e.g test40) that you possibly had on the folder.
Fritz GUI (technically Komodo 12 ChessBase GUI), and yes, new engine. It was obviously not playing like normal Lc0 versions. Evaluations of positions with material lost were drastically different.
Komodo rules!
lkaufman
Posts: 5960
Joined: Sun Jan 10, 2010 6:15 am
Location: Maryland USA

Re: Lc0 51010

Post by lkaufman »

mwyoung wrote: Fri Mar 29, 2019 8:43 am Stockfish was not as impressed as Fruit. Lc0 51012 vs Stockfish

Results.jpg
Well I don't expect a 3250 engine to score much against SF on a powerful computer. The point is that if it is 3250 level with a self-play rating under 1500, what should it be with a 3000+ self-play rating in a few weeks? One point I'm not clear on; are the self-play ratings based on the 800 node games (if that is the current number), or are they based on games at 1' + 1", and if the later, is it hardware-adjusted or just a mix of all hardware at that level?
Komodo rules!
Werewolf
Posts: 1795
Joined: Thu Sep 18, 2008 10:24 pm

Re: Lc0 51010

Post by Werewolf »

lkaufman wrote: Fri Mar 29, 2019 5:51 pm
mwyoung wrote: Fri Mar 29, 2019 8:43 am Stockfish was not as impressed as Fruit. Lc0 51012 vs Stockfish

Results.jpg
Well I don't expect a 3250 engine to score much against SF on a powerful computer. The point is that if it is 3250 level with a self-play rating under 1500, what should it be with a 3000+ self-play rating in a few weeks?
Sadly not 3250 + 1500. The self play ratings are..."not to scale" as an architect might put it.