Lc0 51010

Discussion of anything and everything relating to chess playing software and machines.

Moderators: bob, hgm, Harvey Williamson

Forum rules
This textbox is used to restore diagrams posted with the [d] tag before the upgrade.
Post Reply
lkaufman
Posts: 3722
Joined: Sun Jan 10, 2010 5:15 am
Location: Maryland USA
Contact:

Lc0 51010

Post by lkaufman » Fri Mar 29, 2019 4:21 am

I noticed that Lc0 started a new network from scratch, and after just ten iterations it showed a rating around 1500 (now 1466). I have read that these self-play ratings are inflated, so I imagined that its true strength was lower than 1500. So I played it a couple games at 5' + 5" (I'm too old for pure 5' blitz on a computer) and was surprised that I lost without a chance both games. I don't play GM level blitz at age 71, but I'm still at least 2200 level in blitz so this seemed rather strange. It's true that I have a very good GPU (RTX 2080) so I suppose this is the explanation, but anyway to check this out I set it to play some 1' + 1" games against Fruit 2.2.1 (rated around 2750 CCRL), and I was shocked to see this 1500 rated new Lc0 version beating 2750 rated Fruit by 9.5 to 0.5, a 512 elo margin!! Can a network less than one day old really be playing at 3250 level already (on good GPU)? Have others had similar experiences with new networks in the past? What does this mean?
Komodo rules!

mwyoung
Posts: 1641
Joined: Wed May 12, 2010 8:00 pm

Re: Lc0 51010

Post by mwyoung » Fri Mar 29, 2019 4:36 am

lkaufman wrote:
Fri Mar 29, 2019 4:21 am
I noticed that Lc0 started a new network from scratch, and after just ten iterations it showed a rating around 1500 (now 1466). I have read that these self-play ratings are inflated, so I imagined that its true strength was lower than 1500. So I played it a couple games at 5' + 5" (I'm too old for pure 5' blitz on a computer) and was surprised that I lost without a chance both games. I don't play GM level blitz at age 71, but I'm still at least 2200 level in blitz so this seemed rather strange. It's true that I have a very good GPU (RTX 2080) so I suppose this is the explanation, but anyway to check this out I set it to play some 1' + 1" games against Fruit 2.2.1 (rated around 2750 CCRL), and I was shocked to see this 1500 rated new Lc0 version beating 2750 rated Fruit by 9.5 to 0.5, a 512 elo margin!! Can a network less than one day old really be playing at 3250 level already (on good GPU)? Have others had similar experiences with new networks in the past? What does this mean?
I will test the new NN. But Your RTX 2080 is a beast. What was Fruit tested with in your match?
Professing themselves to be wise, they became fools,
Take on me. foes 0

mwyoung
Posts: 1641
Joined: Wed May 12, 2010 8:00 pm

Re: Lc0 51010

Post by mwyoung » Fri Mar 29, 2019 7:43 am

Stockfish was not as impressed as Fruit. Lc0 51012 vs Stockfish
Results.jpg
Results.jpg (102.5 KiB) Viewed 5000 times
Professing themselves to be wise, they became fools,
Take on me. foes 0

User avatar
George Tsavdaris
Posts: 1599
Joined: Thu Mar 09, 2006 11:35 am

Re: Lc0 51010

Post by George Tsavdaris » Fri Mar 29, 2019 9:50 am

lkaufman wrote:
Fri Mar 29, 2019 4:21 am
I noticed that Lc0 started a new network from scratch, and after just ten iterations it showed a rating around 1500 (now 1466). I have read that these self-play ratings are inflated, so I imagined that its true strength was lower than 1500. So I played it a couple games at 5' + 5" (I'm too old for pure 5' blitz on a computer) and was surprised that I lost without a chance both games. I don't play GM level blitz at age 71, but I'm still at least 2200 level in blitz so this seemed rather strange. It's true that I have a very good GPU (RTX 2080) so I suppose this is the explanation, but anyway to check this out I set it to play some 1' + 1" games against Fruit 2.2.1 (rated around 2750 CCRL), and I was shocked to see this 1500 rated new Lc0 version beating 2750 rated Fruit by 9.5 to 0.5, a 512 elo margin!! Can a network less than one day old really be playing at 3250 level already (on good GPU)? Have others had similar experiences with new networks in the past? What does this mean?
With what GUI do you play with Leela?
Did you created a NEW engine for this 51010 net? Because if not and you just altered the old settings for example, it might still uses an older net(e.g test40) that you possibly had on the folder.
After his son's birth they've asked him:
"Is it a boy or girl?"
YES! He replied.....

User avatar
Guenther
Posts: 3041
Joined: Wed Oct 01, 2008 4:33 am
Location: Regensburg, Germany
Full name: Guenther Simon
Contact:

Re: Lc0 51010

Post by Guenther » Fri Mar 29, 2019 11:38 am

George Tsavdaris wrote:
Fri Mar 29, 2019 9:50 am
lkaufman wrote:
Fri Mar 29, 2019 4:21 am
I noticed that Lc0 started a new network from scratch, and after just ten iterations it showed a rating around 1500 (now 1466). I have read that these self-play ratings are inflated, so I imagined that its true strength was lower than 1500. So I played it a couple games at 5' + 5" (I'm too old for pure 5' blitz on a computer) and was surprised that I lost without a chance both games. I don't play GM level blitz at age 71, but I'm still at least 2200 level in blitz so this seemed rather strange. It's true that I have a very good GPU (RTX 2080) so I suppose this is the explanation, but anyway to check this out I set it to play some 1' + 1" games against Fruit 2.2.1 (rated around 2750 CCRL), and I was shocked to see this 1500 rated new Lc0 version beating 2750 rated Fruit by 9.5 to 0.5, a 512 elo margin!! Can a network less than one day old really be playing at 3250 level already (on good GPU)? Have others had similar experiences with new networks in the past? What does this mean?
With what GUI do you play with Leela?
Did you created a NEW engine for this 51010 net? Because if not and you just altered the old settings for example, it might still uses an older net(e.g test40) that you possibly had on the folder.
I thought the same first, but real blitz games told me otherwise ;-)
Even on my very slow gpu NN 51010 is already very good at least vs. Humans in fair conditions.
(no thinking window, no evals to watch - no chance at 40/4 w/o preparation - I am slightly below 2000 Fide compared now,
in lichess still 2050-2100 at 3-0 games)
Current foe list count : [97]
http://rwbc-chess.de/chronology.htm

jhellis3
Posts: 399
Joined: Fri Aug 16, 2013 10:36 pm

Re: Lc0 51010

Post by jhellis3 » Fri Mar 29, 2019 2:39 pm

Now imagine if the training data wasn't filled with noise ;).

lkaufman
Posts: 3722
Joined: Sun Jan 10, 2010 5:15 am
Location: Maryland USA
Contact:

Re: Lc0 51010

Post by lkaufman » Fri Mar 29, 2019 4:39 pm

mwyoung wrote:
Fri Mar 29, 2019 4:36 am
lkaufman wrote:
Fri Mar 29, 2019 4:21 am
I noticed that Lc0 started a new network from scratch, and after just ten iterations it showed a rating around 1500 (now 1466). I have read that these self-play ratings are inflated, so I imagined that its true strength was lower than 1500. So I played it a couple games at 5' + 5" (I'm too old for pure 5' blitz on a computer) and was surprised that I lost without a chance both games. I don't play GM level blitz at age 71, but I'm still at least 2200 level in blitz so this seemed rather strange. It's true that I have a very good GPU (RTX 2080) so I suppose this is the explanation, but anyway to check this out I set it to play some 1' + 1" games against Fruit 2.2.1 (rated around 2750 CCRL), and I was shocked to see this 1500 rated new Lc0 version beating 2750 rated Fruit by 9.5 to 0.5, a 512 elo margin!! Can a network less than one day old really be playing at 3250 level already (on good GPU)? Have others had similar experiences with new networks in the past? What does this mean?
I will test the new NN. But Your RTX 2080 is a beast. What was Fruit tested with in your match?
Fruite 2.2.1 was just a single thread engine, but it was running on a very fast i7 (4.9 GHz). I also played five games against the much stronger Naum 4 running on 7 threads, which should ber about 3100 CCRL, and Lc0 51010 won by 4 to 1, an even higher performance rating!
Komodo rules!

lkaufman
Posts: 3722
Joined: Sun Jan 10, 2010 5:15 am
Location: Maryland USA
Contact:

Re: Lc0 51010

Post by lkaufman » Fri Mar 29, 2019 4:42 pm

George Tsavdaris wrote:
Fri Mar 29, 2019 9:50 am
lkaufman wrote:
Fri Mar 29, 2019 4:21 am
I noticed that Lc0 started a new network from scratch, and after just ten iterations it showed a rating around 1500 (now 1466). I have read that these self-play ratings are inflated, so I imagined that its true strength was lower than 1500. So I played it a couple games at 5' + 5" (I'm too old for pure 5' blitz on a computer) and was surprised that I lost without a chance both games. I don't play GM level blitz at age 71, but I'm still at least 2200 level in blitz so this seemed rather strange. It's true that I have a very good GPU (RTX 2080) so I suppose this is the explanation, but anyway to check this out I set it to play some 1' + 1" games against Fruit 2.2.1 (rated around 2750 CCRL), and I was shocked to see this 1500 rated new Lc0 version beating 2750 rated Fruit by 9.5 to 0.5, a 512 elo margin!! Can a network less than one day old really be playing at 3250 level already (on good GPU)? Have others had similar experiences with new networks in the past? What does this mean?
With what GUI do you play with Leela?
Did you created a NEW engine for this 51010 net? Because if not and you just altered the old settings for example, it might still uses an older net(e.g test40) that you possibly had on the folder.
Fritz GUI (technically Komodo 12 ChessBase GUI), and yes, new engine. It was obviously not playing like normal Lc0 versions. Evaluations of positions with material lost were drastically different.
Komodo rules!

lkaufman
Posts: 3722
Joined: Sun Jan 10, 2010 5:15 am
Location: Maryland USA
Contact:

Re: Lc0 51010

Post by lkaufman » Fri Mar 29, 2019 4:51 pm

mwyoung wrote:
Fri Mar 29, 2019 7:43 am
Stockfish was not as impressed as Fruit. Lc0 51012 vs Stockfish

Results.jpg
Well I don't expect a 3250 engine to score much against SF on a powerful computer. The point is that if it is 3250 level with a self-play rating under 1500, what should it be with a 3000+ self-play rating in a few weeks? One point I'm not clear on; are the self-play ratings based on the 800 node games (if that is the current number), or are they based on games at 1' + 1", and if the later, is it hardware-adjusted or just a mix of all hardware at that level?
Komodo rules!

Werewolf
Posts: 1193
Joined: Thu Sep 18, 2008 8:24 pm

Re: Lc0 51010

Post by Werewolf » Fri Mar 29, 2019 6:40 pm

lkaufman wrote:
Fri Mar 29, 2019 4:51 pm
mwyoung wrote:
Fri Mar 29, 2019 7:43 am
Stockfish was not as impressed as Fruit. Lc0 51012 vs Stockfish

Results.jpg
Well I don't expect a 3250 engine to score much against SF on a powerful computer. The point is that if it is 3250 level with a self-play rating under 1500, what should it be with a 3000+ self-play rating in a few weeks?
Sadly not 3250 + 1500. The self play ratings are..."not to scale" as an architect might put it.

Post Reply