LCZero: Progress and Scaling. Relation to CCRL Elo

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: LCZero: Progress and Scaling. Relation to CCRL Elo

Post by Laskos »

CMCanavessi wrote:Kai, how much nps are you getting running 4 CPU threads? Would it be possible to estimate how much you would get running around 43 like TCEC does, and the approx. strenght?
NPS on 4 CPU are from 1500 to 5000 or more, depending on position. Also, it increases with allotted time, stabilizes after say 30 seconds or so on a position. I think this NPS is comparable to a good GPU NPS. I think MCTS search parallelize very well, so I would expect from 4 to 43 cores an improvement of 200-300 ELO points, depending on time control. Also, in TCEC LTC conditions, as LCZero seems to improve with time control (scales better than standard engines), I expect it to be at at least 2300 Elo level (CCRL), probably more.
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: LCZero: Progress and Scaling. Relation to CCRL Elo

Post by Laskos »

Laskos wrote:
Interesting to compare LCZero with a standard engine of similar strength at 1s/move, Predateur 2.2.1.

The average length of the Wins:
LCZero: 34.7 moves
Predateur: 51.9 moves

The paired histogram of the Wins by each engine is the following:

Image

It seems LCZero plays much better the openings and early midgames than the endgames compared to a standard engine of similar strength. Endgames will be a problem maybe for a long time.
Confirmation that LCZero is much better in openings, at least positionally. With my Openings200beta07.epd pretty positional opening testing suite, I got the following:

Code: Select all

[Search parameters: MaxDepth=99   MaxTime=20.0   DepthDelta=2   MinDepth=7   MinTime=0.1] 

Engine                         : Correct  TotalPos  Corr%  AveT(s)  MaxT(s)  TestFile 
      
Komodo 10.2 64-bit             :     145       200   72.5      2.0     20.0  openings200beta07.epd 
Houdini 5.01 Pro x64           :     144       200   72.0      2.4     20.0  openings200beta07.epd    
Stockfish 8 64 BMI2            :     141       200   70.5      2.0     20.0  openings200beta07.epd 
Houdini 5.01 Pro x64 Tactical  :     139       200   69.5      2.3     20.0  openings200beta07.epd      
Deep Shredder 13 x64           :     128       200   64.0      2.7     20.0  openings200beta07.epd    
Houdini 4 Pro x64              :     126       200   63.0      1.8     20.0  openings200beta07.epd    
Andscacs 0.88n                 :     123       200   61.5      2.4     20.0  openings200beta07.epd 
Houdini 4 Pro x64 Tactical     :     120       200   60.0      1.6     20.0  openings200beta07.epd 
Nirvanachess 2.3               :     119       200   59.5      1.8     20.0  openings200beta07.epd 
Fire 5 x64                     :     110       200   55.0      3.0     20.0  openings200beta07.epd    
Texel 1.06 64-bit              :     110       200   55.0      1.6     20.0  openings200beta07.epd    
Fritz 15                       :     102       200   51.0      1.9     20.0  openings200beta07.epd    
Fruit 2.1      (2685)          :      91       200   45.5      1.5     20.0  openings200beta07.epd  

LCZero  *************          :      90       200   45.0      1.7     20.0  openings200beta07.epd
  
Sjaak II 1.3.1 (2194)          :      75       200   37.5      4.0     20.0  openings200beta07.epd    
BikJump v2.01  (2098)          :      74       200   37.0      1.6     20.0  openings200beta07.epd
The maximum time was 20s / position. I put the CCRL 40/4 Elo ratings of engines around LCZero in ranking, it is remarkable that LCZero positionally in the openings is the level of 2700 Elo engine like Fruit 2.1.
CheckersGuy
Posts: 273
Joined: Wed Aug 24, 2016 9:49 pm

Re: LCZero: Progress and Scaling. Relation to CCRL Elo

Post by CheckersGuy »

That's indeed a very impressive result but that's probably what neural-nets are good at. It's kind of intresting. Weaker traditional alpha-beta engines are decent at tactics and suffer from bad positional play while with Leela0 it's the other way around :lol:
Uri Blass
Posts: 10269
Joined: Thu Mar 09, 2006 12:37 am
Location: Tel-Aviv Israel

Re: LCZero: Progress and Scaling. Relation to CCRL Elo

Post by Uri Blass »

Laskos wrote:
CMCanavessi wrote:Kai, how much nps are you getting running 4 CPU threads? Would it be possible to estimate how much you would get running around 43 like TCEC does, and the approx. strenght?
NPS on 4 CPU are from 1500 to 5000 or more, depending on position. Also, it increases with allotted time, stabilizes after say 30 seconds or so on a position. I think this NPS is comparable to a good GPU NPS. I think MCTS search parallelize very well, so I would expect from 4 to 43 cores an improvement of 200-300 ELO points, depending on time control. Also, in TCEC LTC conditions, as LCZero seems to improve with time control (scales better than standard engines), I expect it to be at at least 2300 Elo level (CCRL), probably more.
Weak engines do not scale well based on my experience.

It means that if engine A at 10 seconds per move is at the same level as engine B at 1 seconds per move then
Probably engine A need more then 100 seconds per move to be at the same level as engine B at 10 seconds per move.

I think that it may be interesting to test LCzero against stockfish with fixed number of nodes per move.

First find a number of nodes K that LCZero at 1 second per move is at the same level as stockfish at K nodes per move and after it test LCZero at 10 seconds per move against stockfish at 10K nodes per move to see who scales better.

I suggest K nodes per move for stockfish because I am sure that today LCZero is too weak to beat Stockfish even at 100:1 time handicap and hopefully 10K nodes per move is near 10 times slower than K nodes per move.
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: LCZero: Progress and Scaling. Relation to CCRL Elo

Post by Laskos »

Uri Blass wrote:
Laskos wrote:
CMCanavessi wrote:Kai, how much nps are you getting running 4 CPU threads? Would it be possible to estimate how much you would get running around 43 like TCEC does, and the approx. strenght?
NPS on 4 CPU are from 1500 to 5000 or more, depending on position. Also, it increases with allotted time, stabilizes after say 30 seconds or so on a position. I think this NPS is comparable to a good GPU NPS. I think MCTS search parallelize very well, so I would expect from 4 to 43 cores an improvement of 200-300 ELO points, depending on time control. Also, in TCEC LTC conditions, as LCZero seems to improve with time control (scales better than standard engines), I expect it to be at at least 2300 Elo level (CCRL), probably more.
Weak engines do not scale well based on my experience.

It means that if engine A at 10 seconds per move is at the same level as engine B at 1 seconds per move then
Probably engine A need more then 100 seconds per move to be at the same level as engine B at 10 seconds per move.

I think that it may be interesting to test LCzero against stockfish with fixed number of nodes per move.

First find a number of nodes K that LCZero at 1 second per move is at the same level as stockfish at K nodes per move and after it test LCZero at 10 seconds per move against stockfish at 10K nodes per move to see who scales better.

I suggest K nodes per move for stockfish because I am sure that today LCZero is too weak to beat Stockfish even at 100:1 time handicap and hopefully 10K nodes per move is near 10 times slower than K nodes per move.
Well, it will be compared to weak engines, not to top ones. But I will check your theory (which seems valid to me):
I managed to equal the strength:

SF9 1000 nodes/move vs LCZero 0.25s/move:
55.5: 44.5

Now I will test
SF9 8000 nodes/move vs LCZero 2.0s/move
?

3 doublings in time (factor of 8). It will take some time, I will post later the result.
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: LCZero: Progress and Scaling. Relation to CCRL Elo

Post by Laskos »

Laskos wrote:
Uri Blass wrote:
Laskos wrote:
CMCanavessi wrote:Kai, how much nps are you getting running 4 CPU threads? Would it be possible to estimate how much you would get running around 43 like TCEC does, and the approx. strenght?
NPS on 4 CPU are from 1500 to 5000 or more, depending on position. Also, it increases with allotted time, stabilizes after say 30 seconds or so on a position. I think this NPS is comparable to a good GPU NPS. I think MCTS search parallelize very well, so I would expect from 4 to 43 cores an improvement of 200-300 ELO points, depending on time control. Also, in TCEC LTC conditions, as LCZero seems to improve with time control (scales better than standard engines), I expect it to be at at least 2300 Elo level (CCRL), probably more.
Weak engines do not scale well based on my experience.

It means that if engine A at 10 seconds per move is at the same level as engine B at 1 seconds per move then
Probably engine A need more then 100 seconds per move to be at the same level as engine B at 10 seconds per move.

I think that it may be interesting to test LCzero against stockfish with fixed number of nodes per move.

First find a number of nodes K that LCZero at 1 second per move is at the same level as stockfish at K nodes per move and after it test LCZero at 10 seconds per move against stockfish at 10K nodes per move to see who scales better.

I suggest K nodes per move for stockfish because I am sure that today LCZero is too weak to beat Stockfish even at 100:1 time handicap and hopefully 10K nodes per move is near 10 times slower than K nodes per move.
Well, it will be compared to weak engines, not to top ones. But I will check your theory (which seems valid to me):
I managed to equal the strength:

SF9 1000 nodes/move vs LCZero 0.25s/move:
55.5: 44.5

Now I will test
SF9 8000 nodes/move vs LCZero 2.0s/move
?

3 doublings in time (factor of 8). It will take some time, I will post later the result.
SF9 1000 nodes/move vs LCZero 0.25s/move:
55.5 : 44.5

SF9 8000 nodes/move vs LCZero 2.0s/move
83.5 : 16.5

So, SF9 indeed scales significantly better, as you thought. But keep in mind that the base time for SF9 is about 1ms, 250 times or so smaller than the base time of LCZero. It's a bit comparing apples to oranges, the doublings at hugely shorter TC give obviously more Elo. Comparing the scaling with modern similar in strength engines like Zurichess App. and Predateur at similar time controls for all gives better scaling for LCZero. My guess is that if LCZero becomes comparable in strength to SF9 at same time used, it will scale better than SF9. But let's see.
Uri Blass
Posts: 10269
Joined: Thu Mar 09, 2006 12:37 am
Location: Tel-Aviv Israel

Re: LCZero: Progress and Scaling. Relation to CCRL Elo

Post by Uri Blass »

Laskos wrote:
Laskos wrote:
Uri Blass wrote:
Laskos wrote:
CMCanavessi wrote:Kai, how much nps are you getting running 4 CPU threads? Would it be possible to estimate how much you would get running around 43 like TCEC does, and the approx. strenght?
NPS on 4 CPU are from 1500 to 5000 or more, depending on position. Also, it increases with allotted time, stabilizes after say 30 seconds or so on a position. I think this NPS is comparable to a good GPU NPS. I think MCTS search parallelize very well, so I would expect from 4 to 43 cores an improvement of 200-300 ELO points, depending on time control. Also, in TCEC LTC conditions, as LCZero seems to improve with time control (scales better than standard engines), I expect it to be at at least 2300 Elo level (CCRL), probably more.
Weak engines do not scale well based on my experience.

It means that if engine A at 10 seconds per move is at the same level as engine B at 1 seconds per move then
Probably engine A need more then 100 seconds per move to be at the same level as engine B at 10 seconds per move.

I think that it may be interesting to test LCzero against stockfish with fixed number of nodes per move.

First find a number of nodes K that LCZero at 1 second per move is at the same level as stockfish at K nodes per move and after it test LCZero at 10 seconds per move against stockfish at 10K nodes per move to see who scales better.

I suggest K nodes per move for stockfish because I am sure that today LCZero is too weak to beat Stockfish even at 100:1 time handicap and hopefully 10K nodes per move is near 10 times slower than K nodes per move.
Well, it will be compared to weak engines, not to top ones. But I will check your theory (which seems valid to me):
I managed to equal the strength:

SF9 1000 nodes/move vs LCZero 0.25s/move:
55.5: 44.5

Now I will test
SF9 8000 nodes/move vs LCZero 2.0s/move
?

3 doublings in time (factor of 8). It will take some time, I will post later the result.
SF9 1000 nodes/move vs LCZero 0.25s/move:
55.5 : 44.5

SF9 8000 nodes/move vs LCZero 2.0s/move
83.5 : 16.5

So, SF9 indeed scales significantly better, as you thought. But keep in mind that the base time for SF9 is about 1ms, 250 times or so smaller than the base time of LCZero. It's a bit comparing apples to oranges, the doublings at hugely shorter TC give obviously more Elo. Comparing the scaling with modern similar in strength engines like Zurichess App. and Predateur at similar time controls for all gives better scaling for LCZero. My guess is that if LCZero becomes comparable in strength to SF9 at same time used, it will scale better than SF9. But let's see.
I think that for correspondence players it may be interesting if some weaker program scales better than stockfish in this type of test because in that case there is a reason to use it for analysis(hoping that using 24 hours for that program in some position is better even if in rating list it is worse because no rating list use 24 hours per move).
peter
Posts: 3185
Joined: Sat Feb 16, 2008 7:38 am
Full name: Peter Martan

Re: LCZero: Progress and Scaling. Relation to CCRL Elo

Post by peter »

Hi Robin!
CheckersGuy wrote:That's indeed a very impressive result but that's probably what neural-nets are good at. It's kind of intresting. Weaker traditional alpha-beta engines are decent at tactics and suffer from bad positional play while with Leela0 it's the other way around :lol:
Well, I'd admit, that the opening has become better, but I yet wouldn't call that good positional play:
[pgn]
[Event "?"]
[Site "?"]
[Date "2018.03.31"]
[Round "?"]
[White "CuckooChess 1.13a9"]
[Black "play.lczero.org"]
[ECO "C50"]
[Result "1-0"]

1. e4 e5 2. Nf3 Nc6 3. Bc4 d6 4. O-O Be7 5. d4 Nxd4 6. Nxd4
exd4 7. Qh5 g6 8. Qd5 Be6 9. Qxb7 Nf6 10. Bxe6 fxe6 11. Rd1
e5 12. Qc6+ Kf7 13. c3 Rb8 14. cxd4 exd4 15. Rxd4 Rb6
16. Qc2 Rf8 17. e5 Nd7 18. e6+ Kxe6 19. Qc4+ Kf6 20. Bh6
Re8 21. Rf4+ Ke5 22. Qe4# 1-0
[/pgn]

Cuckoochess was running on a SonyXperia with 15"per game and Leela in slow mode. It was before the server was changed to an older computer over weekend but after changing to latest NN-version.
Peter.
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: LCZero: Progress and Scaling. Relation to CCRL Elo

Post by Laskos »

peter wrote:Hi Robin!
CheckersGuy wrote:That's indeed a very impressive result but that's probably what neural-nets are good at. It's kind of intresting. Weaker traditional alpha-beta engines are decent at tactics and suffer from bad positional play while with Leela0 it's the other way around :lol:
Well, I'd admit, that the opening has become better, but I yet wouldn't call that good positional play:
[pgn]
[Event "?"]
[Site "?"]
[Date "2018.03.31"]
[Round "?"]
[White "CuckooChess 1.13a9"]
[Black "play.lczero.org"]
[ECO "C50"]
[Result "1-0"]

1. e4 e5 2. Nf3 Nc6 3. Bc4 d6 4. O-O Be7 5. d4 Nxd4 6. Nxd4
exd4 7. Qh5 g6 8. Qd5 Be6 9. Qxb7 Nf6 10. Bxe6 fxe6 11. Rd1
e5 12. Qc6+ Kf7 13. c3 Rb8 14. cxd4 exd4 15. Rxd4 Rb6
16. Qc2 Rf8 17. e5 Nd7 18. e6+ Kxe6 19. Qc4+ Kf6 20. Bh6
Re8 21. Rf4+ Ke5 22. Qe4# 1-0
[/pgn]

Cuckoochess was running on a SonyXperia with 15"per game and Leela in slow mode. It was before the server was changed to an older computer over weekend but after changing to latest NN-version.
I am not sure, if I understood, the hardware was very weak, and LC0 improves greatly with time and hardware. By move 11. Rd1, LC0 had a better position, although the earlier moves were not nice (but not obviously wrong). I tested this morning the latest network, ID69, and compared to 2 days older network ID56, the latest performs significantly better in my opening positional suite.

Code: Select all

[Search parameters: MaxDepth=99   MaxTime=20.0   DepthDelta=2   MinDepth=7   MinTime=0.1] 

Engine                         : Correct  TotalPos  Corr%  AveT(s)  MaxT(s)  TestFile 
      
Komodo 10.2 64-bit             :     145       200   72.5      2.0     20.0  openings200beta07.epd 
Houdini 5.01 Pro x64           :     144       200   72.0      2.4     20.0  openings200beta07.epd    
Stockfish 8 64 BMI2            :     141       200   70.5      2.0     20.0  openings200beta07.epd 
Houdini 5.01 Pro x64 Tactical  :     139       200   69.5      2.3     20.0  openings200beta07.epd      
Deep Shredder 13 x64           :     128       200   64.0      2.7     20.0  openings200beta07.epd    
Houdini 4 Pro x64              :     126       200   63.0      1.8     20.0  openings200beta07.epd    
Andscacs 0.88n                 :     123       200   61.5      2.4     20.0  openings200beta07.epd 
Houdini 4 Pro x64 Tactical     :     120       200   60.0      1.6     20.0  openings200beta07.epd 
Nirvanachess 2.3               :     119       200   59.5      1.8     20.0  openings200beta07.epd 
Fire 5 x64                     :     110       200   55.0      3.0     20.0  openings200beta07.epd    
Texel 1.06 64-bit              :     110       200   55.0      1.6     20.0  openings200beta07.epd    
Fritz 15       (3227)          :     102       200   51.0      1.9     20.0  openings200beta07.epd  

LCZero  *************  ID69    :      98       200   49.0      2.7     20.0  openings200beta07.epd 
  
Fruit 2.1      (2685)          :      91       200   45.5      1.5     20.0  openings200beta07.epd  

LCZero  *************  ID56    :      90       200   45.0      1.7     20.0  openings200beta07.epd 
  
Sjaak II 1.3.1 (2194)          :      75       200   37.5      4.0     20.0  openings200beta07.epd    
BikJump v2.01  (2098)          :      74       200   37.0      1.6     20.0  openings200beta07.epd
Maximum time was 20s/position.
LC0 seems already close to very strong engines in this opening suite. At this pace of advancement in positional understanding, I will be very curious how it develops.
sovaz1997
Posts: 261
Joined: Sun Nov 13, 2016 10:37 am

Re: LCZero: Progress and Scaling. Relation to CCRL Elo

Post by sovaz1997 »

Hi! Where to put on weights for the neurual networks lczero? I cann't run the UCI-engine (Output: "A network weights file is requied to the problem"). Thanks!