LCZero: Progress and Scaling. Relation to CCRL Elo

Laskos · Post by **Laskos** » Sat Mar 31, 2018 7:00 pm

CMCanavessi wrote:Kai, how much nps are you getting running 4 CPU threads? Would it be possible to estimate how much you would get running around 43 like TCEC does, and the approx. strenght?

NPS on 4 CPU are from 1500 to 5000 or more, depending on position. Also, it increases with allotted time, stabilizes after say 30 seconds or so on a position. I think this NPS is comparable to a good GPU NPS. I think MCTS search parallelize very well, so I would expect from 4 to 43 cores an improvement of 200-300 ELO points, depending on time control. Also, in TCEC LTC conditions, as LCZero seems to improve with time control (scales better than standard engines), I expect it to be at at least 2300 Elo level (CCRL), probably more.

Laskos · Post by **Laskos** » Sat Mar 31, 2018 7:18 pm

Laskos wrote:
Interesting to compare LCZero with a standard engine of similar strength at 1s/move, Predateur 2.2.1.

The average length of the Wins:
LCZero: 34.7 moves
Predateur: 51.9 moves

The paired histogram of the Wins by each engine is the following:

It seems LCZero plays much better the openings and early midgames than the endgames compared to a standard engine of similar strength. Endgames will be a problem maybe for a long time.

Confirmation that LCZero is much better in openings, at least positionally. With my Openings200beta07.epd pretty positional opening testing suite, I got the following:

Code: Select all

&#91;Search parameters&#58; MaxDepth=99   MaxTime=20.0   DepthDelta=2   MinDepth=7   MinTime=0.1&#93; 

Engine                         &#58; Correct  TotalPos  Corr%  AveT&#40;s&#41;  MaxT&#40;s&#41;  TestFile 
      
Komodo 10.2 64-bit             &#58;     145       200   72.5      2.0     20.0  openings200beta07.epd 
Houdini 5.01 Pro x64           &#58;     144       200   72.0      2.4     20.0  openings200beta07.epd    
Stockfish 8 64 BMI2            &#58;     141       200   70.5      2.0     20.0  openings200beta07.epd 
Houdini 5.01 Pro x64 Tactical  &#58;     139       200   69.5      2.3     20.0  openings200beta07.epd      
Deep Shredder 13 x64           &#58;     128       200   64.0      2.7     20.0  openings200beta07.epd    
Houdini 4 Pro x64              &#58;     126       200   63.0      1.8     20.0  openings200beta07.epd    
Andscacs 0.88n                 &#58;     123       200   61.5      2.4     20.0  openings200beta07.epd 
Houdini 4 Pro x64 Tactical     &#58;     120       200   60.0      1.6     20.0  openings200beta07.epd 
Nirvanachess 2.3               &#58;     119       200   59.5      1.8     20.0  openings200beta07.epd 
Fire 5 x64                     &#58;     110       200   55.0      3.0     20.0  openings200beta07.epd    
Texel 1.06 64-bit              &#58;     110       200   55.0      1.6     20.0  openings200beta07.epd    
Fritz 15                       &#58;     102       200   51.0      1.9     20.0  openings200beta07.epd    
Fruit 2.1      &#40;2685&#41;          &#58;      91       200   45.5      1.5     20.0  openings200beta07.epd  

LCZero  *************          &#58;      90       200   45.0      1.7     20.0  openings200beta07.epd
  
Sjaak II 1.3.1 &#40;2194&#41;          &#58;      75       200   37.5      4.0     20.0  openings200beta07.epd    
BikJump v2.01  &#40;2098&#41;          &#58;      74       200   37.0      1.6     20.0  openings200beta07.epd

The maximum time was 20s / position. I put the CCRL 40/4 Elo ratings of engines around LCZero in ranking, it is remarkable that LCZero positionally in the openings is the level of 2700 Elo engine like Fruit 2.1.

CheckersGuy · Post by **CheckersGuy** » Sat Mar 31, 2018 7:26 pm

That's indeed a very impressive result but that's probably what neural-nets are good at. It's kind of intresting. Weaker traditional alpha-beta engines are decent at tactics and suffer from bad positional play while with Leela0 it's the other way around

Uri Blass · Post by **Uri Blass** » Sat Mar 31, 2018 8:06 pm

Laskos wrote:
CMCanavessi wrote:Kai, how much nps are you getting running 4 CPU threads? Would it be possible to estimate how much you would get running around 43 like TCEC does, and the approx. strenght?
NPS on 4 CPU are from 1500 to 5000 or more, depending on position. Also, it increases with allotted time, stabilizes after say 30 seconds or so on a position. I think this NPS is comparable to a good GPU NPS. I think MCTS search parallelize very well, so I would expect from 4 to 43 cores an improvement of 200-300 ELO points, depending on time control. Also, in TCEC LTC conditions, as LCZero seems to improve with time control (scales better than standard engines), I expect it to be at at least 2300 Elo level (CCRL), probably more.

Weak engines do not scale well based on my experience.

It means that if engine A at 10 seconds per move is at the same level as engine B at 1 seconds per move then
Probably engine A need more then 100 seconds per move to be at the same level as engine B at 10 seconds per move.

I think that it may be interesting to test LCzero against stockfish with fixed number of nodes per move.

First find a number of nodes K that LCZero at 1 second per move is at the same level as stockfish at K nodes per move and after it test LCZero at 10 seconds per move against stockfish at 10K nodes per move to see who scales better.

I suggest K nodes per move for stockfish because I am sure that today LCZero is too weak to beat Stockfish even at 100:1 time handicap and hopefully 10K nodes per move is near 10 times slower than K nodes per move.

Laskos · Post by **Laskos** » Sat Mar 31, 2018 9:36 pm

Uri Blass wrote:
Laskos wrote:
CMCanavessi wrote:Kai, how much nps are you getting running 4 CPU threads? Would it be possible to estimate how much you would get running around 43 like TCEC does, and the approx. strenght?
NPS on 4 CPU are from 1500 to 5000 or more, depending on position. Also, it increases with allotted time, stabilizes after say 30 seconds or so on a position. I think this NPS is comparable to a good GPU NPS. I think MCTS search parallelize very well, so I would expect from 4 to 43 cores an improvement of 200-300 ELO points, depending on time control. Also, in TCEC LTC conditions, as LCZero seems to improve with time control (scales better than standard engines), I expect it to be at at least 2300 Elo level (CCRL), probably more.
Weak engines do not scale well based on my experience.

It means that if engine A at 10 seconds per move is at the same level as engine B at 1 seconds per move then
Probably engine A need more then 100 seconds per move to be at the same level as engine B at 10 seconds per move.

I think that it may be interesting to test LCzero against stockfish with fixed number of nodes per move.

First find a number of nodes K that LCZero at 1 second per move is at the same level as stockfish at K nodes per move and after it test LCZero at 10 seconds per move against stockfish at 10K nodes per move to see who scales better.

I suggest K nodes per move for stockfish because I am sure that today LCZero is too weak to beat Stockfish even at 100:1 time handicap and hopefully 10K nodes per move is near 10 times slower than K nodes per move.

Well, it will be compared to weak engines, not to top ones. But I will check your theory (which seems valid to me):
I managed to equal the strength:

SF9 1000 nodes/move vs LCZero 0.25s/move:
55.5: 44.5

Now I will test
SF9 8000 nodes/move vs LCZero 2.0s/move
?

3 doublings in time (factor of 8). It will take some time, I will post later the result.

Laskos · Post by **Laskos** » Sun Apr 01, 2018 12:10 am

Laskos wrote:
Uri Blass wrote:
Laskos wrote:
CMCanavessi wrote:Kai, how much nps are you getting running 4 CPU threads? Would it be possible to estimate how much you would get running around 43 like TCEC does, and the approx. strenght?
NPS on 4 CPU are from 1500 to 5000 or more, depending on position. Also, it increases with allotted time, stabilizes after say 30 seconds or so on a position. I think this NPS is comparable to a good GPU NPS. I think MCTS search parallelize very well, so I would expect from 4 to 43 cores an improvement of 200-300 ELO points, depending on time control. Also, in TCEC LTC conditions, as LCZero seems to improve with time control (scales better than standard engines), I expect it to be at at least 2300 Elo level (CCRL), probably more.
Weak engines do not scale well based on my experience.

It means that if engine A at 10 seconds per move is at the same level as engine B at 1 seconds per move then
Probably engine A need more then 100 seconds per move to be at the same level as engine B at 10 seconds per move.

I think that it may be interesting to test LCzero against stockfish with fixed number of nodes per move.

First find a number of nodes K that LCZero at 1 second per move is at the same level as stockfish at K nodes per move and after it test LCZero at 10 seconds per move against stockfish at 10K nodes per move to see who scales better.

I suggest K nodes per move for stockfish because I am sure that today LCZero is too weak to beat Stockfish even at 100:1 time handicap and hopefully 10K nodes per move is near 10 times slower than K nodes per move.
Well, it will be compared to weak engines, not to top ones. But I will check your theory (which seems valid to me):
I managed to equal the strength:

SF9 1000 nodes/move vs LCZero 0.25s/move:
55.5: 44.5

Now I will test
SF9 8000 nodes/move vs LCZero 2.0s/move
?

3 doublings in time (factor of 8). It will take some time, I will post later the result.

SF9 1000 nodes/move vs LCZero 0.25s/move:
55.5 : 44.5

SF9 8000 nodes/move vs LCZero 2.0s/move
83.5 : 16.5

So, SF9 indeed scales significantly better, as you thought. But keep in mind that the base time for SF9 is about 1ms, 250 times or so smaller than the base time of LCZero. It's a bit comparing apples to oranges, the doublings at hugely shorter TC give obviously more Elo. Comparing the scaling with modern similar in strength engines like Zurichess App. and Predateur at similar time controls for all gives better scaling for LCZero. My guess is that if LCZero becomes comparable in strength to SF9 at same time used, it will scale better than SF9. But let's see.

Uri Blass · Post by **Uri Blass** » Sun Apr 01, 2018 2:29 am

Laskos wrote:
Laskos wrote:
Uri Blass wrote:
Laskos wrote:
CMCanavessi wrote:Kai, how much nps are you getting running 4 CPU threads? Would it be possible to estimate how much you would get running around 43 like TCEC does, and the approx. strenght?
NPS on 4 CPU are from 1500 to 5000 or more, depending on position. Also, it increases with allotted time, stabilizes after say 30 seconds or so on a position. I think this NPS is comparable to a good GPU NPS. I think MCTS search parallelize very well, so I would expect from 4 to 43 cores an improvement of 200-300 ELO points, depending on time control. Also, in TCEC LTC conditions, as LCZero seems to improve with time control (scales better than standard engines), I expect it to be at at least 2300 Elo level (CCRL), probably more.
Weak engines do not scale well based on my experience.

It means that if engine A at 10 seconds per move is at the same level as engine B at 1 seconds per move then
Probably engine A need more then 100 seconds per move to be at the same level as engine B at 10 seconds per move.

I think that it may be interesting to test LCzero against stockfish with fixed number of nodes per move.

First find a number of nodes K that LCZero at 1 second per move is at the same level as stockfish at K nodes per move and after it test LCZero at 10 seconds per move against stockfish at 10K nodes per move to see who scales better.

I suggest K nodes per move for stockfish because I am sure that today LCZero is too weak to beat Stockfish even at 100:1 time handicap and hopefully 10K nodes per move is near 10 times slower than K nodes per move.
Well, it will be compared to weak engines, not to top ones. But I will check your theory (which seems valid to me):
I managed to equal the strength:

SF9 1000 nodes/move vs LCZero 0.25s/move:
55.5: 44.5

Now I will test
SF9 8000 nodes/move vs LCZero 2.0s/move
?

3 doublings in time (factor of 8). It will take some time, I will post later the result.
SF9 1000 nodes/move vs LCZero 0.25s/move:
55.5 : 44.5

SF9 8000 nodes/move vs LCZero 2.0s/move
83.5 : 16.5

So, SF9 indeed scales significantly better, as you thought. But keep in mind that the base time for SF9 is about 1ms, 250 times or so smaller than the base time of LCZero. It's a bit comparing apples to oranges, the doublings at hugely shorter TC give obviously more Elo. Comparing the scaling with modern similar in strength engines like Zurichess App. and Predateur at similar time controls for all gives better scaling for LCZero. My guess is that if LCZero becomes comparable in strength to SF9 at same time used, it will scale better than SF9. But let's see.

I think that for correspondence players it may be interesting if some weaker program scales better than stockfish in this type of test because in that case there is a reason to use it for analysis(hoping that using 24 hours for that program in some position is better even if in rating list it is worse because no rating list use 24 hours per move).

peter · Post by **peter** » Sun Apr 01, 2018 6:45 am

Hi Robin!

CheckersGuy wrote:That's indeed a very impressive result but that's probably what neural-nets are good at. It's kind of intresting. Weaker traditional alpha-beta engines are decent at tactics and suffer from bad positional play while with Leela0 it's the other way around

Well, I'd admit, that the opening has become better, but I yet wouldn't call that good positional play:
[pgn]
[Event "?"]
[Site "?"]
[Date "2018.03.31"]
[Round "?"]
[White "CuckooChess 1.13a9"]
[Black "play.lczero.org"]
[ECO "C50"]
[Result "1-0"]

1. e4 e5 2. Nf3 Nc6 3. Bc4 d6 4. O-O Be7 5. d4 Nxd4 6. Nxd4
exd4 7. Qh5 g6 8. Qd5 Be6 9. Qxb7 Nf6 10. Bxe6 fxe6 11. Rd1
e5 12. Qc6+ Kf7 13. c3 Rb8 14. cxd4 exd4 15. Rxd4 Rb6
16. Qc2 Rf8 17. e5 Nd7 18. e6+ Kxe6 19. Qc4+ Kf6 20. Bh6
Re8 21. Rf4+ Ke5 22. Qe4# 1-0
[/pgn]

Cuckoochess was running on a SonyXperia with 15"per game and Leela in slow mode. It was before the server was changed to an older computer over weekend but after changing to latest NN-version.

Laskos · Post by **Laskos** » Sun Apr 01, 2018 11:37 am

peter wrote:Hi Robin!
CheckersGuy wrote:That's indeed a very impressive result but that's probably what neural-nets are good at. It's kind of intresting. Weaker traditional alpha-beta engines are decent at tactics and suffer from bad positional play while with Leela0 it's the other way around
Well, I'd admit, that the opening has become better, but I yet wouldn't call that good positional play:
[pgn]
[Event "?"]
[Site "?"]
[Date "2018.03.31"]
[Round "?"]
[White "CuckooChess 1.13a9"]
[Black "play.lczero.org"]
[ECO "C50"]
[Result "1-0"]

1. e4 e5 2. Nf3 Nc6 3. Bc4 d6 4. O-O Be7 5. d4 Nxd4 6. Nxd4
exd4 7. Qh5 g6 8. Qd5 Be6 9. Qxb7 Nf6 10. Bxe6 fxe6 11. Rd1
e5 12. Qc6+ Kf7 13. c3 Rb8 14. cxd4 exd4 15. Rxd4 Rb6
16. Qc2 Rf8 17. e5 Nd7 18. e6+ Kxe6 19. Qc4+ Kf6 20. Bh6
Re8 21. Rf4+ Ke5 22. Qe4# 1-0
[/pgn]

Cuckoochess was running on a SonyXperia with 15"per game and Leela in slow mode. It was before the server was changed to an older computer over weekend but after changing to latest NN-version.

I am not sure, if I understood, the hardware was very weak, and LC0 improves greatly with time and hardware. By move 11. Rd1, LC0 had a better position, although the earlier moves were not nice (but not obviously wrong). I tested this morning the latest network, ID69, and compared to 2 days older network ID56, the latest performs significantly better in my opening positional suite.

Code: Select all

&#91;Search parameters&#58; MaxDepth=99   MaxTime=20.0   DepthDelta=2   MinDepth=7   MinTime=0.1&#93; 

Engine                         &#58; Correct  TotalPos  Corr%  AveT&#40;s&#41;  MaxT&#40;s&#41;  TestFile 
      
Komodo 10.2 64-bit             &#58;     145       200   72.5      2.0     20.0  openings200beta07.epd 
Houdini 5.01 Pro x64           &#58;     144       200   72.0      2.4     20.0  openings200beta07.epd    
Stockfish 8 64 BMI2            &#58;     141       200   70.5      2.0     20.0  openings200beta07.epd 
Houdini 5.01 Pro x64 Tactical  &#58;     139       200   69.5      2.3     20.0  openings200beta07.epd      
Deep Shredder 13 x64           &#58;     128       200   64.0      2.7     20.0  openings200beta07.epd    
Houdini 4 Pro x64              &#58;     126       200   63.0      1.8     20.0  openings200beta07.epd    
Andscacs 0.88n                 &#58;     123       200   61.5      2.4     20.0  openings200beta07.epd 
Houdini 4 Pro x64 Tactical     &#58;     120       200   60.0      1.6     20.0  openings200beta07.epd 
Nirvanachess 2.3               &#58;     119       200   59.5      1.8     20.0  openings200beta07.epd 
Fire 5 x64                     &#58;     110       200   55.0      3.0     20.0  openings200beta07.epd    
Texel 1.06 64-bit              &#58;     110       200   55.0      1.6     20.0  openings200beta07.epd    
Fritz 15       &#40;3227&#41;          &#58;     102       200   51.0      1.9     20.0  openings200beta07.epd  

LCZero  *************  ID69    &#58;      98       200   49.0      2.7     20.0  openings200beta07.epd 
  
Fruit 2.1      &#40;2685&#41;          &#58;      91       200   45.5      1.5     20.0  openings200beta07.epd  

LCZero  *************  ID56    &#58;      90       200   45.0      1.7     20.0  openings200beta07.epd 
  
Sjaak II 1.3.1 &#40;2194&#41;          &#58;      75       200   37.5      4.0     20.0  openings200beta07.epd    
BikJump v2.01  &#40;2098&#41;          &#58;      74       200   37.0      1.6     20.0  openings200beta07.epd

Maximum time was 20s/position.
LC0 seems already close to very strong engines in this opening suite. At this pace of advancement in positional understanding, I will be very curious how it develops.

sovaz1997 · Post by **sovaz1997** » Sun Apr 01, 2018 12:19 pm

Hi! Where to put on weights for the neurual networks lczero? I cann't run the UCI-engine (Output: "A network weights file is requied to the problem"). Thanks!

LCZero: Progress and Scaling. Relation to CCRL Elo

Re: LCZero: Progress and Scaling. Relation to CCRL Elo

Re: LCZero: Progress and Scaling. Relation to CCRL Elo

Re: LCZero: Progress and Scaling. Relation to CCRL Elo

Re: LCZero: Progress and Scaling. Relation to CCRL Elo

Re: LCZero: Progress and Scaling. Relation to CCRL Elo

Re: LCZero: Progress and Scaling. Relation to CCRL Elo

Re: LCZero: Progress and Scaling. Relation to CCRL Elo

Re: LCZero: Progress and Scaling. Relation to CCRL Elo

Re: LCZero: Progress and Scaling. Relation to CCRL Elo

Re: LCZero: Progress and Scaling. Relation to CCRL Elo