LCZero: Progress and Scaling. Relation to CCRL Elo

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

User avatar
CMCanavessi
Posts: 1142
Joined: Thu Dec 28, 2017 4:06 pm
Location: Argentina

Re: LCZero: Progress and Scaling. Relation to CCRL Elo

Post by CMCanavessi »

Laskos wrote:Taking into account a 3-4% slowdown of the net, I guess that the improvement since ID227 is in the range of 25 Elo points, maybe even a bit more. Have yet to test at fixed time control.
In my gauntlet, the strongest 128-sized network that I tried was 195 which got a score of 108.5/200 with an elo of 2667.3
The last 2 networks I tested (already 195 in size) are 232 and 236 which got 117.5 and 117/200 respectively, with elos of 2700.8 and 2698.9

That's already ~35 elo more (with the same TC and the same hardware) against 195 and ~45 more than 227, and it looks like 239 is somewhat stronger, so I would guess it's a bit more than 25, more like 60 maybe? I'll test 240 later tonight (or 241, depending on external factors such us life :P) and we'll see where we stand right now.

Image
Follow my tournament and some Leela gauntlets live at http://twitch.tv/ccls
jp
Posts: 1470
Joined: Mon Apr 23, 2018 7:54 am

Re: LCZero: Progress and Scaling. Relation to CCRL Elo

Post by jp »

Laskos wrote: Reproducibility at least is not a problem here. That Leela in A0 conditions is 3300 and A0 is 3600 CCRL Elo level is not a problem to me.
Reproducibility by definition is not a problem for Leela.

It is a problem for A0, but Leela's existence solves the problem.
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: LCZero: Progress and Scaling. Relation to CCRL Elo

Post by Laskos »

jp wrote:
Laskos wrote: Reproducibility at least is not a problem here. That Leela in A0 conditions is 3300 and A0 is 3600 CCRL Elo level is not a problem to me.
Reproducibility by definition is not a problem for Leela.

It is a problem for A0, but Leela's existence solves the problem.
I meant reproducing A0 results. They are reproduced, by an large. Now those doubting A0 results and even the authors' integrity should take a rest.
User avatar
mhull
Posts: 13447
Joined: Wed Mar 08, 2006 9:02 pm
Location: Dallas, Texas
Full name: Matthew Hull

Re: LCZero: Progress and Scaling. Relation to CCRL Elo

Post by mhull »

Laskos wrote:
jp wrote:
Laskos wrote: Reproducibility at least is not a problem here. That Leela in A0 conditions is 3300 and A0 is 3600 CCRL Elo level is not a problem to me.
Reproducibility by definition is not a problem for Leela.

It is a problem for A0, but Leela's existence solves the problem.
I meant reproducing A0 results. They are reproduced, by an large. Now those doubting A0 results and even the authors' integrity should take a rest.
+1
Matthew Hull
jp
Posts: 1470
Joined: Mon Apr 23, 2018 7:54 am

Re: LCZero: Progress and Scaling. Relation to CCRL Elo

Post by jp »

No, Kai, I think we have to wait and see. When Leela is big enough and trained long enough, then we see how strong Leela is & be happy with whatever that is.
We can't accurately project forward from current Leela performance.
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: LCZero: Progress and Scaling. Relation to CCRL Elo

Post by Laskos »

jp wrote:No, Kai, I think we have to wait and see. When Leela is big enough and trained long enough, then we see how strong Leela is & be happy with whatever that is.
We can't accurately project forward from current Leela performance.
Then wait and see. Some folks enjoy waiting per se.
Dann Corbit
Posts: 12537
Joined: Wed Mar 08, 2006 8:57 pm
Location: Redmond, WA USA

Re: LCZero: Progress and Scaling. Relation to CCRL Elo

Post by Dann Corbit »

Laskos wrote:
jp wrote:No, Kai, I think we have to wait and see. When Leela is big enough and trained long enough, then we see how strong Leela is & be happy with whatever that is.
We can't accurately project forward from current Leela performance.
Then wait and see. Some folks enjoy waiting per se.
Not me,

In the words of Inigo Montoya, "I hate waiting."
Taking ideas is not a vice, it is a virtue. We have another word for this. It is called learning.
But sharing ideas is an even greater virtue. We have another word for this. It is called teaching.
mirek
Posts: 52
Joined: Sat Mar 24, 2018 4:18 pm

Re: LCZero: Progress and Scaling. Relation to CCRL Elo

Post by mirek »

Werewolf wrote:
Daniel Mehrmann wrote:I think 2950 CCRL elo is a way to high.

I tested Leela id 234 versus Naum 4.6, Deep Shredder 11 and ProDeo. After my results 10+10/game, she has a rating of around 2850 CCRL elo.

Regards
Daniel
Depends on your graphics card
While he mentions timecontrol (which is necessary), the result also depends on his CPU speed. And GPU speed (as you pointed out). And also about which CCRL list we are talking about. I wonder why typically none bothers to mention those things.

As far as I can tell the only reasonable way of comparing lc0 strength to CCRL is the following:

1.) select CCRL list - for example 40/4

2.) select CPU engine against which you will be comparing lc0. This engine needs to play at conditions equivalent to CCRL 40/4 so you will adjust the time control for the CPU engine depending on your CPU speed - relative to CCRL reference CPU which is Athlon 64 X2 4600+ (2.4 GHz)

3.) Select time control for leela-zero - for example 40 moves in 2 minutes (it should be independent of CPU engine time control calibrated in step 2.) It's at this time control you will be comparing lc0's stength relative to CCRL 40/4.

4.) Run the match: lc0 at timecontrol decided in step 3. vs CPU engine at timecontrol calibrated in step 2.

5.) Caclulate lc0's performance from the match given the known elo of her opposition. Then you can say:
LC0's estimated CCRL 40/4 elo at time-control decided in step 3. and on my GPU = calculated lc0's performance.

So for example:
"LC0 at 40 moves per 2 minutes on GTX 1060 has estimated CCRL 40/4 elo = 2900."

While without the timecontrol at which LC0 was playing or without proper CPU engine timecontrol calibration or without the GPU speed it's quite meaningless I am afraid.

PS: you could make LC0 and the CPU engine run at the same time control.
Then you would skip step 2 and go directly into step 3. So you select timecontrol for LC0 and then you have to estimate the CPU engine's CCRL 40/4 rating based on how it's scaling at that given timecontrol on your CPU = less precise (different engines my scale differently, especially if it's many times more or less than would be the calibrated time control for the CPU engine on your CPU)

In other words the goal should be to determine how LC0 compares to CCRL 40/4 at various timecontrols on given GPU (and independent of CPU speed). So if you know that for example at 40/2 and GTX 1060 it's roughly 2900 you can then also say that on GTX 1080 it will be roughly 2900 at 40/1 etc.

Also for given time control and system with given CPU and given GPU you can then estimate how LC0 would compare (on that system) to some CPU engine of known CCRL 40/4 strength.
mirek
Posts: 52
Joined: Sat Mar 24, 2018 4:18 pm

Re: LCZero: Progress and Scaling. Relation to CCRL Elo

Post by mirek »

mirek wrote: So for example:
"LC0 at 40 moves per 2 minutes on GTX 1060 has estimated CCRL 40/4 elo = 2900."
Forgot to mention that if you would consider GTX 1060 gives e.g. 1 000 nps and average thinking time at that time control = 3s / move you have 3 000 nodes searched per move on average.

So instead of time control and GPU you have just single metric, e.g.: at 120000 nodes per 40 moves LC0 ~= 2900 CCRL 40/4

While more convenient it may also be tricky due to how it would reflect differences in time management. (Better time management at the same average nodes per 40 moves can cause difference in +/- multiples of 10 rating points) This would however work OK for fixed time per move time control.
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: LCZero: Progress and Scaling. Relation to CCRL Elo

Post by Laskos »

mirek wrote:
Werewolf wrote:
Daniel Mehrmann wrote:I think 2950 CCRL elo is a way to high.

I tested Leela id 234 versus Naum 4.6, Deep Shredder 11 and ProDeo. After my results 10+10/game, she has a rating of around 2850 CCRL elo.

Regards
Daniel
Depends on your graphics card
While he mentions timecontrol (which is necessary), the result also depends on his CPU speed. And GPU speed (as you pointed out). And also about which CCRL list we are talking about. I wonder why typically none bothers to mention those things.

As far as I can tell the only reasonable way of comparing lc0 strength to CCRL is the following:

1.) select CCRL list - for example 40/4

2.) select CPU engine against which you will be comparing lc0. This engine needs to play at conditions equivalent to CCRL 40/4 so you will adjust the time control for the CPU engine depending on your CPU speed - relative to CCRL reference CPU which is Athlon 64 X2 4600+ (2.4 GHz)

3.) Select time control for leela-zero - for example 40 moves in 2 minutes (it should be independent of CPU engine time control calibrated in step 2.) It's at this time control you will be comparing lc0's stength relative to CCRL 40/4.

4.) Run the match: lc0 at timecontrol decided in step 3. vs CPU engine at timecontrol calibrated in step 2.

5.) Caclulate lc0's performance from the match given the known elo of her opposition. Then you can say:
LC0's estimated CCRL 40/4 elo at time-control decided in step 3. and on my GPU = calculated lc0's performance.

So for example:
"LC0 at 40 moves per 2 minutes on GTX 1060 has estimated CCRL 40/4 elo = 2900."

While without the timecontrol at which LC0 was playing or without proper CPU engine timecontrol calibration or without the GPU speed it's quite meaningless I am afraid.

PS: you could make LC0 and the CPU engine run at the same time control.
Then you would skip step 2 and go directly into step 3. So you select timecontrol for LC0 and then you have to estimate the CPU engine's CCRL 40/4 rating based on how it's scaling at that given timecontrol on your CPU = less precise (different engines my scale differently, especially if it's many times more or less than would be the calibrated time control for the CPU engine on your CPU)

In other words the goal should be to determine how LC0 compares to CCRL 40/4 at various timecontrols on given GPU (and independent of CPU speed). So if you know that for example at 40/2 and GTX 1060 it's roughly 2900 you can then also say that on GTX 1080 it will be roughly 2900 at 40/1 etc.

Also for given time control and system with given CPU and given GPU you can then estimate how LC0 would compare (on that system) to some CPU engine of known CCRL 40/4 strength.
Why you are complicating so much things? They are complicated indeed, taking into account the different scaling and hardware used, but let's simplify, people don't spend much time reading tedious posts.

I repeatedly stated that my list is CCRL 40/4'. I also stated that LC0 rating on 40/4' will be different from that on 40/40'. Time control is equal for both engines, LC0 and the standard engine (isn't that the practice of these rating lists?). I roughly made equivalent an i5 or i7 core to an AMD64 core, and you can nitpick on that (there is a factor of 2 maybe, but often we are talking of factors of 10-100). The dependence of GPU speed on a typical i5 or i7 CPU is no more that 20%, negligible.

All in all, the current LC0 will probably be in 2850-3000 range on CCRL 40/4' Elo rating list on a GTX 1060 GPU with an i5, i7 or i9 CPU (of any kind), in games played in CCRL 40/4' conditions. A bit easier now? If you still don't believe me, play games in those conditions and report back. 200-400 games, at least.