The real elo of lczero is in its name

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

Daniel Shawul
Posts: 4185
Joined: Tue Mar 14, 2006 11:34 am
Location: Ethiopia

Re: First results

Post by Daniel Shawul »

CMCanavessi wrote:
Daniel Shawul wrote:It got better at 320+2 but still like -180 elos

Code: Select all

                        scorpio-mcts-min : (+ 10 ,=  2 ,-  3)

 1.                        scorpio-mcts-min     91    183    183     15    73.3%   -90    13.3%
 2.                                  lczero    -90    183    183     15    26.7%    91    13.3%
I had to shut down my lap top so this one has few games. Next one is 900+10 (15m+10) ...


Is that the config I tested? How can we get so different results? What hardware config are you testing Leela on? How many threads for Scorpio?
They are both using 2 threads on an i7 laptop processor. L0 gets about 200 playouts/sec. scorpio-mcts-min could actually be 2700 elo on TCEC hardware too though one has to test. If that is the case then, all this NN work is a waste if you compare fairly.

Your setup is very unfair to the alphabeta engines because L0 was getting 1 kN/s on your GPU. You should use 1 CPU core for it too if you want to be fair. Infact AlphaZero also should have used the 64 cores used for Stockfish; the 4 TPU setup hugely benefits A0.
If leela has to run on the GPU (though there is no reason for it as examplified by TCEC results), you can measure how much nps it gets on 1 CPU core, and calculate the number of cores to use for the alphabeta engines from that. Say L0 got about 100 n/s on one cpu core, then you have to use atleast 1000/100=10 cores for the alphabeta engines if leela uses the GPU. This one still gives some edge to L0 because scalability of alphabeta engines is not that good, but it is a more fair setup than what you have. This is the reason you have been getting inflated elos for L0.

To say that L0 is not designed for the CPU is wrong; see my other post about the similarity of the situation where a speciality hardware is used to accelerate an evaluation at the same cost. What plays in L0 favour is the performance per dollar (same price GPU and CPU) or per watt; GPUs are getting cheaper and are preferred for HPC for this reason.
User avatar
CMCanavessi
Posts: 1142
Joined: Thu Dec 28, 2017 4:06 pm
Location: Argentina

Re: First results

Post by CMCanavessi »

Daniel Shawul wrote:
CMCanavessi wrote:
Daniel Shawul wrote:It got better at 320+2 but still like -180 elos

Code: Select all

                        scorpio-mcts-min : (+ 10 ,=  2 ,-  3)

 1.                        scorpio-mcts-min     91    183    183     15    73.3%   -90    13.3%
 2.                                  lczero    -90    183    183     15    26.7%    91    13.3%
I had to shut down my lap top so this one has few games. Next one is 900+10 (15m+10) ...


Is that the config I tested? How can we get so different results? What hardware config are you testing Leela on? How many threads for Scorpio?
They are both using 2 threads on an i7 laptop processor. L0 gets about 200 playouts/sec. scorpio-mcts-min could actually be 2700 elo on TCEC hardware too though one has to test. If that is the case then, all this NN work is a waste if you compare fairly.

Your setup is very unfair to the alphabeta engines because L0 was getting 1 kN/s on your GPU. You should use 1 CPU core for it too if you want to be fair. Infact AlphaZero also should have used the 64 cores used for Stockfish; the 4 TPU setup hugely benefits A0.
If leela has to run on the GPU (though there is no reason for it as examplified by TCEC results), you can measure how much nps it gets on 1 CPU core, and calculate the number of cores to use for the alphabeta engines from that. Say L0 got about 100 n/s on one cpu core, then you have to use atleast 1000/100=10 cores for the alphabeta engines if leela uses the GPU. This one still gives some edge to L0 because scalability of alphabeta engines is not that good, but it is a more fair setup than what you have. This is the reason you have been getting inflated elos for L0.

To say that L0 is not designed for the CPU is wrong; see my other post about the similarity of the situation where a speciality hardware is used to accelerate an evaluation at the same cost. What plays in L0 favour is the performance per dollar (same price GPU and CPU) or per watt; GPUs are getting cheaper and are preferred for HPC for this reason.
Why don't you re-write Scorpio to use the GPU then? Oh, maybe it's because it not optimal and runs better on CPUs... Well, Leela was _DESIGNED_ to be ran on GPUs, it only runs on CPUs so more people can contribute to the training process, it's not intended for playing games.

You're only putting your focus on the MCTS search, but you're "forgetting" that below that, Leela uses a neural network, that's where GPU helps, not in the search.

It is you who is making an unfair match in my opinion, by using Leela on cpu (which it was NOT designed for, even if you say it was), but we'll see what happens in the future. The project is still using diapers, it's super young and quite buggy/non-optimized.
Follow my tournament and some Leela gauntlets live at http://twitch.tv/ccls
User avatar
CMCanavessi
Posts: 1142
Joined: Thu Dec 28, 2017 4:06 pm
Location: Argentina

Re: First results

Post by CMCanavessi »

This is what the promotion bug did to Leela, in a visual way (awesome graph by Uriopass btw):

Image

It's very easy to see how promoting to queen suddenly went down with black pieces.
Follow my tournament and some Leela gauntlets live at http://twitch.tv/ccls
mehmet karaman
Posts: 142
Joined: Tue Jan 28, 2014 8:37 am
Location: TURKEY

Re: First results

Post by mehmet karaman »

Neural Network AI Leela Zero Destroys an IM

https://www.youtube.com/watch?v=e-w06IVnht0

Lasse Ostebo Lovik's current rating is 2379

LCZero beat him with a 10-1 score ( 2779 elo)