SPCC: Testrun of Lc0 66680 finished

Discussion of computer chess matches and engine tournaments.

Moderator: Ras

User avatar
pohl4711
Posts: 2923
Joined: Sat Sep 03, 2011 7:25 am
Location: Berlin, Germany
Full name: Stefan Pohl

SPCC: Testrun of Lc0 66680 finished

Post by pohl4711 »

NN-testrun of Lc0 0.26.3 66680 finished - impressive new highscore!

https://www.sp-cc.de

(Perhaps you have to clear your browsercache or reload the website)
Pedro
Posts: 30
Joined: Mon Oct 26, 2020 3:05 pm
Full name: Pedro

Re: SPCC: Testrun of Lc0 66680 finished

Post by Pedro »

Great evolution of Leela! Based on its new rating, it is on par with the latest Stockfish dev
User avatar
pohl4711
Posts: 2923
Joined: Sat Sep 03, 2011 7:25 am
Location: Berlin, Germany
Full name: Stefan Pohl

Re: SPCC: Testrun of Lc0 66680 finished

Post by pohl4711 »

Pedro wrote: Sun Dec 20, 2020 5:49 pm Great evolution of Leela! Based on its new rating, it is on par with the latest Stockfish dev
No. As I wrote on my website about the Lc0-testruns:
"500 NBSC-Advanced-Armageddon games each testrun (= a win for Black is 2 points for Black and a draw is a 1 point-win for Black). vs. Stockfish 200418 (SPCC-Elo: 3568 (Contempt set to 0) (around +14 Elo stronger than Stockfish 11 (SPCC-Elo: 3554)).
The errorbar of each result is +/- 20 Elo. But mention, that the usage of my NBSC-Armageddon openings spreads the Elo-results around 2.25x wider, than using classical openings for testing(!),"

This wider Elo-spreading is important for the Lc0-net testing, because
a) The progress of the neural-nets is often very small
b) only 500 games are played in each testrun, because on one machine only one Lc0-instance can run at the same time. Because of this small amount of games, it is important to spread the results for more stable ranking.

But the wider Elo-spreading means, that the Elo-distance between Stockfish 200418 and Lc0 0.26.3 66680 would not be 156 Elo (what you see in the Lc0 NN-ratinglist (3724-3568=156)), but only 156/2.25=69 Elo, when using classical openigs for testing and no Armageddon rescoring. So, for classical engine-testing (my SPCC ratinglist on my main-site for example), Lc0 0.26.3 66680 should have 3568+69=3637 Elo.