SPCC: Testrun of Ethereal 13 nnue finished

Discussion of computer chess matches and engine tournaments.

Moderator: Ras

Modern Times
Posts: 3831
Joined: Thu Jun 07, 2012 11:02 pm

Re: SPCC: Testrun of Ethereal 13 nnue finished

Post by Modern Times »

AndrewGrant wrote: Wed Jun 16, 2021 9:41 pm Awesome results! #4 if you filter multiple Stockfishes and Komodos, #3 if you discount Fire (Stockfish) with a Stockfish Network.
Excited to see the victory over Komodo 14. Beating the old-guard of Alpha-Beta was the goal for the release. +75 :)

Thanks for taking the time to test things, and for using the AVX2 version.
Single CPU 40/15 pure list has you at #4 as well, but with error margins #4 through to #7 are too close to call. At longer time control, it is difficult to play the volume of games needed to bring that down. Advantage blitz in that respect.

http://ccrl.chessdom.com/ccrl/4040/rati ... e_cpu.html
mehmet123
Posts: 697
Joined: Sun Jan 26, 2020 10:38 pm
Location: Turkey
Full name: Mehmet Karaman

Re: SPCC: Testrun of Ethereal 13 nnue finished

Post by mehmet123 »

kranium wrote: Thu Jun 17, 2021 12:02 am
mehmet123 wrote: Wed Jun 16, 2021 1:09 pm Great progress for a chess engine that doesn't use a Stockfish NNUE net.
Mehmet, have you seen these?
Sorry to disappoint but they're rather pathetic

CCRL Blitz
Ethereal 13.00 64-bit 8CPU 3570
Ethereal 12.75 64-bit 8CPU 3541

CCRL 40/15
Ethereal 13.00 64-bit 4CPU 3410
Ethereal 12.75 64-bit 4CPU 3392

but only $40.00, what a deal
it's easy to understand now why he's so busy 'spinning' things...

"These Networks are the second of their kind, boasting themselves as the only other high level NNUEs not derived from, trained on, nor duplicated from the works of the Stockfish team."
:wink:
For me, the improvement in the SPCC rating list is a very good result. According to the CCRL rating list, the difference between Stockfish 13 and Stockfish 11 (without NNUE) is only 69 elo. It is clear that these elo differences are much less for top engines than the actual values due to using Bayeselo.
If we are going to comment only the elo score, there are much more powerful non official Ethereal versions that I have adapting some Stockfish NNUE nets to Ethereal.
http://talkchess.com/forum3/viewtopic.p ... &start=190
https://PrivateLadyEscorts.com - Live Local Dating - No Verify - Anonymous Casual Dating - Chat Local Singles
Modern Times
Posts: 3831
Joined: Thu Jun 07, 2012 11:02 pm

Re: SPCC: Testrun of Ethereal 13 nnue finished

Post by Modern Times »

mehmet123 wrote: Thu Jun 17, 2021 5:00 am For me, the improvement in the SPCC rating list is a very good result. According to the CCRL rating list, the difference between Stockfish 13 and Stockfish 11 (without NNUE) is only 69 elo. It is clear that these elo differences are much less for top engines than the actual values due to using Bayeselo.
"Actual values" ? I'd say the opposite, I prefer Bayeselo over Ordo and the ratings inflation that results from the latter. Neither are wrong, just different methods.
User avatar
pohl4711
Posts: 2924
Joined: Sat Sep 03, 2011 7:25 am
Location: Berlin, Germany
Full name: Stefan Pohl

Re: SPCC: Testrun of Ethereal 13 nnue finished

Post by pohl4711 »

AndrewGrant wrote: Wed Jun 16, 2021 9:41 pm Awesome results! #4 if you filter multiple Stockfishes and Komodos, #3 if you discount Fire (Stockfish) with a Stockfish Network.
Excited to see the victory over Komodo 14. Beating the old-guard of Alpha-Beta was the goal for the release. +75 :)

Thanks for taking the time to test things, and for using the AVX2 version.
Thanks for the free version for testing.
I think +75 Elo is a valid result, when mentioning, that your selfplay progression test was:

ELO | 121.51 +- 5.35 (95%)
CONF | 10.0+0.1s Threads=1 Hash=8MB
Games | N: 10436 W: 4966 L: 1458 D: 4012

So, in my testruns, using longer thinking-time (3'+1'' (but using slower hardware (Notebook, running engines on 20 Hyperthreading-Threads on a 12 core machine)), I would have expected an Elo-gain of around +70 to +80 Elo. And thats exactly, what we see.
The +45 or so Elo-gain of other testers seem pretty strange to me.