CEGT - 5´+3" and 3´+1" rating lists February 2022

Discussion of computer chess matches and engine tournaments.

Moderators: hgm, Dann Corbit, Harvey Williamson

User avatar
Werner
Posts: 2862
Joined: Wed Mar 08, 2006 10:09 pm
Location: Germany
Full name: Werner Schüle

CEGT - 5´+3" and 3´+1" rating lists February 2022

Post by Werner »

Hi all,
our actual rating lists are online and can be found under the attached links!

5'+3'' pb=on
last update was from February 19th with 7380 new games, total now 372780 with 289 Engines/Versions

NEW Engines
6 Ethereal 13.50NN x64 3484 +13 -13 1400 games (+ 20 to v. 13.25)
7 Berserk 8.5.1NN x64 3483 +13 -13 1400 games (+ 10 to v. 6.0)
82 Wasp 5.20NN x64 3225 +19 -19 900 games (+ 17 to v. 5.00)
86 Combusken 2.0.0NN x64 3217 +19 -19 900 games (-)
94 Rebel 14.1NN x64 3187 +19 -19 900 games (-)
96 Hiarcs 15 x64 3186 +19 -19 900 games (+ 316 to v. 14)
97 Velvet 3.2.0NN x64 3182 +18 -18 900 games (+ 120 to v. 3.1.0)
116 DanaSah 9.0NN (DN2) x64 3140 +19 -19 900 games (-)
126 Stash 32.0 x64 3116 +19 -19 900 games (-)

3'+1'' pb=on
Last update was February 22th
9320 new games, total 247920 with 111 engines.

NEW Engines
1 Stockfish 14.1NN x64 3583 +16 -16 1540 games (-28 to v. 14.0)
2 Komodo Dragon 2.6 x64 3558 +16 -16 1540 games (+ 37 to v. 2.0)
3 Komodo Dragon 2.6 x64 (MCTS) 3501 +16 -16 1540 games (+ 57 to v. 2.0)
5 SlowChess Blitz 2.8NN x64 3419 +14 -14 1740 games (+ 27 to v. 2.7)
7 Revenge 2.0NN x64 3401 +14 -14 1740 games (+ 69 to v. 1.0)
4 Ethereal 13.50NN x64 3419 +14 -14 1800 games (+ 22 to v. 13.25)
6 Berserk 8.0NN x64 3409 +15 -15 1540 (+ 176 to v. 4.5.1)
11 Seer 2.4.0NN x64 3309 +15 -15 1740 games (+ 52 to v. 2.3.1)

A big „Thank you“ to all testers as usual!!

Links

40/20: http://www.cegt.net/rating.htm
Blitz: http://www.cegt.net/blitz.htm
40/120: http://www.cegt.net/rating120.htm
25+8: http://www.cegt.net/rating25plus8.htm
3+1 pb=on: http://www.cegt.net/rating3plus1pbon.htm
5+3 pb=on: http://www.cegt.net/rating5plus3pbon.htm
Tester: http://www.cegt.net/testers/testers.htm

Werner Schüle
CEGT-Team
Jouni
Posts: 3229
Joined: Wed Mar 08, 2006 8:15 pm

Re: CEGT - 5´+3" and 3´+1" rating lists February 2022

Post by Jouni »

1 Stockfish 14.1NN x64 3583 +16 -16 1540 games (-28 to v. 14.0) :!:
Jouni
User avatar
Werner
Posts: 2862
Joined: Wed Mar 08, 2006 10:09 pm
Location: Germany
Full name: Werner Schüle

Re: CEGT - 5´+3" and 3´+1" rating lists February 2022

Post by Werner »

We recognized this result too.
On our 40/20 list we see too no improvements:
2 Stockfish 14.1NNUE x64 1CPU 3579 13 13 2463 63.1% 3481 71.9%
3 Stockfish 14.0NNUE x64 1CPU 3579 13 13 2323 64.4% 3471 69 %
Werner
Wolfgang
Posts: 889
Joined: Sat May 13, 2006 1:08 am

Re: CEGT - 5´+3" and 3´+1" rating lists February 2022

Post by Wolfgang »

Jouni wrote: Tue Feb 22, 2022 8:22 pm 1 Stockfish 14.1NN x64 3583 +16 -16 1540 games (-28 to v. 14.0) :!:
I made all these actual tests, but I have no idea what happened here.
Stockfish still has to play vs. Nemorino and Igel (Houdini will be eliminated from the list) but these 400 games will probably not be a game changer.

@ 40/4 we have +18 to SF 14.0 and @ 5+3Ponder we have +3 which both seems more realistic than -28 :oops:
Best
Wolfgang
CEGT-Team
www.cegt.net
www.cegt.forumieren.com
Jouni
Posts: 3229
Joined: Wed Mar 08, 2006 8:15 pm

Re: CEGT - 5´+3" and 3´+1" rating lists February 2022

Post by Jouni »

Everything are just inside error bars. In reality there is NO progress like fastgm confirms!

Stockfish 14.1 3497 9 3000 70.17 1219 1772 9 59.1 54
Stockfish 14 3497 9 3450 74.10 1668 1777 5 51.5 100

Selftesting don't improve engine anymore?
Jouni
dkappe
Posts: 1620
Joined: Tue Aug 21, 2018 7:52 pm
Full name: Dietrich Kappe

Re: CEGT - 5´+3" and 3´+1" rating lists February 2022

Post by dkappe »

1. You can only detect the difference with special unbalanced openings, like uho as used now in fishtest and in the CCC and TCEC engine tournaments, where exits are close to 100 centipawns.
2. The hypothesis is that uho ratings correlate with “normal” or balanced opening ratings, though it looks that rating gaps are twice as big, so +30 “standard” works out to +60 uho.
3. You are much better off running many tens of thousands of ultrabullet and bullet games with uho rather than standard openings at what are termed “very long time control.” Clearly your results just need more games until they conform to fishtest expectations.
Fat Titz by Stockfish, the engine with the bodaciously big net. Remember: size matters. If you want to learn more about this engine just google for "Fat Titz".
carldaman
Posts: 2277
Joined: Sat Jun 02, 2012 2:13 am

Re: CEGT - 5´+3" and 3´+1" rating lists February 2022

Post by carldaman »

If drawish (aka 'balanced') openings are fed into the testing, hardly any Elo improvement will be noticed, because SF-NNUE in its current form is too risk-averse.