CEGT - rating lists December 06th 2020

Discussion of computer chess matches and engine tournaments.

Moderators: hgm, Rebel, chrisw

User avatar
Werner
Posts: 2871
Joined: Wed Mar 08, 2006 10:09 pm
Location: Germany
Full name: Werner Schüle

CEGT - rating lists December 06th 2020

Post by Werner »

Hi all,
our actual rating lists are online and can be found under the attached links!

40 / 20:
New games: 3.305; 34 different engines
Total: 1.410.211

NEW Engines
1182 GreKo 2020.03 x64: 2576 - 1000 games (+20 to v. 2020.01)
1179 Pro Deo 3.0: 2581 - 1103 games (+44 to v. 2.8)
117 RubiChess 1.9NN-1107 x64 1CPU: 3231 - 1000 games (+22 to v. 1.9dev; +77 to v. 1.8)
1207 Devel 3.8.4 w32: 2563 - 100 games (startrating)
I made a short test with Orion 0.8NN; result was around 3000; we wait for an own net for our lists.

UPDATES
1 Stockfish 12.0 x64-82215 4CPU: 3580 - 1052 games (+1)

40 / 4
last update was November 29th
we are testing:
LCZero 0.26.3 703810 (latest 70.... net)
LCZero 0.26.3 CUDNN J92-330 = ca. ELO 3559 out of 300 games (only!)
Seer 1.2.1 NNUE Perf=2943
Beef 0.3.6 Perf=2975
Cheng 4.40 dev Perf=2859
Koivisto 4.0 Perf=2928

25'+8''
last update was November 18th with 1250 new games

5'+3'' pb=on
last update was November 6th, we are testing
SlowChess Blitz Classic 2.4 3284 out of 1900 games (+44 to v2.2)
KomodoDragon 1.0 x64 1CPU 3499 out of 1900 games

3'+1'' pb=on
Last update was June 16th
we started a large tournament with new engines: https://cegt.forumieren.com/t1401-tourn ... sions#2749

01 Stockfish 12.0 x64 NNUE xxxxx 106.5 145.0 165.0 155.0 163.5 162.5 176.5 172.5 175.0 1421.5
02 Komodo Dragon 1.0 x64 93.5 xxxxx 140.5 147.5 148.5 150.0 153.0 160.0 164.5 172.0 1329.5
03 Komodo 14.1 x64 55.0 59.5 xxxxx 107.5 109.5 114.5 114.0 124.0 142.0 141.0 967.0
04 Houdini 6.0 x64 35.0 52.5 92.5 xxxxx 109.0 111.0 120.0 122.0 137.0 139.5 918.5
05 Nemorino 6.00 x64 NNUE 45.0 51.5 90.5 91.0 xxxxx 97.5 109.0 108.0 115.5 128.0 836.0
06 Ethereal 12.75 x64 36.5 50.0 85.5 89.0 102.5 xxxxx 107.5 103.5 128.5 127.0 830.0
07 SlowChess Blitz Classic 37.5 47.0 86.0 80.0 91.0 92.5 xxxxx 99.0 124.0 118.5 775.5
08 Komodo 14.1 x64 MCTS 23.5 40.0 76.0 78.0 92.0 96.5 101.0 xxxxx 119.5 123.0 749.5
09 Igel 2.8.0 x64 NNUE 27.5 35.5 58.0 63.0 84.5 71.5 76.0 80.5 xxxxx 105.5 602.0
10 RubiChess 1.9.0 x64 NNUE 25.0 28.0 59.0 60.5 72.0 73.0 81.5 77.0 94.5 xxxxx 570.5

A big „Thank you“ to all testers as usual!!

Links

40/20: http://www.cegt.net/rating.htm
Blitz: http://www.cegt.net/blitz.htm
40/120: http://www.cegt.net/rating120.htm
25+8: http://www.cegt.net/rating25plus8.htm
3+1 pb=on: http://www.cegt.net/rating3plus1pbon.htm
5+3 pb=on: http://www.cegt.net/rating5plus3pbon.htm
Tester: http://www.cegt.net/testers/testers.htm
Games of the week: http://www.cegt.net/40_40%20Rating%20Li ... on/gow.jpg

Werner Schüle
CEGT-Team
David Carteau
Posts: 121
Joined: Sat May 24, 2014 9:09 am
Location: France
Full name: David Carteau

Re: CEGT - rating lists December 06th 2020

Post by David Carteau »

Werner wrote: Sun Dec 06, 2020 2:38 pm (...)
I made a short test with Orion 0.8NN; result was around 3000; we wait for an own net for our lists.
(...)
Hi CEGT team,

First of all, as usual, let me thank you for your time and the help you bring to us, engine authors, in testing our engines. It is very useful to have a good idea on progress (or absence of...!) that is made.

I would like to bring some elements about Orion new neural network (NN) :

Orion v0.8 comes with its own neural network, build with an home-made trainer and inference code (see https://orionchess.pagesperso-orange.fr/).

The network architecture is based on the NNUE concept, the implemented into Stockfish, but simpler and smaller (the network uses ~5 million weights against ~10 million for Stockfish's one).

I think what you mean by "own" network is the fact that I used one of Sergio Vieri's best network to train mine.

That was a delibery choice - at this stage - as my goal was to build a NN trainer and check if it was able to produce reasonably strong evaluation (at least better than my handcrafted one !).

As announced, the next step for me is to try to train networks without relying on external knowledge. At this stage, I see three alternatives :

1) Use Orion v0.7 hadncrafted eval. That does'nt sound great : the engine is weak, and after a simple (and stupid ?) calculation, it appears that training a net based on 360 million positions searched at depth 8 would require more than a year (hypothesis based on 1s per search, with 8 parallel processes = more than 520 days) !

2) Use game results only. I've just gave a chance to this, and it poorly failed (experiment based on all CCRL games available) !

3) Learn through playing games (the "zero" approach). That's my current preference. Again this would require a lot of computational power, but at least it appears to be a great and exciting new challenge and should give positive results !

Coming back to CEGT list, and as stated on my webpage, Stockfish's evaluation has been used since Orion v0.4 for the parameter tuning of my evaluation function. I let you decide if all versions of Orion starting from v0.4 should then be removed or not from your lists !

Don't worry, I understand that my choice can be disapproved by some people, who prefer 100% originality (or simply cannot test all engines derivatives !). Let me just say that reusing other people research work, trying to understand it, and then to enhance it, can also lead, sometimes, to great innovations !

I will accept very well not being ranked at all. Developping Orion remains for me a hobby, not a competition. My objective is to learn Artificial Intelligence technics (see my previous experiments with genectics), and chess is a great way to achieve that !

Regards, David.
User avatar
Werner
Posts: 2871
Joined: Wed Mar 08, 2006 10:09 pm
Location: Germany
Full name: Werner Schüle

Re: CEGT - rating lists December 06th 2020

Post by Werner »

Hello David,
thanks for the answer.
We are still in an internal discussion and in the discussion with other Engine autors. The situation is not easy.
At least we don´t want to test engines, which uses only Stockfish NNUE instead of the own evaluation. But the situation is: we cannot see really, what the programmer has done. And we have other NNUE engines already in the list with such problems...
We will see next Sunday what happens.
regards
Werner
Werner