The Speedy Rating List

Discussion of computer chess matches and engine tournaments.

Moderators: hgm, Rebel, chrisw

User avatar
Rebel
Posts: 6991
Joined: Thu Aug 18, 2011 12:04 pm

Re: The Speedy Rating List

Post by Rebel »

cucumber wrote: Thu May 28, 2020 8:24 pm Wow, this is really cool. Do you think you could talk about the methodology behind Lc1.epd?
Chris analyzed the Lcx positions with Lc0 on his RTX 2080 on long time control.
Also, would you be able to test Fire 7.1 again? I'm surprised to see it so low. It seems like a major outlier compared to what rating lists show.
There is something strange with Fire, I never get the results the rating lists offer and not only with this tool. I have Fire run now for the 4000ms list. Maybe it's just a scaling issue.
90% of coding is debugging, the other 10% is writing bugs.
User avatar
xr_a_y
Posts: 1871
Joined: Sat Nov 25, 2017 2:28 pm
Location: France

Re: The Speedy Rating List

Post by xr_a_y »

Rebel wrote: Thu May 28, 2020 8:24 pm
xr_a_y wrote: Thu May 28, 2020 4:46 pm
Rebel wrote: Thu May 28, 2020 4:34 pm
xr_a_y wrote: Thu May 28, 2020 4:21 pm Thanks for considering Minic in here !

Can you please give a shot to Minic 2.32 (unofficial release that can be found here https://github.com/tryingsomestuff/Mini ... ter/Minic2)

I suspect it might be stronger on this type of test ...
Oki doki, but which executable to pick?

Code: Select all

   minic_2.32_mingw_x64_nehalem.exe  minic 2.32   May 28, 2020  
   minic_2.32_mingw_x64_skylake.exe  minic 2.32   May 28, 2020  
   minic_2.32_mingw_x64_x86-64.exe  minic 2.32   May 28, 2020  
In this order, best to worst, depending on your hardware :
skylake == avx2/bmi2
nehalem == sse4.2
x86-64 == just popcnt

Code: Select all

                                                               Max            Time   Hash          
    Engine           Points  Used Time   Found   Pos    Elo   Score   Score    ms     Mb  Cpu  Errors
29  Minic 2.32       289394  11:01:03.5  19880  40000  2894  400000  72.35%   1000   128    1     0
32  Minic 2.25       287730  10:54:48.6  19780  40000  2877  400000  71.93%   1000   128    1     0
Used the sse4.2 executable for 2.32, not sure which one I used for 2.25, likely the sse4.2 also.
Thanks a lot !
User avatar
Rebel
Posts: 6991
Joined: Thu Aug 18, 2011 12:04 pm

Re: The Speedy Rating List

Post by Rebel »

Added 10 new engines.

Winter 0.8
ProDeo 2.2
Benjamin
Devel 3.0.0b
FoxSee 3.3.3
Topple 0.7.5
GreKo 2020.03
CT800 1.40
Combusken 1.2
Minic 2.32

http://rebel13.nl/download/speedy-rating-list.html
90% of coding is debugging, the other 10% is writing bugs.
cucumber
Posts: 144
Joined: Sun Oct 14, 2018 8:21 pm
Full name: JSmith

Re: The Speedy Rating List

Post by cucumber »

Rebel wrote: Thu May 28, 2020 8:46 pm
cucumber wrote: Thu May 28, 2020 8:24 pm Wow, this is really cool. Do you think you could talk about the methodology behind Lc1.epd?
Chris analyzed the Lcx positions with Lc0 on his RTX 2080 on long time control.
Also, would you be able to test Fire 7.1 again? I'm surprised to see it so low. It seems like a major outlier compared to what rating lists show.
There is something strange with Fire, I never get the results the rating lists offer and not only with this tool. I have Fire run now for the 4000ms list. Maybe it's just a scaling issue.
That's really interesting! I wonder if Norman knows what might be causing that. Thanks for such a quick reply.
jorose
Posts: 358
Joined: Thu Jan 22, 2015 3:21 pm
Location: Zurich, Switzerland
Full name: Jonathan Rosenthal

Re: The Speedy Rating List

Post by jorose »

This is an interesting benchmark to me. For the most part this seems reasonably accurate.

I find it interesting to compare your lists with 1s/move and 4s/move. Booot moving up relative to Laser and Rubichess was somewhat expected. Notable is Ethereals improvement which seems to cement it well above "the rest" with more time.

I wonder how much this list represents tactical or positional capabilities of an engine?

P.S.: Thank you for testing Winter 0.8 :D
-Jonathan
User avatar
Rebel
Posts: 6991
Joined: Thu Aug 18, 2011 12:04 pm

Re: The Speedy Rating List

Post by Rebel »

cucumber wrote: Fri May 29, 2020 2:12 am
Rebel wrote: Thu May 28, 2020 8:46 pm
cucumber wrote: Thu May 28, 2020 8:24 pm Wow, this is really cool. Do you think you could talk about the methodology behind Lc1.epd?
Chris analyzed the Lcx positions with Lc0 on his RTX 2080 on long time control.
Also, would you be able to test Fire 7.1 again? I'm surprised to see it so low. It seems like a major outlier compared to what rating lists show.
There is something strange with Fire, I never get the results the rating lists offer and not only with this tool. I have Fire run now for the 4000ms list. Maybe it's just a scaling issue.
That's really interesting! I wonder if Norman knows what might be causing that. Thanks for such a quick reply.
4000ms did not help.

Code: Select all

                                                              Max            Time   Hash          
    Engine           Points  Used Time   Found   Pos    Elo   Score   Score    ms     Mb  Cpu  Errors
 1  Stockfish 11     333316  44:49:32.7  24467  40000  3333  400000  83.33%   4000   128    1     0
 2  Komodo 14        326627  44:27:14.3  23558  40000  3266  400000  81.66%   4000   128    1     0
 3  Houdini 6.03     323896  44:47:34.1  23189  40000  3238  400000  80.97%   4000   128    1     0
 4  Ethereal 12      321795  44:44:14.7  22957  40000  3218  400000  80.45%   4000   128    1     0
 5  rofChade 2.3     318859  44:39:33.9  22599  40000  3188  400000  79.71%   4000   128    1    41
 6  Xiphos 0.6       317879  43:21:42.3  22555  40000  3178  400000  79.47%   4000   128    1     0
 7  Schooner 2.2     317565  43:41:15.3  22562  40000  3175  400000  79.39%   4000   128    1     0
 8  Booot 6.4        316479  47:12:45.4  22523  40000  3164  400000  79.12%   4000   128    1     2
 9  RubiChess 1.7.2  314933  44:43:11.4  22096  40000  3149  400000  78.73%   4000   128    1     0
10  Laser 1.7        314814  44:45:07.0  22345  40000  3148  400000  78.70%   4000   128    1     0
11  Fire 7.1         308708  44:42:14.0  21560  40000  3087  400000  77.18%   4000   128    1     0
Did some cute-chess experiments and with SOMU -> Match Stats I got:

Code: Select all

1. Cute-chess-run-1 : 8-noves.pgn  book-moves Fire 54.981 | book-moves opponent 45.275
2. Cute-chess-run-2 : 4-noves.pgn  book-moves Fire 29.099 | book-moves opponent 19.766
3. Cute-chess-run-3 : endgame.pgn  book-moves Fire      0 | book-moves opponent      0
90% of coding is debugging, the other 10% is writing bugs.
User avatar
Rebel
Posts: 6991
Joined: Thu Aug 18, 2011 12:04 pm

Re: The Speedy Rating List

Post by Rebel »

jorose wrote: Fri May 29, 2020 2:47 am This is an interesting benchmark to me. For the most part this seems reasonably accurate.

I find it interesting to compare your lists with 1s/move and 4s/move. Booot moving up relative to Laser and Rubichess was somewhat expected. Notable is Ethereals improvement which seems to cement it well above "the rest" with more time.

I wonder how much this list represents tactical or positional capabilities of an engine?

P.S.: Thank you for testing Winter 0.8 :D
I have no idea.

Positions mainly come from a 9.8 million EPD set from Dann Corbit and the 40,000 positions of lc1.epd were randomly chosen.
90% of coding is debugging, the other 10% is writing bugs.
User avatar
Rebel
Posts: 6991
Joined: Thu Aug 18, 2011 12:04 pm

Re: The Speedy Rating List

Post by Rebel »

Added a scaling list by comparing the elo gain of engines from 1000ms to 4000ms.

http://rebel13.nl/download/speedy-rating-list.html
90% of coding is debugging, the other 10% is writing bugs.
Ras
Posts: 2487
Joined: Tue Aug 30, 2016 8:19 pm
Full name: Rasmus Althoff

Re: The Speedy Rating List

Post by Ras »

Rebel wrote: Thu May 28, 2020 10:49 pm Added 10 new engines.

CT800 1.40
Thanks for testing! :D
Rasmus Althoff
https://www.ct800.net
Terje
Posts: 347
Joined: Tue Nov 19, 2019 4:34 am
Location: https://github.com/TerjeKir/weiss
Full name: Terje Kirstihagen

Re: The Speedy Rating List

Post by Terje »

Given that a lot of engines have exactly 41 errors it would be interesting to know which positions (or even 1 of them) these occur in.