The Speedy Rating List

Rebel · Post by **Rebel** » Thu May 28, 2020 8:46 pm

cucumber wrote: ↑Thu May 28, 2020 8:24 pm Wow, this is really cool. Do you think you could talk about the methodology behind Lc1.epd?

Chris analyzed the Lcx positions with Lc0 on his RTX 2080 on long time control.

Also, would you be able to test Fire 7.1 again? I'm surprised to see it so low. It seems like a major outlier compared to what rating lists show.

There is something strange with Fire, I never get the results the rating lists offer and not only with this tool. I have Fire run now for the 4000ms list. Maybe it's just a scaling issue.

xr_a_y · Post by **xr_a_y** » Thu May 28, 2020 9:42 pm

Rebel wrote: ↑Thu May 28, 2020 8:24 pm
xr_a_y wrote: ↑Thu May 28, 2020 4:46 pm
Rebel wrote: ↑Thu May 28, 2020 4:34 pm
xr_a_y wrote: ↑Thu May 28, 2020 4:21 pm Thanks for considering Minic in here !

Can you please give a shot to Minic 2.32 (unofficial release that can be found here https://github.com/tryingsomestuff/Mini ... ter/Minic2)

I suspect it might be stronger on this type of test ...
Oki doki, but which executable to pick?
Code: Select all
   minic_2.32_mingw_x64_nehalem.exe  minic 2.32   May 28, 2020  
   minic_2.32_mingw_x64_skylake.exe  minic 2.32   May 28, 2020  
   minic_2.32_mingw_x64_x86-64.exe  minic 2.32   May 28, 2020  
In this order, best to worst, depending on your hardware :
skylake == avx2/bmi2
nehalem == sse4.2
x86-64 == just popcnt
Code: Select all
                                                               Max            Time   Hash          
    Engine           Points  Used Time   Found   Pos    Elo   Score   Score    ms     Mb  Cpu  Errors
29  Minic 2.32       289394  11:01:03.5  19880  40000  2894  400000  72.35%   1000   128    1     0
32  Minic 2.25       287730  10:54:48.6  19780  40000  2877  400000  71.93%   1000   128    1     0
Used the sse4.2 executable for 2.32, not sure which one I used for 2.25, likely the sse4.2 also.

Thanks a lot !

Rebel · Post by **Rebel** » Thu May 28, 2020 10:49 pm

Added 10 new engines.

Winter 0.8
ProDeo 2.2
Benjamin
Devel 3.0.0b
FoxSee 3.3.3
Topple 0.7.5
GreKo 2020.03
CT800 1.40
Combusken 1.2
Minic 2.32

http://rebel13.nl/download/speedy-rating-list.html

cucumber · Post by **cucumber** » Fri May 29, 2020 2:12 am

Rebel wrote: ↑Thu May 28, 2020 8:46 pm
cucumber wrote: ↑Thu May 28, 2020 8:24 pm Wow, this is really cool. Do you think you could talk about the methodology behind Lc1.epd?
Chris analyzed the Lcx positions with Lc0 on his RTX 2080 on long time control.

Also, would you be able to test Fire 7.1 again? I'm surprised to see it so low. It seems like a major outlier compared to what rating lists show.
There is something strange with Fire, I never get the results the rating lists offer and not only with this tool. I have Fire run now for the 4000ms list. Maybe it's just a scaling issue.

That's really interesting! I wonder if Norman knows what might be causing that. Thanks for such a quick reply.

jorose · Post by **jorose** » Fri May 29, 2020 2:47 am

This is an interesting benchmark to me. For the most part this seems reasonably accurate.

I find it interesting to compare your lists with 1s/move and 4s/move. Booot moving up relative to Laser and Rubichess was somewhat expected. Notable is Ethereals improvement which seems to cement it well above "the rest" with more time.

I wonder how much this list represents tactical or positional capabilities of an engine?

P.S.: Thank you for testing Winter 0.8

Rebel · Post by **Rebel** » Fri May 29, 2020 7:33 am

cucumber wrote: ↑Fri May 29, 2020 2:12 am
Rebel wrote: ↑Thu May 28, 2020 8:46 pm
cucumber wrote: ↑Thu May 28, 2020 8:24 pm Wow, this is really cool. Do you think you could talk about the methodology behind Lc1.epd?
Chris analyzed the Lcx positions with Lc0 on his RTX 2080 on long time control.

Also, would you be able to test Fire 7.1 again? I'm surprised to see it so low. It seems like a major outlier compared to what rating lists show.
There is something strange with Fire, I never get the results the rating lists offer and not only with this tool. I have Fire run now for the 4000ms list. Maybe it's just a scaling issue.
That's really interesting! I wonder if Norman knows what might be causing that. Thanks for such a quick reply.

4000ms did not help.

Code: Select all

                                                              Max            Time   Hash          
    Engine           Points  Used Time   Found   Pos    Elo   Score   Score    ms     Mb  Cpu  Errors
 1  Stockfish 11     333316  44:49:32.7  24467  40000  3333  400000  83.33%   4000   128    1     0
 2  Komodo 14        326627  44:27:14.3  23558  40000  3266  400000  81.66%   4000   128    1     0
 3  Houdini 6.03     323896  44:47:34.1  23189  40000  3238  400000  80.97%   4000   128    1     0
 4  Ethereal 12      321795  44:44:14.7  22957  40000  3218  400000  80.45%   4000   128    1     0
 5  rofChade 2.3     318859  44:39:33.9  22599  40000  3188  400000  79.71%   4000   128    1    41
 6  Xiphos 0.6       317879  43:21:42.3  22555  40000  3178  400000  79.47%   4000   128    1     0
 7  Schooner 2.2     317565  43:41:15.3  22562  40000  3175  400000  79.39%   4000   128    1     0
 8  Booot 6.4        316479  47:12:45.4  22523  40000  3164  400000  79.12%   4000   128    1     2
 9  RubiChess 1.7.2  314933  44:43:11.4  22096  40000  3149  400000  78.73%   4000   128    1     0
10  Laser 1.7        314814  44:45:07.0  22345  40000  3148  400000  78.70%   4000   128    1     0
11  Fire 7.1         308708  44:42:14.0  21560  40000  3087  400000  77.18%   4000   128    1     0

Did some cute-chess experiments and with SOMU -> Match Stats I got:

Code: Select all

1. Cute-chess-run-1 : 8-noves.pgn  book-moves Fire 54.981 | book-moves opponent 45.275
2. Cute-chess-run-2 : 4-noves.pgn  book-moves Fire 29.099 | book-moves opponent 19.766
3. Cute-chess-run-3 : endgame.pgn  book-moves Fire      0 | book-moves opponent      0

Rebel · Post by **Rebel** » Fri May 29, 2020 8:04 am

jorose wrote: ↑Fri May 29, 2020 2:47 am This is an interesting benchmark to me. For the most part this seems reasonably accurate.

I find it interesting to compare your lists with 1s/move and 4s/move. Booot moving up relative to Laser and Rubichess was somewhat expected. Notable is Ethereals improvement which seems to cement it well above "the rest" with more time.

I wonder how much this list represents tactical or positional capabilities of an engine?

P.S.: Thank you for testing Winter 0.8

I have no idea.

Positions mainly come from a 9.8 million EPD set from Dann Corbit and the 40,000 positions of lc1.epd were randomly chosen.

Rebel · Post by **Rebel** » Sat May 30, 2020 9:58 am

Added a scaling list by comparing the elo gain of engines from 1000ms to 4000ms.

http://rebel13.nl/download/speedy-rating-list.html

Ras · Post by **Ras** » Sat May 30, 2020 10:55 am

Rebel wrote: ↑Thu May 28, 2020 10:49 pm Added 10 new engines.

CT800 1.40

Thanks for testing!

Terje · Post by **Terje** » Sat May 30, 2020 1:59 pm

Given that a lot of engines have exactly 41 errors it would be interesting to know which positions (or even 1 of them) these occur in.

The Speedy Rating List

Re: The Speedy Rating List

Re: The Speedy Rating List

Re: The Speedy Rating List

Re: The Speedy Rating List

Re: The Speedy Rating List

Re: The Speedy Rating List

Re: The Speedy Rating List

Re: The Speedy Rating List

Re: The Speedy Rating List

Re: The Speedy Rating List