MRL - The MEA Rating List

Rebel · Post by **Rebel** » Fri Jun 08, 2018 9:43 pm

Added 2 tools:

MRT.EXE creates from a MEA Excel (*.csv) file a rating list as used here so you can create (and publish) your own findings. Syntax: mrt [source] [destination]

MRE.EXE creates from a MEA log-file 2 EPD files, one that contains the positions an engine scored no point at all and a second one that contains the positions that the best (10 points) move is not found. See the examples [ one ] and [ two ] of the Stockfish 9 log-file at 60 seconds per move. Syntax: mre [source]

http://rebel13.nl/rebel13/mrl.html#tools

Albert Silver · Post by **Albert Silver** » Sun Jun 10, 2018 5:40 am

Rebel wrote: ↑Thu Jun 07, 2018 10:42 pm
Albert Silver wrote: ↑Thu Jun 07, 2018 9:20 pmI'm quite curious to see whether Leela can fit into that system, being the bipolar gal she is.
I don't follow the zero news, what is the downoad location?

It depends somewhat on your hardware. If you have an Nvidia GPU, preferably not too old, then you will enjoy hugely more performance than without. The NN is an incredibly unusual engine, and even on my fairly modern $270 graphics card, I get about 5000 to 7000 nodes per second. Yes, nodes, not kilo nodes, it will perform along the lines of 3100-3200 CCRL. Without one, you are probably looking at 250 nodes per second at best.

The official site is: https://github.com/LeelaChessZero/lc0/w ... ng-Started

If you have a GPU, then advise as there is another executable you will want to use instead.

Ferdy · Post by **Ferdy** » Sun Jun 10, 2018 10:14 am

Tried to establish relationship between cegt40/4 rating and mea score at 5s/pos for selected engines.

Code: Select all

Data points: 18
              Player MEAScore  CEGT40/4  MEARating  Error
0     [Critter 1.6a]  [13758]      3028       3132   -104
1      [Komodo 9.02]  [13679]      3200       3114     85
2            [Gull3]  [13392]      3078       3046     31
3    [Andscacs 0.93]  [13101]      3091       2977    113
4          [Booot 6]  [12999]      2890       2953    -63
5       [texel 1.07]  [12924]      2969       2935     33
6        [Laser 1.5]  [12888]      2942       2926     15
7    [Protector 1.9]  [12876]      2956       2923     32
8     [Hannibal 1.7]  [12780]      2981       2901     79
9      [arasan 20.5]  [12625]      2905       2864     40
10           [ice 3]  [12614]      2941       2861     79
11        [senpai 2]  [12560]      2943       2849     93
12      [Dirty 2017]  [12223]      2744       2769    -25
13  [Deuterium 2018]  [11955]      2800       2705     94
14       [Tornado 8]  [11825]      2667       2675     -8
15       [Fruit 2.1]  [11406]      2496       2575    -79
16       [Gandalf 7]  [11071]      2475       2496    -21
17     [Ruffian 2.1]  [10884]      2443       2452     -9

Code: Select all

MEARating = (0.23676 x MEAScore) + (-124.6)
Error = cegt40/4 - MEARating

Rebel · Post by **Rebel** » Sun Jun 10, 2018 11:26 am

Looks reasonable well.

IMO you can also make a graph from the 10 sec results vs the CEGT 40/20 list since the time control used is based on the hardware of 2006, when they started.

Example from the download section -

[Event "Wasp"]
[Site "Werner"]
[Date "2018.05.12"]
[Round "1"]
[White "Wasp 3.00 x64 1CPU"]
[Black "Chess22k 1.9 x64"]
[Result "1/2-1/2"]
[ECO "A03"]
[WhiteElo "2200"]
[BlackElo "2200"]
[PlyCount "132"]
[EventDate "2018.??.??"]
[TimeControl "40/480:40/480:40/480"]

Meaning 12 secs average.

Ferdy · Post by **Ferdy** » Sun Jun 10, 2018 12:14 pm

Rebel wrote: ↑Sun Jun 10, 2018 11:26 am Looks reasonable well.

IMO you can also make a graph from the 10 sec results vs the CEGT 40/20 list since the time control used is based on the hardware of 2006, when they started.

Plot for 10s/pos and CEGT 40/20,

Code: Select all

Data points: 18
              Player MEAScore  CEGT40/20  MEARating  Error
0      [Komodo 9.02]  [13825]       3301       3230     70
1    [Andscacs 0.93]  [13294]       3083       3054     28
2    [Protector 1.9]  [13221]       2961       3030    -69
3        [Laser 1.5]  [13181]       2983       3017    -34
4     [Hannibal 1.7]  [13112]       3095       2994    100
5       [texel 1.07]  [13084]       2968       2985    -17
6      [arasan 20.5]  [13052]       2918       2974    -56
7         [Wasp 3.0]  [13007]       2907       2959    -52
8         [senpai 2]  [12840]       2947       2904     42
9            [ice 3]  [12838]       2935       2903     31
10      [Dirty 2017]  [12443]       2739       2773    -34
11  [Deuterium 2018]  [12360]       2818       2745     72
12      [Pedone 1.7]  [12304]       2922       2727    194
13       [Tornado 8]  [12239]       2698       2705     -7
14      [ProDeo 2.2]  [11829]       2531       2570    -39
15       [Fruit 2.1]  [11763]       2499       2548    -49
16       [Gandalf 7]  [11457]       2464       2447     16
17     [Ruffian 2.1]  [11354]       2432       2412     19

Ray Bongalon · Post by **Ray Bongalon** » Tue Jun 12, 2018 11:28 pm

Hi.

Regarding the estimated rating set by the program after running the test, I always get the same rating value (e.g. 2500) even if I use different 'movetime' values. Thank you.

Regards

Rebel · Post by **Rebel** » Wed Jun 13, 2018 9:29 am

Ray Bongalon wrote: ↑Tue Jun 12, 2018 11:28 pm Hi.

Regarding the estimated rating set by the program after running the test, I always get the same rating value (e.g. 2500) even if I use different 'movetime' values. Thank you.

Regards

You need to run MRT.EXE, it's the util that calculates the actual rating, see http://rebel13.nl/rebel13/mrl.html#tools

Rebel · Post by **Rebel** » Wed Jun 13, 2018 9:40 am

Added a lot of new engines. It's amazing the see the old (2010-2012) Robolito based clique (Houdini 1.5, Bouquet, Critter) to dominate the various lists and programs rated 250-300 elo higher, like Komodo and Stockfish, are unable to surpass them.

http://rebel13.nl/mea2.html
http://rebel13.nl/mea3.html

Rebel · Post by **Rebel** » Wed Jun 13, 2018 11:05 am

Interesting MEA experiment with ProDeo investigating the reliability of the 1500 positions STS test-suite

http://rebel13.nl/sts.html

Joost Buijs · Post by **Joost Buijs** » Sat Jun 16, 2018 4:15 pm

Rebel wrote: ↑Wed Jun 13, 2018 9:40 am Added a lot of new engines. It's amazing the see the old (2010-2012) Robolito based clique (Houdini 1.5, Bouquet, Critter) to dominate the various lists and programs rated 250-300 elo higher, like Komodo and Stockfish, are unable to surpass them.

http://rebel13.nl/mea2.html
http://rebel13.nl/mea3.html

Really, this doesn't surprice me at all, just like I already said before is that STS is based on analysis with engines from at least 5 years back, and that were indead engines based on Robbolito and friends. Current results show that it is unreliable to use STS to determine playing strength, maybe it gives a rough indication but thats it.

MRL - The MEA Rating List

Re: MRL - The MEA Rating List

Re: MRL - The MEA Rating List

Re: MRL - The MEA Rating List

Re: MRL - The MEA Rating List

Re: MRL - The MEA Rating List

Re: MRL - The MEA Rating List

Re: MRL - The MEA Rating List

Re: MRL - The MEA Rating List

Re: MRL - The MEA Rating List

Re: MRL - The MEA Rating List