MRL - The MEA Rating List

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

User avatar
Rebel
Posts: 7207
Joined: Thu Aug 18, 2011 12:04 pm
Full name: Ed Schröder

Re: MRL - The MEA Rating List

Post by Rebel »

Added 2 tools:

MRT.EXE creates from a MEA Excel (*.csv) file a rating list as used here so you can create (and publish) your own findings. Syntax: mrt [source] [destination]

MRE.EXE creates from a MEA log-file 2 EPD files, one that contains the positions an engine scored no point at all and a second one that contains the positions that the best (10 points) move is not found. See the examples [ one ] and [ two ] of the Stockfish 9 log-file at 60 seconds per move. Syntax: mre [source]

http://rebel13.nl/rebel13/mrl.html#tools
90% of coding is debugging, the other 10% is writing bugs.
Albert Silver
Posts: 3026
Joined: Wed Mar 08, 2006 9:57 pm
Location: Rio de Janeiro, Brazil

Re: MRL - The MEA Rating List

Post by Albert Silver »

Rebel wrote: Thu Jun 07, 2018 10:42 pm
Albert Silver wrote: Thu Jun 07, 2018 9:20 pmI'm quite curious to see whether Leela can fit into that system, being the bipolar gal she is. :-)
I don't follow the zero news, what is the downoad location?
It depends somewhat on your hardware. If you have an Nvidia GPU, preferably not too old, then you will enjoy hugely more performance than without. The NN is an incredibly unusual engine, and even on my fairly modern $270 graphics card, I get about 5000 to 7000 nodes per second. Yes, nodes, not kilo nodes, it will perform along the lines of 3100-3200 CCRL. Without one, you are probably looking at 250 nodes per second at best.

The official site is: https://github.com/LeelaChessZero/lc0/w ... ng-Started

If you have a GPU, then advise as there is another executable you will want to use instead.
"Tactics are the bricks and sticks that make up a game, but positional play is the architectural blueprint."
Ferdy
Posts: 4840
Joined: Sun Aug 10, 2008 3:15 pm
Location: Philippines

Re: MRL - The MEA Rating List

Post by Ferdy »

Tried to establish relationship between cegt40/4 rating and mea score at 5s/pos for selected engines.

Image

Code: Select all

Data points: 18
              Player MEAScore  CEGT40/4  MEARating  Error
0     [Critter 1.6a]  [13758]      3028       3132   -104
1      [Komodo 9.02]  [13679]      3200       3114     85
2            [Gull3]  [13392]      3078       3046     31
3    [Andscacs 0.93]  [13101]      3091       2977    113
4          [Booot 6]  [12999]      2890       2953    -63
5       [texel 1.07]  [12924]      2969       2935     33
6        [Laser 1.5]  [12888]      2942       2926     15
7    [Protector 1.9]  [12876]      2956       2923     32
8     [Hannibal 1.7]  [12780]      2981       2901     79
9      [arasan 20.5]  [12625]      2905       2864     40
10           [ice 3]  [12614]      2941       2861     79
11        [senpai 2]  [12560]      2943       2849     93
12      [Dirty 2017]  [12223]      2744       2769    -25
13  [Deuterium 2018]  [11955]      2800       2705     94
14       [Tornado 8]  [11825]      2667       2675     -8
15       [Fruit 2.1]  [11406]      2496       2575    -79
16       [Gandalf 7]  [11071]      2475       2496    -21
17     [Ruffian 2.1]  [10884]      2443       2452     -9

Code: Select all

MEARating = (0.23676 x MEAScore) + (-124.6)
Error = cegt40/4 - MEARating
User avatar
Rebel
Posts: 7207
Joined: Thu Aug 18, 2011 12:04 pm
Full name: Ed Schröder

Re: MRL - The MEA Rating List

Post by Rebel »

Looks reasonable well.

IMO you can also make a graph from the 10 sec results vs the CEGT 40/20 list since the time control used is based on the hardware of 2006, when they started.

Example from the download section -

[Event "Wasp"]
[Site "Werner"]
[Date "2018.05.12"]
[Round "1"]
[White "Wasp 3.00 x64 1CPU"]
[Black "Chess22k 1.9 x64"]
[Result "1/2-1/2"]
[ECO "A03"]
[WhiteElo "2200"]
[BlackElo "2200"]
[PlyCount "132"]
[EventDate "2018.??.??"]
[TimeControl "40/480:40/480:40/480"]

Meaning 12 secs average.
90% of coding is debugging, the other 10% is writing bugs.
Ferdy
Posts: 4840
Joined: Sun Aug 10, 2008 3:15 pm
Location: Philippines

Re: MRL - The MEA Rating List

Post by Ferdy »

Rebel wrote: Sun Jun 10, 2018 11:26 am Looks reasonable well.

IMO you can also make a graph from the 10 sec results vs the CEGT 40/20 list since the time control used is based on the hardware of 2006, when they started.
Plot for 10s/pos and CEGT 40/20,

Image

Code: Select all

Data points: 18
              Player MEAScore  CEGT40/20  MEARating  Error
0      [Komodo 9.02]  [13825]       3301       3230     70
1    [Andscacs 0.93]  [13294]       3083       3054     28
2    [Protector 1.9]  [13221]       2961       3030    -69
3        [Laser 1.5]  [13181]       2983       3017    -34
4     [Hannibal 1.7]  [13112]       3095       2994    100
5       [texel 1.07]  [13084]       2968       2985    -17
6      [arasan 20.5]  [13052]       2918       2974    -56
7         [Wasp 3.0]  [13007]       2907       2959    -52
8         [senpai 2]  [12840]       2947       2904     42
9            [ice 3]  [12838]       2935       2903     31
10      [Dirty 2017]  [12443]       2739       2773    -34
11  [Deuterium 2018]  [12360]       2818       2745     72
12      [Pedone 1.7]  [12304]       2922       2727    194
13       [Tornado 8]  [12239]       2698       2705     -7
14      [ProDeo 2.2]  [11829]       2531       2570    -39
15       [Fruit 2.1]  [11763]       2499       2548    -49
16       [Gandalf 7]  [11457]       2464       2447     16
17     [Ruffian 2.1]  [11354]       2432       2412     19
Ray Bongalon
Posts: 20
Joined: Mon May 17, 2010 3:53 pm
Location: Christchurch, New Zealand

Re: MRL - The MEA Rating List

Post by Ray Bongalon »

Hi.

Regarding the estimated rating set by the program after running the test, I always get the same rating value (e.g. 2500) even if I use different 'movetime' values. Thank you.

Regards
User avatar
Rebel
Posts: 7207
Joined: Thu Aug 18, 2011 12:04 pm
Full name: Ed Schröder

Re: MRL - The MEA Rating List

Post by Rebel »

Ray Bongalon wrote: Tue Jun 12, 2018 11:28 pm Hi.

Regarding the estimated rating set by the program after running the test, I always get the same rating value (e.g. 2500) even if I use different 'movetime' values. Thank you.

Regards
You need to run MRT.EXE, it's the util that calculates the actual rating, see http://rebel13.nl/rebel13/mrl.html#tools
90% of coding is debugging, the other 10% is writing bugs.
User avatar
Rebel
Posts: 7207
Joined: Thu Aug 18, 2011 12:04 pm
Full name: Ed Schröder

Re: MRL - The MEA Rating List

Post by Rebel »

Added a lot of new engines. It's amazing the see the old (2010-2012) Robolito based clique (Houdini 1.5, Bouquet, Critter) to dominate the various lists and programs rated 250-300 elo higher, like Komodo and Stockfish, are unable to surpass them.

http://rebel13.nl/mea2.html
http://rebel13.nl/mea3.html
90% of coding is debugging, the other 10% is writing bugs.
User avatar
Rebel
Posts: 7207
Joined: Thu Aug 18, 2011 12:04 pm
Full name: Ed Schröder

Re: MRL - The MEA Rating List

Post by Rebel »

Interesting MEA experiment with ProDeo investigating the reliability of the 1500 positions STS test-suite

http://rebel13.nl/sts.html
90% of coding is debugging, the other 10% is writing bugs.
Joost Buijs
Posts: 1597
Joined: Thu Jul 16, 2009 10:47 am
Location: Almere, The Netherlands

Re: MRL - The MEA Rating List

Post by Joost Buijs »

Rebel wrote: Wed Jun 13, 2018 9:40 am Added a lot of new engines. It's amazing the see the old (2010-2012) Robolito based clique (Houdini 1.5, Bouquet, Critter) to dominate the various lists and programs rated 250-300 elo higher, like Komodo and Stockfish, are unable to surpass them.

http://rebel13.nl/mea2.html
http://rebel13.nl/mea3.html
Really, this doesn't surprice me at all, just like I already said before is that STS is based on analysis with engines from at least 5 years back, and that were indead engines based on Robbolito and friends. Current results show that it is unreliable to use STS to determine playing strength, maybe it gives a rough indication but thats it.