MEA and temere.epd

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

User avatar
pohl4711
Posts: 2439
Joined: Sat Sep 03, 2011 7:25 am
Location: Berlin, Germany
Full name: Stefan Pohl

Re: MEA and temere.epd

Post by pohl4711 »

Rebel wrote: Sat Apr 25, 2020 7:38 pm No problem here, make sure MEA.EXE is in the same folder as the Lc0 files.

Or in other words, install Lc0 in the TEMERE folder, not in the engines folder.
I found a way, to avoid that MEA-bug and place all engines in the engines folder. Works with Fat Fritz. All you have to do is to give the full path to the engine-exe in MEA:

wrong (but works with Lc0 0.24.1 and Stockfish etc.):
set EXE=engines\FatFritz_cpu_1\lc0-fatfritz-blas.exe

right:
set EXE=C:\MEA\engines\FatFritz_cpu_1\lc0-fatfritz-blas.exe
(if MEA is directly on C:\)
User avatar
Rebel
Posts: 6995
Joined: Thu Aug 18, 2011 12:04 pm

Re: MEA and temere.epd

Post by Rebel »

pohl4711 wrote: Sun Apr 26, 2020 5:28 am
Dann Corbit wrote: Sun Apr 26, 2020 12:16 am It must be one of those "laws of big numbers" things that makes it work so well.
If it can be used to make engines play better, then it is revolutionary.
It is! And it is perfect for fast tests (only 1 calculated node!) of neural-nets...
Look on my website:
https://www.sp-cc.de/nn-mea-testing.htm

Example:

Code: Select all

Engine                           :  Top1  Top1Rate  Score  ScoreRate
lc0 0.24.1 LS 14.3 (20x256)      : 17929    0.515  257442    0.739
lc0 0.24.1 LS 14.2 (20x256)      : 17899    0.514  257012    0.738
lc0 0.24.1 LS 14.1 (20x256)      : 17769    0.510  255712    0.734
lc0 0.24.1 LS 14 (20x256)        : 17749    0.509  255361    0.733
Not only the ranking of all LS 14 nets is correct. Aditionally, the wider gap between LS 14.1 and LS 14.2 is correct. Awesome!
Great, more good news.

Because the lack of strength options in nowadays top engines I turned to my own. Compared a somewhat stronger ProDeo (10-15 elo) versus the last official one of 2016.

http://rebel13.nl/mea/ProDeo%203.html

I want to do one more test before releasing the new tool, I am thinking of the current Stockfish and compare it with version 11. Where can I download the source code and how much stronger is it?
90% of coding is debugging, the other 10% is writing bugs.
User avatar
pohl4711
Posts: 2439
Joined: Sat Sep 03, 2011 7:25 am
Location: Berlin, Germany
Full name: Stefan Pohl

Re: MEA and temere.epd

Post by pohl4711 »

Dann Corbit
Posts: 12541
Joined: Wed Mar 08, 2006 8:57 pm
Location: Redmond, WA USA

Re: MEA and temere.epd

Post by Dann Corbit »

You can get the very latest development stockfish build from here:
https://abrok.eu/stockfish/
Taking ideas is not a vice, it is a virtue. We have another word for this. It is called learning.
But sharing ideas is an even greater virtue. We have another word for this. It is called teaching.
User avatar
Rebel
Posts: 6995
Joined: Thu Aug 18, 2011 12:04 pm

Re: MEA and temere.epd

Post by Rebel »

Okay, thanks. Compiler both to have equal NPS, Result is not great.

Code: Select all

    EPD  : epd\45000.epd
    Time : 100ms
                                                       Solving       Max    Total    Time   Hash          
    Engine           Score   Used Time   Found   Pos     Time       Score    Rate     ms     Mb  Cpu  CCRL
 1  sf11-april-26    951897  01:36:21.1  22370  45000  00:08:01.4  1350000  70.5%    100    128    1  2900
 2  sf11-release     950327  01:36:18.6  22293  45000  00:07:54.8  1350000  70.4%    100    128    1  2900
90% of coding is debugging, the other 10% is writing bugs.
User avatar
pohl4711
Posts: 2439
Joined: Sat Sep 03, 2011 7:25 am
Location: Berlin, Germany
Full name: Stefan Pohl

Re: MEA and temere.epd

Post by pohl4711 »

Rebel wrote: Mon Apr 27, 2020 12:43 am Okay, thanks. Compiler both to have equal NPS, Result is not great.

Code: Select all

    EPD  : epd\45000.epd
    Time : 100ms
                                                       Solving       Max    Total    Time   Hash          
    Engine           Score   Used Time   Found   Pos     Time       Score    Rate     ms     Mb  Cpu  CCRL
 1  sf11-april-26    951897  01:36:21.1  22370  45000  00:08:01.4  1350000  70.5%    100    128    1  2900
 2  sf11-release     950327  01:36:18.6  22293  45000  00:07:54.8  1350000  70.4%    100    128    1  2900
IMHO MEA is not good for testing really strong AB-engines. Why? I tested Stockfish with 5''/position (with the huge 34844 epd-set, I use for NN-testings) on a Hexacore and got a much too high Scorerate of more than 87% (I believe, beyond 85%, the result are not reliable anymore). So, the conclusion here is, that Stockfish should be tested with very short timecontrol, only, like you did here. But what makes Stockfish so incredible strong is, that its search is very, very well tuned and tricky. And with only 100ms thinking-time, this strength can not unfold its effect. And because of this, MEA can not measure the progress of Stockfish under that conditions.
MEA is good for testing weaker AB-engines. And it is perfect for testing NNs (without any search, only 1 node/position). But for Stockfish I would not recommend to use it.
User avatar
Rebel
Posts: 6995
Joined: Thu Aug 18, 2011 12:04 pm

Re: MEA and temere.epd

Post by Rebel »

I think that's a bit premature, what you only have is the temere util meant to create a reasonable ranking list (without elo) with an error bar of -25/+25 elo. The util to be released in about a couple of days is a try to narrow that gap to 5-10 elo and meant for further improvement. I think the system has potential but it will be a long ride to get the maximum out of it. It can also fail.
90% of coding is debugging, the other 10% is writing bugs.
User avatar
Rebel
Posts: 6995
Joined: Thu Aug 18, 2011 12:04 pm

Re: MEA and temere.epd

Post by Rebel »

Two more examples for demonstration purpose only.

Tuned 7 parameters in the hope to find an indication (emphasis added) of an improvement.

http://rebel13.nl/mea/mix1.html

No cigar.

Run time 43 minutes, the preparation of the batch files almost took longer :lol:

Tuning the Bishop Value.

http://rebel13.nl/mea/Bishop_Value.html

Too bad.

It's starting to itch to re-tune that old dinosaur and that's a long time ago.
90% of coding is debugging, the other 10% is writing bugs.
Dann Corbit
Posts: 12541
Joined: Wed Mar 08, 2006 8:57 pm
Location: Redmond, WA USA

Re: MEA and temere.epd

Post by Dann Corbit »

Here is my data for the temere positions.


I have 21,586 positions where we agree on the best move (temerity-arg.epd)
I have 13,212 positions where we disagree on the best move (temerity-dis.epd)

I do not have your data file, so I am not sure what the evaluations and depths are.
Hence, it is difficult for me to make contrasts and comparisons.
Taking ideas is not a vice, it is a virtue. We have another word for this. It is called learning.
But sharing ideas is an even greater virtue. We have another word for this. It is called teaching.
User avatar
Rebel
Posts: 6995
Joined: Thu Aug 18, 2011 12:04 pm

Re: MEA and temere.epd

Post by Rebel »

Dann Corbit wrote: Tue Apr 28, 2020 4:39 am Here is my data for the temere positions.


I have 21,586 positions where we agree on the best move (temerity-arg.epd)
I have 13,212 positions where we disagree on the best move (temerity-dis.epd)

I do not have your data file, so I am not sure what the evaluations and depths are.
Hence, it is difficult for me to make contrasts and comparisons.
MEA creates perfect EPD'S in the "epd_out" folder, example.

Code: Select all

1b1qrr2/1p4pk/1np4p/p3Np1B/Pn1P4/R1N3B1/1Pb2PPP/2Q1R1K1 b - - bm Bxe5; ce 203; acd 12;
1k1r2r1/1b4p1/p4n1p/1pq1pPn1/2p1P3/P1N2N2/1PB1Q1PP/3R1R1K b - - bm Nxf3; ce 124; acd 13;
1k1r3r/pb1q2p1/B4p2/2p4p/Pp1bPPn1/7P/1P2Q1P1/R1BN1R1K b - - bm Bxa6; ce 178; acd 14;
BTW, you must have noticed by now that many positions come from your 110 million EPD database, excellent to create random sets.
90% of coding is debugging, the other 10% is writing bugs.