Rebel wrote: ↑Sat Apr 25, 2020 7:38 pm
No problem here, make sure MEA.EXE is in the same folder as the Lc0 files.
Or in other words, install Lc0 in the TEMERE folder, not in the engines folder.
I found a way, to avoid that MEA-bug and place all engines in the engines folder. Works with Fat Fritz. All you have to do is to give the full path to the engine-exe in MEA:
wrong (but works with Lc0 0.24.1 and Stockfish etc.):
set EXE=engines\FatFritz_cpu_1\lc0-fatfritz-blas.exe
right:
set EXE=C:\MEA\engines\FatFritz_cpu_1\lc0-fatfritz-blas.exe
(if MEA is directly on C:\)
Dann Corbit wrote: ↑Sun Apr 26, 2020 12:16 am
It must be one of those "laws of big numbers" things that makes it work so well.
If it can be used to make engines play better, then it is revolutionary.
Not only the ranking of all LS 14 nets is correct. Aditionally, the wider gap between LS 14.1 and LS 14.2 is correct. Awesome!
Great, more good news.
Because the lack of strength options in nowadays top engines I turned to my own. Compared a somewhat stronger ProDeo (10-15 elo) versus the last official one of 2016.
I want to do one more test before releasing the new tool, I am thinking of the current Stockfish and compare it with version 11. Where can I download the source code and how much stronger is it?
90% of coding is debugging, the other 10% is writing bugs.
Taking ideas is not a vice, it is a virtue. We have another word for this. It is called learning.
But sharing ideas is an even greater virtue. We have another word for this. It is called teaching.
EPD : epd\45000.epd
Time : 100ms
Solving Max Total Time Hash
Engine Score Used Time Found Pos Time Score Rate ms Mb Cpu CCRL
1 sf11-april-26 951897 01:36:21.1 22370 45000 00:08:01.4 1350000 70.5% 100 128 1 2900
2 sf11-release 950327 01:36:18.6 22293 45000 00:07:54.8 1350000 70.4% 100 128 1 2900
90% of coding is debugging, the other 10% is writing bugs.
EPD : epd\45000.epd
Time : 100ms
Solving Max Total Time Hash
Engine Score Used Time Found Pos Time Score Rate ms Mb Cpu CCRL
1 sf11-april-26 951897 01:36:21.1 22370 45000 00:08:01.4 1350000 70.5% 100 128 1 2900
2 sf11-release 950327 01:36:18.6 22293 45000 00:07:54.8 1350000 70.4% 100 128 1 2900
IMHO MEA is not good for testing really strong AB-engines. Why? I tested Stockfish with 5''/position (with the huge 34844 epd-set, I use for NN-testings) on a Hexacore and got a much too high Scorerate of more than 87% (I believe, beyond 85%, the result are not reliable anymore). So, the conclusion here is, that Stockfish should be tested with very short timecontrol, only, like you did here. But what makes Stockfish so incredible strong is, that its search is very, very well tuned and tricky. And with only 100ms thinking-time, this strength can not unfold its effect. And because of this, MEA can not measure the progress of Stockfish under that conditions.
MEA is good for testing weaker AB-engines. And it is perfect for testing NNs (without any search, only 1 node/position). But for Stockfish I would not recommend to use it.
I think that's a bit premature, what you only have is the temere util meant to create a reasonable ranking list (without elo) with an error bar of -25/+25 elo. The util to be released in about a couple of days is a try to narrow that gap to 5-10 elo and meant for further improvement. I think the system has potential but it will be a long ride to get the maximum out of it. It can also fail.
90% of coding is debugging, the other 10% is writing bugs.
I have 21,586 positions where we agree on the best move (temerity-arg.epd)
I have 13,212 positions where we disagree on the best move (temerity-dis.epd)
I do not have your data file, so I am not sure what the evaluations and depths are.
Hence, it is difficult for me to make contrasts and comparisons.
Taking ideas is not a vice, it is a virtue. We have another word for this. It is called learning.
But sharing ideas is an even greater virtue. We have another word for this. It is called teaching.
Dann Corbit wrote: ↑Tue Apr 28, 2020 4:39 am
Here is my data for the temere positions.
I have 21,586 positions where we agree on the best move (temerity-arg.epd)
I have 13,212 positions where we disagree on the best move (temerity-dis.epd)
I do not have your data file, so I am not sure what the evaluations and depths are.
Hence, it is difficult for me to make contrasts and comparisons.
MEA creates perfect EPD'S in the "epd_out" folder, example.
1b1qrr2/1p4pk/1np4p/p3Np1B/Pn1P4/R1N3B1/1Pb2PPP/2Q1R1K1 b - - bm Bxe5; ce 203; acd 12;
1k1r2r1/1b4p1/p4n1p/1pq1pPn1/2p1P3/P1N2N2/1PB1Q1PP/3R1R1K b - - bm Nxf3; ce 124; acd 13;
1k1r3r/pb1q2p1/B4p2/2p4p/Pp1bPPn1/7P/1P2Q1P1/R1BN1R1K b - - bm Bxa6; ce 178; acd 14;
BTW, you must have noticed by now that many positions come from your 110 million EPD database, excellent to create random sets.
90% of coding is debugging, the other 10% is writing bugs.