EAS-Tool new version for engine developers

pohl4711 · Post by **pohl4711** » Fri Apr 05, 2024 1:27 pm

Here the EAS-Tool with hardcoded short-win movelimit. It is set to 60 (moves) hardcoded. This value can indeed be changed in the source, first lines (.bat file). Can be found in the "for_engine_developers" folder.

Normally the EAS-Tool calculates the shortwin movelimit (the limit, below an engine gets bonuspoints for a won game, because it is a short win) based on the average length of all won games in the input.pgn
But this can lead to unstable results, when engine-developers play gauntlets with their new engine version vs. a bunch of opponents (compared to the EAS-results, when a RoundRobin is evaluated). So, here the EAS-Tool with hardcoded short-win movelimit of 60 (or you choose another value). 60 moves works fine, when strong engines play each other without adjudication of the games (except tablebases-adjudication)...

Just re-download the EAS Tool (the normal version is of course still included in the download)
https://www.sp-cc.de/files/engines_aggr ... cs_tool.7z

pohl4711 · Post by **pohl4711** » Fri Apr 05, 2024 5:17 pm

Here the EAS-score of Patricia 2.0 after 3100 of 10000 games in my testrun vs. 10 engines (strength around Rybka 4.1). Balanced openings, singlethread, 3min+1sec. The strength of the 10 opponents fits very well: Right now, Patricia 2.0 has an overall score of 48.5% against all 10 opponents.
I used the new version of the EAS-Tool, where the limit for short win EAS-points is hardcoded to 60 (same value, the EAS-tool calculates for my UHO-Top15 Ratinglist games - that makes a comparison a little bit more reliable).
Here the result:

Code: Select all

                                 bad  avg.win 
Rank  EAS-Score  sacs   shorts  draws  moves  Engine/player 
-------------------------------------------------------------------
   1    317753  41.01%  35.87%  04.46%   68   Patricia 2.0  
   2    157590  09.68%  57.26%  33.00%   59   Critter 1.6a  
   3    142925  03.00%  54.00%  31.53%   60   Rybka 4.1  
   4    142073  05.62%  59.55%  31.82%   60   Laser 1.5  
   5    134347  03.08%  56.15%  29.00%   62   Andscacs 0.88  
   6    132193  02.54%  43.22%  25.96%   63   Komodo 5  
   7    101786  03.60%  45.95%  29.63%   63   Texel 1.7  
   8     90665  08.74%  42.72%  37.27%   64   Hannibal 1.7  
   9     87798  02.46%  44.26%  40.57%   66   Houdini 1.5a  
  10     75949  01.32%  31.58%  23.01%   71   Princhess 0.16  
  11     65072  02.61%  39.13%  45.61%   65   Nirvanachess 2.4

The fact, that the opponents (which all are at the same strength-level than Particia 2.0) have such high numbers/points in the short wins, comes from the fact, that Patricia 2.0 looses a lot of games quickly, because it plays so risky and plays so much sacs - that leads to quick losses, if something goes wrong... So, this definitly makes sense.

317753 of Patricia 2.0 is a real great EAS-Score. No other engine (except OpenTal (but this engine has only 2300 Elo and plays just crazy)) has ever achieved more than 300000 EAS-points. Velvet 4.1 and Komodo 14.1 aggressive had around 280000 EAS-points in my old SPCC-Ratinglist. And no engine ever had more than 38% sacs. So 41% of Patricia 2.0 is just fantastic. And no other engine had ever a bad draw-ratio below 8%. So 4.46% bad draws of Patricia 2.0 are just outstanding.

Whiskers · Post by **Whiskers** » Fri Apr 05, 2024 8:14 pm

I really appreciate your new addition to the tool! I will definitely use it, now I can easily test with gauntlets rather than with round robins.

I’m somewhat surprised that a sac rate of over 40, a short win rate of 35, and a draw rate of under 5 doesn’t score more than it does. I got similar individual metrics for my test that gave a 425k EAS score. I suppose with the longer TC most of the sacrifices are of less value and the short games are often longer.

Damas Clásicas · Post by **Damas Clásicas** » Sat Apr 06, 2024 9:10 pm

pohl4711 wrote: ↑Fri Apr 05, 2024 5:17 pm Here the EAS-score of Patricia 2.0 after 3100 of 10000 games in my testrun vs. 10 engines (strength around Rybka 4.1). Balanced openings, singlethread, 3min+1sec. The strength of the 10 opponents fits very well: Right now, Patricia 2.0 has an overall score of 48.5% against all 10 opponents.
I used the new version of the EAS-Tool, where the limit for short win EAS-points is hardcoded to 60 (same value, the EAS-tool calculates for my UHO-Top15 Ratinglist games - that makes a comparison a little bit more reliable).
Here the result:
Code: Select all
                                 bad  avg.win 
Rank  EAS-Score  sacs   shorts  draws  moves  Engine/player 
-------------------------------------------------------------------
   1    317753  41.01%  35.87%  04.46%   68   Patricia 2.0  
   2    157590  09.68%  57.26%  33.00%   59   Critter 1.6a  
   3    142925  03.00%  54.00%  31.53%   60   Rybka 4.1  
   4    142073  05.62%  59.55%  31.82%   60   Laser 1.5  
   5    134347  03.08%  56.15%  29.00%   62   Andscacs 0.88  
   6    132193  02.54%  43.22%  25.96%   63   Komodo 5  
   7    101786  03.60%  45.95%  29.63%   63   Texel 1.7  
   8     90665  08.74%  42.72%  37.27%   64   Hannibal 1.7  
   9     87798  02.46%  44.26%  40.57%   66   Houdini 1.5a  
  10     75949  01.32%  31.58%  23.01%   71   Princhess 0.16  
  11     65072  02.61%  39.13%  45.61%   65   Nirvanachess 2.4  
The fact, that the opponents (which all are at the same strength-level than Particia 2.0) have such high numbers/points in the short wins, comes from the fact, that Patricia 2.0 looses a lot of games quickly, because it plays so risky and plays so much sacs - that leads to quick losses, if something goes wrong... So, this definitly makes sense.

317753 of Patricia 2.0 is a real great EAS-Score. No other engine (except OpenTal (but this engine has only 2300 Elo and plays just crazy)) has ever achieved more than 300000 EAS-points. Velvet 4.1 and Komodo 14.1 aggressive had around 280000 EAS-points in my old SPCC-Ratinglist. And no engine ever had more than 38% sacs. So 41% of Patricia 2.0 is just fantastic. And no other engine had ever a bad draw-ratio below 8%. So 4.46% bad draws of Patricia 2.0 are just outstanding.

Nice info! I have been following your page for a long time ago.

BTW, didn't you have any crashes with Patricia 2.0?

pohl4711 · Post by **pohl4711** » Sun Apr 07, 2024 6:28 am

Damas Clásicas wrote: ↑Sat Apr 06, 2024 9:10 pm
Nice info! I have been following your page for a long time ago.

BTW, didn't you have any crashes with Patricia 2.0?

No, right now, 6700 of 10000 games of my testrun of Patricia 2.0 are already done and all games are OK. (I made a small batch tool, which makes a fast search for timelosses, disonnects or illegal moves in pgn-files, containing games played with cutechess and the tool found no problems in the already played games).

I use my PC, which I normally use for Lc0-testing. It has a Ryzen 7 6800H 8core CPU. I run 12 games simultaneously in cutechess-cli. 3min+1sec, 256MB Hash.
All works fine here.

Werewolf · Post by **Werewolf** » Wed May 01, 2024 3:53 pm

pohl4711 wrote: ↑Fri Apr 05, 2024 1:27 pm Here the EAS-Tool with hardcoded short-win movelimit. It is set to 60 (moves) hardcoded. This value can indeed be changed in the source, first lines (.bat file). Can be found in the "for_engine_developers" folder.

Normally the EAS-Tool calculates the shortwin movelimit (the limit, below an engine gets bonuspoints for a won game, because it is a short win) based on the average length of all won games in the input.pgn
But this can lead to unstable results, when engine-developers play gauntlets with their new engine version vs. a bunch of opponents (compared to the EAS-results, when a RoundRobin is evaluated). So, here the EAS-Tool with hardcoded short-win movelimit of 60 (or you choose another value). 60 moves works fine, when strong engines play each other without adjudication of the games (except tablebases-adjudication)...

Just re-download the EAS Tool (the normal version is of course still included in the download)
https://www.sp-cc.de/files/engines_aggr ... cs_tool.7z

Have you tested Lc0 BT4? I can't see it on your list.

pohl4711 · Post by **pohl4711** » Wed May 01, 2024 7:19 pm

https://www.sp-cc.de/nn-vs-sf-testing.htm

And please read this:
https://www.sp-cc.de/files/lc0_uholist_explanation.txt

EAS-Tool new version for engine developers

EAS-Tool new version for engine developers

Re: EAS-Tool new version for engine developers

Re: EAS-Tool new version for engine developers

Re: EAS-Tool new version for engine developers

Re: EAS-Tool new version for engine developers

Re: EAS-Tool new version for engine developers

Re: EAS-Tool new version for engine developers