Question for Stefan about his EAS tool

Discussion of anything and everything relating to chess playing software and machines.

Moderator: Ras

User avatar
Rebel
Posts: 7307
Joined: Thu Aug 18, 2011 12:04 pm
Full name: Ed Schröder

Question for Stefan about his EAS tool

Post by Rebel »

It's about the "bad draws" calculation, from the README -

3) Bad draws: Bad draws are games, which were drawn before endgame (material check is done, the
number of played moves does not matter) and draws after the engine had a material advantage of
at least 1 pawn during a game, because the engine should win a game, if material was won. All
these bad draws are finally checked for a material disadvantage of at least 1 pawn: Because draws
with material disadvantage prevented a possible loss and so, these games are no bad draws and are
not counted.


Unless we have a language misunderstanding the assumption highlighted in red is wrong.

A pawn up or down often doesn't say much, think of passed pawns in the endgame, 2 connected passed pawns on the 6th row easily may receive a 5 pawn bonus from the evaluation.

But maybe you can explain.
90% of coding is debugging, the other 10% is writing bugs.
User avatar
RubiChess
Posts: 642
Joined: Fri Mar 30, 2018 7:20 am
Full name: Andreas Matthies

Re: Question for Stefan about his EAS tool

Post by RubiChess »

EAS is like horse races forcing the horses running on two legs because humans do it.
My 2 ct.

Regards, Andreas
User avatar
pohl4711
Posts: 2723
Joined: Sat Sep 03, 2011 7:25 am
Location: Berlin, Germany
Full name: Stefan Pohl

Re: Question for Stefan about his EAS tool

Post by pohl4711 »

Rebel wrote: Wed Jul 31, 2024 7:15 am It's about the "bad draws" calculation, from the README -

3) Bad draws: Bad draws are games, which were drawn before endgame (material check is done, the
number of played moves does not matter) and draws after the engine had a material advantage of
at least 1 pawn during a game, because the engine should win a game, if material was won. All
these bad draws are finally checked for a material disadvantage of at least 1 pawn: Because draws
with material disadvantage prevented a possible loss and so, these games are no bad draws and are
not counted.


Unless we have a language misunderstanding the assumption highlighted in red is wrong.

A pawn up or down often doesn't say much, think of passed pawns in the endgame, 2 connected passed pawns on the 6th row easily may receive a 5 pawn bonus from the evaluation.

But maybe you can explain.
I doubt that this is wrong. I know, one pawn more is not that much. But superhuman-strong engines should win, when having won material in a game. Of course, sometimes this is not enough to win. There are endgames, when having a knight or bishop more, are drawish.
But the point is not, that somethig in the EAS-tool can fail from time to time (failures can happen in the sacrifice-filtering, too (you mentioned a fail-queen-sac here on talkchess yourself)). The point is, that it works overall, considering a huge amount of games (and it must be running fast, a slow but better solution is no alternative)
If you look at my EAS-Ratinglist or the Patricia-EAS calculations, the "bad draws"-concept works fine (Patricias bad draw-values are 5% and below! And the less aggressive playing engines have bad draw-values up to 26% in my testruns). So, for me, it is important, that a concept works, even though it sounds a little bit strange and fails from time to time.
Last edited by pohl4711 on Wed Jul 31, 2024 9:03 am, edited 2 times in total.
User avatar
pohl4711
Posts: 2723
Joined: Sat Sep 03, 2011 7:25 am
Location: Berlin, Germany
Full name: Stefan Pohl

Re: Question for Stefan about his EAS tool

Post by pohl4711 »

RubiChess wrote: Wed Jul 31, 2024 8:41 am EAS is like horse races forcing the horses running on two legs because humans do it.
My 2 ct.

Regards, Andreas
And what is wrong about that? The EAS-Tool ranks engines by their aggressiveness and of course, the aggressiveness is measured on a "human scale". Because humans watch the games and the engines. And so the human perspective for aggressive play is the only way to measure it:
EAS-Score = Aggressiveness of the engine = Entertainment-factor of the engine (for humans, watching or playing the engine!!!)
User avatar
RubiChess
Posts: 642
Joined: Fri Mar 30, 2018 7:20 am
Full name: Andreas Matthies

Re: Question for Stefan about his EAS tool

Post by RubiChess »

pohl4711 wrote: Wed Jul 31, 2024 8:58 am
RubiChess wrote: Wed Jul 31, 2024 8:41 am EAS is like horse races forcing the horses running on two legs because humans do it.
My 2 ct.

Regards, Andreas
And what is wrong about that? The EAS-Tool ranks engines by their aggressiveness and of course, the aggressiveness is measured on a "human scale". Because humans watch the games and the engines. And so the human perspective for aggressive play is the only way to measure it:
EAS-Score = Aggressiveness of the engine = Entertainment-factor of the engine (for humans, watching or playing the engine!!!)
Nothing wrong, just my 2ct.
User avatar
Rebel
Posts: 7307
Joined: Thu Aug 18, 2011 12:04 pm
Full name: Ed Schröder

Re: Question for Stefan about his EAS tool

Post by Rebel »

There are too many exceptions, we are speaking of the endgame -

1. Endings with an isolated double pawn hardly counts as a material advantage.

2. Endings with a one pawn advantage on one flank of the board are draws most of the time.

3. The unequal bishop ending, often not even 2 pawns up is enough to win.

4. QvsRN, QvsRB, QvsRR draw most of the time.

5. RvsBP, with a healthy pawns the game is a draw, often RvsNP too.

In my MRI tool I use a simple condition, if both sides agree their score is above margin and the game ends in a draw then the side with the positive score probably missed a win. Margin is flexible, say 2.50 or 3.0

I doubt pgn-extract is capable to handle scores, but I can write a stand alone util for you that you can call in your batch file and passes the information.
90% of coding is debugging, the other 10% is writing bugs.
chrisw
Posts: 4626
Joined: Tue Apr 03, 2012 4:28 pm
Location: Midi-Pyrénées
Full name: Christopher Whittington

Re: Question for Stefan about his EAS tool

Post by chrisw »

There's not much point in complaining about it. EAS Tool is written using other "tools" and is limited by what those other tools can do. Like it's written in a high level language where not a lot is possible and that limits its capabilities. If you want to change things, you'll need to write the code for a EAS(new) using let's say, Python Chess and Python, and trap all these events which you refer to.
Somebody could just go it alone and publish the whatever source on Github as a community resource, or collaborate with Stefan and use his general algorithm and weights.
User avatar
Rebel
Posts: 7307
Joined: Thu Aug 18, 2011 12:04 pm
Full name: Ed Schröder

Re: Question for Stefan about his EAS tool

Post by Rebel »

Indeed, C++ or Python would be best, do I hear you volunteer? :wink:

Meanwhile it's possible to improve what is available at the moment which is already good, I wasn't complaining.
90% of coding is debugging, the other 10% is writing bugs.
User avatar
pohl4711
Posts: 2723
Joined: Sat Sep 03, 2011 7:25 am
Location: Berlin, Germany
Full name: Stefan Pohl

Re: Question for Stefan about his EAS tool

Post by pohl4711 »

Rebel wrote: Wed Jul 31, 2024 7:54 pm There are too many exceptions, we are speaking of the endgame -

1. Endings with an isolated double pawn hardly counts as a material advantage.

2. Endings with a one pawn advantage on one flank of the board are draws most of the time.

3. The unequal bishop ending, often not even 2 pawns up is enough to win.

4. QvsRN, QvsRB, QvsRR draw most of the time.

5. RvsBP, with a healthy pawns the game is a draw, often RvsNP too.

In my MRI tool I use a simple condition, if both sides agree their score is above margin and the game ends in a draw then the side with the positive score probably missed a win. Margin is flexible, say 2.50 or 3.0

I doubt pgn-extract is capable to handle scores, but I can write a stand alone util for you that you can call in your batch file and passes the information.
I tried the EAS-calculation on my UHO-Top15 Ratinglistgames with a new bad-draw filter: 2 pawns material advantage and a draw as result is now a bad draw, not 1 pawn material advantage:

New (2 pawns):

Code: Select all

                                 bad  avg.win 
Rank  EAS-Score  sacs   shorts  draws  moves  Engine/player 
-------------------------------------------------------------------
   1    211903  22.95%  20.21%  04.89%   72   Stockfish 16.1 240224  
   2    206312  21.22%  18.65%  04.88%   73   Stockfish 240719 avx2  
   3    166964  13.02%  19.71%  06.90%   72   Torch 3 popavx2  
   4    165200  16.71%  13.25%  06.86%   80   Lizard 10.5 avx2  
   5    145303  11.75%  15.92%  06.49%   77   Obsidian 13.0 avx2  
   6    141636  13.40%  11.93%  06.99%   78   Clover 7.0 avx2  
   7    137832  15.47%  16.99%  09.32%   74   KomodoDragon 3.3 avx2  
   8    131777  11.29%  11.24%  06.77%   80   PlentyChess 2.1 avx2  
   9    131663  15.11%  14.57%  08.53%   76   RubiChess 240112 avx2  
  10    128598  15.30%  08.90%  07.48%   84   Ethereal 14.38 avx2  
  11    117698  10.35%  09.36%  08.05%   81   Alexandria 7.0 avx2  
  12    117505  09.72%  08.52%  07.50%   80   Berserk 13 avx2  
  13    113101  13.80%  05.72%  09.57%   86   Titan 1.1 avx2  
  14    105180  10.84%  09.91%  10.58%   80   Caissa 1.19 avx2  
  15    102749  10.66%  08.77%  10.11%   79   Viridithas 13.0 avx2  
  16     86883  07.23%  12.21%  13.11%   77   Seer 2.8.0 avx2  
-------------------------------------------------------------------
*** Average length of all won games:     77 moves
Here the result, using the normal EAS-Tool:

Code: Select all

                                 bad  avg.win 
Rank  EAS-Score  sacs   shorts  draws  moves  Engine/player 
-------------------------------------------------------------------
   1    179620  22.95%  20.21%  09.55%   72   Stockfish 16.1 240224  
   2    174029  21.22%  18.65%  09.29%   73   Stockfish 240719 avx2  
   3    123948  13.02%  19.71%  14.10%   72   Torch 3 popavx2  
   4    114143  16.71%  13.25%  16.24%   80   Lizard 10.5 avx2  
   5     94767  15.47%  16.99%  19.11%   74   KomodoDragon 3.3 avx2  
   6     90662  11.75%  15.92%  17.04%   77   Obsidian 13.0 avx2  
   7     86995  13.40%  11.93%  17.97%   78   Clover 7.0 avx2  
   8     84905  11.29%  11.24%  15.85%   80   PlentyChess 2.1 avx2  
   9     82678  15.11%  14.57%  19.11%   76   RubiChess 240112 avx2  
  10     80277  15.30%  08.90%  17.45%   84   Ethereal 14.38 avx2  
  11     68713  10.35%  09.36%  19.54%   81   Alexandria 7.0 avx2  
  12     67229  13.80%  05.72%  20.07%   86   Titan 1.1 avx2  
  13     62400  09.72%  08.52%  19.45%   80   Berserk 13 avx2  
  14     59509  10.66%  08.77%  21.84%   79   Viridithas 13.0 avx2  
  15     59343  10.84%  09.91%  22.72%   80   Caissa 1.19 avx2  
  16     46392  07.23%  12.21%  26.70%   77   Seer 2.8.0 avx2  
-------------------------------------------------------------------
*** Average length of all won games:     77 moves
You can see, the ranking is nearly the same (that is what the EAS-Tool is about). Only the numbers of bad draws are around 50% with the 2pawn-advantage-rule, compared to the original bad draw numbers. Komodo lost 2 ranks, but in the range of Komodo, the EAS-results are very close (EAS-Score distances between engines below 10000 points are mostly random)
As I said: I know, there are failure detections in the EAS-Tool, that will never change. But the failures are statistically evenly distributed, so this does (nearly) not matter. And a higher bad-draw value is much better for calculating, because real aggressive playing engines like Patricia could get to a nearly zero-value of bad-draws. And because the scoring-system is exponential, this is not a good thing for a stable scoring (internally the EAS-Tool calculates the EAS-points of bad draws from the percentage of good draws, subtracted from 100%). So, getting any numbers close to 0% or 100% is a bad thing, when calculating exponential (the same problem you get with normal Elo-calculations, when head-to-head results of engines are close to 100% or 0%). So, for practical reasons, the normal EAS-Tool solution of bad-draw detection is better.
User avatar
pohl4711
Posts: 2723
Joined: Sat Sep 03, 2011 7:25 am
Location: Berlin, Germany
Full name: Stefan Pohl

Re: Question for Stefan about his EAS tool

Post by pohl4711 »

chrisw wrote: Wed Jul 31, 2024 8:19 pm There's not much point in complaining about it. EAS Tool is written using other "tools" and is limited by what those other tools can do. Like it's written in a high level language where not a lot is possible and that limits its capabilities. If you want to change things, you'll need to write the code for a EAS(new) using let's say, Python Chess and Python, and trap all these events which you refer to.
That is correct... If somebody wants to do so - fine for me. If I can help (explaing my code or the EAS scoring system or so), I will do...