patricia devlog

lithander · Post by **lithander** » Mon Feb 12, 2024 5:47 pm

Before releasing Leorik 3 I did a mini gauntlet with a few opponents and this is the EAS score when I stuff the PGN in the EAS tool:

Code: Select all

Rank  EAS-Score   wins  moves   sacs    sacsQ    sacs5+   sacs4    sacs3    sacs2    sacs1    all shorts short40  short45  short50  short55  short60   draws    Engine/player
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
   1    188464      900   69   20.11% =[00.00% + 00.44% + 00.56% + 01.78% + 05.11% + 12.22%]    33.78% = [07.89% + 05.56% + 05.11% + 07.11% + 08.11%]  16.92%   Leorik-3.0  
   2     71251      167   83   07.19% =[00.00% + 00.60% + 00.00% + 00.00% + 00.60% + 05.99%]    11.38% = [01.20% + 00.60% + 01.20% + 01.80% + 06.59%]  18.03%   Nalwald-18  
   3     51066      245   85   09.80% =[00.00% + 00.41% + 00.00% + 00.41% + 01.63% + 07.35%]    05.71% = [00.00% + 00.00% + 00.00% + 01.22% + 04.49%]  21.67%   frozenight-6  
   4     43004      160   89   05.00% =[00.00% + 00.63% + 00.00% + 00.00% + 01.25% + 03.13%]    06.88% = [00.00% + 00.63% + 01.25% + 02.50% + 02.50%]  25.00%   StockNemo-5.7  
   5     27105      227   91   02.64% =[00.00% + 00.00% + 00.00% + 00.44% + 00.00% + 02.20%]    06.17% = [00.44% + 00.44% + 00.00% + 00.44% + 04.85%]  26.98%   zahak-10  
   6     22320       46   84   00.00% =[00.00% + 00.00% + 00.00% + 00.00% + 00.00% + 00.00%]    13.04% = [00.00% + 00.00% + 00.00% + 08.70% + 04.35%]  32.17%   PeSTO

Based on the score, comparing it to https://www.sp-cc.de/eas-ratinglist.htm Leorik should be one of the most agressive engines ever without even trying? But I doubt that's really the case.

I don't understand how the EAS tool works exactly but I think what it means is that you need to have a wide range of engines that are both stronger and weaker before you can draw meaningful conclusions.

Whiskers · Post by **Whiskers** » Mon Feb 12, 2024 9:53 pm

lithander wrote: ↑Mon Feb 12, 2024 5:47 pm Before releasing Leorik 3 I did a mini gauntlet with a few opponents and this is the EAS score when I stuff the PGN in the EAS tool:

Code: Select all

Rank  EAS-Score   wins  moves   sacs    sacsQ    sacs5+   sacs4    sacs3    sacs2    sacs1    all shorts short40  short45  short50  short55  short60   draws    Engine/player
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
   1    188464      900   69   20.11% =[00.00% + 00.44% + 00.56% + 01.78% + 05.11% + 12.22%]    33.78% = [07.89% + 05.56% + 05.11% + 07.11% + 08.11%]  16.92%   Leorik-3.0  
   2     71251      167   83   07.19% =[00.00% + 00.60% + 00.00% + 00.00% + 00.60% + 05.99%]    11.38% = [01.20% + 00.60% + 01.20% + 01.80% + 06.59%]  18.03%   Nalwald-18  
   3     51066      245   85   09.80% =[00.00% + 00.41% + 00.00% + 00.41% + 01.63% + 07.35%]    05.71% = [00.00% + 00.00% + 00.00% + 01.22% + 04.49%]  21.67%   frozenight-6  
   4     43004      160   89   05.00% =[00.00% + 00.63% + 00.00% + 00.00% + 01.25% + 03.13%]    06.88% = [00.00% + 00.63% + 01.25% + 02.50% + 02.50%]  25.00%   StockNemo-5.7  
   5     27105      227   91   02.64% =[00.00% + 00.00% + 00.00% + 00.44% + 00.00% + 02.20%]    06.17% = [00.44% + 00.44% + 00.00% + 00.44% + 04.85%]  26.98%   zahak-10  
   6     22320       46   84   00.00% =[00.00% + 00.00% + 00.00% + 00.00% + 00.00% + 00.00%]    13.04% = [00.00% + 00.00% + 00.00% + 08.70% + 04.35%]  32.17%   PeSTO

Based on the score, comparing it to https://www.sp-cc.de/eas-ratinglist.htm Leorik should be one of the most agressive engines ever without even trying? But I doubt that's really the case.

I don't understand how the EAS tool works exactly but I think what it means is that you need to have a wide range of engines that are both stronger and weaker before you can draw meaningful conclusions.

Those results look a bit odd. Leorik sacrificing twice as often as any other opponent and winning way faster on average? And how did PESTO not play a sacrifice even once? The raw data clearly suggests Leorik is really aggressive. Maybe a couple of its opponents are vulnerable to tactical shots? Thanks for the info, it gives me something to think about.

Whiskers · Post by **Whiskers** » Mon Feb 12, 2024 9:55 pm

Leorik winning 8% of its games in under 40 moves seems to suggest there were one or two hopeless engines that Leorik crushed over and over again, while the other engines couldn’t enjoy that benefit because they only played Leorik.

lithander · Post by **lithander** » Mon Feb 12, 2024 10:31 pm

Whiskers wrote: ↑Mon Feb 12, 2024 9:55 pm Leorik winning 8% of its games in under 40 moves seems to suggest there were one or two hopeless engines that Leorik crushed over and over again, while the other engines couldn’t enjoy that benefit because they only played Leorik.

For context here's how the match was setup:

Code: Select all

./cutechess-cli.exe -engine conf="Leorik-3.0" -engine conf="frozenight-6" -engine conf="PeSTO" -engine conf="StockNemo-5.7" -engine conf="Nalwald-18" -engine conf="zahak-10" -each tc=40/30 book=varied.bin option.Hash=32 -pgnout leorik3_gauntlet1_30per40.pgn -rounds 1000 -games 2 -repeat -concurrency 7 -tournament gauntlet

Anchoring the engines to their CCRL ratings shows that the only really hopeless engine was PeSTO. Maybe PeSTO was providing Leorik with all the opportunities to play short, aggressive wins. That's a good theory!

Code: Select all

 .\ordo-win64.exe -p .\leorik3_gauntlet1_30per40.pgn -m anchors_3.0.txt
   # PLAYER           :  RATING  POINTS  PLAYED   (%)
   1 frozenight-6     :  3374.0   406.5     626    65
   2 zahak-10         :  3342.0   366.0     625    59
   3 Leorik-3.0       :  3289.4  1591.5    3128    51
   4 Nalwald-18       :  3289.0   344.5     625    55
   5 StockNemo-5.7    :  3287.0   316.0     626    50
   6 PeSTO            :  3122.0   103.5     626    17

Guenther · Post by **Guenther** » Tue Feb 13, 2024 1:04 pm

lithander wrote: ↑Mon Feb 12, 2024 5:47 pm Before releasing Leorik 3 I did a mini gauntlet with a few opponents and this is the EAS score when I stuff the PGN in the EAS tool:

Code: Select all

Rank  EAS-Score   wins  moves   sacs    sacsQ    sacs5+   sacs4    sacs3    sacs2    sacs1    all shorts short40  short45  short50  short55  short60   draws    Engine/player
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
   1    188464      900   69   20.11% =[00.00% + 00.44% + 00.56% + 01.78% + 05.11% + 12.22%]    33.78% = [07.89% + 05.56% + 05.11% + 07.11% + 08.11%]  16.92%   Leorik-3.0  
   2     71251      167   83   07.19% =[00.00% + 00.60% + 00.00% + 00.00% + 00.60% + 05.99%]    11.38% = [01.20% + 00.60% + 01.20% + 01.80% + 06.59%]  18.03%   Nalwald-18  
   3     51066      245   85   09.80% =[00.00% + 00.41% + 00.00% + 00.41% + 01.63% + 07.35%]    05.71% = [00.00% + 00.00% + 00.00% + 01.22% + 04.49%]  21.67%   frozenight-6  
   4     43004      160   89   05.00% =[00.00% + 00.63% + 00.00% + 00.00% + 01.25% + 03.13%]    06.88% = [00.00% + 00.63% + 01.25% + 02.50% + 02.50%]  25.00%   StockNemo-5.7  
   5     27105      227   91   02.64% =[00.00% + 00.00% + 00.00% + 00.44% + 00.00% + 02.20%]    06.17% = [00.44% + 00.44% + 00.00% + 00.44% + 04.85%]  26.98%   zahak-10  
   6     22320       46   84   00.00% =[00.00% + 00.00% + 00.00% + 00.00% + 00.00% + 00.00%]    13.04% = [00.00% + 00.00% + 00.00% + 08.70% + 04.35%]  32.17%   PeSTO

Based on the score, comparing it to https://www.sp-cc.de/eas-ratinglist.htm Leorik should be one of the most agressive engines ever without even trying? But I doubt that's really the case. :roll: I don't understand how the EAS tool works exactly but I think what it means is that you need to have a wide range of engines that are both stronger and weaker before you can draw meaningful conclusions.

So far I have never tried this tool, but it seems obvious it will only show meaningful results in matches or full round tournaments (e.g. not in gauntlets and not in tournaments, where players have a different number of games - provided its calculations are correct anyway).

@ Adam
Thanks for the dev log of your new 'Patricia' - will enjoy your posts about it, as I did and still do with Thomas' Leorik :)

Whiskers · Post by **Whiskers** » Thu Feb 15, 2024 4:26 pm

I added RFP, LMR, NMP, and history to Patricia; she's now at around 2900 strength already. She also remains more aggressive than the corresponding Willow version:

Code: Select all

Rank  EAS-Score  sacs   shorts  draws  moves  Engine/player 
-------------------------------------------------------------------
   1    118857  15.97%  23.00%  16.59%   71   Patricia 0.1  
   2     70876  11.84%  24.49%  24.89%   70   Willow 2.8

It's time to work a little bit more on Patricia's aggressiveness. This time around, I plan to retrain my network on "aggressive" data filtered from my Willow dataset at lower LR. This should keep most of the knowledge that the net already has, just slanting it a little bit towards the new positions I'm adding in; I expect this new net to be slightly worse at regular chess, but much better at aggressive chess, which is what I want.
The only problem is that my data is all in binpack format, so I need to write a converter.

chesskobra · Post by **chesskobra** » Thu Feb 15, 2024 10:45 pm

Interesting project, I will be following. Do the position filters work on epd files or pgn files? Do they compile on linux?

pohl4711 · Post by **pohl4711** » Fri Feb 16, 2024 7:43 am

Whiskers wrote: ↑Thu Feb 15, 2024 4:26 pm I added RFP, LMR, NMP, and history to Patricia; she's now at around 2900 strength already. She also remains more aggressive than the corresponding Willow version:
Code: Select all
Rank  EAS-Score  sacs   shorts  draws  moves  Engine/player 
-------------------------------------------------------------------
   1    118857  15.97%  23.00%  16.59%   71   Patricia 0.1  
   2     70876  11.84%  24.49%  24.89%   70   Willow 2.8  
It's time to work a little bit more on Patricia's aggressiveness. This time around, I plan to retrain my network on "aggressive" data filtered from my Willow dataset at lower LR. This should keep most of the knowledge that the net already has, just slanting it a little bit towards the new positions I'm adding in; I expect this new net to be slightly worse at regular chess, but much better at aggressive chess, which is what I want.
The only problem is that my data is all in binpack format, so I need to write a converter.

Great! Cant wait to see the results.

Whiskers · Post by **Whiskers** » Fri Feb 16, 2024 11:53 pm

chesskobra wrote: ↑Thu Feb 15, 2024 10:45 pm Interesting project, I will be following. Do the position filters work on epd files or pgn files? Do they compile on linux?

They work on text files that contain FEN lines. I did this because I have a lot of Willow data lying around in that format (that I convert to binpacks for nnue training). It would not be very difficult to change it to work with pgns, but I don't see the point for my purposes right now. If demand is high enough I'll generalize the position filter programs and make them easy to build too.

chesskobra · Post by **chesskobra** » Fri Feb 16, 2024 11:59 pm

Of course, I am not requesting you to generalise them but I was only curious since I was wondering if such programs could be used to extract interesting positions from game databases for the purpose of human training.

patricia devlog

Re: patricia devlog

Re: patricia devlog

Re: patricia devlog

Re: patricia devlog

Re: patricia devlog

Re: patricia devlog

Re: patricia devlog

Re: patricia devlog

Re: patricia devlog

Re: patricia devlog