Ratinglist-testrun of Berserk 10 finished.
https://www.sp-cc.de
Also take a look at the EAS-Ratinglist, the world's first engine-ratinglist not measuring strength of engines but engines's style of play:
https://www.sp-cc.de/eas-ratinglist.htm
(Perhaps you have to clear your browsercache (press <STRG>+<SHIFT>+<DEL>) or reload the website))
SPCC: Testrun of Berserk 10 finished
Moderator: Ras
-
- Posts: 2732
- Joined: Sat Sep 03, 2011 7:25 am
- Location: Berlin, Germany
- Full name: Stefan Pohl
-
- Posts: 1632
- Joined: Tue Aug 21, 2018 7:52 pm
- Full name: Dietrich Kappe
Re: SPCC: Testrun of Berserk 10 finished
At 3657 up from Berserk 9 at 3644. Seems there is a bug in 10, however.
Fat Titz by Stockfish, the engine with the bodaciously big net. Remember: size matters. If you want to learn more about this engine just google for "Fat Titz".
-
- Posts: 122
- Joined: Wed Feb 17, 2021 3:16 pm
- Full name: Jay Honnold
Re: SPCC: Testrun of Berserk 10 finished
With regards to your EAS rating list, have you checked that the EAS-Score doesn't vary heavily when playing engines against different pools? Berserk is always at the bottom of your EAS list and I'm curious if it has to do with the engine or the pool.
Looking this pool, Berserk has a pretty poor score of 43%, mostly due to 4000 games being against engines 100+ Elo stronger than it. If you replaced the two stockfish dev vesrions with SF HCE and Rubichess, does Berserk's EAS-Score rise? If you played Berserk only vs a pool of engines where it is favored, does the EAS-Score rise?
If the EAS-Score varies heavily based on pools, then there is a flaw in your list in my opinion.
Code: Select all
6 Berserk 10 avx2 : 3657 9000 (+775,=6221,-2004), 43.2 %
Stockfish 221004 avx2 : 1000 (+ 2,=566,-432), 28.5 %
Stockfish 220927 avx2 : 1000 (+ 1,=582,-417), 29.2 %
KomodoDragon 3.1 MCTS : 1000 (+ 40,=779,-181), 43.0 %
KomodoDragon 3.1 avx2 : 1000 (+ 7,=682,-311), 34.8 %
Revenge 3.0 avx2 : 1000 (+147,=777,- 76), 53.5 %
Ethereal 13.75 nnue : 1000 (+178,=774,- 48), 56.5 %
Koivisto 8.13 avx2 : 1000 (+120,=814,- 66), 52.7 %
Slow Chess 2.9 avx2 : 1000 (+278,=683,- 39), 62.0 %
Stockfish 15 220418 : 1000 (+ 2,=564,-434), 28.4 %
If the EAS-Score varies heavily based on pools, then there is a flaw in your list in my opinion.
-
- Posts: 122
- Joined: Wed Feb 17, 2021 3:16 pm
- Full name: Jay Honnold
Re: SPCC: Testrun of Berserk 10 finished
That being said, I do think Berserk is a pretty simple minded engine and deserves it's low ranking on your list, but I still believe you should check that your EAS-Score isn't impacted by unforseen factors.
-
- Posts: 2732
- Joined: Sat Sep 03, 2011 7:25 am
- Location: Berlin, Germany
- Full name: Stefan Pohl
Re: SPCC: Testrun of Berserk 10 finished
Just look at Slow Chess 2.9:jhonnold wrote: ↑Mon Oct 17, 2022 7:06 pm With regards to your EAS rating list, have you checked that the EAS-Score doesn't vary heavily when playing engines against different pools? Berserk is always at the bottom of your EAS list and I'm curious if it has to do with the engine or the pool.
Looking this pool, Berserk has a pretty poor score of 43%, mostly due to 4000 games being against engines 100+ Elo stronger than it. If you replaced the two stockfish dev vesrions with SF HCE and Rubichess, does Berserk's EAS-Score rise? If you played Berserk only vs a pool of engines where it is favored, does the EAS-Score rise?Code: Select all
6 Berserk 10 avx2 : 3657 9000 (+775,=6221,-2004), 43.2 % Stockfish 221004 avx2 : 1000 (+ 2,=566,-432), 28.5 % Stockfish 220927 avx2 : 1000 (+ 1,=582,-417), 29.2 % KomodoDragon 3.1 MCTS : 1000 (+ 40,=779,-181), 43.0 % KomodoDragon 3.1 avx2 : 1000 (+ 7,=682,-311), 34.8 % Revenge 3.0 avx2 : 1000 (+147,=777,- 76), 53.5 % Ethereal 13.75 nnue : 1000 (+178,=774,- 48), 56.5 % Koivisto 8.13 avx2 : 1000 (+120,=814,- 66), 52.7 % Slow Chess 2.9 avx2 : 1000 (+278,=683,- 39), 62.0 % Stockfish 15 220418 : 1000 (+ 2,=564,-434), 28.4 %
If the EAS-Score varies heavily based on pools, then there is a flaw in your list in my opinion.
Score 43.1%, like Berserk 10. And EAS Score is 95348, nearly 3x bigger than Berserk 10. Rank 4 in EAS ratinglist...
The EAS calculations are all done with percent values, because of the reason that it should not matter, how strong the engine plays and how high the score is!
From my website:
"Because a weaker player can be playing aggressive, too, the EAS-Score (= Engine Aggressivenes Score, see explanation below) and all other statistics are build on percents from the won games of an engine/player. So, if an engine has won more games, it must win more short games or win games with sacrifices. A weaker engine, which has won less games, need less wins of short games or win games with sacrifices."
Or look at the full-ratinglist, where all played games of the engines are stored and no Stockfish-dev-versions are included (below the full ratinglist, the full EAS-ratinglist follows):
https://www.sp-cc.de/files/spcc_full_list.txt
Here you have Berserk 9 with 13000 played games, no SF-devs as opponent and a score of 49.2% (nearly 50%):
17 Berserk 9 avx2 : 3647 5 5 13000 49.2% 3653 70.2%
And the EAS-score stays as bad as always (rank 158 of 166 entries!!!):
158 35269 10.09% 05.89% 26.71% Berserk 9 avx2
And SlowChess 2.9 has 19000 games here, with only 42.5% score:
26 Slow Chess 2.9 avx2 : 3585 4 4 17000 42.5% 3641 66.7%
And the EAS-score stays high (Rank 17 of 166):
17 92177 23.99% 23.54% 17.84% Slow Chess 2.9 avx2
Q.E.D.
Last edited by pohl4711 on Mon Oct 17, 2022 7:54 pm, edited 1 time in total.
-
- Posts: 2732
- Joined: Sat Sep 03, 2011 7:25 am
- Location: Berlin, Germany
- Full name: Stefan Pohl
Re: SPCC: Testrun of Berserk 10 finished
Or look at my testings of the old King-engine, when I built settings for my King-Chesscomputer:
OpenTal 1.2 has a score of only 26.4% and is on the last rank of the list...
And now look at the EAS-calculation:
Q.E.D. Part 2
The EAS-tool should measure the aggressiveness of an engine, no matter how strong or weak the engine is and no matter how strong or weak the opponents are (of course the opponents should not be so strong, that the engine cannot win any game...). And thats exactly, what the EAS-tool does! And thats why I am really proud of this tool, because such an Agressiveness-ratinglist / scoring-system never existed before in computerchess.
And beyond 3200 Elo the strength of an engine gets more and more insignificant, but an interesting, aggressive playing style gets more and more important. IMO.
Code: Select all
Program Elo + - Games Score Av.Op. Draws
1 Rebel 13 : 2567 5 5 7500 63.5 % 2466 22.4 %
2 Delfi 5.4 : 2533 6 6 7500 59.1 % 2466 23.6 %
3 TheKing Razorback : 2525 8 8 3500 52.1 % 2511 29.1 %
4 TheKing Researcher : 2525 8 8 3500 52.0 % 2511 28.1 %
5 K2 0.95 : 2523 5 5 7500 57.7 % 2466 21.1 %
6 TheKing TrS Normal : 2521 8 8 3500 51.5 % 2511 28.1 %
7 RedQueen 1.1.98 : 2519 5 5 7500 57.2 % 2466 19.6 %
8 Gandalf 7 : 2508 6 6 7500 55.7 % 2466 24.6 %
9 TheKing TrS Solid : 2502 8 8 3500 48.8 % 2511 37.1 %
10 TheKing Normal : 2500 8 8 3500 48.5 % 2511 28.3 %
11 TheKing SPCC Normal : 2499 8 8 3500 48.4 % 2511 27.5 %
12 TheKing SPCC Solid : 2498 8 8 3500 48.3 % 2511 38.3 %
13 TheKing Solid : 2481 8 8 3500 45.9 % 2511 31.5 %
14 TheKing TrS Active : 2478 8 8 3500 45.4 % 2511 21.1 %
15 TheKing Active : 2474 8 8 3500 44.8 % 2511 20.8 %
16 Ruffian Leiden : 2470 6 6 7500 50.4 % 2466 21.7 %
17 TheKing SPCC Active : 2461 8 8 3500 43.1 % 2511 18.3 %
18 Orion 0.6 : 2456 6 6 7500 48.4 % 2466 27.0 %
19 TheKing TrS AkAg : 2449 9 9 3500 41.3 % 2511 13.1 %
20 TheKing TrS Aggressive : 2395 8 8 3500 34.2 % 2511 7.1 %
21 TheKing Aggressive : 2355 9 9 3500 29.4 % 2511 6.8 %
22 Open Tal 1.2 : 2329 9 9 3500 26.4 % 2511 7.5 %
And now look at the EAS-calculation:
Code: Select all
bad
Rank EAS-Score sacs shorts draws Engine/player
-------------------------------------------------------------
1 688980 66.67% 85.23% 10.31% Open Tal 1.2
2 459251 46.75% 81.19% 15.61% TheKing Aggressive
3 449557 40.39% 82.28% 26.91% TheKing TrS Aggressive
4 373875 34.92% 76.25% 26.64% TheKing TrS AkAg
5 312244 31.50% 71.99% 16.19% TheKing Active
6 284089 22.95% 66.72% 19.76% TheKing TrS Active
7 251190 14.13% 67.70% 31.77% TheKing SPCC Active
8 231634 17.23% 64.86% 16.67% TheKing Researcher
9 225428 15.72% 64.97% 19.88% TheKing Normal
10 224918 20.24% 66.00% 15.74% TheKing TrS Normal
11 209408 14.84% 62.40% 17.19% TheKing Razorback
12 202066 11.35% 63.55% 23.05% Gandalf 7
13 201970 11.08% 67.80% 24.83% K2 0.95
14 191813 08.07% 67.76% 27.23% Delfi 5.4
15 188982 15.26% 58.77% 18.53% TheKing Solid
16 185300 08.18% 70.63% 22.49% RedQueen 1.1.98
17 182529 05.81% 67.10% 30.53% Rebel 13
18 180321 08.57% 62.00% 22.97% TheKing SPCC Normal
19 174379 07.98% 61.34% 32.68% Orion 0.6
20 158870 04.75% 62.31% 34.48% Ruffian Leiden
21 157007 12.65% 52.79% 17.71% TheKing TrS Solid
22 155722 07.26% 56.43% 21.92% TheKing SPCC Solid
The EAS-tool should measure the aggressiveness of an engine, no matter how strong or weak the engine is and no matter how strong or weak the opponents are (of course the opponents should not be so strong, that the engine cannot win any game...). And thats exactly, what the EAS-tool does! And thats why I am really proud of this tool, because such an Agressiveness-ratinglist / scoring-system never existed before in computerchess.
And beyond 3200 Elo the strength of an engine gets more and more insignificant, but an interesting, aggressive playing style gets more and more important. IMO.