Sudden SF progress!

Discussion of anything and everything relating to chess playing software and machines.

Moderator: Ras

Jouni
Posts: 3655
Joined: Wed Mar 08, 2006 8:15 pm
Full name: Jouni Uski

Sudden SF progress!

Post by Jouni »

Elo: 7.66 ± 1.4 (95%) LOS: 100.0%
Total: 60000 W: 16017 L: 14694 D: 29289
Ptnml(0-2): 86, 6160, 16212, 7429, 113
nElo: 15.68 ± 2.8 (95%) PairsRatio: 1.21

60+0.6 th 1 testing. Surprising :!: .
Jouni
dkappe
Posts: 1632
Joined: Tue Aug 21, 2018 7:52 pm
Full name: Dietrich Kappe

Re: Sudden SF progress!

Post by dkappe »

That’s vs SF_16, correct? So a test versus a somewhat older version? If you have more context, would like to hear it.
Fat Titz by Stockfish, the engine with the bodaciously big net. Remember: size matters. If you want to learn more about this engine just google for "Fat Titz".
Jouni
Posts: 3655
Joined: Wed Mar 08, 2006 8:15 pm
Full name: Jouni Uski

Re: Sudden SF progress!

Post by Jouni »

Against SF16. Previous test was Elo: 0.58 ±1.4.
Jouni
User avatar
Dariusz
Posts: 379
Joined: Sat Jun 13, 2015 10:08 am
Location: Poland
Full name: Dariusz Domagała

Re: Sudden SF progress!

Post by Dariusz »

MCERL (ongoing)

# PLAYER : RATING POINTS PLAYED (%)
1 Stockfish dev-20230911-3f7fb5ac : 3800 1599.5 2200 72.7%
2 Stockfish dev-20230811-81929458 : 3788 2455.5 4082 60.2%
3 Stockfish dev-20230824-4c4cb185 : 3786 1404.0 2028 69.2%
4 Stockfish dev-20230319-02e46970 : 3778 2464.0 4440 55.5%
5 Stockfish dev-20230412-acb0d204 : 3777 818.5 1078 75.9%
6 Stockfish dev-20230425-41f50b2c : 3776 259.0 520 49.8%
7 Stockfish dev-20230305-cdec775a : 3771 796.0 1328 59.9%
8 Stockfish 16 : 3770 2205.5 3363 65.6%
Regards, Darius
https://chessengeria.eu
User avatar
Ozymandias
Posts: 1537
Joined: Sun Oct 25, 2009 2:30 am

Re: Sudden SF progress!

Post by Ozymandias »

About 4 ELO in 3 months according to NCM and SPCC. But of course the latest version will walk that back, as any time there's a net architecture change.

Nothing sudden, probably not even progress in the end.
Jouni
Posts: 3655
Joined: Wed Mar 08, 2006 8:15 pm
Full name: Jouni Uski

Re: Sudden SF progress!

Post by Jouni »

Bigger net L1-2560 nn-ac1dbea57aa3.nnue was -7/-8 Elo at NCM. Nice progress :wink: .
Jouni
User avatar
Ozymandias
Posts: 1537
Joined: Sun Oct 25, 2009 2:30 am

Re: Sudden SF progress!

Post by Ozymandias »

As anticipated, progress disappeared 3 months ago with the net change and has barely recovered. We're still at 5 Elo, now for 6 months since v16.

Will there be a v17 with single digit gains? Dragon did it all the time, so, why not?
User avatar
pohl4711
Posts: 2807
Joined: Sat Sep 03, 2011 7:25 am
Location: Berlin, Germany
Full name: Stefan Pohl

Re: Sudden SF progress!

Post by pohl4711 »

Right now, the policy is to release a new full version of Stockfish, if this Stockfish beats the latest official released SF by 100 normalized Elo in the Fishtest-Progression testrun. Normalized Elo is a bit complicated, but Gamepair-Elo progress is quite similar to normalized Elo numbers...
And according to my Gamepair-rescored UHO-Top15 Ratinglist, Stockfish 231222 is +38 Gamepair-Elo stronger than SF 16.
So 1/3 of the way to SF 17 is done.

Code: Select all

 # PLAYER                   :  RATING  ERROR  PLAYED     W     D     L   (%)  CFS(%)
   1 Stockfish 231222 avx2    :    3859     14    7500  6413   964   123  91.9     100
   2 Stockfish 16 230630      :    3821   ----    7500  6230  1067   203  90.2     100
   3 Torch 1 popavx2          :    3666     14    7500  5339  1517   644  81.3     100
   4 KomodoDragon 3.3 avx2    :    3557     14    7500  4575  1849  1076  73.3     100
   5 Berserk 12 avx2          :    3482     14    7500  3988  2069  1443  67.0     100
   6 Ethereal 14.25 nnue      :    3343     14    7500  2710  2549  2241  53.1     100
   7 Caissa 1.15 avx2         :    3276     14    7500  2094  2676  2730  45.8      82
   8 RubiChess 230918 avx2    :    3272     14    7500  2063  2665  2772  45.3     100
   9 CSTal 2.0 avx2           :    3200     14    7500  1437  2694  3369  37.1      56
  10 Obsidian 9.0 avx2        :    3200     14    7500  1408  2740  3352  37.0     100
  11 Clover 6.1 avx2          :    3177     14    7500  1280  2617  3603  34.5     100
  12 Koivisto 9.2 avx2        :    3160     14    7500  1198  2495  3807  32.6     100
  13 Rebel EAS avx2           :    3134     15    7500   986  2495  4019  29.8     100
  14 Seer 2.7.0 avx2          :    3121     14    7500   872  2516  4112  28.4      67
  15 RofChade 3.1 avx2        :    3119     14    7500   867  2494  4139  28.2     100
  16 Uralochka 3.40a avx2     :    3084     14    7500   677  2319  4504  24.5     ---


------------------------------------------------------------------- 
--- Number of all Gamepairs          : 60000 
--- Number of drawn Gamepairs overall: 17863 (= 29.77%) 
--- Number of 1:1 drawn Gamepairs    : 8411  (= 14.02%) 
--- Number of 2-draws drawn Gamepairs: 9452  (= 15.75%) 
------------------------------------------------------------------- 

Head-to-Head Gamepair-result of Stockfish 231222 vs SF 16 is:
500 ( 138+, 273=, 89-), 54.9% : +38 (Gamepair-)Elo
Modern Times
Posts: 3748
Joined: Thu Jun 07, 2012 11:02 pm

Re: Sudden SF progress!

Post by Modern Times »

pohl4711 wrote: Sun Dec 31, 2023 10:47 am Right now, the policy is to release a new full version of Stockfish, if this Stockfish beats the latest official released SF by 100 normalized Elo in the Fishtest-Progression testrun.
As it gets harder and harder to achieve Elo gains at this level, perhaps they should make that policy +50 normalised Elo.
User avatar
Ozymandias
Posts: 1537
Joined: Sun Oct 25, 2009 2:30 am

Re: Sudden SF progress!

Post by Ozymandias »

pohl4711 wrote: Sun Dec 31, 2023 10:47 am Right now, the policy is to release a new full version of Stockfish, if this Stockfish beats the latest official released SF by 100 normalized Elo in the Fishtest-Progression testrun. Normalized Elo is a bit complicated, but Gamepair-Elo progress is quite similar to normalized Elo numbers...
And according to my Gamepair-rescored UHO-Top15 Ratinglist, Stockfish 231222 is +38 Gamepair-Elo stronger than SF 16.
So 1/3 of the way to SF 17 is done.

Code: Select all

 # PLAYER                   :  RATING  ERROR  PLAYED     W     D     L   (%)  CFS(%)
   1 Stockfish 231222 avx2    :    3859     14    7500  6413   964   123  91.9     100
   2 Stockfish 16 230630      :    3821   ----    7500  6230  1067   203  90.2     100
   3 Torch 1 popavx2          :    3666     14    7500  5339  1517   644  81.3     100
   4 KomodoDragon 3.3 avx2    :    3557     14    7500  4575  1849  1076  73.3     100
   5 Berserk 12 avx2          :    3482     14    7500  3988  2069  1443  67.0     100
   6 Ethereal 14.25 nnue      :    3343     14    7500  2710  2549  2241  53.1     100
   7 Caissa 1.15 avx2         :    3276     14    7500  2094  2676  2730  45.8      82
   8 RubiChess 230918 avx2    :    3272     14    7500  2063  2665  2772  45.3     100
   9 CSTal 2.0 avx2           :    3200     14    7500  1437  2694  3369  37.1      56
  10 Obsidian 9.0 avx2        :    3200     14    7500  1408  2740  3352  37.0     100
  11 Clover 6.1 avx2          :    3177     14    7500  1280  2617  3603  34.5     100
  12 Koivisto 9.2 avx2        :    3160     14    7500  1198  2495  3807  32.6     100
  13 Rebel EAS avx2           :    3134     15    7500   986  2495  4019  29.8     100
  14 Seer 2.7.0 avx2          :    3121     14    7500   872  2516  4112  28.4      67
  15 RofChade 3.1 avx2        :    3119     14    7500   867  2494  4139  28.2     100
  16 Uralochka 3.40a avx2     :    3084     14    7500   677  2319  4504  24.5     ---


------------------------------------------------------------------- 
--- Number of all Gamepairs          : 60000 
--- Number of drawn Gamepairs overall: 17863 (= 29.77%) 
--- Number of 1:1 drawn Gamepairs    : 8411  (= 14.02%) 
--- Number of 2-draws drawn Gamepairs: 9452  (= 15.75%) 
------------------------------------------------------------------- 

Head-to-Head Gamepair-result of Stockfish 231222 vs SF 16 is:
500 ( 138+, 273=, 89-), 54.9% : +38 (Gamepair-)Elo
90% of the openings are sidelines and game-pairs consist of 500 ultra fast games. I prefer NCM, much more realistic progress tracking.