Stockfish 20251005 is clear improvement in test suites

Discussion of anything and everything relating to chess playing software and machines.

Moderator: Ras

Jouni
Posts: 3683
Joined: Wed Mar 08, 2006 8:15 pm
Full name: Jouni Uski

Stockfish 20251005 is clear improvement in test suites

Post by Jouni »

In HTC it solved 82 and 17.1 solved 76. In private suite it solved 64 and 17.1 solved 59. First test 110 and second 100 positions. Now there is already version 20251007 :lol: . May be better or worse who knows?
Jouni
Uri
Posts: 513
Joined: Thu Dec 27, 2007 9:34 pm

Re: Stockfish 20251005 is clear improvement in test suites

Post by Uri »

The Stockfish project is a failure.

Stockfish is still very weak when it comes to complex endgames or complex openings and it's strategical knowledge is still very little.

They have been trying to improve Stockfish for many years and month on end now and Stockfish did not improve by much so I think we should just let the Stockfish project drop.
Jouni
Posts: 3683
Joined: Wed Mar 08, 2006 8:15 pm
Full name: Jouni Uski

Re: Stockfish 20251005 is clear improvement in test suites

Post by Jouni »

Stockfish is not intended to solving problems. Rems is and it solves in same tests 102 + 85.
Jouni
User avatar
Master Om
Posts: 453
Joined: Wed Nov 24, 2010 10:57 am
Location: INDIA

Re: Stockfish 20251005 is clear improvement in test suites

Post by Master Om »

Jouni wrote: Tue Oct 07, 2025 7:12 pm Stockfish is not intended to solving problems. Rems is and it solves in same tests 102 + 85.
What settings did u use in rems to test the suite? Can U please share?
Always Expect the Unexpected
Jouni
Posts: 3683
Joined: Wed Mar 08, 2006 8:15 pm
Full name: Jouni Uski

Re: Stockfish 20251005 is clear improvement in test suites

Post by Jouni »

Simply increase this parameters from default 2: Random Op. MultiPV. And test best value to your CPU.
Jouni
Uri Blass
Posts: 10903
Joined: Thu Mar 09, 2006 12:37 am
Location: Tel-Aviv Israel

Re: Stockfish 20251005 is clear improvement in test suites

Post by Uri Blass »

Uri wrote: Tue Oct 07, 2025 4:49 pm The Stockfish project is a failure.

Stockfish is still very weak when it comes to complex endgames or complex openings and it's strategical knowledge is still very little.

They have been trying to improve Stockfish for many years and month on end now and Stockfish did not improve by much so I think we should just let the Stockfish project drop.
Stockfish is not a failure because they are interested only in beating other engines and they do not fail in what they try to achieve.

I also do not agree that it is very weak in complex endgames or complex opening and I would like to see specific positions that demonstrate the weakness if you claim it is very weak.

I know only that it is very weak in finding best moves in positions when one side has a big advantage.

I expect the score to go up when you search deepet because the engine get positions that are closer to mate but stockfish does not care about moves that mate faster because it does not add elo(and for a strange reason care not to count material in the evaluation when one side has a big advantage and does not know what is better out of winning positions based on evaluation.

Even in the following position it cannot see more than +10 score in a reasonable time when it already saw more than +10 at very small depths.
[d]1nb1kbn1/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQ - 0 1

Stockfish_25100709_x64_avx2:
Available processors: 0-7
Using 6 threads
NNUE evaluation using nn-1c0000000000.nnue (133MiB, (22528, 3072, 15, 32, 1))
NNUE evaluation using nn-37f18f62d772.nnue (6MiB, (22528, 128, 15, 32, 1))
1/2 00:00 130 19k +6.95 Nb1-c3
2/2 00:00 577 82k +10.61 Nb1-c3
3/4 00:00 821 117k +10.29 Nb1-c3 b7-b6 a2-a3
4/7 00:00 1k 160k +10.20 Nb1-c3 Ng8-f6
5/6 00:00 1k 179k +10.17 Nb1-c3 Nb8-c6 h2-h4 Ng8-f6 Rh1-h2
6/7 00:00 2k 224k +9.73 Nb1-c3 Ng8-f6 Ng1-f3 Nf6-g4
7/8 00:00 2k 283k +9.76 Nb1-c3 Ng8-f6 Nc3-b5 Nf6-g4
8/14 00:00 12k 1,190k +9.79 Nb1-c3 Ng8-f6 Ng1-f3 e7-e6 d2-d4 Nf6-g4
9/16 00:00 35k 2,158k +9.86 Nb1-c3 Nb8-c6 Ng1-f3 Ng8-f6 a2-a3 e7-e5 d2-d3
10/14 00:00 42k 2,472k +10.05 Nb1-c3 Nb8-c6 f2-f3 Ng8-f6 e2-e3 e7-e5 Ng1-h3 Nc6-b4 a2-a3 d7-d5 a3xb4 Bc8xh3
11/15 00:00 63k 3,022k +9.84 Nb1-c3 Ng8-f6 a2-a3 Nb8-c6 d2-d4 e7-e5 e2-e3
12/18 00:00 94k 3,477k +9.80 Nb1-c3 Ng8-f6 Nc3-b5 Nb8-c6 Nb5xc7+ Ke8-d8 Nc7-b5 Kd8-e8 c2-c3 Nf6-g4 Ng1-h3 Nc6-e5 f2-f3
13/24 00:00 116k 3,637k +9.80 Nb1-c3 Ng8-f6 h2-h4 Nb8-c6 e2-e3 e7-e5 d2-d3 d7-d5 Bf1-e2 e5-e4 a2-a4
14/17 00:00 297k 3,813k +9.76 f2-f3 Ng8-f6 Nb1-c3 Nb8-c6 Ra1-b1 e7-e5 e2-e3 e5-e4
15/23 00:00 432k 3,893k +9.76 f2-f3 Nb8-c6 Nb1-c3 e7-e5 a2-a3 Ng8-f6 b2-b4 d7-d5 Bc1-b2 d5-d4 Nc3-a4
16/26 00:00 1,010k 4,372k +9.72 f2-f3 Nb8-c6 Nb1-c3 Ng8-f6 b2-b3 d7-d5 Bc1-b2 d5-d4 Nc3-b5 e7-e5 Qd1-c1
17/28 00:00 3,364k 5,028k +9.57 b2-b4 Ng8-f6 Bc1-b2 d7-d5 Qd1-c1 e7-e5 Bb2xe5 Nb8-c6 Be5xf6 g7xf6 b4-b5 Nc6-e5 Ng1-f3 Ke8-d8 Nb1-c3 Bf8-h6 Rh1-g1 Ne5-g4
18/30 00:00 3,683k 5,059k +9.61 b2-b4 Ng8-f6 Bc1-b2 d7-d5 Qd1-c1 e7-e5 Bb2xe5 Nf6-e4 f2-f3 Nb8-c6 Be5-b2 a7-a6 d2-d3 Ke8-d8 d3xe4 Bf8-d6 c2-c3
19/39 00:03 6,058k 1,998k +9.62 b2-b4 d7-d5 Bc1-b2 Ng8-f6 Qd1-c1 e7-e5 Bb2xe5 Nb8-c6 Be5xf6 g7xf6 Ng1-f3 Ke8-d8 Rh1-g1 Nc6xb4 c2-c3 Nb4-c6 Nb1-a3 Nc6-a5
20/34 00:04 9,547k 1,945k +9.57 b2-b4 Ng8-f6 Bc1-b2 d7-d5 f2-f3 e7-e5 Bb2xe5 Nb8-c6 Be5xc7 Nc6xb4 c2-c3 Nb4-c6 Qd1-c1 Bc8-f5 Ng1-h3 d5-d4 Rh1-g1 d4-d3 e2xd3
21/32 00:05 11,004k 1,988k +9.60 b2-b4 e7-e5 Bc1-b2 d7-d5 Qd1-c1 Bf8xb4 Bb2xe5 Nb8-c6 Be5xc7 Ng8-f6 c2-c3 Bb4-e7 Bc7-f4 Bc8-e6 Nb1-a3
22/31 00:06 16,913k 2,519k +9.52 b2-b4 e7-e5 Bc1-b2 d7-d5 Qd1-c1 Bf8xb4 Bb2xe5 Nb8-c6 Be5xc7 Ng8-f6 f2-f3 Ke8-d7 Bc7-g3 Nf6-h5 Bg3-f2 Kd7-d8 g2-g3 Bb4-e7 Bf2-e3 Be7-d6 c2-c3 d5-d4 Be3xd4 Nc6xd4 c3xd4
23/39 00:09 33,284k 3,638k +9.55 b2-b4 e7-e5 Bc1-b2 d7-d5 Qd1-c1 Ke8-d8 Nb1-c3 d5-d4 Nc3-e4 c7-c5 b4-b5 Ng8-f6 Ne4-g5 c5-c4 f2-f3 Kd8-e8 Ng5-h3 Bc8xh3 Ng1xh3
24/44 00:12 58,413k 4,556k +9.57 f2-f3 d7-d5 Nb1-c3 d5-d4 Nc3-e4 Ng8-f6 Ne4xf6+ e7xf6 c2-c3 Nb8-c6 Qd1-c2 Bf8-d6 Qc2xh7 Bc8-e6 b2-b4 Ke8-d8 Qh7-b1
25/35 00:13 63,426k 4,620k +9.54 f2-f3 d7-d5 Nb1-c3 d5-d4 Nc3-e4 Ng8-f6 Ne4xf6+ e7xf6 a2-a3 Bf8-d6 d2-d3 Ke8-f8 g2-g3 Nb8-d7 Bf1-g2 Nd7-e5 Ke1-f1 b7-b5 Ng1-h3 Bc8xh3 Bg2xh3
26/49 00:20 108,557k 5,253k +9.67 Ng1-f3 Ng8-f6 Nb1-c3 d7-d5 d2-d4 Bc8-f5 a2-a3 Nb8-c6 Bc1-f4 Nf6-e4 Nc3xd5 e7-e6 Nd5xc7+ Ke8-d7 Nc7-b5 e6-e5 Nf3xe5+ Nc6xe5 Bf4xe5 f7-f6 Be5-f4
27/44 00:26 146,988k 5,584k +9.69 Ng1-f3 d7-d5 Nb1-c3 Ng8-f6 d2-d4 Bc8-f5 a2-a3 c7-c5 e2-e3 Nb8-c6 d4xc5 e7-e6 Nf3-d4 g7-g6 f2-f3 a7-a6 Nd4xf5 g6xf5 Bc1-d2 Nc6-e5 Qd1-b1 Ke8-d7 Nc3-a4 Bf8-g7 Rh1-g1
28/43 00:31 178,381k 5,743k +9.72 Ng1-f3 d7-d5 Nb1-c3 Ng8-f6 d2-d4 Bc8-f5 a2-a3 c7-c5 d4xc5 e7-e6 Nf3-d4 Bf5-g6 f2-f3 Bf8xc5 Nd4-b3 Bc5-b6 e2-e3 Ke8-e7 Bf1-e2 Bb6-c7 g2-g3 Nf6-g4 f3xg4 Nb8-d7 g4-g5
29/43 00:38 225,547k 5,890k +9.73 Ng1-f3 d7-d5 Nb1-c3 Ng8-f6 d2-d4 Bc8-f5 a2-a3 c7-c5 d4xc5 e7-e6 Nf3-h4 Bf5-e4 Bc1-g5 Nf6-d7 f2-f3 Nd7-e5 e2-e3 Bf8-e7 Bg5xe7 Ke8xe7
30/44 00:40 236,566k 5,911k +9.73 Ng1-f3 d7-d5 Nb1-c3 Ng8-f6 d2-d4 Bc8-f5 a2-a3 h7-h6 Rh1-g1 e7-e6 Bc1-f4 Nb8-c6 e2-e3 Nf6-e4 Bf4xc7 Ke8-d7 Bc7-g3 Bf8-e7 Nf3-d2 Ne4xd2 Ke1xd2 Be7-d8 Bf1-b5 Bd8-a5 Rg1-e1 a7-a6 Bb5xc6+ b7xc6 Bg3-h4 Ba5xc3+ b2xc3 Kd7-c8 Qd1-c1
31/47 01:00 371,025k 6,085k +9.85 Nb1-c3 Ng8-f6 d2-d4 Nb8-c6 d4-d5 Nc6-e5 Ng1-f3 Ne5-g6 h2-h4 h7-h5 Qd1-d4 Nf6-g4 Qd4-c5 Ke8-d8 d5-d6 e7xd6 Qc5xh5 Ng4-f6 Qh5-a5 Ng6-e5 Nf3xe5 d6xe5 Bc1-g5 Bf8-e7 Qa5xe5 d7-d6 Qe5-g3 d6-d5 h4-h5
32/45 01:10 426,071k 6,066k +9.89 Nb1-c3 Ng8-f6 d2-d4 Nb8-c6 d4-d5 Nc6-e5 Ng1-f3 Ne5-g6 h2-h4 h7-h5 Qd1-d4 e7-e6 Qd4xa7 Bf8-b4 Bc1-d2 Bb4xc3 Bd2xc3 Nf6xd5 Qa7-d4 Nd5-f6 Qd4-c5 Nf6-e4 Qc5xc7 Ng6-e7 Qc7-b8 Ne7-d5
33/55 02:11 815,222k 6,192k +9.98 Nb1-c3 Ng8-f6 d2-d4 e7-e6 e2-e4 a7-a6 Bf1-d3 b7-b5 a2-a3 c7-c5 d4xc5 Bf8xc5 Bc1-e3 Bc5xe3 f2xe3 Bc8-b7 Ng1-f3 g7-g5 Ke1-e2 g5-g4 Nf3-e5 Bb7xe4 Bd3xe4 b5-b4 a3xb4 Nf6xe4 Nc3xe4
34/56 03:14 1,209,768k 6,235k +9.93 Nb1-c3 Ng8-f6 d2-d4 d7-d5 Bc1-f4 Nb8-c6 f2-f3 Ke8-d8 e2-e4 e7-e6 e4-e5 Nf6-h5 Bf4-e3 Bc8-d7 Bf1-d3 Kd8-c8 Ng1-e2 f7-f6 e5xf6 Nh5xf6 Qd1-d2 e6-e5 d4xe5 Nc6xe5 Be3-d4 Ne5-c6 Bd3-b5 Bd7-f5 Bb5-a4 Bf8-b4 Ba4xc6 b7xc6 a2-a3 Bb4-d6
35/58 07:22 2,810,067k 6,348k +9.85 Nb1-c3 c7-c5 Ng1-f3 Ng8-f6 e2-e4 Nb8-c6 d2-d3 b7-b6 Bf1-e2 Bc8-b7 Bc1-f4 a7-a6 O-O e7-e5 Bf4-g3 Nf6-h5 Nf3-g5 Nh5xg3 f2xg3 Nc6-d4 Kg1-h1 c5-c4 Be2-h5 c4xd3 c2xd3 g7-g6 Bh5-f3 Nd4xf3 Qd1xf3 f7-f5
Jouni
Posts: 3683
Joined: Wed Mar 08, 2006 8:15 pm
Full name: Jouni Uski

Re: Stockfish 20251005 is clear improvement in test suites

Post by Jouni »

Dev version is now clearly less selective than 17.1. Here's 1 core analysis from start.

Stockfish 17.1:
...
1.e4
= (0.20 ++) Depth: 45/54 00:04:05 156mN
1.e4 e5
= (0.17 --) Depth: 45/56 00:04:16 161mN
1.e4 e5 2.Nf3 Nc6 3.Bb5 Nf6 4.0-0 Nxe4 5.Re1 Nd6 6.Nxe5 Nxe5 7.Rxe5+ Be7 8.Bf1 0-0 9.d4 Bf6 10.Re1 Nf5 11.c3 d5 12.Nd2 c6 13.Nf3 Nd6 14.Bf4 Bf5 15.a4 a5 16.h3 Be4 17.Bh2 Re8 18.Ne5 Bh4 19.Nd3 Bxd3 20.Bxd3 Rxe1+ 21.Qxe1 Ne8 22.Qe5 Be7 23.Re1 Bd6
= (0.16) Depth: 45/56 00:04:26 166mN
1.e4 e5
= (0.16 --) Depth: 46/60 00:04:52 182mN
1.e4
= (0.17 ++) Depth: 46/60 00:05:23 202mN
1.e4 e5 2.Nf3 Nc6 3.Bb5 Nf6 4.0-0 Nxe4 5.Re1 Nd6 6.Nxe5 Nxe5 7.Rxe5+ Be7 8.Bf1 0-0 9.d4 Bf6 10.Re1 Nf5 11.c3 d5 12.a4 a5 13.h3 Be7 14.Nd2 Bd6 15.Nf3 Nh4 16.Ne5 Ng6 17.Qh5 Qh4 18.Qxh4 Nxh4 19.Bf4 c6 20.Bd3 Bb8 21.Bg5 Nf5 22.Nf3 h6 23.Bd2 Be6 24.Re2 Nd6 25.Bf4
= (0.15) Depth: 46/62 00:05:42 215mN
1.e4 e5
= (0.14 --) Depth: 47/56 00:06:07 232mN
1.e4
= (0.16 ++) Depth: 47/57 00:06:20 240mN
1.e4 e5
= (0.13 --) Depth: 47/57 00:06:37 252mN
1.e4 e5
= (0.11 --) Depth: 47/59 00:07:49 298mN
1.e4
= (0.13 ++) Depth: 47/62 00:07:52 300mN
1.e4 e5 2.Nf3 Nc6 3.Bb5 Nf6 4.0-0 Nxe4 5.Re1 Nd6 6.Nxe5 Nxe5 7.Bf1 Be7 8.Rxe5 0-0 9.d4 Bf6 10.Re1 Nf5 11.c3 d5 12.a4 a5 13.Nd2 c6 14.Nf3 Nd6 15.Bf4 Bf5 16.h3 Be4 17.Bh2 Re8 18.Ne5 h6 19.Nd3 Bxd3 20.Rxe8+ Qxe8 21.Bxd3 Qe6 22.Qg4 Qxg4 23.hxg4 Ne8 24.Re1 g6 25.Bf4 Bg5 26.Bxg5 hxg5
= (0.11) Depth: 47/62 00:07:55 302mN
1.e4 e5
= (0.11 --) Depth: 48/59 00:08:13 315mN
1.e4 e5
= (0.09 --) Depth: 48/59 00:08:31 328mN
1.e4
= (0.11 ++) Depth: 48/60 00:09:09 353mN
1.e4 e5 2.Nf3 Nc6 3.Bb5 Nf6 4.0-0 Nxe4 5.Re1 Nd6 6.Nxe5 Nxe5 7.Bf1 Be7 8.Rxe5 0-0 9.d4 Bf6 10.Re1 Nf5 11.c3 d5 12.a4 a5 13.Nd2 c6 14.Nf3 Nd6 15.Bf4 Bf5 16.h3 Be4 17.Bh2 Re8 18.Ne5 h6 19.Nd3 Bxd3 20.Rxe8+ Nxe8 21.Bxd3 Be7 22.Be5 Bd6 23.f4 Nf6 24.Qb3 Qc7 25.Re1
= (0.12) Depth: 48/60 00:09:10 354mN
1.e4 e5
= (0.10 --) Depth: 49/54 00:09:27 366mN
1.e4
= (0.11 ++) Depth: 49/56 00:09:32 369mN
1.e4
= (0.12 ++) Depth: 49/57 00:09:33 370mN
1.e4
= (0.15 ++) Depth: 49/57 00:09:34 370mN
1.e4 e5 2.Nf3 Nc6 3.Bb5 Nf6 4.0-0 Nxe4 5.Re1 Nd6 6.Nxe5 Nxe5 7.Bf1 Be7 8.Rxe5 0-0 9.d4 Bf6 10.Re1 Re8 11.Bf4 Rxe1 12.Qxe1 Ne8 13.c3 d5 14.Nd2 c6 15.Nb3 Bf5 16.Qe2 g6 17.Re1 Ng7 18.g4 Bc8 19.Bh6 b6 20.h3 a5 21.a3
= (0.10) Depth: 49/57 00:09:37 373mN
1.e4 e5
= (0.10 --) Depth: 50/57 00:09:45 378mN

SF20251005:

1.e4
= (0.03 ++) Depth: 43/70 00:06:29 268mN
1.e4 e5 2.Nf3 Nc6 3.Bb5 Nf6 4.0-0 Nxe4 5.Re1 Nd6 6.Nxe5 Be7 7.Bf1 Nxe5 8.Rxe5 0-0 9.d4 Ne8 10.c4 Bf6 11.Re1 d5 12.cxd5 Qxd5 13.Be3 Be6 14.Nc3 Qd7 15.d5 Bf5 16.a4 Nd6 17.a5 Rfe8 18.a6 bxa6 19.Rxa6 h6 20.Bxa7 Rxe1 21.Qxe1 Re8
= (0.03) Depth: 43/70 00:06:32 270mN
1.e4 e5 2.Nf3 Nc6 3.Bb5 Nf6 4.0-0 Nxe4 5.Re1 Nd6 6.Nxe5 Be7 7.Bf1 Nxe5 8.Rxe5 0-0 9.d4 Ne8 10.c4 Bf6 11.Re1 d5 12.cxd5 Qxd5 13.Be3 Be6 14.Nc3 Qd7 15.d5 Bf5 16.a4 Nd6 17.a5 Rfe8 18.a6 bxa6 19.Rxa6 Rab8 20.Qc1 h6 21.h3 Qd8 22.Rxa7 Bg6 23.Bd2 Rxe1 24.Qxe1
= (0.03) Depth: 44/58 00:07:14 296mN
1.e4 e5
= (0.02 --) Depth: 45/61 00:07:28 305mN
1.e4 e5
= (0.01 --) Depth: 45/61 00:07:41 314mN
1.e4
= (0.01 ++) Depth: 45/67 00:08:29 343mN
1.d4
= (0.03 ++) Depth: 45/67 00:09:07 368mN
1.d4 Nf6
= (0.00 --) Depth: 45/67 00:09:23 378mN
1.e4
= (0.00 ++) Depth: 45/67 00:10:05 405mN
1.e4 e5 2.Nf3 Nc6 3.Bb5 Nf6 4.0-0 Nxe4 5.Re1 Nd6 6.Nxe5 Be7 7.Bf1 Nxe5 8.Rxe5 0-0 9.d4 Ne8 10.d5 Bc5 11.Re1 d6 12.Nc3 Bf5 13.Bd3 Qh4 14.Ne4 Bxe4 15.Rxe4 Qxf2+ 16.Kh1 Qf6 17.Rf4 Qe5 18.Re4
= (0.00) Depth: 45/69 00:10:19 415mN
Jouni
MOBMAT
Posts: 399
Joined: Sat Feb 04, 2017 11:57 pm
Location: USA

Re: Stockfish 20251005 is clear improvement in test suites

Post by MOBMAT »

I would like to be able to use the 6-man Nalimov EGTB in SF.

The Nalimov TBs are the main reason I still use Houdini for mate solving.
i7-6700K @ 4.00Ghz 32Gb, Win 10 Home, EGTBs on PCI SSD
Benchmark: Stockfish15.1 NNUE x64 bmi2 (nps): 1277K
User avatar
Ajedrecista
Posts: 2134
Joined: Wed Jul 13, 2011 9:04 pm
Location: Madrid, Spain.

Re: Stockfish 20251005 is clear improvement in test suites.

Post by Ajedrecista »

Hello:
MOBMAT wrote: Sat Oct 11, 2025 12:39 am I would like to be able to use the 6-man Nalimov EGTB in SF.

The Nalimov TBs are the main reason I still use Houdini for mate solving.
CPW - Nalimov Tablebases
CPW wrote:[...] however the license policy requires explicit permission by Eugene Nalimov.
As far as I remember, this permission has been nearly impossible to obtain for many years. Houdini could be one of few exceptions, but having Gaviota EGTB and later Syzygy EGTB looks enough for (almost) anybody.

Regards from Spain.

Ajedrecista.