Reminder: Stockfish is best with 2 cores

Discussion of anything and everything relating to chess playing software and machines.

Moderator: Ras

Werewolf
Posts: 2042
Joined: Thu Sep 18, 2008 10:24 pm

Re: Reminder: Stockfish is best with 2 cores

Post by Werewolf »

gordonr wrote: Sun Oct 05, 2025 4:13 am
Werewolf wrote: Sat Oct 04, 2025 11:39 am looks sound

[d]1r4k1/4pp1p/pp1pq1p1/r2R4/PbP1P3/1P1QBP1P/R5P1/7K w - - 0 1

averaged over 3 runs
2 cores: 11.08s
16 cores: 12.95s

Code: Select all

SF dev 24th Aug 2025, AMD Ryzen 9 5950X 16-Core Processor (3.40 GHz), 2 GB hash, 6 men TBs

100 runs

2 threads:

13.9, 7.49, 13.02, 9.74, 13.15, 10.9, 16.3, 8.24, 4.72, 14.38, 11.25, 6.3, 36.06, 9.19, 6.98, 15.07, 1.35, 7.33, 9.92, 25.73, 
9.14, 52.39, 7.56, 14.18, 19.37, 10.94, 6.28, 15.05, 15.05, 14.73, 10.93, 9.95, 11.27, 18.47, 12.06, 9.19, 18.56, 12.14, 12.1, 3.91, 
9.07, 14.55, 19.2, 18.16, 22.59, 16.56, 9.38, 2.74, 8.99, 12.64, 14.05, 11.25, 11.66, 10.17, 8.98, 1.23, 3.2, 5.24, 3.42, 40.82, 
14.43, 5.34, 13.96, 6.61, 32.61, 9.16, 12.07, 3.4, 13.02, 4.88, 9.79, 18.28, 14.75, 11.83, 19.5, 20.37, 4.78, 14.25, 16.32, 11.81, 
6.89, 12.15, 12.3, 8.44, 6.55, 22.23, 6.38, 8.77, 7.48, 16.05, 10.59, 12.1, 1.46, 11.24, 9.47, 10.25, 7.28, 11.2, 6.74, 12.84, 

Mean: 12.26 secs

[1.23, 1.35, 1.46, 2.74, 3.2, 3.4, 3.42, 3.91, 4.72, 4.78, 4.88, 5.24, 5.34, 6.28, 6.3, 6.38, 6.55, 6.61, 6.74, 6.89, 6.98, 7.28, 7.33, 7.48, 7.49, 7.56, 8.24, 8.44, 8.77, 8.98, 8.99, 9.07, 9.14, 9.16, 9.19, 9.19, 9.38, 9.47, 9.74, 9.79, 9.92, 9.95, 10.17, 10.25, 10.59, 10.9, 10.93, 10.94, 11.2, 11.24, 11.25, 11.25, 11.27, 11.66, 11.81, 11.83, 12.06, 12.07, 12.1, 12.1, 12.14, 12.15, 12.3, 12.64, 12.84, 13.02, 13.02, 13.15, 13.9, 13.96, 14.05, 14.18, 14.25, 14.38, 14.43, 14.55, 14.73, 14.75, 15.05, 15.05, 15.07, 16.05, 16.3, 16.32, 16.56, 18.16, 18.28, 18.47, 18.56, 19.2, 19.37, 19.5, 20.37, 22.23, 22.59, 25.73, 32.61, 36.06, 40.82, 52.39]

Median: 11.25 secs

------

16 threads:

5.33, 3.66, 2.2, 4.0, 4.11, 5.4, 1.32, 4.42, 4.18, 6.73, 3.91, 2.67, 4.21, 5.68, 7.21, 2.19, 3.83, 3.05, 4.98, 2.85, 
5.0, 7.28, 6.72, 3.82, 3.16, 8.15, 9.25, 4.32, 2.2, 4.9, 6.79, 5.45, 12.95, 7.07, 5.16, 4.82, 10.44, 13.05, 7.04, 5.4, 
19.18, 10.26, 5.84, 6.11, 7.1, 2.01, 9.8, 7.7, 13.13, 7.63, 11.34, 7.62, 12.5, 3.77, 5.08, 9.39, 8.19, 2.03, 4.26, 7.84, 
5.46, 2.62, 4.92, 5.5, 9.5, 13.8, 7.02, 9.22, 7.21, 7.24, 6.69, 4.19, 10.51, 3.92, 7.17, 11.49, 7.09, 1.26, 4.19, 11.34, 
5.58, 12.32, 6.81, 3.94, 9.6, 7.67, 5.34, 6.67, 7.66, 11.98, 7.17, 5.45, 9.39, 9.81, 5.07, 13.94, 11.88, 13.38, 6.13, 3.97, 

Mean: 6.8 secs

[1.26, 1.32, 2.01, 2.03, 2.19, 2.2, 2.2, 2.62, 2.67, 2.85, 3.05, 3.16, 3.66, 3.77, 3.82, 3.83, 3.91, 3.92, 3.94, 3.97, 4.0, 4.11, 4.18, 4.19, 4.19, 4.21, 4.26, 4.32, 4.42, 4.82, 4.9, 4.92, 4.98, 5.0, 5.07, 5.08, 5.16, 5.33, 5.34, 5.4, 5.4, 5.45, 5.45, 5.46, 5.5, 5.58, 5.68, 5.84, 6.11, 6.13, 6.67, 6.69, 6.72, 6.73, 6.79, 6.81, 7.02, 7.04, 7.07, 7.09, 7.1, 7.17, 7.17, 7.21, 7.21, 7.24, 7.28, 7.62, 7.63, 7.66, 7.67, 7.7, 7.84, 8.15, 8.19, 9.22, 9.25, 9.39, 9.39, 9.5, 9.6, 9.8, 9.81, 10.26, 10.44, 10.51, 11.34, 11.34, 11.49, 11.88, 11.98, 12.32, 12.5, 12.95, 13.05, 13.13, 13.38, 13.8, 13.94, 19.18]

Median: 6.67 secs
How are you testing? Are you doing each position discreetly, or part of a suite? I don't think Fritz clears the hash between each position in a testsuite?

Also you're using a later version of Stockfish.
peter
Posts: 3412
Joined: Sat Feb 16, 2008 7:38 am
Full name: Peter Martan

Re: Reminder: Stockfish is best with 2 cores

Post by peter »

Werewolf wrote: Mon Oct 06, 2025 8:10 pm I don't think Fritz clears the hash between each position in a testsuite?
I think it does. Ucinewgame- command should occur with each one from a database newly loaded position. So engines like Stockfish (treating ucinewgame like or similarly to clear hash command) should start with (as well as, not quite same as with new start of GUI, hash- rests could remain maybe, but test -positions in test -suites aren't so similar to each other normally, at least not the single best move- positions, that I use with Fritz normally, that such hash- rests would help much anyhow) empty hash in automatically loaded positions out of .cbh- file (cb- equivalent to .pgn), regards
Peter.
Werewolf
Posts: 2042
Joined: Thu Sep 18, 2008 10:24 pm

Re: Reminder: Stockfish is best with 2 cores

Post by Werewolf »

peter wrote: Mon Oct 06, 2025 9:02 pm
Werewolf wrote: Mon Oct 06, 2025 8:10 pm I don't think Fritz clears the hash between each position in a testsuite?
I think it does. Ucinewgame- command should occur with each one from a database newly loaded position. So engines like Stockfish (treating ucinewgame like or similarly to clear hash command) should start with (as well as, not quite same as with new start of GUI, hash- rests could remain maybe, but test -positions in test -suites aren't so similar to each other normally, at least not the single best move- positions, that I use with Fritz normally, that such hash- rests would help much anyhow) empty hash in automatically loaded positions out of .cbh- file (cb- equivalent to .pgn), regards
Thanks, good to know!
Werewolf
Posts: 2042
Joined: Thu Sep 18, 2008 10:24 pm

Re: Reminder: Stockfish is best with 2 cores

Post by Werewolf »

SF17.1 / 16GB Hash

16C
3.39, 9.42, 14.28, 5.17, 5.59, 16.14, 5.16, 4.08, 5.63, 15.13, 14.05, 13.94, 13.59, 8.97, 6.45, 6.56, 2.58, 7.48, 16.52, 8.58, 10.31, 4.63, 3.03, 3.20, 4.38, 7.78, 9.52, 4.64, 3.44, 6.47, 6.09, 7.89, 8.33, 4.44, 1.92, 3.58, 2.20, 9.64, 5.61, 12.94, 9.17, 8.02, 15.66, 7.25, 2.89, 13.38, 5.84, 4.88, 6.55, 2.97, 9.72, 7.41, 7.25, 14.94, 12.39, 5.47, 3.81, 11.63, 4.64, 4.14, 7.25, 4.88, 9.48, 4.84, 3.20, 2.63, 15.94, 9.73, 1.39, 6.47, 5.73, 8.88, 13.02, 6.86, 3.34, 6.58, 9.06, 3.81, 7.98, 10.36, 4.41, 8.86, 4.30, 2.77, 14.81, 16.11, 2.61, 6.09, 1.23, 4.00, 6.91, 4.47, 9.19, 19.00, 10.83, 18.78, 1.42, 6.59, 5.47, 8.06
Average: 7.58s

2C
12.69, 9.75, 44.34, 11.50, 11.75, 16.39, 33.25, 25.36, 6.63, 24.97, 35.83, 31.27, 11.95, 21.08, 19.30, 72.44, 3.98, 27.00, 12.42, 15.38, 5.86, 12.03, 24.14, 27.53, 11.11, 12.02, 6.50, 33.80, 13.02, 8.27, 17.73, 16.41, 9.23, 3.34, 17.88, 20.20, 11.92, 34.58, 11.86, 3.25, 4.56, 3.41, 28.72, 37.77, 38.55, 12.84, 7.78, 25.22, 3.84, 22.25, 12.36, 44.19, 43.19, 67.77, 52.64, 8.86, 33.75, 10.81, 4.09, 13.36, 15.13, 9.31, 45.16, 23.63, 47.22, 2.39, 80.75, 27.91, 11.83, 34.67, 37.67, 20.77, 9.75, 36.31, 32.38, 31.94, 62.08, 17.86, 26.00, 18.33, 42.61, 5.97, 28.56, 24.86, 41.73, 20.83, 44.64, 51.05, 6.16, 20.94, 8.27, 18.80, 39.20, 17.11, 30.91, 69.63, 58.39, 30.97, 38.86, 47.53
Average: 24.62s

That's conclusive.
Now for the very hard question: why is this not translating into more solves when the time is fixed?
gordonr
Posts: 236
Joined: Thu Aug 06, 2009 8:04 pm
Location: UK

Re: Reminder: Stockfish is best with 2 cores

Post by gordonr »

Werewolf wrote: Mon Oct 06, 2025 8:10 pm How are you testing? Are you doing each position discreetly, or part of a suite? I don't think Fritz clears the hash between each position in a testsuite?
I use my own test code and I restart the SF engine between each test position (not just clearing hash) - so e.g. 200 runs means 200 starts of the SF engine. See here: https://www.talkchess.com/forum/viewtop ... 36#p657336
Werewolf wrote: Mon Oct 06, 2025 8:10 pm Also you're using a later version of Stockfish.
You reached your own conclusion but here's my data for SF 17.1

Code: Select all

SF 17.1

200 runs

2 threads:

7.82, 12.1, 6.28, 11.39, 8.12, 12.77, 10.81, 10.93, 9.95, 5.66, 3.46, 6.65, 4.88, 3.51, 3.52, 5.15, 4.81, 9.04, 7.14, 5.91, 
3.47, 5.78, 3.55, 5.72, 9.89, 6.18, 6.09, 16.12, 7.94, 8.47, 13.96, 4.52, 11.0, 3.76, 3.15, 9.41, 13.83, 7.82, 7.93, 1.93, 
2.7, 2.49, 6.86, 4.34, 9.44, 3.43, 9.43, 4.65, 8.73, 4.03, 5.02, 4.34, 6.0, 10.86, 5.91, 1.98, 1.95, 9.21, 8.11, 2.43, 
7.39, 11.6, 6.57, 6.24, 8.04, 9.81, 13.22, 3.52, 3.9, 12.38, 5.97, 7.33, 5.87, 4.15, 10.9, 8.27, 3.65, 6.74, 3.3, 5.75, 
6.8, 5.31, 9.04, 8.21, 5.21, 6.11, 6.69, 4.95, 8.82, 18.82, 29.23, 7.25, 4.69, 5.8, 5.54, 15.06, 6.09, 5.25, 6.21, 5.2, 
2.98, 2.06, 6.35, 5.37, 7.19, 4.58, 3.68, 5.38, 3.14, 5.0, 1.35, 4.52, 10.74, 8.28, 10.81, 8.28, 3.7, 11.14, 12.65, 10.01, 
5.6, 3.6, 18.08, 4.62, 6.05, 8.29, 3.43, 4.71, 7.45, 6.08, 8.81, 9.38, 10.78, 4.88, 5.09, 1.85, 3.17, 3.43, 6.45, 3.8, 
2.6, 5.04, 8.32, 1.42, 9.25, 8.52, 9.74, 5.91, 6.44, 7.21, 5.57, 8.28, 1.45, 6.65, 3.91, 1.0, 3.6, 4.39, 11.48, 7.87, 
3.44, 3.51, 4.63, 6.13, 6.55, 6.29, 9.08, 5.18, 13.51, 10.49, 7.66, 4.73, 4.83, 11.25, 4.75, 8.2, 5.41, 10.78, 9.01, 10.51, 
11.6, 3.23, 7.22, 8.46, 9.14, 5.3, 4.76, 8.75, 4.54, 8.8, 41.22, 9.96, 1.68, 2.33, 5.76, 11.22, 5.17, 2.36, 4.42, 2.49, 

Mean: 7.01 seconds

[1.0, 1.35, 1.42, 1.45, 1.68, 1.85, 1.93, 1.95, 1.98, 2.06, 2.33, 2.36, 2.43, 2.49, 2.49, 2.6, 2.7, 2.98, 3.14, 3.15, 3.17, 3.23, 3.3, 3.43, 3.43, 3.43, 3.44, 3.46, 3.47, 3.51, 3.51, 3.52, 3.52, 3.55, 3.6, 3.6, 3.65, 3.68, 3.7, 3.76, 3.8, 3.9, 3.91, 4.03, 4.15, 4.34, 4.34, 4.39, 4.42, 4.52, 4.52, 4.54, 4.58, 4.62, 4.63, 4.65, 4.69, 4.71, 4.73, 4.75, 4.76, 4.81, 4.83, 4.88, 4.88, 4.95, 5.0, 5.02, 5.04, 5.09, 5.15, 5.17, 5.18, 5.2, 5.21, 5.25, 5.3, 5.31, 5.37, 5.38, 5.41, 5.54, 5.57, 5.6, 5.66, 5.72, 5.75, 5.76, 5.78, 5.8, 5.87, 5.91, 5.91, 5.91, 5.97, 6.0, 6.05, 6.08, 6.09, 6.09, 6.11, 6.13, 6.18, 6.21, 6.24, 6.28, 6.29, 6.35, 6.44, 6.45, 6.55, 6.57, 6.65, 6.65, 6.69, 6.74, 6.8, 6.86, 7.14, 7.19, 7.21, 7.22, 7.25, 7.33, 7.39, 7.45, 7.66, 7.82, 7.82, 7.87, 7.93, 7.94, 8.04, 8.11, 8.12, 8.2, 8.21, 8.27, 8.28, 8.28, 8.28, 8.29, 8.32, 8.46, 8.47, 8.52, 8.73, 8.75, 8.8, 8.81, 8.82, 9.01, 9.04, 9.04, 9.08, 9.14, 9.21, 9.25, 9.38, 9.41, 9.43, 9.44, 9.74, 9.81, 9.89, 9.95, 9.96, 10.01, 10.49, 10.51, 10.74, 10.78, 10.78, 10.81, 10.81, 10.86, 10.9, 10.93, 11.0, 11.14, 11.22, 11.25, 11.39, 11.48, 11.6, 11.6, 12.1, 12.38, 12.65, 12.77, 13.22, 13.51, 13.83, 13.96, 15.06, 16.12, 18.08, 18.82, 29.23, 41.22]

Median: 6.11 seconds

----------

16 threads:

7.43, 1.0, 2.26, 1.16, 2.08, 1.06, 3.07, 4.6, 2.79, 1.09, 1.68, 3.45, 6.0, 2.33, 3.15, 3.21, 5.09, 1.41, 1.86, 1.01, 
4.23, 2.56, 1.4, 2.39, 1.71, 5.58, 1.78, 2.69, 1.77, 7.34, 1.95, 1.62, 2.28, 2.32, 4.41, 1.35, 1.26, 2.01, 1.77, 1.52, 
7.94, 2.53, 2.89, 3.27, 2.47, 3.89, 1.12, 5.61, 2.39, 2.25, 1.0, 2.87, 4.2, 4.05, 3.03, 2.19, 2.78, 9.31, 3.21, 3.39, 
4.19, 1.26, 2.03, 2.25, 2.79, 3.04, 1.97, 1.22, 4.29, 4.43, 1.44, 3.02, 1.73, 1.96, 3.28, 1.46, 3.02, 3.0, 1.78, 3.99, 
2.06, 3.07, 2.45, 1.15, 1.76, 5.77, 2.74, 12.66, 2.37, 5.38, 4.13, 2.6, 0.99, 2.07, 14.85, 3.66, 1.85, 2.33, 1.74, 1.8, 
2.16, 1.92, 8.8, 1.51, 1.0, 3.25, 2.97, 1.95, 3.71, 1.77, 2.32, 6.04, 7.54, 2.56, 2.38, 2.83, 3.35, 1.77, 3.3, 2.99, 
5.79, 2.36, 3.09, 2.31, 4.17, 3.09, 5.88, 6.67, 3.98, 3.29, 2.41, 1.42, 2.03, 1.16, 2.25, 7.91, 5.03, 1.84, 3.45, 2.99, 
2.96, 3.67, 5.14, 2.63, 8.3, 1.87, 4.17, 3.37, 2.52, 1.84, 3.07, 2.5, 2.73, 4.25, 1.79, 3.56, 3.19, 1.89, 2.52, 9.2, 
2.09, 2.44, 2.96, 2.58, 2.53, 1.47, 3.4, 3.36, 1.81, 1.81, 0.98, 1.78, 3.88, 2.38, 3.94, 4.03, 1.89, 2.06, 1.44, 3.43, 
1.22, 2.39, 3.57, 3.91, 3.36, 1.48, 3.5, 6.49, 2.88, 1.85, 4.34, 2.78, 3.27, 1.9, 5.3, 4.48, 5.88, 1.51, 1.52, 2.72, 

Mean: 3.13 seconds

[0.98, 0.99, 1.0, 1.0, 1.0, 1.01, 1.06, 1.09, 1.12, 1.15, 1.16, 1.16, 1.22, 1.22, 1.26, 1.26, 1.35, 1.4, 1.41, 1.42, 1.44, 1.44, 1.46, 1.47, 1.48, 1.51, 1.51, 1.52, 1.52, 1.62, 1.68, 1.71, 1.73, 1.74, 1.76, 1.77, 1.77, 1.77, 1.77, 1.78, 1.78, 1.78, 1.79, 1.8, 1.81, 1.81, 1.84, 1.84, 1.85, 1.85, 1.86, 1.87, 1.89, 1.89, 1.9, 1.92, 1.95, 1.95, 1.96, 1.97, 2.01, 2.03, 2.03, 2.06, 2.06, 2.07, 2.08, 2.09, 2.16, 2.19, 2.25, 2.25, 2.25, 2.26, 2.28, 2.31, 2.32, 2.32, 2.33, 2.33, 2.36, 2.37, 2.38, 2.38, 2.39, 2.39, 2.39, 2.41, 2.44, 2.45, 2.47, 2.5, 2.52, 2.52, 2.53, 2.53, 2.56, 2.56, 2.58, 2.6, 2.63, 2.69, 2.72, 2.73, 2.74, 2.78, 2.78, 2.79, 2.79, 2.83, 2.87, 2.88, 2.89, 2.96, 2.96, 2.97, 2.99, 2.99, 3.0, 3.02, 3.02, 3.03, 3.04, 3.07, 3.07, 3.07, 3.09, 3.09, 3.15, 3.19, 3.21, 3.21, 3.25, 3.27, 3.27, 3.28, 3.29, 3.3, 3.35, 3.36, 3.36, 3.37, 3.39, 3.4, 3.43, 3.45, 3.45, 3.5, 3.56, 3.57, 3.66, 3.67, 3.71, 3.88, 3.89, 3.91, 3.94, 3.98, 3.99, 4.03, 4.05, 4.13, 4.17, 4.17, 4.19, 4.2, 4.23, 4.25, 4.29, 4.34, 4.41, 4.43, 4.48, 4.6, 5.03, 5.09, 5.14, 5.3, 5.38, 5.58, 5.61, 5.77, 5.79, 5.88, 5.88, 6.0, 6.04, 6.49, 6.67, 7.34, 7.43, 7.54, 7.91, 7.94, 8.3, 8.8, 9.2, 9.31, 12.66, 14.85]

Median: 2.63 seconds

peter
Posts: 3412
Joined: Sat Feb 16, 2008 7:38 am
Full name: Peter Martan

Re: Reminder: Stockfish is best with 2 cores

Post by peter »

Werewolf wrote: Mon Oct 06, 2025 11:34 pm why is this not translating into more solves when the time is fixed?
Who says, it isn't?
I mean except Jouni, who hasn't given any data here so far to support his to all experience contradictionary claim, experience with game playing and in any way statistically relevant positional testing as well?
:)
E.g. here

https://forum.computerschach.de/cgi-bin ... #pid176682

and here

https://forum.computerschach.de/cgi-bin ... #pid176738

I recently gave some examples to testing with suites of not much too small sample size, to see, how SF dev. (in examples of the links 20250913) scales with them at given hardware- TC.

I just didn't take part in this discussion here so far, because I had the impression, all those doing so (taking part) were determined to discuss single positions only. Of course, you can take any single position you want and try to get a statistically relevant difference in time to solution of different hardware- threads, hash, concurrencies, GUIs, however you define time to solution as well, but neither will you get any transitivity to next one single position at all, nor will it therefore tell anything then about statistically relevant results you get from next one single position. That's the old nonsense- discussion about comparability between positional testing and game playing as well. At latter you don't let play out from single positions neither, do you? You want to have enough different starting positions and enough games for comparability of results to each other, who thinks, there's a qualitative difference to any other kind of positional testing? Game playing from given starting positions is simply outplayed positional testing as well, isn't it?

So don't hope to get statistically relevant results out of positional testing (no matter, how you do it, interactively, with tools like MEA or with GUIs, even if all the positions are correctly and completely analysed as for best move, second best move, eval, outcome as for WDL- probabiltiy, you can define as many criteria as you want, what you anyhow still need to get statistically relevant and in any way to other one kinds of positional testing comparable results, outplayed or not, what you need anyhow in any way, is sample size, you see?

If engine- engine- game playing (and testing with this one method only) dies a draw- death (also and as well if you count game pairs only, by too big one- sided advantage of opening positions simply replacing 2 draws drawn by 2 wins drawn game pairs) who can hope then with good reason, positional testing doesn't do so as well (die a draw- death)? You can spare some hardware- time to get statistically relevant results some sooner by not letting all your test positions be played out to full games and evaulate output and move- choice of test positions "only" instead, yet the nearness to each other of engines (or their settings and hardware- usage of different TCs) and their performances won't get principally smaller thus. You can get results more quickly by especially selective positions, (or stick to a single at all as said before) but you exchange bigger distinction compared to error bar, so sooner to be got statistical relevance, this you exchange just to smaller transitivity to e.g. game playing results and to other samples of test- positions. So if you want to have results of any meaning as for "overall playing strength" also (good old "elosion" as I like to call it, "overall" never existed at all, not for engines neither for humans, and for certain not at all as for "positional indepencency" of any kind of testing) you have to have both: sample size for statistical significance, and selection of positions for getting biggest distinction/error bar- relation out of smallest sample size. Third thing you need, if you want to have any comparability to the kind of engine- engine game playing we are used to agree about practically doauble as for bigness of engine- pool and time to get enough games done, if not maybe even to different kind of game playing (different engine- pool, different hardware- TC, different starting positions), then you need even more positions with even more difference in character (hardware- TC necessary to fully solve them out of correct reasons by best engines on average, tactially ones with forced follow- up lines, positional ones with more then one solution- move of near to next one's WDL- probability, positions with much and with less material on board, with more and with less advantage of one side, to be won ones and to be defended ones....
Just my two cents
:)
Peter.