I doubt it would be fair to say therefore: 16 threadswould = about 34,769.850 NPS
and 8 threads = 17,384,925 NPS
But perhaps it is NOT a linear relationship like that. Would anyone have a 'rule of thumb' if not?
Generally, Stockfish NNUE NPS depends also on the network size, currently it makes 1M to 2M NPS per core on modern CPUs (depends for example on frequency and vector-unit like SSE, AVX2, NEON), and with HyperThreading on (SMT2, 2 threads per core) you gain roughly 1.5x more NPS.
--
Srdja
So, in general, in doubling the thread count (say from 8 to 16) instead of a 2x gain in NPS, you get closer to 1.5x gain in NPS - instead of from say 100 to 200, you actually get closer to 150?
If thread and core count is equal you can expcect an doubling of NPS, 8cores with 8threads to 16cores with 16threads doubles the NPS, assuming same architecture and frequency cos most engines scale linear NPS wise nowadays across cores on a single socket, then you turn SMT resp. HT on, and you get further ~1.5x NPS by this, depends, on architecture and engine.
Modern Intel and AMD processors profit both from two threads per core, 2-way SMT, in varying percentages, some people prefer SMT resp. HT off during testing.
--
Srdja
Just to clarify "then you turn SMT resp. HT on and run twofold as many threads as cores , and you get further ~1.5x NPS by this, depends, on architecture and engine."
smatovic wrote: ↑Mon Nov 28, 2022 5:24 pm
Just to clarify "then you turn SMT resp. HT on and run twofold as many threads as cores , and you get further ~1.5x NPS by this, depends, on architecture and engine."
--
Srdja
Indeed. Running 16 threads on an 8 core machine is not the same as 16 threads on a 16 core machine. Those extra threads are competing for the same resource. That and the fact that usually, depending on architecture, when you fully load the CPU with those extra threads, the CPU clock will slow down so that it stays within thermal and/or power limits.
I doubt it would be fair to say therefore: 16 threadswould = about 34,769.850 NPS
and 8 threads = 17,384,925 NPS
But perhaps it is NOT a linear relationship like that. Would anyone have a 'rule of thumb' if not?
Generally, Stockfish NNUE NPS depends also on the network size, currently it makes 1M to 2M NPS per core on modern CPUs (depends for example on frequency and vector-unit like SSE, AVX2, NEON), and with HyperThreading on (SMT2, 2 threads per core) you gain roughly 1.5x more NPS.
--
Srdja
So, in general, in doubling the thread count (say from 8 to 16) instead of a 2x gain in NPS, you get closer to 1.5x gain in NPS - instead of from say 100 to 200, you actually get closer to 150?
If thread and core count is equal you can expcect an doubling of NPS, 8cores with 8threads to 16cores with 16threads doubles the NPS, assuming same architecture and frequency cos most engines scale linear NPS wise nowadays across cores on a single socket, then you turn SMT resp. HT on, and you get further ~1.5x NPS by this, depends, on architecture and engine.
Modern Intel and AMD processors profit both from two threads per core, 2-way SMT, in varying percentages, some people prefer SMT resp. HT off during testing.
--
Srdja
Just to clarify "then you turn SMT resp. HT on and run twofold as many threads as cores , and you get further ~1.5x NPS by this, depends, on architecture and engine."
--
Srdja
Thanks. that's what I was wondering. I'm needing a new system and thinking of the Zen 7950x because of all the cores and I largely will use it for chess analysis....and thinking of lowering to 115 Watts for 'overnight' or 'extended' work for many (but not all) cores/threads. Heck, given the # of cores/threads, I many leave it at 115 Watts more often than not since it seems you get 80%+ of the goodness of the core for a lot less energy penalty....besides, It's not like AMD has a lot of experience with pushing max heat/wattage thru their CPU's and I would like the system to last a nice long time.
And thanks for that old link Vinvin - I've bookmarked it for future reference.
Sounds reasonable, to buy the newest architecture (TSMC 5nm fab process) with many cores and then run in ECO mode. AMD Zen3 and Zen4 are decent architectures for running chess IMO.
I doubt it would be fair to say therefore: 16 threadswould = about 34,769.850 NPS
and 8 threads = 17,384,925 NPS
But perhaps it is NOT a linear relationship like that. Would anyone have a 'rule of thumb' if not?
Generally, Stockfish NNUE NPS depends also on the network size, currently it makes 1M to 2M NPS per core on modern CPUs (depends for example on frequency and vector-unit like SSE, AVX2, NEON), and with HyperThreading on (SMT2, 2 threads per core) you gain roughly 1.5x more NPS.
--
Srdja
So, in general, in doubling the thread count (say from 8 to 16) instead of a 2x gain in NPS, you get closer to 1.5x gain in NPS - instead of from say 100 to 200, you actually get closer to 150?
If thread and core count is equal you can expcect an doubling of NPS, 8cores with 8threads to 16cores with 16threads doubles the NPS, assuming same architecture and frequency cos most engines scale linear NPS wise nowadays across cores on a single socket, then you turn SMT resp. HT on, and you get further ~1.5x NPS by this, depends, on architecture and engine.
Modern Intel and AMD processors profit both from two threads per core, 2-way SMT, in varying percentages, some people prefer SMT resp. HT off during testing.
--
Srdja
Just to clarify "then you turn SMT resp. HT on and run twofold as many threads as cores , and you get further ~1.5x NPS by this, depends, on architecture and engine."
--
Srdja
Thanks. that's what I was wondering. I'm needing a new system and thinking of the Zen 7950x because of all the cores and I largely will use it for chess analysis....and thinking of lowering to 115 Watts for 'overnight' or 'extended' work for many (but not all) cores/threads. Heck, given the # of cores/threads, I many leave it at 115 Watts more often than not since it seems you get 80%+ of the goodness of the core for a lot less energy penalty....besides, It's not like AMD has a lot of experience with pushing max heat/wattage thru their CPU's and I would like the system to last a nice long time.
And thanks for that old link Vinvin - I've bookmarked it for future reference.