Stockfish and serious hardware: 384 threads
Moderator: Ras
-
- Posts: 3570
- Joined: Wed Mar 08, 2006 8:15 pm
- Full name: Jouni Uski
Stockfish and serious hardware: 384 threads
In SF forum they have run some tests with 384 threads. It is 8x Intel Xeon Platinum 8168 with 431.403.814 nodes/s. Way to go! One test was 384 vs 64 threads which gave ELO: +93.95 +-11.9 (95%) LOS: 100.0%. Also hyperthreading seem to be beneficial for SF.
Jouni
-
- Posts: 1494
- Joined: Thu Mar 30, 2006 2:08 pm
Re: Stockfish and serious hardware: 384 threads
Wow! I m quite surprised. thread doubling experiments on an earlier Stockfish showed just a 6 elo gain between 16 and 32 cores. Since 64 to 384 is between 2 and 3 doublings, I would have expected a much lower gain. There is something to learn that is unexpected here.
-
- Posts: 3570
- Joined: Wed Mar 08, 2006 8:15 pm
- Full name: Jouni Uski
Re: Stockfish and serious hardware: 384 threads
But I have difficult to believe how can they test:
scaling result going from 192 threads to 384 hyperthreads
ELO: 22.27 +-9.7 (95%) LOS: 100.0%
Total: 1000 W: 134 L: 70 D: 796
when used google server is ALWAYS with hyperthreads??
scaling result going from 192 threads to 384 hyperthreads
ELO: 22.27 +-9.7 (95%) LOS: 100.0%
Total: 1000 W: 134 L: 70 D: 796
when used google server is ALWAYS with hyperthreads??
Jouni
-
- Posts: 694
- Joined: Sun Nov 08, 2015 11:10 pm
- Full name: Bojun Guo
Re: Stockfish and serious hardware: 384 threads
It is quite straightforward:
With ponder off while the 192-thread engine is running, the worse case scenario is has 96 threads on real cores and 96 threads on HT. Despite any clever OS scheduling strategy to prevent this from happening, you can still see a fairly consistent scaling factor for doubling the number of threads regardless of distribution. One can argue that the result of HT effectiveness may be an upper-bound, then people could always run some 8 cores vs 4 cores + 4 HT tests on two machines, since we now know about how that would scale.
With ponder off while the 192-thread engine is running, the worse case scenario is has 96 threads on real cores and 96 threads on HT. Despite any clever OS scheduling strategy to prevent this from happening, you can still see a fairly consistent scaling factor for doubling the number of threads regardless of distribution. One can argue that the result of HT effectiveness may be an upper-bound, then people could always run some 8 cores vs 4 cores + 4 HT tests on two machines, since we now know about how that would scale.