Elo gain by core doubling - Komodo 14, Stockfish 11

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

fastgm
Posts: 818
Joined: Mon Aug 19, 2013 6:57 pm

Elo gain by core doubling - Komodo 14, Stockfish 11

Post by fastgm »

AMD Ryzen Threadripper 3990X, 64 cores, 128 threads
Komodo 14 POPCNT vs Komodo 14 POPCNT, default, 128 MB Hash, TC = 10 + 0.1 sec, 3000 games
Stockfish 11 POPCNT vs Stockfish 11 POPCNT, default, 128 MB Hash, TC = 10 + 0.1 sec, 3000 games

Komodo 14

Code: Select all

Threads   2 vs 1   4 vs 2   8 vs 4   16 vs 8   32 vs 16   64 vs 32   128 vs 64
Elo         78       74       59        43        25         33           4
Draw %      67.7     68.8     72.9      77.5      81.0       80.9        84.4
Image

Stockfish 11

Code: Select all

Threads   2 vs 1   4 vs 2   8 vs 4   16 vs 8   32 vs 16   64 vs 32   128 vs 64
Elo         91       81       66        49        46         47          14
Draw %      58.8     62.6     67.9      72.4      73.7       76.0        79.8
Image

tpm = time per move, d = depth, nps = nodes per second

Code: Select all

Komodo 14 T1      (tpm=213.0 d=19.25 nps=  2.233.619)
Komodo 14 T2      (tpm=210.4 d=20.74 nps=  4.447.790) - (tpm=212.5 d=20.25 nps=  4.414.826)
Komodo 14 T4      (tpm=210.3 d=21.72 nps=  8.648.412) - (tpm=215.4 d=21.32 nps=  8.096.784)
Komodo 14 T8      (tpm=213.7 d=22.70 nps= 15.485.326) - (tpm=214.4 d=23.04 nps= 17.355.067)
Komodo 14 T16     (tpm=215.9 d=24.33 nps= 31.353.128) - (tpm=216.8 d=24.62 nps= 33.949.832)
Komodo 14 T32     (tpm=215.0 d=25.00 nps= 58.672.598) - (tpm=213.2 d=25.92 nps= 73.913.606)
Komodo 14 T64     (tpm=205.6 d=26.61 nps=144.891.763) - (tpm=207.0 d=26.48 nps=142.430.040)
Komodo 14 T128    (tpm=191.8 d=25.54 nps=217.664.278)

Code: Select all

Stockfish 11 T1   (tpm=207.6 d=22.86 nps=  1.974.300)
Stockfish 11 T2   (tpm=203.5 d=25.84 nps=  3.916.578) - (tpm=204.6 d=24.19 nps=  3.958.171)
Stockfish 11 T4   (tpm=201.4 d=26.85 nps=  7.648.719) - (tpm=202.6 d=25.56 nps=  8.055.504)
Stockfish 11 T8   (tpm=199.9 d=27.82 nps= 15.198.315) - (tpm=197.9 d=27.30 nps= 17.985.686)
Stockfish 11 T16  (tpm=196.2 d=28.97 nps= 31.934.055) - (tpm=197.9 d=28.37 nps= 34.072.265)
Stockfish 11 T32  (tpm=196.7 d=29.75 nps= 58.780.235) - (tpm=195.5 d=30.26 nps= 75.623.541)
Stockfish 11 T64  (tpm=194.0 d=31.94 nps=143.885.696) - (tpm=194.5 d=30.90 nps=142.776.895)
Stockfish 11 T128 (tpm=194.6 d=30.71 nps=213.497.473)
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: Elo gain by core doubling - Komodo 14, Stockfish 11

Post by Laskos »

fastgm wrote: Mon Jun 15, 2020 9:02 am AMD Ryzen Threadripper 3990X, 64 cores, 128 threads
Komodo 14 POPCNT vs Komodo 14 POPCNT, default, 128 MB Hash, TC = 10 + 0.1 sec, 3000 games
Stockfish 11 POPCNT vs Stockfish 11 POPCNT, default, 128 MB Hash, TC = 10 + 0.1 sec, 3000 games

Komodo 14

Code: Select all

Threads   2 vs 1   4 vs 2   8 vs 4   16 vs 8   32 vs 16   64 vs 32   128 vs 64
Elo         78       74       59        43        25         33           4
Draw %      67.7     68.8     72.9      77.5      81.0       80.9        84.4
Image

Stockfish 11

Code: Select all

Threads   2 vs 1   4 vs 2   8 vs 4   16 vs 8   32 vs 16   64 vs 32   128 vs 64
Elo         91       81       66        49        46         47          14
Draw %      58.8     62.6     67.9      72.4      73.7       76.0        79.8
Image

tpm = time per move, d = depth, nps = nodes per second

Code: Select all

Komodo 14 T1      (tpm=213.0 d=19.25 nps=  2.233.619)
Komodo 14 T2      (tpm=210.4 d=20.74 nps=  4.447.790) - (tpm=212.5 d=20.25 nps=  4.414.826)
Komodo 14 T4      (tpm=210.3 d=21.72 nps=  8.648.412) - (tpm=215.4 d=21.32 nps=  8.096.784)
Komodo 14 T8      (tpm=213.7 d=22.70 nps= 15.485.326) - (tpm=214.4 d=23.04 nps= 17.355.067)
Komodo 14 T16     (tpm=215.9 d=24.33 nps= 31.353.128) - (tpm=216.8 d=24.62 nps= 33.949.832)
Komodo 14 T32     (tpm=215.0 d=25.00 nps= 58.672.598) - (tpm=213.2 d=25.92 nps= 73.913.606)
Komodo 14 T64     (tpm=205.6 d=26.61 nps=144.891.763) - (tpm=207.0 d=26.48 nps=142.430.040)
Komodo 14 T128    (tpm=191.8 d=25.54 nps=217.664.278)

Code: Select all

Stockfish 11 T1   (tpm=207.6 d=22.86 nps=  1.974.300)
Stockfish 11 T2   (tpm=203.5 d=25.84 nps=  3.916.578) - (tpm=204.6 d=24.19 nps=  3.958.171)
Stockfish 11 T4   (tpm=201.4 d=26.85 nps=  7.648.719) - (tpm=202.6 d=25.56 nps=  8.055.504)
Stockfish 11 T8   (tpm=199.9 d=27.82 nps= 15.198.315) - (tpm=197.9 d=27.30 nps= 17.985.686)
Stockfish 11 T16  (tpm=196.2 d=28.97 nps= 31.934.055) - (tpm=197.9 d=28.37 nps= 34.072.265)
Stockfish 11 T32  (tpm=196.7 d=29.75 nps= 58.780.235) - (tpm=195.5 d=30.26 nps= 75.623.541)
Stockfish 11 T64  (tpm=194.0 d=31.94 nps=143.885.696) - (tpm=194.5 d=30.90 nps=142.776.895)
Stockfish 11 T128 (tpm=194.6 d=30.71 nps=213.497.473)
Nice, thanks. SF 11 seems a bit better, although the draw rates are higher for Komodo 14, compressing somewhat its gains. One would have to us Normalized Elo to see a bit clearer. Do you have the complete table of the results (w/d/l)?
fastgm
Posts: 818
Joined: Mon Aug 19, 2013 6:57 pm

Re: Elo gain by core doubling - Komodo 14, Stockfish 11

Post by fastgm »

Laskos wrote: Mon Jun 15, 2020 9:08 am Do you have the complete table of the results (w/d/l)?
Komodo 14

Code: Select all

Threads    2 vs 1          4 vs 2          8 vs 4          16 vs 8        32 vs 16        64 vs 32       128 vs 64
w-d-l	815-2030-155	783-2064-153	657-2187-156	520-2326-154	392-2431-177	430-2427-143	251-2531-218
Stockfish 11

Code: Select all

Threads     2 vs 1         4 vs 2          8 vs 4         16 vs 8         32 vs 16        64 vs 32       128 vs 64
w-d-l	1004-1764-232	902-1879-219	762-2038-200	626-2171-203	592-2212-196	560-2280-160	363-2395-242
User avatar
yurikvelo
Posts: 710
Joined: Sat Dec 06, 2014 1:53 pm

Re: Elo gain by core doubling - Komodo 14, Stockfish 11

Post by yurikvelo »

Interesting information is also Mn/sec benchmark (startpos go depth 28) at different code counts.
To differ scaling problems (speed gain after doubling cores) and diminishing returns (elo gain after doubling nodes)
Even scaling problems is hard to differ from AMD Precision Boost technology

This is Freq vs Threads
Image

Also, was it Linux or Windows?
If CPU has more than 64 logical cores, Windows split them in 2 Groups
Image

Application need special support to be able to run in multiple groups (up to 64 thread each group)
Alayan
Posts: 550
Joined: Tue Nov 19, 2019 8:48 pm
Full name: Alayan Feh

Re: Elo gain by core doubling - Komodo 14, Stockfish 11

Post by Alayan »

Stockfish get much better scaling results according to this data, but Stockfish 11 has contempt and Komodo 14 has 0 as default contempt. And this might make a noticeable difference, as contempt will expand the doubling gain in such tests.
Raphexon
Posts: 476
Joined: Sun Mar 17, 2019 12:00 pm
Full name: Henk Drost

Re: Elo gain by core doubling - Komodo 14, Stockfish 11

Post by Raphexon »

Wow, thanks for the data.
Very informative.

Stockfish 11 T64 (tpm=194.0 d=31.94 nps=143.885.696) - (tpm=194.5 d=30.90 nps=142.776.895)
Stockfish 11 T128 (tpm=194.6 d=30.71 nps=213.497.473)

TTD decrease going from 64>128, but elo gain from widening.

This makes me assume that increasing hash size would yield more elo with doubling threads than doubling TC, since the branching factor (game tree) is bigger (wider) with more threads.
fastgm
Posts: 818
Joined: Mon Aug 19, 2013 6:57 pm

Re: Elo gain by core doubling - Komodo 14, Stockfish 11

Post by fastgm »

yurikvelo wrote: Mon Jun 15, 2020 9:49 am Interesting information is also Mn/sec benchmark (startpos go depth 28) at different code counts.
To differ scaling problems (speed gain after doubling cores) and diminishing returns (elo gain after doubling nodes)
Even scaling problems is hard to differ from AMD Precision Boost technology
AMD Precision Boost was disabled, of course. Speed was 3.7 GHz

Stockfish 11, 2048 MB Hash, depth 28

1 thread

info depth 28 seldepth 40 multipv 1 score cp 61 nodes 41087252 nps 1.724.398 hashfull 163 tbhits 0 time 23827 pv d2d4 d7d5 c2c4 g8f6 g1f3 c7c5 c4d5 c5d4 f3d4 d8d5 b1c3 d5a5 d4b3 a5h5 h2h4 c8d7 c1d2 e7e5 f2f3 h5g6 g2g4 f6g4 f3g4 g6g4 h1h3 b8c6 e2e4 f8e7 d2e3 e7h4 h3h4 g4h4 e3f2 h4h2
bestmove d2d4 ponder d7d5

-------------------------

2 threads

info depth 28 seldepth 45 multipv 1 score cp 39 nodes 87577846 nps 3.444.691 hashfull 295 tbhits 0 time 25424 pv e2e4 e7e6 d2d4 d7d5 e4d5 e6d5 c2c4 g8f6 g1f3 f8b4 b1c3 e8g8 f1e2 c8e6 c4d5 f6d5 c1d2 b8c6 e1g1 b4e7 f1e1 e7f6 d2e3 c6e7 c3e4 e7f5 e4f6 d5f6 e2d3 d8d5 e3d2 f5d4 f3d4 d5d4 d2c3 d4h4 d1c2 c7c6 a2a3
bestmove e2e4 ponder e7e6

-------------------------

4 threads

info depth 28 seldepth 42 multipv 1 score cp 56 nodes 122385893 nps 6.745.254 hashfull 430 tbhits 0 time 18144 pv d2d4 d7d5 c2c4 e7e6 g1f3 g8f6 b1c3 c7c6 e2e3 f8d6 f1e2 b8d7 e1g1 e8g8 b2b3 b7b6 c1b2 h7h6 e2d3 c8b7 f1e1 f8e8 e3e4 d5e4 c3e4 f6e4 d3e4 d7f6 f3e5 f6e4 e1e4
bestmove d2d4 ponder d7d5

-------------------------

8 threads

info depth 28 seldepth 40 multipv 1 score cp 53 nodes 301457188 nps 13.543.767 hashfull 827 tbhits 0 time 22258 pv e2e4 e7e6 d2d4 d7d5 e4d5 e6d5 g1f3 g8f6 f1d3 c7c5 e1g1 c5c4 f1e1 f8e7 d3f1 e8g8 b2b3 c4b3 a2b3 f6e4 f1d3 c8f5 f3d2 f8e8 d2e4 d5e4 d3e4 e7b4 c2c3 e8e4 e1e4 f5e4 c3b4 e4g6 d4d5 d8f6
bestmove e2e4 ponder e7e6

-------------------------

16 threads

info depth 28 seldepth 43 multipv 1 score cp 64 nodes 409911336 nps 26.900.599 hashfull 904 tbhits 0 time 15238 pv d2d4 g8f6 c2c4 c7c6 c1f4 d7d5 e2e3 c8f5 b1c3 b8d7 g1f3 e7e6 f1e2 f6h5 f4g5 f8e7 g5e7 d8e7 e1g1 e8g8 h2h3 h5f6 a2a3 f6e4 c3e4 d5e4 f3d2 f5g6 f1e1 c6c5 d4d5 e6d5
bestmove d2d4 ponder g8f6

-------------------------

32 threads

info depth 28 seldepth 48 multipv 1 score cp 49 nodes 649109661 nps 53.649.860 hashfull 979 tbhits 0 time 12099 pv e2e4 e7e6 g1f3 d7d5 e4d5 e6d5 d2d4 g8f6 f1d3 c7c5 e1g1 c5c4 f1e1 f8e7 d3f1 e8g8 b2b3 c8e6 b3c4 d5c4 e1e6 f7e6 f1c4 f6d5 d1e2 d8d7 c4d3 b8c6 c2c3 a8d8 a2a4 h7h6 a4a5 g8h8 e2e4 d5f6 e4h4 e7d6 c3c4
bestmove e2e4 ponder e7e6

-------------------------

64 threads

info depth 28 seldepth 45 multipv 1 score cp 44 nodes 969567683 nps 102.795.555 hashfull 995 tbhits 0 time 9432 pv d2d4 e7e6 c2c4 d7d5 g1f3 g8f6 b1c3 c7c6 e2e3 b8d7 f1d3 d5c4 d3c4 b7b5 c4e2 b5b4 c3a4 f8e7 e1g1 e8g8 b2b3 c6c5 c1b2 c8b7 d4c5 d7c5 a4c5 e7c5 f3d2 d8e7 e2f3 f8d8 f3b7 e7b7
bestmove d2d4 ponder e7e6

-------------------------

128 threads

info depth 28 seldepth 40 multipv 1 score cp 53 nodes 1481877075 nps 146.358.229 hashfull 1000 tbhits 0 time 10125 pv g1f3 e7e6 e2e4 d7d5 e4d5 e6d5 d2d4 g8f6 f1d3 f8d6 d1e2 d6e7 e1g1 e8g8 f1e1 f8e8 c1g5 c8e6 c2c4 b8c6 b1c3 c6b4 c4d5 b4d5 f3e5 d5b4 d3c4 f6d5 c3d5 b4d5 g5e7 d8e7
bestmove g1f3 ponder e7e6

yurikvelo wrote: Mon Jun 15, 2020 9:49 am Also, was it Linux or Windows?
Windows 10 Enterprise
Last edited by fastgm on Mon Jun 15, 2020 12:09 pm, edited 1 time in total.
Milos
Posts: 4190
Joined: Wed Nov 25, 2009 1:47 am

Re: Elo gain by core doubling - Komodo 14, Stockfish 11

Post by Milos »

fastgm wrote: Mon Jun 15, 2020 9:02 am AMD Ryzen Threadripper 3990X, 64 cores, 128 threads
Komodo 14 POPCNT vs Komodo 14 POPCNT, default, 128 MB Hash, TC = 10 + 0.1 sec, 3000 games
Stockfish 11 POPCNT vs Stockfish 11 POPCNT, default, 128 MB Hash, TC = 10 + 0.1 sec, 3000 games

Komodo 14

Code: Select all

Threads   2 vs 1   4 vs 2   8 vs 4   16 vs 8   32 vs 16   64 vs 32   128 vs 64
Elo         78       74       59        43        25         33           4
Draw %      67.7     68.8     72.9      77.5      81.0       80.9        84.4
Image

Stockfish 11

Code: Select all

Threads   2 vs 1   4 vs 2   8 vs 4   16 vs 8   32 vs 16   64 vs 32   128 vs 64
Elo         91       81       66        49        46         47          14
Draw %      58.8     62.6     67.9      72.4      73.7       76.0        79.8
Image

tpm = time per move, d = depth, nps = nodes per second

Code: Select all

Komodo 14 T1      (tpm=213.0 d=19.25 nps=  2.233.619)
Komodo 14 T2      (tpm=210.4 d=20.74 nps=  4.447.790) - (tpm=212.5 d=20.25 nps=  4.414.826)
Komodo 14 T4      (tpm=210.3 d=21.72 nps=  8.648.412) - (tpm=215.4 d=21.32 nps=  8.096.784)
Komodo 14 T8      (tpm=213.7 d=22.70 nps= 15.485.326) - (tpm=214.4 d=23.04 nps= 17.355.067)
Komodo 14 T16     (tpm=215.9 d=24.33 nps= 31.353.128) - (tpm=216.8 d=24.62 nps= 33.949.832)
Komodo 14 T32     (tpm=215.0 d=25.00 nps= 58.672.598) - (tpm=213.2 d=25.92 nps= 73.913.606)
Komodo 14 T64     (tpm=205.6 d=26.61 nps=144.891.763) - (tpm=207.0 d=26.48 nps=142.430.040)
Komodo 14 T128    (tpm=191.8 d=25.54 nps=217.664.278)

Code: Select all

Stockfish 11 T1   (tpm=207.6 d=22.86 nps=  1.974.300)
Stockfish 11 T2   (tpm=203.5 d=25.84 nps=  3.916.578) - (tpm=204.6 d=24.19 nps=  3.958.171)
Stockfish 11 T4   (tpm=201.4 d=26.85 nps=  7.648.719) - (tpm=202.6 d=25.56 nps=  8.055.504)
Stockfish 11 T8   (tpm=199.9 d=27.82 nps= 15.198.315) - (tpm=197.9 d=27.30 nps= 17.985.686)
Stockfish 11 T16  (tpm=196.2 d=28.97 nps= 31.934.055) - (tpm=197.9 d=28.37 nps= 34.072.265)
Stockfish 11 T32  (tpm=196.7 d=29.75 nps= 58.780.235) - (tpm=195.5 d=30.26 nps= 75.623.541)
Stockfish 11 T64  (tpm=194.0 d=31.94 nps=143.885.696) - (tpm=194.5 d=30.90 nps=142.776.895)
Stockfish 11 T128 (tpm=194.6 d=30.71 nps=213.497.473)
All really nice results. One big caveat though, going 64 to 128 threads you are not measuring physical core performance scaling but hyperthreading scaling. So gaining 14 Elo on only 40% extra nps on such a huge number of threads is actually not a bad result for SF.
chrisw
Posts: 4319
Joined: Tue Apr 03, 2012 4:28 pm

Re: Elo gain by core doubling - Komodo 14, Stockfish 11

Post by chrisw »

fastgm wrote: Mon Jun 15, 2020 11:57 am
yurikvelo wrote: Mon Jun 15, 2020 9:49 am Interesting information is also Mn/sec benchmark (startpos go depth 28) at different code counts.
To differ scaling problems (speed gain after doubling cores) and diminishing returns (elo gain after doubling nodes)
Even scaling problems is hard to differ from AMD Precision Boost technology
AMD Precision Boost was disabled, of course. Speed was 3.7 GHz

Stockfish 11, 2048 MB Hash, depth 28

1 thread

info depth 28 seldepth 40 multipv 1 score cp 61 nodes 41087252 nps 1.724.398 hashfull 163 tbhits 0 time 23827 pv d2d4 d7d5 c2c4 g8f6 g1f3 c7c5 c4d5 c5d4 f3d4 d8d5 b1c3 d5a5 d4b3 a5h5 h2h4 c8d7 c1d2 e7e5 f2f3 h5g6 g2g4 f6g4 f3g4 g6g4 h1h3 b8c6 e2e4 f8e7 d2e3 e7h4 h3h4 g4h4 e3f2 h4h2
bestmove d2d4 ponder d7d5

-------------------------

2 threads

info depth 28 seldepth 45 multipv 1 score cp 39 nodes 87577846 nps 3.444.691 hashfull 295 tbhits 0 time 25424 pv e2e4 e7e6 d2d4 d7d5 e4d5 e6d5 c2c4 g8f6 g1f3 f8b4 b1c3 e8g8 f1e2 c8e6 c4d5 f6d5 c1d2 b8c6 e1g1 b4e7 f1e1 e7f6 d2e3 c6e7 c3e4 e7f5 e4f6 d5f6 e2d3 d8d5 e3d2 f5d4 f3d4 d5d4 d2c3 d4h4 d1c2 c7c6 a2a3
bestmove e2e4 ponder e7e6

-------------------------

4 threads

info depth 28 seldepth 42 multipv 1 score cp 56 nodes 122385893 nps 6.745.254 hashfull 430 tbhits 0 time 18144 pv d2d4 d7d5 c2c4 e7e6 g1f3 g8f6 b1c3 c7c6 e2e3 f8d6 f1e2 b8d7 e1g1 e8g8 b2b3 b7b6 c1b2 h7h6 e2d3 c8b7 f1e1 f8e8 e3e4 d5e4 c3e4 f6e4 d3e4 d7f6 f3e5 f6e4 e1e4
bestmove d2d4 ponder d7d5

-------------------------

8 threads

info depth 28 seldepth 40 multipv 1 score cp 53 nodes 301457188 nps 13.543.767 hashfull 827 tbhits 0 time 22258 pv e2e4 e7e6 d2d4 d7d5 e4d5 e6d5 g1f3 g8f6 f1d3 c7c5 e1g1 c5c4 f1e1 f8e7 d3f1 e8g8 b2b3 c4b3 a2b3 f6e4 f1d3 c8f5 f3d2 f8e8 d2e4 d5e4 d3e4 e7b4 c2c3 e8e4 e1e4 f5e4 c3b4 e4g6 d4d5 d8f6
bestmove e2e4 ponder e7e6

-------------------------

16 threads

info depth 28 seldepth 43 multipv 1 score cp 64 nodes 409911336 nps 26.900.599 hashfull 904 tbhits 0 time 15238 pv d2d4 g8f6 c2c4 c7c6 c1f4 d7d5 e2e3 c8f5 b1c3 b8d7 g1f3 e7e6 f1e2 f6h5 f4g5 f8e7 g5e7 d8e7 e1g1 e8g8 h2h3 h5f6 a2a3 f6e4 c3e4 d5e4 f3d2 f5g6 f1e1 c6c5 d4d5 e6d5
bestmove d2d4 ponder g8f6

-------------------------

32 threads

info depth 28 seldepth 48 multipv 1 score cp 49 nodes 649109661 nps 53.649.860 hashfull 979 tbhits 0 time 12099 pv e2e4 e7e6 g1f3 d7d5 e4d5 e6d5 d2d4 g8f6 f1d3 c7c5 e1g1 c5c4 f1e1 f8e7 d3f1 e8g8 b2b3 c8e6 b3c4 d5c4 e1e6 f7e6 f1c4 f6d5 d1e2 d8d7 c4d3 b8c6 c2c3 a8d8 a2a4 h7h6 a4a5 g8h8 e2e4 d5f6 e4h4 e7d6 c3c4
bestmove e2e4 ponder e7e6

-------------------------

64 threads

info depth 28 seldepth 45 multipv 1 score cp 44 nodes 969567683 nps 102.795.555 hashfull 995 tbhits 0 time 9432 pv d2d4 e7e6 c2c4 d7d5 g1f3 g8f6 b1c3 c7c6 e2e3 b8d7 f1d3 d5c4 d3c4 b7b5 c4e2 b5b4 c3a4 f8e7 e1g1 e8g8 b2b3 c6c5 c1b2 c8b7 d4c5 d7c5 a4c5 e7c5 f3d2 d8e7 e2f3 f8d8 f3b7 e7b7
bestmove d2d4 ponder e7e6

-------------------------

128 threads

info depth 28 seldepth 40 multipv 1 score cp 53 nodes 1481877075 nps 146.358.229 hashfull 1000 tbhits 0 time 10125 pv g1f3 e7e6 e2e4 d7d5 e4d5 e6d5 d2d4 g8f6 f1d3 f8d6 d1e2 d6e7 e1g1 e8g8 f1e1 f8e8 c1g5 c8e6 c2c4 b8c6 b1c3 c6b4 c4d5 b4d5 f3e5 d5b4 d3c4 f6d5 c3d5 b4d5 g5e7 d8e7
bestmove g1f3 ponder e7e6

yurikvelo wrote: Mon Jun 15, 2020 9:49 am Also, was it Linux or Windows?
Windows 10 Enterprise
Nice! But, and there’s always a but. A graph of threads against nps should show nps falling off slowly at higher thread counts, your data does show that, but there is a glitch at 64 threads. Probably because SF nps rate alters depending on what’s going on in the tree, and you’ve tested only on one position, so tree variability will be high. If you rerun on several positions and average out the nps, there should be the smooth falling off effect shown with fewer glitches. My guess is that the nps falls off with high threads is that each thread has a memory requirement and, at a certain point, this means that more and more memory accesses will be outside of cache, thus slower. I didn’t keep up with whether or not chip design has come to deliver cache memory for each core, or whether they all have to share, obvs unitary core cache would be cool.
Anyway, improvements with high thread counts are going to be quite dependant on how small a memory footprint each engine has, therefore a software engine data design function.
corres
Posts: 3657
Joined: Wed Nov 18, 2015 11:41 am
Location: hungary

Re: Elo gain by core doubling - Komodo 14, Stockfish 11

Post by corres »

yurikvelo wrote: Mon Jun 15, 2020 9:49 am Interesting information is also Mn/sec benchmark (startpos go depth 28) at different code counts.
To differ scaling problems (speed gain after doubling cores) and diminishing returns (elo gain after doubling nodes)
Even scaling problems is hard to differ from AMD Precision Boost technology

This is Freq vs Threads
Image

Also, was it Linux or Windows?
If CPU has more than 64 logical cores, Windows split them in 2 Groups
Image

Application need special support to be able to run in multiple groups (up to 64 thread each group)
It is not an accident I emphasized in every time testers would not use HT (SMT) and frequency turbo, only physical cores with fixed CPU clock. Another important viewpoint is the good cooling of the machine.