What is the value of logical cores ( HT) for chess ?

Discussion of anything and everything relating to chess playing software and machines.

Moderator: Ras

User avatar
MikeB
Posts: 4889
Joined: Thu Mar 09, 2006 6:34 am
Location: Pen Argyl, Pennsylvania

Re: What is the value of logical cores ( HT) for chess ?

Post by MikeB »

MikeB wrote: Mon Feb 24, 2020 1:53 pm
MikeB wrote: Mon Feb 24, 2020 1:06 pm
corres wrote: Mon Feb 24, 2020 8:46 am
MikeB wrote: Mon Feb 24, 2020 6:30 am ...
Even though it's 50% more threads, the NPS only goes up about 25% - logical cores are not as productive as real cores.. We might see a 10 Elo difference at best. At this level. There's going to be a lot of draws with identical engines. I see my energy usage has also gone up. It's about 25-30 more watts when the 48 thread SF is thinking.
Logical core (HT for Intel, SMT for AMD) use only the remainder resources of CPU. They give minimal plus power of calculation but enhance the power consumption and the heat production of CPU. If somebody want to make repeatable and serious tests it is very advisable to turn it off together with every automated frequency tuning (OC) of CPU.
This is slow going, as only one game can be played at a time.
Score of Stockfish-022220-32 vs Stockfish-022220-48: 7 - 2 - 54 [0.540]
Elo difference: 27.6 +/- 32.0, LOS: 95.2 %, DrawRatio: 85.7 %

63 of 100 games finished.

I will let it finish the 100 game set - but at least on my setup, it appears to be counter-productive. One artifact that attracted my attention and made me suspicious about logical cores was that I had noticed that on somes benches I was running with 32 threads vs 64 threads, the 32 threads benches were often completed quicker than the 64 threads benches, although the 64 threads benches showed noticeably higher (50%) nps.

Anyway , perhaps my machine is an outlier, I would be very interested for others who have a Threadripper 3970x. 3990x or even the 2990w to run a similar type test . With cutechess-GUI it's very easy to set up different number of threads. As an aside, I did run 32 threads versus 16 threads on the 32 logical core Threadripper and those results were as expected with the 32 thread SF dominating the 16 threaded SF, showing about a plus 90 Elo superiority. But that's 32 real cores vs 16 real cores. If this is true, it would not make sense to use logical cores at all for computer chess ( with CPUs similar to the Threadripper 3970X) as there is no chess benefit and it costs more ( higher power consumption => higher electric bills).
I'm killing it here as , SMT is just not for my system

Score of Stockfish-022220-32 vs Stockfish-022220-64: 8 - 4 - 60 [0.528]
Elo difference: 19.3 +/- 32.6, LOS: 87.6 %, DrawRatio: 83.3 %

72 of 100 games finished.
Now, in my quest to find out how to turn off SMT, this looks very interesting as it maty be possible to run at even greater frequencies with SMT turned off.
https://www.amd.com/system/files/docume ... -guide.pdf
I wish I had know about the AMD Ryzen Master app earlier - from there you can turn off SMT, and then easily auto tune for highest sustainable frequency - which in my case, it's about 4.117 ghz and it keeps the system temp hovering at 82C. The SF bench of 2048 32 26 is completed faster than when using the 64 cores with SMT enabled. For me, this works , ymmv.
Image
Zenmastur
Posts: 919
Joined: Sat May 31, 2014 8:28 am

Re: What is the value of logical cores ( HT) for chess ?

Post by Zenmastur »

MikeB wrote: Mon Feb 24, 2020 3:49 pm
I wish I had know about the AMD Ryzen Master app earlier - from there you can turn off SMT, and then easily auto tune for highest sustainable frequency - which in my case, it's about 4.117 ghz and it keeps the system temp hovering at 82C. The SF bench of 2048 32 26 is completed faster than when using the 64 cores with SMT enabled. For me, this works , ymmv.
The problem with this analysis is it doesn't tell you the quality of the search. If you do two searches and one gets a better time to depth but has a lower score, which search is better? You can't tell just from the time to depth. It could be that the apparently "slower" search returns a much better score. You shouldn't use an equal position for this type of test. It needs to be a position where one side has a marked advantage. That way the difference between the searches, if there is one, should be apparent in the score. A test suite seems in order.
Only 2 defining forces have ever offered to die for you.....Jesus Christ and the American Soldier. One died for your soul, the other for your freedom.
Alayan
Posts: 550
Joined: Tue Nov 19, 2019 8:48 pm
Full name: Alayan Feh

Re: What is the value of logical cores ( HT) for chess ?

Post by Alayan »

Bench completion time is a largely irrelevant metric when running bench at different core counts, as the tree searched with more threads is different.

However, SMP's gains per doubling get smaller and smaller with more cores, so a point at which SMT stops being useful certainly exist.
Zenmastur
Posts: 919
Joined: Sat May 31, 2014 8:28 am

Re: What is the value of logical cores ( HT) for chess ?

Post by Zenmastur »

Alayan wrote: Mon Feb 24, 2020 4:20 pm Bench completion time is a largely irrelevant metric when running bench at different core counts, as the tree searched with more threads is different.

However, SMP's gains per doubling get smaller and smaller with more cores, so a point at which SMT stops being useful certainly exist.
I'm sure that it does. The question becomes how much do the controllable variables affect when this point is reached. I suspect that they may contribute more than you might expect.
Only 2 defining forces have ever offered to die for you.....Jesus Christ and the American Soldier. One died for your soul, the other for your freedom.
lkaufman
Posts: 6284
Joined: Sun Jan 10, 2010 6:15 am
Location: Maryland USA
Full name: Larry Kaufman

Re: What is the value of logical cores ( HT) for chess ?

Post by lkaufman »

MikeB wrote: Mon Feb 24, 2020 1:53 pm
MikeB wrote: Mon Feb 24, 2020 1:06 pm
corres wrote: Mon Feb 24, 2020 8:46 am
MikeB wrote: Mon Feb 24, 2020 6:30 am ...
Even though it's 50% more threads, the NPS only goes up about 25% - logical cores are not as productive as real cores.. We might see a 10 Elo difference at best. At this level. There's going to be a lot of draws with identical engines. I see my energy usage has also gone up. It's about 25-30 more watts when the 48 thread SF is thinking.
Logical core (HT for Intel, SMT for AMD) use only the remainder resources of CPU. They give minimal plus power of calculation but enhance the power consumption and the heat production of CPU. If somebody want to make repeatable and serious tests it is very advisable to turn it off together with every automated frequency tuning (OC) of CPU.
This is slow going, as only one game can be played at a time.
Score of Stockfish-022220-32 vs Stockfish-022220-48: 7 - 2 - 54 [0.540]
Elo difference: 27.6 +/- 32.0, LOS: 95.2 %, DrawRatio: 85.7 %

63 of 100 games finished.

I will let it finish the 100 game set - but at least on my setup, it appears to be counter-productive. One artifact that attracted my attention and made me suspicious about logical cores was that I had noticed that on somes benches I was running with 32 threads vs 64 threads, the 32 threads benches were often completed quicker than the 64 threads benches, although the 64 threads benches showed noticeably higher (50%) nps.

Anyway , perhaps my machine is an outlier, I would be very interested for others who have a Threadripper 3970x. 3990x or even the 2990w to run a similar type test . With cutechess-GUI it's very easy to set up different number of threads. As an aside, I did run 32 threads versus 16 threads on the 32 logical core Threadripper and those results were as expected with the 32 thread SF dominating the 16 threaded SF, showing about a plus 90 Elo superiority. But that's 32 real cores vs 16 real cores. If this is true, it would not make sense to use logical cores at all for computer chess ( with CPUs similar to the Threadripper 3970X) as there is no chess benefit and it costs more ( higher power consumption => higher electric bills).
I'm killing it here as , SMT is just not for my system

Score of Stockfish-022220-32 vs Stockfish-022220-64: 8 - 4 - 60 [0.528]
Elo difference: 19.3 +/- 32.6, LOS: 87.6 %, DrawRatio: 83.3 %

72 of 100 games finished.
Now, in my quest to find out how to turn off SMT, this looks very interesting as it maty be possible to run at even greater frequencies with SMT turned off.
https://www.amd.com/system/files/docume ... -guide.pdf
I suspect that running 48 vs 32 threads on one machine may not really tell us which is stronger, because switching sides constantly means starting and stopping threads with unknown consequences. That's why I ran my tests vs. Lc0 on a GPU, which uses just 2 of the CPU threads, so it's almost like running against another computer. Another approach is to use time-to-depth, but keeping in mind that more threads is always better if time to depth is equal. So if the minimum time-to-depth is at 48 threads for example (as I found in a tiny sample), the the optimum number of threads is 48 or more. If you are considering electricity cost, then just using the minimum time-to-depth is probably a very good compromise between elo and cost, since increasing thread count beyond that point will mean a time vs quality tradeoff and hence only a small elo gain at best.
Komodo rules!
User avatar
MikeB
Posts: 4889
Joined: Thu Mar 09, 2006 6:34 am
Location: Pen Argyl, Pennsylvania

Re: What is the value of logical cores ( HT) for chess ?

Post by MikeB »

lkaufman wrote: Mon Feb 24, 2020 5:58 pm
MikeB wrote: Mon Feb 24, 2020 1:53 pm
MikeB wrote: Mon Feb 24, 2020 1:06 pm
corres wrote: Mon Feb 24, 2020 8:46 am
MikeB wrote: Mon Feb 24, 2020 6:30 am ...
Even though it's 50% more threads, the NPS only goes up about 25% - logical cores are not as productive as real cores.. We might see a 10 Elo difference at best. At this level. There's going to be a lot of draws with identical engines. I see my energy usage has also gone up. It's about 25-30 more watts when the 48 thread SF is thinking.
Logical core (HT for Intel, SMT for AMD) use only the remainder resources of CPU. They give minimal plus power of calculation but enhance the power consumption and the heat production of CPU. If somebody want to make repeatable and serious tests it is very advisable to turn it off together with every automated frequency tuning (OC) of CPU.
This is slow going, as only one game can be played at a time.
Score of Stockfish-022220-32 vs Stockfish-022220-48: 7 - 2 - 54 [0.540]
Elo difference: 27.6 +/- 32.0, LOS: 95.2 %, DrawRatio: 85.7 %

63 of 100 games finished.

I will let it finish the 100 game set - but at least on my setup, it appears to be counter-productive. One artifact that attracted my attention and made me suspicious about logical cores was that I had noticed that on somes benches I was running with 32 threads vs 64 threads, the 32 threads benches were often completed quicker than the 64 threads benches, although the 64 threads benches showed noticeably higher (50%) nps.

Anyway , perhaps my machine is an outlier, I would be very interested for others who have a Threadripper 3970x. 3990x or even the 2990w to run a similar type test . With cutechess-GUI it's very easy to set up different number of threads. As an aside, I did run 32 threads versus 16 threads on the 32 logical core Threadripper and those results were as expected with the 32 thread SF dominating the 16 threaded SF, showing about a plus 90 Elo superiority. But that's 32 real cores vs 16 real cores. If this is true, it would not make sense to use logical cores at all for computer chess ( with CPUs similar to the Threadripper 3970X) as there is no chess benefit and it costs more ( higher power consumption => higher electric bills).
I'm killing it here as , SMT is just not for my system

Score of Stockfish-022220-32 vs Stockfish-022220-64: 8 - 4 - 60 [0.528]
Elo difference: 19.3 +/- 32.6, LOS: 87.6 %, DrawRatio: 83.3 %

72 of 100 games finished.
Now, in my quest to find out how to turn off SMT, this looks very interesting as it maty be possible to run at even greater frequencies with SMT turned off.
https://www.amd.com/system/files/docume ... -guide.pdf
I suspect that running 48 vs 32 threads on one machine may not really tell us which is stronger, because switching sides constantly means starting and stopping threads with unknown consequences. That's why I ran my tests vs. Lc0 on a GPU, which uses just 2 of the CPU threads, so it's almost like running against another computer. Another approach is to use time-to-depth, but keeping in mind that more threads is always better if time to depth is equal. So if the minimum time-to-depth is at 48 threads for example (as I found in a tiny sample), the the optimum number of threads is 48 or more. If you are considering electricity cost, then just using the minimum time-to-depth is probably a very good compromise between elo and cost, since increasing thread count beyond that point will mean a time vs quality tradeoff and hence only a small elo gain at best.
That’s a good idea - using Lc0 , will try that tonight with a gauntlet lc0 vs the 32 and 48. Thanks
Image
mbabigian
Posts: 220
Joined: Tue Oct 15, 2013 2:34 am
Location: US
Full name: Mike Babigian

Re: What is the value of logical cores ( HT) for chess ?

Post by mbabigian »

Apparenltly , 4.15 Ghz is the sweet spot for my machine. Stays right at 80C which is critical for my system. Above 80C , funny things start happening. Very similar to the Pi in that respect , keeping the PI at 80C or below was critical too.
80C is crazy hot. Might I suggest you at least check the thermal paste application on your CPU and if that doesn't help, reconsider your CPU cooling solution. Very few of the 3970X builds are running anywhere near that without something being wrong. Thermal Grizzly makes excellent thermal paste.

I can't break 65C unless I also max out my GPU at the same time (single cooling loop). Although I am water cooled there are plenty of people running air that are nowhere near 80C. See attached screen print. This was after more than 6 minutes with Stockfish running 63 threads and LC0 running 1 thread to heat the GPU. GPU max temp was bouncing a bit, but maxed at 40C.

I completely agree that using hyperthreads is silly. I get 4.184Ghz at 32 threads and only about 4.11Ghz at 64 threads. Normally with 32 threads and GPU maxed, I can't get over 62C with a room temperature of 70F. Without turning up the GPU I often top out at only 59C for 32 threads. The ELO difference analyzing 4 threads vs 8 with hyperthreads might be something, but 32 real vs 48 or 64 logical can't justify the power or heat. When summer comes, I don't get to run this computer for free (I heat with electricity in winter).

Mike
P.S. Board rejected the attachment. See: http://software.farseer.org/64Threads&MaxGPU.jpg
User avatar
MikeB
Posts: 4889
Joined: Thu Mar 09, 2006 6:34 am
Location: Pen Argyl, Pennsylvania

Re: What is the value of logical cores ( HT) for chess ?

Post by MikeB »

mbabigian wrote: Tue Feb 25, 2020 2:37 am
Apparenltly , 4.15 Ghz is the sweet spot for my machine. Stays right at 80C which is critical for my system. Above 80C , funny things start happening. Very similar to the Pi in that respect , keeping the PI at 80C or below was critical too.
80C is crazy hot. Might I suggest you at least check the thermal paste application on your CPU and if that doesn't help, reconsider your CPU cooling solution. Very few of the 3970X builds are running anywhere near that without something being wrong. Thermal Grizzly makes excellent thermal paste.

I can't break 65C unless I also max out my GPU at the same time (single cooling loop). Although I am water cooled there are plenty of people running air that are nowhere near 80C. See attached screen print. This was after more than 6 minutes with Stockfish running 63 threads and LC0 running 1 thread to heat the GPU. GPU max temp was bouncing a bit, but maxed at 40C.

I completely agree that using hyperthreads is silly. I get 4.184Ghz at 32 threads and only about 4.11Ghz at 64 threads. Normally with 32 threads and GPU maxed, I can't get over 62C with a room temperature of 70F. Without turning up the GPU I often top out at only 59C for 32 threads. The ELO difference analyzing 4 threads vs 8 with hyperthreads might be something, but 32 real vs 48 or 64 logical can't justify the power or heat. When summer comes, I don't get to run this computer for free (I heat with electricity in winter).

Mike
P.S. Board rejected the attachment. See: http://software.farseer.org/64Threads&MaxGPU.jpg
Now that you mentioned it , I think my AIO cooler is a POS - what cooler are you using? My unit is not home built. 80C is cool for machine - otherwise it can run right to 95C and then it throttles big time - now I now 95C is too high - since that is the limit - I wasn't really thinking 80C was high since the little PI's also run hot like that . Years ago , nothing would run at 80C.
Image
mbabigian
Posts: 220
Joined: Tue Oct 15, 2013 2:34 am
Location: US
Full name: Mike Babigian

Re: What is the value of logical cores ( HT) for chess ?

Post by mbabigian »

Now that you mentioned it , I think my AIO cooler is a POS - what cooler are you using? My unit is not home built
.

My water block is nothing special. Corsair Hydro X XC9. Doesn't fully cover the heat spreader. If you are running at least a 360mm AIO with good flow rate, you should be fine.

Before building I read tests that showed the ryzen chips starting to slow above 65C. No significant slowing until much higher, but that set my goal for the build. I have a very high flow rate pump too. I have one 420mm and two 480mm radiators because I planned for two 3080TI's when released and I often have it pegged for months without so much as a reboot. All that said, I've seen screen shots on the overclockers forums showing my temps on systems using a 360mm AIO.

I'm concerned your builder screwed up the thermal paste. You can buy some for under $20 and it is a easy thing to try. Don't fry your 2k chip. Something isn't right.

Mike
User avatar
MikeB
Posts: 4889
Joined: Thu Mar 09, 2006 6:34 am
Location: Pen Argyl, Pennsylvania

Re: What is the value of logical cores ( HT) for chess ?

Post by MikeB »

mbabigian wrote: Tue Feb 25, 2020 7:27 am
Now that you mentioned it , I think my AIO cooler is a POS - what cooler are you using? My unit is not home built
.

My water block is nothing special. Corsair Hydro X XC9. Doesn't fully cover the heat spreader. If you are running at least a 360mm AIO with good flow rate, you should be fine.

Before building I read tests that showed the ryzen chips starting to slow above 65C. No significant slowing until much higher, but that set my goal for the build. I have a very high flow rate pump too. I have one 420mm and two 480mm radiators because I planned for two 3080TI's when released and I often have it pegged for months without so much as a reboot. All that said, I've seen screen shots on the overclockers forums showing my temps on systems using a 360mm AIO.

I'm concerned your builder screwed up the thermal paste. You can buy some for under $20 and it is a easy thing to try. Don't fry your 2k chip. Something isn't right.

Mike
https://www.dropbox.com/s/bpo8wm2jj4a5x ... e.JPG?dl=0
Ryzen Master shows anything under 80 degrees in green - so Asus and AMD are not too worried about it. I'm using 30 cores ( out of 32), getting 2M nps which for 2 min 1 sec inc game is decent. If I was using 60 out 64 cores here, the nps would be down to 1300 nps - lesser quality of game means more noise and more inconsistent results. Also I finished my testing on lc0 with 30 thread and 6o threaded SF. Going to 64 threads gains about 15 Elo on my machine , ymmv - this is with setting the machine for 32 threads with SMT off - the nps difference between that setup and the 64 thread setup is only 30% (since the 32 thread setup can be clocked higher by almost 10% on my machine). When I use all 64 threads , my machine barely runs over 3.7 ghz - if I want to keep it under 80C.
Image