Hi everybody,
I recently upgraded my cpu from a 3900X to a 5950X as my old motherboard died.
I would like to post my personal experience regarding hyperthreading in my checkers engine and NNUE chess engines.
On my old 3900X when I used the additional threads (going from 12 to 24 threads) I always got a substantial speed increase.
From the starting position my checkers engine can go to 48MNs using 12 threads, when I used 24 threads it would go to around 60MNs.
So I got an additional 12MNs using all threads.
Enter the 5950X and my engine can go to 78MNs from the starting position using 16 threads.
But using 32 threads, I got 83MNs.
I noticed that a few months ago on a friend of mine that has a 5600X with 6cores/12threads.
In that cpu I practically didn't notice any speedup using my checkers engine.
But I didn't believe it at the time, I though my friend must had some problem with the chipset drivers.
My mind simply didn't accept that hyperthreading on a Ryzen 5000 series cpu was worse than on 3000 series.
I've seen no one noticing this on reviews and benchmarks on the web, but then again few people use use chess and checkers engines.
I see the same behaviour on stockfish NNUE engines. Haven't tested on non NNUE engines though.
So my new cpu is faster than the previous one, but those additional threads have gone out the window for me and I won't be clapping my hands to the Ryzen 5000 series.
For those wanting to upgrade to a Ryzen 5000 series cpu please take my experience into consideration.
It's as if these cpus were exclusively tuned for gaming and not for general computing.
best regards,
Alvaro
Hyperthreading on Ryzen 3000 vs Ryzen 5000 series
Moderator: Ras
-
Cardoso
- Posts: 363
- Joined: Thu Mar 16, 2006 7:39 pm
- Location: Portugal
- Full name: Alvaro Cardoso
-
Modern Times
- Posts: 3792
- Joined: Thu Jun 07, 2012 11:02 pm
Re: Hyperthreading on Ryzen 3000 vs Ryzen 5000 series
Well, I just ran Stockfish 15 on my 5900X on infinite analysis. Measured at the 1 minute mark, the nps was 25% higher on 24 threads than 12 threads despite the clock dropping by about 10%. (Closing the GUI and engine between the runs). Hard to interpret that - the CPU is obviously less efficient overall when using the extra threads beyond the core count, but also you can't expect Stockfish to double its speed either. So...
-
Ras
- Posts: 2727
- Joined: Tue Aug 30, 2016 8:19 pm
- Full name: Rasmus Althoff
Re: Hyperthreading on Ryzen 3000 vs Ryzen 5000 series
OK, so the raw numbers:
3900X (12T): 48 MN/s
3900X (24T): 60 MN/s (+25%)
5950X (16T): 78 MN/s
5950X (32T): 83 MN/s (+6%)
5950X (16T) vs 3900X (12T), normed to core count: +22%
5950X (32T) vs 3900X (24T), normed to core count: +4%
So the 5000 doesn't actually look bad here. With SMT, you have to consider two things: if it's "too" much of a speed-up, it means that there's a serious bottleneck in the core design, and the question is why the core doesn't use its resources in single thread properly. The uplift you see with 5000-SMT can be smaller because you already got the uplift without SMT, which points to some core bottlenecks that were removed in Zen3. In that case, SMT doesn't have much CPU "waiting" time of one thread to fill with the second thread. If you use bitboards, that might be related to the much faster PEXT in Zen3.
The other factor is that SMT isn't helpful when the application runs into contention e.g. over mutexes, but that's unlikely here because you did see a change from 3900X/12 vs /24, but not with 5600X/6 vs /12. Or maybe it's memory bound, i.e. the memory bandwidth is the limiting factor. SMT is most useful in CPU bound computation that doesn't exhibit memory pressure. Gamers Nexus also noted that Zen3 does not only perform better with two RAM bars compared to one, that's trivial, but also with four symmetric RAM bars compared to two.
So, to sum up: Ryzen 5000 roughly gives you the +22% speedup for your specific application already for non-SMT, which is most of what 3900X only unlocked with SMT (+25%). That's a good thing because in chess and checkers, you want a given computational power split over as few cores as possible. The multithread application scaling doesn't increase linearly with the core count because you end up doing calculations in parallel that you wouldn't even have had sequentially.
Rasmus Althoff
https://www.ct800.net
https://www.ct800.net
-
Cardoso
- Posts: 363
- Joined: Thu Mar 16, 2006 7:39 pm
- Location: Portugal
- Full name: Alvaro Cardoso
Re: Hyperthreading on Ryzen 3000 vs Ryzen 5000 series
Very interesting,
one thing that comes to mind is that both the 5900X and the 5950X have the exact same number of chiplets (but the 5900X has 4 cores disabled).
And so they both have the same amount of L3 cache (32Mb x 2 = 64Mb) but the 5950X has 4 more cores and 8 more threads for the same exact amount of L3 cache. So maybe the 5950X has less L3 cache "per core".
Off course by saying this I'm contradicting myself in connection to a previous post:
forum3/viewtopic.php?f=2&t=79736&p=924799#p924743
It would be interesting to do these experiments in a 3950X.
one thing that comes to mind is that both the 5900X and the 5950X have the exact same number of chiplets (but the 5900X has 4 cores disabled).
And so they both have the same amount of L3 cache (32Mb x 2 = 64Mb) but the 5950X has 4 more cores and 8 more threads for the same exact amount of L3 cache. So maybe the 5950X has less L3 cache "per core".
Off course by saying this I'm contradicting myself in connection to a previous post:
forum3/viewtopic.php?f=2&t=79736&p=924799#p924743
It would be interesting to do these experiments in a 3950X.
Last edited by Cardoso on Fri Apr 22, 2022 10:52 pm, edited 1 time in total.
-
Raphexon
- Posts: 476
- Joined: Sun Mar 17, 2019 12:00 pm
- Full name: Henk Drost
Re: Hyperthreading on Ryzen 3000 vs Ryzen 5000 series
The program in question might also be bottlenecked by the memory...
So you'll see less gain with SMT on the 5950x because the RAM can no longer keep up.
Stockfish definitely is sensitive to it. (check NPS with only 1 ramstick for example)
So you'll see less gain with SMT on the 5950x because the RAM can no longer keep up.
Stockfish definitely is sensitive to it. (check NPS with only 1 ramstick for example)
-
Cardoso
- Posts: 363
- Joined: Thu Mar 16, 2006 7:39 pm
- Location: Portugal
- Full name: Alvaro Cardoso
Re: Hyperthreading on Ryzen 3000 vs Ryzen 5000 series
I didn't think of that also, as my 2x16GB sticks are low end timings (the kit costed me 170euros at the time), although they are running at 3200MTs.
Besides the hashtable my evaluation function has a lot of data structures so maybe they are trashing the L3 cache.
So those 32 threads are demanding too much from the memory subsystem.