CornfedForever wrote: ↑Sun Jul 16, 2023 8:05 amIf you used half your cores (4), would you get roughly 11,162,474 and 5.581.237 pm 2 cores?
Here the NPS measurements for different thread counts:
16: 22324947 (x1.31)
8: 16937908 (x1.82)
4: 9286272 (x2.00)
2: 4642347 (x2.11)
1: 2199671
The sharp scaling drop when going from 8 to 16 threads is because I have 8 physical cores, so it's hyperthreading with 16 cores. However, one hyperthreaded core is worth only about 1.5 physical cores, not 2 cores.
Otherwise, you see the scaling going down already between 1 and 8, and that's because all cores are using the same system RAM, so the memory bandwidth gets shared. Desktop CPUs have only dual channel (even if equipped with 4 sticks). That's also why AMD doesn't release consumer desktop CPUs with more than 16 threads, because the memory bandwidth isn't there to support that.
CornfedForever wrote: ↑Sun Jul 16, 2023 8:05 amIf you used half your cores (4), would you get roughly 11,162,474 and 5.581.237 pm 2 cores?
Here the NPS measurements for different thread counts:
16: 22324947 (x1.31)
8: 16937908 (x1.82)
4: 9286272 (x2.00)
2: 4642347 (x2.11)
1: 2199671
The sharp scaling drop when going from 8 to 16 threads is because I have 8 physical cores, so it's hyperthreading with 16 cores. However, one hyperthreaded core is worth only about 1.5 physical cores, not 2 cores.
Otherwise, you see the scaling going down already between 1 and 8, and that's because all cores are using the same system RAM, so the memory bandwidth gets shared. Desktop CPUs have only dual channel (even if equipped with 4 sticks). That's also why AMD doesn't release consumer desktop CPUs with more than 16 threads, because the memory bandwidth isn't there to support that.
CornfedForever wrote: ↑Sun Jul 16, 2023 8:05 amIf you used half your cores (4), would you get roughly 11,162,474 and 5.581.237 pm 2 cores?
Here the NPS measurements for different thread counts:
16: 22324947 (x1.31)
8: 16937908 (x1.82)
4: 9286272 (x2.00)
2: 4642347 (x2.11)
1: 2199671
The sharp scaling drop when going from 8 to 16 threads is because I have 8 physical cores, so it's hyperthreading with 16 cores. However, one hyperthreaded core is worth only about 1.5 physical cores, not 2 cores.
Otherwise, you see the scaling going down already between 1 and 8, and that's because all cores are using the same system RAM, so the memory bandwidth gets shared. Desktop CPUs have only dual channel (even if equipped with 4 sticks). That's also why AMD doesn't release consumer desktop CPUs with more than 16 threads, because the memory bandwidth isn't there to support that.
It is important to note that the NPS with multiple threads doesn't show the speed-up since there is a significant overlap between the threads.
It is difficult to measure the real speed-up, but using test positions gives you an approximation.
Regards anst
Ras wrote: ↑Sun Jul 16, 2023 8:55 amOtherwise, you see the scaling going down already between 1 and 8, and that's because all cores are using the same system RAM, so the memory bandwidth gets shared. Desktop CPUs have only dual channel (even if equipped with 4 sticks). That's also why AMD doesn't release consumer desktop CPUs with more than 16 threads, because the memory bandwidth isn't there to support that.
It might also be shared caches that prevent linear scaling. Or a combination.
Memory bandwidth shouldn't be that much of a problem, and moderm memory systems seem to be able to process many simultaneous reads and writes.