Too few games to draw any conclusions.ouachita wrote:The Quest for the Holy Grail ends here:
Code: Select all
SF010214-16 v. H4B-16, Blitz 1m+1s 0 1 Houdini 4 Pro x64B +28 +29/=50/-21 54.00% 54.0/100 2 Stockfish 020114 64 SSE4.2 -28 +21/=50/-29 46.00% 46.0/100 SF010214-32T v. H4B-16C, 1+1 0 1 Houdini 4 Pro x64B +7 +21/=60/-19 51.00% 51.0/100 2 Stockfish 020114 64 SSE4.2 -7 +19/=60/-21 49.00% 49.0/100 The single difference between these two test bases, aside from kPa and RH, was changing SF to 32 threads. As the narrator said so often in The Wonder Years, "there you have it."
Stockfish 020114 - Houdini 4 x64A Testing 39 of 100 played.
Moderator: Ras
-
Vinvin
- Posts: 5312
- Joined: Thu Mar 09, 2006 9:40 am
- Full name: Vincent Lejeune
Re: Video on how hypethreading works in a Intel CPU.
-
mwyoung
- Posts: 2727
- Joined: Wed May 12, 2010 10:00 pm
Re: Video on how hypethreading works in a Intel CPU.
I agree, but that is not the question. How can HT cripple a engine or just cripple Houdini somehow or some why.arjuntemurnikar wrote:There is a difference between using the BIOS option to switch off HT and using 4/8 cores with HT=ON. You are right, the number of pipes stay the same, but the difference is in how information gets piped through them.mwyoung wrote:
I have 7 computers in my house, some amd, some intel. Some with Bios HT option. My engine vs engine computer I keep nothing else on but chess testing. But I can test on any of them. I was referring to my computer I run my test on. To keep the overhead the lowest for testing engine vs engine. This is why my CPU % at rest is almost 0%.
I can do many things on my other computers, like type this response.
Other testers also have many computers at their homes and are posting the results here on your HT Question.
I have run many test on this including running test positions. I get the same NPS, the same time per depth. Except for the normal variation one would expect from testing MP. That why I tested this many times to get a good average.
Larry, I don't know what more I can do or say, but let others post their results.
You need to explain your theory, how a cpu core processing 1 pipe only per core with HT on can cripple a engine, over having HT off.
The CPU is the same either with HT on or off in Bios. ALL the bios HT ON/OFF option does is keep the programs restricted to 1 pipe on the core. But the 2 pipes are still there regardless of the HT setting in the BIOS.
If my results are incorrect and anything is possible that is why we test. INTEL is going to make many customers upset when running their single thread apps on their intel HT CPU. Because that is all INTEL is mostly making in CPU chips.
And if somehow this is true, I am sure we would have read about this from AMD inc.
Sorry I have to disagree, because of my testing. But others are also testing this and posting their results.
For example, consider an quad core system with 4 real cores (1, 2, 3 & 4) each divided into 2 logical cores (A & B).
With HT=ON, an 8 thread engine will fire all 8 logical cores (1A, 1B, 2A, 2B, 3A, 3B, 4A & 4B)
With HT=ON, a 4 thread engine will fire 4/8 logical cores For e.g. 1A, 2A, 3A, 4A or it could even be 1A, 1B, 2A, 3A, depending on how the CPU controller distributes the load internally. The engine has no control over which cores its using. So effectively its using 4/8 logical cores or 50% of the total load.
Now, with HT=OFF at the BIOS level, effectively the two logical cores in each real core act as ONE real core. So this means, a 4 thread engine will fire all 4 real cores (i.e. 1A+1B, 2A+2B, 3A+3B & 4A+4B). This means it will utilize all cores for full CPU load of 100%. This is what the task manager indicates to you.
So, in conclusion, the only way to test HT vs no-HT is by using two identical systems. Tests on single systems are simply flawed to begin with.
Now, that said, all these tests were anyway done with small sample sizes, so the mystery of HT vs no-HT remains unsolved.
And this can be tested on one computer. Like I have done with test positions. If HT on Vs. HT off cripples a engine it would show up in test positions. Longer thinks times, lower NPS. It does not cripple a engine in any why. Houdini or Stockfish running 4CPU HT on or HT off on a 4 core system.
Unless the theory is tests position are not effected by this theory...
"The worst thing that can happen to a forum is a running wild attacking moderator(HGM) who is not corrected by the community." - Ed Schröder
But my words like silent raindrops fell. And echoed in the wells of silence.
But my words like silent raindrops fell. And echoed in the wells of silence.
-
ouachita
- Posts: 454
- Joined: Tue Jan 15, 2013 4:33 pm
- Location: Ritz-Carlton, NYC
- Full name: Bobby Johnson
Re: Video on how hypethreading works in a Intel CPU.
for over 2000 years.arjuntemurnikar wrote:So the quest for the "Holy Grail" remains a incomplete.
SIM, PhD, MBA, PE
-
arjuntemurnikar
- Posts: 204
- Joined: Tue Oct 15, 2013 10:22 pm
- Location: Singapore
Re: Video on how hypethreading works in a Intel CPU.
In case of Houdini, the decrease in efficiency of the parallel alpha-beta search overpowers the increase in performance from hyperthreading. This is mentioned in Houdini's official FAQ so there is no mystery here. The actual performance value will change from system to system, config to config. For example, increase from 2 -> 4 logical cores may be a net gain in elo while increase from 4 -> 8 logical cores may be a net loss in elo.mwyoung wrote:
I agree, but that is not the question. How can HT cripple a engine or just cripple Houdini somehow or some why.
If Stockfish shows a gain in elo for 4 -> 8 logical cores, then that means it has a better MP implementation as the decrease in efficiency of parallel alpha-beta search is not enough to overcome the increase in performance from hyperthreading. (This may be different for 8 -> 16 logical cores or 16 -> 32 logical cores, and by the way, now you are in multi-processor territory so thats a whole different ball game!) Anyway, no significant data to be conclusive.
Test positions are not the way to gauge the performance of an engine. HT will lead to higher NPS because there is a performance increase. (Intel says 30% but it depends on the program and the CPU architecture.) This can easily be surmounted by the decrease in efficiency of parallel alpha-beta search. I think this leads to slower time-to-depth speeds.mwyoung wrote:
And this can be tested on one computer. Like I have done with test positions. If HT on Vs. HT off cripples a engine it would show up in test positions. Longer thinks times, lower NPS. It does not cripple a engine in any why. Houdini or Stockfish running 4CPU HT on or HT off on a 4 core system.
Unless the theory is tests position are not effected by this theory...
I have explained this enough times now. I will stop.
-
mwyoung
- Posts: 2727
- Joined: Wed May 12, 2010 10:00 pm
Re: Video on how hypethreading works in a Intel CPU.
I agree....arjuntemurnikar wrote:In case of Houdini, the decrease in efficiency of the parallel alpha-beta search overpowers the increase in performance from hyperthreading. This is mentioned in Houdini's official FAQ so there is no mystery here. The actual performance value will change from system to system, config to config. For example, increase from 2 -> 4 logical cores may be a net gain in elo while increase from 4 -> 8 logical cores may be a net loss in elo.mwyoung wrote:
I agree, but that is not the question. How can HT cripple a engine or just cripple Houdini somehow or some why.
If Stockfish shows a gain in elo for 4 -> 8 logical cores, then that means it has a better MP implementation as the decrease in efficiency of parallel alpha-beta search is not enough to overcome the increase in performance from hyperthreading. (This may be different for 8 -> 16 logical cores or 16 -> 32 logical cores, and by the way, now you are in multi-processor territory so thats a whole different ball game!) Anyway, no significant data to be conclusive.
Test positions are not the way to gauge the performance of an engine. HT will lead to higher NPS because there is a performance increase. (Intel says 30% but it depends on the program and the CPU architecture.) This can easily be surmounted by the decrease in efficiency of parallel alpha-beta search. I think this leads to slower time-to-depth speeds.mwyoung wrote:
And this can be tested on one computer. Like I have done with test positions. If HT on Vs. HT off cripples a engine it would show up in test positions. Longer thinks times, lower NPS. It does not cripple a engine in any why. Houdini or Stockfish running 4CPU HT on or HT off on a 4 core system.
Unless the theory is tests position are not effected by this theory...
I have explained this enough times now. I will stop.
"I think this leads to slower time to depth speeds" This is exactly my point. Because all we are trying to answer, and is Larry's Question. Does HT hurts Houdini on a 4 cpu intel -- HT ON vs HT OFF. When Houdini is set to 4 threads.
You can test this with test positions, and the answer is NO.
The answer Shows because there in no degradation in the Time to Depth times in test positions when testing 4 threads HT ON or HT OFF.
Bye...
Last edited by mwyoung on Wed Jan 08, 2014 4:03 am, edited 1 time in total.
"The worst thing that can happen to a forum is a running wild attacking moderator(HGM) who is not corrected by the community." - Ed Schröder
But my words like silent raindrops fell. And echoed in the wells of silence.
But my words like silent raindrops fell. And echoed in the wells of silence.
-
ouachita
- Posts: 454
- Joined: Tue Jan 15, 2013 4:33 pm
- Location: Ritz-Carlton, NYC
- Full name: Bobby Johnson
Re: Video on how hypethreading works in a Intel CPU.
Agreed, but with only one changed one variable, the result suggests SF may indeed improve with HT on my machine as I defined it here. I long ago ran a test with H with HT and it performed worse. All of this is headed toward that 16 v 6 match.Vinvin wrote:Too few games to draw any conclusions.
SIM, PhD, MBA, PE
-
lkaufman
- Posts: 6284
- Joined: Sun Jan 10, 2010 6:15 am
- Location: Maryland USA
- Full name: Larry Kaufman
Re: Video on how hypethreading works in a Intel CPU.
I am not a hardware expert so I won't presume to try to explain why, but as far as I know everyone who has compared performance of any engine with HT disabled and with HT enabled, but in each case with threads set to equal real cores, has found that performance is better with HT disabled, typically by 10 to 15 %. Whether being able to set threads to twice the number of cores provides Stockfish enough benefit to offset this I don't know, although I rather doubt it. It pretty clearly is not the case with Houdini, and with Komodo my best guess from all evidence is that Komodo runs better on 8 threads than on 4 on an i7 with HT on, but better still with 4 threads and HT off.mwyoung wrote: I have 7 computers in my house, some amd, some intel. Some with Bios HT option. My engine vs engine computer I keep nothing else on but chess testing. But I can test on any of them. I was referring to my computer I run my test on. To keep the overhead the lowest for testing engine vs engine. This is why my CPU % at rest is almost 0%.
I can do many things on my other computers, like type this response.
Other testers also have many computers at their homes and are posting the results here on your HT Question.
I have run many test on this including running test positions. I get the same NPS, the same time per depth. Except for the normal variation one would expect from testing MP. That why I tested this many times to get a good average.
Larry, I don't know what more I can do or say, but let others post their results.
You need to explain your theory, how a cpu core processing 1 pipe only per core with HT on can cripple a engine, over having HT off.
The CPU is the same either with HT on or off in Bios. ALL the bios HT ON/OFF option does is keep the programs restricted to 1 pipe on the core. But the 2 pipes are still there regardless of the HT setting in the BIOS.
If my results are incorrect and anything is possible that is why we test. INTEL is going to make many customers upset when running their single thread apps on their intel HT CPU. Because that is all INTEL is mostly making in CPU chips.
And if somehow this is true, I am sure we would have read about this from AMD inc.
Sorry I have to disagree, because of my testing. But others are also testing this and posting their results.
-
mwyoung
- Posts: 2727
- Joined: Wed May 12, 2010 10:00 pm
Re: Video on how hypethreading works in a Intel CPU.
I talked to the computer engineer at my company, he said yes there could be a bit of a difference, but it would be hard to detect. He said less then 10%, because the system is very good at routing processes to real cores that are not busy. This is less so if your system has more back ground usage.lkaufman wrote:I am not a hardware expert so I won't presume to try to explain why, but as far as I know everyone who has compared performance of any engine with HT disabled and with HT enabled, but in each case with threads set to equal real cores, has found that performance is better with HT disabled, typically by 10 to 15 %. Whether being able to set threads to twice the number of cores provides Stockfish enough benefit to offset this I don't know, although I rather doubt it. It pretty clearly is not the case with Houdini, and with Komodo my best guess from all evidence is that Komodo runs better on 8 threads than on 4 on an i7 with HT on, but better still with 4 threads and HT off.mwyoung wrote: I have 7 computers in my house, some amd, some intel. Some with Bios HT option. My engine vs engine computer I keep nothing else on but chess testing. But I can test on any of them. I was referring to my computer I run my test on. To keep the overhead the lowest for testing engine vs engine. This is why my CPU % at rest is almost 0%.
I can do many things on my other computers, like type this response.
Other testers also have many computers at their homes and are posting the results here on your HT Question.
I have run many test on this including running test positions. I get the same NPS, the same time per depth. Except for the normal variation one would expect from testing MP. That why I tested this many times to get a good average.
Larry, I don't know what more I can do or say, but let others post their results.
You need to explain your theory, how a cpu core processing 1 pipe only per core with HT on can cripple a engine, over having HT off.
The CPU is the same either with HT on or off in Bios. ALL the bios HT ON/OFF option does is keep the programs restricted to 1 pipe on the core. But the 2 pipes are still there regardless of the HT setting in the BIOS.
If my results are incorrect and anything is possible that is why we test. INTEL is going to make many customers upset when running their single thread apps on their intel HT CPU. Because that is all INTEL is mostly making in CPU chips.
And if somehow this is true, I am sure we would have read about this from AMD inc.
Sorry I have to disagree, because of my testing. But others are also testing this and posting their results.
The reason for the slight difference he said was because if a core is busy and the system has to route the process to a different CPU then it expected. It will create a cache miss.
Even at his max 10% speed penalty this can not explain Stockfish's results playing with 8cpu. I ask him if affinity lock would solve this problem for a program running less threads then logical cores. He said it would work on any setting of threads that is using real cores on a HT system because you are locking those cores to the program, and you can not lock a logical core unless the real cores are used first. So the system will no longer route process to the affinity locked cores as that program now has the highest priority. The system will only override the affinity lock if it has no other choice, say if you tried and play the computers with ponder on.
I tested this with the Houdini, without doing extensive testing it looked like Houdini with affinity lock got 4.3 MNps without affinity lock got 4.0 MNps.
On my other i7 running at 1.7 Ghz where I could turn HT on or off in bios, I also got less then 10% difference. But it is hard to be exact in a quick test because of the natural variation you get with MP. I will take his word that there is a difference.
You can also test this I assume.
What kind of results can you see with Houdini HT vs No HT using all the real cores?
"The worst thing that can happen to a forum is a running wild attacking moderator(HGM) who is not corrected by the community." - Ed Schröder
But my words like silent raindrops fell. And echoed in the wells of silence.
But my words like silent raindrops fell. And echoed in the wells of silence.
-
shrapnel
- Posts: 1339
- Joined: Fri Nov 02, 2012 9:43 am
- Location: New Delhi, India
Re: Video on how hypethreading works in a Intel CPU.
Bravo Arjun ! You are the only one who has truly understood what I was trying to convey. You also have explained it beautifully. I thought the images I posted should have cleared the doubts in the minds of M. Young and Barry, but apparently not !arjuntemurnikar wrote:
There is a difference between using the BIOS option to switch off HT and using 4/8 cores with HT=ON. You are right, the number of pipes stay the same, but the difference is in how information gets piped through them.
For example, consider an quad core system with 4 real cores (1, 2, 3 & 4) each divided into 2 logical cores (A & B).
With HT=ON, an 8 thread engine will fire all 8 logical cores (1A, 1B, 2A, 2B, 3A, 3B, 4A & 4B)
With HT=ON, a 4 thread engine will fire 4/8 logical cores For e.g. 1A, 2A, 3A, 4A or it could even be 1A, 1B, 2A, 3A, depending on how the CPU controller distributes the load internally. The engine has no control over which cores its using. So effectively its using 4/8 logical cores or 50% of the total load.
Now, with HT=OFF at the BIOS level, effectively the two logical cores in each real core act as ONE real core. So this means, a 4 thread engine will fire all 4 real cores (i.e. 1A+1B, 2A+2B, 3A+3B & 4A+4B). This means it will utilize all cores for full CPU load of 100%. This is what the task manager indicates to you.
So, in conclusion, the only way to test HT vs no-HT is by using two identical systems. Tests on single systems are simply flawed to begin with.
Now, that said, all these tests were anyway done with small sample sizes, so the mystery of HT vs no-HT remains unsolved.
They can't seem to see the Woods for the trees... LOL.
I'm not really interested into getting into a debate on a matter which is so obvious !
Whether they believe it or not is upto them.... let them continue to waste their time on Testing which is worthless as the very basic premise on which they base their Testing is wrong, at least in this case !
If Stockfish is better at utilising HT than Houdini,then well, bully for Stockfish and bad show by Houdini, I'm NOT disputing THAT !
BUT, if you want to fairly test ANY Engine against Houdini 4, you HAVE TO DISABLE HYPERTHREADING IN BIOS ITSELF IN THE PC RUNNING HOUDINI !
( Let me repeat it twice more, maybe it will go through
DISABLE HYPERTHREADING IN BIOS ITSELF IN THE PC RUNNING HOUDINI !
DISABLE HYPERTHREADING IN BIOS ITSELF IN THE PC RUNNING HOUDINI !
....otherwise your Testing jack....
Now, think of me as a Field Tester.
When I go online for an Engine-Engine match while using Houdini 4 on my 6-Core PC, I want maximum strength, so I would ALWAYS disable HT in my BIOS itself, thereby leaving me with only 6 REAL THREADS and NOTHING ELSE. I would NEVER NEVER ENABLE HT and THEN proceed to play with 6 threads !
Similarly, if I decided to use Stockfish instead, and obviously wanting maximum strength, I would ENABLE HT and proceed to play with all 12 threads available to me ! (On the premise that HT suits SF, which like Larry Kaufman, I'm extremely doubtful about ).
So, obviously, each Engine would like to play at its MAXIMUM strength, whether its an online match like I play, or your Testing !
Anything else would be unfair !
Just my 2 cents...not participating in this debate any more ! ( Have to play online ) !
i7 5960X @ 4.1 Ghz, 64 GB G.Skill RipJaws RAM, Twin Asus ROG Strix OC 11 GB Geforce 2080 Tis
-
lkaufman
- Posts: 6284
- Joined: Sun Jan 10, 2010 6:15 am
- Location: Maryland USA
- Full name: Larry Kaufman
Re: Video on how hypethreading works in a Intel CPU.
I don't have two identical computers so I can't test this myself. I think a 10% speed difference is a good estimate. This would probably mean about 15 elo at the bullet speeds we are talking about.lkaufman wrote:I am not a hardware expert so I won't presume to try to explain why, but as far as I know everyone who has compared performance of any engine with HT disabled and with HT enabled, but in each case with threads set to equal real cores, has found that performance is better with HT disabled, typically by 10 to 15 %. Whether being able to set threads to twice the number of cores provides Stockfish enough benefit to offset this I don't know, although I rather doubt it. It pretty clearly is not the case with Houdini, and with Komodo my best guess from all evidence is that Komodo runs better on 8 threads than on 4 on an i7 with HT on, but better still with 4 threads and HT off.mwyoung wrote: I have 7 computers in my house, some amd, some intel. Some with Bios HT option. My engine vs engine computer I keep nothing else on but chess testing. But I can test on any of them. I was referring to my computer I run my test on. To keep the overhead the lowest for testing engine vs engine. This is why my CPU % at rest is almost 0%.
I can do many things on my other computers, like type this response.
Other testers also have many computers at their homes and are posting the results here on your HT Question.
I have run many test on this including running test positions. I get the same NPS, the same time per depth. Except for the normal variation one would expect from testing MP. That why I tested this many times to get a good average.
Larry, I don't know what more I can do or say, but let others post their results.
You need to explain your theory, how a cpu core processing 1 pipe only per core with HT on can cripple a engine, over having HT off.
The CPU is the same either with HT on or off in Bios. ALL the bios HT ON/OFF option does is keep the programs restricted to 1 pipe on the core. But the 2 pipes are still there regardless of the HT setting in the BIOS.
If my results are incorrect and anything is possible that is why we test. INTEL is going to make many customers upset when running their single thread apps on their intel HT CPU. Because that is all INTEL is mostly making in CPU chips.
And if somehow this is true, I am sure we would have read about this from AMD inc.
Sorry I have to disagree, because of my testing. But others are also testing this and posting their results.