Hyperthreading and Computer Chess: Intel i5-3210M

Laskos · Post by **Laskos** » Fri Apr 12, 2013 8:56 pm

bob wrote:
Regardless of urban legend, I have NEVER seen one example where using hyper threading improves the performance of a chess engine. Not a single one.

It is you spreading the urban legend that HT doesn't work for chess engines. On an i7 2600 me and others had unequivocal beneficial results of HT.
http://talkchess.com/forum/viewtopic.ph ... 1&start=66

Modern Times · Post by **Modern Times** » Fri Apr 12, 2013 9:08 pm

bob wrote: Bottom line is to NOT use hyper threading when playing chess. You don't have to disable it as new operating system process schedulers understand the issues and will make sure each thread runs on a physical core, unless you run more threads than physical cores. At that point, you start to hurt performance.

Which new operating systems are you referring to ? Do you include Windows 7 in that ?

bob · Post by **bob** » Fri Apr 12, 2013 9:15 pm

Modern Times wrote:
bob wrote: Bottom line is to NOT use hyper threading when playing chess. You don't have to disable it as new operating system process schedulers understand the issues and will make sure each thread runs on a physical core, unless you run more threads than physical cores. At that point, you start to hurt performance.
Which new operating systems are you referring to ? Do you include Windows 7 in that ?

Yes. Windows 7/8, recent linux kernels (no more than 3 years old), Mac OS x. Etc.

bob · Post by **bob** » Fri Apr 12, 2013 9:29 pm

Laskos wrote:
bob wrote:
Regardless of urban legend, I have NEVER seen one example where using hyper threading improves the performance of a chess engine. Not a single one.
It is you spreading the urban legend that HT doesn't work for chess engines. On an i7 2600 me and others had unequivocal beneficial results of HT.
http://talkchess.com/forum/viewtopic.ph ... 1&start=66

I remain unconvinced. And fortunately, I have run a LOT of tests, not just some tactical positions. Did you disable turbo-boost? Do you REALLY have a parallel search that has little or no overhead, which is required to get a speedup from the relatively modest improvements HT gives.

Here's a few quick comparisons between mt=2 and mt=4 on my macbook dual-core i7:

log.001: time=11.68 mat=0 n=101057466 fh=95% nps=8.7M
log.002: time=12.71 mat=0 n=139025987 fh=95% nps=10.9M

First run is always mt=2, second is mt=4.

NPS goes up, time to same depth gets longer, tree size gets larger.

A few others, just for fun. I normally run about 300 positions, and for 4 threads, I run each test at least 8 times and average. For 2 threads I run at least 4 times and average.

log.001: time=15.79 mat=0 n=128285112 fh=93% nps=8.1M
log.002: time=21.87 mat=0 n=211812947 fh=93% nps=9.7M

log.001: time=48.39 mat=0 n=348924409 fh=93% nps=7.2M
log.002: time=40.65 mat=0 n=358124220 fh=92% nps=8.8M

log.001: time=9.99 mat=0 n=110069531 fh=94% nps=11.0M
log.002: time=12.25 mat=0 n=149579319 fh=93% nps=12.2M

log.001: time=26.42 mat=0 n=223055725 fh=93% nps=8.4M
log.002: time=27.37 mat=0 n=280907999 fh=93% nps=10.3M

What I am citing is NOT "urban legend". It is something that is well-known and well-understood by those that have actually spent time developing a parallel search and testing it for improvements.

BTW, searching tactical positions is not a valid way of testing parallel search. The key there is that the best move is often ordered later in the list by the very nature of the position (the best move is usually some sort of 'surprise'. This plays right into the hands of a parallel search that by its very nature tends to do better when move ordering is sub-optimal.

Laskos · Post by **Laskos** » Fri Apr 12, 2013 9:45 pm

bob wrote:
Laskos wrote:
bob wrote:
Regardless of urban legend, I have NEVER seen one example where using hyper threading improves the performance of a chess engine. Not a single one.
It is you spreading the urban legend that HT doesn't work for chess engines. On an i7 2600 me and others had unequivocal beneficial results of HT.
http://talkchess.com/forum/viewtopic.ph ... 1&start=66
I remain unconvinced. And fortunately, I have run a LOT of tests, not just some tactical positions. Did you disable turbo-boost? Do you REALLY have a parallel search that has little or no overhead, which is required to get a speedup from the relatively modest improvements HT gives.

Here's a few quick comparisons between mt=2 and mt=4 on my macbook dual-core i7:

log.001: time=11.68 mat=0 n=101057466 fh=95% nps=8.7M
log.002: time=12.71 mat=0 n=139025987 fh=95% nps=10.9M

First run is always mt=2, second is mt=4.

NPS goes up, time to same depth gets longer, tree size gets larger.

A few others, just for fun. I normally run about 300 positions, and for 4 threads, I run each test at least 8 times and average. For 2 threads I run at least 4 times and average.

log.001: time=15.79 mat=0 n=128285112 fh=93% nps=8.1M
log.002: time=21.87 mat=0 n=211812947 fh=93% nps=9.7M

log.001: time=48.39 mat=0 n=348924409 fh=93% nps=7.2M
log.002: time=40.65 mat=0 n=358124220 fh=92% nps=8.8M

log.001: time=9.99 mat=0 n=110069531 fh=94% nps=11.0M
log.002: time=12.25 mat=0 n=149579319 fh=93% nps=12.2M

log.001: time=26.42 mat=0 n=223055725 fh=93% nps=8.4M
log.002: time=27.37 mat=0 n=280907999 fh=93% nps=10.3M

What I am citing is NOT "urban legend". It is something that is well-known and well-understood by those that have actually spent time developing a parallel search and testing it for improvements.

BTW, searching tactical positions is not a valid way of testing parallel search. The key there is that the best move is often ordered later in the list by the very nature of the position (the best move is usually some sort of 'surprise'. This plays right into the hands of a parallel search that by its very nature tends to do better when move ordering is sub-optimal.

Time to depth is not very convincing too, as the tree is wider with more threads. I actually played some matches 8-t versus 4-t on 4 cores (HT on), the results seemed to be conclusive for HT efficiency (10-20 Elo points). The only glitch I saw is that HT kicks to full NPS only after several seconds per move of Houdini 3, on ultra-fast controls it seems worthless indeed.

bob · Post by **bob** » Fri Apr 12, 2013 10:23 pm

Laskos wrote:
bob wrote:
Laskos wrote:
bob wrote:
Regardless of urban legend, I have NEVER seen one example where using hyper threading improves the performance of a chess engine. Not a single one.
It is you spreading the urban legend that HT doesn't work for chess engines. On an i7 2600 me and others had unequivocal beneficial results of HT.
http://talkchess.com/forum/viewtopic.ph ... 1&start=66
I remain unconvinced. And fortunately, I have run a LOT of tests, not just some tactical positions. Did you disable turbo-boost? Do you REALLY have a parallel search that has little or no overhead, which is required to get a speedup from the relatively modest improvements HT gives.

Here's a few quick comparisons between mt=2 and mt=4 on my macbook dual-core i7:

log.001: time=11.68 mat=0 n=101057466 fh=95% nps=8.7M
log.002: time=12.71 mat=0 n=139025987 fh=95% nps=10.9M

First run is always mt=2, second is mt=4.

NPS goes up, time to same depth gets longer, tree size gets larger.

A few others, just for fun. I normally run about 300 positions, and for 4 threads, I run each test at least 8 times and average. For 2 threads I run at least 4 times and average.

log.001: time=15.79 mat=0 n=128285112 fh=93% nps=8.1M
log.002: time=21.87 mat=0 n=211812947 fh=93% nps=9.7M

log.001: time=48.39 mat=0 n=348924409 fh=93% nps=7.2M
log.002: time=40.65 mat=0 n=358124220 fh=92% nps=8.8M

log.001: time=9.99 mat=0 n=110069531 fh=94% nps=11.0M
log.002: time=12.25 mat=0 n=149579319 fh=93% nps=12.2M

log.001: time=26.42 mat=0 n=223055725 fh=93% nps=8.4M
log.002: time=27.37 mat=0 n=280907999 fh=93% nps=10.3M

What I am citing is NOT "urban legend". It is something that is well-known and well-understood by those that have actually spent time developing a parallel search and testing it for improvements.

BTW, searching tactical positions is not a valid way of testing parallel search. The key there is that the best move is often ordered later in the list by the very nature of the position (the best move is usually some sort of 'surprise'. This plays right into the hands of a parallel search that by its very nature tends to do better when move ordering is sub-optimal.
Time to depth is not very convincing too, as the tree is wider with more threads. I actually played some matches 8-t versus 4-t on 4 cores (HT on), the results seemed to be conclusive for HT efficiency (10-20 Elo points). The only glitch I saw is that HT kicks to full NPS only after several seconds per move of Houdini 3, on ultra-fast controls it seems worthless indeed.

Time to depth is the ONLY valid way to measure parallel search improvement. And measuring speed improvement is the way to improve the search. I've never seen HT "take a while to kick in" and that makes no sense to me at all from all the HT testing I have done (starting back with the original PIV).

syzygy · Post by **syzygy** » Fri Apr 12, 2013 10:36 pm

bob wrote:Time to depth is the ONLY valid way to measure parallel search improvement.

The tree is different. The selected move found can be different. The only valid measure is playing strength.

Laskos · Post by **Laskos** » Fri Apr 12, 2013 10:56 pm

bob wrote:
Laskos wrote:
Time to depth is not very convincing too, as the tree is wider with more threads. I actually played some matches 8-t versus 4-t on 4 cores (HT on), the results seemed to be conclusive for HT efficiency (10-20 Elo points). The only glitch I saw is that HT kicks to full NPS only after several seconds per move of Houdini 3, on ultra-fast controls it seems worthless indeed.
Time to depth is the ONLY valid way to measure parallel search improvement. And measuring speed improvement is the way to improve the search. I've never seen HT "take a while to kick in" and that makes no sense to me at all from all the HT testing I have done (starting back with the original PIV).

No, time to depth is not convincing either, the tree is wider with more threads to the same depth, and the move chosen at the same depth is on average better with more threads. I really did observe that 8-t on 4 physical cores (HT on) needs time to kick in, observing both Houdini's NPS and the results in matches. I don't know what happens.

The only criterion is the playing strength, and on several tests on not very fast controls I did observe 15-20 (+/- 10 2SD) points improvement from HT. The matches were 4'+2'' and 5'+5'', with LOS of some 98% IIRC.

bob · Post by **bob** » Sat Apr 13, 2013 1:33 am

syzygy wrote:
bob wrote:Time to depth is the ONLY valid way to measure parallel search improvement.
The tree is different. The selected move found can be different. The only valid measure is playing strength.

The selected move only varies infrequently. If you want to measure an improvement to the search, you can try to play 100K games, or you can measure the normal speedup that everyone relies on. Hyperthreading is NOT a win here.

bnemias · Post by **bnemias** » Sat Apr 13, 2013 1:48 am

syzygy wrote:The tree is different. The selected move found can be different. The only valid measure is playing strength.

Heh. Hard to argue with that.

This subject comes up every so often, and it's hard to believe people still think searching a larger tree with a small increase in NPS is beneficial. I'm not completely convinced there's a relationship between time to depth and playing strength either. But lacking any data of my own, I tend to believe Bob.

Hyperthreading and Computer Chess: Intel i5-3210M

Re: Hyperthreading and Computer Chess: Intel i5-3210M

Re: Hyperthreading and Computer Chess: Intel i5-3210M

Re: Hyperthreading and Computer Chess: Intel i5-3210M

Re: Hyperthreading and Computer Chess: Intel i5-3210M

Re: Hyperthreading and Computer Chess: Intel i5-3210M

Re: Hyperthreading and Computer Chess: Intel i5-3210M

Re: Hyperthreading and Computer Chess: Intel i5-3210M

Re: Hyperthreading and Computer Chess: Intel i5-3210M

Re: Hyperthreading and Computer Chess: Intel i5-3210M

Re: Hyperthreading and Computer Chess: Intel i5-3210M