Hyperthreading and Computer Chess: Intel i5-3210M

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Harvey Williamson, bob

Forum rules
This textbox is used to restore diagrams posted with the [d] tag before the upgrade.
Post Reply
User avatar
Mike S.
Posts: 1480
Joined: Thu Mar 09, 2006 4:33 am

Hyperthreading and Computer Chess: Intel i5-3210M

Post by Mike S. » Fri Apr 12, 2013 12:24 am

Code: Select all

Intel i5-3210M, 2 x 2.5 GHz (-2.9 GHz)
2 physical cores; 4 logical cores (HT)
512 M hash tables (DDR3-RAM / 800 MHz)

Engine           | P#1 depth  time(2T)  time(4T) | P#2 depth  time(2T)  time(4T)
--------------------------------------------------------------------------------
Critter 1.6a     |      20      34        20     |      19      59        34
Deep Fritz 13    |      21      48        62 ?   |      22      32        15
Houdini 1.5a     |      21      40        26     |      21     113        36 !
Rybka 2.3.2a     |      16      36        26     |      16     113       136 ?
Stockfish 100413 |      24      34        13 !   |      25      66        41
--------------------------------------------------------------------------------
Average Time Relation:            1.60:1         |                 1.89:1
--------------------------------------------------------------------------------
Average Time Relation total:     ~1.75:1
                      ------------------
[D]r1bq1rk1/2ppbppp/p1n2n2/1p2p3/4P3/1B3N2/PPPP1PPP/RNBQR1K1 w - -
[D]5k2/6p1/2p2p2/P7/1Q6/2P1pqPP/7K/8 b - - bm c5; id Quick-19;
Regards, Mike

Vinvin
Posts: 4330
Joined: Thu Mar 09, 2006 8:40 am
Full name: Vincent Lejeune

Re: Hyperthreading and Computer Chess: Intel i5-3210M

Post by Vinvin » Fri Apr 12, 2013 8:15 am

Don't forget that with more than 1 thread, the shape of the three is not constant. The move tree analyzed can be very different if there are fails high/low in some part.
I suggest you do at least 10 measures to get an average.

User avatar
lucasart
Posts: 3036
Joined: Mon May 31, 2010 11:29 am
Full name: lucasart
Contact:

Re: Hyperthreading and Computer Chess: Intel i5-3210M

Post by lucasart » Fri Apr 12, 2013 8:38 am

Thank you Mike for this interesting analysis. This is also what I experienced in testing. For most practical purposes HT cores are like real cores, and it's a waste of CPU resources to limit yourself to the real cores only.

Here's the setup that I would recommend, assuming you have N cores, including the HT ones:

=> SMP capable engines
- pondering off: N-1 threads
- pondering on: (N-1)/2 threads

=> non SMP engines
- pondering off: play N-1 games concurrently
- pondering on: play (N-1)/2 games concurrently

The reason for N-1 is that you need to leave some CPU power to the operating system and other things running in the background, otherwise you introduce a significant amount of noise in the results.

That's for playing games. For analyzing a position, you should use N threads, of course.
Theory and practice sometimes clash. And when that happens, theory loses. Every single time.

User avatar
geots
Posts: 4790
Joined: Fri Mar 10, 2006 11:42 pm

Re: Hyperthreading and Computer Chess: Intel i5-3210M

Post by geots » Fri Apr 12, 2013 11:35 am

lucasart wrote:Thank you Mike for this interesting analysis. This is also what I experienced in testing. For most practical purposes HT cores are like real cores, and it's a waste of CPU resources to limit yourself to the real cores only.

Here's the setup that I would recommend, assuming you have N cores, including the HT ones:

=> SMP capable engines
- pondering off: N-1 threads
- pondering on: (N-1)/2 threads

=> non SMP engines
- pondering off: play N-1 games concurrently
- pondering on: play (N-1)/2 games concurrently

The reason for N-1 is that you need to leave some CPU power to the operating system and other things running in the background, otherwise you introduce a significant amount of noise in the results.

That's for playing games. For analyzing a position, you should use N threads, of course.





Heading to get some sleep, but thought I would ask you this before I did. I have in transit a new i7 6-core system that has HT. But it can be completely disabled in the bios. Would you please repeat your above comments for me in English (meaning layman's terms) and how I should handle it in testing- HT disabled or not?


Best,

george

User avatar
lucasart
Posts: 3036
Joined: Mon May 31, 2010 11:29 am
Full name: lucasart
Contact:

Re: Hyperthreading and Computer Chess: Intel i5-3210M

Post by lucasart » Fri Apr 12, 2013 12:43 pm

geots wrote:
lucasart wrote:Thank you Mike for this interesting analysis. This is also what I experienced in testing. For most practical purposes HT cores are like real cores, and it's a waste of CPU resources to limit yourself to the real cores only.

Here's the setup that I would recommend, assuming you have N cores, including the HT ones:

=> SMP capable engines
- pondering off: N-1 threads
- pondering on: (N-1)/2 threads

=> non SMP engines
- pondering off: play N-1 games concurrently
- pondering on: play (N-1)/2 games concurrently

The reason for N-1 is that you need to leave some CPU power to the operating system and other things running in the background, otherwise you introduce a significant amount of noise in the results.

That's for playing games. For analyzing a position, you should use N threads, of course.





Heading to get some sleep, but thought I would ask you this before I did. I have in transit a new i7 6-core system that has HT. But it can be completely disabled in the bios. Would you please repeat your above comments for me in English (meaning layman's terms) and how I should handle it in testing- HT disabled or not?


Best,

george
do NOT disable HT in the bios.

there are some very specific use cases where HT should be switched off. but chess is not one of them.
Theory and practice sometimes clash. And when that happens, theory loses. Every single time.

User avatar
Houdini
Posts: 1471
Joined: Mon Mar 15, 2010 11:00 pm
Contact:

Re: Hyperthreading and Computer Chess: Intel i5-3210M

Post by Houdini » Fri Apr 12, 2013 1:11 pm

If these data are just a single run on each position, could you please repeat the procedure at least 6 times, so that you get statistically more meaningful results?
With a single run the random variance probably is significantly larger than whatever it is you want to measure.

Robert

Daniel Shawul
Posts: 3723
Joined: Tue Mar 14, 2006 10:34 am
Location: Ethiopia
Contact:

Re: Hyperthreading and Computer Chess: Intel i5-3210M

Post by Daniel Shawul » Fri Apr 12, 2013 3:09 pm

I just did a quick run for scorpio with 30 positions and sd=19 on an i7. I used the same positions for root splitting smp testing here. Well it does help a bit but definitely not as good as yours. Maybe there are some improvements since i5 but it used to suck really bad that you can't get anything out of it. Some times the parallel version searches too few nodes making it look like it had searched through the same size tree quickly when it actually did much less work. Look at the result for the first two positions below which show more than 2x speed up for the parallel search that is unrealistic.

Summary:

Code: Select all

Overhead=1.119790877	Speedup=1.278925236	NpsScale=1.253819895
Overhead is how much more nodes are searched by parallel search, speedup is the figure you calculated, and NpsScale is the increase in nps.

Break down by positions:

Code: Select all

Pos	Overhead	Time	Nps
1	0.564162854	2.395285585	1.351342101
2	0.461966673	2.799729364	1.292907252
3	0.747177059	1.78276699	1.331771051
4	1.326429632	1.004076641	1.331512428
5	0.926284322	1.471530249	1.362900279
6	1.326118952	0.993322204	1.316821923
7	2.049017317	0.628380262	1.287769291
8	1.351376686	0.895705521	1.211843191
9	1.970543459	0.643058956	1.267186644
10	1.410484565	0.895223712	1.262791622
11	0.958216627	1.302304147	1.248225839
12	0.774485341	1.517084282	1.175044964
13	0.736167635	1.639139863	1.206637108
14	1.135163923	1.025959368	1.164103135
15	0.614583773	1.837416481	1.129635144
16	0.928486158	1.398373984	1.298479336
17	0.846912203	1.509052183	1.277809895
18	1.509545139	0.883116883	1.331911156
19	0.850412293	1.472677596	1.251809033
20	0.98837578	1.302170284	1.287858813
21	1.712199611	0.75	1.285624017
22	0.801207492	1.545054945	1.237142435
23	1.040297195	1.168458781	1.215226641
24	0.854206042	1.405472637	1.200861908
25	1.227304906	0.959276018	1.177709846
26	0.965279498	1.274656679	1.230093076
27	2.019117814	0.620960699	1.253483118
28	0.922701724	1.322807018	1.220769908
29	1.061150907	1.125816993	1.195203546
30	1.514350717	0.798878767	1.210122143
Raw Data for 1 and 2 thread search

Code: Select all

Nodes	Time	NPS	splits	bad		Nodes	Time	NPS	splits	bad
=====	====	===	======	===		=====	====	===	======	===
38218858	26.42	1446314	0	0		21561660	11.03	1954465	968	116
31400921	20.69	1517832	0	0		14506179	7.39	1962416	895	74
22715937	14.69	1546669	0	0		16972827	8.24	2059809	1099	144
46040373	24.63	1869583	0	0		61069315	24.53	2489373	1565	172
11453398	8.27	1384598	0	0		10609103	5.62	1887069	555	39
34011510	23.8	1429355	0	0		45103308	23.96	1882206	866	88
36181540	22.54	1605001	0	0		74136602	35.87	2066871	632	77
11153341	5.84	1908185	0	0		15072365	6.52	2312421	1113	172
66639698	34.14	1952010	0	0		131316421	53.09	2473561	1743	247
88505491	47.42	1866456	0	0		124835629	52.97	2356945	2435	181
49801820	28.26	1761960	0	0		47720932	21.7	2199324	1139	165
22787410	13.32	1710252	0	0		17648515	8.78	2009623	1007	131
148356423	90.71	1635592	0	0		109215197	55.34	1973566	1159	149
15029728	9.09	1653253	0	0		17061205	8.86	1924557	709	49
15240295	8.25	1847084	0	0		9366438	4.49	2086531	843	66
17245165	12.04	1432203	0	0		16011897	8.61	1859686	720	68
23278988	14.17	1643300	0	0		19715259	9.39	2099825	934	74
13485712	9.52	1417310	0	0		20357291	10.78	1887731	1013	97
16534614	10.78	1534107	0	0		14061239	7.32	1920409	810	42
12083650	7.8	1548193	0	0		11943187	5.99	1993854	1177	69
8087397	5.01	1612641	0	0		13847238	6.68	2073250	1716	98
25145349	14.06	1788558	0	0		20146642	9.1	2212701	744	66
48924050	29.34	1667656	0	0		50895552	25.11	2026580	1261	89
8811538	5.65	1559564	0	0		7526869	4.02	1872821	679	49
10167660	6.36	1597683	0	0		12478819	6.63	1881607	1199	135
35938248	20.42	1759953	0	0		34690454	16.02	2164906	858	61
14650681	7.11	2061443	0	0		29581451	11.45	2583984	736	81
12312435	7.54	1632949	0	0		11360705	5.7	1993455	781	103
11135076	6.89	1615652	0	0		11815996	6.12	1931033	735	38
18686209	11.4	1638566	0	0		28297474	14.27	1982865	754	88

User avatar
Mike S.
Posts: 1480
Joined: Thu Mar 09, 2006 4:33 am

Re: Hyperthreading and Computer Chess: Intel i5-3210M

Post by Mike S. » Fri Apr 12, 2013 3:59 pm

Houdini wrote:could you please repeat the procedure at least 6 times
I understand this demand due to MP being not deterministic, but sorry, I cannot do that. IF I'd run it six times, then a 7th run would probably again be much different from the average of 1...6.

Sometimes I need to simplify my world. :mrgreen:
Regards, Mike

ernest
Posts: 1851
Joined: Wed Mar 08, 2006 7:30 pm

Re: Hyperthreading and Computer Chess: Intel i5-3210M

Post by ernest » Fri Apr 12, 2013 4:31 pm

Mike S. wrote:I'd run it six times, then a 7th run would probably again be much different from the average of 1...6.
Then you should learn about statistics.....
(and computer chess has a lot to do about statistics!)
But you know all about that, Mike! :)

Don't confuse simple and simple-minded! 8-)

bob
Posts: 20401
Joined: Mon Feb 27, 2006 6:30 pm
Location: Birmingham, AL

Re: Hyperthreading and Computer Chess: Intel i5-3210M

Post by bob » Fri Apr 12, 2013 5:05 pm

geots wrote:
lucasart wrote:Thank you Mike for this interesting analysis. This is also what I experienced in testing. For most practical purposes HT cores are like real cores, and it's a waste of CPU resources to limit yourself to the real cores only.

Here's the setup that I would recommend, assuming you have N cores, including the HT ones:

=> SMP capable engines
- pondering off: N-1 threads
- pondering on: (N-1)/2 threads

=> non SMP engines
- pondering off: play N-1 games concurrently
- pondering on: play (N-1)/2 games concurrently

The reason for N-1 is that you need to leave some CPU power to the operating system and other things running in the background, otherwise you introduce a significant amount of noise in the results.

That's for playing games. For analyzing a position, you should use N threads, of course.





Heading to get some sleep, but thought I would ask you this before I did. I have in transit a new i7 6-core system that has HT. But it can be completely disabled in the bios. Would you please repeat your above comments for me in English (meaning layman's terms) and how I should handle it in testing- HT disabled or not?


Best,

george
Regardless of urban legend, I have NEVER seen one example where using hyper threading improves the performance of a chess engine. Not a single one.

You don't need to disable HT, but if you have 6 cores, you should not run more than 6 processes in a parallel chess engine.

The reason for this is as follows.

In a good implementation of parallel search (most seem to use some flavor of the Crafty parallel search) each additional thread used adds about 30% overhead, meaning a new thread improves performance by only 70%. But with hyperthreading, you take a physical core and divide it into two equal halves.

With one core, if your tree search searches exactly 100M nodes, then on a hyper-threaded (two threads, only one physical core) each thread will search about 50M nodes in the same amount of time. Unfortunately, one of those cores will have that 30% overhead, which means you only searched the equivalent of 85M nodes as opposed to your original 100M. The search is slower. The dual threads might speed it up a bit, but nowhere near enough to offset that 30% loss.

Bottom line is to NOT use hyper threading when playing chess. You don't have to disable it as new operating system process schedulers understand the issues and will make sure each thread runs on a physical core, unless you run more threads than physical cores. At that point, you start to hurt performance.

Post Reply