TCEC Super Final News...

Discussion of anything and everything relating to chess playing software and machines.

Moderators: bob, hgm, Harvey Williamson

Forum rules
This textbox is used to restore diagrams posted with the [d] tag before the upgrade.
User avatar
Eelco de Groot
Posts: 4162
Joined: Sun Mar 12, 2006 1:40 am
Location: Groningen

Re: TCEC Super Final News...

Post by Eelco de Groot » Mon Oct 31, 2016 1:19 pm

Latest News: Kiran Panditrao has made two different binaries, one from Marco's very new NUMA branch, with some code based on Texel's NUMA (by Peter Österlund). So if Martin or Anton Mihailov would choose to leave Hyperthreading ON this time, at least Stockfish is prepared. Thanks Marco and Kiran! By the way anyone can download these very fast binaries for their own system. https://groups.google.com/forum/#!topic ... bDn-Gyfjrk
This is probably close to Stockfish 8 at least as it stands now? There may of course still be changes, but probably not much for the TCEC final anymore.
Debugging is twice as hard as writing the code in the first
place. Therefore, if you write the code as cleverly as possible, you
are, by definition, not smart enough to debug it.
-- Brian W. Kernighan

Dan Cooper
Posts: 184
Joined: Sun Nov 01, 2015 2:15 am

Re: TCEC Super Final News...

Post by Dan Cooper » Mon Oct 31, 2016 1:55 pm

And as we all know, SF team submitting a binary from something other than master has never backfired before...

syzygy
Posts: 4455
Joined: Tue Feb 28, 2012 10:56 pm

Re: TCEC Super Final News...

Post by syzygy » Mon Oct 31, 2016 5:12 pm

Dan Cooper wrote:And as we all know, SF team submitting a binary from something other than master has never backfired before...
You are referring to a bug that was also present in master at the time. Unsurprisingly, after reverting to the old executable based on master the bug hit once more in that same round.

CheckersGuy
Posts: 273
Joined: Wed Aug 24, 2016 7:49 pm

Re: TCEC Super Final News...

Post by CheckersGuy » Mon Oct 31, 2016 6:47 pm

How much does Stockfish gain from hyperthreading ? It's not that much I assume :?

User avatar
Eelco de Groot
Posts: 4162
Joined: Sun Mar 12, 2006 1:40 am
Location: Groningen

Re: TCEC Super Final News...

Post by Eelco de Groot » Mon Oct 31, 2016 11:35 pm

CheckersGuy wrote:How much does Stockfish gain from hyperthreading ? It's not that much I assume :?
This is not about hyperthreading as such, but because of a Windows limitation/bug with this much processors. So actually in this case, a lot of speed is lost, Marco thinks about 20 to 30% loss with HT ON but not changing the number of threads. That's just a feature of Windows.

But even with much less cores, there can be a difference. It is surprising that we don't actually have hard data. It depends also on how modern your hyperthreading processor is. If on my modern i7 6700 I go from 4 threads to 7, that is three hyperthreads, the taskmanager goes from 60% to almost a 100%. This actually is another bug in Windows I am told. You also don't see the chess engines listed in the list of processes, so in that list of processes you can't see where the 100% is actually coming from. This bug was not in old Windows XP taskmanager. XP more than 10 years old now. But back to hyperthreading, just one example with 7 threads. Try to find the time to depth, because from the nodes you also can't infer everything.

49/61 13:12 -2.85 1.Bb2 Bc6 2.Ke2 f3+ 3.Kd2 Bd7 4.Bc1 Kd4
5.c6 Bxc6 6.Ke1 Ke5 7.Be3 Kd5 8.Kd1 Bb7
9.Bb6 Kc4 10.Ke1 Bd5 11.Bc7 Kd3
12.Bf4 Bc6 13.Kf1 Kd4 14.Bb8 (12.339.655.921) 15567

50/73 18:14 -2.85 1.c6 Bxc6 2.Bc3 Bb7 3.Bb2 Kd3 4.Be5 f3
5.Kf1 Be4 6.Bd6 Kc3 7.Bc7 Kc2 8.Ke1 Kb3
9.Kd1 Bb7 10.Ke1 Kxa3 11.Ba5 Ka2
12.Kf2 Kb2 13.Ke1 Kc3 14.Kf2 (17.084.715.174) 15612

51/73 18:50 -2.85 1.c6 Bxc6 2.Bc3 Bb7 3.Bb2 Kd3 4.Be5 f3
5.Kf1 Kc2 6.Bd6 Bc6 7.Bg3 Be4 8.Bc7 Kc3
9.Be5+ Kb3 10.Bc7 Bb7 11.Bb6 Kxa3
12.Ba5 Kb3 13.Ke1 Bc6 14.Kd2 (17.661.325.264) 15625

52/73 19:10 -2.85 1.c6 Bxc6 2.Bc3 Bb7 3.Bb2 Kd3 4.Be5 f3
5.Kf1 Kc2 6.Bd6 Bc6 7.Bg3 Be4 8.Bc7 Kb2
9.Kf2 Kb3 10.Bd8 Kxa3 11.Ba5 Ka2
12.Kg3 Kb2 13.Kf2 Kc3 14.Kg1 (17.978.807.470) 15622

53/73 19:24 -2.85 1.c6 Bxc6 2.Bc3 Bb7 3.Bb2 Kd3 4.Be5 f3
5.Kf1 Kc2 6.Bd6 Bc6 7.Bg3 Be4 8.Bc7 Kb2
9.Kf2 Kb3 10.Bd8 Kxa3 11.Ba5 Ba8
12.Kg1 Bb7 13.Kf1 Bc6 14.Ke1 (18.180.245.580) 15613

54/73 20:19 -2.85 1.c6 Bxc6 2.Bc3 Bb7 3.Bb2 Kd3 4.Be5 f3
5.Kf1 Kc2 6.Bd6 Bc6 7.Bg3 Be4 8.Bc7 Kb2
9.Kf2 Kc3 10.Bd8 Kc4 11.Ke1 Kb3
12.Bc7 Kxa3 13.Ba5 Kb3 14.Kf1 (19.009.339.990) 15591

55/73 22:10 -2.85 1.c6 Bxc6 2.Bc3 Bb7 3.Bb2 Kd3 4.Be5 f3
5.Kf1 Kc2 6.Bd6 Bc6 7.Bg3 Be4 8.Bc7 Kb2
9.Kf2 Kb3 10.Ke3 Bc6 11.Kd2 Kxa3
12.Ba5 Bd5 13.Ke3 Ka4 14.Kd2 (20.760.362.936) 15606

56/73 26:28 -2.93-- 1.c6 Bxc6 (24.569.558.183) 15464

56/73 28:03 -2.93 1.c6 Bxc6 2.Bc3 Bb7 3.Bb2 Kd3 4.Be5 f3
5.Kf1 Kc2 6.Bd6 Bc6 7.Bc7 Kb2 8.Bb6 Kb3
9.Bc7 Bb7 10.Kf2 Kc3 11.Be5+ Kc2
12.Bd4 Kb3 13.Bb6 Kxa3 14.Ba5 (26.087.344.060) 15491

57/73 32:38 -2.93 1.c6 Bxc6 2.Bc3 Bb7 3.Bb2 Kd3 4.Be5 f3
5.Kf1 Kc2 6.Bd6 Bc6 7.Bc7 Kb2 8.Bb6 Kb3
9.Bc7 Kc4 10.Kf2 Kd4 11.Kf1 Kd3
12.Bg3 Kc4 13.Kf2 Kc3 14.Bc7 (30.600.134.120) 15626

58/73 36:37 -2.93 1.c6 Bxc6 2.Bc3 Bb7 3.Bb2 Kd3 4.Be5 f3
5.Kf1 Kc2 6.Bd6 Bc6 7.Bc7 Kb2 8.Ba5 Kxa3
9.Ke1 Bd5 10.Kf1 Kb3 11.Kf2 Kc4
12.Kg3 Kd4 13.Bb6+ Ke4 14.Kf2 (34.162.487.604) 15547

59/79 46:09 -2.93 1.c6 Bxc6 2.Bc3 Bb7 3.Bb2 Kd3 4.Be5 f3
5.Kf1 Kc2 6.Bd6 Bc6 7.Bc7 Kb2 8.Ba5 Kxa3
9.Ke1 Bd5 10.Kf1 Kb2 11.Kf2 Kc3
12.Kf1 Kc4 13.Kf2 Bc6 14.Ke1 (43.010.859.903) 15529

60/79 66:22 -2.94 1.c6 Bxc6 2.Bc3 Bb7 3.Bb2 Kd3 4.Be5 f3
5.Kf1 Kc2 6.Bd6 Bc6 7.Bc7 Kb2 8.Ba5 Kxa3
9.Ke1 Bd5 10.Kf1 Kb2 11.Kf2 Kc3
12.Kg1 Kd4 13.Bb6+ Ke4 14.Kf2 (61.686.504.434) 15489

61/79 70:10 -2.94 1.c6 Bxc6 2.Bc3 Bb7 3.Bb2 Kd3 4.Be5 f3
5.Kf1 Kc2 6.Bd6 Bc6 7.Bc7 Kb2 8.Ba5 Kxa3
9.Kf2 Kb3 10.Ke1 Kb2 11.Bb6 Kc3
12.Ba5 Bd5 13.Kf2 Be4 14.Kf1 (65.029.344.656) 15443

62/79 76:15 -2.94 1.c6 Bxc6 2.Bc3 Bb7 3.Bb2 Kd3 4.Be5 f3
5.Kf1 Kc2 6.Bd6 Bc6 7.Bc7 Kb2 8.Ba5 Kxa3
9.Kf2 Kb3 10.Ke1 Kb2 11.Bb6 Kc3
12.Ba5 Bd5 13.Kf2 Be4 14.Ke3 (70.491.062.057) 15405

63/79 138:43 -2.94 1.c6 Bxc6 2.Bc3 Bb7 3.Bb2 Kd3 4.Be5 f3
5.Kf1 Kc2 6.Bd6 Bc6 7.Bc7 Kb2 8.Ba5 Kxa3
9.Kf2 Kb3 10.Kg3 Kc4 11.Kf2 Bd5
12.Kf1 Kd3 13.Kf2 Kc3 14.Kg1 (133.182.933.714) 16000

64/79 214:42 -2.94 1.c6 Bxc6 2.Bc3 Bb7 3.Bb2 Kd3 4.Be5 f3
5.Kf1 Kc2 6.Bd6 Bc6 7.Bc7 Kb2 8.Ba5 Kxa3
9.Kf2 Bd5 10.Kg3 Ka4 11.Kf2 Kb3
12.Kf1 Kc3 13.Kf2 Bb7 14.Kf1 (206.512.296.728) 16030

65/79 256:24 -2.94 1.c6 Bxc6 2.Bc3 Bb7 3.Bb2 Kd3 4.Be5 f3
5.Kf1 Kc2 6.Bd6 Bc6 7.Bc7 Kb2 8.Ba5 Kxa3
9.Kf2 Bd5 10.Kg3 Kb2 11.Bb6 Kc3
12.Ba5 Kc4 13.Kf2 Bc6 14.Ke3 (248.901.221.740) 16178

best move: c5-c6 time: 296:28.078 min n/s: 16.178.540 nodes: 290.747.361.950


That was with 7 threads. And with just four threads, equal to the number of cores:

[D]8/8/p4Bp1/1pPb2P1/1P2kp2/P7/5K2/8 w - -

Engine: Rainbow Serpent 20160903_018 MP (512 MB)
by T. Romstad, M. Costalba, J. Kiiski, G. Linscott

29/36 0:01 -0.58 1.Be7 Kd4 2.Bf6+ Kc4 3.Be5 f3 4.Bc7 Kd4
5.Bd6 Kd3 6.Be7 Kc2 7.Bd8 Be4 8.Bc7 Kd3
9.Bd6 Bb7 10.Be5 Bc6 11.Bc7 Bd5
12.Ba5 Kd4 13.Bc7 Bc6 14.Bd8 (18.422.729) 10661

30/37 0:02 -0.61 1.Be7 Kd4 2.Bf6+ Kc4 3.Be5 f3 4.Bc7 Kd4
5.Bd6 Kd3 6.Bc7 Ke4 7.Bd8 Kd4 8.Bc7 Bb7
9.Bd8 Kc3 10.Bc7 Bd5 11.Be5+ Kb3
12.Bc7 Kxa3 13.Ba5 Be4 14.Ke3 (25.301.425) 10761

31/40 0:04 -0.61 1.Be7 Kd4 2.Bf6+ Kc4 3.Be5 f3 4.Bc7 Kd4
5.Bd6 Bc6 6.Bc7 Kc3 7.Ke3 Kb3 8.Ba5 Kxa3
9.Kf2 Bd5 10.Ke3 Kb3 11.Kf2 Bc6
12.Ke3 Kc4 13.Kf2 Bb7 14.Ke3 (50.044.666) 10989

32/40 0:05 -0.61 1.Be7 Kd4 2.Bf6+ Kc4 3.Be5 f3 4.Bf4 Kd4
5.Bc7 Bb7 6.Bd8 Ke4 7.Bf6 Bc6 8.Bd8 Kf4
9.Bf6 Ba8 10.Be7 Ke4 11.Bf6 Kd3
12.Bd8 Bb7 13.Bc7 Ke4 14.Bd8 (60.315.815) 11034

33/49 0:07 -0.56 1.Be7 Kd4 2.Bf6+ Kc4 3.Be5 f3 4.Bf4 Kd4
5.Bc7 Bb7 6.Bd8 Ke4 7.Bf6 Bc6 8.Be7 Kf4
9.Bf6 Ba8 10.Be7 Ke4 11.Bf6 Bb7
12.Be7 Bd5 13.Bf8 Kd3 14.Be7 (84.668.322) 11140

34/49 0:08 -0.56 1.Be7 Kd4 2.Bf6+ Kc4 3.Be5 f3 4.Bf4 Kd4
5.Bc7 Bb7 6.Bd8 Ke4 7.Bf6 Bd5 8.Be7 Kf4
9.Bf6 Kf5 10.Ke3 Kg4 11.Kf2 Kf4
12.Be7 Ba8 13.Bf6 Bc6 14.Bd8 (95.002.454) 11119

35/49 0:11 -0.56 1.Be7 Kd4 2.Bf6+ Kc4 3.Be5 f3 4.Bf4 Kd4
5.Bc7 Bb7 6.Bd8 Ke4 7.Bf6 Bd5 8.Be7 Kf4
9.Bf6 Kf5 10.Ke3 Kg4 11.Kf2 Kf4
12.Be7 Ke4 13.Bf8 Kd3 14.Be7 (128.280.803) 11159

36/49 0:11 -0.56 1.Be7 Kd4 2.Bf6+ Kc4 3.Be5 f3 4.Bf4 Kd4
5.Bc7 Bb7 6.Bd8 Ke4 7.Be7 Kd3 8.Bd8 Be4
9.Bc7 Bd5 10.Bd6 Ke4 11.Be7 Ba8
12.Bd6 Bc6 13.Bc7 Kd4 14.Bd8 (132.822.777) 11164

37/49 0:18 -0.56 1.Be7 Kd4 2.Bf6+ Kc4 3.Be5 f3 4.Bf4 Kd4
5.Bc7 Bb7 6.Bd8 Ke4 7.Be7 Kf5 8.Ke3 Kg4
9.Kf2 Ba8 10.Bd8 Kf4 11.Be7 Ke4
12.Bd6 Kd4 13.Be7 Kc4 14.Bd8 (208.984.548) 11136

38/49 0:20 -0.56 1.Be7 Kd4 2.Bf6+ Kc4 3.Be5 f3 4.Bf4 Kd4
5.Bc7 Bb7 6.Bd8 Ke4 7.Be7 Kf5 8.Ke3 Be4
9.Bd8 Bc6 10.Bf6 Kg4 11.Kf2 Ba8
12.Ke1 Bb7 13.Kf2 Kf5 14.Ke3 (230.676.960) 11151

39/49 0:31 -0.56 1.Be7 Kd4 2.Bf6+ Kc4 3.Be5 f3 4.Bf4 Kd4
5.Bc7 Bb7 6.Bd8 Ke4 7.Be7 Kf5 8.Ke3 Be4
9.Bd8 Bc6 10.Kf2 Ke4 11.Bc7 Kd3
12.Bd8 Bb7 13.Bb6 Bd5 14.Bc7 (345.263.124) 11082

40/49 0:36 -0.56 1.Be7 Kd4 2.Bf6+ Kc4 3.Be5 f3 4.Bf4 Kd4
5.Bc7 Bb7 6.Bd8 Ke4 7.Be7 Kf5 8.Ke3 Be4
9.Bd8 Bc6 10.Kf2 Bd5 11.Be7 Kg4
12.Bd8 Kf4 13.Bf6 Bc6 14.Be7 (407.972.583) 11032

41/51 0:54 -0.56 1.Be7 Kd4 2.Bf6+ Kc4 3.Be5 f3 4.Bf4 Kd4
5.Bc7 Bb7 6.Bd8 Ke4 7.Be7 Bd5 8.Bf6 Kd3
9.Be7 Kc4 10.Bd8 Bc6 11.Bc7 Kd4
12.Ba5 Ba8 13.Bd8 Ke4 14.Be7 (599.323.450) 10992

42/51 1:04 -0.56 1.Be7 Kd4 2.Bf6+ Kc4 3.Be5 f3 4.Bf4 Kd4
5.Bc7 Bb7 6.Bd8 Ke4 7.Be7 Kf5 8.Bf6 Kf4
9.Bd8 Bc6 10.Be7 Bd5 11.Bf6 Ke4
12.Be7 Kf5 13.Bd8 Be4 14.Ke3 (701.967.636) 10903

43/51 1:39 -0.56 1.Be7 Kd4 2.Bf6+ Kc4 3.Be5 f3 4.Bf4 Kd4
5.Bc7 Bb7 6.Bd8 Ke4 7.Be7 Kf5 8.Bf6 Kf4
9.Bd8 Bc6 10.Be7 Bd5 11.Bf6 Kf5
12.Ke3 Kg4 13.Kf2 Bb7 14.Bd8 (1.091.016.587) 10975

44/51 2:08 -0.56 1.Be7 Kd4 2.Bf6+ Kc4 3.Be5 f3 4.Bf4 Kd4
5.Bc7 Bb7 6.Bd8 Ke4 7.Be7 Kf5 8.Bf6 Kf4
9.Bd8 Bc6 10.Be7 Bd5 11.Bf6 Be4
12.Be7 Bc6 13.Bf6 Ke4 14.Be7 (1.407.152.315) 10961

45/51 3:10 -0.56 1.Be7 Kd4 2.Bf6+ Kc4 3.Be5 f3 4.Bf4 Kd4
5.Bc7 Bb7 6.Bd8 Ke4 7.Be7 Bc6 8.Bd8 Kd3
9.Bc7 Kd4 10.Bd8 Ke4 11.Be7 Kd3
12.Bd8 Bb7 13.Be7 Kc4 14.Ke3 (2.085.584.523) 10935

46/54 3:32 -0.56 1.Be7 Kd4 2.Bf6+ Kc4 3.Be5 f3 4.Bf4 Kd4
5.Bc7 Bb7 6.Bd8 Ke4 7.Be7 Bc6 8.Bd8 Bd7
9.Bc7 Be8 10.Bd8 Kf4 11.Bc7+ Kg4
12.Bd6 Bc6 13.Be7 Kf4 14.Bd8 (2.322.430.493) 10932

47/54 5:17 -0.56 1.Be7 Kd4 2.Bf6+ Kc4 3.Be5 f3 4.Bf4 Kd4
5.Bc7 Bb7 6.Bd8 Bc6 7.Bf6+ Ke4 8.Bd8 Kf4
9.Bf6 Bd7 10.Bg7 Ke4 11.Bf8 Bc6
12.Be7 Bd5 13.Bd8 Kf4 14.Be7 (3.442.620.568) 10852

48/70 7:47 -0.64-- 1.Be7 Kd3 (5.146.493.382) 11003

48/70 8:50 -0.72-- 1.Be7 Kd3 (5.823.462.990) 10982

48/70 11:24 -0.85-- 1.Be7 Kd3 (7.585.028.803) 11074

48/70 12:29 -1.02-- 1.Be7 Kd4 (8.313.283.076) 11087

48/70 14:28 -1.25-- 1.Be7 Kd4 (9.751.400.363) 11221

48/70 15:43 -1.55-- 1.Be7 Kd4 (10.654.245.464) 11292

48/70 17:18 -1.95-- 1.Be7 Kd4 (11.744.976.056) 11306

48/70 18:17 -2.48-- 1.Be7 Kd4 (12.409.514.000) 11308

48/70 21:04 -2.87 1.Be7 Kd4 2.Bf8 Be4 3.Bd6 f3 4.Kg3 Kc3
5.Kf2 Kb2 6.Bc7 Kxa3 7.Ba5 Kb3 8.Kg3 Kc3
9.c6 Bxc6 10.Kf2 Kc4 11.Kg3 Kd3
12.Kf2 Ke4 13.Bc7 Kf5 14.Bd8 (14.221.759.624) 11247

49/70 23:42 -2.87 1.Be7 Kd4 2.Bf8 Be4 3.Bd6 f3 4.Kg3 Kc3
5.Kf2 Kc4 6.c6 Bxc6 7.Bc7 Ba8 8.Ba5 Bb7
9.Bc7 Kb3 10.Kf1 Kxa3 11.Ba5 Kb2
12.Kf2 Bc6 13.Kf1 Kb3 14.Kf2 (16.006.210.878) 11254

50/70 24:12 -2.87 1.Be7 Kd4 2.Bf8 Be4 3.Bd6 f3 4.Kg3 Kc3
5.Kf2 Kc4 6.c6 Bxc6 7.Bc7 Ba8 8.Ba5 Bb7
9.Bd8 Bd5 10.Kg3 Bc6 11.Bf6 Kb3
12.Bd8 Kxa3 13.Ba5 Kb3 14.Kf2 (16.319.975.528) 11236

51/70 24:38 -2.87 1.Be7 Kd4 2.Bf8 Be4 3.Bd6 f3 4.Kg3 Kc3
5.Kf2 Kc4 6.c6 Bxc6 7.Bc7 Ba8 8.Ba5 Bb7
9.Bd8 Bd5 10.Kg3 Bc6 11.Bf6 Kb3
12.Bd8 Kxa3 13.Ba5 Kb3 14.Kf2 (16.590.924.472) 11224

52/70 25:24 -2.87 1.Be7 Kd4 2.Bf8 Be4 3.Bd6 f3 4.Kg3 Kc3
5.Kf2 Kc4 6.c6 Bxc6 7.Bc7 Ba8 8.Ba5 Bb7
9.Bd8 Bd5 10.Kg3 Bc6 11.Bf6 Bb7
12.Be7 Kb3 13.Bd8 Kb2 14.Bb6 (17.095.860.132) 11210

53/76 59:14 -2.87 1.c6 Bxc6 2.Kf1 Bd5 3.Bb2 f3 4.Kf2 Kf5
5.Bc1 Kg4 6.Be3 Be6 7.Kf1 Bf5 8.Ke1 Be4
9.Bc1 Bb7 10.Be3 Bc6 11.Kf1 Be4
12.Kg1 Kg3 13.Kf1 Bd5 14.Ke1 (40.215.227.138) 11314

54/76 59:37 -2.87 1.c6 Bxc6 2.Kf1 Bd5 3.Bb2 f3 4.Kf2 Kf5
5.Bc1 Kg4 6.Be3 Be6 7.Kf1 Bf5 8.Ke1 Be4
9.Bc1 Kg3 10.Be3 Bd5 11.Kf1 Be6
12.Ke1 Kh4 13.Kd1 Bb3+ 14.Ke1 (40.475.089.679) 11314

55/76 59:59 -2.87 1.c6 Bxc6 2.Kf1 Bd5 3.Bb2 f3 4.Kf2 Kf5
5.Bc1 Kg4 6.Be3 Be6 7.Kf1 Bf5 8.Ke1 Be4
9.Bc1 Bd3 10.Be3 Bf5 11.Kd1 Kg3
12.Ke1 Be6 13.Bb6 Kf4 14.Kf2 (40.735.789.118) 11318

56/76 60:15 -2.87 1.c6 Bxc6 2.Kf1 Bd5 3.Bb2 f3 4.Kf2 Kf5
5.Bc1 Kg4 6.Be3 Be6 7.Kf1 Bf5 8.Ke1 Be4
9.Bc1 Bd3 10.Be3 Bf5 11.Kd1 Kg3
12.Ke1 Be6 13.Bb6 Kf4 14.Kf2 (40.950.484.784) 11326

57/76 61:06 -2.87 1.c6 Bxc6 2.Kf1 Bd5 3.Bb2 f3 4.Kf2 Kf5
5.Bc1 Kg4 6.Be3 Be6 7.Kf1 Bf5 8.Ke1 Be4
9.Bc1 Bd3 10.Be3 Bf5 11.Kd1 Kg3
12.Ke1 Be6 13.Bb6 Kf4 14.Kf2 (41.550.652.912) 11332

58/76 73:08 -2.87 1.c6 Bxc6 2.Kf1 Bd5 3.Bb2 f3 4.Kf2 Kf5
5.Bc1 Kg4 6.Be3 Be6 7.Kf1 Kg3 8.Ke1 Bb3
9.Bd4 Kf4 10.Kf2 Bd5 11.Bf6 Kf5
12.Be7 Ke4 13.Bd8 Kd3 14.Bb6 (50.643.871.932) 11538

59/76 75:41 -2.87 1.c6 Bxc6 2.Kf1 Bd5 3.Bb2 f3 4.Kf2 Kf5
5.Bc1 Kg4 6.Be3 Be6 7.Kf1 Bf5 8.Ke1 Bd7
9.Kf1 Bc8 10.Bd2 Kf5 11.Be3 Ke4
12.Bc1 Bb7 13.Ke1 Bc6 14.Kf2 (52.476.764.372) 11556

60/76 84:29 -2.95-- 1.c6 Bxc6 (58.327.665.310) 11505

60/79 123:35 -2.95 1.c6 Bxc6 2.Kf1 Bd5 3.Bb2 f3 4.Kf2 Kd3
5.Be5 Kc4 6.Bf4 Kc3 7.Bd6 Kb3 8.Bc7 Bb7
9.Kf1 Kxa3 10.Ba5 Ka2 11.Bb6 Kb3
12.Ba5 Kc4 13.Ke1 Kd3 14.Kf2 (87.166.272.874) 11753

61/79 148:13 -2.95 1.c6 Bxc6 2.Kf1 Bd5 3.Bb2 f3 4.Kf2 Bb7
5.Bc1 Kd4 6.Bf4 Kc3 7.Bc1 Kc2 8.Bf4 Kb2
9.Bc7 Kxa3 10.Ba5 Bd5 11.Ke1 Kb3
12.Kf2 Bc6 13.Kg3 Ba8 14.Kf2 (105.259.319.178) 11835

62/79 177:41 -2.95 1.c6 Bxc6 2.Kf1 Kd3 3.Bb2 Bd5 4.Be5 f3
5.Kf2 Kc2 6.Bf6 Kb3 7.Bd8 Kxa3 8.Ba5 Kb3
9.Ke3 Kc3 10.Kf2 Bc6 11.Ke3 Kb3
12.Kf2 Ka3 13.Kf1 Be4 14.Kg1 (125.184.086.836) 11741

63/79 219:06 -2.95 1.c6 Bxc6 2.Kf1 Kd3 3.Bb2 Bd5 4.Be5 f3
5.Kf2 Kc4 6.Bf6 Kb3 7.Bd8 Kxa3 8.Ba5 Kb3
9.Ke3 Kc3 10.Kf2 Kd2 11.Bd8 Kd3
12.Bc7 Ke4 13.Ba5 Bb7 14.Bb6 (156.676.283.589) 11918

64/79 234:20 -2.95 1.c6 Bxc6 2.Kf1 Kd3 3.Bb2 Bd5 4.Be5 f3
5.Kf2 Kc4 6.Bf6 Kb3 7.Bd8 Kxa3 8.Ba5 Kb3
9.Ke3 Kc3 10.Kf2 Kd2 11.Bb6 Ba8
12.Ba5 Bb7 13.Bc7 Kd3 14.Bd8 (167.802.948.421) 11934

You can only slightly see a difference sometimes for any time to depth you pick, but depth is not fixed anymore (Lazy SMP). Sometimes the difference in time to a certain plydepth is almost two, twice the depth in the same time is about 50 Elo better. But that is an upper limit, and it would mean that 7 cores is twice as fast as four, which is theoretically impossible, especially if the new ones are all hyperthreaded. 'Super linear scaling' it is called. Time to find 1.c6 is actually bad in both cases, (version 018 probably has an inferior search and strictly this is not Stockfish anymore) but that is beside the point. Lots of noise in there too makes it difficult to measure the real differences.
Debugging is twice as hard as writing the code in the first
place. Therefore, if you write the code as cleverly as possible, you
are, by definition, not smart enough to debug it.
-- Brian W. Kernighan

Milos
Posts: 3387
Joined: Wed Nov 25, 2009 12:47 am

Re: TCEC Super Final News...

Post by Milos » Mon Oct 31, 2016 11:58 pm

Eelco de Groot wrote:If on my modern i7 6700 I go from 4 threads to 7, that is three hyperthreads, the taskmanager goes from 60% to almost a 100%. This actually is another bug in Windows I am told. You also don't see the chess engines listed in the list of processes, so in that list of processes you can't see where the 100% is actually coming from. This bug was not in old Windows XP taskmanager. XP more than 10 years old now. But back to hyperthreading, just one example with 7 threads.
I am surprised that there are still ppl that use Windows Taskman. Just replace it with Process Explorer (practically a Microsoft tool that exists for almost 2 decades now) it is infinitely better.

syzygy
Posts: 4455
Joined: Tue Feb 28, 2012 10:56 pm

Re: TCEC Super Final News...

Post by syzygy » Tue Nov 01, 2016 12:08 am

Eelco de Groot wrote:
CheckersGuy wrote:How much does Stockfish gain from hyperthreading ? It's not that much I assume :?
This is not about hyperthreading as such, but because of a Windows limitation/bug with this much processors. So actually in this case, a lot of speed is lost, Marco thinks about 20 to 30% loss with HT ON but not changing the number of threads. That's just a feature of Windows.
Let's do the math.

If the Windows-based dual-node 22-core server has HT on, Windows will count 2 x 22 x 2 = 88 logical processors. Since 88 > 64, it will divide those logical processors over two processor groups, each having 2 x 22 logical processors corresponding to the 22 physical cores of one node. If a multi-threaded program does nothing special, Windows will schedule all its threads on the cores of a single processor group. So SF would run on only one of the two 22-core cpus.

Suppose SF could get 1Mnps out of each physical core. It should then run at 44Mnps. But if the processor-group limitation kicks in, it will run two search threads on each logical core of a single node. This does not quite halve the speed, because HT is apparently giving about 30% extra nps. So SF should get about 1.3 x 0.5 x 44 = 28.6Mnps instead of 44Mnps. A loss of 35%

I"ll add that there is nothing inherently bad with having HT on in the BIOS. It's just Windows that makes it problematic.

MikeB
Posts: 3467
Joined: Thu Mar 09, 2006 5:34 am
Location: Pen Argyl, Pennsylvania

Re: TCEC Super Final News...

Post by MikeB » Tue Nov 01, 2016 12:30 am

Dan Cooper wrote:And as we all know, SF team submitting a binary from something other than master has never backfired before...
+1 touché nice...

MikeB
Posts: 3467
Joined: Thu Mar 09, 2006 5:34 am
Location: Pen Argyl, Pennsylvania

Re: TCEC Super Final News...

Post by MikeB » Tue Nov 01, 2016 12:56 am

Mac Pro 12 physical cores
[D]8/8/p4Bp1/1pPb2P1/1P2kp2/P7/5K2/8 w - -

Code: Select all

threads 12
hash 2048

info depth 68 seldepth 71 multipv 1 score cp -66 nodes 3360603603 nps 25621582 hashfull 999 tbhits 0 time 131163 pv f6e7 e4d3 e7d6 f4f3 d6e7 d5b7 e7d6 d3c2 f2e3 c2b2 d6c7 b2a3 c7a5 a3b3 e3f2 b3a4 f2e3 b7c6 e3f2 c6d5 f2e3 a4b3 e3f2 d5b7 f2e3 b3c4 e3f2 b7c6 f2e3 c6d5 e3f2 c4b3 f2e3 d5c6 e3d3 b3b2 d3e3 b2c3 e3f2 c6d5 f2e3 c3c2 e3f2 d5b7 f2e3 c2b1 e3f2 b7c6 f2e3 b1c2 e3f2 c2b3 f2e3 b3c4 e3f2 c6b7 f2e3 c4c3 e3f2 b7a8 f2e3 c3c4 e3f2 c4d4 a5d8 a8d5 d8a5 d5c6 a5c7 c6d5

Code: Select all

threads 18
hash 2048

info depth 68 seldepth 71 multipv 1 score cp -114 nodes 2869441476 nps 31597950 hashfull 999 tbhits 0 time 90811 pv f6e7 e4d4 e7d6 f4f3 d6h2 d5c6 h2c7 d4d3 c7d6 d3e4 d6c7 c6d5 c7h2 d5b7 h2g3 b7a8 g3h4 a8c6 f2g1 e4d4 h4g3 d4c3 g3c7 c3c2 g1f1 c6d5 c7a5 c2b3 a5b6 d5c6 b6c7 b3b2 c7a5 c6b7 a5b6 b2c2 b6a5 b7e4 a5c7 c2b2 c7a5 b2b3 f1e1 e4c6 e1f1 b3c4 a5d8 c6b7 f1f2 c4c3 d8c7 c3c2 f2e3 c2b2 c7a5 b7e4 e3f2 e4d5 a5c7 b2b1 c7a5 d5b7 f2e3 b7c6 a5c7 b1c2 e3f2 c6d5 c7b6 c2b3
trying to use 24 threads in this position is a disaster...

Code: Select all

threads 24
hash 2048

info depth 62 seldepth 67 multipv 1 score cp -74 nodes 3313354076 nps 33230574 hashfull 999 tbhits 0 time 99708 pv f6d8 e4d4 d8c7 f4f3 c7f4 d4c3 f4c7 d5c6 c7a5 c6b7 a5b6 c3c4 b6d8 b7d5 d8c7 c4b3 f2e3 b3a3 c7a5 a3b3 e3f2 b3c3 f2e3 c3c4 e3f2 d5c6 f2g3 c4c3 g3f2 c3d4 a5c7 c6d5 c7a5 d4c4 f2e3 c4c3 e3f2 d5b7 f2e3 c3c4 e3f2 c4d3 a5c7 b7d5 c7d8 d5c6 d8a5 d3e4 a5c7 e4f5 c7d8 f5g4 d8e7 c6b7 e7d8 g4f4 d8f6 f4e4 f6e7 e4d3 e7d8 d3d4 d8a5 d4c4 f2e3 b7c6 e3f2
Larry Kaufman had suggested for Komodo, 1.75 times physical cores = number of threads, seems to work for Stockfish as well.

User avatar
Eelco de Groot
Posts: 4162
Joined: Sun Mar 12, 2006 1:40 am
Location: Groningen

Re: TCEC Super Final News...

Post by Eelco de Groot » Tue Nov 01, 2016 1:11 am

MikeB wrote:
Dan Cooper wrote:And as we all know, SF team submitting a binary from something other than master has never backfired before...
+1 touché nice...
I don't quite remember what this might have been, sorry. But the alternative binary, I think, was mainly meant by Marco for Anton to measure if there is a large speed difference on this system, indicating if HT is set in the BIOS. We assume it will not be but if not, 35% speed gain if Ronald's computation is correct, it is probaby better to go with the other one for the final :shock:
Debugging is twice as hard as writing the code in the first
place. Therefore, if you write the code as cleverly as possible, you
are, by definition, not smart enough to debug it.
-- Brian W. Kernighan

Post Reply