g8B! 2009
[d]kb6/p5P1/P6p/1K3P1P/3p3P/3P4/1p1N4/qB6 w - - 0 1
Something strange in Crystal 3.1 : AVX2 seems better even for Intel Core !?
When, I run this position on my Core I5 9600K (6 cores) + 16 Go Hash
With AVX2 optimize I get 3 mn 36s
With BMI2 optimize I get 22mn 02s
With BMI2 optimize (only 1 core) I get 44mn 05s
Do you think AVX2 is better for Crystal even for an Intel Core ?
Or this is just a special case ?
Analysis by Crystal 150121 3.1 AVX2: (Core I5 9600K 6 cores)
..
1.f6 Qa3 2.Nc4 Qb3+ 3.Kc5 Qa4 4.Ba2 b1Q 5.Bxb1 Qd7 6.Ba2 Qe6 7.Kb5 Qxf6 8.g8Q Qf5+ 9.Ne5 Qd7+ 10.Kc4 Qb5+ 11.Kxd4 Qxe5+ 12.Kc4 Qc7+ 13.Kd4
= (0.00) Depth: 29/27 00:00:15 119MN
1.f6 Qa3 2.Nc4 Qb3+ 3.Kc5 Qa4 4.Ba2 b1Q 5.Bxb1 Qd7 6.Ba2 Qe6 7.Kb5 Qxf6 8.g8Q Qf5+ 9.Ne5 Qd7+ 10.Kc4 Qb5+ 11.Kxd4 Qxe5+ 12.Kc4 Qc7+ 13.Kd4
= (0.00) Depth: 30/27 00:00:19 153MN
1.f6 Qa3 2.Nc4 Qb3+ 3.Kc5 Qa4 4.Ba2 b1Q 5.Bxb1 Qd7 6.Ba2 Qe6 7.Kb5 Qxf6 8.g8Q Qf5+ 9.Ne5 Qd7+ 10.Kc4 Qb5+ 11.Kxd4 Qxe5+ 12.Kc4 Qc7+ 13.Kd4
= (0.00) Depth: 31/27 00:00:20 165MN
1.f6 Qa3 2.Nc4 Qb3+ 3.Kc5 Qa4 4.Ba2 b1Q 5.Bxb1 Qd7 6.Ba2 Qe6 7.Kb5 Qxf6 8.g8Q Qf5+ 9.Ne5 Qd7+ 10.Kc4 Qb5+ 11.Kxd4 Qxe5+ 12.Kc4 Qc7+ 13.Kd4
= (0.00) Depth: 32/27 00:00:28 247MN
1.f6 Qa3 2.Nc4 Qb3+ 3.Kc5 Qa4 4.Ba2 b1Q 5.Bxb1 Qd7 6.Ba2 Qe6 7.Kb5 Qxf6 8.g8Q Qf5+ 9.Ne5 Qd7+ 10.Kc4 Qb5+ 11.Kxd4 Qxe5+ 12.Kc4 Qc7+ 13.Kd4
= (0.00) Depth: 33/27 00:00:50 452MN
1.f6 Qa3 2.Nc4 Qb3+ 3.Kc5 Qa4 4.Ba2 b1Q 5.Bxb1 Qd7 6.Ba2 Qe6 7.Kb5 Qxf6 8.g8Q Qf5+ 9.Ne5 Qd7+ 10.Kc4 Qb5+ 11.Kxd4 Qxe5+ 12.Kc4 Qc7+ 13.Kd4
= (0.00) Depth: 34/27 00:01:02 571MN
1.f6 Qa3 2.Nc4 Qb3+ 3.Kc5 Qa4 4.Ba2 b1Q 5.Bxb1 Qd7 6.Ba2 Qe6 7.Kb5 Qxf6 8.g8Q Qf5+ 9.Ne5 Qd7+ 10.Kc4 Qb5+ 11.Kxd4 Qxe5+ 12.Kc4 Qc7+ 13.Kd4
= (0.00) Depth: 35/27 00:01:10 658MN
1.f6 Qa3 2.Nc4 Qb3+ 3.Kc5 Qa4 4.Ba2 b1Q 5.Bxb1 Qd7 6.Ba2 Qe6 7.Kb5 Qxf6 8.g8Q Qf5+ 9.Ne5 Qd7+ 10.Kc4 Qb5+ 11.Kxd4 Qxe5+ 12.Kc4 Qc7+ 13.Kd4
= (0.00) Depth: 36/27 00:01:24 790MN
1.g8B
= (0.08 ++) Depth: 37/32 00:03:35 2071MN
1.g8B
= (0.16 ++) Depth: 37/32 00:03:36 2074MN
1.g8B
= (0.28 ++) Depth: 37/32 00:03:36 2077MN
1.g8B
+/= (0.46 ++) Depth: 37/32 00:03:36 2081MN
....
Analysis by Crystal 140121 3.1 BMI2 (Core I5 9600K 6 cores)
..
1.f6 Qa3 2.Nc4 Qb3+ 3.Kc5 Qa4 4.f7 Qd7 5.g8Q Qb5+ 6.Kxd4 Qxc4+ 7.Ke3 Qf4+ 8.Ke2 Qh2+ 9.Ke3
= (0.00) Depth: 28/19 00:00:12 91400kN
1.f6 Qa3 2.Nc4 Qb3+ 3.Kc5 Qa4 4.f7 Qd7 5.g8Q Qb5+ 6.Kxd4 Qxc4+ 7.Ke3 Qf4+ 8.Ke2 Qh2+ 9.Ke3
= (0.00) Depth: 29/19 00:00:19 170MN
1.f6 Qa3 2.Nc4 Qb3+ 3.Kc5 Qa4 4.f7 Qd7 5.g8Q Qb5+ 6.Kxd4 Qxc4+ 7.Ke3 Qf4+ 8.Ke2 Qh2+ 9.Ke3
= (0.00) Depth: 30/19 00:00:28 257MN
1.f6 Qa3 2.Nc4 Qb3+ 3.Kc5 Qa4 4.f7 Qd7 5.g8Q Qb5+ 6.Kxd4 Qxc4+ 7.Ke3 Qf4+ 8.Ke2 Qh2+ 9.Ke3
= (0.00) Depth: 31/19 00:00:33 303MN
1.f6 Qa3 2.Nc4 Qb3+ 3.Kc5 Qa4 4.f7 Qd7 5.g8Q Qb5+ 6.Kxd4 Qxc4+ 7.Ke3 Qf4+ 8.Ke2 Qh2+ 9.Ke3
= (0.00) Depth: 32/19 00:01:03 599MN
1.f6 Qa3 2.Nc4 Qb3+ 3.Kc5 Qa4 4.f7 Qd7 5.g8Q Qb5+ 6.Kxd4 Qxc4+ 7.Ke3 Qf4+ 8.Ke2 Qh2+ 9.Ke3
= (0.00) Depth: 33/19 00:01:43 999MN
1.f6 Qa3 2.Nc4 Qb3+ 3.Kc5 Qa4 4.f7 Qd7 5.g8Q Qb5+ 6.Kxd4 Qxc4+ 7.Ke3 Qf4+ 8.Ke2 Qh2+ 9.Ke3
= (0.00) Depth: 34/19 00:01:46 1035MN
1.f6 Qa3 2.Nc4 Qb3+ 3.Kc5 Qa4 4.f7 Qd7 5.g8Q Qb5+ 6.Kxd4 Qxc4+ 7.Ke3 Qf4+ 8.Ke2 Qh2+ 9.Ke3
= (0.00) Depth: 35/19 00:03:38 2147MN
1.f6 Qa3 2.Nc4 Qb3+ 3.Kc5 Qa4 4.f7 Qd7 5.g8Q Qb5+ 6.Kxd4 Qxc4+ 7.Ke3 Qf4+ 8.Ke2 Qh2+ 9.Ke3
= (0.00) Depth: 36/19 00:08:18 4831MN
1.f6 Qa3 2.Nc4 Qb3+ 3.Kc5 Qa4 4.f7 Qd7 5.g8Q Qb5+ 6.Kxd4 Qxc4+ 7.Ke3 Qf4+ 8.Ke2 Qh2+ 9.Ke3
= (0.00) Depth: 37/19 00:17:39 10344MN
1.g8B
= (0.08 ++) Depth: 38/34 00:22:02 12969MN
1.g8B
= (0.16 ++) Depth: 38/34 00:22:03 12978MN
1.g8B
= (0.28 ++) Depth: 38/38 00:22:04 12984MN
1.g8B
+/= (0.46 ++) Depth: 38/39 00:22:05 12990MN
Analysis by Crystal 140121 3.1 BMI2 (Core I5 9600K 1 core)
...
1.f6 Qa3 2.Nc4 Qb3+ 3.Kc5 Qa4 4.Ba2 b1Q 5.Bxb1 Qd7 6.Ba2 Qe6 7.Kb5 Qxf6 8.g8Q Qf5+ 9.Ne5 Qd7+ 10.Kc4 Qb5+ 11.Kxd4 Qb2+ 12.Ke4 Qxe5+ 13.Kf3 Qf4+ 14.Ke2 Qe5+
= (0.00) Depth: 38/34 00:10:10 1127MN
1.f6 Qa3 2.Nc4 Qb3+ 3.Kc5 Qa4 4.Ba2 b1Q 5.Bxb1 Qd7 6.Ba2 Qe6 7.Kb5 Qxf6 8.g8Q Qf5+ 9.Ne5 Qd7+ 10.Kc4 Qb5+ 11.Kxd4 Qb2+ 12.Ke4 Qxe5+ 13.Kf3 Qf4+ 14.Ke2 Qe5+
= (0.00) Depth: 39/34 00:12:28 1381MN
1.f6 Qa3 2.Nc4 Qb3+ 3.Kc5 Qa4 4.Ba2 b1Q 5.Bxb1 Qd7 6.Ba2 Qe6 7.Kb5 Qxf6 8.g8Q Qf5+ 9.Ne5 Qd7+ 10.Kc4 Qb5+ 11.Kxd4 Qb2+ 12.Ke4 Qxe5+ 13.Kf3 Qf4+ 14.Ke2 Qe5+
= (0.00) Depth: 40/34 00:34:02 3778MN
1.f6 Qa3 2.Nc4 Qb3+ 3.Kc5 Qa4 4.Ba2 b1Q 5.Bxb1 Qd7 6.Ba2 Qe6 7.Kb5 Qxf6 8.g8Q Qf5+ 9.Ne5 Qd7+ 10.Kc4 Qb5+ 11.Kxd4 Qb2+ 12.Ke4 Qxe5+ 13.Kf3 Qf4+ 14.Ke2 Qe5+
= (0.00) Depth: 41/34 00:36:54 4099MN
1.g8B
= (0.08 ++) Depth: 42/34 00:44:05 4897MN
1.g8B
= (0.16 ++) Depth: 42/34 00:44:09 4905MN
1.g8B
= (0.28 ++) Depth: 42/34 00:44:12 4912MN
1.g8B
+/= (0.46 ++) Depth: 42/34 00:44:17 4921MN
...
1.g8B Bf4 2.Bga2 Bxd2 3.f6 Qxa2 4.Bxa2 b1Q+ 5.Bxb1 Bc1 6.Kc6 Kb8 7.Kd7 Bd2 8.Ba2 Be1 9.f7 Bb4 10.Bd5 Ba3 11.Bc4 Bf8 12.Bb5 Bg7 13.Ke7 Kc7 14.f8Q Bxf8+ 15.Kxf8 Kd8 16.Bc4 Kc8 17.Ba2 Kd7 18.Kg7 Kd6 19.Kxh6 Kd7 20.Bd5 Kc7 21.Bc4 Kc6 22.Kg6 Kb6 23.Kg7 Ka5 24.Kf6 Kb4 25.h6
+- (#78 ++) Depth: 44/80 01:52:28 15022MN
(, 17.02.2021)
Crystal 3.1 AVX2 or BMI2 for Intel Core ?
Moderators: hgm, Dann Corbit, Harvey Williamson
-
Raphexon
- Posts: 476
- Joined: Sun Mar 17, 2019 12:00 pm
- Full name: Henk Drost
Re: Crystal 3.1 AVX2 or BMI2 for Intel Core ?
Multithreading behaviour is not deterministic so results can (and will) change every run.
If you redo the test you will most likely find different times for the compiles again.
This result can be attributed to a lucky result for the AVX2 compile. (Or unlucky for BMI2)
If you redo the test you will most likely find different times for the compiles again.
This result can be attributed to a lucky result for the AVX2 compile. (Or unlucky for BMI2)
-
Jouni
- Posts: 3232
- Joined: Wed Mar 08, 2006 8:15 pm
Re: Crystal 3.1 AVX2 or BMI2 for Intel Core ?
BMI is faster. Look at node speed. But difference to AVX2 is minimal may be 2-3%.
Jouni