A possible harder test position from the game is one ply earlier and it is to avoid Rb5
[d]8/4nk2/1p3p2/2rp2pp/1P1R1N1P/6P1/3KPP2/8 b - - 0 49 am Rb5
I am not sure if avoiding Rb5 save the game but it seems that at least white does not have a forced win.
I am not sure about evaluating the knight endgame after Rc4 and did not analyze it.
A test position from the european under 18 championship
Moderator: Ras
-
Uri Blass
- Posts: 11150
- Joined: Thu Mar 09, 2006 12:37 am
- Location: Tel-Aviv Israel
-
Sorgenkind
Re: A test position from the european under 18 championship
Hi!
It's a quad Opteron 6180 SE (12 cores @ 2.5 GHz each).
Here are the results on 1, 2, 4, 8, 16, 32 and 48 cores (I re-ran the last one which previously used 128 GB hash) with 64 GB hash, I have marked the time to solution:
With more than 16 threads, it takes very long searches until the speed in terms of nodes per second converges to its maximum (which, in middle-game positions usually is between 30 MN/s to 35 MN/s on 48 threads).
Best regards,
Steffen
It's a quad Opteron 6180 SE (12 cores @ 2.5 GHz each).
Here are the results on 1, 2, 4, 8, 16, 32 and 48 cores (I re-ran the last one which previously used 128 GB hash) with 64 GB hash, I have marked the time to solution:
- THREADS = 1
1. h4xg5 f6xg5! 2. Nf4xh5 Ne7-c6 3. Rd4-d3! Nc6-e5 4. Rd3-d4! Ne5-c6
= (0.00) Depth: 22/54 00:19:07.11 1012578kN (883 KN/s, 0 splits, 0 aborts)
1. Nf4-d3 Ne7-c6 2. e2-e3! Kf7-g6 3. Kd2-c3 Nc6xd4 4. e3xd4! Kg6-f5! 5. f2-f3 g5-g4! 6. Kc3-b3 g4xf3 7. Nd3-f2! Kf5-g6 8. Kb3-a4 Rb5xb4 9. Ka4xb4! f6-f5 10. Kb4-b5 f5-f4 11. g3xf4 Kg6-f5! 12. Kb5xb6 Kf5xf4! 13. Kb6-c5 Kf4-g3 14. Nf2-d3 Kg3xh4 15. Kc5xd5 Kh4-g5 16. Kd5-e4
= (1.51) Depth: 22/63 00:41:47.16 2235332kN (892 KN/s, 0 splits, 0 aborts)
1. Nf4-d3 Ne7-c6 2. e2-e3! Kf7-g6 3. Kd2-c3 Nc6xd4 4. e3xd4! Kg6-f5! 5. f2-f3 g5-g4! 6. Kc3-b3 g4xf3 7. Nd3-f2! Kf5-g6 8. Kb3-a4 Rb5xb4 9. Ka4xb4! f6-f5 10. Kb4-b5 f5-f4 11. g3xf4 Kg6-f5! 12. Kb5xb6 Kf5xf4! 13. Kb6-c5 Kf4-g3 14. Nf2-d3 Kg3xh4 15. Kc5xd5 Kh4-g5 16. Kd5-e4
= (1.51) Depth: 22/63 00:44:59.87 2401093kN (889 KN/s, 0 splits, 0 aborts)
THREADS = 2
1. h4xg5 f6xg5! 2. Nf4xh5 Ne7-c6 3. Rd4-d3! Nc6-e5 4. Rd3-d4! Ne5-c6
= (0.00) Depth: 22/49 00:10:13.71 1007283kN (1641 KN/s, 261122 splits, 17998 aborts)
1. Nf4-d3 Ne7-f5 2. e2-e3! Nf5-d6 3. Kd2-c3 Nd6-e4 4. Kc3-b2 Kf7-e6 5. Kb2-a3 Ke6-d6 6. Nd3-c1! Ne4xf2 7. Nc1-e2 Nf2-e4 8. Ka3-a4 Rb5-c5 9. b4xc5! b6xc5 10. Rd4-d1 Ne4-f2 11. Rd1-a1 Kd6-e5 12. Ka4-b3 Nf2-e4
= (1.91) Depth: 22/63 00:29:06.64 2981660kN (1707 KN/s, 520923 splits, 38232 aborts)
1. Nf4-d3 Ne7-f5 2. e2-e3! Nf5-d6 3. Kd2-c3 Nd6-e4 4. Kc3-b2 Kf7-e6 5. Kb2-a3 Ke6-d6 6. Nd3-c1! Ne4xf2 7. Nc1-e2 Nf2-e4 8. Ka3-a4 Rb5-c5 9. b4xc5! b6xc5 10. Rd4-d1 Ne4-f2 11. Rd1-a1 Kd6-e5 12. Ka4-b3 Nf2-e4
= (1.91) Depth: 22/63 00:30:27.89 3109912kN (1701 KN/s, 548341 splits, 41091 aborts)
1. Nf4-d3 Ne7-f5 2. e2-e3! Nf5-d6 3. Kd2-c3 Nd6-e4 4. Kc3-b2 Kf7-e6 5. Kb2-a3 Ke6-d6 6. Nd3-c1 Ne4xf2 7. Nc1-e2 Nf2-e4 8. Ka3-a4 Rb5-c5 9. b4xc5! b6xc5 10. Rd4-d1 Ne4-f2 11. Rd1-a1 Kd6-e5 12. Ka4-b3 Nf2-e4
THREADS = 4
1. h4xg5 f6xg5! 2. Nf4xh5! Ne7-c6 3. Rd4-d3! Nc6-e5 4. Rd3-d4 Ne5-c6
= (0.00) Depth: 22/50 00:07:24.29 1347414kN (3033 KN/s, 1319277 splits, 84310 aborts)
1. Nf4-d3 Ne7-c6 2. e2-e3! Kf7-e6 3. Kd2-c3 Ke6-f5 4. f2-f3 Nc6xd4 5. e3xd4! g5xh4 6. g3xh4! Kf5-e6 7. Kc3-b3 Rb5-c5! 8. Nd3xc5 Ke6-f5 9. Kb3-a4 b6-b5 10. Ka4xb5
= (1.46) Depth: 22/63 00:13:58.18 2626482kN (3134 KN/s, 2009764 splits, 133168 aborts)
1. Nf4-d3 Ne7-c6 2. e2-e3! Kf7-e6 3. Kd2-c3 Ke6-f5 4. f2-f3 Nc6xd4 5. e3xd4! g5xh4 6. g3xh4! Kf5-e6 7. Kc3-b3 Rb5-c5! 8. Nd3xc5 Ke6-f5 9. Kb3-a4 b6-b5 10. Ka4xb5
= (1.46) Depth: 22/63 00:15:03.08 2814896kN (3117 KN/s, 2236086 splits, 152963 aborts)
THREADS = 8
1. h4xg5 f6xg5! 2. Nf4xh5 Ne7-c6 3. Rd4-d3! Nc6-e5 4. Rd3-d4 Ne5-c6
= (0.00) Depth: 22/52 00:03:23.17 1048777kN (5162 KN/s, 2621744 splits, 146236 aborts)
1. Nf4-d3 Ne7-c6 2. e2-e3 Kf7-e6 3. Kd2-c2 Ke6-f5 4. g3-g4 h5xg4 5. h4-h5 Nc6-b8 6. Kc2-c3 Nb8-a6 7. Nd3-e1 g4-g3 8. f2xg3 g5-g4 9. Ne1-c2 Kf5-g5 10. Nc2-a3 Rb5xb4 11. Rd4xb4 Na6xb4 12. Kc3xb4 Kg5xh5
= (1.37) Depth: 22/60 00:09:13.72 3203603kN (5786 KN/s, 5388243 splits, 315848 aborts)
1. Nf4-d3 Ne7-c6 2. e2-e3 Kf7-e6 3. Kd2-c2 Ke6-f5 4. g3-g4 h5xg4 5. h4-h5 Nc6-b8 6. Kc2-c3 Nb8-a6 7. Nd3-e1 g4-g3 8. f2xg3 g5-g4 9. Ne1-c2 Kf5-g5 10. Nc2-a3 Rb5xb4 11. Rd4xb4 Na6xb4 12. Kc3xb4 Kg5xh5
= (1.37) Depth: 22/60 00:09:48.73 3383309kN (5747 KN/s, 5915638 splits, 354794 aborts)
THREADS = 16
1. h4xg5 f6xg5 2. Nf4xh5 Ne7-c6 3. Rd4-d3 Nc6-e5 4. Rd3-d4 Ne5-c6
= (0.00) Depth: 23/59 00:01:19.42 725061kN (9129 KN/s, 2478539 splits, 120256 aborts)
1. Nf4-d3 Ne7-c6 2. e2-e3 Nc6xd4 3. e3xd4 Kf7-g6 4. Kd2-c3 Kg6-f5 5. f2-f3 g5-g4 6. f3xg4 Kf5xg4 7. Kc3-b3 Kg4xg3 8. Kb3-a4 Rb5xb4 9. Nd3xb4 Kg3xh4 10. Ka4-b5 Kh4-g5 11. Nb4xd5
= (1.17) Depth: 23/59 00:03:10.19 1962846kN (10320 KN/s, 5060703 splits, 253415 aborts)
1. Nf4-d3 Ne7-c6 2. e2-e3 Nc6xd4 3. e3xd4 Kf7-g6 4. Kd2-c3 Kg6-f5 5. f2-f3 g5-g4 6. f3xg4 Kf5xg4 7. Kc3-b3 Kg4xg3 8. Kb3-a4 Rb5xb4 9. Nd3xb4 Kg3xh4 10. Ka4-b5 Kh4-g5 11. Nb4xd5
= (1.17) Depth: 23/59 00:03:29.02 2133147kN (10205 KN/s, 5846257 splits, 298719 aborts)
THREADS = 32
1. h4xg5 f6xg5 2. Nf4xh5 Ne7-c6 3. Rd4-d3 Nc6-e5 4. Rd3-d4 Ne5-c6
= (0.00) Depth: 22/52 00:00:33.92 397371kN (11714 KN/s, 3159646 splits, 131587 aborts)
1. Nf4-d3 Ne7-f5 2. e2-e3 Nf5xd4 3. e3xd4 Kf7-g6 4. Kd2-c3 Kg6-f5 5. f2-f3 g5-g4 6. f3xg4 Kf5xg4 7. Kc3-b3 Kg4xg3 8. Kb3-a4 Rb5-a5 9. b4xa5 b6xa5 10. Ka4xa5 Kg3-f3 11. Ka5-b4
= (1.28) Depth: 22/52 00:01:20.64 1207747kN (14976 KN/s, 7028155 splits, 306079 aborts)
1. Nf4-d3 Ne7-f5 2. e2-e3 Nf5xd4 3. e3xd4 Kf7-g6 4. Kd2-c3 Kg6-f5 5. f2-f3 g5-g4 6. f3xg4 Kf5xg4 7. Kc3-b3 Kg4xg3 8. Kb3-a4 Rb5-a5 9. b4xa5 b6xa5 10. Ka4xa5 Kg3-f3 11. Ka5-b4
= (1.28) Depth: 22/52 00:01:28.05 1286317kN (14608 KN/s, 7936072 splits, 355695 aborts)
THREADS = 48
1. h4xg5 f6xg5 2. Nf4xh5 Ne7-c6 3. Rd4-d3 Nc6-e5 4. Rd3-d4 Ne5-c6
= (0.00) Depth: 22/46 00:00:29.47 420266kN (14259 KN/s, 4727174 splits, 189790 aborts)
1. Nf4-d3 Ne7-c6 2. e2-e3 Kf7-e6 3. Kd2-c3 Nc6xd4 4. e3xd4 Ke6-f5 5. f2-f3 g5-g4 6. f3xg4 Kf5xg4 7. Kc3-b3 Kg4xg3 8. Kb3-a4 Rb5-a5 9. b4xa5 b6xa5 10. Ka4xa5 Kg3-f3 11. Ka5-b4
= (1.28) Depth: 22/60 00:01:09.00 1260596kN (18268 KN/s, 10913359 splits, 451647 aborts)
1. Nf4-d3 Ne7-c6 2. e2-e3 Kf7-e6 3. Kd2-c3 Nc6xd4 4. e3xd4 Ke6-f5 5. f2-f3 g5-g4 6. f3xg4 Kf5xg4 7. Kc3-b3 Kg4xg3 8. Kb3-a4 Rb5-a5 9. b4xa5 b6xa5 10. Ka4xa5 Kg3-f3 11. Ka5-b4
= (1.28) Depth: 22/60 00:01:15.69 1342571kN (17735 KN/s, 12187166 splits, 516283 aborts)
With more than 16 threads, it takes very long searches until the speed in terms of nodes per second converges to its maximum (which, in middle-game positions usually is between 30 MN/s to 35 MN/s on 48 threads).
Best regards,
Steffen
-
Sorgenkind
Re: A test position from the european under 18 championship
Hi Louis!
Zappa is not a particularly fast searcher in terms of nodes per second (usually around 800 kN/s on one Opteron 6180 SE core). It does not have any user adjustable parameters related to SMP search, but as far as I know it was designed to perform well on many cores, so I used it as a benchmark.
See my reply to kgburcham for the speed-up on 1, 2, ..., 48 cores.
Best regards,
Steffen
Zappa is not a particularly fast searcher in terms of nodes per second (usually around 800 kN/s on one Opteron 6180 SE core). It does not have any user adjustable parameters related to SMP search, but as far as I know it was designed to perform well on many cores, so I used it as a benchmark.
See my reply to kgburcham for the speed-up on 1, 2, ..., 48 cores.
Best regards,
Steffen
-
rbarreira
- Posts: 900
- Joined: Tue Apr 27, 2010 3:48 pm
Re: A test position from the european under 18 championship
Wow, that is a very interesting machine. I hope you won't mind a question about it. I have been interested in finding out how necessary it is to use NUMA-aware memory allocation in such a quad-cpu beast.
Have you compared, for example, Crafty on all 48 cores with and without NUMA libraries to see the speed boost? Does Zappa Mexico use the NUMA memory allocation libraries?
Have you compared, for example, Crafty on all 48 cores with and without NUMA libraries to see the speed boost? Does Zappa Mexico use the NUMA memory allocation libraries?
-
Houdini
- Posts: 1471
- Joined: Tue Mar 16, 2010 12:00 am
Re: A test position from the european under 18 championship
I've highlighted the node speeds.Sorgenkind wrote:Hi!
It's a quad Opteron 6180 SE (12 cores @ 2.5 GHz each).
Here are the results on 1, 2, 4, 8, 16, 32 and 48 cores (I re-ran the last one which previously used 128 GB hash) with 64 GB hash, I have marked the time to solution:
THREADS = 1
889 KN/s
THREADS = 2
1701 KN/s
THREADS = 4
3117 KN/s
THREADS = 8
5747 KN/s
THREADS = 16
10205 KN/s
THREADS = 32
14608 KN/s
THREADS = 48
17735 KN/s
These numbers are very typical of the performance I observed before implementing the NUMA awareness in Houdini 2, so I assume that Zappa is not NUMA-aware.
Each Opteron 6180 is identified by the system as 2 NUMA nodes, in total your system has 8 NUMA nodes.
The node speed with 48 cores is only 3 times higher than with 8 cores, basically you're only using the hardware at 50% of its capabilities.
Robert
-
rbarreira
- Posts: 900
- Joined: Tue Apr 27, 2010 3:48 pm
Re: A test position from the european under 18 championship
So you're saying you got about a doubling of speed when making Houdini NUMA-aware? What was your test hardware?Houdini wrote: I've highlighted the node speeds.
These numbers are very typical of the performance I observed before implementing the NUMA awareness in Houdini 2, so I assume that Zappa is not NUMA-aware.
-
Houdini
- Posts: 1471
- Joined: Tue Mar 16, 2010 12:00 am
Re: A test position from the european under 18 championship
I'm saying that Steffen's results show that his 48 cores only deliver 3 times the performance of 8 cores, meaning that the hardware efficiency is only 50%.rbarreira wrote:So you're saying you got about a doubling of speed when making Houdini NUMA-aware? What was your test hardware?Houdini wrote: I've highlighted the node speeds.
These numbers are very typical of the performance I observed before implementing the NUMA awareness in Houdini 2, so I assume that Zappa is not NUMA-aware.
My NUMA development tests were made on a 16-core dual Opteron 6128.
I have received results of Houdini 2 running on a platform similar to Steffen's (quad Opteron 6178SE), with 31 cores the node speeds are about 32,000 to 35,000 kN/s. Without NUMA awareness it probably wouldn't get much higher than about 25,000 kN/s.
Robert
-
rbarreira
- Posts: 900
- Joined: Tue Apr 27, 2010 3:48 pm
Re: A test position from the european under 18 championship
So something like a 20-40% speed boost by using local memory when possible. That's pretty good, thanks for the info!Houdini wrote:I'm saying that Steffen's results show that his 48 cores only deliver 3 times the performance of 8 cores, meaning that the hardware efficiency is only 50%.rbarreira wrote:So you're saying you got about a doubling of speed when making Houdini NUMA-aware? What was your test hardware?Houdini wrote: I've highlighted the node speeds.
These numbers are very typical of the performance I observed before implementing the NUMA awareness in Houdini 2, so I assume that Zappa is not NUMA-aware.
My NUMA development tests were made on a 16-core dual Opteron 6128.
I have received results of Houdini 2 running on a platform similar to Steffen's (quad Opteron 6178SE), with 31 cores the node speeds are about 32,000 to 35,000 kN/s. Without NUMA awareness it probably wouldn't get much higher than about 25,000 kN/s.
Robert
I would love to get my hands on one of those multi-socket Opterons, but they're not cheap...
-
jdart
- Posts: 4420
- Joined: Fri Mar 10, 2006 5:23 am
- Location: http://www.arasanchess.org
Re: A test position from the european under 18 championship
Arasan, latest build, on a Q9550 (4 cores), 5-man TBs:
Code: Select all
"Grandelius-Raznikov, EU Youth Chess Ch, Albena 2011" bm Nd3
1 0.01 Ng2 +0.16 2538
1 0.01 hxg5 +1.36 2622
2 0.01 hxg5 +1.36 2625
3 0.01 hxg5 +1.36 2649
4 0.01 hxg5 +1.24 2912
5 0.01 hxg5 +1.12 3881
6 0.01 hxg5 +1.04 5457
7 0.01 -- +0.54 8806
7 0.01 hxg5 +0.16 10098
8 0.03 Nxh5 +0.16 21167
9 0.03 Nxh5 +0.04 39881
10 0.05 Nxh5 -0.24 61737
11 0.06 Nxh5 -0.44 101310
12 0.09 Nxh5 -0.45 177420
13 0.15 Nxh5 +0.00 304563
14 0.25 Nxh5 -0.45 563656
14 0.31 hxg5 +0.00 763573
15 0.36 hxg5 +0.00 875912
16 0.51 hxg5 -0.04 1352988
17 0.87 hxg5 +0.00 2368741
18 1.45 hxg5 +0.00 4078720
19 2.42 hxg5 +0.00 6993060
20 4.83 hxg5 +0.00 14138598
21 7.75 hxg5 +0.00 22622569
22 12.50 hxg5 +0.00 37034340
23 20.78 hxg5 +0.00 61698909
24 34.56 hxg5 +0.00 99638273
25 63.48 hxg5 +0.00 180954036
26 116.92 hxg5 +0.00 319454842
26 250.90 Nd3! +0.50 709058992
26 256.50 Nd3 +0.42 725813201
27 277.17 Nd3 +0.38 785546691
2.84M nodes/second.
36537 tablebase probes, 36537 tablebase hits
22767 splits, average thread usage=3.95
result: Nd3 score: +0.38 ++ solved in 250.90 sec. (709.06M nodes)
Nd3 Nf5 e3 Nxd4 exd4 Ke6 Kc3 Kf5 f3 g4 fxg4+ Kxg4 Kb3 Kxg3 Ka4 Ra5+ bxa5 bxa5 Kx
a5 f5 Kb5 f4 Kc5 f3 Kxd5 Kxh4 Ke5 Kg3 Nc5
solution times:
0 1 2 3 4 5 6 7 8 9
0 | 250.90
correct : 1/1
nodes to solution : 709.06M
depth to solution : 25.00
time to solution : 250.90 sec.
test complete
-
perejaslav
- Posts: 240
- Joined: Sat Mar 18, 2006 4:01 am
- Location: Cold
Re: A test position from the european under 18 championship
[d]8/4nk2/1p3p2/1r1p2pp/1P1R1N1P/6P1/3KPP2/8 w - - 0 1
Analysis by Critter 0.90 64-bit SSE4:
1. +/= (0.30): 1.Nd3 Nf5 2.e3 Ke6 3.Kc3 Nd6 4.hxg5 fxg5 5.Kb3 Ne4 6.f3 Nc5+ 7.Kc3 Nxd3 8.Rxd3 g4 9.fxg4 hxg4 10.Rd4 Kf5 11.Rf4+ Kg5 12.Rf2
2. = (0.00): 1.hxg5 fxg5 2.Nxh5 Kg6 3.g4 Nc6 4.Rd3 Ne5 5.Rd4
3. = (0.00): 1.Nxh5 Nc6 2.Rg4 Kg6 3.Nxf6 Kxf6 4.Rxg5 Nxb4 5.g4 Rc5 6.f4 b5 7.h5 Rc2+ 8.Ke3 Rc3+ 9.Kf2 Rc4 10.Kf3 Rc3+ 11.e3 Nc2 12.Rf5+ Kg7 13.Rg5+
4. -/+ (-0.77): 1.Ng2 Nc6 2.Rd3 d4 3.Ne1 gxh4 4.gxh4 Rxb4 5.Nc2 Ra4 6.Rb3 Ne5 7.Rb5 d3 8.exd3 Rxh4 9.Ke2 Nd7 10.Ne3 Rh1 11.Nd5 h4 12.Nxb6 Nxb6 13.Rxb6 h3 14.Rb4 Ke6 15.Rh4 Ke5 16.Kf3 Rd1 17.Rxh3 Rxd3+ 18.Kg2 Rxh3 19.Kxh3
5. -/+ (-0.88): 1.Nh3 Nc6 2.Rd3 gxh4 3.Rc3 Ne7 4.Rb3 hxg3 5.fxg3 d4 6.Nf4 Rg5 7.Ra3 Nc6 8.Nd3 Ke7 9.Ra6 Rb5 10.Ra4 Ne5 11.Nf4 Nc4+ 12.Ke1 Kd6 13.Kf2 Ne3
It takes 5 seconds for Critter 0.90 to solve the test with 5 multiPV lines.
Analysis by Critter 0.90 64-bit SSE4:
1. +/= (0.30): 1.Nd3 Nf5 2.e3 Ke6 3.Kc3 Nd6 4.hxg5 fxg5 5.Kb3 Ne4 6.f3 Nc5+ 7.Kc3 Nxd3 8.Rxd3 g4 9.fxg4 hxg4 10.Rd4 Kf5 11.Rf4+ Kg5 12.Rf2
2. = (0.00): 1.hxg5 fxg5 2.Nxh5 Kg6 3.g4 Nc6 4.Rd3 Ne5 5.Rd4
3. = (0.00): 1.Nxh5 Nc6 2.Rg4 Kg6 3.Nxf6 Kxf6 4.Rxg5 Nxb4 5.g4 Rc5 6.f4 b5 7.h5 Rc2+ 8.Ke3 Rc3+ 9.Kf2 Rc4 10.Kf3 Rc3+ 11.e3 Nc2 12.Rf5+ Kg7 13.Rg5+
4. -/+ (-0.77): 1.Ng2 Nc6 2.Rd3 d4 3.Ne1 gxh4 4.gxh4 Rxb4 5.Nc2 Ra4 6.Rb3 Ne5 7.Rb5 d3 8.exd3 Rxh4 9.Ke2 Nd7 10.Ne3 Rh1 11.Nd5 h4 12.Nxb6 Nxb6 13.Rxb6 h3 14.Rb4 Ke6 15.Rh4 Ke5 16.Kf3 Rd1 17.Rxh3 Rxd3+ 18.Kg2 Rxh3 19.Kxh3
5. -/+ (-0.88): 1.Nh3 Nc6 2.Rd3 gxh4 3.Rc3 Ne7 4.Rb3 hxg3 5.fxg3 d4 6.Nf4 Rg5 7.Ra3 Nc6 8.Nd3 Ke7 9.Ra6 Rb5 10.Ra4 Ne5 11.Nf4 Nc4+ 12.Ke1 Kd6 13.Kf2 Ne3
It takes 5 seconds for Critter 0.90 to solve the test with 5 multiPV lines.