old test position

Discussion of anything and everything relating to chess playing software and machines.

Moderator: Ras

jdart
Posts: 4406
Joined: Fri Mar 10, 2006 5:23 am
Location: http://www.arasanchess.org

Re: old test position

Post by jdart »

For the record, this is from the game Nunn-Nataf, FRA-chT 1999 (http://www.chessgames.com/perl/chessgame?gid=1333500).

Arasan takes a long time to get this. About 25 minutes. 3 best moves at 1hr/move:

"Nunn-Nataf, FRA-chT 1999" bm Nxf2
result: Nxf2 score: -0.28 ++ solved in 1460.69 sec. (3128.91M node
s)
Nxf2 Qd5+ Kh8 Rf1 Ng4 Rxf8+ Qxf8 Nc2 Qf6 O-O-O Qh6+ Kb1 Be6 Qd2 Qxd2 Rxd2 Nge5 Nd5 Bg5 Rd1 Rf8 Nce3 Nb4
result(2): Qb6 score: -1.01 ** not solved in 3600.05 secs. (
7809.51M nodes)
Qb6 Rb1 Nf6 Nc2 Bh3 Qd3 Rf7 b4 Qa7 f4 Ng4 Kd2 Rd8 Bh4 Bg2 Bxe7 Nxe7 Bxg4 Bxh1 Rxh1 Rxf4
result(3): Nh6 score: -1.56 ** not solved in 3600.14 secs. (
7871.62M nodes)
Nh6 Rg1 Nf7 Qd2 Bf6 O-O-O Be6 f4 Bd4 Rge1 Qb6 Na4 Qa7 Nc2 Bf2 Rf1 Bxg3 hxg3 Rac8
User avatar
Eelco de Groot
Posts: 4669
Joined: Sun Mar 12, 2006 2:40 am
Full name:   Eelco de Groot

Re: old test position

Post by Eelco de Groot »

jdart wrote:For the record, this is from the game Nunn-Nataf, FRA-chT 1999 (http://www.chessgames.com/perl/chessgame?gid=1333500).

Arasan takes a long time to get this. About 25 minutes. 3 best moves at 1hr/move:

"Nunn-Nataf, FRA-chT 1999" bm Nxf2
result: Nxf2 score: -0.28 ++ solved in 1460.69 sec. (3128.91M node
s)
Nxf2 Qd5+ Kh8 Rf1 Ng4 Rxf8+ Qxf8 Nc2 Qf6 O-O-O Qh6+ Kb1 Be6 Qd2 Qxd2 Rxd2 Nge5 Nd5 Bg5 Rd1 Rf8 Nce3 Nb4
result(2): Qb6 score: -1.01 ** not solved in 3600.05 secs. (
7809.51M nodes)
Qb6 Rb1 Nf6 Nc2 Bh3 Qd3 Rf7 b4 Qa7 f4 Ng4 Kd2 Rd8 Bh4 Bg2 Bxe7 Nxe7 Bxg4 Bxh1 Rxh1 Rxf4
result(3): Nh6 score: -1.56 ** not solved in 3600.14 secs. (
7871.62M nodes)
Nh6 Rg1 Nf7 Qd2 Bf6 O-O-O Be6 f4 Bd4 Rge1 Qb6 Na4 Qa7 Nc2 Bf2 Rf1 Bxg3 hxg3 Rac8
If I assume that everybody posted scores from White's point of view, except Jon, I think this must be Arasan evaluating Qb6 higher in multi_PV so from Black's point of view and Ray's Crafty is also from Black's point of view, -I can't imagine that Crafty would evalute this as better for Black right from the first iteration?-, then so far while some only had a very short time none of the engines actually find a plusscore for Black?

Maybe it depends what the exact variation is, Ted's Rybka 3 for instance only finds a repetition by a perpetual check in the short 1'47'' seconds, but I think they all first have to find this repetition and only then can discover that Black actually can still win if he invests another Bishop! None of the given analysis shows this :P

So far Ancalagon also has great troubles from the startposition, but after giving it a few moves from Bright's line there is a more easy position, still difficult for Ancalagon, now the object is to score a win for Black :) First a perpetual otherwise it is too difficult to find. Trying more extensions can actually make the variations much worse, or improve the depth at which Nxf2 is found but then because pursuing all the checks still ruin the time to solution...


The move to find is 4... Nb4 (giving queencheck as it were, then White queen has to go to the queenside where further attacks on both king and queen should secure the victory)

This is Stockfish 1.3, 32 bit and only half the CPU. Stockfish is starting up very slowly, see knps at the end of the PV in Shredder GUI, but Ancalagon also has problems with this, I'm not sure exactly what causes it, partly I should just reboot Windows etc. which I haven't done for a while, and it is of course with two GUIs running:

[FEN "r1bq1rk1/1p2b1pp/p1np4/8/2P1P1n1/N1N3B1/PP2BP1P/R2QK2R b KQ -"]

1... Nxf2 2. Qd5+ Kh8 3. Bxf2 Rxf2 4. Kxf2 *


[d]r1bq3k/1p2b1pp/p1np4/3Q4/2P1P3/N1N5/PP2BK1P/R6R b - -

Engine: Stockfish 1.3.1 JA (Athlon 2009 MHz 50% CPU, 64 MB)
by Tord Romstad, Marco Costalba

2.01 0:03 -3.01 4...Qb6+ 5.c5 Bh4+ 6.Kg2 Qxb2 (402) 0

2.05 0:03 -2.78 4...Bh4+ 5.Kg1 Qb6+ 6.c5 Qxb2 (652) 0

3.01 0:04 -2.78 4...Bh4+ 5.Kg1 Qb6+ 6.c5 Qxb2 (1.660) 0

4.01 0:05 -2.33 4...Bh4+ 5.Kg2 Ne5 6.Rhf1 Qe7 (5.354) 0

5.01 0:06 -3.60 4...Bh4+ 5.Kg2 Nb4 6.Qh5 Be6 7.Bg4 (8.237) 1

5.02 0:08 -2.82 4...Qb6+ 5.c5 Qxb2 6.Qb3 Qxb3 7.axb3 dxc5 (13.081) 1

6.01 0:10 -2.88 4...Qb6+ 5.c5 Qxb2 6.Qb3 Qxb3 7.axb3 Bh4+
8.Kg2 dxc5 (23.595) 2

7.01 0:11 -2.27 4...Qb6+ (34.096) 3

8.01 0:12 -1.92 4...Qb6+ 5.c5 Qxb2 6.Qb3 Qd2 7.Rhd1 Bh4+
8.Kg1 Qg5+ 9.Kh1 dxc5 (59.001) 4

9.01 0:13 -2.03 4...Qb6+ 5.c5 Qxb2 6.Qb3 Qd2 7.Rhd1 Qf4+
8.Kg1 dxc5 9.Rf1 Qg5+ 10.Kh1 (89.256) 6

10.01 0:13 -2.15 4...Qb6+ 5.c5 Qxb2 6.Qb3 Qd2 7.Rhd1 Qf4+
8.Kg1 Qe3+ 9.Kh1 dxc5 10.Rg1 Bd6 (155.787) 11

11.01 0:14 -1.80 4...Qb6+ (297.114) 21

12.01 0:15 -1.47 4...Qb6+ 5.c5 Qxb2 6.cxd6 Bh4+ 7.Kg2 Qxc3
8.Qd3 Qe5 9.Rhd1 Bd7 10.Nc4 Qf6
11.Qf3 Qe6 (776.018) 49

13.01 0:18 -1.58 4...Qb6+ 5.c5 Qxb2 6.cxd6 Bh4+ 7.Kg2 Qxc3
8.Qd3 Qe5 9.Rhd1 Be6 10.Nc4 Qg5+
11.Kh1 Bf2 12.Rab1 b5 (1.700.098) 90

14.01 0:25 -1.49 4...Qb6+ 5.c5 Qxb2 6.cxd6 Bh4+ 7.Kg2 Qxc3
8.Qd3 Qc5 9.Rhf1 Be6 10.Rac1 Qg5+
11.Kh1 Ne5 12.Qc3 Rd8 13.Rg1 (3.885.421) 150

15.01 0:49 -1.17 4...Qb6+ (11.202.947) 225

16.01 1:15 -1.23 4...Qb6+ 5.c5 Qxb2 6.Qb3 Qd2 7.Rhd1 Qh6
8.Kg2 dxc5 9.Kh1 Bd6 10.Rxd6 Qxd6
11.Rg1 Qe7 12.Nc2 Be6 13.Nd5 (18.911.151) 251

17.01 3:18 -1.01 4...Qb6+ 5.c5 Qxb2 6.Qb3 Qd2 7.Rhd1 Qh6
8.Kg2 dxc5 9.Kh1 b5 10.Qd5 b4 11.Na4 bxa3
12.Nxc5 Bh3 13.e5 Rd8 (54.898.340) 277

18.01 5:14 -1.15 4...Qb6+ 5.c5 Qxb2 6.Qb3 Qd2 7.Rhd1 Qh6
8.Kg2 dxc5 9.Kh1 b5 10.Qd5 Bh3
11.Nc2 Rf8 12.Qh5 Qxh5 13.Bxh5 Rf2
14.Ne3 (88.259.912) 280

18.03 5:57 -0.21 4...Nb4 (101.538.535) 284

19.01 6:33 0.00 4...Nb4 5.Qh5 Bh4+ 6.Kg2 g6 7.Qf3 Qg5+
8.Kf1 Bh3+ 9.Qxh3 Rf8+ 10.Bf3 Qe3
11.Qxh4 Qxf3+ 12.Kg1 Qe3+ 13.Kg2 Qf3+
14.Kg1 (112.417.122) 286

20.01 9:21 0.00 4...Nb4 5.Qh5 Bh4+ 6.Kg2 g6 7.Qf3 Qg5+
8.Kf1 Bh3+ 9.Qxh3 Rf8+ 10.Bf3 Qe3
11.Qxh4 Qxf3+ 12.Kg1 Qe3+ 13.Kg2 Qf3+
14.Kg1 (162.956.521) 290

21.01 14:02 +0.58 4...Nb4 (247.820.439) 294

22.01 26:38 +2.94 4...Nb4 5.Qh5 Bh4+ 6.Kg2 g6 7.Qf3 Qg5+
8.Kf1 Bh3+ 9.Qxh3 Rf8+ 10.Bf3 Qe3
11.Qxh4 Qxf3+ 12.Kg1 Qe3+ 13.Kg2 Qf3+
14.Kg1 (479.213.600) 299

23.01 61:22 +3.27 4...Nb4 5.Qh5 Bh4+ 6.Kg2 g6 7.Qf3 Qg5+
8.Qg3 Bxg3 9.hxg3 Nd3 10.Raf1 Nxb2
11.Nd5 Qd2 12.Rf2 Nd3 13.Rd1 Ne1+
14.Kg1 Qxa2 15.Nb1 Qa5 16.Ndc3 Nc2
17.Rxd6 Bh3 (1.134.464.836) 308

24.01 184:02 +3.43 4...Nb4 5.Qh5 Bh4+ 6.Kg2 g6 7.Qf3 Qg5+
8.Qg3 Bxg3 9.hxg3 Be6 10.Rad1 Qe5
11.Rd2 Rf8 12.Rf1 Rxf1 13.Bxf1 Qh5
14.Nc2 Qh6 15.Rf2 Nxc2 16.Rxc2 Bh3+
17.Kg1 Qe3+ (3.463.126.980) 313

25.01 402:54 +3.37 4...Nb4 5.Qh5 Bh4+ 6.Kg2 g6 7.Qf3 Qg5+
8.Qg3 Bxg3 9.hxg3 Be6 10.Rad1 Qe5
11.Rd2 Rf8 12.Rf1 Rxf1 13.Bxf1 Qh5
14.Nc2 Qh6 15.Rf2 Nxc2 16.Rxc2 Bh3+
17.Kg1 Qe3+ (8.226.439.264) 340


best move: Nc6-b4 time: 405:43.671 min n/s: 342.506 nodes: 8.337.780.252

This is with the latest Ancalagon but with 100% CPU in the first plies it is not yet very convincing... Some builds much worse than this, not finding Bh4 or only a 0.00 score if I extend too much :(


r1bq3k/1p2b1pp/p1np4/3Q4/2P1P3/N1N5/PP2BK1P/R6R b - -

Engine: Ancalagon 1.3 WS180 Build 163 (256 MB)
by Romstad, Costalba, Kiiski, de Groot

2.01 0:02 -3.64 4...Qb6+ 5.c5 dxc5 (8.725) 3

3.01 0:03 -2.52 4...Qb6+ 5.c5 Qxb2 6.Qb3 Qxb3 7.axb3 dxc5 (165.673) 51

4.01 0:03 -2.05 4...Qb6+ 5.c5 Qxb2 6.Qb3 Qd2 7.cxd6 Bxd6 (392.538) 106

5.01 0:04 -1.70 4...Qb6+ 5.c5 Qxb2 6.Qb3 Qd2 7.Rhd1 Qf4+
8.Kg1 dxc5 (923.063) 198

6.01 0:04 -1.66 4...Qb6+ 5.c5 Qxb2 6.Qb3 Qd2 7.Rhd1 Qf4+
8.Kg1 dxc5 9.Nd5 Qxe4 10.Nxe7 Nxe7 (1.078.821) 221

7.01 0:05 -1.43 4...Qb6+ 5.c5 Qxb2 6.Qb3 Qd2 7.Rhd1 Qf4+
8.Kg1 Nd4 9.Rxd4 Qe3+ 10.Kh1 Qxd4 (1.421.688) 259

8.01 0:09 -1.60 4...Qb6+ 5.c5 Qxb2 6.Qb3 Qd2 7.Rhd1 Qh6
8.Kg1 Nd4 9.Qf7 Qg5+ 10.Kh1 dxc5 (3.647.139) 403

9.01 0:15 -1.43 4...Qb6+ 5.c5 Qxb2 6.Qb3 Qd2 7.Rhd1 Qf4+
8.Kg1 Bf6 9.Nd5 Qg5+ 10.Qg3 Qxg3+
11.hxg3 Bxa1 12.Rxa1 dxc5 (7.850.321) 509

10.01 0:37 -1.60 4...Qb6+ 5.c5 Qxb2 6.Qb3 Qd2 7.Rhd1 Qf4+
8.Kg1 dxc5 9.Nd5 Qxe4 10.Nxe7 Qxe7
11.Bc4 Nd4 (21.590.253) 580

10.08 1:17 -0.90 4...Nb4 (49.589.925) 643

11.01 1:18 0.00 4...Nb4 5.Qh5 Bh4+ 6.Kg1 g6 7.Qf3 Qg5+
8.Kf1 Bh3+ 9.Qxh3 Rf8+ 10.Bf3 Qe3
11.Qxh4 Qxf3+ 12.Kg1 Qe3+ 13.Kg2 Qf3+
14.Kg1 (50.774.060) 646

12.01 8:00 0.00 4...Nb4 5.Qh5 Bh4+ 6.Kg1 g6 7.Qf3 Qg5+
8.Kf1 Bh3+ 9.Qxh3 Rf8+ 10.Bf3 Qe3
11.Qxh4 Qxf3+ 12.Kg1 Qe3+ 13.Kg2 Qf3+
14.Kg1 (306.328.605) 637

13.01 10:25 0.00 4...Nb4 5.Qh5 Bh4+ 6.Kg1 g6 7.Qf3 Qg5+
8.Kf1 Bh3+ 9.Qxh3 Rf8+ 10.Bf3 Qe3
11.Qxh4 Qxf3+ 12.Kg1 Qe3+ 13.Kg2 Qf3+
14.Kg1 (400.385.044) 640

14.01 18:02 0.00 4...Nb4 5.Qh5 Bh4+ 6.Kg1 g6 7.Qf3 Qg5+
8.Kf1 Bh3+ 9.Qxh3 Rf8+ 10.Bf3 Qe3
11.Qxh4 Qxf3+ 12.Kg1 Qe3+ 13.Kg2 Qf3+
14.Kg1 (695.944.102) 643

15.01 53:44 +0.19 4...Nb4 (1.730.402.895) 536

16.01 57:59 +0.98 4...Nb4 (1.810.207.378) 520

17.01 96:45 +4.11 4...Nb4 (2.524.745.201) 434

18.01 318:23 +4.80 4...Nb4 5.Qh5 Bh4+ 6.Kg2 g6 7.Qf3 Qg5+
8.Qg3 Bxg3 9.hxg3 Be6 10.Raf1 Qd2
11.Nab1 Qxb2 12.Rd1 Rd8 13.c5 Nxa2
14.Rd2 Qb4 15.Rxa2 Qxc5 (6.678.757.459) 349


best move: Nc6-b4 time: 407:03.796 min n/s: 344.030 nodes: 8.402.460.114
Debugging is twice as hard as writing the code in the first
place. Therefore, if you write the code as cleverly as possible, you
are, by definition, not smart enough to debug it.
-- Brian W. Kernighan
jdart
Posts: 4406
Joined: Fri Mar 10, 2006 5:23 am
Location: http://www.arasanchess.org

Re: old test position

Post by jdart »

If I assume that everybody posted scores from White's point of view, except Jon
Arasan's scores are always from the perspective of the side to move. So Nxf2 is better for Black (less negative).

--Jon
Dann Corbit
Posts: 12792
Joined: Wed Mar 08, 2006 8:57 pm
Location: Redmond, WA USA

Re: old test position

Post by Dann Corbit »

jdart wrote:
If I assume that everybody posted scores from White's point of view, except Jon
Arasan's scores are always from the perspective of the side to move. So Nxf2 is better for Black (less negative).

--Jon
When analyzing an EPD record, here is what the standard says:

"16.2.5.6: Opcode "ce": centipawn evaluation
The opcode "ce" indicates the evaluation of the indicated position in centipawn units. It takes a single operand, an optionally signed integer that gives an evaluation of the position from the viewpoint of the active player; i.e., the player with the move. Positive values indicate a position favorable to the moving player while negative values indicate a position favorable to the passive player; i.e., the player without the move. A centipawn evaluation value close to zero indicates a neutral positional evaluation.

Values are restricted to integers that are equal to or greater than -32767 and are less than or equal to 32766.

A value greater than 32000 indicates the availability of a forced mate to the active player. The number of plies until mate is given by subtracting the evaluation from the value 32767. Thus, a winning mate in N fullmoves is a mate in ((2 * N) - 1) halfmoves (or ply) and has a corresponding centipawn evaluation of (32767 - ((2 * N) - 1)). For example, a mate on the move (mate in one) has a centipawn evaluation of 32766 while a mate in five has a centipawn evaluation of 32758.

A value less than -32000 indicates the availability of a forced mate to the passive player. The number of plies until mate is given by subtracting the evaluation from the value -32767 and then negating the result. Thus, a losing mate in N fullmoves is a mate in (2 * N) halfmoves (or ply) and has a corresponding centipawn evaluation of (-32767 + (2 * N)). For example, a mate after the move (losing mate in one) has a centipawn evaluation of -32765 while a losing mate in five has a centipawn evaluation of -32757.

A value of -32767 indicates an illegal position. A stalemate position has a centipawn evaluation of zero as does a position drawn due to insufficient mating material. Any other position known to be a certain forced draw also has a centipawn evaluation of zero."

Unfortunately, for game play, the sign and magnitude of the score are not spelled out.

Bottom line:
For EPD analysis, Jon is doing it the right way. If the analysis comes from game play, then there is no standard and anything goes (sounds like a defect in the standard to me.)