Arasan test suite update

Discussion of anything and everything relating to chess playing software and machines.

Moderator: Ras

Dann Corbit
Posts: 12777
Joined: Wed Mar 08, 2006 8:57 pm
Location: Redmond, WA USA

Re: Arasan test suite update

Post by Dann Corbit »

BubbaTough wrote:
I would argue that both moves definitely win, and so one move is not better than the other.
I like this criteria, but rarely see it applied in test positions. For example, when you are an exchange up almost all moves win, but some moves might take 30 moves and others might give forced mate in 5. Using something along the lines of the Nunn criteria for an ! might be a nice approach to defining best move criteria for a problem set.

-Sam
If a move clearly wins the game then it is a mistake to exclude it from the best moves list unless all other moves also clearly win the game in which case there are no best moves (since all are equal).

I consider a test position which excludes a clearly winning move to be buggy.

The best test positions are unambiguous.
BubbaTough
Posts: 1154
Joined: Fri Jun 23, 2006 5:18 am

Re: Arasan test suite update

Post by BubbaTough »

I consider a test position which excludes a clearly winning move to be buggy.
I think we are on the same page here...my minor adjustment would be all moves other than the best move decrease the result of the game with perfect play. There most be a cleaner way to say that :roll:. For example, best move wins, all other moves draw or lose. Another example: best move draws, all other moves lose. I don't like "this move wins faster than that move" stuff. They artificially discourge exploration of pruning/eval options that concentrate on improving results rather than exact calculations.

Now, there is room in my life for testbeds that concentrate on exploring "best practical chance" type moves, but that is a different breed of testbed altogether.

-Sam
Dann Corbit
Posts: 12777
Joined: Wed Mar 08, 2006 8:57 pm
Location: Redmond, WA USA

Re: Arasan test suite update

Post by Dann Corbit »

BubbaTough wrote:
I consider a test position which excludes a clearly winning move to be buggy.
I think we are on the same page here...my minor adjustment would be all moves other than the best move decrease the result of the game with perfect play. There most be a cleaner way to say that :roll:. For example, best move wins, all other moves draw or lose. Another example: best move draws, all other moves lose. I don't like "this move wins faster than that move" stuff. They artificially discourge exploration of pruning/eval options that concentrate on improving results rather than exact calculations.

Now, there is room in my life for testbeds that concentrate on exploring "best practical chance" type moves, but that is a different breed of testbed altogether.

-Sam
I think that there are also "best moves" that may result in a faster mate but are really inferior to a mate that takes longer in a certain sense:

The sortest mate involves some large piece sacrifice and must be played totally by a single pv or it dies. Longer mates do not need the sacrifice and lead inexorably towards mate.

The shorter mate is usually more fun to watch, because of the fireworks. But the longer, more boring mates are much safer paths. The programs that tend to choose the safer paths will tend to make better decisions on average.

And if you let them think for eons, they will usually find the shortest mate also.
jdart
Posts: 4401
Joined: Fri Mar 10, 2006 5:23 am
Location: http://www.arasanchess.org

Re: Arasan test suite update

Post by jdart »

This problem (10.1) has a "shortest mate" solution but there are other moves that are forced mates, too, just longer. So I agree in this case (aind for similar mate problems) that you shouldn't just count the shortest mate as correct. Some engines will hit on a longer mate first.

But I disagree in general that all "winning" moves should be equivalent solutions, if you mean by "win" less than a mate score. I think if a move gives a superior eval (you can pick your number but I generally like to see +1 pawn at least over alternatives) then it can be considered best.

--Jon
BubbaTough
Posts: 1154
Joined: Fri Jun 23, 2006 5:18 am

Re: Arasan test suite update

Post by BubbaTough »

But I disagree in general that all "winning" moves should be equivalent solutions, if you mean by "win" less than a mate score. I think if a move gives a superior eval (you can pick your number but I generally like to see +1 pawn at least over alternatives) then it can be considered best.
I understand your point of view, and as someone doing the bulk of the work your opinion is really the one that counts. As a human problem solver though I know I hated finding a solution that won a piece and being told I was wrong because another move won a rook. My complaint would always be "why does it matter, I am going to win the game with either move!". As an engine writer I feel the same way. I enjoy playing with pruning techniques that spend their energy making sure they get a "safe" win, instead of finding the best win. For example, if you find a win of a piece going into what seems like a winning position, instead of considering other moves that might win more, spend your time making really sure your move really wins (search deeper, less pruning, etc.). This is what people do, and it is an interesting avenue of engine research in my opinion. Testbeds that ignore this aspect of practical chess make worse testbeds for me. I must admit that this is likely relevant to few other engines (maybe none?) but while I am feeling chatty I thought I would throw my cent in there. Feel free to ignore it.

-Sam
Dann Corbit
Posts: 12777
Joined: Wed Mar 08, 2006 8:57 pm
Location: Redmond, WA USA

Re: Arasan test suite update

Post by Dann Corbit »

I guess that 98% of moves where score is +300 centipawns {given a long search with a strong engine} win the game.

I guess that 25% of the moves where one strong program chooses move X and another strong program chooses Y after a long search the move with the lower score is really better.

Until a move is resolved as checkmate we do not know for sure that it has an outcome of "won" {though a score of e.g. +20 pawns is practically won if the engine is decent and the search is not super-fast}.
BubbaTough
Posts: 1154
Joined: Fri Jun 23, 2006 5:18 am

Re: Arasan test suite update

Post by BubbaTough »

I guess that 98% of moves where score is +300 centipawns {given a long search with a strong engine} win the game.

I guess that 25% of the moves where one strong program chooses move X and another strong program chooses Y after a long search the move with the lower score is really better.

Until a move is resolved as checkmate we do not know for sure that it has an outcome of "won" {though a score of e.g. +20 pawns is practically won if the engine is decent and the search is not super-fast}.
That is true I guess. For me, if humans pretty much agree that one move is clearly superior in the it leads to a better result with best play than any other move, even in the face of any evidence engines provide, I am happy to accept that as the best move. I like to fall back on human judgement (using computer tools of course) instead of purely computer decisions...otherwise we risk the scenario where computers are training computers in which case engines start emulating each other instead of making progress (which I define as combining computer calculation strength with human judgement).

-Sam
jdart
Posts: 4401
Joined: Fri Mar 10, 2006 5:23 am
Location: http://www.arasanchess.org

Re: Arasan test suite update

Post by jdart »

I appreciate the info. If I am slow to respond I am not ignoring you: just busy and it is taking me time to work through your results. I am finding your output hard to follow because it looks like in some cases you've chopped off part of the test identifier, so I can't easily tell what test you are referring to.

Some notes:

for 10.163 and 10.10 it seems Arasan gives a clear advantage to the test move but Rybka doesn't. Arasan isn't very speculative in its eval generally but it's always possible it's got something wrong.

Re 10.46 (Ra2) I think we agree this is not a good test move because there are alternative mates. I'd like to replace it with this position from a recent game, which is interesting but not terribly hard:

b4r1k/5ppq/8/1P1pPP2/4p1P1/2R3Q1/6K1/3R2Br b - - bm d4; id arasan10.46"; c0 "Crafty-Arasan, ICC 2008";

10.199 (d5) is also suspect (not sure if this is one in your list but it was in an earlier post). I have a candidate replacement:

r1b1rk2/p1pq2p1/1p1b1p1p/n2P4/2P1NP2/P2B1R2/1BQ3PP/R6K w - - bm Nxf6; id "arasan10.199"; c0 "Gutsche-Jones, W-ch24 sf05 email 2000";

This is very difficult for Arasan but Rybka solved it in about a minute. Shredder 10 took about 10 minutes (on my dual box).

10.108 (Bxh6) and 10.15 (Kf6) I will look at more closely.
Dann Corbit
Posts: 12777
Joined: Wed Mar 08, 2006 8:57 pm
Location: Redmond, WA USA

Re: Arasan test suite update

Post by Dann Corbit »

BubbaTough wrote:
I guess that 98% of moves where score is +300 centipawns {given a long search with a strong engine} win the game.

I guess that 25% of the moves where one strong program chooses move X and another strong program chooses Y after a long search the move with the lower score is really better.

Until a move is resolved as checkmate we do not know for sure that it has an outcome of "won" {though a score of e.g. +20 pawns is practically won if the engine is decent and the search is not super-fast}.
That is true I guess. For me, if humans pretty much agree that one move is clearly superior in the it leads to a better result with best play than any other move, even in the face of any evidence engines provide, I am happy to accept that as the best move. I like to fall back on human judgement (using computer tools of course) instead of purely computer decisions...otherwise we risk the scenario where computers are training computers in which case engines start emulating each other instead of making progress (which I define as combining computer calculation strength with human judgement).

-Sam
I guess that human judgement is wrong about as often as computer judgement is wrong. And that is a surprisingly large number. In other words, old test sets that have not been carefully scrutinized are always full of bugs.
Dann Corbit
Posts: 12777
Joined: Wed Mar 08, 2006 8:57 pm
Location: Redmond, WA USA

Re: Arasan test suite update

Post by Dann Corbit »

jdart wrote:I appreciate the info. If I am slow to respond I am not ignoring you: just busy and it is taking me time to work through your results. I am finding your output hard to follow because it looks like in some cases you've chopped off part of the test identifier, so I can't easily tell what test you are referring to.
>>
Arena did that. Nothing I can do about it.
<<
Some notes:

for 10.163 and 10.10 it seems Arasan gives a clear advantage to the test move but Rybka doesn't. Arasan isn't very speculative in its eval generally but it's always possible it's got something wrong.
>>
It looks to me on 10.163 that both moves are good. Here is Rybka and Toga CMLX

Code: Select all

4) Re5; id "arasan10.16 
    Searching move: Re1-e5
    Best move (TogaCMLX): Re1-e5
    identical moves! Found in: 01:44
     2/18	00:00	       1.462	0	+1.65	Ng5xe6+ Kd8e8 Qb8xc8+ Ke8f7 Re1e5
     3/19	00:00	       3.295	0	+1.63	Ng5xe6+ Kd8e8 Qb8xc8+ Ke8f7 f2f3 Nf6d5 Qc8xc6
     4/17	00:00	       3.328	0	+1.60	Ng5xe6+ Kd8e8 Qb8xc8+ Ke8f7 h2h3 Qd2c3 Qc8b7
     5/22	00:00	      15.197	0	+1.65	Ng5xe6+ Kd8e8 Qb8xc8+ Ke8f7 h2h3 g7g6 Qc8b7 Nh6f5 Qb7xa7
     6/30	00:00	      45.195	0	+1.45	Ng5xe6+ Kd8e8 Ra1d1 Qd2xa5 Re1e5 Qa5a2 Qb8xc8+ Ke8f7 h2h3 Qa2c2
     7/27	00:00	      70.964	0	+1.38	Ng5xe6+ Kd8e8 Ra1d1 Qd2xa5 Re1e5 Qa5b6 Qb8xc8+ Ke8f7 h2h4 Nh6g4
     8/30	00:00	     140.087	0	+1.61	Ng5xe6+ Kd8e8 Ra1d1 Qd2xa5 Qb8xc8+ Ke8f7 Qc8xc6 Qa5d5 Qc6a6 Qd5d7 Re1e5 Kf7g8
     9/32	00:00	     261.997	0	+1.49	Ng5xe6+ Kd8e8 Qb8xc8+ Ke8f7 Ra1d1 Qd2xa5 Qc8xc6 Nh6f5 Re1e5 Nf5xd4 Ne6g5+ Kf7g6 Qc6xf6+ g7xf6 Re5xa5 e7e5
    10/37	00:00	     670.842	0	+2.07	Ng5xe6+ Kd8e8 Qb8xc8+ Ke8f7 Qc8xc6 Nh6f5 Qc6a6 g7g6 Ra1d1 Qd2h6 Qa6xa7 Bf8g7 Qa7b7 Rh8e8 a5a6 Kf7g8
    11/37	00:01	   1.130.137	0	+2.19	Ng5xe6+ Kd8e8 Qb8xc8+ Ke8f7 Qc8xc6 Nh6f5 Qc6a6 g7g6 Ra1d1 Qd2h6 Qa6xa7 Nf6d5 Re1e5 Nd5f4 d4d5 Nf4xe6 d5xe6+ Kf7g7
    12/38	00:02	   1.942.276	4.511.381	+1.81	Ng5xe6+ Kd8e8 Qb8xc8+ Ke8f7 Qc8xc6 Nh6f5 Qc6a6 g7g6 Ra1d1 Qd2b4 Rd1b1 Qb4c3 Qa6xa7 Bf8h6 Kg1h1 Nf6d5 Re1e2
    12/37	00:02	   2.023.697	4.511.381	+1.95	Ng5xe6+ Kd8e8 Qb8xc8+ Ke8f7 Qc8xc6 Nh6f5 Qc6a6 g7g6 Ra1d1 Qd2h6 Qa6xa7 Nf6d5 Re1e5 Nd5f4 Qa7a6 Nf5d6 d4d5 Nf4xe6 d5xe6+ Kf7g7
    13/48	00:05	   5.408.795	4.477.318	+1.60	Ng5xe6+ Kd8e8 Qb8xc8+ Ke8f7 Qc8xc6 Nh6f5 Qc6a6 g7g6 Re1d1 Qd2h6 Ne6d8+ Kf7g7 Qa6xa7 Nf6d5 Ra1c1 Nd5f4 Rc1c6 Kg7h7 Nd8f7 Nf4e2+ Kg1h1
    14/48	00:09	  10.531.108	4.615.385	+1.94	Ng5xe6+ Kd8e8 Qb8xc8+ Ke8f7 Qc8xc6 Nh6f5 Qc6c4 Nf5d6 Qc4b3 g7g6 h2h3 Qd2h6 a5a6 Nf6e8 Qb3b8 Ne8g7 Ne6d8+ Kf7g8 Qb8xa7 Kg8h7
    15/52	00:17	  19.433.671	4.617.850	+1.69	Ng5xe6+ Kd8e8 Qb8xc8+ Ke8f7 Qc8xc6 Nh6f5 Qc6c4 Nf5d6 Qc4b3 g7g6 h2h3 Qd2h6 a5a6 Kf7g8 Qb3b8 Nf6e4 Qb8xa7 Qh6d2 f2f3 Qd2f2+ Kg1h1 Ne4g3+ Kh1h2
    15/55	00:17	  18.753.255	4.617.850	+1.76	Ng5xe6+ Kd8e8 Qb8xc8+ Ke8f7 Qc8xc6 Nh6f5 Qc6c4 Nf5d6 Qc4b3 g7g6 h2h3 Qd2h6 Ra1b1 Kf7g8 h3h4 Kg8h7 Ne6g5+ Kh7g7 Re1e6 Kg7g8
    16/60	00:30	  34.029.094	4.642.998	+1.84	Ng5xe6+ Kd8e8 Qb8xc8+ Ke8f7 Qc8xc6 Nh6f5 Qc6c4 Nf5d6 Qc4b3 g7g6 h2h3 Qd2h6 a5a6 Kf7g8 Qb3b8 Nf6e4 Qb8xa7 Qh6d2 f2f3 Nd6b5 Qa7d7
    16/53	01:08	  72.601.097	4.477.305	+2.28	Re1e5 Qd2c3 Ra1e1 Qc3a3 Ng5xe6+ Kd8e8 Qb8xc8+ Ke8f7 Ne6g5+ Kf7g6 Qc8xc6 Qa3b3 Ng5e6 Nh6f7 Re5e3 Qb3b8 Re3g3+ Kg6h7 Qc6c2+ Kh7g8 Re1b1 Qb8d6 Qc2f5
    16/54	01:08	  76.284.404	4.477.305	+2.50	Re1e5 Qd2c3 Ra1e1 Qc3a3 Ng5xe6+ Kd8e8 Qb8xc8+ Ke8f7 Ne6g5+ Kf7g6 Qc8xc6 Qa3a2 Ng5e6 Kg6f7 Qc6c7 a7a6 Ne6xf8 Rh8xf8 Re5xe7+ Kf7g8 Re7xg7+ Kg8h8 Qc7e7 Rf8f7 Rg7xf7 Qa2xf7
    16/54	01:27	  91.208.125	4.504.571	+3.33	Re1e5 Nf6d5 Ng5xe6+ Kd8d7 Qb8xa7+ Kd7e8 Qa7b8 Ke8f7 Qb8xc8 g7g6 Qc8xc6 Nd5c3 Qc6f3+ Nh6f5 a5a6 Bf8g7 Ne6xg7 Kf7xg7 a6a7 Nf5xd4 Qf3b7 Nd4f5 a7a8Q Rh8xa8 Qb7xa8 Nc3e2+ Kg1h1 Ne2f4
    17/68	01:35	 105.184.901	4.512.903	+2.01	Ng5xe6+ Kd8e8 Qb8xc8+ Ke8f7 Qc8xc6 Nh6f5 Qc6c4 Nf5d6 Qc4b3 g7g6 h2h3 Rh8g8 Re1c1 Qd2h6 Rc1c7 a7a6 Ra1d1 Nd6e4 Rc7c8 Ne4d2 Ne6g5+ Kf7g7 Qb3f7+ Kg7h8 Ng5e6
    17/65	01:44	 118.092.201	4.524.642	+3.56	Re1e5 Nf6d5 Ng5xe6+ Kd8d7 Qb8xa7+ Kd7e8 Qa7b8 Ke8f7 Qb8xc8 g7g6 Qc8xc6 Nh6g4 Re5xd5 Qd2xf2+ Kg1h1 Bf8h6 a5a6 Bh6e3 h2h3 Qf2g3 Ra1f1+ Be3f2 Ne6g5+ Kf7g7 Qc6c7 Qg3xc7 Ng5e6+ Kg7f7 Ne6xc7 e7e6
    18/75	02:59	 204.847.346	4.580.024	+4.10	Re1e5 g7g6 Re5c5 Qd2xg5 Rc5xg5 Nh6f5 Rg5xg6 Bf8h6 Qb8xa7 Nf6d5 a5a6 Nf5h4 Rg6g3 Bh6f4 Rg3b3 Rh8g8 g2g3 Nh4f5 Kg1h1 Bf4d2 Rb3b8 Rg8g4
    19/61	04:49	 311.965.721	4.599.222	+4.43	Re1e5 g7g6 Re5c5 Qd2xg5 Rc5xg5 Nh6f5 Rg5xg6 Bf8h6 Qb8xa7 Bh6f4 a5a6 Nf6d5 Qa7c5 Kd8d7 a6a7 Bc8b7 Ra1b1 Bb7a8 Rb1b2 Bf4d6 Qc5c4 Nd5f4 Rg6g5 Nf4d5
    20/66	07:34	 500.908.500	4.665.228	+4.49	Re1e5 g7g6 Re5c5 Qd2xg5 Rc5xg5 Nh6f5 Rg5xg6 Bf8h6 Qb8xa7 Nf6d5 a5a6 Bh6f4 Qa7c5 Kd8d7 a6a7 Bc8b7 Ra1b1 Bb7a8 g2g3 Bf4d6 Qc5c4 e6e5 d4xe5 Bd6xe5 Qc4e4 Nf5d4
    20/75	07:34	 529.304.639	4.665.228	+4.50	Re1e5 Qd2c3 Ra1e1 Rh8g8 Ng5xe6+ Kd8e8 Ne6xf8 Ke8xf8 Re5xe7 Qc3xe1+ Re7xe1 Kf8f7 Qb8xa7+ Kf7g6 Qa7b6 Nh6f5 a5a6 Bc8xa6 Qb6xa6 Rg8d8 Qa6xc6 Nf5xd4 Qc6b6 Rd8d7 f2f4 Kg6h7 Qb6c5 h5h4 f4f5
    21/72	12:31	 860.365.655	4.669.820	+4.46	Re1e5 g7g6 Re5c5 Qd2xg5 Rc5xg5 Nh6f5 Qb8xa7 Bf8g7 Rg5xg6 Nf6d7 Rg6xg7 Nf5xg7 Ra1c1 Ng7e8 Rc1xc6 Rh8f8 Rc6c1 Rf8f5 Qa7a8 Ne8d6 a5a6 Rf5a5 a6a7 Nd6c4 Qa8c6 Nd7b6 Rc1xc4 Nb6xc4 a7a8Q Ra5xa8 Qc6xa8
   9/20/2008 11:27:13 PM, Time for this analysis: 00:25:00, Rated time: 48:00

4) Re5; id "arasan10.16 
    Searching move: Re1-e5
    Best move (Rybka 3): Ng5xe6
    Not found in: 25:00
      2	00:00	         567	580.608	+1.19	Ng5xe6+
      3	00:00	         968	991.232	+1.35	Ng5xe6+
      4	00:00	       2.034	2.082.816	+1.43	Ng5xe6+
      5	00:00	       3.084	3.158.016	+1.36	Ng5xe6+ Kd8e8 Qb8xc8+
      6	00:00	      13.647	291.136	+1.43	Ng5xe6+ Kd8e8 Qb8xc8+ Ke8f7 h2h3 g7g5 Qc8xc6 Nh6f5
      7	00:00	      26.896	248.121	+1.38	Ng5xe6+ Kd8e8 Qb8xc8+ Ke8f7 h2h3 Nh6f5 Ne6xf8 Qd2g5 Nf8e6
      8	00:00	      68.745	263.651	+1.57	Ng5xe6+ Kd8e8 Qb8xc8+ Ke8f7 h2h3 Nh6f5 Ne6xf8 Qd2g5 Nf8e6 Qg5xg2+ Kg1xg2 Rh8xc8 Ra1c1 Kf7g6
      9	00:00	      94.046	256.125	+1.57	Ng5xe6+ Kd8e8 Qb8xc8+ Ke8f7 h2h3 Nh6f5 Ne6xf8 Qd2g5 Nf8e6 Qg5xg2+ Kg1xg2 Rh8xc8 Ra1c1 Nf5h4+ Kg2g3 Nh4f5+ Kg3h2 Kf7g8 Re1e5
     10	00:00	     150.736	266.586	+1.50	Ng5xe6+ Kd8e8 Qb8xc8+ Ke8f7 h2h3 Nh6f5 Ne6xf8 Qd2f4 Nf8e6 Rh8xc8
     11	00:02	     598.681	267.123	+1.39	Ng5xe6+ Kd8e8 Qb8xc8+ Ke8f7 Qc8xc6 Nh6f5 Qc6b7 Nf5d6 Qb7b3 Rh8h6
     12+	00:04	     643.563	260.580	+1.59	Ng5xe6+
     12	00:04	     793.890	261.733	+1.71	Ng5xe6+ Kd8e8 Qb8xc8+ Ke8f7 Qc8xc6 Nh6f5 Qc6b7 Nf5d6 Qb7b3 Rh8h6 Ra1d1 Qd2xa5 Ne6g5+ Kf7e8 Qb3b8+ Qa5d8 Qb8xa7 Nf6h7 Ng5e6
     13	00:06	   1.595.894	265.809	+1.72	Ng5xe6+ Kd8e8 Qb8xc8+ Ke8f7 Qc8xc6 Nh6f5 Qc6b7 Nf5d6 Qb7b3 g7g6 a5a6 Rh8g8 f2f3 h5h4 h2h3
     14+	00:11	   2.633.208	247.945	+1.92	Ng5xe6+
     14	00:14	   3.677.409	256.220	+2.00	Ng5xe6+ Kd8e8 Qb8xc8+ Ke8f7 Qc8xc6 Nh6f5 Qc6c4 Nf5d6 Qc4b3 g7g6 h2h3 a7a6 Re1e5 Nd6b5 Ra1e1 h5h4 Ne6d8+ Kf7g7 Re5e2
     15+	00:27	   6.699.118	250.837	+2.20	Ng5xe6+
     15+	00:35	   8.314.796	240.213	+2.40	Ng5xe6+
     15	00:53	  12.835.158	249.552	+2.64	Ng5xe6+ Kd8e8 Qb8xc8+ Ke8f7 Qc8xc6 Nh6f5 Qc6c4 Nf5d6 Qc4b3 g7g6
     16+	01:36	  23.791.042	252.649	+2.84	Ng5xe6+
     16+	02:35	  36.613.565	241.005	+3.04	Ng5xe6+
     16	02:44	  39.531.232	246.750	+3.01	Ng5xe6+ Kd8e8 Qb8xc8+ Ke8f7 Qc8xc6 Nh6f5 Qc6c4 Nf5d6 Qc4b3 g7g6 h2h3
     17	04:08	  59.892.796	247.240	+3.05	Ng5xe6+ Kd8e8 Qb8xc8+ Ke8f7 Qc8xc6 Nh6f5 Qc6c4 Nf5d6 Qc4b3 g7g6 h2h3 Qd2h6 Re1e5 Rh8h7 Ne6g5+ Kf7g7 Ra1e1 a7a6 Qb3b6 Nd6f7 Ng5xh7 Kg7xh7 Re5e6 Qh6g5 Qb6xa6 Nf6d5 Qa6b5 Nf7d6 Qb5c5 Nd6b7
     18	07:32	 111.653.874	252.593	+3.18	Ng5xe6+ Kd8e8 Qb8xc8+ Ke8f7 Qc8xc6 Nh6f5 Qc6c4 Nf5d6 Qc4b3 g7g6 h2h3 Qd2h6 Re1e5 Rh8h7 Ne6g5+ Kf7g7 Ra1e1 a7a6 Qb3b6 Nd6f7 Ng5xh7 Kg7xh7 Re5e6 Nf6d5 Qb6b3 Qh6g5 Qb3f3 Nd5f6 Re6xa6 Qg5d2 Qf3d1
     19	16:14	 248.839.727	261.628	+3.33	Ng5xe6+ Kd8e8 Qb8xc8+ Ke8f7 Qc8xc6 Nh6f5 Qc6c4 Nf5d6 Qc4b3 g7g6 h2h3 Qd2h6 Re1e5 Rh8h7 Ne6g5+ Kf7g7 Ra1e1 Nd6f5 Qb3f7+ Kg7h8 Ng5xh7 Nf6xh7 Re5e6 Qh6d2 Qf7xg6 Nf5g7 Re6e5 e7e6 Qg6e4 Nh7f6 Qe4e3
   9/19/2008 9:36:01 PM, Time for this analysis: 00:25:00, Rated time: 25:20
It seems obvious that both moves win. I am not convinced that one is better than the other. There is a higher score for the key move, but Rybka 3 is much stronger than Toga CMLX.

10.10 is very similar:

Code: Select all

5) g4; id "arasan10.10" 
    Searching move: g5-g4
    Best move (Rybka 3): Bg7xc3
    Not found in: 25:00
      2	00:00	         621	635.904	+0.26	Bg7xc3
      3	00:00	         741	758.784	+0.09	Bg7xc3
      4+	00:01	       1.262	1.292.288	+0.29	Bg7xc3
      4	00:01	       1.496	1.531.904	+0.28	Bg7xc3
      5	00:01	       2.059	2.108.416	+0.46	Bg7xc3 Qd2xc3 f7f6
      6	00:02	       4.934	315.776	+0.27	Bg7xc3 Qd2xc3 Qe7e5 Qc3xe5
      7	00:02	       8.481	271.392	+0.44	Bg7xc3 Qd2xc3 Rd5d8 Rb8xd8 Qe7xd8 Qc3d2 Qd8d5
      8	00:02	      12.018	256.384	+0.44	Bg7xc3 Qd2xc3 Rd5d8 Rb8xd8 Qe7xd8 Qc3d2 Qd8d5
      9	00:02	      33.821	274.862	+0.49	Bg7xc3 Qd2xc3 Rd5d8 Rb8xd8 Qe7xd8 Qc3d2 Qd8d5 Kh1g2 g5g4 Kg2g1
     10	00:03	      44.169	262.959	+0.49	Bg7xc3 Qd2xc3 Rd5d8 Rb8xd8 Qe7xd8 Qc3d2 Qd8d5 Kh1g2 g5g4 Kg2g1
     11+	00:03	     387.284	343.358	+0.69	Bg7xc3
     11+	00:03	     557.211	351.561	+0.89	Bg7xc3
     11+	00:04	   1.004.401	346.882	+2.09	Bg7xc3
     11	00:04	   1.136.419	350.088	+2.11	Bg7xc3 Qd2xc3 d3d2 Qc3h8+ Kh7g6 Qh8g8+ Kg6f6
     12	00:05	   1.240.160	352.365	+2.11	Bg7xc3 Qd2xc3 d3d2 Qc3h8+ Kh7g6 Qh8g8+ Kg6f6 Qg8h8+ Kf6e6 Rb8b6+ Rd5d6 Qh8c8+ Qe7d7 Rb6xd6+ Ke6xd6 Qc8c2 Kd6e5 Qc2d1 g5g4 Kh1g2 Qd7d5 Kg2g1 g4xf3 Kg1f2
     13+	00:05	   1.942.819	352.239	+2.31	Bg7xc3
     13+	00:07	   2.542.039	358.793	+2.51	Bg7xc3
     13+	00:13	   4.615.862	351.475	+3.71	Bg7xc3
     13	00:27	   9.533.985	363.618	+3.96	Bg7xc3 Qd2xc3 d3d2 Qc3h8+ Kh7g6 Qh8g8+ Kg6f6 Qg8h8+ Kf6e6 Rb8b6+ Rd5d6 Qh8d4 Rd6xb6 Qd4e3+ Ke6d7 Qe3xd2+ Kd7e8 Qd2xa5 Rb6xb3 Qa5a8+ Qe7d8 Qa8c6+ Ke8e7 Qc6c5+ Ke7e6 Qc5f2
     14+	00:32	  11.187.985	359.633	+4.16	Bg7xc3
     14+	00:36	  12.742.658	360.216	+4.36	Bg7xc3
     14	00:43	  14.900.212	353.591	+4.71	Bg7xc3 Qd2xc3 d3d2 Qc3h8+ Kh7g6 Qh8g8+ Kg6f6 Qg8h8+ Kf6e6 Rb8b6+ Rd5d6 Qh8d4 Rd6xb6 Qd4e3+ Ke6d7 Qe3xd2+ Kd7e8 Qd2xa5 Rb6xb3 Qa5a8+ Qe7d8 Qa8c6+ Ke8e7 Qc6e4+ Ke7f8 Kh1g2
     15	00:50	  17.212.763	350.994	+4.71	Bg7xc3 Qd2xc3 d3d2 Qc3h8+ Kh7g6 Qh8g8+ Kg6f6 Qg8h8+ Kf6e6 Rb8b6+ Rd5d6 Qh8d4 Rd6xb6 Qd4e3+ Ke6d7 Qe3xd2+ Kd7e8 Qd2xa5 Rb6xb3 Qa5a8+ Qe7d8 Qa8c6+ Ke8e7 Qc6e4+ Ke7f8
     16+	01:12	  24.578.456	346.728	+4.91	Bg7xc3
     16+	01:38	  32.891.628	343.627	+5.11	Bg7xc3
     16	19:18	 420.198.215	371.490	+5.37	Bg7xc3
   9/19/2008 10:01:02 PM, Time for this analysis: 00:25:00, Rated time: 50:20

5) g4; id "arasan10.10" 
    Searching move: g5-g4
    Best move (TogaCMLX): g5-g4
    identical moves! Found in: 00:33
     2/15	00:00	       1.929	0	-0.18	Bg7xc3 Qd2xc3 d3d2 Qc3h8+ Kh7g6 Rb8g8+ Kg6f5 Qh8h7+ Kf5e6
     3/19	00:00	       4.389	0	+0.12	Bg7xc3 Qd2xc3 Qe7e5 Qc3c6 Qe5e1+ Kh1g2 Qe1e2+ Kg2g1 Qe2xf3
     4/21	00:00	       5.465	0	+0.45	Bg7xc3 Qd2xc3 Qe7e5 Qc3c6 Qe5e1+ Kh1g2 Qe1e2+ Kg2h3 Qe2xf3
     5/23	00:00	      11.765	0	+0.37	Bg7xc3 Qd2xc3 Qe7e5 Qc3xe5 Rd5xe5 Rb8d8 Re5e3 Rd8d7 Re3xf3
     6/23	00:00	      15.643	0	+0.24	Bg7xc3 Qd2xc3 Qe7e5 Qc3xe5 Rd5xe5 Rb8d8 Re5e1+ Kh1g2 Re1e3 Kg2f1 Re3xf3+ Kf1e1
     7/29	00:00	      70.045	0	+0.04	Bg7xc3 Qd2xc3 d3d2 Qc3h8+ Kh7g6 Qh8g8+ Kg6f6 Qg8h8+ Kf6e6 Rb8b6+ Rd5d6 Qh8h3+ f7f5 Qh3h6+ Ke6d5 Rb6b5+ Kd5d4
     8/36	00:00	     126.964	0	+0.99	Bg7xc3 Qd2xc3 d3d2 Qc3h8+ Kh7g6 Qh8g8+ Kg6f6 Qg8h8+ Kf6e6 Rb8b6+ Rd5d6 Qh8c8+ Qe7d7 Rb6xd6+ Ke6xd6 Qc8c2 Kd6e6 Qc2e4+ Ke6d6 Qe4d4+ Kd6c6
     9/35	00:00	     226.855	0	+0.94	Bg7xc3 Qd2xc3 d3d2 Qc3h8+ Kh7g6 Qh8g8+ Kg6f6 Qg8h8+ Kf6e6 Rb8b6+ Rd5d6 Qh8c8+ Qe7d7 Rb6xd6+ Ke6xd6 Qc8c2 Kd6e5 Qc2d1 Qd7d4
    10/35	00:00	     444.813	0	+0.77	Bg7xc3 Qd2xc3 d3d2 Qc3h8+ Kh7g6 Qh8g8+ Kg6f6 Qg8h8+ Kf6e6 Rb8b6+ Rd5d6 Qh8c8+ Qe7d7 Rb6xd6+ Ke6xd6 Qc8c2 Kd6e5 Qc2d1 Qd7d4 Kh1g2
    11/41	00:01	   1.089.568	0	+0.50	Bg7xc3 Qd2xc3 d3d2 Qc3h8+ Kh7g6 Qh8g8+ Kg6f6 Qg8h8+ Kf6e6 Rb8b6+ Rd5d6 Qh8c8+ Qe7d7 Rb6xd6+ Ke6xd6 Qc8c2 Kd6e5 Qc2d1 Qd7d4 Kh1g2 Ke5f5
    12/43	00:02	   2.026.732	5.285.996	+0.63	Bg7xc3 Qd2xc3 d3d2 Qc3h8+ Kh7g6 Qh8g8+ Kg6f6 Qg8h8+ Kf6e6 Rb8b6+ Rd5d6 Qh8c8+ Qe7d7 Rb6xd6+ Ke6xd6 Qc8c2 Kd6e5 Qc2d1 Qd7d4 Kh1g2 f7f6 h2h4 g5xh4 g3xh4
    13/56	00:04	   5.967.850	5.128.205	+0.50	Bg7xc3 Qd2xc3 d3d2 Qc3h8+ Kh7g6 Qh8g8+ Kg6f6 Qg8h8+ Kf6e6 Rb8b6+ Rd5d6 Qh8c8+ Qe7d7 Rb6xd6+ Ke6xd6 Qc8c2 Kd6e5 Qc2d1 Qd7d5 Kh1g1 f7f6 Kg1f2 Qd5d4+ Kf2e2
    14/56	00:09	  12.098.519	5.136.971	+0.52	Bg7xc3 Qd2xc3 d3d2 Qc3h8+ Kh7g6 Qh8g8+ Kg6f6 Qg8h8+ Kf6e6 Rb8b6+ Rd5d6 Qh8c8+ Qe7d7 Rb6xd6+ Ke6xd6 Qc8c2 Kd6e5 Qc2d1 Qd7d5 Kh1g1 Qd5d3 Kg1g2 Ke5d6 h2h4 g5xh4 g3xh4
    14/70	00:33	  42.223.451	5.256.410	+4.02	g5g4 Nc3e4 g4xf3 Ne4g5+ Qe7xg5 Qd2xg5 Rd5xg5 Rb8d8 f3f2 Kh1g2 Rg5f5 Kg2f1 Rf5f3 Rd8d7 Bg7e5 Rd7e7 d3d2 Re7d7 Be5c3 g3g4
    15/58	00:36	  47.500.930	5.235.278	+4.13	g5g4 Nc3e4 g4xf3 Ne4g5+ Qe7xg5 Qd2xg5 Rd5xg5 Rb8d8 f3f2 Kh1g2 Rg5f5 Kg2f1 Rf5f3 Rd8d7 Bg7e5 Rd7e7 d3d2 Re7d7 Be5c3 g3g4 Kh7g6
    16/70	00:44	  53.297.688	5.141.049	+4.27	g5g4 Nc3e4 g4xf3 Ne4g5+ Qe7xg5 Qd2xg5 Rd5xg5 Rb8d8 f3f2 Kh1g2 Rg5f5 Kg2f1 Rf5f3 Rd8d7 Bg7e5 Rd7e7 d3d2 Re7d7 Be5c3 g3g4 Kh7g6 h2h4 f7f5 Rd7d6+ Kg6f7
    17/70	01:17	  90.605.656	4.909.103	+6.12	g5g4 Nc3e4 g4xf3 Ne4g5+ Qe7xg5 Qd2xg5 Rd5xg5 Rb8d8 Rg5c5 Kh1g1 Rc5c3 Rd8d7 Bg7h6 Kg1f2 d3d2 h2h4 Rc3xb3 g3g4 Rb3b1 g4g5 d2d1Q Rd7xd1 Rb1xd1 g5xh6 Rd1d3 Kf2g3 Kh7xh6 Kg3f2 Kh6g6
    18/63	02:01	 146.226.999	4.808.877	+6.33	g5g4 Nc3e4 g4xf3 Ne4g5+ Qe7xg5 Qd2xg5 Rd5xg5 Rb8d8 Rg5c5 Kh1g1 Rc5c3 Rd8d7 Bg7h6 Kg1f2 d3d2 h2h4 Rc3xb3 g3g4 Rb3b1 Rd7xf7+ Kh7g6 Rf7d7 d2d1Q Rd7xd1 Rb1xd1 Kf2xf3 Rd1f1+ Kf3e2 Rf1b1 Ke2f3 Rb1b3+ Kf3e2
    19/70	02:50	 196.947.022	4.808.956	+6.42	g5g4 Nc3e4 g4xf3 Ne4g5+ Qe7xg5 Qd2xg5 Rd5xg5 Rb8d8 Rg5c5 Kh1g1 Rc5c3 Rd8d7 Bg7h6 Kg1f2 d3d2 Rd7xf7+ Kh7g6 Rf7d7 Rc3xb3 g3g4 Rb3b1 Rd7d6+ Kg6f7 h2h4 Bh6f4 Rd6xd2 Bf4xd2 Kf2xf3 Kf7e6 g4g5 Ke6f5 Kf3e2
    20/70	04:10	 292.642.903	4.887.041	+6.43	g5g4 Nc3e4 g4xf3 Ne4g5+ Qe7xg5 Qd2xg5 Rd5xg5 Rb8d8 Rg5c5 Kh1g1 Rc5c3 Kg1f2 Bg7h6 Rd8d7 d3d2 Rd7xf7+ Kh7g6 Rf7d7 Rc3xb3 g3g4 Rb3b1 Rd7d6+ Kg6f7 h2h4 Bh6f4 Rd6d7+ Kf7e6 Rd7xd2 Bf4xd2 Kf2xf3 Rb1b4 Kf3e2 Bd2f4
    21/70	08:16	 551.082.015	4.869.311	+7.41	g5g4 Nc3e4 g4xf3 Ne4g5+ Qe7xg5 Qd2xg5 Rd5xg5 Rb8d8 Rg5c5 Kh1g1 Rc5c3 Kg1f2 Bg7h6 Kf2xf3 d3d2+ Kf3e2 Rc3e3+ Ke2xd2 Re3e8+ Kd2c3 Re8xd8 Kc3c4 Bh6d2 Kc4b5 Kh7g6 Kb5c4 Rd8h8 h2h4 Bd2e1 Kc4b5 Be1xg3 Kb5xa5 Rh8xh4 Ka5b5 Rh4h5+ Kb5c4
    22/76	17:18	1.271.577.648	4.899.395	+8.11	g5g4 Nc3e4 g4xf3 Ne4g5+ Qe7xg5 Qd2xg5 Rd5xg5 Rb8d8 Rg5c5 Kh1g1 Rc5c3 Kg1f2 Bg7h6 Kf2xf3 d3d2+ Kf3e2 Rc3e3+ Ke2xd2 Re3e8+ Kd2c3 Re8xd8 Kc3c4 Rd8b8 Kc4c3 Bh6g7+ Kc3c4 Kh7g6 Kc4d5 Rb8xb3 Kd5c4 Rb3b2 h2h4 Rb2b4+ Kc4d5 Rb4xa4 Kd5c5 Ra4e4
   9/20/2008 11:52:14 PM, Time for this analysis: 00:25:00, Rated time: 48:33
Again, score is better for the key move as found by Toga, but Rybka is 100 Elo stronger and sees more than a Rook advantage for his choice.
Rybka also tends to score lower (and Shredder give larger scores in magnitude as a reference).
<<
Re 10.46 (Ra2) I think we agree this is not a good test move because there are alternative mates. I'd like to replace it with this position from a recent game, which is interesting but not terribly hard:

b4r1k/5ppq/8/1P1pPP2/4p1P1/2R3Q1/6K1/3R2Br b - - bm d4; id arasan10.46"; c0 "Crafty-Arasan, ICC 2008";

10.199 (d5) is also suspect (not sure if this is one in your list but it was in an earlier post). I have a candidate replacement:

r1b1rk2/p1pq2p1/1p1b1p1p/n2P4/2P1NP2/P2B1R2/1BQ3PP/R6K w - - bm Nxf6; id "arasan10.199"; c0 "Gutsche-Jones, W-ch24 sf05 email 2000";

This is very difficult for Arasan but Rybka solved it in about a minute. Shredder 10 took about 10 minutes (on my dual box).

10.108 (Bxh6) and 10.15 (Kf6) I will look at more closely.
[/code]