Surprisingly difficult stalemate problem (for some engines)

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

carldaman
Posts: 2283
Joined: Sat Jun 02, 2012 2:13 am

Re: Surprisingly difficult stalemate problem (for some engin

Post by carldaman »

A similar thing happened once in a slow test game I ran between SF and Tornado 5, where SF took the Rook and allowed stalemate. It was about a year and a half ago, and SF also had trouble seeing the stalemate in analysis mode, but a newer SF may detect it faster, based on what I see in the other replies.

Regards,
CL
Warp wrote:One day I was running a Spike 1.4 vs. Stockfish 6 with moderate time controls (40 moves in 20 minutes), and by move 67 Stockfish was clearly ahead (eval over +6 for black). Then, suddenly, and surprisingly, Spike, playing white, stalemated the game in just a few moves. Stockfish completely failed to see the stalemate and avoid it.

The crucial point in the game was this (white to play and stalemate):

[d]8/5R2/1p2p1pk/p6p/P2R3P/8/r5r1/7K w - - 0 68

It turns out that for some engines (including Stockfish), it's surprisingly hard to see the stalemate, while for other engines (such as Spike) it's very easy. When I analyze that position with several engines, using 4 threads on an i5, it takes on average about this much for them to see the stalemate (it can vary quite a lot between runs):

- Stockfish (even very recent versions): Between 2 and 8 minutes.
- Texel 1.02: Between 1 and 8 minutes.
- Gull 3: Doesn't seem to ever see the stalemate, no matter how long I let it run.
- Spike 1.4: Less than a second.
- Hakkapeliitta 3.0: 6 seconds (very consistently).
- Rybka 2.3.2a: Between 2 and 14 seconds.
- Bobcat 7.1: Doesn't seem to ever see the stalemate, no matter how long I let it run.
- Ruffian 1.0.5: Less than a second.
- Hermann 2.8: About one second.

I don't have Komodo, but I have been reported that it, too, can take several minutes to see the stalemate (although I don't know about the most recent versions).

What I find fascinating and interesting about this is that it's not an artificially constructed position, but it's an actual position that came up in an engine vs. engine game (and in which the stronger engine missed victory because it didn't see it).
JVMerlino
Posts: 1357
Joined: Wed Mar 08, 2006 10:15 pm
Location: San Francisco, California

Re: Surprisingly difficult stalemate problem (for some engin

Post by JVMerlino »

Warp wrote:One day I was running a Spike 1.4 vs. Stockfish 6 with moderate time controls (40 moves in 20 minutes), and by move 67 Stockfish was clearly ahead (eval over +6 for black). Then, suddenly, and surprisingly, Spike, playing white, stalemated the game in just a few moves. Stockfish completely failed to see the stalemate and avoid it.

The crucial point in the game was this (white to play and stalemate):

[d]8/5R2/1p2p1pk/p6p/P2R3P/8/r5r1/7K w - - 0 68

It turns out that for some engines (including Stockfish), it's surprisingly hard to see the stalemate, while for other engines (such as Spike) it's very easy. When I analyze that position with several engines, using 4 threads on an i5, it takes on average about this much for them to see the stalemate (it can vary quite a lot between runs):

- Stockfish (even very recent versions): Between 2 and 8 minutes.
- Texel 1.02: Between 1 and 8 minutes.
- Gull 3: Doesn't seem to ever see the stalemate, no matter how long I let it run.
- Spike 1.4: Less than a second.
- Hakkapeliitta 3.0: 6 seconds (very consistently).
- Rybka 2.3.2a: Between 2 and 14 seconds.
- Bobcat 7.1: Doesn't seem to ever see the stalemate, no matter how long I let it run.
- Ruffian 1.0.5: Less than a second.
- Hermann 2.8: About one second.

I don't have Komodo, but I have been reported that it, too, can take several minutes to see the stalemate (although I don't know about the most recent versions).

What I find fascinating and interesting about this is that it's not an artificially constructed position, but it's an actual position that came up in an engine vs. engine game (and in which the stronger engine missed victory because it didn't see it).
Even my mediocre Myrddin announces draw in less than one second. No excuse for stronger engines, then. :-)

jm
tpoppins
Posts: 919
Joined: Tue Nov 24, 2015 9:11 pm
Location: upstate

Re: Surprisingly difficult stalemate problem (for some engin

Post by tpoppins »

Warp wrote:The crucial point in the game was this (white to play and stalemate):

[d]8/5R2/1p2p1pk/p6p/P2R3P/8/r5r1/7K w - - 0 68

It turns out that for some engines (including Stockfish), it's surprisingly hard to see the stalemate, while for other engines (such as Spike) it's very easy. When I analyze that position with several engines, using 4 threads on an i5, it takes on average about this much for them to see the stalemate (it can vary quite a lot between runs):

- Stockfish (even very recent versions): Between 2 and 8 minutes.
- Texel 1.02: Between 1 and 8 minutes.
- Gull 3: Doesn't seem to ever see the stalemate, no matter how long I let it run.
- Spike 1.4: Less than a second.
- Hakkapeliitta 3.0: 6 seconds (very consistently).
- Rybka 2.3.2a: Between 2 and 14 seconds.
- Bobcat 7.1: Doesn't seem to ever see the stalemate, no matter how long I let it run.
- Ruffian 1.0.5: Less than a second.
- Hermann 2.8: About one second.
Deep Fritz 14 (AKA Pandix), some hundreds Elo points below SF on rating lists, also sees it instantly:

[d]8/5R2/1p2p1pk/p6p/P2R3P/8/r5r1/7K w - - 0 68

Code: Select all

Analysis by Deep Fritz 14 x64:

[...]
68.Rh7+ Kxh7 69.Rd7+ Kh8 70.Rh7+ Kg8 71.Rg7+ Kf8 72.Rf7+ Ke8 73.Re7+ Kd8 74.Rd7+ Kc8 75.Rc7+ Kb8 76.Rb7+ Kxb7 
  =  (0.00)   Depth: 17   00:00:01  6905kN, tb=9
So does Houdini:

Code: Select all

Analysis by Houdini 4 Pro x64 B:

68.Rh7+ Kxh7 69.Rd7+ Kh8 70.Rh7+ Kg8 71.Rg7+ Kh8 72.Rh7+ Kg8 
  =  (0.00)   Depth: 13/34   00:00:01  596kN, tb=4
Gull (the 040516 Syzygy build) does find the draw, but only after a looong think:

Code: Select all

Analysis by Gull 3 x64 (syzygy):

[...]
68.Rh7+ Kxh7 69.Rd7+ Kh8 70.Rh7+ Kg8 71.Rg7+ Kf8 72.Rf7+ Ke8 73.Re7+ Kd8 74.Rd7+ Kc8 75.Rc7+ Kb8 76.Rb7+ Ka8 77.Ra7+ Kb8 78.Rb7+ 
  -+  (-2.74 ++)   Depth: 33/45   00:44:45  44780MN, tb=4957816
68.Rh7+ Kxh7 69.Rd7+ Kh8 70.Rh7+ Kg8 71.Rg7+ Kf8 72.Rf7+ Ke8 73.Re7+ Kd8 74.Rd7+ Kc8 75.Rc7+ Kb8 76.Rb7+ Ka8 77.Ra7+ Kb8 78.Rb7+ 
  =  (0.00)   Depth: 34/50   00:44:46  44788MN, tb=4962275
Warp wrote: I don't have Komodo, but I have been reported that it, too, can take several minutes to see the stalemate (although I don't know about the most recent versions).
K9.3 with Null-Pruning off needs almost 20,000 mN to see it a d=30 (15 min on the same setup). Not using v9.42, as anyone can prove any eval he wants by tweaking Analysis Contempt.
Warp wrote:What I find fascinating and interesting about this is that it's not an artificially constructed position, but it's an actual position that came up in an engine vs. engine game (and in which the stronger engine missed victory because it didn't see it).
That this should happen to Stockfish is the direct consequence of its design philosophy.
Uri Blass
Posts: 10410
Joined: Thu Mar 09, 2006 12:37 am
Location: Tel-Aviv Israel

Re: Surprisingly difficult stalemate problem (for some engin

Post by Uri Blass »

I agree that the fact that stockfish is blind in some position is a direct
consequence of its design philosophy.

I do not like the design philosophy of stockfish that care only about elo so prevent fixing obvious bugs(bugs from my point of view and the stockfish developers may think different).


A better design philosophy should consider minimal time for positions to solve

Stockfish does not have to be as fast as houdini in solving every position but more than 100 times slower is certainly something that should not be acceptable.

If Houdini solve it in 0.1 second then stockfish should not be slower than 10 seconds on the same hardware.

I believe that fixing bugs like this is possible with no significant change in elo but stockfish design philosophy does not allow it and they will not fix the bug even if it pass SPRT(-3,1) twice if fixing the bug is not a simplification.
Branko Radovanovic
Posts: 89
Joined: Sat Sep 13, 2014 4:12 pm
Location: Zagreb, Croatia
Full name: Branko Radovanović

Re: Surprisingly difficult stalemate problem (for some engin

Post by Branko Radovanovic »

Stockfish 4 solves it in a matter of seconds, while DD doesn't see it for at least a couple of minutes or so (perhaps not at all?).

So, between 4 and DD there must have been a patch that added Elo (presumably), yet "broke" this...
User avatar
Nordlandia
Posts: 2821
Joined: Fri Sep 25, 2015 9:38 pm
Location: Sortland, Norway

Re: Surprisingly difficult stalemate problem (for some engin

Post by Nordlandia »

[pgn][Event "18.p WCCT7 Ty (tt)"]
[Site "?"]
[Date "2004.??.??"]
[Round "?"]
[White "Predrag=N"]
[Black "(=1031.57d3a1)"]
[Result "1/2-1/2"]
[SetUp "1"]
[FEN "7Q/4p3/4p3/p1p1P3/Pp2P3/3Kp3/p1PbP3/kN6 w - - 0 1"]
[PlyCount "22"]
[EventDate "2004.??.??"]

{EG#14182} 1. Na3 (1. Nxd2 $2 exd2 2. Kxd2 Kb2 3. Qh3 c4 $1) 1... bxa3 (1...
Kb2 2. Nc4+ Kb1 3. Qh1+ Bc1 4. Nxa5 $1 a1=Q 5. Qxc1+ Kxc1 6. Nb3+) 2. Qh1+ $1
Kb2 3. Qa1+ $1 Kxa1 4. c4 $1 Bc1 (4... Kb1) (4... Be1 5. Kc2 Bd2 6. Kd1 Kb1) 5.
Kc2 Bb2 6. Kb3 $1 Bxe5 (6... Kb1) 7. Kc2 Bb2 8. e5 $1 (8. Kb3 $2 Kb1 $1 9. e5
a1=N#) 8... Bxe5 9. Kc1 Bc3 (9... Bb2+ 10. Kc2) 10. Kc2 Bb2 (10... Bd2 11. Kd1
$1 (11. Kd3 $2 Kb2) 11... Kb1) 11. Kb3 Kb1 1/2-1/2

[/pgn]
Vinvin
Posts: 5228
Joined: Thu Mar 09, 2006 9:40 am
Full name: Vincent Lejeune

Re: Surprisingly difficult stalemate problem (for some engin

Post by Vinvin »

Nordlandia wrote:[pgn][Event "18.p WCCT7 Ty (tt)"]
[Site "?"]
[Date "2004.??.??"]
[Round "?"]
[White "Predrag=N"]
[Black "(=1031.57d3a1)"]
[Result "1/2-1/2"]
[SetUp "1"]
[FEN "7Q/4p3/4p3/p1p1P3/Pp2P3/3Kp3/p1PbP3/kN6 w - - 0 1"]
[PlyCount "22"]
[EventDate "2004.??.??"]

{EG#14182} 1. Na3 (1. Nxd2 $2 exd2 2. Kxd2 Kb2 3. Qh3 c4 $1) 1... bxa3 (1...
Kb2 2. Nc4+ Kb1 3. Qh1+ Bc1 4. Nxa5 $1 a1=Q 5. Qxc1+ Kxc1 6. Nb3+) 2. Qh1+ $1
Kb2 3. Qa1+ $1 Kxa1 4. c4 $1 Bc1 (4... Kb1) (4... Be1 5. Kc2 Bd2 6. Kd1 Kb1) 5.
Kc2 Bb2 6. Kb3 $1 Bxe5 (6... Kb1) 7. Kc2 Bb2 8. e5 $1 (8. Kb3 $2 Kb1 $1 9. e5
a1=N#) 8... Bxe5 9. Kc1 Bc3 (9... Bb2+ 10. Kc2) 10. Kc2 Bb2 (10... Bd2 11. Kd1
$1 (11. Kd3 $2 Kb2) 11... Kb1) 11. Kb3 Kb1 1/2-1/2

[/pgn]
Nice one !
Max
Posts: 247
Joined: Tue Apr 13, 2010 10:41 am

Re: Surprisingly difficult stalemate problem (for some engin

Post by Max »

hgm wrote:Rabid Rook is notoriously difficult to recognize. The 50-move barrier is usually way beyond the horizon, so you are dependent on 3-fold reps to recognize the draw. Hash hits can often shield the repeats, however.
Serious?

[D]8/5R2/1p2p1pk/p6p/P2R3P/8/r5r1/7K w - - 0 1

Analysis by Junior 7:
68.Th7+ Kxh7 69.Td7+ Kg8 70.Tg7+ Kf8 71.Tf7+ Ke8 72.Te7+ Kd8 73.Td7+ Kc8 74.Tc7+ Kb8 75.Tb7+ Ka8 76.Tb8+ Kxb8
= (0.00) Tiefe: 18 00:00:02 10594kN

Analysis by The King 3.50:
68.Th7+ Kxh7 69.Td7+ Kg8 70.Tg7+ Kf8 71.Tf7+ Ke8 72.Te7+ Kd8 73.Td7+ Kc8 74.Tc7+ Kb8 75.Tb7+ Ka8 76.Ta7+ Kxa7
= (0.00) Tiefe: 13 00:00:01 862kN

Analysis by Shredder 7.04:
68.Th7+ Kxh7 69.Td7+ Kg8 70.Tg7+ Kh8 71.Tg8+ Kxg8
= (0.00) Tiefe: 11/11 00:00:01 10525kN

Analysis by SOS:
68.Th7+ Kxh7 69.Td7+ Kh8 70.Th7+ Kg8 71.Tg7+ Kf8 72.Tf7+ Ke8 73.Te7+ Kd8 74.Td7+ Kc8 75.Tc7+ Kb8 76.Tb7+ Ka8 77.Ta7+ Kxa7
= (0.00) Tiefe: 17/30 00:00:01 2992kN

Analysis by Rebel 12:
68.Th7+ Kxh7 69.Td7+ Kg8 70.Tg7+ Kf8 71.Tf7+ Ke8 72.Te7+ Kd8 73.Td7+ Kc8 74.Tc7+ Kb8 75.Tb7+ Ka8 76.Ta7+
= (0.00) Tiefe: 16 00:00:03 12220kN

Analysis by Phalanx XXII:
68.Th7+ Kxh7 69.Td7+ Kh8 70.Th7+ Kg8 71.Tg7+ Kf8 72.Tf7+ Ke8 73.Te7+ Kd8 74.Td7+ Kc8 75.Tc7+ Kb8 76.Tb7+ Kxb7
= (0.00) Tiefe: 11/31 00:00:01 2026kN

Analysis by OliThink 5.30:
68.Th7+ Kxh7 69.Td7+ Kh8 70.Th7+ Kg8 71.Tg7+ Kf8 72.Tf7+ Ke8 73.Te7+ Kd8 74.Td7+ Kc8 75.Tc7+ Kb8 76.Tb7+ Kxb7
= (0.00) Tiefe: 21 00:00:01 3873kN

Analysis by Fritz 6:
68.Th7+ Kxh7 69.Td7+ Kh8 70.Th7+ Kg8 71.Tg7+ Kf8 72.Tf7+ Ke8
= (0.00) Tiefe: 17/32 00:00:03 13225kN

Analysis by Fruit 1.0:
68.Th7+ Kxh7 69.Td7+ Kh8 70.Th7+ Kg8 71.Tg7+ Kf8 72.Tf7+ Ke8 73.Te7+ Kd8 74.Td7+ Kc8 75.Tc7+ Kb8 76.Tb7+ Ka8 77.Ta7+ Kxa7
= (0.00) Tiefe: 19 00:00:01 2613kN

Analysis by Colossus 2008b:
68.Th7+ Kxh7 69.Td7+ Kh8 70.Th7+ Kg8 71.Tg7+ Kf8 72.Tf7+ Ke8 73.Te7+ Kd8 74.Td7+ Kc8 75.Tc7+ Kb8 76.Tb7+ Ka8 77.Tb8+ Kxb8
= (0.00) Tiefe: 19/34 00:00:01 2579kN

Analysis by Comet B68:
68.Th7+ Kxh7 69.Td7+ Kh8 70.Th7+ Kg8 71.Tg7+ Kh8 72.Tg8+ Kxg8
= (0.00) Tiefe: 13/32 00:00:08 17158kN

Analysis by Crafty 14.12:
68.Th7+ Kxh7 69.Td7+ Kg8 70.Tg7+ Kf8 71.Tf7+ Ke8 72.Te7+ Kd8 73.Td7+ Kc8 74.Tc7+ Kb8 75.Tb7+ Ka8 76.Ta7+ Kxa7
= (0.00) Tiefe: 16 00:00:03 8520kN
Hope we're not just the biological boot loader for digital super intelligence. Unfortunately, that is increasingly probable - Elon Musk
OliverBr
Posts: 725
Joined: Tue Dec 18, 2007 9:38 pm
Location: Munich, Germany
Full name: Dr. Oliver Brausch

Re: Surprisingly difficult stalemate problem (for some engines)

Post by OliverBr »

Actually OliThink 5.3.5 sees the stalemate after 0.34 Second on play 11

Code: Select all

13	  0.00 	1.66M  	0:00.35	f7h7 h6h7 d4d7 h7h8 d7h7 h8g8 h7g7 g8f8 g7f7 f8e8 f7e7 e8d8 e7d7 d8c8 d7c7 c8b8 c7b7 b8a8 b7a7 a8a7  
 12	  0.00 	1.64M  	0:00.34	f7h7 h6h7 d4d7 h7h8 d7h7 h8g8 h7g7 g8f8 g7f7 f8e8 f7e7 e8d8 e7d7 d8c8 d7c7 c8b8 c7b7 b8a8 b7a7 a8a7  
 11	  0.00 	1.63M  	0:00.34	f7h7 h6h7 d4d7 h7h8 d7h7 h8g8 h7g7 g8f8 g7f7 f8e8 f7e7 e8d8 e7d7 d8c8 d7c7 c8b8 c7b7 b8a8 b7a7 a8a7  
 10	 -2.92 	798891	0:00.17	f7f3 g2h2 h1g1 a2g2 g1f1 g2c2 f1g1 h2e2 g1f1 e6e5 d4d6 e5e4  
  9	 -2.88 	425055	0:00.09	f7f3 g2h2 h1g1 a2g2 g1f1 g2c2 f1g1 h2e2 d4d6 c2c1 f3f1 c1c4  
  8	 -2.80 	155519	0:00.03	d4e4 g2b2 h1g1 e6e5 f7f6 b2b1 f6f1 b1b4 e4e5  
  7	 -2.78 	76262  	0:00.01	d4e4 g2c2 h1g1 e6e5 e4e5 a2a1 f7f1 a1a4  
  6	 -2.78 	35347  	0:00.00	d4e4 g2c2 h1g1 e6e5 e4e5 a2a1 f7f1  
  5	 -2.75 	15392  	0:00.00	d4e4 g2c2 h1g1 c2c1 f7f1 c1c2  
  4	 -2.68 	5864    	0:00.00	f7f6 g2h2 h1g1 a2g2 g1f1 e6e5  
  3	 -2.65 	1128    	0:00.00	d4e4 g2h2 h1g1 a2g2 g1f1  
  2	 -2.61 	422      	0:00.00	d4e4 g2h2 h1g1  
  1	 -2.25 	42        	0:00.00	d4d7  
  0	# 
Chess Engine OliThink: http://brausch.org/home/chess
OliThink GitHub:https://github.com/olithink
Andrew
Posts: 231
Joined: Thu Mar 09, 2006 12:51 am
Location: Australia

Re: Surprisingly difficult stalemate problem (for some engines)

Post by Andrew »

Just read the earlier posts. This position is still a problem for Komodo 14, hope it can be fixed!

Have also confirmed Deep Fritz 14 find it quickly (along with Fritz 13 and 15). But Fritz 17, no!

Don't have Fritz 16.

Andrew