I tried this position with Crafty-23.2 and Stockfish 1.8 and they both give scores > 8 after analyzing for more than 1 minute, while the position is a tablebase draw. These seem like remarkably optimistic evaluations with no win in sight.
[D]2k5/8/Pp1K4/8/7B/8/P7/8 w - - 0 1 bm a4; id "Fine 149 draw";
Position crafty and stockfish both badly mis-evaluate
Moderators: hgm, Rebel, chrisw
-
- Posts: 778
- Joined: Sat Jul 01, 2006 7:11 am
-
- Posts: 20943
- Joined: Mon Feb 27, 2006 7:30 pm
- Location: Birmingham, AL
Re: Position crafty and stockfish both badly mis-evaluate
Has to be a bug in Crafty. It should say white can't win here since the rook pawn(s) + wrong bishop can't promote with the king in front of the pawn. I'll look as it should get this correct.jwes wrote:I tried this position with Crafty-23.2 and Stockfish 1.8 and they both give scores > 8 after analyzing for more than 1 minute, while the position is a tablebase draw. These seem like remarkably optimistic evaluations with no win in sight.
[D]2k5/8/Pp1K4/8/7B/8/P7/8 w - - 0 1 bm a4; id "Fine 149 draw";
-
- Posts: 1154
- Joined: Fri Jun 23, 2006 5:18 am
Re: Position crafty and stockfish both badly mis-evaluate
This position is tricky because the b pawn in some positions can be forced to advance letting white convert his 2nd a pawn into a b pawn, which will foil most evals if done near the leafs of the tree. If you made the b pawn into a pawn on any other file, I expect more programs would understand this position better.
When trying to fix things in these positions, care must be taken. This particular positions is drawn, but there are similar positions which are not.
-Sam
When trying to fix things in these positions, care must be taken. This particular positions is drawn, but there are similar positions which are not.
-Sam
-
- Posts: 778
- Joined: Sat Jul 01, 2006 7:11 am
Re: Position crafty and stockfish both badly mis-evaluate
I was thinking that too, but I can't construct a position with B and 2 a pawns vs b pawn where that works.BubbaTough wrote:This position is tricky because the b pawn in some positions can be forced to advance letting white convert his 2nd a pawn into a b pawn, which will foil most evals if done near the leafs of the tree. If you made the b pawn into a pawn on any other file, I expect more programs would understand this position better.
When trying to fix things in these positions, care must be taken. This particular positions is drawn, but there are similar positions which are not.
-Sam
-
- Posts: 6401
- Joined: Thu Mar 09, 2006 8:30 pm
- Location: Chicago, Illinois, USA
Re: Position crafty and stockfish both badly mis-evaluate
Gaviota was supposed to know all this but this position is very tricky for other reasons.BubbaTough wrote:This position is tricky because the b pawn in some positions can be forced to advance letting white convert his 2nd a pawn into a b pawn, which will foil most evals if done near the leafs of the tree. If you made the b pawn into a pawn on any other file, I expect more programs would understand this position better.
When trying to fix things in these positions, care must be taken. This particular positions is drawn, but there are similar positions which are not.
-Sam
First of all, white can force black to take the bishop, and if the program does not recognize KPPKP with two rook pawns as draw, it will keep giving a high positive score.
Second, if there is no detection of stalemate in quies, the search can wisely direct the PV to make sure that always the last quies move is taking the b pawn with stalemate. The evaluation will be with a winning score (since the pawn now is in the "b" column).
Third, if the futility margin is not big enough in quies(), it will make the whole thing worst, particularly with an evil interaction with the hashtables. I needed to correct all this three things, and now it works:
Gaviota 0.76.6-modified.
No tablebases
Code: Select all
setboard 2k5/8/Pp1K4/8/7B/8/P7/8 w - - 0 1 bm a4; id "Fine 149 draw";
d
+-----------------+
| . . k . . . . . |
| . . . . . . . . |
| P p . K . . . . |
| . . . . . . . . | Castling:
| . . . . . . . B | ep: -
| . . . . . . . . |
| P . . . . . . . |
| . . . . . . . . | [White]
+-----------------+
tbuse off
analyze
********* Starts iterative deepening, thread = 0
set timer to infinite
25 1: 0.0 +1.46 1.Kc6
124 2 0.0 :-(
215 2: 0.0 +0.19 1.Kc6 Kb8
598 3: 0.0 +0.18 1.Kc6 Kb8 2.Bg3+ Ka7
2096 4: 0.0 +0.19 1.Kc6 Kb8 2.Kb5 Ka7
6849 5: 0.0 +0.19 1.Kc6 Kb8 2.Bg3+ Ka8 3.Kb5 Ka7
9286 6 0.0 +0.19 1.Kc6 Kb8 2.Kb5 Ka8 3.Bf2 Ka7
22413 6: 0.1 +0.19 1.Kc6 Kb8 2.Kb5 Ka8 3.Bf2 Ka7
28938 7 0.1 +0.19 1.Kc6 Kb8 2.Bg3+ Ka7 3.Kb5 Ka8 4.a4 Ka7
50131 7 0.2 +0.19 1.Bg3 Kb8 2.Kd7+ Ka7 3.Kc8 Ka8 4.Bb8 b5
62389 7: 0.2 +0.19 1.Bg3 Kb8 2.Kd7+ Ka7 3.Kc8 Ka8 4.Bb8 b5
80678 8 0.3 +0.20 1.Bg3 Kb8 2.Kd7+ Ka7 3.Kc8 Ka8 4.Bb8 b5
5.a7 b4
145606 8: 0.4 +0.20 1.Bg3 Kb8 2.Kd7+ Ka7 3.Kc8 Ka8 4.Bb8 b5
5.a7 b4
175886 9 0.5 +0.20 1.Bg3 Kb8 2.Kd7+ Ka7 3.Kc8 Ka8 4.a4 Ka7
5.Bb8+ Ka8 6.a7 b5
305102 9: 0.7 +0.20 1.Bg3 Kb8 2.Kd7+ Ka7 3.Kc8 Ka8 4.a4 Ka7
5.Bb8+ Ka8 6.a7 b5
381384 10 0.8 +0.20 1.Bg3 Kb8 2.Kd7+ Ka7 3.Kc6 Ka8 4.Bf2
Ka7 5.a4 Kb8 6.a7+ Ka8
638300 10: 1.2 +0.20 1.Bg3 Kb8 2.Kd7+ Ka7 3.Kc6 Ka8 4.Bf2
Ka7 5.a4 Kb8 6.a7+ Ka8
748613 11 1.4 +0.20 1.Bg3 Kb8 2.Kd7+ Ka7 3.Kc6 Ka8 4.Bf2
Ka7 5.Kb5 Ka8 6.a4 Kb8
1194167 11: 2.0 +0.20 1.Bg3 Kb8 2.Kd7+ Ka7 3.Kc6 Ka8 4.Bf2
Ka7 5.Kb5 Ka8 6.a4 Kb8
1505765 12 2.6 +0.20 1.Bg3 Kb8 2.Kd7+ Ka7 3.Kc6 Ka8 4.Kc7
Ka7 5.Kc8 Ka8 6.Bb8 b5 7.a7 b4
2259774 12: 3.6 +0.20 1.Bg3 Kb8 2.Kd7+ Ka7 3.Kc6 Ka8 4.Kc7
Ka7 5.Kc8 Ka8 6.Bb8 b5 7.a7 b4
2894063 13 5.0 +0.20 1.Bg3 Kb8 2.Kd7+ Ka7 3.Kc6 Ka8 4.Kc7
Ka7 5.Kc8 Ka8 6.a4 Ka7 7.Bb8+ Ka8 8.a7
b5
3035733 13 5.1 +0.20 1.Kc6 Kb8 2.a4 Ka7 3.Kb5 Ka8 4.Bf2 Kb8
5.a7+ Kb7 6.Bd4 Ka8 7.Ka6 b5
4195838 13: 6.8 +0.20 1.Kc6 Kb8 2.a4 Ka7 3.Kb5 Ka8 4.Bf2 Kb8
5.a7+ Kb7 6.Bd4 Ka8 7.Ka6 b5
5935338 14 9.9 +0.20 1.Kc6 Kb8 2.Bg3+ Ka7 3.Kb5 Ka8 4.a4 Ka7
5.Bf4 Ka8 6.Be3 Kb8 7.Kc6 Ka8 8.a7 b5
8741050 14: 14.3 +0.20 1.Kc6 Kb8 2.Bg3+ Ka7 3.Kb5 Ka8 4.a4 Ka7
5.Bf4 Ka8 6.Be3 Kb8 7.Kc6 Ka8 8.a7 b5
Code: Select all
analyze
********* Starts iterative deepening, thread = 0
set timer to infinite
25 1: 0.0 +1.46 1.Kc6
120 2 0.0 :-(
211 2: 0.0 +0.19 1.Kc6 Kb8
559 3: 0.0 +0.18 1.Kc6 Kb8 2.Bg3+ Ka7
1363 4: 0.0 +0.19 1.Kc6 Kb8 2.Kb5 Ka7
4083 5: 0.0 +0.19 1.Kc6 Kb8 2.Bg3+ Ka8 3.Kb5 Ka7
5038 6 0.0 +0.19 1.Kc6 Kb8 2.Kb5 Ka8 3.Bf2 Ka7
9017 6: 0.0 +0.19 1.Kc6 Kb8 2.Kb5 Ka8 3.Bf2 Ka7
11984 7 0.1 +0.19 1.Kc6 Kb8 2.Bg3+ Ka7 3.Kb5 Ka8 4.Bc7
Ka7
14472 7 0.1 +0.19 1.Bg3 Kb8 2.Kd7+ Ka7 3.Kc8 Ka8 4.Bb8 b5
19252 7 0.1 +0.20 1.a4 Kb8 2.Bf2 Ka8 3.a7 Kb7 4.Bd4 Ka8
22091 7: 0.1 +0.20 1.a4 Kb8 2.Bf2 Ka8 3.a7 Kb7 4.Bd4 Ka8
25991 8 0.2 +0.19 1.a4 Kb8 2.Bf2 Ka8 3.Kc6 Kb8 4.Kb5 Ka7
29635 8 0.2 +0.19 1.Kc6 Kb8 2.Bf2 Ka8 3.Kb5 Kb8 4.Bd4 Ka7
41888 8: 0.2 +0.19 1.Kc6 Kb8 2.Bf2 Ka8 3.Kb5 Kb8 4.Bd4 Ka7
49934 9 0.2 +0.20 1.Kc6 Kb8 2.Bg3+ Ka7 3.Kb5 Ka8 4.Bc7
Ka7 5.a4 Ka8
57226 9 0.3 +0.20 1.a4 Kb8 2.Bf2 Ka8 3.Kc6 Kb8 4.a7+ Ka8
5.Bd4 b5
80054 9: 0.3 +0.20 1.a4 Kb8 2.Bf2 Ka8 3.Kc6 Kb8 4.a7+ Ka8
5.Bd4 b5
96398 10 0.4 +0.20 1.a4 Kb8 2.Kc6 Ka7 3.Kb5 Ka8 4.Bg3 Ka7
5.Bc7 Ka8
159552 10: 0.5 +0.20 1.a4 Kb8 2.Kc6 Ka7 3.Kb5 Ka8 4.Bg3 Ka7
5.Bc7 Ka8
181370 11 0.5 +0.20 1.a4 Kb8 2.Bf2 Ka8 3.Kc6 Kb8 4.a7+ Ka8
5.Kb5 Kb7 6.Bd4 Ka8
259037 11: 0.7 +0.20 1.a4 Kb8 2.Bf2 Ka8 3.Kc6 Kb8 4.a7+ Ka8
5.Kb5 Kb7 6.Bd4 Ka8
331581 12 0.9 +0.20 1.a4 Kb8 2.Bf2 Ka8 3.Kc6 Kb8 4.Kb5 Ka8
5.Bg1 Kb8 6.Be3 Ka8
621119 12: 1.5 +0.20 1.a4 Kb8 2.Bf2 Ka8 3.Kc6 Kb8 4.Kb5 Ka8
5.Bg1 Kb8 6.Be3 Ka8
682369 13 1.6 +0.20 1.a4 Kb8 2.Bf2 Ka8 3.Kc6 Kb8 4.a7+ Ka8
5.Kb5 Kb7 6.Bd4 Ka8 7.Ka6 b5
864100 13: 2.0 +0.20 1.a4 Kb8 2.Bf2 Ka8 3.Kc6 Kb8 4.a7+ Ka8
5.Kb5 Kb7 6.Bd4 Ka8 7.Ka6 b5
1262935 14 2.8 +0.20 1.a4 Kb8 2.Bf2 Ka8 3.Kc6 Kb8 4.Kb5 Ka8
5.Bg1 Kb8 6.Be3 Ka8 7.Bd4 Kb8
2723616 14: 5.8 +0.20 1.a4 Kb8 2.Bf2 Ka8 3.Kc6 Kb8 4.Kb5 Ka8
5.Bg1 Kb8 6.Be3 Ka8 7.Bd4 Kb8
2820170 15 6.0 +0.20 1.a4 Kb8 2.Bf2 Ka8 3.Kc6 Kb8 4.Kb5 Ka8
5.Bg1 Kb8 6.Be3 Ka8 7.Kc6 Kb8 8.a7+ Ka8
4174275 15: 9.4 +0.20 1.a4 Kb8 2.Bf2 Ka8 3.Kc6 Kb8 4.Kb5 Ka8
5.Bg1 Kb8 6.Be3 Ka8 7.Kc6 Kb8 8.a7+ Ka8
4304496 16 9.7 +0.20 1.a4 Kb8 2.Bf2 Ka8 3.Kc6 Kb8 4.Kb5 Ka8
5.Bg1 Kb8 6.Be3 Ka8 7.Bf4 Ka7 8.Bc7 Ka8
5028243 16 11.0 +0.20 1.a3 Kb8 2.Kd7 Ka7 3.Kc8 Ka8 4.Bg3 b5
5.Be5 Ka7 6.Kc7 Ka8 7.Kc6 Ka7 8.Bd4+
Kb8 9.a7+ Ka8
5805792 16: 12.9 +0.20 1.a3 Kb8 2.Kd7 Ka7 3.Kc8 Ka8 4.Bg3 b5
5.Be5 Ka7 6.Kc7 Ka8 7.Kc6 Ka7 8.Bd4+
Kb8 9.a7+ Ka8
6107334 17 13.5 +0.20 1.a3 Kb8 2.Kd7 Ka7 3.Kc8 Ka8 4.Bg3 b5
5.Be5 Ka7 6.Kc7 Ka8 7.Kd6 Kb8 8.Kc6+
Ka7 9.Bd4+ Kb8 10.a7+ Ka8
7571773 17: 17.3 +0.20 1.a3 Kb8 2.Kd7 Ka7 3.Kc8 Ka8 4.Bg3 b5
5.Be5 Ka7 6.Kc7 Ka8 7.Kd6 Kb8 8.Kc6+
Ka7 9.Bd4+ Kb8 10.a7+ Ka8
8282772 18 18.8 +0.20 1.a3 Kb8 2.Kd7 Ka7 3.Kc8 Ka8 4.Bg3 b5
5.Be5 Ka7 6.Kc7 Ka8 7.Kc6 Ka7 8.Bd4+
Kb8 9.a7+ Ka8 10.Kb6 b4
-
- Posts: 6401
- Joined: Thu Mar 09, 2006 8:30 pm
- Location: Chicago, Illinois, USA
Re: Position crafty and stockfish both badly mis-evaluate
[D]8/2K3B1/k7/Pp6/8/P7/8/8 w - - 0 1jwes wrote:I was thinking that too, but I can't construct a position with B and 2 a pawns vs b pawn where that works.BubbaTough wrote:This position is tricky because the b pawn in some positions can be forced to advance letting white convert his 2nd a pawn into a b pawn, which will foil most evals if done near the leafs of the tree. If you made the b pawn into a pawn on any other file, I expect more programs would understand this position better.
When trying to fix things in these positions, care must be taken. This particular positions is drawn, but there are similar positions which are not.
-Sam
Bd4 wins, if Kxa5 Kb7, and if b4, axb4.
Miguel
-
- Posts: 2273
- Joined: Mon Sep 29, 2008 1:50 am
Re: Position crafty and stockfish both badly mis-evaluate
Thanks!!!!Gaviota was supposed to know all this but this position is very tricky for other reasons.
First of all, white can force black to take the bishop, and if the program does not recognize KPPKP with two rook pawns as draw, it will keep giving a high positive score.
Second, if there is no detection of stalemate in quies, the search can wisely direct the PV to make sure that always the last quies move is taking the b pawn with stalemate. The evaluation will be with a winning score (since the pawn now is in the "b" column).
Third, if the futility margin is not big enough in quies(), it will make the whole thing worst, particularly with an evil interaction with the hashtables. I needed to correct all this three things, and now it works:
GnuChess was exhibiting the same problems as the other engines.
So I was going to start a debugging session but now you have
explained it all!
EDIT: Now how do you check efficiently for stalemate during quiescence
search ?
-
- Posts: 6401
- Joined: Thu Mar 09, 2006 8:30 pm
- Location: Chicago, Illinois, USA
Re: Position crafty and stockfish both badly mis-evaluate
I did not say I do this efficientlyMichel wrote:Thanks!!!!Gaviota was supposed to know all this but this position is very tricky for other reasons.
First of all, white can force black to take the bishop, and if the program does not recognize KPPKP with two rook pawns as draw, it will keep giving a high positive score.
Second, if there is no detection of stalemate in quies, the search can wisely direct the PV to make sure that always the last quies move is taking the b pawn with stalemate. The evaluation will be with a winning score (since the pawn now is in the "b" column).
Third, if the futility margin is not big enough in quies(), it will make the whole thing worst, particularly with an evil interaction with the hashtables. I needed to correct all this three things, and now it works:
GnuChess was exhibiting the same problems as the other engines.
So I was going to start a debugging session but now you have
explained it all!
EDIT: Now how do you check efficiently for stalemate during quiescence
search ?
What I do now is very crude, but I plan to improve it. I cannot prove stalemate efficiently, but it is easy to prove "no stalemate" efficiently "most of the time". For instance, I do with bitboard variables
Code: Select all
if (king_moves[kingsquare] & ~opponent_attacks & ~mypieces) {
/* no king move available */
stalemate = full_check_for_stalemate(); /* expensive but I rarely need to do this */
} else {
stalemate = FALSE;
}
A couple of years ago it took me several days of debugging to understand a similar (but more complex) position posted by Uri. The interaction with the hashtable is really nasty. I thought I eliminated most problems with "fail hard" in quies(), but obviously I did not.
Miguel
-
- Posts: 317
- Joined: Mon Jun 26, 2006 9:44 am
Re: Position crafty and stockfish both badly mis-evaluate
Your example does not "count." He implicitly meant a position with white to move where white does not have an immediate axb capture and the black king is on a7, a8, b7 or b8. Other examples are not interesting.
A 2nd black pawn on b5 would sort of count. But with one 1 black b-pawn, it does not seem possible.
A 2nd black pawn on b5 would sort of count. But with one 1 black b-pawn, it does not seem possible.
-
- Posts: 481
- Joined: Thu Apr 16, 2009 12:00 pm
- Location: Slovakia, EU
Re: Position crafty and stockfish both badly mis-evaluate
Critter evaluates this position as "almost draw"
Code: Select all
2k5/8/Pp1K4/8/7B/8/P7/8 w - -
Engine: Critter 0.80 32-bit (128 MB)
by Richard Vida
24/70 0:27 +0.05 1.Kc6 Kb8 2.Bg3+ Ka7 3.Kc7 Ka8 4.Kd7 Ka7
5.Kc8 Ka8 6.Bb8 b5 7.a7 b4 8.Kd7 Kb7
9.Kd6 Ka8 10.Kc5 Kb7 11.Kb5 Ka8
12.Be5 Kxa7 13.Bd4+ Kb7 14.Kxb4 (32.469.029) 1186
25/70 0:32 +0.05 1.Kc6 Kb8 2.Bg3+ Ka7 3.Kc7 Ka8 4.Kd7 Ka7
5.Kc8 Ka8 6.Bb8 b5 7.a7 b4 8.Kd7 Kb7
9.Kd6 Ka8 10.Kc5 Kb7 11.Kb5 Ka8
12.Be5 Kxa7 13.Bd4+ Kb7 14.Kxb4 (38.599.235) 1204
26/70 0:38 +0.05 1.Kc6 Kb8 2.Bg3+ Ka7 3.Kc7 Ka8 4.Kd7 Ka7
5.Kc8 Ka8 6.Bb8 b5 7.a7 b4 8.Kd7 Kb7
9.Kd6 Ka8 10.Kc5 Kb7 11.Kb5 Ka8
12.Be5 Kxa7 13.Bd4+ Kb7 14.Kxb4 (46.922.983) 1224
27/70 0:46 +0.05 1.Kc6 Kb8 2.Bg3+ Ka7 3.Kc7 Ka8 4.Kd7 Ka7
5.Kc8 Ka8 6.Bb8 b5 7.a7 b4 8.Kd7 Kb7
9.Kd6 Ka8 10.Kc5 Kb7 11.Kb5 Ka8
12.Be5 Kxa7 13.Bd4+ Kb7 14.Kxb4 (58.018.378) 1247
28/70 0:56 +0.05 1.Kc6 Kb8 2.Bg3+ Ka7 3.Kc7 Ka8 4.Kd7 Ka7
5.Kc8 Ka8 6.Bb8 b5 7.a7 b4 8.Kd7 Kb7
9.Kd6 Ka8 10.Kc5 Kb7 11.Kb5 Ka8
12.Be5 Kxa7 13.Bd4+ Kb7 14.Kxb4 (70.335.746) 1255
29/70 1:09 +0.05 1.Kc6 Kb8 2.Bg3+ Ka7 3.Kc7 Ka8 4.Kd7 Ka7
5.Kc8 Ka8 6.Bb8 b5 7.a7 b4 8.Kd7 Kb7
9.Kd6 Ka8 10.Kc5 Kb7 11.Kb5 Ka8
12.Be5 Kxa7 13.Bd4+ Kb7 14.Bf2 (88.024.778) 1265
30/70 1:29 +0.05 1.Kc6 Kb8 2.Bg3+ Ka7 3.Kc7 Ka8 4.Kd7 Ka7
5.Kc8 Ka8 6.Bb8 b5 7.a7 b4 8.Kd7 Kb7
9.Kd6 Ka8 10.Kc5 Kb7 11.Kb5 Ka8
12.Be5 Kxa7 13.Bd4+ Kb7 14.Kxb4 (113.074.690) 1259
best move: Kd6-c6 time: 1:43.235 min n/s: 1.267.947 nodes: 130.677.248