SF misevaluating pawn endings
Moderator: Ras
-
- Posts: 436
- Joined: Thu Aug 02, 2012 7:48 pm
- Location: Germany
Re: SF misevaluating pawn endings
i could use 5men tb's, they are installed but i wanted to show that sth is wrong with SF evaluating some pawn endings without the help of third party software...Komodo for example has no problems with the 2 positions i posted with or without tb access but thats another story oc.
Wahrheiten sind Illusionen von denen wir aber vergessen haben dass sie welche sind.
-
- Posts: 436
- Joined: Thu Aug 02, 2012 7:48 pm
- Location: Germany
Re: SF misevaluating pawn endings
well "wrong" is a hard word...i feel sry for that, excuse me...i should have sayed that there is room for improvements for SF evaluating pawn endings...full credit to the SF team !
Wahrheiten sind Illusionen von denen wir aber vergessen haben dass sie welche sind.
-
- Posts: 1766
- Joined: Wed Jun 03, 2009 12:14 am
Re: SF misevaluating pawn endings
so probably the wider search from multiple cores led to a 'lucky' hit, but as joerg says the issue is still there.MikeB wrote:actually I'm not seeing that with multilple threads - single pv, no tb it finds and sticks with f4 , and yes I do see Kd4 in single thread modeyanquis1972 wrote:confirm no TBs single PV latest SF likes Kd4, very weirdSpliffjiffer wrote:you both tested with different conditions as i did:
jouni used multipv and used too low time for pos2...SF will change from Nxe3 to Nd5 later on missevaluating the pawn ending
michael used tb's
ofc the big question is, is it actually an issue? yes, 5 men syzygy are cheap, but should engines be absolutely reliant on them? i'm sure some would say yes, if it means speed/elo gains; personally, perhaps even especially when its an engine as widely used as SF, i think it's a problem that should be rectified.
-
- Posts: 3393
- Joined: Sat Feb 16, 2008 7:38 am
- Full name: Peter Martan
Re: SF misevaluating pawn endings
Hi Peer!
While the first one seems to be a real half-blind spot in SF's search, that sometimes does and sometimes doesn't show up with SMP, the second one might be a more difficult position in which K10.2 and Fire5 only succeeded without help of tbs at single tríals. SF and H5 failed at short TC.
Edit: In the meantime I had a single success with SF too. 1...Nxe3 arises in output now and then at once but gets lost for ...Nd5 soon, this time the right move was kept, but I couldn't reproduce it after restarting the GUI,
Very good findings or yours, thanks!Spliffjiffer wrote:2 simple pawn endings SF has problems with...
Engine: asmfish 04.11.2016; CPU: amd a10 7870k (quad core); TB's: None
[d]8/1p4p1/5k2/5p2/8/3K3P/5PP1/8 w - - 0 1
1.f4...only drawing move...after a 15 min search asmfish fails to find this, scoring 1.Kd4 as best with -4.49 at depth 48.
After forcing 1.f4 asmfish fails high with 1...b5, giving a score of over +18 for black at low depths...then quickly resolves to 0.00.
[d]8/6pk/7p/8/8/2npPP1P/1RnK2P1/8 b - - 0 1
1...Nxe3...only drawing move...after a 15 min search asmfish fails to find this, scoring 1...Nd5 as best with -2.47 at depth 43.
After 1....Nxe3 2.Kxc3 Nd1 3.Kb3 [Here asmfish stubbornly tries to avoid the pawn endgame with 3...Ne3?] Nxb2 4.Kxb2 Kg6 5.Kc3 Kf5 6.g3 h5! (only move, which seems to be the one making asmfish misevaluating the whole line) its a draw...with 6.g3 SF fails high, scoring the position +41.34 for white at lower depths until evaluating the position with +0.08 after around 20 sec.
regards
While the first one seems to be a real half-blind spot in SF's search, that sometimes does and sometimes doesn't show up with SMP, the second one might be a more difficult position in which K10.2 and Fire5 only succeeded without help of tbs at single tríals. SF and H5 failed at short TC.
Edit: In the meantime I had a single success with SF too. 1...Nxe3 arises in output now and then at once but gets lost for ...Nd5 soon, this time the right move was kept, but I couldn't reproduce it after restarting the GUI,
Peter.
-
- Posts: 3393
- Joined: Sat Feb 16, 2008 7:38 am
- Full name: Peter Martan
Re: SF misevaluating pawn endings
And the reason for finding or not finding 1...Nxe3, Peer has spotted quite well in his winning line after 1...Nxe3 already too, 3...Nxb2 has to be found.peter wrote:...Spliffjiffer wrote:2 simple pawn endings SF has problems with...
Engine: asmfish 04.11.2016; CPU: amd a10 7870k (quad core); TB's: None
[d]8/1p4p1/5k2/5p2/8/3K3P/5PP1/8 w - - 0 1
1.f4...only drawing move...after a 15 min search asmfish fails to find this, scoring 1.Kd4 as best with -4.49 at depth 48.
After forcing 1.f4 asmfish fails high with 1...b5, giving a score of over +18 for black at low depths...then quickly resolves to 0.00.
[d]8/6pk/7p/8/8/2npPP1P/1RnK2P1/8 b - - 0 1
1...Nxe3...only drawing move...after a 15 min search asmfish fails to find this, scoring 1...Nd5 as best with -2.47 at depth 43.
After 1....Nxe3 2.Kxc3 Nd1 3.Kb3 [Here asmfish stubbornly tries to avoid the pawn endgame with 3...Ne3?] Nxb2 4.Kxb2 Kg6 5.Kc3 Kf5 6.g3 h5! (only move, which seems to be the one making asmfish misevaluating the whole line) its a draw...with 6.g3 SF fails high, scoring the position +41.34 for white at lower depths until evaluating the position with +0.08 after around 20 sec.
regards
the second one might be a more difficult position in which K10.2 and Fire5 only succeeded without help of tbs at single tríals. SF and H5 failed at short TC.
Edit: In the meantime I had a single success with SF too. 1...Nxe3 arises in output now and then at once but gets lost for ...Nd5 soon, this time the right move was kept, but I couldn't reproduce it after restarting the GUI,
1...Nxe3 2.Kxc3 Nd1+3.Kb3 :
[d]8/6pk/7p/8/8/1K1p1P1P/1R4P1/3n4 b - - 0 1
Analysis by Brainfish 031216 64 POPCNT:
...
3...Ne3 4.Rd2 Nf5 5.Kc3 Ng3 6.Kxd3 h5 7.Kd4 Nf1 8.Rf2 Ng3 9.Ke5 Kg6 10.Rd2 Nf1 11.Rd6+ Kf7 12.Kf4 h4 13.Kg5 Ne3 14.Rd7+ Ke6 15.Rd4 Kf7 16.Rxh4 Nd5 17.Rf4+ Kg8 18.Rd4 Ne7 19.Rd7 Nc6 20.Rc7 Ne5 21.Kf5 Nd3 22.Rd7 Nc5 23.Rd5 Na4 24.Kg6 Kf8 25.Rh5 Nc3 26.Rh8+ Ke7 27.Rh4 Ke6 28.Rf4 Nb1 29.Re4+ Kd5 30.Kxg7 Kd6 31.h4 Na3 32.h5 Nb1 33.h6
+- (12.45) Depth: 39/69 00:03:39 7733MN
...
3...Ne3 4.Rd2
+- (11.87 --) Depth: 40/69 00:03:42 7815MN
And yes, tbs help again to find the simple looking 3...Nxb2 drawing, but without them I'd yet would call that an at least half blind spot too,
Peter.
-
- Posts: 4660
- Joined: Sun Mar 12, 2006 2:40 am
- Full name: Eelco de Groot
Re: SF misevaluating pawn endings
Joerg posted on https://github.com/official-stockfish/S ... issues/760
I don't know how much extra code is involved.... If your point of view is that Syzygy bases should solve this, I don't think we have much chance. I depends if it can be cleanly done?
By the way for Peer's position 1 I have an alternative suggestion that even lowers the benchnumber
Old bench with Vondele's latest refactor patch:
bench: 5884767
New bench with modificated latest Development Stock:
Stockfish 20161205 MOD _02
===========================
Total time (ms) : 3112
Nodes searched : 5879059
Nodes/second : 1889157
Will return always something like this
8/1p4p1/5k2/5p2/8/3K3P/5PP1/8 w - -
Engine: Stockfish 20161205 MOD _02 HT (7 threads, i7 6700, 512 MB)
by T. Romstad, M. Costalba, J. Kiiski, G. Linscott
30/55 0:01 0.00 1.f4 b5 2.Kd4 Ke6 3.g4 Kf7 4.Kc5 g5
5.Kxb5 gxf4 6.Kc4 fxg4 7.hxg4 Kf6
8.Kd3 Kg5 9.Ke2 Kxg4 10.Kf2 f3
11.Kf1 f2 12.Kxf2 Kf4 13.Ke2 Ke4
14.Kf2 (28.197.992) 19857
31/55 0:01 0.00 1.f4 b5 2.Kd4 Ke6 3.g4 Kf7 4.Kc5 g5
5.Kxb5 gxf4 6.Kc4 fxg4 7.hxg4 Kf6
8.Kd3 Kg5 9.Ke2 Kxg4 10.Kf2 f3
11.Kf1 f2 12.Kxf2 Kf4 13.Ke2 Ke4
14.Kf2 (33.953.160) 20114
32/55 0:01 0.00 1.f4 b5 2.Kd4 Ke6 3.g4 Kf7 4.Kc5 g5
5.Kxb5 gxf4 6.Kc4 fxg4 7.hxg4 Kf6
8.Kd3 Kg5 9.Ke2 Kxg4 10.Kf2 f3
11.Kf1 f2 12.Kxf2 Kf4 13.Ke2 Ke4
14.Kf2 (40.140.731) 20222
33/55 0:02 0.00 1.f4 b5 2.Kd4 Ke6 3.g4 Kf7 4.Kc5 g5
5.Kxb5 gxf4 6.Kc4 fxg4 7.hxg4 Kf6
8.Kd3 Kg5 9.Ke2 Kxg4 10.Kf2 f3
11.Kf1 f2 12.Kxf2 Kf4 13.Ke2 Ke4
14.Kf2 (57.295.204) 20706
34/55 0:03 0.00 1.f4 b5 2.Kd4 Ke6 3.g4 Kf7 4.Kc5 g5
5.Kxb5 gxf4 6.Kc4 fxg4 7.hxg4 Kf6
8.Kd3 Kg5 9.Ke2 Kxg4 10.Kf2 f3
11.Kf1 Kg3 12.Kg1 f2+ 13.Kf1 Kh2
14.Kxf2 (80.122.010) 21286
35/55 0:05 0.00 1.f4 b5 2.Kd4 Ke6 3.g4 Kf7 4.Kc5 g5
5.Kxb5 gxf4 6.Kc4 fxg4 7.hxg4 Kf6
8.Kd3 Kg5 9.Ke2 Kxg4 10.Kf2 f3
11.Kf1 Kg3 12.Kg1 f2+ 13.Kf1 Kh2
14.Kxf2 (113.809.223) 22407
36/55 0:09 0.00 1.f4 b5 2.Kd4 Ke6 3.g4 Kf7 4.Kc5 g5
5.Kxb5 gxf4 6.Kc4 fxg4 7.hxg4 Kf6
8.Kd3 Kg5 9.Ke2 Kxg4 10.Kf2 f3
11.Kf1 Kg3 12.Kg1 f2+ 13.Kf1 Kh2
14.Kxf2 (234.291.665) 24842
37/55 0:10 0.00 1.f4 b5 2.Kd4 Ke6 3.g4 Kf7 4.Kc5 g5
5.Kxb5 gxf4 6.Kc4 fxg4 7.hxg4 Kf6
8.Kd3 Kg5 9.Ke2 Kxg4 10.Kf2 f3
11.Kf1 Kg3 12.Kg1 f2+ 13.Kf1 Kh2
14.Kxf2 (253.566.325) 24964
38/55 0:12 0.00 1.f4 b5 2.Kd4 Ke6 3.g4 Kf7 4.Kc5 g5
5.Kxb5 gxf4 6.Kc4 fxg4 7.hxg4 Kf6
8.Kd3 Kg5 9.Ke2 Kxg4 10.Kf2 f3
11.Kf1 Kg3 12.Kg1 f2+ 13.Kf1 Kh2
14.Kxf2 (304.774.696) 25387
39/55 0:19 0.00 1.f4 b5 2.Kd4 Ke6 3.g4 Kf7 4.Kc5 g5
5.Kxb5 gxf4 6.Kc4 fxg4 7.hxg4 Kf6
8.Kd3 Kg5 9.Ke2 Kxg4 10.Kf2 f3
11.Kf1 Kg3 12.Kg1 f2+ 13.Kf1 Kh2
14.Kxf2 (516.614.250) 27164
40/55 1:34 0.00 1.f4 b5 2.Kd4 Ke6 3.g4 Kf7 4.Kc5 g5
5.Kxb5 gxf4 6.Kc4 fxg4 7.hxg4 Kf6
8.Kd3 Kg5 9.Ke2 Kxg4 10.Kf2 f3
11.Kf1 Kg3 12.Kg1 f2+ 13.Kf1 Kh2
14.Kxf2 (2.782.650.592) 29573
41/55 4:04 0.00 1.f4 b5 2.Kd4 Ke6 3.g4 Kf7 4.Kc5 g5
5.Kxb5 gxf4 6.Kc4 fxg4 7.hxg4 Kf6
8.Kd3 Kg5 9.Ke2 Kxg4 10.Kf2 f3
11.Kf1 Kg3 12.Kg1 f2+ 13.Kf1 Kh2
14.Kxf2 (7.373.300.745) 30140
42/55 4:09 0.00 1.f4 b5 2.Kd4 Ke6 3.g4 Kf7 4.Kc5 g5
5.Kxb5 gxf4 6.Kc4 fxg4 7.hxg4 Kf6
8.Kd3 Kg5 9.Ke2 Kxg4 10.Kf2 f3
11.Kf1 Kg3 12.Kg1 f2+ 13.Kf1 Kh2
14.Kxf2 (7.490.207.986) 30054
43/55 4:31 0.00 1.f4 b5 2.Kd4 Ke6 3.g4 Kf7 4.Kc5 g5
5.Kxb5 gxf4 6.Kc4 fxg4 7.hxg4 Kf6
8.Kd3 Kg5 9.Ke2 Kxg4 10.Kf2 f3
11.Kf1 Kg3 12.Kg1 f2+ 13.Kf1 Kh2
14.Kxf2 (8.122.271.067) 29967
I rather reply here now because I think Rocky's issue gets a bit sidetracked. This is a very different position. I don't think treating KPk bitbase evals differently will help a lot here. Your suggestion for changing LMP will, but unfortunately you lose some speed I think. Stefano's suggestion is more a logical change because you make better use of exact results in the KPk bitbase. I guess, in positions where you hit KPk it should even be a little speedup because you don't need to search a draw position anymore (only in Root to get a move) but not so much that Marco will be satisfied by that. It should leave LMR intact though.@Stefano80 @Kingdefender But will this also be helpful in positions with more pawns like the one mentioned above? Iam not sure, maybe I will give it a try tomorrow. It should be easy enough to try.
I don't know how much extra code is involved.... If your point of view is that Syzygy bases should solve this, I don't think we have much chance. I depends if it can be cleanly done?
By the way for Peer's position 1 I have an alternative suggestion that even lowers the benchnumber

bench: 5884767
New bench with modificated latest Development Stock:
Stockfish 20161205 MOD _02
===========================
Total time (ms) : 3112
Nodes searched : 5879059
Nodes/second : 1889157
Code: Select all
// Step 13. Pruning at shallow depth
if ( !rootNode
&& !(pos.count<ALL_PIECES>(WHITE) + pos.count<ALL_PIECES>(BLACK) <= 5) // New condition
/* && !(PvNode && !pos.non_pawn_material(pos.side_to_move())) */
&& bestValue > VALUE_MATED_IN_MAX_PLY)
{
8/1p4p1/5k2/5p2/8/3K3P/5PP1/8 w - -
Engine: Stockfish 20161205 MOD _02 HT (7 threads, i7 6700, 512 MB)
by T. Romstad, M. Costalba, J. Kiiski, G. Linscott
30/55 0:01 0.00 1.f4 b5 2.Kd4 Ke6 3.g4 Kf7 4.Kc5 g5
5.Kxb5 gxf4 6.Kc4 fxg4 7.hxg4 Kf6
8.Kd3 Kg5 9.Ke2 Kxg4 10.Kf2 f3
11.Kf1 f2 12.Kxf2 Kf4 13.Ke2 Ke4
14.Kf2 (28.197.992) 19857
31/55 0:01 0.00 1.f4 b5 2.Kd4 Ke6 3.g4 Kf7 4.Kc5 g5
5.Kxb5 gxf4 6.Kc4 fxg4 7.hxg4 Kf6
8.Kd3 Kg5 9.Ke2 Kxg4 10.Kf2 f3
11.Kf1 f2 12.Kxf2 Kf4 13.Ke2 Ke4
14.Kf2 (33.953.160) 20114
32/55 0:01 0.00 1.f4 b5 2.Kd4 Ke6 3.g4 Kf7 4.Kc5 g5
5.Kxb5 gxf4 6.Kc4 fxg4 7.hxg4 Kf6
8.Kd3 Kg5 9.Ke2 Kxg4 10.Kf2 f3
11.Kf1 f2 12.Kxf2 Kf4 13.Ke2 Ke4
14.Kf2 (40.140.731) 20222
33/55 0:02 0.00 1.f4 b5 2.Kd4 Ke6 3.g4 Kf7 4.Kc5 g5
5.Kxb5 gxf4 6.Kc4 fxg4 7.hxg4 Kf6
8.Kd3 Kg5 9.Ke2 Kxg4 10.Kf2 f3
11.Kf1 f2 12.Kxf2 Kf4 13.Ke2 Ke4
14.Kf2 (57.295.204) 20706
34/55 0:03 0.00 1.f4 b5 2.Kd4 Ke6 3.g4 Kf7 4.Kc5 g5
5.Kxb5 gxf4 6.Kc4 fxg4 7.hxg4 Kf6
8.Kd3 Kg5 9.Ke2 Kxg4 10.Kf2 f3
11.Kf1 Kg3 12.Kg1 f2+ 13.Kf1 Kh2
14.Kxf2 (80.122.010) 21286
35/55 0:05 0.00 1.f4 b5 2.Kd4 Ke6 3.g4 Kf7 4.Kc5 g5
5.Kxb5 gxf4 6.Kc4 fxg4 7.hxg4 Kf6
8.Kd3 Kg5 9.Ke2 Kxg4 10.Kf2 f3
11.Kf1 Kg3 12.Kg1 f2+ 13.Kf1 Kh2
14.Kxf2 (113.809.223) 22407
36/55 0:09 0.00 1.f4 b5 2.Kd4 Ke6 3.g4 Kf7 4.Kc5 g5
5.Kxb5 gxf4 6.Kc4 fxg4 7.hxg4 Kf6
8.Kd3 Kg5 9.Ke2 Kxg4 10.Kf2 f3
11.Kf1 Kg3 12.Kg1 f2+ 13.Kf1 Kh2
14.Kxf2 (234.291.665) 24842
37/55 0:10 0.00 1.f4 b5 2.Kd4 Ke6 3.g4 Kf7 4.Kc5 g5
5.Kxb5 gxf4 6.Kc4 fxg4 7.hxg4 Kf6
8.Kd3 Kg5 9.Ke2 Kxg4 10.Kf2 f3
11.Kf1 Kg3 12.Kg1 f2+ 13.Kf1 Kh2
14.Kxf2 (253.566.325) 24964
38/55 0:12 0.00 1.f4 b5 2.Kd4 Ke6 3.g4 Kf7 4.Kc5 g5
5.Kxb5 gxf4 6.Kc4 fxg4 7.hxg4 Kf6
8.Kd3 Kg5 9.Ke2 Kxg4 10.Kf2 f3
11.Kf1 Kg3 12.Kg1 f2+ 13.Kf1 Kh2
14.Kxf2 (304.774.696) 25387
39/55 0:19 0.00 1.f4 b5 2.Kd4 Ke6 3.g4 Kf7 4.Kc5 g5
5.Kxb5 gxf4 6.Kc4 fxg4 7.hxg4 Kf6
8.Kd3 Kg5 9.Ke2 Kxg4 10.Kf2 f3
11.Kf1 Kg3 12.Kg1 f2+ 13.Kf1 Kh2
14.Kxf2 (516.614.250) 27164
40/55 1:34 0.00 1.f4 b5 2.Kd4 Ke6 3.g4 Kf7 4.Kc5 g5
5.Kxb5 gxf4 6.Kc4 fxg4 7.hxg4 Kf6
8.Kd3 Kg5 9.Ke2 Kxg4 10.Kf2 f3
11.Kf1 Kg3 12.Kg1 f2+ 13.Kf1 Kh2
14.Kxf2 (2.782.650.592) 29573
41/55 4:04 0.00 1.f4 b5 2.Kd4 Ke6 3.g4 Kf7 4.Kc5 g5
5.Kxb5 gxf4 6.Kc4 fxg4 7.hxg4 Kf6
8.Kd3 Kg5 9.Ke2 Kxg4 10.Kf2 f3
11.Kf1 Kg3 12.Kg1 f2+ 13.Kf1 Kh2
14.Kxf2 (7.373.300.745) 30140
42/55 4:09 0.00 1.f4 b5 2.Kd4 Ke6 3.g4 Kf7 4.Kc5 g5
5.Kxb5 gxf4 6.Kc4 fxg4 7.hxg4 Kf6
8.Kd3 Kg5 9.Ke2 Kxg4 10.Kf2 f3
11.Kf1 Kg3 12.Kg1 f2+ 13.Kf1 Kh2
14.Kxf2 (7.490.207.986) 30054
43/55 4:31 0.00 1.f4 b5 2.Kd4 Ke6 3.g4 Kf7 4.Kc5 g5
5.Kxb5 gxf4 6.Kc4 fxg4 7.hxg4 Kf6
8.Kd3 Kg5 9.Ke2 Kxg4 10.Kf2 f3
11.Kf1 Kg3 12.Kg1 f2+ 13.Kf1 Kh2
14.Kxf2 (8.122.271.067) 29967
Debugging is twice as hard as writing the code in the first
place. Therefore, if you write the code as cleverly as possible, you
are, by definition, not smart enough to debug it.
-- Brian W. Kernighan
place. Therefore, if you write the code as cleverly as possible, you
are, by definition, not smart enough to debug it.
-- Brian W. Kernighan