Stockfish endgame evaluation problem

Discussion of chess software programming and technical issues.

Moderator: Ras

jwes
Posts: 778
Joined: Sat Jul 01, 2006 7:11 am

Re: Stockfish endgame evaluation problem

Post by jwes »

zamar wrote:
jwes wrote:Stockfish evaluates this position as +8.20 for white. This seems like a very optimistic evaluation.
White has an extra piece and a passed pawn on 7th rank. With very few exceptions this is always winning, so I see nothing wrong with the evaluation.
One problem is that it evaluates the first position higher than this one at low depths though the eval quickly rises, which suggests a possible solution - if the side which is ahead has pawns and the pv does not reset the 50 move counter for a long time and the eval does not change significantly, make the evaluation more drawish.
[d]5k2/6p1/8/5P2/4K3/5R2/8/8 w KQkq - 0 1
zamar wrote:
It also seems that there should be some way search could see that it is not won. There are less than 20,000 relevant positions (bk on f8 or e7, white pieces anywhere, or without wp at f7 and bk on f8, g8 or h8, white pieces anywhere) and there should be some way the program can see that it cannot force any other positions.
[d]5k2/5Pp1/6P1/8/4K3/8/2B5/8 b - - 0 36
For human it's easy to detect these kind of fortress positions. But for computers... nope.
lech
Posts: 1169
Joined: Sun Feb 14, 2010 10:02 pm

Re: Stockfish endgame evaluation problem

Post by lech »

I modified my deep idea no tweak and now I have first real chees engine. :lol:

[d] 1r6/4k3/r2p2p1/2pR1p1p/2P1pP1P/pPK1P1P1/P7/1B6 b - - 0 1

Code: Select all

Sfx:
   1	00:00	         456	2.425	+2,86	Ra6a8
   2	00:00	         505	2.686	+2,58	Ra6a8 Rd5d1
   3	00:00	         648	3.446	+2,58	Ra6a8 Rd5d1 Ke7e6
   4	00:00	         869	4.622	+2,50	Ra6a8 Rd5d1 Ke7e6 Bb1c2
   4	00:00	       1.192	6.340	+2,54	Ra6a7 Rd5d1 Ke7e6 Bb1c2
   5	00:00	       1.555	7.622	+2,46	Ra6a7 Rd5d1 Ke7e6 Bb1c2 Ra7b7
   6+	00:00	       2.131	10.446	+2,58	Ra6a7 Rd5d1 Ke7e6 Bb1c2 Ra7b7
   6	00:00	       2.757	13.514	+2,58	Ra6a7 Rd5d2 Ke7e6 Bb1c2 Ra7b7 Rd2d1
   6	00:00	       3.456	15.780	+2,62	Ke7e6 Bb1c2 Ra6a8 Rd5d1 Rb8b7 Rd1d2
   7	00:00	       4.662	19.838	+2,54	Ke7e6 Bb1c2 Ra6a8 Rd5d1 Rb8b7 Rd1d2 Ra8b8 Rd2d1
   8	00:00	       6.901	27.604	+2,62	Ke7e6 Bb1c2 Ra6a7 Rd5d2 Ra7b7 Rd2d1 Rb7c7 Rd1d2
   9-	00:00	       9.676	38.704	+2,50	Ke7e6 Bb1c2 Ra6a7 Rd5d2 Ra7b7 Rd2d1 Rb7c7 Rd1d2
   9	00:00	      12.563	47.229	+2,50	Ke7e6 Bb1c2 Ra6a7 Rd5d2 Ra7b7 Rd2d1 Rb7c7 Rd1d2 Rc7a7 Rd2d1 Ra7b7 Rd1d2 Rb7a7
  10	00:00	      14.733	55.387	+2,54	Ke7e6 Bb1c2 Ra6a7 Rd5d2 Ra7b7 Rd2d1 Rb7c7 Rd1d2 Rc7a7 Rd2d1 Ra7b7 Rd1d2 Rb7a7
  11	00:00	      21.428	72.148	+2,58	Ke7e6 Bb1c2 Ra6a7 Rd5d2 Ra7b7 Rd2d1 Rb8f8 Rd1d2 Rf8c8 Bc2d1 Rc8b8
  12	00:00	      39.260	114.127	+2,54	Ke7e6 Bb1c2 Ra6a7 Bc2d1 Ra7b7 Bd1e2 Rb8a8 Rd5d1 Ra8e8 Kc3c2 Re8b8 Kc2c3 Rb8a8
  13+	00:00	      56.696	145.002	+2,62	Ke7e6 Bb1c2 Ra6a7 Bc2d1 Ra7b7 Bd1e2 Rb8a8 Rd5d1 Ra8e8 Kc3c2 Re8c8 Kc2c3 Rc8a8 Rd1d2 Ra8h8
  13-	00:00	      72.140	170.947	+2,46	Ke7e6 Bb1c2 Ra6a7 Bc2d1 Ra7b7 Bd1e2 Rb8a8 Rd5d1 Ra8e8 Kc3c2 Re8c8 Kc2c3 Rc8a8 Rd1d2 Ra8h8 Rd2d1 Rh8e8
  13	00:00	      98.079	209.123	+2,46	Ke7e6 Bb1c2 Ra6a7 Bc2d1 Ra7b7 Bd1e2 Rb8a8 Rd5d1 Rb7a7 Rd1d2 Ra8h8 Be2f1 Rh8b8
  14	00:00	     117.423	220.719	+2,50	Ke7e6 Bb1c2 Ra6a7 Bc2d1 Ra7b7 Bd1e2 Rb8a8 Rd5d1 Rb7a7 Rd1d2 Ra8h8 Rd2d5 Ra7b7 Rd5d1 Rh8e8 Kc3c2 Re8c8 Kc2c3 Rc8a8 Rd1d2 Ra8h8 Rd2d1
  15	00:00	     163.093	267.365	+2,46	Ke7e6 Bb1c2 Ra6a7 Bc2d1 Ra7b7 Bd1e2 Rb8a8 Rd5d1 Ra8h8 Kc3c2 Rb7f7 Be2f1 Rh8b8 Bf1g2 Rf7b7
  16	00:00	     227.115	309.000	+2,46	Ke7e6 Bb1c2 Ra6a7 Bc2d1 Ra7b7 Bd1e2 Rb8h8 Rd5d2 Rb7f7 Rd2d1 Rf7d7 Rd1d2 Rh8b8 Rd2d1 Rd7a7 Rd1d5 Rb8h8 Rd5d1 Ra7d7
  17-	00:00	     310.982	349.025	+2,38	Ke7e6 Bb1c2 Ra6a7 Bc2d1 Ra7b7 Bd1e2 Rb8h8 Rd5d2 Rb7f7 Rd2d1 Rf7d7 Rd1d2 Rh8b8 Rd2d1 Rd7a7 Rd1d5 Rb8a8 Rd5d1 Ra7d7 Rd1d5
  17	00:01	     557.722	414.971	+2,34	Ke7e6 Bb1c2 Ra6a7 Bc2d1 Ra7b7 Bd1e2 Rb8h8 Rd5d2 Rb7f7 Rd2d1 Rf7d7 Rd1d2 Rh8b8 Rd2d1 Rd7a7 Rd1d5 Rb8a8 Rd5d1 Ra7b7 Rd1d2 Ra8h8 Rd2d1 Rh8d8 Rd1d2 Rd8a8 Rd2d1 Ra8a7 Be2f1 Ra7a8 Bf1e2
  18	00:01	     814.220	453.099	+2,30	Ke7e6 Bb1c2 Ra6a7 Bc2d1 Ra7b7 Bd1e2 Rb8h8 Rd5d2 Rb7f7 Rd2d1 Rf7d7 Rd1d2 Rh8b8 Rd2d1 Rd7a7 Rd1d5 Rb8b7 Rd5d1 Rb7b8
  19	00:02	   1.094.037	486.238	+2,30	Ke7e6 Bb1c2 Ra6a7 Bc2d1 Ra7b7 Bd1e2 Rb8h8 Rd5d2 Rb7f7 Rd2d1 Rf7d7 Kc3c2 Rd7b7 Rd1d2 Rb7a7 Kc2c3 Ra7b7 Rd2d1 Rh8c8 Rd1d2 Ke6d7 Rd2d1 Kd7c6 Rd1d5 Rb7a7 Rd5d1 Rc8b8 Be2f1 Rb8f8 Bf1g2 Ra7b7 Bg2f1 Rb7a7
  20-	00:02	   1.503.296	517.129	+2,22	Ke7e6 Bb1c2 Ra6a7 Bc2d1 Ra7b7 Bd1e2 Rb8h8 Rd5d2 Rb7f7 Rd2d1 Rf7d7 Kc3c2 Rd7b7 Rd1d2 Rb7a7 Kc2c3 Ra7b7 Kc3c2
  20-	00:05	   2.461.435	478.785	+2,14	Ke7e6 Bb1c2 Ra6a7 Bc2d1 Ra7b7 Bd1e2 Rb8h8 Rd5d2 Ke6d7 Rd2d1 Rh8a8 Rd1d2 Kd7c6 Be2f1 Ra8h8 Rd2d1 Rh8d8 Rd1d5 Kc6d7 Bf1e2 Rd8h8 Rd5d1
  20	00:06	   3.224.027	485.473	+2,10	Ke7e6 Bb1c2 Ra6a7 Bc2d1 Ra7b7 Bd1e2 Rb8h8 Rd5d2 Ke6d7 Rd2d1 Rh8a8 Rd1d5 Kd7e6 Rd5d1 Ra8f8 Rd1d2 Rb7h7 Be2d1 Rf8c8 Bd1e2 Rh7f7 Rd2d5 Rf7a7 Rd5d1 Rc8a8 Rd1d2 Ra7d7 Rd2d1 Ra8b8 Rd1d5 Rd7h7 Rd5d2 Rb8h8 Rd2d1 Rh7f7 Rd1d5 Rh8h6 Rd5d1 Rh6h8
  21	00:07	   3.729.103	486.066	+2,10	Ke7e6 Bb1c2 Ra6a7 Bc2d1 Ra7b7 Bd1e2 Rb8h8 Rd5d2 Ke6d7 Rd2d1 Rh8a8 Rd1d2 Kd7c6 Rd2d1 Ra8a7 Rd1d5 Rb7d7 Rd5d1 Rd7b7
  22-	00:09	   4.545.615	499.847	+1,97	Ke7e6 Bb1c2 Ra6a7 Bc2d1 Ra7b7 Bd1e2 Rb8h8 Rd5d2 Ke6d7 Rd2d1 Rh8a8 Rd1d2 Kd7c6 Rd2d1 Ra8a7 Rd1d5 Rb7d7 Rd5d1 Rd7b7
  22-	00:11	   6.005.776	536.038	+1,85	Ke7e6 Bb1c2 Ra6a7 Bc2d1 Ra7b7 Bd1e2 Rb8h8 Rd5d2 Rb7a7 Rd2d1 Rh8d8 Rd1d5 Ra7d7 Rd5d1 Rd7a7
  22	00:14	   8.335.136	566.284	+1,77	Ke7e6 Bb1c2 Ra6a7 Bc2d1 Ra7b7 Bd1e2 Rb8h8 Rd5d2 Rb7a7 Rd2d5 Rh8h7 Rd5d1 Ra7b7 Rd1d2 Rb7b6 Rd2d5 Rb6c6 Rd5d1 Rh7a7 Rd1d2 Ra7e7 Rd2d5 Rc6b6 Rd5d2 Rb6b7 Rd2d5 Re7c7 Rd5d1 Rc7c8 Rd1d2 Rc8d8 Rd2d5 Rd8h8
  23-	00:21	  12.718.238	592.400	+1,45	Ke7e6 Bb1c2 Ra6a7 Bc2d1 Ra7b7 Bd1e2 Rb8h8 Rd5d1 Rh8h7 Rd1d5 Rh7h8
  23	00:32	  19.299.622	589.571	+1,29	Ke7e6 Bb1c2 Ra6a7 Bc2d1 Ra7b7 Bd1e2 Rb8h8 Be2d1 Rh8h6 Bd1e2 Rb7h7 Rd5d1 Rh7h8 Rd1c1 Rh8f8 Rc1b1 Rh6h8 Rb1d1 Rf8c8 Rd1d2 Rc8d8 Rd2d1 Ke6d7 Rd1d5 Rh8h7 Rd5d1 Rh7h8
  24	00:41	  24.686.153	601.866	+1,21	Ke7e6 Bb1c2 Ra6a7 Bc2d1 Ra7b7 Bd1e2 Rb8h8 Be2d1 Rh8h6 Bd1e2 Rb7h7 Rd5d1 Rh7h8 Kc3c2 Rh8f8 Rd1d2 Rf8d8 Kc2c3 Rd8a8 Rd2d1 Rh6h8 Rd1d2 Ra8b8 Rd2d5 Rb8b7
  25	00:54	  33.194.600	605.420	+1,13	Ke7e6 Bb1c2 Ra6a7 Bc2d1 Ra7b7 Bd1e2 Rb8h8 Be2d1 Rh8h6 Bd1e2 Rb7h7 Be2d1 Rh7f7 Bd1e2 Rh6h7 Kc3c2 Rf7a7 Rd5d1 Rh7h8 Kc2c3 Rh8c8 Rd1d2 Ra7a6 Rd2d5 Ke6d7 Rd5d2 Rc8b8 Rd2d1 Kd7c7 Rd1d5 Kc7b7 Rd5d2 Ra6c6 Rd2d5 Rb8e8 Rd5d1 Kb7a6 Rd1d5
  26	01:04	  38.759.575	597.588	+1,13	Ke7e6 Bb1c2 Ra6a7 Bc2d1 Ra7b7 Bd1e2 Rb8h8 Be2d1 Rh8h6 Bd1e2 Rb7h7 Be2d1 Rh7f7 Bd1e2 Rh6h7 Kc3c2 Rf7a7 Rd5d1 Rh7h8 Rd1d5 Ra7a5 Kc2c3 Ke6d7 Rd5d1 Kd7c7 Rd1d5 Rh8b8 Rd5d2
  27-	01:27	  51.098.264	582.104	+1,05	Ke7e6 Bb1c2 Ra6a7 Bc2d1 Ra7b7 Bd1e2 Rb8h8 Be2d1 Rh8h6 Bd1e2 Rb7h7 Be2d1 Rh7f7 Bd1e2 Rh6h7 Kc3c2 Rf7a7 Rd5d2 Rh7b7 Kc2c3 Rb7b8 Rd2d5 Ke6e7 Be2f1 Ra7a6 Bf1e2 Ke7f6 Rd5d2
  27-	01:52	  64.709.730	577.523	+0,96	Ke7e6 Bb1c2 Ra6a7 Bc2d1 Ra7b7 Bd1e2 Ke6e7 Kc3c2 Rb7a7 Kc2c3 Rb8b4 Rd5d1 Rb4b6 Rd1d5 Ra7b7 Kc3c2 Rb7b8 Kc2c3 Rb8b7
  27	02:05	  72.185.049	574.817	+0,96	Ke7e6 Bb1c2 Ra6a7 Bc2d1 Ra7b7 Bd1e2 Ke6e7 Kc3c2 Rb7a7 Kc2c3 Ra7d7 Be2d1 Ke7f8 Bd1e2 Kf8g7 Rd5d1 Kg7h8 Rd1d5 Rb8b6 Rd5d1 Kh8g8 Rd1d2 Kg8f8 Rd2d5 Rd7d8 Be2d1 Kf8g7 Rd5d2 Kg7f8 Rd2d5
  28-	02:35	  88.660.131	569.645	+0,80	Ke7e6 Bb1c2 Ra6a7 Bc2d1 Ra7b7 Bd1e2 Ke6e7 Be2f1 Rb7b6 Bf1e2 Ke7f7 Be2d1 Kf7e6 Kc3c2 Rb6b5 Bd1e2 Rb5b6 Be2d1
  28	03:02	 101.405.074	556.308	+0,80	Ke7e6 Bb1c2 Ra6a7 Bc2d1 Ra7b7 Bd1e2 Ke6e7 Be2f1 Rb7b6 Bf1e2 Ke7f7 Be2d1 Kf7e6 Kc3c2 Rb6b5 Bd1e2 Rb5b4 Be2d1 Rb4b7 Bd1e2 Ke6d7 Be2d1 Rb8e8 Kc2d2 Re8e6 Kd2c3 Kd7d8 Bd1e2 Kd8e7 Be2f1 Rb7d7 Bf1e2 Ke7d8 Be2f1 Kd8c7 Bf1e2 Kc7b6 Rd5d2 Kb6a5 Rd2d5 Rd7e7 Rd5d1 Re6f6 Rd1d5 Re7b7 Rd5d1 Rb7f7 Rd1d5 Rf6e6 Rd5d2 Ka5a6 Rd2d5 Ka6a5
  29	03:21	 110.888.247	550.740	+0,80	Ke7e6 Bb1c2 Ra6a7 Bc2d1 Ra7b7 Bd1e2 Ke6e7 Be2f1 Rb7b6 Bf1e2 Ke7f7 Be2d1 Kf7e6 Kc3c2 Rb6b5 Bd1e2 Rb5b4 Kc2c3 Rb4b6 Kc3c2 Rb6b4
  30-	04:12	 137.626.903	544.686	+0,72	Ke7e6 Bb1c2 Ra6a7 Bc2d1 Ra7b7 Bd1e2 Ke6e7 Kc3c2 Rb8d8 Kc2c3 Rd8d7 Be2d1 Rb7b6 Bd1e2 Rb6b4 Be2d1 Ke7d8 Kc3c2
  30-	05:01	 163.296.513	541.529	+0,64	Ke7e6 Bb1c2 Ra6a7 Bc2d1 Ra7b7 Bd1e2 Rb7f7 Be2d1 Ke6d7 Bd1e2 Rf7g7 Be2d1 Rb8a8 Bd1e2 Kd7e6 Kc3c2 Ra8f8 Kc2c3 Rf8g8 Kc3d2 Rg8e8 Kd2c2 Ke6e7 Kc2c3 Ke7d7 Be2f1 Re8b8 Bf1e2 Kd7c7 Be2f1 Rb8h8 Bf1e2 Kc7c6 Be2d1 Rh8h7 Bd1e2 Kc6c7 Be2d1 Kc7b8 Bd1e2 Kb8c7
  30-	06:00	 194.307.189	538.315	+0,48	Ke7e6 Bb1c2 Ra6a7 Bc2d1 Ra7b7 Bd1e2 Rb7f7 Be2d1 Ke6f6 Bd1e2 Rf7e7 Kc3c2 Rb8b6 Kc2c3 Kf6g7 Kc3c2 Kg7h6 Kc2c3 Re7e8 Kc3c2 Rb6c6 Rd5d1 Re8e6 Kc2c3 Re6e8 Rd1d2 Rc6c8 Rd2d5 Rc8a8 Rd5d1 Re8e6 Rd1d2 Re6e8 Rd2d1
  30-	08:01	 252.318.034	523.633	+0,16	Ke7e6 Bb1c2 Ra6a7 Bc2d1 Ra7b7 Bd1e2 Rb8e8 Kc3c2 Ke6d7 Be2f1 Rb7b8 Bf1e2 Rb8d8 Kc2c3 Kd7c8 Kc3c2 Kc8b7 Kc2c3 Rd8a8 Kc3c2 Re8f8 Kc2d2 Ra8d8 Kd2c2 Kb7a7 Kc2d2 Rf8f6 Kd2c2 Rf6e6 Kc2c3 Rd8d7 Rd5d2 Rd7d8 Rd2d5
  30+	09:33	 291.248.252	507.470	+0,88	Rb8xb3+ Kc3xb3 Ra6b6+ Kb3c2 Rb6b2+ Kc2c1 Rb2e2 Rd5d1 Re2xe3 Rd1g1 Re3c3+ Kc1d2 Rc3xc4 Bb1c2 d6d5 Rg1b1 d5d4 Kd2d1 Ke7d6 Rb1b6+ Kd6c7 Rb6xg6 d4d3 Bc2b3 Rc4b4 Kd1d2 c5c4 Bb3d1 Rb4b2+ Kd2c3 e4e3 Bd1xh5 e3e2 Rg6e6 Rb2xa2 Kc3xc4 Ra2d2 Re6e5 a3a2 Re5c5+ Kc7d6 Rc5d5+ Kd6e6 Rd5e5+ Ke6f6
  30+	10:12	 306.380.402	500.136	+0,96	Rb8xb3+ Kc3xb3 Ra6b6+ Kb3c2 Rb6b2+ Kc2c1 Rb2e2 Rd5d1 Re2xe3 Rd1g1 Re3c3+ Kc1d2 Rc3xc4 Bb1c2 d6d5 Rg1b1 d5d4 Kd2d1 Ke7d6 Rb1b6+ Kd6c7 Bc2b3 Rc4c3 Rb6xg6 c5c4 Bb3c2 Rc3f3 Bc2a4 c4c3 Rg6c6+ Kc7b7 Rc6c4 Rf3d3+ Kd1c1
  30	11:08	 328.039.848	490.423	+1,09	Rb8xb3+ Kc3xb3 Ra6b6+ Kb3c2 Rb6b2+ Kc2c1 Rb2e2 Rd5d1 Re2xe3 Rd1g1 Re3c3+ Kc1d2 Rc3xc4 Bb1c2 d6d5 Rg1b1 d5d4 Bc2d1 Rc4c3 Rb1b3 e4e3+ Kd2e2 Rc3c1 Rb3xa3 c5c4 Ra3a4 Ke7d6 a2a3 Kd6c5 Ra4a5+ Kc5b6 Ra5a4 d4d3+ Ke2xe3 Rc1xd1 Ra4xc4 Rd1g1 Ke3xd3 Rg1xg3+ Kd3d4 Rg3xa3 Rc4c1
No one engine is able to find 1...Rxb3.
The strange ? :lol:
zamar
Posts: 613
Joined: Sun Jan 18, 2009 7:03 am

Re: Stockfish endgame evaluation problem

Post by zamar »

jwes wrote:
if the side which is ahead has pawns and the pv does not reset the 50 move counter for a long time and the eval does not change significantly, make the evaluation more drawish.
I've thought sth like that, but the problem is that this breaks path-invariance which is one of the cornerstones of TT.

Also there comes new horizon effect. Side with more material will use all tricks to delay captures close to the leaves.

So in short: I don't believe in this idea.
Joona Kiiski
jwes
Posts: 778
Joined: Sat Jul 01, 2006 7:11 am

Re: Stockfish endgame evaluation problem

Post by jwes »

zamar wrote:
jwes wrote:
if the side which is ahead has pawns and the pv does not reset the 50 move counter for a long time and the eval does not change significantly, make the evaluation more drawish.
I've thought sth like that, but the problem is that this breaks path-invariance which is one of the cornerstones of TT.
Path invariance is largely an illusion when the 50 move rule is involved.
zamar wrote:Also there comes new horizon effect. Side with more material will use all tricks to delay captures close to the leaves.
It is the side with less material that would want to delay captures and if it can delay captures that long then maybe there is something about the position. Also pawn moves are likely more important than captures. My thought was to have no penalty for the first 10-20 ply and increasing thereafter.
zamar wrote:So in short: I don't believe in this idea.
Do you have a better idea or are you saying that the problem is insoluble?
Uri Blass
Posts: 10889
Joined: Thu Mar 09, 2006 12:37 am
Location: Tel-Aviv Israel

Re: Stockfish endgame evaluation problem

Post by Uri Blass »

zamar wrote:
jwes wrote:
if the side which is ahead has pawns and the pv does not reset the 50 move counter for a long time and the eval does not change significantly, make the evaluation more drawish.
I've thought sth like that, but the problem is that this breaks path-invariance which is one of the cornerstones of TT.

Also there comes new horizon effect. Side with more material will use all tricks to delay captures close to the leaves.

So in short: I don't believe in this idea.
I do not think that path invariance is a good idea.

Movei 10 10 10(that have path dependent evaluation) is something like 20 or 30 elo better than the default version of Movei(that has not path dependent evaluation)

Note that I do not use path dependent evaluation to detect fortress but simply to detect progress.
I compare the static evaluation of the leaf with the static evaluation 2 plies earlier and give a bonus of 10 centipawns if there is a progress
20 centipawns if there is a double progress(progress also from 4 plies earlier to 2 plies earlier) and 30 centi-pawns if there is a tripple progress(progress also from 6 plies earlier to 4 plies earlier)

Note that I also thought like you(early versions of movei that did not use hash for pruning used path dependent evaluation but I decided that the default value for movei00.8.438 is no path dependent evaluation because I did not believe that the same idea can work with using hash for pruning but I was surprised to find that it works for it).

It may not work for other programs because of different implementation of hash or because of different evaluation.


From the CCRL 40/40 rating list:

1 Movei 00.8.438 (10 10 10) 2773 +13 −13 47.9% +11.2 38.2% 2184
99.4%
Movei 00.8.438 2743 +19 −19 50.0% +0.5 37.3% 871

10 10 10 is better than the default with 99.4% confidence.


Uri
zamar
Posts: 613
Joined: Sun Jan 18, 2009 7:03 am

Re: Stockfish endgame evaluation problem

Post by zamar »

Uri Blass wrote: Note that I do not use path dependent evaluation to detect fortress but simply to detect progress.
I compare the static evaluation of the leaf with the static evaluation 2 plies earlier and give a bonus of 10 centipawns if there is a progress
20 centipawns if there is a double progress(progress also from 4 plies earlier to 2 plies earlier) and 30 centi-pawns if there is a tripple progress(progress also from 6 plies earlier to 4 plies earlier)
I'm not sure that I'm following you... You are giving progress bonuses only for the side to move? What's the justification for this? You want to increase the value of tempo when static score is raising? Definetily original idea, but very hard for me to believe that this could actually work.

Are you sure that simply increasing the value of tempo doesn't give the same benefit?
Joona Kiiski
zamar
Posts: 613
Joined: Sun Jan 18, 2009 7:03 am

Re: Stockfish endgame evaluation problem

Post by zamar »

zamar wrote:So in short: I don't believe in this idea.
Do you have a better idea or are you saying that the problem is insoluble?
I'm saying neither. I don't have better ideas and I think it's possible to construct algorithms to solve this problem. It's just a lot of trouble for minimalistic ELO gain.
Joona Kiiski
Uri Blass
Posts: 10889
Joined: Thu Mar 09, 2006 12:37 am
Location: Tel-Aviv Israel

Re: Stockfish endgame evaluation problem

Post by Uri Blass »

zamar wrote:
Uri Blass wrote: Note that I do not use path dependent evaluation to detect fortress but simply to detect progress.
I compare the static evaluation of the leaf with the static evaluation 2 plies earlier and give a bonus of 10 centipawns if there is a progress
20 centipawns if there is a double progress(progress also from 4 plies earlier to 2 plies earlier) and 30 centi-pawns if there is a tripple progress(progress also from 6 plies earlier to 4 plies earlier)
I'm not sure that I'm following you... You are giving progress bonuses only for the side to move? What's the justification for this? You want to increase the value of tempo when static score is raising? Definetily original idea, but very hard for me to believe that this could actually work.

Are you sure that simply increasing the value of tempo doesn't give the same benefit?
This is not about the side to move(the point is that I do not trust comparison between evaluations with different sides to move).

I will give some examples:
suppose I need to evaluate the position after
1.e4 e5 2.Nf3 Nc6 Bb5

Example 1:

static evaluation after 1.e4 is 0.4 pawns for white
static evaluation after 1.e4 e5 2.Nf3 is 0.42 pawns for white
static evaluation after 1.e4 e5 2.Nf3 Nc6 3.Bb5 is 0.46 pawns for white

0.46>0.42 so there is a progresss for white in the last 2 plies(and I add 0.1 to the static evaluation function from white point of view)
0.42>0.4 so there is another progress in the previous 2 plies(and I add another 0.1 to the static evaluation)

The value that I return is not 0.46 but 0.46+0.1+0.1=0.66

example 2:
static evaluation after 1.e4 is 0.44 pawns for white
static evaluation after 1.e4 e5 2.Nf3 is 0.42 pawns for white
static evaluation after 1.e4 e5 2.Nf3 Nc6 3.Bb5 is 0.46 pawns for white

In this time there is a progress in the last 2 plies but not in the previous 2 plies so I return 0.56 for white

example 3:

static evaluation after 1.e4 is 0.5 pawns for white
static evaluation after 1.e4 e5 2.Nf3 is 0.48 pawns for white
static evaluation after 1.e4 e5 2.Nf3 Nc6 3.Bb5 is 0.46 pawns for white

In this time there is a double progress for black so I reduce 0.2 from the evaluation(from white point of view) and returns 0.26 for white
zamar
Posts: 613
Joined: Sun Jan 18, 2009 7:03 am

Re: Stockfish endgame evaluation problem

Post by zamar »

Thanks for the explanation Uri,

this sounds really interesting. I'll definitely give this idea a try in near future.
Joona Kiiski