Improving evaluation of passed pawns

bob · Post by **bob** » Sun Jan 30, 2011 3:18 pm

Gerd Isenberg wrote:
bob wrote:
bhlangonijr wrote:Is there any engine that evaluates correctly the position bellow?

[D] 8/p1p5/6pp/PPP2k2/8/4PK2/8/8 w - - 0 43

Stockfish gives me the static score -24

My engine' s static evaluation is -61
Crafty says -0.06, but a simple 3 ply search sees the eval skyrocket for white as it should. I don't like letting the eval try to handle this. This is a "dynamic" issue that is best left to the search, because there are so many special cases to deal with.

In my case the wrong evaluation is due to the bigger bonuses for the connected passers, but a human can instantly see it is a winning position for white.

This specific position is an easy win which the static eval can be fixed to work correctly, as we have only pawns left in the board and because the opponent king is far from the promotion line of the candidate passed pawn. Although if there is left some minor piece from the opposite side things become more difficult to evaluate it statically. BTW the 3 ply score is +6.something in 0.00 seconds. By the time we get to depth 13 at 0.05 seconds, eval is roughly +11.0

The next given position my engine reasonably evaluates as 11:

[D] 8/p1p2p2/6pp/PPP2k2/8/4PKP1/8/8 w - - 0 43

Stockfish gives the same score for this one: -24.
Crafty says +.28 statically. a .02 second search says +10.0...
What does Crafty eval say without a-pawns and/or c-pawns?

Static eval does not change. I do not count a candidate the same as a normal pawn at present. Used to but found that it played significantly worse on the cluster. I have this on my to-do list to fix. The problem is, both sides can have candidates and that is harder to get right statically. And we detect the case where we have two passed pawns and while neither can promote (out-run enemy king) by itself, both can, since the king eventually has to stop one, leaving the other one free. With candidates, that is a problem, because candidate passers take tempi to become real passers... It is more complicated to handle "can it run" when you have to factor in the moves to generate the passer first, and I simply put that on my to-do list for later...

bob wrote:
I wonder if there is a better for evaluating statically such positions. I think it might not increase significantly the size of the code and may be worth a few Elo points.

One quick & dirty way of fixing that would be giving bigger bonuses for candidate passers when there is only pawns left in the board and the opposite king is away from the promotion line.
This is dangerous. The opponent might have a pawn that can't "run" by the square of the pawn rule, but the king can be positions such that the pawn can't be stopped, and it promotes before the "unstoppable passer" and wins. This gets to be _very_ problematic if the eval spits out grossly wrong answers. And trying to determine all the special cases is a pain. I don't have the position handy, but there is a position where it appears black can't possibly win because its king is too far from white's passed pawn. Black has two choices. It can move its king up to help it's own passed pawn promote, but then white defends with a king move. Or it can move it's king toward the enemy passer but it can't stop it. The winning plan is to do both at the same time. Move the king toward both it's own passer and the opponent's passer, so that white can't take the time to push its passer or black will also promote, and white's passer can't win because the black king is closer to supporting its own pawn and will draw. It is a classic that will blow your mind if you have not seen it before. And doing that statically would be one complicated piece of code. yet a search can solve it in milliseconds, as it should...

Bellow, another position which is evaluated incorrectly:

[D] 8/p1p2p2/6p1/PPP2k1p/K2P4/4P3/8/8 w - - 0 43

Stockfish static score: -80
In my case my engine evaluates this position as -1400 because of the many unstoppable pawns in the black side. ( It is such a huge bonus... )

Any thoughts?

Regards,
One can always "compute" where the candidate turns into a passer, and use that for the square of the pawn calculation. But it is complex and dangerous. Crafty says -6.0 statically but +8 after less than 0.05 seconds.

I suppose it is all about whether you want your static eval to handle dynamic considerations..
The point here in a pawn ending, the advanced candidate with an open frontspan on the fifth rank has the same number of helpers than sentries, which makes the candidate almost as valuable as it would already be a passer. The sentries are forced to exchange versus the helpers, white does not lose any tempo. Additionally, the candidate's frontspan is not blocked by own king and has the opponent king outside the candidate's square.

Guess you refer the Réti Endgame Study, yes there many cases where static knowledge might fail due to multiple threats and all kind of interactions and counter threats. But again, here with the mentioned conditions it is safe to treat the b5-candidate already only slightly worse than a passer, as each pair of sentry/helper can be canceled out here.

True for this case. But you have already exposed my point, special cases and dynamic knowledge applied in a static domain.

bob · Post by **bob** » Sun Jan 30, 2011 3:21 pm

bhlangonijr wrote:
bob wrote:
One can always "compute" where the candidate turns into a passer, and use that for the square of the pawn calculation. But it is complex and dangerous. Crafty says -6.0 statically but +8 after less than 0.05 seconds.

I suppose it is all about whether you want your static eval to handle dynamic considerations..
I suppose almost every engine can solve all these positions instantly, through search, when it is close to the root. The problem here is when these positions appear close to the leaves, where all the pruning mechanism takes place. The mis-evaluation of such positions close to the leaves may postpone the correct ordering of the root move which will eventually cause it, as all reductions/pruning schemes heavily rely on the static evaluation. The point is whether we can evaluate better such positions. I am not saying we have to completely solve it statically, but rather give a better "hint" to the search.
One may argue that such situations would be statistically insignificant, but i would like to get as much correct "answers" in my evaluation function as possible.

It doesn't have to be close to the root for these. Most are done in a 3 ply search. In endgames, normal depth is 30+, so we can search 25+ plies, and still recognize such won positions since we have 5 plies to invest to run things out to some sort of static conclusion...

As far as being "correct" goes, that is critical. Or the search will hunt up all those positions where you make a mistake in the eval, and use those to lose the game...

Don · Post by **Don** » Sun Jan 30, 2011 7:35 pm

bhlangonijr wrote:
bob wrote:
One can always "compute" where the candidate turns into a passer, and use that for the square of the pawn calculation. But it is complex and dangerous. Crafty says -6.0 statically but +8 after less than 0.05 seconds.

I suppose it is all about whether you want your static eval to handle dynamic considerations..
I suppose almost every engine can solve all these positions instantly, through search, when it is close to the root. The problem here is when these positions appear close to the leaves, where all the pruning mechanism takes place. The mis-evaluation of such positions close to the leaves may postpone the correct ordering of the root move which will eventually cause it, as all reductions/pruning schemes heavily rely on the static evaluation. The point is whether we can evaluate better such positions. I am not saying we have to completely solve it statically, but rather give a better "hint" to the search.
One may argue that such situations would be statistically insignificant, but i would like to get as much correct "answers" in my evaluation function as possible.

Ideally you would want this of course, but if you could do it perfectly you would not need search.

In this position about the only "reasonable" attempt would be to consider one of the queen side pawns passed without loss of tempo as someone pointed out. However that in itself would not prove a win and you would have to consider all interactions to prove it's a win.

It's probably possible to create a special king and pawn ending evaluation function that has a great deal of sophistication. However, the question is whether the slowdown is worth it. It's better to address the things that make the evaluation more balanced in general. In this position a tiny search quickly discovers the search, but look for positions that may have serious concepts missing that may take MANY ply to discover by search.

Gerd Isenberg · Post by **Gerd Isenberg** » Sun Jan 30, 2011 8:23 pm

Don wrote:
bhlangonijr wrote:
bob wrote:
One can always "compute" where the candidate turns into a passer, and use that for the square of the pawn calculation. But it is complex and dangerous. Crafty says -6.0 statically but +8 after less than 0.05 seconds.

I suppose it is all about whether you want your static eval to handle dynamic considerations..
I suppose almost every engine can solve all these positions instantly, through search, when it is close to the root. The problem here is when these positions appear close to the leaves, where all the pruning mechanism takes place. The mis-evaluation of such positions close to the leaves may postpone the correct ordering of the root move which will eventually cause it, as all reductions/pruning schemes heavily rely on the static evaluation. The point is whether we can evaluate better such positions. I am not saying we have to completely solve it statically, but rather give a better "hint" to the search.
One may argue that such situations would be statistically insignificant, but i would like to get as much correct "answers" in my evaluation function as possible.
Ideally you would want this of course, but if you could do it perfectly you would not need search.

In this position about the only "reasonable" attempt would be to consider one of the queen side pawns passed without loss of tempo as someone pointed out. However that in itself would not prove a win and you would have to consider all interactions to prove it's a win.

It's probably possible to create a special king and pawn ending evaluation function that has a great deal of sophistication. However, the question is whether the slowdown is worth it. It's better to address the things that make the evaluation more balanced in general. In this position a tiny search quickly discovers the search, but look for positions that may have serious concepts missing that may take MANY ply to discover by search.

Typical advice from someone with commercial intentions, ignoring simple to implement knowledge about candidates, apparently to avoid RedQueen becoming too strong

Don · Post by **Don** » Sun Jan 30, 2011 9:08 pm

Gerd Isenberg wrote:
Don wrote:
bhlangonijr wrote:
bob wrote:
One can always "compute" where the candidate turns into a passer, and use that for the square of the pawn calculation. But it is complex and dangerous. Crafty says -6.0 statically but +8 after less than 0.05 seconds.

I suppose it is all about whether you want your static eval to handle dynamic considerations..
I suppose almost every engine can solve all these positions instantly, through search, when it is close to the root. The problem here is when these positions appear close to the leaves, where all the pruning mechanism takes place. The mis-evaluation of such positions close to the leaves may postpone the correct ordering of the root move which will eventually cause it, as all reductions/pruning schemes heavily rely on the static evaluation. The point is whether we can evaluate better such positions. I am not saying we have to completely solve it statically, but rather give a better "hint" to the search.
One may argue that such situations would be statistically insignificant, but i would like to get as much correct "answers" in my evaluation function as possible.
Ideally you would want this of course, but if you could do it perfectly you would not need search.

In this position about the only "reasonable" attempt would be to consider one of the queen side pawns passed without loss of tempo as someone pointed out. However that in itself would not prove a win and you would have to consider all interactions to prove it's a win.

It's probably possible to create a special king and pawn ending evaluation function that has a great deal of sophistication. However, the question is whether the slowdown is worth it. It's better to address the things that make the evaluation more balanced in general. In this position a tiny search quickly discovers the search, but look for positions that may have serious concepts missing that may take MANY ply to discover by search.
Typical advice from someone with commercial intentions, ignoring simple to implement knowledge about candidates, apparently to avoid RedQueen becoming too strong

I expect Red Queen to become very strong if Ben Hur sticks with computer chess. So anything I can do to throw him off will be good ...

My advice is to focus on king and pawn specific endings.

bhlangonijr · Post by **bhlangonijr** » Sun Jan 30, 2011 9:26 pm

Don wrote: I expect Red Queen to become very strong if Ben Hur sticks with computer chess. So anything I can do to throw him off with be good ...

I wouldn't be afraid of that. RQ is simply my little toy I play with in the weekends.
But hopefully RQ will be able to take some points of Komodo. We never know.

Back to topic, I wouldn't be surprised if I find out that Houdini has such knowledge in the static evaluation.

bhlangonijr · Post by **bhlangonijr** » Sun Jan 30, 2011 9:43 pm

bhlangonijr wrote:
Don wrote: I expect Red Queen to become very strong if Ben Hur sticks with computer chess. So anything I can do to throw him off with be good ...
I wouldn't be afraid of that. RQ is simply my little toy I play with in the weekends.
But hopefully RQ will be able to take some points of Komodo. We never know.

Back to topic, I wouldn't be surprised if I find out that Houdini has such knowledge in the static evaluation.

Well, just checked out and that is what I found:

Code: Select all

Houdini 1.5a w32
&#40;c&#41; 2010-11 Robert Houdart

info string 128 MB Hash
uci
id name Houdini 1.5a w32
id author Robert Houdart
option name Hash type spin min 4 max 1024 default 128
option name Clear Hash type button
option name Threads type spin min 1 max 8 default 8
option name Split_Depth type spin min 8 max 99 default 10
option name Ponder type check default false
option name Contempt type spin min 0 max 2 default 1
option name Analysis_Contempt type check default false
option name MultiPV type spin min 1 max 16 default 1
option name GaviotaTbPath type string default <empty>
option name GaviotaTbCache type spin min 4 max 1024 default 64
option name Hard_Probe_Depth type spin min 2 max 99 default 24
option name Soft_Probe_Depth type spin min 2 max 99 default 16
uciok
position fen 8/p1p5/6pp/PPP2k2/8/4PK2/8/8 w - - 0 43
eval
go depth 0
info multipv 1 depth 1 seldepth 6 score cp 564  time 41 nodes 28 nps 0 tbhits 0
hashfull 0 pv b5b6 c7b6 c5b6 a7b6 a5b6
bestmove b5b6 ponder c7b6

Although it doesn't mean it has this knowledge in the static evaluation. Because it appears Houdini is kind of Rybka-ish in the search and does a 5-ply search in the first iteration.
Whatever it is, Houdini evaluated it nicely in the very first iteration - differently than most engines...

It would be nice to hear from Robert if Houdini does that.

bhlangonijr · Post by **bhlangonijr** » Sun Jan 30, 2011 9:58 pm

bhlangonijr wrote:
bhlangonijr wrote:
Don wrote: I expect Red Queen to become very strong if Ben Hur sticks with computer chess. So anything I can do to throw him off with be good ...
I wouldn't be afraid of that. RQ is simply my little toy I play with in the weekends.
But hopefully RQ will be able to take some points of Komodo. We never know.

Back to topic, I wouldn't be surprised if I find out that Houdini has such knowledge in the static evaluation.

Well, just checked out and that is what I found:

Code: Select all

Houdini 1.5a w32
&#40;c&#41; 2010-11 Robert Houdart

info string 128 MB Hash
uci
id name Houdini 1.5a w32
id author Robert Houdart
option name Hash type spin min 4 max 1024 default 128
option name Clear Hash type button
option name Threads type spin min 1 max 8 default 8
option name Split_Depth type spin min 8 max 99 default 10
option name Ponder type check default false
option name Contempt type spin min 0 max 2 default 1
option name Analysis_Contempt type check default false
option name MultiPV type spin min 1 max 16 default 1
option name GaviotaTbPath type string default <empty>
option name GaviotaTbCache type spin min 4 max 1024 default 64
option name Hard_Probe_Depth type spin min 2 max 99 default 24
option name Soft_Probe_Depth type spin min 2 max 99 default 16
uciok
position fen 8/p1p5/6pp/PPP2k2/8/4PK2/8/8 w - - 0 43
eval
go depth 0
info multipv 1 depth 1 seldepth 6 score cp 564  time 41 nodes 28 nps 0 tbhits 0
hashfull 0 pv b5b6 c7b6 c5b6 a7b6 a5b6
bestmove b5b6 ponder c7b6

Although it doesn't mean it has this knowledge in the static evaluation. Because it appears Houdini is kind of Rybka-ish in the search and does a 5-ply search in the first iteration.
Whatever it is, Houdini evaluated it nicely in the very first iteration - differently than most engines...

Giving a second thought the score 564 is a huge bonus. If Houdini had solved it through search the score would be close to Queen value... So yeah, Houdini definitely has this knowledge...

It would be nice to hear from Robert if Houdini does that.

Houdini · Post by **Houdini** » Sun Jan 30, 2011 10:27 pm

bhlangonijr wrote:Although it doesn't mean it has this knowledge in the static evaluation. Because it appears Houdini is kind of Rybka-ish in the search and does a 5-ply search in the first iteration.
Whatever it is, Houdini evaluated it nicely in the very first iteration - differently than most engines...

Giving a second thought the score 564 is a huge bonus. If Houdini had solved it through search the score would be close to Queen value... So yeah, Houdini definitely has this knowledge...

It would be nice to hear from Robert if Houdini does that.

After playing b5-b6 Houdini knows it has a passed pawn that cannot be stopped by the Black King, see diagram:

[D] 8/p1p5/1P4pp/P1P2k2/8/4PK2/8/8 b - -
An unstoppable passed pawn at the 6th rank is given a score of 550 cp.
In your run Houdini performed a 1-ply search with quiescence. The quiescence examines the pawn swaps at b6 but doesn't really change the evaluation in this case.

Robert

bhlangonijr · Post by **bhlangonijr** » Sun Jan 30, 2011 10:47 pm

Houdini wrote:
bhlangonijr wrote:Although it doesn't mean it has this knowledge in the static evaluation. Because it appears Houdini is kind of Rybka-ish in the search and does a 5-ply search in the first iteration.
Whatever it is, Houdini evaluated it nicely in the very first iteration - differently than most engines...

Giving a second thought the score 564 is a huge bonus. If Houdini had solved it through search the score would be close to Queen value... So yeah, Houdini definitely has this knowledge...

It would be nice to hear from Robert if Houdini does that.
After playing b5-b6 Houdini knows it has a passed pawn that cannot be stopped by the Black King, see diagram:

[D] 8/p1p5/1P4pp/P1P2k2/8/4PK2/8/8 b - -
An unstoppable passed pawn at the 6th rank is given a score of 550 cp.
In your run Houdini performed a 1-ply search with quiescence. The quiescence examines the pawn swaps at b6 but doesn't really change the evaluation in this case.

Robert

Thanks for the quick reply Robert, but what I really want to know is, what does Houdini's static evaluation give for the original position I have posted?

Improving evaluation of passed pawns

Re: Improving evaluation of passed pawns

Re: Improving evaluation of passed pawns

Re: Improving evaluation of passed pawns

Re: Improving evaluation of passed pawns

Re: Improving evaluation of passed pawns

Re: Improving evaluation of passed pawns

Re: Improving evaluation of passed pawns

Re: Improving evaluation of passed pawns

Re: Improving evaluation of passed pawns

Re: Improving evaluation of passed pawns