To make sure that mating potential is properly appreciated in determination of piece values by Fairy-Max, I equiped it with a system similar to Fruit's, for reducing advantages according to the 'naive' evaluation by a multiplier, when the advantage is largely illusory because mating potential is in jeopardy or non-existent. Because it has to account for an infinite variety of piece types, I could not make it as specific and detailed as Fruit's, of course. So I am trying the following approximation:
The system only kicks in when the number of Pawns drops below two. (So it will not notice that KNNPPKBN is tainted because a double sac of BN vs PP puts you in a sore spot. But neither does Fruit.) I apply 3 different multipliers: 1/2, 1/4 and 1/8. Unlike Fruit, I don't bother with multiplier 0; even in KNNK a factor 1/8 is enough to reduce the advantage to below a Pawn, so that it will prefer KPK over KNNK, which is all that is needed. (Multipliers 0 have the disadvantage that they will allow the opponent to creep toward you to near equality without offering any resistance, and who knows if due to a miseval such near-equality will not actually be a win for him. E.g. in KNKPP, where you carelessly let him advance his passers, to then suddenly discover you cannot stop promotion because they advanced too far.)
I use the factors as follows:
1) without Pawns for the strong side, the factor the factor is 1/2 (if not less), irrespective of the Piece material.
2) in case (1) 'Dead draws' like KNK... get 1/8.
3) When with a single Pawn for the strong side, a factor 1/4 is applied when case(2) can be reached by trading the weakest piece of the weak side for the Pawn.
Case (1) should discourage conversion to endgames like KQKBN (which could have fortress draws), and even KQKR (which might be too difficult to find the win, and in cases where you have a piece that is just marginally weaker than a Queen, such as an RN (=C) or BN (=A) compound, can indeed be a dead draw).
The 'dead draws' that are recognized have at most two (non-Pawn, non-King) pieces for the strongside, and at most one for the weak side. Such end-games are classified as dead draw when they have
a) less than 350 cP advantage in the Piece material ('1 minor ahead'). This catches KBK.., KNK.., KRKB.., KRKN.., KRBKR.., KRNKR.., KQBKQ.., KQNKQ... But not cases like KQBKC, which are often won because Q-C is just large enough to drive up the difference above 350)
a-1) an exception to rule (1) is the case where the strong side has a light piece (<350cP) which has mating potential. For instance, a 'Short Rook' ('S'), which moves as Rook but at most two steps, is worth less than a Knight. But KQSKQ and KRSKR are won, because trading Q or R leaves you with a won KSK. Even KSNKS is won, although you are only a (non-mating) Knight ahead, because you can force trade of N vs defending S. So increasing the S value would not do it, and you have to look at mating potential apart from value. Whether a piece is a 'mating minor' has to be programmed by hand in the piece description file. (Too complex to figure that out automatically...)
b) A single, arbitrary stong piece that is color bound (and thus obviously has no mating potential). Some pieces stronger than Rook suffer from this. (E.g. a Bishop that also can jump 2 orthogonally, worth slightly over 500), or even repeat such jumps until it encounters an obstacle (worth ~700).) This rule should really be extended to having a pair of color-bound pieces on the same color. But in Fairy-Max it would be cumbersome and time consuming to figure out which color pieces are on (requires a board scan), and I don't plan to do test games with multiple color-bound piece types for the moment. (And a pair of the same type will always start on unlike colors.)
c) 'defficient pairs' of equal pieces, like two Knights. The piece type would have to be marked as such by hand in the piece definition file. It is quite uncommon that two minors of a different type lack mating potential, but with a pair of equal pieces you have much less choice in doing things, and lack of mating potential is quite common.
What is still missing is a way to devaluate an advantage in an 'unlike Bishops' situation. Partly because it is costly to figure out colors. (Perhaps I really should switch to using a material index.) But also because it is not so obvious how to generalize this with unorthodox pieces. The weakness is of course that your only remaining piece (pieces?) have no clout on one of the colors, so the opponent will set up a defense there that you don't have the power to break. But for that it is not necessary per se that the piece of the defender is color bound as well. Yet, in Chess, KBPPKN is not considered drawish. Perhaps this is because a Knight is a color-alternator, so to cover squares of the color the Bishop has no access to, it has to be on the other color, where it can be chases away. So what seems to count is the subset of moves that stay on the same color. And the distance of the Pawns compared to the range of the piece, of course. A Bishop defender is pretty good in this respect, as it is a slider.
One thing that worries me is if always reducing the score by a factor 2 when you have no Pawns is perhaps overdoing it. It would unnecessarily bias the program against trading his last Pawn if it was very much ahead. E.g. in KRPKB there is no reduction, but after a B-P trade, KRK would be depreciated by 2, leading to +2.5, while the advantage in KRPKB would be +3 to +4. While KRK is an easy win. Perhaps I should also make an exception to rule 1 when the opponent King is bare.
Comments are welcome.
End-game eval factors for drawishness
Moderator: Ras
-
hgm
- Posts: 28445
- Joined: Fri Mar 10, 2006 10:06 am
- Location: Amsterdam
- Full name: H G Muller
-
kbhearn
- Posts: 411
- Joined: Thu Dec 30, 2010 4:48 am
Re: End-game eval factors for drawishness
You could add an extra condition to the no-pawn halfing, something like 'if material difference < 4, and no pawn for stronger side, then half it'
This would half a few long winning conditions, but perhaps that's not so bad (KBBvKN would be the easiest one i can think of that would be halved but you might not want to). KRvKPP would be halved, but that seems reasonable as there's a decent chance one of the pawns will be able to be supported by the king and force a rook sac. KQvKRP would be halved, but that's probably not a bad thing as there's some fortresses and the ones that aren't fortresses are very long wins if the king supports the pawn and the rook can't be won immediately.
As for the opposite bishops, does the drawishness in bishops not translate to generalised colorbound pieces?
This would half a few long winning conditions, but perhaps that's not so bad (KBBvKN would be the easiest one i can think of that would be halved but you might not want to). KRvKPP would be halved, but that seems reasonable as there's a decent chance one of the pawns will be able to be supported by the king and force a rook sac. KQvKRP would be halved, but that's probably not a bad thing as there's some fortresses and the ones that aren't fortresses are very long wins if the king supports the pawn and the rook can't be won immediately.
As for the opposite bishops, does the drawishness in bishops not translate to generalised colorbound pieces?
-
hgm
- Posts: 28445
- Joined: Fri Mar 10, 2006 10:06 am
- Location: Amsterdam
- Full name: H G Muller
Re: End-game eval factors for drawishness
Well, I can see how a color-bound piece for the strong side would in general suffer a similar problem, but it is much less clear what the weak side should have for the problem to become manifest. 'Strong-enough presence on the opposite color', I suppose. But what would be 'strong enough' here? Would a piece that steps 1 or jumps 2 diagonally, and has a color-switching non-capture move 1-step sideways (such as the Lieutenant from Spartan Chess, which turns out to be stronger than a Bishop) have it? It is not color bound, but that is in fact an asset. It does not have to switch color if it doesn't want to. (Unlike the Knight.) Would a piece that only occasionally has to stray on the other color be good enough (like step-1 or jump-2 diagonally forward, but move as Knight backwards)? Would that depend on how strong the color bound piece is?kbhearn wrote:As for the opposite bishops, does the drawishness in bishops not translate to generalised colorbound pieces?
-
kbhearn
- Posts: 411
- Joined: Thu Dec 30, 2010 4:48 am
Re: End-game eval factors for drawishness
Well seeing as the problem for the stronger side is complete inability to control the opposite color squares, i would say a hybrid with some ability should not share this problem. i.e. assuming the game still has pawn promotion has a theme of an endgame, if you can knight move backwards to change color and then control the next step forward for a pawn with bishop attacks forward, then you ought to be able to convert an extra pawn advantage in general.
A semi-colorbound piece seems like it should only share bishop characteristics in tactical phases of the game where there shouldn't be a significant opposite bishop 'drawish' penalty anyways because instead it fosters attacks.
With regard to a completely colorbound piece on the stronger side vs a semicolorbound piece, the completely colorbound piece would still tend to have drawish aspects probably if it is indeed the stronger side. Reason being that a blockade on the color opposite to the one the stronger side can control is still possible.
A semi-colorbound piece seems like it should only share bishop characteristics in tactical phases of the game where there shouldn't be a significant opposite bishop 'drawish' penalty anyways because instead it fosters attacks.
With regard to a completely colorbound piece on the stronger side vs a semicolorbound piece, the completely colorbound piece would still tend to have drawish aspects probably if it is indeed the stronger side. Reason being that a blockade on the color opposite to the one the stronger side can control is still possible.
-
hgm
- Posts: 28445
- Joined: Fri Mar 10, 2006 10:06 am
- Location: Amsterdam
- Full name: H G Muller
Re: End-game eval factors for drawishness
I guess a Bishop as defender is so good at this because it is usually trivial to position it such that it covers squares in front of both Pawns (if they are not too far advanced). All you then have to do is 'follow' the opponent's King, so that by the time he protects the square where the Bishop is stopping the Pawn, you attack it a second time with yours. With a Knight as defender this never works, because the Bishop can then attack the Knight (and allowing it to trade will be fatal). In addition the Pawns could simply be too far apart for a Knight to stop them both.kbhearn wrote:With regard to a completely colorbound piece on the stronger side vs a semicolorbound piece, the completely colorbound piece would still tend to have drawish aspects probably if it is indeed the stronger side. Reason being that a blockade on the color opposite to the one the stronger side can control is still possible.
When the defender would have step-1 and jump-2 moves diagonally forward and straight backward (a 'Y'), it is conceivable that he could stop two Pawns too (e.g. white Pd4, g5, black Yf7, Kf5. But the Y is so clumsy that it would be hard to manoeuvre it in the right position, and in the mean time it could be too late. In addition there is the range problem, and against, for instande b- and g-Pawns it has no chance. Also there could be a zugzwang problem. A Bishop is really a very strong piece on the color it does have access to, and anything weaker when you restrict it to that color might not be able to put up a good-enough defense (other than in some special cases), so it is not worth making it a general exception. With a defending Ferz (1-step diagonal mover) you can easily set up a fortress against connected passers (w: Pd6,e7 b:Ke8,Fd7, moving the Ferz between c8,d7 and e6, depending on whether the white King attacks c8 or e6). But when the passers are further apart, it might be impossible.
I guess there is no easy automatic detection of this, and it would have to be empically tested which pieces can effectively defend against color-bound + Pawns, and whih cannot. For now I could put as minimal requirement 'slides diagonally', though. Or perhaps 'slides at least 3 diagonally'.
-
hgm
- Posts: 28445
- Joined: Fri Mar 10, 2006 10:06 am
- Location: Amsterdam
- Full name: H G Muller
Re: End-game eval factors for drawishness
The entire package seems to be working now. Because of proper accounting of the B-pair, I could switch to using Kaufman piece values, rather than using the micro-Max kludge of valuing B always higher than N.
This brings about 25 Elo.
Most of this is from the better piece values. When I measured the effect of drawishness evaluation alone, with the old piece values, it even seemed weaker. (But that was before I exempted bare King end-games from getting a reduction. Not sure how much that matters.)
So the rules are now
Lack of mating potential is assumed if:
1) we have a single color-bound piece, no matter how strong
2) we have 1 or 2 pieces, are less than 350cP ahead, and don't have a piece marked as 'mating minor'
3) we have two equal pieces, of a type marked as 'defective pair'
Without mating potential and no Pawns we reduce the score by a factor 8 ('dead draws')
With 1 Pawn we reduce the score by a factor 4 if the opponent can sac his weakest piece for that Pawn to leave a dead draw.
In other cases, if we have no Pawns and the opponent is not bare King, we reduce the score by a factor 2
Note that we never reduce when we have at least 2 Pawns, or at least 3 Pieces. This is used as a cheap filter to skip the highly inefficient code for testing the complex conditions.
The whole thing is implemented in micro-Max' kludgey style, by keeping a pair bonus variable for each piece type. A bonus of 3 (cP) is used to indicate a mating minor, and a bonus of -4 to indicate a defective pair. True pair bonuses are set by default to 1/8 of the piece value if the program can determine the piece is color bound (which it can for simple leapers and sliders), and 0 otherwise. This can be overruled by specifying a air bonus by hand in the piece description file, after the last move of the piece (disguised as a move with a null-step, to provide backward compatibility).
For regular Chess the only addition in the game description is that the Knight is marked as defective pair, writing 0,3 behind its move list. (The B-pair bonus is set by default.)
This brings about 25 Elo.
Most of this is from the better piece values. When I measured the effect of drawishness evaluation alone, with the old piece values, it even seemed weaker. (But that was before I exempted bare King end-games from getting a reduction. Not sure how much that matters.)
So the rules are now
Lack of mating potential is assumed if:
1) we have a single color-bound piece, no matter how strong
2) we have 1 or 2 pieces, are less than 350cP ahead, and don't have a piece marked as 'mating minor'
3) we have two equal pieces, of a type marked as 'defective pair'
Without mating potential and no Pawns we reduce the score by a factor 8 ('dead draws')
With 1 Pawn we reduce the score by a factor 4 if the opponent can sac his weakest piece for that Pawn to leave a dead draw.
In other cases, if we have no Pawns and the opponent is not bare King, we reduce the score by a factor 2
Note that we never reduce when we have at least 2 Pawns, or at least 3 Pieces. This is used as a cheap filter to skip the highly inefficient code for testing the complex conditions.
The whole thing is implemented in micro-Max' kludgey style, by keeping a pair bonus variable for each piece type. A bonus of 3 (cP) is used to indicate a mating minor, and a bonus of -4 to indicate a defective pair. True pair bonuses are set by default to 1/8 of the piece value if the program can determine the piece is color bound (which it can for simple leapers and sliders), and 0 otherwise. This can be overruled by specifying a air bonus by hand in the piece description file, after the last move of the piece (disguised as a move with a null-step, to provide backward compatibility).
For regular Chess the only addition in the game description is that the Knight is marked as defective pair, writing 0,3 behind its move list. (The B-pair bonus is set by default.)