Playing vs perfect player?

Look · Post by **Look** » Sun Jun 27, 2010 8:47 am

Hi,

Has anyone tried playing any engine against endgame table-bases, if so what are the results? Here are some points:

- Playing against 2 and 3man EGTB: Naturally engine should play perfectly here, with correct eval. Relevant data can be specified precisely like KBk is draw.

- Playing vs 4man and more EGTB: data on this endgame should not be specified exactly, heuristics may be used for engine.

It would be interesting to see the results here for various N-man games.

Look · Post by **Look** » Sun Jun 27, 2010 9:44 am

Hi,

Has anyone tried playing any engine against endgame table-bases, if so what are the results? Here are some points:

- Playing against 2 and 3man EGTB: Naturally engine should play perfectly here, with correct eval. Relevant data can be specified precisely like KBk is draw.

- Playing vs 4man and more EGTB: data on this endgame should not be specified exactly, heuristics may be used for engine.

It would be interesting to see the results here for various N-man games.

Sorry for self talking, but I remembered that a perfect player is not necessarily the best perfect player. That is, there could be several best moves. even though one can objectively always play best move in each position, which move matters in this situation as it could say narrow down possibilities of opponent but another objectively same strength move may give opponent better choices for next moves.

wgarvin · Post by **wgarvin** » Mon Jun 28, 2010 4:28 pm

The standard I would apply, is to ask "can the engine win all positions in this endgame with a won GTV, against an omniscient opponent".

What I mean is, suppose the engine plays the ending using some heuristic (probably augmented by a bitbase to tell it which positions are won/lost/draw). And suppose that the opponent plays the *optimal moves against this strategy*. If the engine can still win all won positions without falling afoul of the 50-move rule, and doesn't require excessive amounts of search time to find its moves--because time may be short near the end--then I'd say its good enough, even if some of the wins seem inelegant to humans.

Suppose you have written a heuristic scoring function to help your engine play a certain endgame. To prove that its good enough, what you could do is use retrograde analysis to construct a "proof database". I.e. basically make a DTZ50 database where one side chooses moves purely using the heuristic, and the other side chooses moves which delay losing for as long as possible according to the DTZ counts in the database. Note that because this is asymmetric, you'd want to build two of them for endings where either side has some winnable positions (one for each side to use the heuristic). After building such a database, you know for every position, the maximum number of moves it will take your engine to win (or at least, to reset the 50-move counter) against the strongest possible resistance. You could use any regular endgame database to confirm that the engine doesn't blunder away any winnable positions into draws, or drawn positions into losses.

[Edit: I guess I emphasized winning positions in this post. For drawn and lost positions, you'd want to try and play into positions which are difficult for imperfect players (like humans) to preserve their win or draw in. Some engines have a "swindle" mode that does this; I think Crafty does.]

bob · Post by **bob** » Mon Jun 28, 2010 6:08 pm

Look wrote:Hi,

Has anyone tried playing any engine against endgame table-bases, if so what are the results? Here are some points:

- Playing against 2 and 3man EGTB: Naturally engine should play perfectly here, with correct eval. Relevant data can be specified precisely like KBk is draw.

- Playing vs 4man and more EGTB: data on this endgame should not be specified exactly, heuristics may be used for engine.

It would be interesting to see the results here for various N-man games.

I have tried some 5 piece endings to test specific code in Crafty.

KBN vs K is one, and while Crafty is not perfect here, it can mate in less than 50 moves from the worst starting position which is (I think) a mate in 34 with perfect play.

KQ vs KR is one I thought would be impossible, but it wins these easily as well, and although again not optimal, it beats the 50 move rule easily.

KRP vs KR is probably the toughest. Playing the KR side it doesn't make any losing mistakes and draws as it should. On the KRP side it does well, but there are a few subtle cases where it draws when it should win, and I continue to work on those.

I've not tried any 6-piece tests at all.

wgarvin · Post by **wgarvin** » Mon Jun 28, 2010 9:59 pm

bob wrote:
Look wrote:Hi,

Has anyone tried playing any engine against endgame table-bases, if so what are the results? Here are some points:

- Playing against 2 and 3man EGTB: Naturally engine should play perfectly here, with correct eval. Relevant data can be specified precisely like KBk is draw.

- Playing vs 4man and more EGTB: data on this endgame should not be specified exactly, heuristics may be used for engine.

It would be interesting to see the results here for various N-man games.
I have tried some 5 piece endings to test specific code in Crafty.

KBN vs K is one, and while Crafty is not perfect here, it can mate in less than 50 moves from the worst starting position which is (I think) a mate in 34 with perfect play.

KQ vs KR is one I thought would be impossible, but it wins these easily as well, and although again not optimal, it beats the 50 move rule easily.

KRP vs KR is probably the toughest. Playing the KR side it doesn't make any losing mistakes and draws as it should. On the KRP side it does well, but there are a few subtle cases where it draws when it should win, and I continue to work on those.

I've not tried any 6-piece tests at all.

I guess it is a pretty good result if you can mate or convert within 50 for all positions against the endgame databases, but it is not the harshest possible test of the engine's heuristics. Those databases assume perfect play by both sides, so if your engine will prefer certain sub-optimal moves, the "opponent" may not take maximum advantage of this. But a dedicated opponent who knew your heuristic for choosing moves (and he does since Crafty is open source), could build an "anti-Crafty" database for certain endings, and might be able to lengthen the line of play even more in some positions, and escape with a draw due to the 50-move rule.

Anyway, to really prove that some heuristic is sufficient to always win any winnable position before the 50-move rule, I think building proof databases is the way to go. It does sound like a lot of work though, and if there's no evidence that Crafty loses games in these endings then I guess it wouldn't be worth the bother.

I think it would be interesting to build these proof databases as part of a project to build an "endgame player" consisting of compressed bitbases (which possibly omit some small subset of the positions?) combined with various scoring heuristics. The proof databases would objectively confirm whether or not the heuristic was good enough to win all winnable positions against any possible opponent (and if not, they could probably be data-mined for ideas about how to improve the heuristics).

bob · Post by **bob** » Mon Jun 28, 2010 11:14 pm

wgarvin wrote:
bob wrote:
Look wrote:Hi,

Has anyone tried playing any engine against endgame table-bases, if so what are the results? Here are some points:

- Playing against 2 and 3man EGTB: Naturally engine should play perfectly here, with correct eval. Relevant data can be specified precisely like KBk is draw.

- Playing vs 4man and more EGTB: data on this endgame should not be specified exactly, heuristics may be used for engine.

It would be interesting to see the results here for various N-man games.
I have tried some 5 piece endings to test specific code in Crafty.

KBN vs K is one, and while Crafty is not perfect here, it can mate in less than 50 moves from the worst starting position which is (I think) a mate in 34 with perfect play.

KQ vs KR is one I thought would be impossible, but it wins these easily as well, and although again not optimal, it beats the 50 move rule easily.

KRP vs KR is probably the toughest. Playing the KR side it doesn't make any losing mistakes and draws as it should. On the KRP side it does well, but there are a few subtle cases where it draws when it should win, and I continue to work on those.

I've not tried any 6-piece tests at all.
I guess it is a pretty good result if you can mate or convert within 50 for all positions against the endgame databases, but it is not the harshest possible test of the engine's heuristics. Those databases assume perfect play by both sides, so if your engine will prefer certain sub-optimal moves, the "opponent" may not take maximum advantage of this. But a dedicated opponent who knew your heuristic for choosing moves (and he does since Crafty is open source), could build an "anti-Crafty" database for certain endings, and might be able to lengthen the line of play even more in some positions, and escape with a draw due to the 50-move rule.

Anyway, to really prove that some heuristic is sufficient to always win any winnable position before the 50-move rule, I think building proof databases is the way to go. It does sound like a lot of work though, and if there's no evidence that Crafty loses games in these endings then I guess it wouldn't be worth the bother.

I think it would be interesting to build these proof databases as part of a project to build an "endgame player" consisting of compressed bitbases (which possibly omit some small subset of the positions?) combined with various scoring heuristics. The proof databases would objectively confirm whether or not the heuristic was good enough to win all winnable positions against any possible opponent (and if not, they could probably be data-mined for ideas about how to improve the heuristics).

In my analysis, the most problematic case (inside 3-4-5 piece endings) is KRP vs KR. It is pretty straightforward to defend, but if you are winning, one wrong move can turn it into a draw. The ones I thought would be difficult are, in fact, trivial. KQ vs KR is one. At one point I thought this was not winnable without egtb's until Amir Ban made a comment in the old r.g.c.c that caused me to test it, and it turns out only a very shallow search is needed. A fraction of a second 10 years ago was good enough if you (a) drive the opponent's king toward the edge, and (b) keep your king active and close as well. For me, as a human, it takes some thinking as if you let the king escape once, it is generally over and drawn.

I suppose it might be possible to try to exploit your opponents non-egtb winning strategy. One measure I watched was the "steps back" by the non-egtb player. If you go from mate in 23 to mate in 25, you lost a couple of moves. If you go from mate in 23 to mate in 43, you might have screwed up the win.

Look · Post by **Look** » Tue Jun 29, 2010 7:04 am

I think it would be interesting to build these proof databases as part of a project to build an endgame player consisting of compressed bitbases (which possibly omit some small subset of the positions?) combined with various scoring heuristics. The proof databases would objectively confirm whether or not the heuristic was good enough to win all winnable positions against any possible opponent (and if not, they could probably be data-mined for ideas about how to improve the heuristics).

I assume something like this could be basis for implementing heuristics for end games even those which do not have bit-bases yet. Some thing like creating heuristics for 7-man end games and using bit-bases when available (like for 6-man) when position transfers to one of those, and then building these heuristics on for various 7-man then 8-man etc. But as you mentioned earlier, I take your winnings as scoring best possible result.

Look · Post by **Look** » Tue Jun 29, 2010 7:27 am

I suppose it might be possible to try to exploit your opponents non-egtb winning strategy. One measure I watched was the steps back by the non-egtb player. If you go from mate in 23 to mate in 25, you lost a couple of moves. If you go from mate in 23 to mate in 43, you might have screwed up the win.

Maybe it is better to specify an interesting point of this thread. Suppose a decent engine scores 50% versus 2 and 3-man EGTB. Some engines can not score 50% for 4-man EGTB because of KBBk for instance. So that would be slighly less than 50%. For 5-man EGTB as you say for Crafty it is less than 50% but close to it actually. Naturally that percentage reduces for 6-man EGTB in any engine. So if some one has the data for these even roughly that would be nice to know. Now comes the interesting part, Consider each engine as a heuristic for 32-man EGTB. If we can estimate upper and lower bounds for N-man EGTB score vs engines, then we can estimate(although very naively) the rating of say a perfect player. I still do not know which perfect player that would be, as best PP would play according factors other that actual scoring value of a move too. Maybe that would be random perfect player, that is, a perfect player that when facing with several options would choose one randomly, or maybe a PP who chooses randomly when best result would be draw or loss, but chooses the shortest path to mate when position is winning.

metax · Post by **metax** » Thu Jul 01, 2010 10:04 pm

Look wrote:Maybe that would be random perfect player, that is, a perfect player that when facing with several options would choose one randomly, or maybe a PP who chooses randomly when best result would be draw or loss, but chooses the shortest path to mate when position is winning.

If you choose the shortest path to mate when winning, it might make sense to choose the longest path to mate when losing, and only choosing randomly when the position is drawn or when several choices have the same distance to mate.

wgarvin · Post by **wgarvin** » Fri Jul 02, 2010 7:32 pm

metax wrote:
Look wrote:Maybe that would be random perfect player, that is, a perfect player that when facing with several options would choose one randomly, or maybe a PP who chooses randomly when best result would be draw or loss, but chooses the shortest path to mate when position is winning.
If you choose the shortest path to mate when winning, it might make sense to choose the longest path to mate when losing, and only choosing randomly when the position is drawn or when several choices have the same distance to mate.

Against an imperfect opponent, when drawing or losing you want to pick the move that gives them the best chance to make a mistake. Maybe you can do a short search and count how many move choices they had that preserved the GTV, and how many would be blunders. Play down paths with a lot of potential blunders.

Playing vs perfect player?

Playing vs perfect player?

Re: Playing vs perfect player?

Re: Playing vs perfect player?

Re: Playing vs perfect player?

Re: Playing vs perfect player?

Re: Playing vs perfect player?

Re: Playing vs perfect player?

Re: Playing vs perfect player?

Re: Playing vs perfect player?

Re: Playing vs perfect player?