wgarvin wrote:I love this bidding mechanic by the way (in general--maybe not for a chess variant). ...
This type of game is an instance of the "Prisoner's Dilemma". Because this might be something well outside the experience of Chess programmers, perhaps I should say some more about this.
In general, in such games, the opponent would always win when he could adapt his choice to yours, if he would heve known it. This rules out that there could be a single deterministic choice that is best. If there was, the opponent would know it too, and adapt his choice accordingly to exploit it. E.g. suppose both players could bid either 0 or 1, with the following pay-off matrix (POV of player A):
Code: Select all
0 1 <--- Player A
0 2 -3
1 -1 2
|___ Player B
So A receives 2 from B on equal bids, but has to pay 3 to B if he bids 1 and B bids 0, etc. At first glance this seems fair, as 2+2 = 1+3. But that is only of relevance when A and B were random movers that played 0 and 1 with 50% probability. 0 cannot be the single best choice for A, as it is refuted by B playing 1. Likewise 1 cannot be best choice, being refuted by B playing 0. To win on average, A would have to keep B in the dark about his choice. Playing 0 and 1 randomly, 50/50, also is a failing strategy: B would refute it by always playing 0, so that in half the cases he pays 2, but in the other half he receives 3.
But by optimizing the probabilities with which A randomly plays 0 and 1, A can have the upper hand: suppose A plays 1 with probability p (and thus 0 with probability 1-p), and B plays 1 with probability q. Then the expected payoff is
2*(1-p)*(1-q) + 2*p*q - (1-p)*q - 3*p*(1-q)
= 2 - 2*p - 2*q + 2*p*q + 2*p*q - q + p*q - 3*p + 3*p*q
= 2 - 5*p - 3*q + 8*p*q
Now the best strategy for each player must have this payoff independent to small changes in the probability they control, for if it were not, they could change that probability in one direction or the other to improve their result. So the derivatives with respect to p and q must be zero:
-5 + 8*q = 0
-3 + 8*p = 0
or q = 5/8 and p = 3/8. If player A sets p = 3/8, the payoff becomes
2 - 5*(3/8) - 3*q + 8*(3/8)*q
= 2 - 15/8 - 3*q + 3*q
= 16/8 - 15/8 = 1/8
In other words, it is positive (in favor of A), and independent of q (so there is nothing B can do against it; however he sets his probabilities, he will consistently lose). Deeper analysis shows that this is because although 2+2 = 1+3 (irrelevant), 2*2 > 1*3 (relevant).
This example shows how this type of situation should be handled: depending on the payoff matrix, both players will choose to play their possible bids with fixed probabilities to optimize their result. In the Chess-2-like game I proposed here as a model, the payoff matrix itself would have to be obtained by recursively searching the outcome of every possible biddings. Once these are known, the Optimize() routine would do a calculation similar to the one I presented in this post.
(Note that it can happen that the condition of zero derivative cannot be met for a probability in the range [0,1]. In that case the optimum would be an edge extremum, obtained by setting the probability 0 or 1. E.g. when you have 4 coins, and you know the opponent only has 1, you should obviously never bid 3 or 4.)
The Optimize() routine finally tells you the expectation value of the scores in the payoff matrix. To minimax it against the alternative moves, you would have to compare such expectation values against absolutely deterministic scores, or in general against expectation values with another probability distribution. This also is non-trivial. E.g. when you are a minor ahead, you would prefer certain loss of a Pawn over a 50-50 probability of losing or gaining a Queen (which on average is neutral). While when you were a minor behind you would actually prefer that. In both cases the gamble to gain/lose a Queen gives you a 50% chance to win the game, but in the first case it is compared against a certain win, while in the second case against an even more certain loss. I think this problem would go away if you used winning probabilities as scores everywhere, because the expectation value of stochastically picked winning probabilities is still a pure winning probability, indistinguishable from any other winning probability of the same value. (If you ignore draws, that is, so that the probability implies the distribution of the two possible outcomes, win or lose.) Not sure how to factor in draws, however. But I guess that would be a general problem in Chess engines, even in orthodox Chess.