I wonder how badly this interferes with alpha-beta pruning,
Are you sure there is an impact on the concept of a-b pruning?
What the OP is proposing I think is that the static eval would depend on whether the stm is white or black (to simulate an opponent with a different evaluation function).
Since the stm is a position property I do not see any problems.
EDIT: Hmm...the proposal seems different, but I confess I do not really understand it as written.
Ideas=science. Simplification=engineering.
Without ideas there is nothing to simplify.
I wonder how badly this interferes with alpha-beta pruning,
Are you sure there is an impact on the concept of a-b pruning?
Are we sure the idea actually works when simply doing mini-max?
Maybe I am wrong, but I have severe doubts.
The idea of minimax is that I look for a strategy that optimises my evaluation function against all possible opponent strategies. I do not limit myself to those strategies that I think my opponent prefers assuming my opponent thinks like me.
Yes, the idea works. In game theory it is usually called "backward induction": given a game tree, each player has his own eval and maximizes his reward in every node where he is to move. I suspect alpha-beta works only for zero-sum games (thus with the same eval for both players) but maybe there's a way to redefine it in this context. Food for thought.
By having different evals it ceases to be a zero-sum game, and this introduces some new concepts. But in principle, minimax is still the approach that maximizes your outcome against optimal play.
One of the new concepts, however, is 'trust': can you trust your opponent to play optimally (for him)? If not, he might not play his best move you counted on (which was also good for you), but might blunder and pick a stupid move that is a bit worse for him, but disastrous for you.
So opponent modeling becomes much more important, and against an opponent you cannot trust you might want to play safe, and avoid moves where a mistake of his can lead to disaster for you.
On plies where it's the program's move, we take the max of the scores of our own eval, and on plies where it's the opponent's move, we take the max of the scores of the eval that is supposed to model the opponent.
Ok so the eval of any node is a pair (w,b). So what is the propagation rule?
I mean what is the rule that computes the eval for a node from that of its child nodes? The result should be a pair. The rule above seems to return only a single value.
Ideas=science. Simplification=engineering.
Without ideas there is nothing to simplify.
As I understand it the value of the node is the pair (w,b) that has the best w in nodes where white is to move, and that has the best b in nodes where black is to move.
hgm wrote:By having different evals it ceases to be a zero-sum game, and this introduces some new concepts. But in principle, minimax is still the approach that maximizes your outcome against optimal play.
Minimax is indeed still the approach that maximises your outcome against optimal play, but what is being described is not that same minimax.
One of the new concepts, however, is 'trust': can you trust your opponent to play optimally (for him)? If not, he might not play his best move you counted on (which was also good for you), but might blunder and pick a stupid move that is a bit worse for him, but disastrous for you.
Exactly. So your opponent might now have a line of play that you are blind to. So you're not going to do well against all strategies anymore, which means you're not going to do well against (objectively) optimal play.
In my view, the "right" way to take into account expected opponent weaknesses and strengths is by making the evaluation asymmetric. If your opponent playing as black is bad with knights, then assign white a bonus if the position has many knights.
If your engine is found the best move,
you program your engine to make another inferior move, asymmetric move.
This is poker.
And how said Marco, useless.
syzygy wrote:Exactly. So your opponent might now have a line of play that you are blind to. So you're not going to do well against all strategies anymore, which means you're not going to do well against (objectively) optimal play.
This doesn't necessarily follow. And the whole point is that you will not need to do well against objectively optimal play, because you will be faced with the silly moves you can seduce your opponent to make, and want to exploit those. It is the kind of strategy that will make you play 2. Qh5 after 1. e4 e5, because you know he will play 2... g6?, and there is no way any solid move will gain you a Rook after 4 moves.
In my view, the "right" way to take into account expected opponent weaknesses and strengths is by making the evaluation asymmetric. If your opponent playing as black is bad with knights, then assign white a bonus if the position has many knights.
Most misconceptions of your opponent cannot be efficiently exploited this way. E.g. when he undervalues Archbishops and thinks hey are an equal trade for Knight + Bishop, you have to offer him N + B + P for A to make him surely squander his Archbishop in the illusion it gains him a Pawn. I don't know how any asymmetric eval could encourage you to do that.