Evert wrote:However, fixing the value of the pawn comes at a price: it is not guaranteed that the probability of winning the game given a +1 evaluation (say) is the same in the middle-game as it is in the end game. In fact, this is very probably not true (and I'm not even thinking about pathological cases where the extra pawn is meaningless; those you should detect and score as nearly equal anyway), but this is a tacit assumption in the tuning method described by Peter.
I don't think there is any such assumption in my method. My method only fixates K, not the value of any evaluation weight (such as the nominal value of a pawn). Also my method optimizes on the scores returned by the q-search function, which is not a vector-value. The interpolation between MG and EG scores has to happen before search sees the score, otherwise alpha-beta would not work.
In fact my method assumes that there is not a perfect match between q-search scores and win probability. The goal of the optimization is to adjust the weights to make the mis-match smaller.
Also note that if the evaluation function internally uses (MG,EG) pairs for its weights, such weights would correspond to two parameters in the optimization problem. You could also use additional parameters to describe the weighting formula. For example, in texel many evaluation terms are weighted like this:
Code: Select all
S = S_mg if material >= param1
S_eg if material <= param2
S_mg + (S_eg - S_mg) * (param1 - material) / (param1 - param2) otherwise
The parameters "param1" and "param2" are also included in the optimization problem.
Evert wrote:Does that sound right? It does explain why I got nothing sensible when I tried to apply it to just positional weights (but not piece values).
From your post I get the impression that you are not really using the method I described, but instead a variation of it. In my method the q-search score is used to compute E, the set of training positions include non-quiet positions and positions where the evaluation score is very large, and no evaluation weights are fixed to a particular value (such as pawn=100).
It is possible to optimize only a subset of the evaluation parameters. However as I wrote in the description of my method, one big objection to this method is that it assumes that "correlation implies causation". This is also a possible reason you didn't see any improvement. Nevertheless, I have so far increased the strength of texel about 150 elo points in 2.5 months using this method. It is quite likely that the method will stop working at some point though, probably long before texel gets close to stockfish strength.