Daniel Shawul wrote:What about the evaluation to be used to get the games ? What effect does it have if I use different values for Q and A. I am guessing Q and A can be set to the same value there. Then Q vs A + P and Q vs A played and equations obtained as you explained. What effect will the value assigned to the rest of the pieces have.
I always set the values slightly different, to minimize the probability of un-forced trading, and keep the imbalance durng as large a part of the game as possible. With different values there is always one side that tries to avoid the trade.
Funny enough the sign of the difference does not seem to matter very much. Whether I set C = Q - 25cP or C = Q + 25cP, in both cases the Q has the upper hand in Q-C imbalances with about 58%. But when the values are not as I expect, I always redo the experiment with corrected piece values. In principle one should iterate to self-consistency, but my initial guesses are usually good enough that the second iteration just confirms the result of the first. (With Ultima that might be a bit different!)
How deep are the searches ? I.e time control for the games.
I used 1-min games on a 2.4GHz Core 2 Duo. At some point I was worried that Knights would be under-estimated at faster TC compared to Bishops, because they would start to play very stupid in the end-game, as prootons would get only within the horizon when it was too late for the Knight to catch the passer, while a Bishop can usually already see it early enough in a 4-ply search. So I did some systematic studies with B vs N+P imbalances, at 30 sec, 1min, 2min, 5min. But I did not fnd any systematic dependence of the score on the TC in that range.
Have you thought about monte-carlo ? Not that it is a good alternative to real games but to just compare the effect of brute force statistics vs smart methodology to arrive at the optimal values. I have a playout searcher that I want to refine a bit and play many games to see if it comes up with anything useful. But it is kind of dumb right now throwing away. For checkers it is a little bit smarter since captures are forced there.
I have never tried that. It would be interesting to see f it would work; if it does, it would be a realtime saver.
I don't know if a change of 200cp can take that much games to detect and that it is only 10cp. May be the queen was grossly mis-evaluated in the first place. For very close calls involving exchanges R vs B for instance that 200cp difference should show up quickly. But with the aggressive tuning of today's engine's I wouldn't be surprised. The piece values determined are usually good for that engine only. Queen , bishop pair etc have been highly over-evaluated in many engines at least according to human grand masters. Also in modern engines there is many positional factors that contribute to the overall eval so the piece values are for that eval only. Since our engines are simplistic except for the piece square tables, I would say we should get representative piece values that humans would find acceptable. Aggressive king safety evaluator usually needs its queen badly. Which reminds me about different piece values in middle/end game phases too. For the endgames it should be quicker to determine as the games are started from highly advanced positions.
Well, it is easy enough to test this. (Intentonally mistuned engine against normalone). The Queen might be an unfavorable case, as it is difficult to trade against other material anyway. Reducing the value to 7 will still not reverse the sign of any simple trade, you would need at least Q vs R+B or something like that. Setting the Rook less valuable than Knight would lead to squandering of the Rook much more often, through simple R-B or R-N trades, for which there should be many opportunities considering you have two Rooks and that the opponent 4 minors. But then we are of course also talking about a much larger percentual error (300 in stead of 500). Setting Q as low as R+P would probably also have a larger impact.
Indeed, end-game values cn be determned through the same method, but did not do very many of them. Just to make sure the surprisingly high Archbishop value was not game-stage dependent. (Sometimes you have this, that a piece that is essentially worthless in the end-game has a very good forking power on a crowded board, so that you can almost always force a trade for something intrinsically more valuable. For instance with a Camel (= (3,1) leaper, where Knight = (2,1) leaper).)
What I do in such a case is select a number of diverse and plausible Pawn structures, where each side has 4-6 Pawns, (always in pairs that are color-conjugated mirrorimages as far as Pawns is concerned), and then add the piece imbalance in a tactically passive way. E.g. if I give white Pawns on a2, b3, c4, g2, h2, and black a7, b6, f7, g6, h7, and want to play A vs R+N, I would put the A and R on a1/a8, and the N on f1/f8. Some Pawn structures would be symmetric, only would have passers, two groups of 3, three groups of 2, etc. And then just play a thousand games and see who has the upper hand.