hgm wrote: ↑Wed Jan 13, 2021 5:31 pm
Desperado wrote: ↑Sun Jan 10, 2021 12:46 pm
Code: Select all
Material: P, N, B, R, Q P, N, B, R, Q
Start: MG: 80,300,320,500,980 EG:100,300,320, 500, 980
End: MG: 5, 24, 32, 48, 68 EG:192,595,610,1020,1990
I have not read this entire discussion, and am entering it only new, so forgive me if I address points that have already been addressed.
Either there is something very wrong in your optimizing algorithm that prevents convergence, or you are not feeding it a wide enough variety of positions to determine the parameters unambiguously. The MG piece values you get as 'optimum' above are unacceptably low. With these values you will get nonsensical result predictions for early middle-game positions. Basically every such position will be predicted as a draw, or at least very close to 50-50, even those where one of the players is a Queen ahead. That should give a huge error, as in practice it would be a 100% result. So it cannot give an optimal fit with Q=68; it should always be better to increase the Q value.
Of course when you have no positions in your test set where one of the players has all pieces, and the other only lacks a Queen, this prediction error will remain totally unnoticed, as the prediction will never have to be made. It becomes a bit like trying to tune the Rook value on a test set of positions none of which contains a Rook: anything flies.
My first guess is that this is your problem. You don't have positions that are heavily unbalanced in a very early game stage in the test set, because you took the positions from high-quality games, and GMs do not blunder away Queens early in the game. Opening positions in high-quality games will always be approximately equal, and by only showing it such positions you deluded the optimizer in to thinking that only the game phase matters, and that early middle-game positions can always go either way. To make it understand that being a Queen or Rook behind is totally fatal even if it is the first piece you lose, a significant fraction of the positions should be Queen-odds or Rook-odds positions.
Hello HG,
thanks for joining. To give you a short summary:
From the start of the thread to now every component was simplified to catch the issue. The current setup is...
* material only evaluation
* cpw algorithm (stepzsize 5 or 1)
* standard phase calcluation - mg(24) eg(0), minor(1),rook(2),queen(4)
* scalingFaktorK 1.0 (to analyse the situation)
* 50K sample size
The conclusion is that is not about the algorithm and code implementation. The challenge depends on the data.
As you pointed out it might be connected with missing/unbalanced properties in the positions.
The latest step was to switch to public available epd file i mentioned before, that is created out of ccrl games.
So was my own, but everybody now has access and can reproduce the issue. I use static evaluation instead of qs(),
but the issue keeps the nearly the same.
At least my own data was shuffled and i used batch sizes of 1M, to be sure i have any important type of position represented.
Nothing changes the result, unfortunatelly.
Even to balance out the impact of error / phase did not change anything. The mg - error still dominates, so the tuner reduces
it much more.
My next thought is, but i' ll do a break, that the error is a porperty of the position. So, it might be natural,that the mg error can
be reduced much more than eg-error. The distance from a mg position to the outcome of a game is usually bigger.
Having a statistically balanced error related to the game phases, now using qs() instead of static eval might give the desired results.
Alternatively the amount of error per gamephase might be updated, more endgame positions or something like that...need to think.
In any case, the problem is caused by the training data.
Need to leave now...