I do have another question about how you guyz deal with that problem.
How do you generate a lot of games which aren't too smiliar ??? Or to phrase it differently, how do you generator a lot of unique positions ?
Do you use a very large opening book as a starting point ?
Greetings
Robin
tuning for the uninformed
Moderators: hgm, Rebel, chrisw
-
- Posts: 931
- Joined: Tue Mar 09, 2010 3:46 pm
- Location: New York
- Full name: Álvaro Begué (RuyDos)
Re: tuning for the uninformed
I downloaded games from CCRL, took positions from those games and analyzed them with my program RuyDos. I saved positions on which the evaluation function was being called after searching 1000 nodes. I then labelled each position by running one very quick SF8-vs-SF8 game.CheckersGuy wrote:I do have another question about how you guyz deal with that problem.
How do you generate a lot of games which aren't too smiliar ??? Or to phrase it differently, how do you generator a lot of unique positions ?
Do you use a very large opening book as a starting point ?
Greetings
Robin
https://bitbucket.org/alonamaloh/ruy_tu ... th_results
EDIT: In that file each position has been replaced by the position from which quiescence search got its score.
-
- Posts: 349
- Joined: Sat Aug 06, 2016 8:31 pm
- Location: United States
Re: tuning for the uninformed
Could you describe these pre-localization algorithms for a non-mathematician?CheckersGuy wrote: This is the local search algorithm but I would assume that it is better to run some gradient based algorithm first (Maybe gradient descent or gauss-newton). Then if the error doesn't change by much anymore I would switch to local search.
Also, suppose a file of positions labeled with evals from an oracle engine instead of game results. Is there a way to convert the error formula used in Texel tuning to take an eval instead of a game result? Perhaps sigmoid of that minus the tuning engine's sigmoid instead of result - sigmoid?
-
- Posts: 4367
- Joined: Fri Mar 10, 2006 5:23 am
- Location: http://www.arasanchess.org
Re: tuning for the uninformed
Texel tuning is basically a form of logistic regression. There is a large literature on this, and a lot of explanatory material online. As initially described it is slightly unorthodox because it is using a squared-error loss function to handle the 3 categories of win, loss, draw. But the general principle is still the same.
--Jon
--Jon
-
- Posts: 4367
- Joined: Fri Mar 10, 2006 5:23 am
- Location: http://www.arasanchess.org
Re: tuning for the uninformed
If the loss function is convex, and it generally is, then you are going to be able to converge to a minimum using local search, or gradient descent.Henk wrote:All does not work if search space has a great many local optima and only very few global optima that you are interested in. But simulated annealing taking too long.
Nobody uses simulated annealing. There are much better algorithms.
For large training sets IMO you want to use a gradient descent method such as ADAM.
--Jon
-
- Posts: 4367
- Joined: Fri Mar 10, 2006 5:23 am
- Location: http://www.arasanchess.org
Re: tuning for the uninformed
Proof of strict convexity for logistic loss function:
http://qwone.com/~jason/writing/convexLR.pdf
--Jon
http://qwone.com/~jason/writing/convexLR.pdf
--Jon
-
- Posts: 349
- Joined: Sat Aug 06, 2016 8:31 pm
- Location: United States
Re: tuning for the uninformed
If this was in response to my post directly above it, I have Texel tuning implemented using the local optimization function given as pseudo-code on the CPW, but it sounds like there are ways to get closer first via "gradient descent or gauss-newton". Unfortunately, what I've read of these so far was over my head.jdart wrote:Texel tuning is basically a form of logistic regression. There is a large literature on this, and a lot of explanatory material online. As initially described it is slightly unorthodox because it is using a squared-error loss function to handle the 3 categories of win, loss, draw. But the general principle is still the same.
--Jon
Regarding the 3-categories (win, loss, draw) being unusual, does that mean using an oracle eval would actually be more appropriate? Would it just plugin where the result (Ri) goes or does it have to be converted to a -1 to 1 range first (using the sigmoid function?)?
-
- Posts: 4367
- Joined: Fri Mar 10, 2006 5:23 am
- Location: http://www.arasanchess.org
Re: tuning for the uninformed
The theory of logistic regression with discrete outcomes (dependent variables) deals mainly with three cases:
1. binary (0 or 1) outcomes
2. ordinal outcomes (such as poor, average, good).
3. discrete but unordered outcomes.
None of these commonly use a squared error distance to measure goodness of fit, as the Texel method does.
However, our problem is basically 2, but with a twist: there are three possible values for a game (0, 0.5 or 1), but the values are meaningful in the sense that 0.5 is really equidistant between 0 and 1. So in that case the Texel method may be a reasonable approach, it is just not as theoretically grounded.
As for using the eval as an oracle: yes, you can do this. Your label is then the oracle's value and you regress to find the best match between predicted and the oracle value. All this changes in the whole procedure is the "loss function" that measures goodness of fit. You could use mean absolute or squared difference between predicted and oracle eval, for example.
Yet another approach is to make predicted moves based on your eval match actually played moves by an "oracle" such as a strong program or a strong human player. This approach has been used in Shogi - there is a paper by Hoki and Kaneko.
But realize that then you are really solving a different problem. The regression will make your eval conform to the oracle's eval. In effect, this is a fancy way of reverse engineering the oracle's eval. It may or may not make your eval predict game results better, or actually play better, but if the oracle is much stronger then it likely will cause improvement.
--Jon
1. binary (0 or 1) outcomes
2. ordinal outcomes (such as poor, average, good).
3. discrete but unordered outcomes.
None of these commonly use a squared error distance to measure goodness of fit, as the Texel method does.
However, our problem is basically 2, but with a twist: there are three possible values for a game (0, 0.5 or 1), but the values are meaningful in the sense that 0.5 is really equidistant between 0 and 1. So in that case the Texel method may be a reasonable approach, it is just not as theoretically grounded.
As for using the eval as an oracle: yes, you can do this. Your label is then the oracle's value and you regress to find the best match between predicted and the oracle value. All this changes in the whole procedure is the "loss function" that measures goodness of fit. You could use mean absolute or squared difference between predicted and oracle eval, for example.
Yet another approach is to make predicted moves based on your eval match actually played moves by an "oracle" such as a strong program or a strong human player. This approach has been used in Shogi - there is a paper by Hoki and Kaneko.
But realize that then you are really solving a different problem. The regression will make your eval conform to the oracle's eval. In effect, this is a fancy way of reverse engineering the oracle's eval. It may or may not make your eval predict game results better, or actually play better, but if the oracle is much stronger then it likely will cause improvement.
--Jon
-
- Posts: 4367
- Joined: Fri Mar 10, 2006 5:23 am
- Location: http://www.arasanchess.org
Re: tuning for the uninformed - Poisson?
It is has occurred me that the correct model here might be Poisson regression (https://en.wikipedia.org/wiki/Poisson_regression). If we ran for each position a match of n games, and if the probability of a win is some value p, then the distribution of game results (x2) would be integers that could be modeled as a Poisson distribution and the parameters could be tuned using that method to model the outcomes. I don't know if this is completely valid but it seems plausible. Texel tuning as initially described is the special case of n=1.
--Jon
--Jon
-
- Posts: 1334
- Joined: Sun Jul 17, 2011 11:14 am
Re: tuning for the uninformed - Poisson?
This is only mildly related to the topic, but Jon, I think you said that Arasan used a closed-form equation for the derivatives of your eval. Would you mind going over how you calculated those? It confuses me a lot.jdart wrote:It is has occurred me that the correct model here might be Poisson regression (https://en.wikipedia.org/wiki/Poisson_regression). If we ran for each position a match of n games, and if the probability of a win is some value p, then the distribution of game results (x2) would be integers that could be modeled as a Poisson distribution and the parameters could be tuned using that method to model the outcomes. I don't know if this is completely valid but it seems plausible. Texel tuning as initially described is the special case of n=1.
--Jon
For example, even if I take a stupid material only tapered eval with mean-squared error, I get this:
Code: Select all
Let count_p_w, count_n_w etc be the count of pawns, knights etc for white.
Let count_p_b, count_n_b etc be the count of pawns, knights etc for black.
Let phase_p, phase_n etc be the phase weight of pawns, knights, etc for tapered eval.
Let value_p_o, value_n_o etc be the material value of pawns, knights, etc in the opening.
Let value_p_e, value_n_e etc be the material value of pawns, knights, etc in the endgame
value_o = (count_p_w - count_p_b) * value_p_o + (count_n_w - count_n_b) * value_n_o + ...
value_e = (count_p_w - count_p_b) * value_p_e + (count_n_w - count_n_b) * value_n_e + ...
total_phase = 16 * phase_p + 4 * phase_n + ...
phase = (count_p_w + count_p_b) * phase_p + (count_n_w + count_n_b) * phase_n + ...
value = ((phase * value_o) + ((total_phase - phase) * value_e)) / total_phase
sigmoid = 1 / (1 + 10 ** (-K*value))
error = (result - sigmoid) * (result - sigmoid)
Some believe in the almighty dollar.
I believe in the almighty printf statement.
I believe in the almighty printf statement.