I am also interested in tuning piece values. I have tons of variants in which I don't know the piece values (can't even make a guess most of them). Later I may even be forced to determine piece values at run time..
Ok my question is CLOP seems to do local optimization guessing from the name. Can it find optimal values where I have no idea of what kind of games I am about to play ?
What is the advantage of CLOP over other "general" type of response surface models that use for example a radial basis function (RBF) to fit about any curve. Does the fact that the objective function is stochastic affect the choice above ? In the past i have used gradient descent methods to optimize eval parameters one by one. Does CLOP take into consideration correlation of parameters as that seems very important for determining piece values? By that I mean given 5 parameters to tune , does it give me correlation table after I run the optimization ?
Thank you
CLOP for Noisy Black-Box Parameter Optimization
Moderators: hgm, Rebel, chrisw, Ras, hgm, chrisw, Rebel, Ras
-
- Posts: 4185
- Joined: Tue Mar 14, 2006 11:34 am
- Location: Ethiopia
-
- Posts: 438
- Joined: Mon Apr 24, 2006 8:06 pm
Re: CLOP - a trial for xiangqi engine
Yes, CLOP is completely domain-independent. I never use it for chess, personally.Daniel Shawul wrote:I am also interested in tuning piece values. I have tons of variants in which I don't know the piece values (can't even make a guess most of them). Later I may even be forced to determine piece values at run time..
Ok my question is CLOP seems to do local optimization guessing from the name. Can it find optimal values where I have no idea of what kind of games I am about to play ?
Most of the time, the function to be optimized does not have any trickly local optimum. By taking advantage of this assumption, CLOP can be faster than methods that try to perform global optimization instead of local optimization.What is the advantage of CLOP over other "general" type of response surface models that use for example a radial basis function (RBF) to fit about any curve.
No. It is mainly a choice of local vs global optimization.Does the fact that the objective function is stochastic affect the choice above ?
Yes, CLOP is good at taking correlations into consideration. For chess, Knight and Bishop have very strong correlation, so it is important.In the past i have used gradient descent methods to optimize eval parameters one by one. Does CLOP take into consideration correlation of parameters as that seems very important for determining piece values? By that I mean given 5 parameters to tune , does it give me correlation table after I run the optimization ?
On the web site of CLOP, you can find a data file by Gian-Carlo Pascutto that shows the application of CLOP to his program. CLOP should work well to optimize piece values.
Rémi
-
- Posts: 2286
- Joined: Mon Sep 29, 2008 1:50 am
Issue....
I managed to lose data of 100000 games!
Here is what I did.
I ran some 75000 games. Unfortunately CLOP seemed to be converging to a local maximum which I knew was not optimal.
Since I knew better parameters than what CLOP was converging to, I decided to let CLOP run for 25000 samples using a small window around the known good values. The idea was that after enlarging the window again
the regression would be pulled towards the good values.
Unfortunately this had the effect that the previous 75000 samples were (silently) cropped to the small window, making them worthless:-(
Normally I would have still had the backup file, but since I did not immediately understand what was going on I restarted clop several
times, losing the backup file in the process.
I think it would be more logical if samples lying outside the parameter
window were simply ignored, instead of cropped.
Here is what I did.
I ran some 75000 games. Unfortunately CLOP seemed to be converging to a local maximum which I knew was not optimal.
Since I knew better parameters than what CLOP was converging to, I decided to let CLOP run for 25000 samples using a small window around the known good values. The idea was that after enlarging the window again
the regression would be pulled towards the good values.
Unfortunately this had the effect that the previous 75000 samples were (silently) cropped to the small window, making them worthless:-(
Normally I would have still had the backup file, but since I did not immediately understand what was going on I restarted clop several
times, losing the backup file in the process.
I think it would be more logical if samples lying outside the parameter
window were simply ignored, instead of cropped.
-
- Posts: 438
- Joined: Mon Apr 24, 2006 8:06 pm
Re: Issue....
CLOP crops the display to the current parameter range, but does not lose data. It should take that data into consideration. If you re-widen the ranges again, you should find your old data again. It is only a display problem.Michel wrote:I managed to lose data of 100000 games!
Here is what I did.
I ran some 75000 games. Unfortunately CLOP seemed to be converging to a local maximum which I knew was not optimal.
Since I knew better parameters than what CLOP was converging to, I decided to let CLOP run for 25000 samples using a small window around the known good values. The idea was that after enlarging the window again
the regression would be pulled towards the good values.
Unfortunately this had the effect that the previous 75000 samples were (silently) cropped to the small window, making them worthless:-(
Normally I would have still had the backup file, but since I did not immediately understand what was going on I restarted clop several
times, losing the backup file in the process.
I think it would be more logical if samples lying outside the parameter
window were simply ignored, instead of cropped.
If there is a convergence problem with your data, I am curious to take a look.
Rémi
-
- Posts: 2286
- Joined: Mon Sep 29, 2008 1:50 am
Re: Issue....
Unfortunately this was not the case. This is how I noticed. The data wasIf you re-widen the ranges again, you should find your old data again.
there, but clipped.
I can try to debug it although I currently have a bit of problem finding my way around in the source code.
Code: Select all
/////////////////////////////////////////////////////////////////////////////
// Transform x to the [-1, 1] range
/////////////////////////////////////////////////////////////////////////////
double CIntegerParameter::TransformToQLR(double x) const
{
return -1.0 + 2.0 * (x - Min + 0.5) / (Max - Min + 1.0);
}
Code: Select all
/////////////////////////////////////////////////////////////////////////////
// Transform x to the [Min, Max] range
/////////////////////////////////////////////////////////////////////////////
double CIntegerParameter::TransformFromQLR(double x) const
{
int i = Min + int((Max - Min + 1) * (x + 1.0) * 0.5);
if (i > Max)
i = Max;
return i;
}
Code: Select all
/////////////////////////////////////////////////////////////////////////////
// Transform x to the [Min, Max] range
/////////////////////////////////////////////////////////////////////////////
double CLinearParameter::TransformFromQLR(double x) const
{
return Min + (Max - Min) * (x + 1.0) * 0.5;
}
If there is a convergence problem with your data, I am curious to take a look.
Well I lost my data:-( If I had not lost it there were various things I could
have checked.
EDIT: fixed confusion between IntegerParameter and IntegerGammaParameter.
-
- Posts: 438
- Joined: Mon Apr 24, 2006 8:06 pm
Re: Issue....
You are right, sorry. I'll improve clop for this case.
Rémi
Rémi
-
- Posts: 2286
- Joined: Mon Sep 29, 2008 1:50 am
Re: Issue....
No need to say sorry. Software of this complexity is bound to have little issues.You are right, sorry. I'll improve clop for this case.
Something else: would it be meaningful to choose the prior differently?
Instead of taking the quadratic function ax^2+bx+c with a,b,c Gaussian
with mean zero and fixed stdDev perhaps one could take
a(x-b)^2+c
with a,b,c Gaussian. This would make it possible to prebias CLOP around
a suggested value for b.
EDIT: I guess I am suggesting to trade in the linear term in the quadratic
function for the choice of an extremum.
-
- Posts: 438
- Joined: Mon Apr 24, 2006 8:06 pm
Re: Issue....
I thought about it, and many other alternatives. A major problem with this approach is the non-linearity it introduces. With ax^2+bx+c, strength estimation depends linearly on parameters of the regression. This is an extremely good property to have. It allows very efficient algorithms, and ensures a unique maximum of the posterior. Introducing non-linearities makes the regression a lot more complex.Michel wrote:No need to say sorry. Software of this complexity is bound to have little issues.You are right, sorry. I'll improve clop for this case.
Something else: would it be meaningful to choose the prior differently?
Instead of taking the quadratic function ax^2+bx+c with a,b,c Gaussian
with mean zero and fixed stdDev perhaps one could take
a(x-b)^2+c
with a,b,c Gaussian. This would make it possible to prebias CLOP around
a suggested value for b.
EDIT: I guess I am suggesting to trade in the linear term in the quadratic
function for the choice of an extremum.
I already gave this advice in the README of clop: if you have an idea of good values for the parameters, you can prime the clop experiment by starting with a narrow interval around these values. Then, you can widen parameter range. (I am not 100% certain now, but I expect this would cause no data loss with the current clop). It seems to be the right way to bias the clop process with prior knowledge.
Rémi
-
- Posts: 4185
- Joined: Tue Mar 14, 2006 11:34 am
- Location: Ethiopia
Re: CLOP for Noisy Black-Box Parameter Optimization
Is that the reason why the elo I calculate from winningRate percentage sometimes do not match the one displayed on the gui with 95% confidence? For example I saw a 47% winning rate suggesting -21 elo, but only -5 elo is displayed. Also since the current selected parameters do not get all the games played (some early games are probably truncated), how are the mean and confidence intervals estimated ?Note that, in general, you cannot use CLOP to estimate the strength of the optimised program accurately. Win rates produced by CLOP are biased. Win rate over all samples is pessimistic. Local win rate tends to be optimistic. The real win rate of optimal parameters is somewhere in-between.
-
- Posts: 2286
- Joined: Mon Sep 29, 2008 1:50 am
Re: CLOP for Noisy Black-Box Parameter Optimization
I can confirm this is safe. It does not lose data.(I am not 100% certain now, but I expect this would cause no data loss with the current clop).
Yes this is what I am doing now. I assume it will work (still only 33000 games).It seems to be the right way to bias the clop process with prior knowledge.
From a theoretical point of view this method appears to be slightly inelegant though...