CLOP for Noisy Black-Box Parameter Optimization

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

Rein Halbersma
Posts: 741
Joined: Tue May 22, 2007 11:13 am

Re: CLOP for Noisy Black-Box Parameter Optimization

Post by Rein Halbersma »

Rémi Coulom wrote: I don't really understand your question.

The objective function can be the expected value of any random variable that depends on parameters. So, if, instead of playing one game and getting the result, you play a tournament and observe LOS over a specific opponent, you can use CLOP to optimize it. But that would be a strange way to use CLOP. If you wish to optimize a program against a set of opponents instead of just one opponent, you can use the "Replications" option of CLOP to play a game against each opponent, and then CLOP will maximize the average winning rate against all these opponents.
I don't mean to maximize the number of points against all these opponents. I mean to maximize the chance that I end up first in a tournament. E.g. playing thousands of games against opposition with an elo of 2950 +/- 1, I would prefer my program to be rated 2940 +/- 20, rather than 2945 +/- 1. The former should have a much larger chance to finish first in a tournament than the latter, even though it would score less points on average.

Is there any way to do such optimization? Can you simply make LOS the objective function?
Rémi Coulom
Posts: 438
Joined: Mon Apr 24, 2006 8:06 pm

Re: CLOP for Noisy Black-Box Parameter Optimization

Post by Rémi Coulom »

Rein Halbersma wrote:I don't mean to maximize the number of points against all these opponents. I mean to maximize the chance that I end up first in a tournament. E.g. playing thousands of games against opposition with an elo of 2950 +/- 1, I would prefer my program to be rated 2940 +/- 20, rather than 2945 +/- 1. The former should have a much larger chance to finish first in a tournament than the latter, even though it would score less points on average.

Is there any way to do such optimization? Can you simply make LOS the objective function?
Well, the +/- interval in bayeselo's evaluation does not really depend on your program's style. It depends mostly on the number of games you play. If you get the same win rate against the same opponents with the same number of games, you'll have the same confidence interval.

Maybe the interval will be a little narrower if you have more draws. But that's an effect of the specific model bayeselo is using for draws. I don't believe it is meaningful.

If you wish to make your program more agressive to increase your chances of winning a tournament against stronger opponents, you could optimize its winning rate with CLOP while counting a draw as a loss, like I suggested in my previous message.

Rémi
petero2
Posts: 684
Joined: Mon Apr 19, 2010 7:07 pm
Location: Sweden
Full name: Peter Osterlund

Re: CLOP for Noisy Black-Box Parameter Optimization

Post by petero2 »

This seem very useful. I have one question though. What does the Max-1, Max-2, ... columns mean in the "Max" tab in the gui?
Rémi Coulom
Posts: 438
Joined: Mon Apr 24, 2006 8:06 pm

Re: CLOP for Noisy Black-Box Parameter Optimization

Post by Rémi Coulom »

petero2 wrote:This seem very useful. I have one question though. What does the Max-1, Max-2, ... columns mean in the "Max" tab in the gui?
This is a feature I started to implement, but was too lazy to finish. The "Mean column" is the weighted mean of all samples. Max-1, Max-2, ... were supposed to be the maximimization of the quadratic function, starting from Mean, along each eigenvector one by one, starting from the most negative eigenvalue. In fact, only Max-1 is computed when the regression is definite negative, and it is the maximum of the regression. Max-2, ... will always remain empty. So mean is the weighted average of samples, and Max-1 is the maximum of the regression when it exists.

I'll clear that up in the next version.

Rémi
ernest
Posts: 2041
Joined: Wed Mar 08, 2006 8:30 pm

Re: CLOP for Noisy Black-Box Parameter Optimization

Post by ernest »

Hi Rémi,

Since I see you are active here now, would you care to give your opinion in the discussion I initiated on Elo rating in
http://www.talkchess.com/forum/viewtopi ... 196#422196

Merci,

Ernest
User avatar
Don
Posts: 5106
Joined: Tue Apr 29, 2008 4:27 pm

Re: CLOP for Noisy Black-Box Parameter Optimization

Post by Don »

Rémi Coulom wrote: Comments and questions are welcome.

Rémi
Larry and I are playing around with CLOP using clop-gui.

In the Max tab, there appears the columns Mean, Max-1, Max-2 and apparently as many columns as there are numbers of parameters to be tuned.

However nothing ever appears in those columns after a day of running. What do those columns mean (no pun.)

In the paper you mention that this is supposed to converge - so given enough games and time will this eventually stop running? I told Larry that even with a very efficient algorithm it is likely to take millions of games to see values that improve on already well selected values.

And which values are taken to be the final values to use? Are the Max-n columns supposed to represent superior parameters?

Addendum: We tried only tuning a couple of parameters and values start to appear in the Max-1 column after a few minutes. But we don't really know what they mean.

Thanks.
Rémi Coulom
Posts: 438
Joined: Mon Apr 24, 2006 8:06 pm

Re: CLOP for Noisy Black-Box Parameter Optimization

Post by Rémi Coulom »

Don wrote:
Rémi Coulom wrote: Comments and questions are welcome.

Rémi
Larry and I are playing around with CLOP using clop-gui.

In the Max tab, there appears the columns Mean, Max-1, Max-2 and apparently as many columns as there are numbers of parameters to be tuned.

However nothing ever appears in those columns after a day of running. What do those columns mean (no pun.)
I explained earlier in this thread: it is a feature I did not implement. Only max-1 has a meaning: it is the maximum of the quadratic regression, if it exists. It sometimes does not exist. Mean is the weighted mean of all samples. I recommend using parameter values in the "Mean" column for your parameters. They usually perform better according to my artificial experiments.
In the paper you mention that this is supposed to converge - so given enough games and time will this eventually stop running? I told Larry that even with a very efficient algorithm it is likely to take millions of games to see values that improve on already well selected values.
It does not converge in a finite amount of games. It converges in the sense that expected simple regret decreases in time, and can be made arbitrarily small with a big enough number of games. Note that I do not have a mathematical proof of this convergence, but I am convinced it does converge if the function to be optimized is continuous.

In the paper I ran experiments on artificial problems. It gives an idea of how close to optimal parameters you get with a given number of games. Around 50% winning rate, one Elo point is about 0.0014. For the Log problem in one dimension, it takes about 10k games to have 0.001 expected simple regret (so, less than one Elo point). For the 5-dimensional problem, it takes a few hundred thousand games to reach 1-Elo expected simple regret (that's 0.2 Elo per parameter).

The magic of CLOP is that it approaches optimal performance faster than it can be measured. Simple regret is smaller than the confidence interval around the estimated win rate.

Rémi
User avatar
Don
Posts: 5106
Joined: Tue Apr 29, 2008 4:27 pm

Re: CLOP for Noisy Black-Box Parameter Optimization

Post by Don »

Rémi Coulom wrote:
Don wrote:
Rémi Coulom wrote: Comments and questions are welcome.

Rémi
Larry and I are playing around with CLOP using clop-gui.

In the Max tab, there appears the columns Mean, Max-1, Max-2 and apparently as many columns as there are numbers of parameters to be tuned.

However nothing ever appears in those columns after a day of running. What do those columns mean (no pun.)
I explained earlier in this thread: it is a feature I did not implement. Only max-1 has a meaning: it is the maximum of the quadratic regression, if it exists. It sometimes does not exist. Mean is the weighted mean of all samples. I recommend using parameter values in the "Mean" column for your parameters. They usually perform better according to my artificial experiments.
In the paper you mention that this is supposed to converge - so given enough games and time will this eventually stop running? I told Larry that even with a very efficient algorithm it is likely to take millions of games to see values that improve on already well selected values.
It does not converge in a finite amount of games. It converges in the sense that expected simple regret decreases in time, and can be made arbitrarily small with a big enough number of games. Note that I do not have a mathematical proof of this convergence, but I am convinced it does converge if the function to be optimized is continuous.

In the paper I ran experiments on artificial problems. It gives an idea of how close to optimal parameters you get with a given number of games. Around 50% winning rate, one Elo point is about 0.0014. For the Log problem in one dimension, it takes about 10k games to have 0.001 expected simple regret (so, less than one Elo point). For the 5-dimensional problem, it takes a few hundred thousand games to reach 1-Elo expected simple regret (that's 0.2 Elo per parameter).

The magic of CLOP is that it approaches optimal performance faster than it can be measured. Simple regret is smaller than the confidence interval around the estimated win rate.

Rémi
Thanks for explaining that Rémi. I have a script for chess if anyone wants to use it for chess. The script does not have a legal move generator for simplicity and therefore draw and win detection is a bit of a kludge, but it works well.
Rémi Coulom
Posts: 438
Joined: Mon Apr 24, 2006 8:06 pm

Re: CLOP for Noisy Black-Box Parameter Optimization

Post by Rémi Coulom »

Don wrote:Thanks for explaining that Rémi. I have a script for chess if anyone wants to use it for chess. The script does not have a legal move generator for simplicity and therefore draw and win detection is a bit of a kludge, but it works well.
I can put a link on the CLOP web site, or host it there.

Rémi
F. Bluemers
Posts: 868
Joined: Thu Mar 09, 2006 11:21 pm
Location: Nederland

Re: CLOP for Noisy Black-Box Parameter Optimization

Post by F. Bluemers »

Don wrote:
Rémi Coulom wrote:
Don wrote:
Rémi Coulom wrote: Comments and questions are welcome.

Rémi
Larry and I are playing around with CLOP using clop-gui.

In the Max tab, there appears the columns Mean, Max-1, Max-2 and apparently as many columns as there are numbers of parameters to be tuned.

However nothing ever appears in those columns after a day of running. What do those columns mean (no pun.)
I explained earlier in this thread: it is a feature I did not implement. Only max-1 has a meaning: it is the maximum of the quadratic regression, if it exists. It sometimes does not exist. Mean is the weighted mean of all samples. I recommend using parameter values in the "Mean" column for your parameters. They usually perform better according to my artificial experiments.
In the paper you mention that this is supposed to converge - so given enough games and time will this eventually stop running? I told Larry that even with a very efficient algorithm it is likely to take millions of games to see values that improve on already well selected values.
It does not converge in a finite amount of games. It converges in the sense that expected simple regret decreases in time, and can be made arbitrarily small with a big enough number of games. Note that I do not have a mathematical proof of this convergence, but I am convinced it does converge if the function to be optimized is continuous.

In the paper I ran experiments on artificial problems. It gives an idea of how close to optimal parameters you get with a given number of games. Around 50% winning rate, one Elo point is about 0.0014. For the Log problem in one dimension, it takes about 10k games to have 0.001 expected simple regret (so, less than one Elo point). For the 5-dimensional problem, it takes a few hundred thousand games to reach 1-Elo expected simple regret (that's 0.2 Elo per parameter).

The magic of CLOP is that it approaches optimal performance faster than it can be measured. Simple regret is smaller than the confidence interval around the estimated win rate.

Rémi
Thanks for explaining that Rémi. I have a script for chess if anyone wants to use it for chess. The script does not have a legal move generator for simplicity and therefore draw and win detection is a bit of a kludge, but it works well.
I wrote one that calls xboard to run the game
I redirect the output to a file and parse it like this to get the outcome:

Code: Select all

for line in open("herr"):
    parts = line.split(' ')
    # print len(parts)
    x = 0
    while x+1<len&#40;parts&#41; &#58;
    	if parts&#91;x&#93; == 'score' &#58;
           parts&#91;x+1&#93;=parts&#91;x+1&#93;.strip&#40;)
           if parts&#91;x+1&#93; == '0-0-1' &#58;
               print 'D'
           if parts&#91;x+1&#93; == '0-1-0' &#58; 
               print 'L'
           if parts&#91;x+1&#93; == '1-0-0' &#58;
               print 'W'
        x=x+1
My first hack in python ,so I bet it can be done a lot better,I parse
all lines,if xboard is invoked with -debug it contains more lines (last one holding result,but i wanted to keep the code simple)