CLOP/cutechess-cli

lucasart · Post by **lucasart** » Tue Feb 07, 2012 9:49 am

I would like to CLOP-timize my piece values. So here's what I have done so far:

* modify me engine so as to understand the command setvalue (as sent by clop-cutechess-cli.py). i have 5 parameters to optimize, PNBRQ opening values (pieces have the same opening and endgame values for now and pawn endgame value is hardcoded to 100, it is the reference point).
* modified the script clop-cutechess-cli.py
* I managed to compile CLOP also, in particular clop-console and clop-gui which are the only ones useful here as I understand.

So it works, and returns "W/L/D" values. When I call the script I can specify a CPU_ID, SEED, and different piece values (amongst my 5 aforementioned parameters).

Now what do I do with that ? What value do I give to CPU_ID ? As I understand CLOP is responsible to send the SEED and the parameter values, but how do I get that to work together ?

PS: While doing this I discovered a few ridiculous bugs in my eval code, which is nice

Rémi Coulom · Post by **Rémi Coulom** » Tue Feb 07, 2012 7:54 pm

CLOP should also send the cpu id. You just have to list your cpus in the .clop file. See DummyScript.clop for an example. Maybe you missed that you have to write a .clop file in order to start your experiment.

Rémi

lucasart · Post by **lucasart** » Fri Feb 10, 2012 4:20 am

Rémi Coulom wrote:CLOP should also send the cpu id. You just have to list your cpus in the .clop file. See DummyScript.clop for an example. Maybe you missed that you have to write a .clop file in order to start your experiment.

Rémi

Thank you Remi. I've been running my CLOP experiment for over 12 hours now, and piece values are starting to converge nicely.

* what do you think is a reasonable "stopping rule" ? is there a rule of thumb i can apply on the hessian matrix and eigen values ? do you think 10,000 games for example is enough to obtain good estimates of 5 parameters that are very influential like piece values ?

* where do I read the estimated values ? There is a tab called Max in clop-gui where I see a column "mean" for my parameters. Is that the mean since the beginning of the experiment ? (which isn't really what I want, although by a cesaro argument it would converge too) or is it really the clop estimate (which is some king of weighted mean, right?)

lucasart · Post by **lucasart** » Fri Feb 10, 2012 4:37 am

lucasart wrote:
Rémi Coulom wrote:CLOP should also send the cpu id. You just have to list your cpus in the .clop file. See DummyScript.clop for an example. Maybe you missed that you have to write a .clop file in order to start your experiment.

Rémi
Thank you Remi. I've been running my CLOP experiment for over 12 hours now, and piece values are starting to converge nicely.

* what do you think is a reasonable "stopping rule" ? is there a rule of thumb i can apply on the hessian matrix and eigen values ? do you think 10,000 games for example is enough to obtain good estimates of 5 parameters that are very influential like piece values ?

* where do I read the estimated values ? There is a tab called Max in clop-gui where I see a column "mean" for my parameters. Is that the mean since the beginning of the experiment ? (which isn't really what I want, although by a cesaro argument it would converge too) or is it really the clop estimate (which is some king of weighted mean, right?)

My two cents guess regarding the pre-condition to stop would be: all eigen values are strictly negative. Does that make sense to you ?

Rémi Coulom · Post by **Rémi Coulom** » Fri Feb 10, 2012 3:22 pm

negative eigenvalues is a very bad stopping rule. It may occur extremely early. It may also never happen.

CLOP will optimize forever, so the optimal stopping time depends on how close to the optimal you wish to be. If you want to be within 1 elo of the best you'll have to play many more games than if 10 Elo is good enough.

It is not very easy to estimate the expected simple regret. I don't know how to do it. You can take a look at the plots in the paper to get an idea of the expected simple regret for a given number of games, for different kinds of functions.

You can also get an idea of how uncertain parameter estimation is by looking at the weighted samples in the plot.

"Mean" is the weighted mean of all samples. I recommend it as a choice of weight values, because it is more robust than "Max". "Max" may be better in case CLOP is still sampling on the domain boundaries, though, because then boundaries create a bias. That's why I also recommend using wide intervals for parameters: it is better to let CLOP figure out by itself where it should take samples.

Rémi

lucasart · Post by **lucasart** » Fri Feb 10, 2012 4:34 pm

thank you!

CLOP is really great, it's fascinating to see the convergence slowly happening. and I would never have been able to guess the CLOP-timal values. It's interesting to see how all known theory regarding piece values is a load of BS... For example knights and bishops are now worth close to 4 pawns, not 3, rooks 6 and queens 11.5

of course this depends on other eval terms that (all other things equal) introduce a biais to piece values already and CLOP is there to help me calculate the biais correction embedded in the piece values.

ethanara · Post by **ethanara** » Fri Feb 10, 2012 4:55 pm

lucasart wrote:thank you!

CLOP is really great, it's fascinating to see the convergence slowly happening. and I would never have been able to guess the CLOP-timal values. It's interesting to see how all known theory regarding piece values is a load of BS... For example knights and bishops are now worth close to 4 pawns, not 3, rooks 6 and queens 11.5

of course this depends on other eval terms that (all other things equal) introduce a biais to piece values already and CLOP is there to help me calculate the biais correction embedded in the piece values.

How many games have you played?
I am trying to cloptimize an experiment version of Sungorus, now it has played 1150 games, but i know there should be much more games.
What is your experience? How many games have you played?

And to Remi, I have a question:
What does "95% UCB, ELO x, 95% LCB" and "95% UCB, winrate x, 95% LCB " mean? It is annoying me to not know what those values are

Regards
Ethan

lucasart · Post by **lucasart** » Fri Feb 10, 2012 5:06 pm

ethanara wrote: How many games have you played?

just over 10 thousand games so far (time control 6"+0.1" 2 games in // as I have a duo core processor)

ethanara wrote: What is your experience?

Code: Select all

Name CLOPUCI
Script ./cutechess-cli/clop-cutechess-cli.py

IntegerParameter P 50  100
IntegerParameter N 200 500
IntegerParameter B 200 500
IntegerParameter R 300 800
IntegerParameter Q 500 1500

Processor cpu1
Processor cpu2

Replications 2
DrawElo 100

H 3
Correlations all

I'm still waiting to stop the experience, as the 5-th eigen value is significantly positive and the corresponding eigen-vector seems to suggest there's some more P decrease KBQ increase to do, which is in line with the recent variation of the CLOP estimatior since the last 2k games or so. I don't know when I'll stop, but I'm guessing 15k should be reasonable

Rémi Coulom · Post by **Rémi Coulom** » Fri Feb 10, 2012 5:13 pm

ethanara wrote:
lucasart wrote:thank you!

CLOP is really great, it's fascinating to see the convergence slowly happening. and I would never have been able to guess the CLOP-timal values. It's interesting to see how all known theory regarding piece values is a load of BS... For example knights and bishops are now worth close to 4 pawns, not 3, rooks 6 and queens 11.5

of course this depends on other eval terms that (all other things equal) introduce a biais to piece values already and CLOP is there to help me calculate the biais correction embedded in the piece values.
How many games have you played?
I am trying to cloptimize an experiment version of Sungorus, now it has played 1150 games, but i know there should be much more games.
What is your experience? How many games have you played?

And to Remi, I have a question:
What does "95% UCB, ELO x, 95% LCB" and "95% UCB, winrate x, 95% LCB " mean? It is annoying me to not know what those values are
Regards
Ethan

These are bounds on the 95% confidence interval (UCB = upper confidence bound, LCB = lower confidence bound).

Rémi

lucasart · Post by **lucasart** » Sun Feb 12, 2012 4:51 am

Rémi Coulom wrote:"Mean" is the weighted mean of all samples. I recommend it as a choice of weight values

Would you consider adding another column to show the weighted stdev of parameters. After thousands of samples, I would be curious to see how much CLOP still moves the paremeters around. Or even LCB/UCB 95% (weighted) if that is feasible

CLOP/cutechess-cli

CLOP/cutechess-cli

Re: CLOP/cutechess-cli

Re: CLOP/cutechess-cli

Re: CLOP/cutechess-cli

Re: CLOP/cutechess-cli

Re: CLOP/cutechess-cli

Re: CLOP/cutechess-cli

Re: CLOP/cutechess-cli

Re: CLOP/cutechess-cli

Re: CLOP/cutechess-cli