Tuning piece values with CLOP

F. Bluemers · Post by **F. Bluemers** » Mon Oct 15, 2012 8:30 pm

Evert wrote:
F. Bluemers wrote:I would have clopped the material values first and would go for
at least 150000 games.
Material converged for Dirty but there was still a range of +- 10 points(pawns),
even after a lot of games.
Values for rooks and queens were noisier for Dirty as far as I remember.
You might have a look at the clop results from sjeng,they are on remi's site.
My results with clop on razor/pruning values did not look convincing to me.The output stayed very noisy,it did not seem to converge.
I had a look at the piece value tuning on the website, I find it hard to interpret the results though since there is no real indication of the error bar.

Interesting that you also found rooks and queens to be quite noisy. I wonder if that's something more people find? Any thoughts on the effect of missing/incorrectly tuned evaluation terms on this?

It might come from some endings that the queen/rook can't win against lesser material ( I didn't use tablebases or bitbases).

Odd that the razor and pruning values remained noisy though... doesn't that suggest that there is something wrong there? There should be a clear optimal value for those, right?

I thought so too,but maybe a correct upper or lower bound is enough for them?

Another thing I could not clop was the size for the aspiration window.

diep · Post by **diep** » Mon Oct 15, 2012 8:39 pm

Evert wrote:
diep wrote: So the only question then to Evert is how many parameters he gave to CLOP to tune.
Twelve in total.
Five piece values, two bad exchange modifiers, a bishop pair bonus and four pruning margins (nominally supposed to correspond to "minor", "minor", "rook" and "queen").

12 ^ 5 = 12 * 12 * 12 * 12 * 12 = 248832 games as worst case for CLOP.

You did to tens of thousands and it's still big SHJT.

syzygy · Post by **syzygy** » Tue Oct 16, 2012 12:07 am

diep wrote:There is an article from Remi showing mathematical 'proof' so you want that this tuner works.

If you introduce a few fata morgana patterns which already confuse CLOP then, that means that all what Remi Coulom wrote is total BS about CLOP, as it should be able to deal with it.

So you don't think that Remi stated some preconditions for his proof to work, e.g. that there are no redundant parameters?

Evert · Post by **Evert** » Tue Oct 16, 2012 9:11 am

F. Bluemers wrote: It might come from some endings that the queen/rook can't win against lesser material ( I didn't use tablebases or bitbases).

Maybe. Jazz has some heuristics for dragging the score closer to 0 if it recognises some of those combinations, and so it should avoid them if possible, but it's not very complete.

I thought so too,but maybe a correct upper or lower bound is enough for them?

Another thing I could not clop was the size for the aspiration window.

I think the aspiration window size doesn't matter much for playing strength, so you'd need tons of games to optimise that. I have no idea what the inherent error margin you can expect from CLOP is, but I would be surprised if you can tune for a few Elo using it (given the spread in piece values you still see at the end).

As for the razor/pruning margins, I suspect it's probably true that you're not very sensitive to the exact value beyond "has to be good enough", but there has to be a difference between 0 and 500cp in terms of how well it works.

Evert · Post by **Evert** » Tue Oct 16, 2012 9:21 am

Well, I stopped my trial last night after some 50000 games and ran a verification gauntlet overnight.

The result was that the new (CLOP optimised) piece values (and margins) perform no better or worse than the old ones.

One possibility is that the strength bottleneck is not actually the piece values, so tuning them does very little. Could there be an issue with the time controls? For optimising under CLOP I used much faster time controls than for the verification match (I suppose this would be easy to test).

Another question: CLOP currently runs single games for a particular parameter (two actually, so I can play the same settings with alternating colours). Would it be better to set the repeat higher (say 100, still far away from what you'd do for a more standard gauntlet) to beat down the random noise a bit?

Rémi Coulom · Post by **Rémi Coulom** » Tue Oct 16, 2012 9:27 am

Evert wrote:Well, I stopped my trial last night after some 50000 games and ran a verification gauntlet overnight.

The result was that the new (CLOP optimised) piece values (and margins) perform no better or worse than the old ones.

One possibility is that the strength bottleneck is not actually the piece values, so tuning them does very little. Could there be an issue with the time controls? For optimising under CLOP I used much faster time controls than for the verification match (I suppose this would be easy to test).

Another question: CLOP currently runs single games for a particular parameter (two actually, so I can play the same settings with alternating colours). Would it be better to set the repeat higher (say 100, still far away from what you'd do for a more standard gauntlet) to beat down the random noise a bit?

You should not expect to get an improvement when tuning 12 parameters with only 50,000 games if they are already reasonably tuned. That's about 4k games per parameter. With so few games, you cannot measure strength accurately enough.

More than two replications per parameter value won't help. Well, it may make computations faster, which will be noticeable only in really high dimensions. But it may hurt the number of games required to find good parameter values.

Rémi

Evert · Post by **Evert** » Tue Oct 16, 2012 9:32 am

Rémi Coulom wrote: You should not expect to get an improvement when tuning 12 parameters with only 50,000 games if they are already reasonably tuned. That's about 4k games per parameter. With so few games, you cannot measure strength accurately enough.

Well, most of them are not actually tuned at all.

Still, point taken. Would it generally be better to (a) run (many) more games, or (b) optimise fewer parameters at a time?

Rémi Coulom · Post by **Rémi Coulom** » Tue Oct 16, 2012 9:36 am

syzygy wrote:
diep wrote:There is an article from Remi showing mathematical 'proof' so you want that this tuner works.

If you introduce a few fata morgana patterns which already confuse CLOP then, that means that all what Remi Coulom wrote is total BS about CLOP, as it should be able to deal with it.
So you don't think that Remi stated some preconditions for his proof to work, e.g. that there are no redundant parameters?

I have no proof of anything regarding the convergence of CLOP. Only empirical data. I would not be surprised if it may fail to converge in some high-dimensional settings with a narrow sinuous valley. I am quite convinced it cannot fail in dimension 1, though, but have no mathematical proof.

In practice, I found that CLOP is an order of magnitude faster than any method I am aware of. I use it a lot, and found that it works very well in practice. It makes no miracle, though. If you want to find optimal parameters a few elo points from optimal, you'll have to play a lot of games.

Rémi

Rémi Coulom · Post by **Rémi Coulom** » Tue Oct 16, 2012 9:40 am

Evert wrote:
Rémi Coulom wrote: You should not expect to get an improvement when tuning 12 parameters with only 50,000 games if they are already reasonably tuned. That's about 4k games per parameter. With so few games, you cannot measure strength accurately enough.
Well, most of them are not actually tuned at all.

Still, point taken. Would it generally be better to (a) run (many) more games, or (b) optimise fewer parameters at a time?

Both are good ideas. I recommend not tuning independent parameters together. Of course, it is difficult to guess which parameters are independent, and, in fact, there may be a small dependence for most of them.

Rémi

diep · Post by **diep** » Tue Oct 16, 2012 12:00 pm

Rémi Coulom wrote:
Evert wrote:Well, I stopped my trial last night after some 50000 games and ran a verification gauntlet overnight.

The result was that the new (CLOP optimised) piece values (and margins) perform no better or worse than the old ones.

One possibility is that the strength bottleneck is not actually the piece values, so tuning them does very little. Could there be an issue with the time controls? For optimising under CLOP I used much faster time controls than for the verification match (I suppose this would be easy to test).

Another question: CLOP currently runs single games for a particular parameter (two actually, so I can play the same settings with alternating colours). Would it be better to set the repeat higher (say 100, still far away from what you'd do for a more standard gauntlet) to beat down the random noise a bit?
You should not expect to get an improvement when tuning 12 parameters with only 50,000 games if they are already reasonably tuned. That's about 4k games per parameter. With so few games, you cannot measure strength accurately enough.

More than two replications per parameter value won't help. Well, it may make computations faster, which will be noticeable only in really high dimensions. But it may hurt the number of games required to find good parameter values.

Rémi

With an algorithm from 1983 you can already PERFECTLY tune 12 parameters, in this specific program.

Heh, even something childish like TD would be able to tune 12 parameters in the end.

Now you publish something and : "we should not expect an improvement"

Oh comeon.

This is highschool tuning you know, just 12 parameters in a program with no further complexity in evaluation function. This is as close as you can get to lineair programming with such simple evaluation function.

Can i take away your professor title?

Tuning piece values with CLOP

Re: Tuning piece values with CLOP

Re: Tuning piece values with CLOP

Re: Tuning piece values with CLOP

Re: Tuning piece values with CLOP

Re: Tuning piece values with CLOP

Re: Tuning piece values with CLOP

Re: Tuning piece values with CLOP

Re: Tuning piece values with CLOP

Re: Tuning piece values with CLOP

Re: Tuning piece values with CLOP