Tuning piece values with CLOP

Discussion of chess software programming and technical issues.

Moderators: hgm, Dann Corbit, Harvey Williamson

diep
Posts: 1822
Joined: Thu Mar 09, 2006 11:54 pm
Location: The Netherlands

Re: Tuning piece values with CLOP

Post by diep »

Rémi Coulom wrote:
syzygy wrote:
diep wrote:There is an article from Remi showing mathematical 'proof' so you want that this tuner works.

If you introduce a few fata morgana patterns which already confuse CLOP then, that means that all what Remi Coulom wrote is total BS about CLOP, as it should be able to deal with it.
So you don't think that Remi stated some preconditions for his proof to work, e.g. that there are no redundant parameters?
I have no proof of anything regarding the convergence of CLOP. Only empirical data. I would not be surprised if it may fail to converge in some high-dimensional settings with a narrow sinuous valley. I am quite convinced it cannot fail in dimension 1, though, but have no mathematical proof.

In practice, I found that CLOP is an order of magnitude faster than any method I am aware of. I use it a lot, and found that it works very well in practice. It makes no miracle, though. If you want to find optimal parameters a few elo points from optimal, you'll have to play a lot of games.

Rémi
"a few elopoints".

200 elopoints is a few?

Define a hard limit on 'a few' instead of talking your way out...
syzygy
Posts: 5554
Joined: Tue Feb 28, 2012 11:56 pm

Re: Tuning piece values with CLOP

Post by syzygy »

diep wrote:
Rémi Coulom wrote:
syzygy wrote:
diep wrote:There is an article from Remi showing mathematical 'proof' so you want that this tuner works.

If you introduce a few fata morgana patterns which already confuse CLOP then, that means that all what Remi Coulom wrote is total BS about CLOP, as it should be able to deal with it.
So you don't think that Remi stated some preconditions for his proof to work, e.g. that there are no redundant parameters?
I have no proof of anything regarding the convergence of CLOP. Only empirical data. I would not be surprised if it may fail to converge in some high-dimensional settings with a narrow sinuous valley. I am quite convinced it cannot fail in dimension 1, though, but have no mathematical proof.

In practice, I found that CLOP is an order of magnitude faster than any method I am aware of. I use it a lot, and found that it works very well in practice. It makes no miracle, though. If you want to find optimal parameters a few elo points from optimal, you'll have to play a lot of games.

Rémi
"a few elopoints".

200 elopoints is a few?

Define a hard limit on 'a few' instead of talking your way out...
Maybe you missed "a lot of games" ?

Clearly 50,000 game outcomes heavily impacted by randomness cannot give sufficient information for determining optimal values of 12 parameters. You can't blame any algorithm for that.
Adam Hair
Posts: 3226
Joined: Wed May 06, 2009 10:31 pm
Location: Fuquay-Varina, North Carolina

Re: Tuning piece values with CLOP

Post by Adam Hair »

diep wrote:
Rémi Coulom wrote:
Evert wrote:Well, I stopped my trial last night after some 50000 games and ran a verification gauntlet overnight.

The result was that the new (CLOP optimised) piece values (and margins) perform no better or worse than the old ones.

One possibility is that the strength bottleneck is not actually the piece values, so tuning them does very little. Could there be an issue with the time controls? For optimising under CLOP I used much faster time controls than for the verification match (I suppose this would be easy to test).

Another question: CLOP currently runs single games for a particular parameter (two actually, so I can play the same settings with alternating colours). Would it be better to set the repeat higher (say 100, still far away from what you'd do for a more standard gauntlet) to beat down the random noise a bit?
You should not expect to get an improvement when tuning 12 parameters with only 50,000 games if they are already reasonably tuned. That's about 4k games per parameter. With so few games, you cannot measure strength accurately enough.

More than two replications per parameter value won't help. Well, it may make computations faster, which will be noticeable only in really high dimensions. But it may hurt the number of games required to find good parameter values.

Rémi

With an algorithm from 1983 you can already PERFECTLY tune 12 parameters, in this specific program.

Heh, even something childish like TD would be able to tune 12 parameters in the end.

Now you publish something and : "we should not expect an improvement"

Oh comeon.

This is highschool tuning you know, just 12 parameters in a program with no further complexity in evaluation function. This is as close as you can get to lineair programming with such simple evaluation function.

Can i take away your professor title?
<moderation>

Vincent, it would be much more acceptable if you were to be less vague and less aggressive.

<moderation>
User avatar
Evert
Posts: 2929
Joined: Sat Jan 22, 2011 12:42 am
Location: NL

Re: Tuning piece values with CLOP

Post by Evert »

Yesterday morning I started a new tuning run, this time with just the base piece values. No bad trade penalties, no bishop pair, no razor margins (these are all still in the code, obviously).

After ~40000 games, piece values seemed to go to what I said before: bishop and knight go to ~400 points (and are very strongly correlated, already visible after a few 1000 games; maybe this is the effect HGM discussed the other week), rook goes to ~550 points (which is actually its value in the code), queen goes to ~1200, but is all over the place really. The pawn actually goes down to ~65 points (!). Now, I haven't actually validated these values, but to me they don't "look" right. What I noticed in the other run is that the value of the minor pieces came down when the razor margin was also tuned down, from ~300 to ~200. My interpretation of this is that what the code is actually doing is tuning the value of a minor piece to 100 points above the razor margin. That's interesting if true, and I'll run some tests to verify this.

The run is now at 60000 games and something interesting is happening: there is a strong clustering of points around the piece values I just mentioned, but in most plots a second "peak" seems to be developing, with knight ~ bishop ~ 350, rook ~550, queen ~950, pawn ~80. I'll leave it running, but it'll be interesting to see where this goes.

All in all, an interesting experience, but not quite as clear-cut and smooth as I had hoped for.
nionita
Posts: 175
Joined: Fri Oct 22, 2010 9:47 pm
Location: Austria

Re: Tuning piece values with CLOP

Post by nionita »

I don't understand why the pawn is also variable. Something must represent the unit, and this should be the pawn, fixed at 100 (centipawns). All other values will be then in centipawns. (Or 1000 millipawns, and then everything is in millipawns, etc)

Then, by having other terms of the evaluation function, which are not tuned in this experiment (but are fixed), you guide somehow the piece values. But (static) piece values are for me something so basically that I would say, they must be evaluated before every other term. Only after that you can introduce other evaluation terms and tune them (with fixed static piece values, the ones found before). Then of course those terms depend on the pieces values, which means: the pieces have different values depending on the concrete position (i.e, dynamic piece value = static piece value + terms related to that piece + some part for correlations between pieces).
User avatar
Evert
Posts: 2929
Joined: Sat Jan 22, 2011 12:42 am
Location: NL

Re: Tuning piece values with CLOP

Post by Evert »

nionita wrote:I don't understand why the pawn is also variable. Something must represent the unit, and this should be the pawn, fixed at 100 (centipawns). All other values will be then in centipawns. (Or 1000 millipawns, and then everything is in millipawns, etc)
Nominally it is, but very few programs actually do that. The reason is the value of the pawn varies a lot throughout the game. I think most programs have the (opening) base value of the pawn below 100 "cp".

Of course the real value of a pawn (as with any piece) is not its static value, but a combination of all evaluation terms. That may bring the value back up to 100 (or, if everything is properly centred, the base value may already be close to that), or perhaps slightly less in the opening and slightly more in the end game. The point though is that the base value alone doesn't tell you everything.

You are right though that varying the value of the pawn but not other evaluation terms changes the relative weight of material versus other terms. I don't think that's automatically bad though.
Then, by having other terms of the evaluation function, which are not tuned in this experiment (but are fixed), you guide somehow the piece values. But (static) piece values are for me something so basically that I would say, they must be evaluated before every other term. Only after that you can introduce other evaluation terms and tune them (with fixed static piece values, the ones found before). Then of course those terms depend on the pieces values, which means: the pieces have different values depending on the concrete position (i.e, dynamic piece value = static piece value + terms related to that piece + some part for correlations between pieces).
I've sortof done that, of course, by starting with (more or less) canonical piece values. What I'm doing now is retuning them after adding in other evaluation terms. I don't think there's much point in ripping out all other evaluation terms and keeping just the material, just to re-derive the standard piece values. Even more so because doing so will drop the strength of the program by 100s of Elo points.
diep
Posts: 1822
Joined: Thu Mar 09, 2006 11:54 pm
Location: The Netherlands

Re: Tuning piece values with CLOP

Post by diep »

syzygy wrote:
diep wrote:
Rémi Coulom wrote:
syzygy wrote:
diep wrote:There is an article from Remi showing mathematical 'proof' so you want that this tuner works.

If you introduce a few fata morgana patterns which already confuse CLOP then, that means that all what Remi Coulom wrote is total BS about CLOP, as it should be able to deal with it.
So you don't think that Remi stated some preconditions for his proof to work, e.g. that there are no redundant parameters?
I have no proof of anything regarding the convergence of CLOP. Only empirical data. I would not be surprised if it may fail to converge in some high-dimensional settings with a narrow sinuous valley. I am quite convinced it cannot fail in dimension 1, though, but have no mathematical proof.

In practice, I found that CLOP is an order of magnitude faster than any method I am aware of. I use it a lot, and found that it works very well in practice. It makes no miracle, though. If you want to find optimal parameters a few elo points from optimal, you'll have to play a lot of games.

Rémi
"a few elopoints".

200 elopoints is a few?

Define a hard limit on 'a few' instead of talking your way out...
Maybe you missed "a lot of games" ?

Clearly 50,000 game outcomes heavily impacted by randomness cannot give sufficient information for determining optimal values of 12 parameters. You can't blame any algorithm for that.
12 parameters is nearly the same to simple lineair programming.

We already know his algorithm is pathetic bad compared to existing manners of lineair tuning, say from 1983. There is half a million variations of that algorithm now, all with exotic names.

In that sense it's a total useless publication i didn't even take the effort to try his algorithm of course. You can write anything if you never google for other persons publications in other fields where they also tune...

Now it appears it doesn't even tune this algorithm.

If you look at the fluctuation of the parameters as presented this means the algorithm simply doesn't converge and can get lost in local optima.

This already for 12 parameters... ...even the biggest beancounter has 50+ parameters... ...which is trivial to tune of course.

A lot harder is the hundreds of parameters that were tuned in Rybka 1.0,
in case everyone forgot about all those material parameters that were tuned there.

How many games were played to achieve that?

Guys like Donninger find all this so simple that he says: "it's only about creating the chessknowledge". That's how simple he considers all this tuning for a beancounter... ...that's with a lot less parameters.

An a lot less games.
User avatar
Eelco de Groot
Posts: 4557
Joined: Sun Mar 12, 2006 2:40 am
Full name:   

Re: Tuning piece values with CLOP

Post by Eelco de Groot »

F. Bluemers wrote:
Evert wrote:
F. Bluemers wrote:I would have clopped the material values first and would go for
at least 150000 games.
Material converged for Dirty but there was still a range of +- 10 points(pawns),
even after a lot of games.
Values for rooks and queens were noisier for Dirty as far as I remember.
You might have a look at the clop results from sjeng,they are on remi's site.
My results with clop on razor/pruning values did not look convincing to me.The output stayed very noisy,it did not seem to converge.
I had a look at the piece value tuning on the website, I find it hard to interpret the results though since there is no real indication of the error bar.

Interesting that you also found rooks and queens to be quite noisy. I wonder if that's something more people find? Any thoughts on the effect of missing/incorrectly tuned evaluation terms on this?
It might come from some endings that the queen/rook can't win against lesser material ( I didn't use tablebases or bitbases).

Odd that the razor and pruning values remained noisy though... doesn't that suggest that there is something wrong there? There should be a clear optimal value for those, right?
I thought so too,but maybe a correct upper or lower bound is enough for them?

Another thing I could not clop was the size for the aspiration window.
I have not really looked at how CLOP works, but I was just wondering if having twelve or more different parameters to tune at once is really such a good thing, notwithstanding that Vincent thinks that is peanuts for any linear programming optimization. The noise in the rook and queen, my suspicion is it has something to do with the fact that the value of a queen is a bit hard to define, you usually exchange it for a queen, and both rook and queen should have some redundancy factor in case you have more than one of them, see Larry Kaufman's site. Also, the base values of the pieces can be radically altered by a material imbalance table (apart from the already mentioned redundancy factors) especially if you look at Stockfish material imbalance it had a big influence on the value of the knight, but that mostly is a linear factor so it could be transferred to the base material value. I think I would rather tune one value at a time (and after tuning PNBRQ, repeat the process until you reach some equilibrium, for all the pieces), and split even the endgame and midgame material values. So just the endgame material value of knights, using just a lot of more or less random endgame positions. That isn't perfect and I think Harm Geert Muller argues that you always have to base material on whole games so that is the opposite view. Look at the value for pawns in Komodo though, the difference between the opening value and the end game value is really huge, and then you are not even calculating in (for now) things as bonuses for candidate and passed pawns. Tuning that (i.e. one base value of a pawn) to one base value that you then want to set at 100 (to better compare with chess theory and other programs) becomes a bit meaningless exercise in my eyes...

Eelco
Debugging is twice as hard as writing the code in the first
place. Therefore, if you write the code as cleverly as possible, you
are, by definition, not smart enough to debug it.
-- Brian W. Kernighan
User avatar
hgm
Posts: 27701
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: Tuning piece values with CLOP

Post by hgm »

Rémi Coulom wrote:In practice, I found that CLOP is an order of magnitude faster than any method I am aware of. I use it a lot, and found that it works very well in practice. It makes no miracle, though. If you want to find optimal parameters a few elo points from optimal, you'll have to play a lot of games.
That could be the case, but the problem is that the whole method of tuning by empirically trying what effect program parameters have is a bust. You can do with orders of magnitude fewer games by simply measuring what the piece values are, in stead of measuring the effect of what the program thinks they are.
User avatar
Evert
Posts: 2929
Joined: Sat Jan 22, 2011 12:42 am
Location: NL

Re: Tuning piece values with CLOP

Post by Evert »

Ok, this is starting to get annoying.

I disabled all "extra" terms: exchange penalties, razor and futility margins. I fixed the value of the pawn at 100 to keep things simple, leaving just the values of the pieces to tune. In this case I just used self-play against the same untuned version of the program (without those extra terms), since the drop in strength due to disabling those terms is enough to make the gauntlet unreliable.

After some 50000 games the result actually converged. In fact, the difference in piece value between the "mean" and "max" columns was stable at 1 or 2 cp. The (weighted) parameter vs. parameter plots now looked like nice regular ellipses. The strength indicator (I know it's unreliable) showed +100 +/- 20 elo.

So I ran a verification match with the exact same parameters, exact same time control and same opponent (ie, the untuned version of the program). The result shows -20 elo for the tuned version over the untuned version.

I just don't get it...