And how do you reliably measure what piece values are? I'd love to know, because I can't think of a reliable way to do it. Getting rough estimates by playing material imbalances off eachother I can understand (although I've never done it) but then the imbalance itself my mess up the result.hgm wrote:You can do with orders of magnitude fewer games by simply measuring what the piece values are, in stead of measuring the effect of what the program thinks they are.
Tuning piece values with CLOP
Moderators: hgm, Dann Corbit, Harvey Williamson
-
Evert
- Posts: 2929
- Joined: Sat Jan 22, 2011 12:42 am
- Location: NL
Re: Tuning piece values with CLOP
-
Rémi Coulom
- Posts: 438
- Joined: Mon Apr 24, 2006 8:06 pm
Re: Tuning piece values with CLOP
You don't play enough games.Evert wrote:Ok, this is starting to get annoying.
I disabled all "extra" terms: exchange penalties, razor and futility margins. I fixed the value of the pawn at 100 to keep things simple, leaving just the values of the pieces to tune. In this case I just used self-play against the same untuned version of the program (without those extra terms), since the drop in strength due to disabling those terms is enough to make the gauntlet unreliable.
After some 50000 games the result actually converged. In fact, the difference in piece value between the "mean" and "max" columns was stable at 1 or 2 cp. The (weighted) parameter vs. parameter plots now looked like nice regular ellipses. The strength indicator (I know it's unreliable) showed +100 +/- 20 elo.
So I ran a verification match with the exact same parameters, exact same time control and same opponent (ie, the untuned version of the program). The result shows -20 elo for the tuned version over the untuned version.
I just don't get it...
Imagine you try to do it manually with bayeselo. How many games will it take? I suppose 50,000 games is the number of games many programmers use to measure the strength of just one version. Unless the "All" win rate shows a progress, you can never be sure you have one. The 100 +/- 20 Elo is biased towards optimistic values.
-
hgm
- Posts: 27701
- Joined: Fri Mar 10, 2006 10:06 am
- Location: Amsterdam
- Full name: H G Muller
Re: Tuning piece values with CLOP
Why do you thing the imbalance 'might mess up the result'? Piece values are all about imbalance, as a heuristic to know which imbalances are better than other, so you should go for them. If Q consistently beats R+B in a large variety of situations by 65%, then that is just it. If the program can make the trade, it will up its winning chances from 50% to 65%. It makes no sense to say that 'in reality' R+B is worth much more than Q, and that it is only because you play them against each other, it seems better.Evert wrote:And how do you reliably measure what piece values are? I'd love to know, because I can't think of a reliable way to do it. Getting rough estimates by playing material imbalances off eachother I can understand (although I've never done it) but then the imbalance itself my mess up the result.
-
Evert
- Posts: 2929
- Joined: Sat Jan 22, 2011 12:42 am
- Location: NL
Re: Tuning piece values with CLOP
I actually wouldn't normally play 50000 games to tune one value, I don't have the computational resources to do that. Or at least, I would not play that many blindly. Depending on the expected gain I would break out early. I also don't need 50000 games to detect the regression in the tuned values.Rémi Coulom wrote: You don't play enough games.
Imagine you try to do it manually with bayeselo. How many games will it take? I suppose 50,000 games is the number of games many programmers use to measure the strength of just one version. Unless the "All" win rate shows a progress, you can never be sure you have one. The 100 +/- 20 Elo is biased towards optimistic values.
However, I do think I found what the problem might be: if I restrict the range of values to be much narrower, centred around the old (un-tuned) piece values and excluding the optimum found earlier, things seem to converge to a different set of piece values. It looks like what is happening is the code converging on a local optimum that may not be the best in the parameter range (but that is easier to converge to, perhaps the valley is wider?). Of course you are right that playing enough games, the system will eventually pick out other optima and move over to them if they are better.
I'll see where the current run goes (and try to let it play more games) and then decide whether to pick up the previous run again.
-
Evert
- Posts: 2929
- Joined: Sat Jan 22, 2011 12:42 am
- Location: NL
Re: Tuning piece values with CLOP
What the...
I started a run on Monday night and stopped it just now because I need the machine for some other (serious) work. I checked the progress: there were >150000 games played, all columns showed an elo gain (the all column showed about 50 elo) and all piece values had more-or-less converged (except for maybe the queen). So I closed the program, then started it up again to double-check some numbers, opened the experiment - and it's truncated at some 30000 games! The time-stamp on the .dat file confirms that the last time it was written to was some time yesterday. All the rest of the data is just gone.
Is this a known problem?
I started a run on Monday night and stopped it just now because I need the machine for some other (serious) work. I checked the progress: there were >150000 games played, all columns showed an elo gain (the all column showed about 50 elo) and all piece values had more-or-less converged (except for maybe the queen). So I closed the program, then started it up again to double-check some numbers, opened the experiment - and it's truncated at some 30000 games! The time-stamp on the .dat file confirms that the last time it was written to was some time yesterday. All the rest of the data is just gone.
Is this a known problem?
-
lucasart
- Posts: 3232
- Joined: Mon May 31, 2010 1:29 pm
- Full name: lucasart
Re: Tuning piece values with CLOP
It's not "SHJT", it's simply the best algo there is. Tuning 12 values, with complex interactions, and differences that are so small (when the noise is relatively so high), is extremely hard.diep wrote:12 ^ 5 = 12 * 12 * 12 * 12 * 12 = 248832 games as worst case for CLOP.Evert wrote:Twelve in total.diep wrote: So the only question then to Evert is how many parameters he gave to CLOP to tune.
Five piece values, two bad exchange modifiers, a bishop pair bonus and four pruning margins (nominally supposed to correspond to "minor", "minor", "rook" and "queen").
You did to tens of thousands and it's still big SHJT.
IMO Evert needs to reduce the number of parameters. 12 is way too much. How about:
* pawn = 80 (opening) 100 (endgame = pivot point)
* knight = bishop = variable1
* bishop pair hardcoded so something like 50
* rook = variable1+variable2 (so variable2 represent the exchange value and is > 0, it avoids stupid sampling of rook < knight values)
* queen = rook+variable3 = var1+var2+var3 (same reason)
so you now optimize in 3 dimensions and can obtain precise values in a reasonable time, especially as there represent things that are really more orthogonal that the initial choice of Evert.
Then you can fix these values and optimize the bishop pair, and the difference bishop-knight=epsilon (probably small and hard to measure).
There really is now better way of optimizing 12 variables at a time. If you can make something better than CLOP, please feel free to share it!
Theory and practice sometimes clash. And when that happens, theory loses. Every single time.
-
Rémi Coulom
- Posts: 438
- Joined: Mon Apr 24, 2006 8:06 pm
Re: Tuning piece values with CLOP
That sounds strange. CLOP writes two files: x.dat and x.log. When an experiment is restarted, it will open these files to re-read the data. A new .dat file is created, and the old .dat file is renamed to x-old.dat. I never observed the problem you describe. The files are flushed after each sample. Maybe you can find a clue by looking at x.log. Or maybe you ran out of disk space?Evert wrote:What the...
I started a run on Monday night and stopped it just now because I need the machine for some other (serious) work. I checked the progress: there were >150000 games played, all columns showed an elo gain (the all column showed about 50 elo) and all piece values had more-or-less converged (except for maybe the queen). So I closed the program, then started it up again to double-check some numbers, opened the experiment - and it's truncated at some 30000 games! The time-stamp on the .dat file confirms that the last time it was written to was some time yesterday. All the rest of the data is just gone.
Is this a known problem?
Rémi
-
Evert
- Posts: 2929
- Joined: Sat Jan 22, 2011 12:42 am
- Location: NL
Re: Tuning piece values with CLOP
Nope, over 600GB left on the drive. It does look like both the .dat and the .log file are truncated at the same point. It's almost as though CLOP just stopped writing output to the files.Rémi Coulom wrote: That sounds strange. CLOP writes two files: x.dat and x.log. When an experiment is restarted, it will open these files to re-read the data. A new .dat file is created, and the old .dat file is renamed to x-old.dat. I never observed the problem you describe. The files are flushed after each sample. Maybe you can find a clue by looking at x.log. Or maybe you ran out of disk space?
I suppose it wouldn't have mattered much if I'd checked the files before exiting the GUI since there is no way to force a rewrite of the data by hand. That's assuming CLOP keeps the result in memory and only reads back the files when it loads the results.
I'm not particularly keen to try to reproduce the problem, but I'll keep an eye out for it.
-
Evert
- Posts: 2929
- Joined: Sat Jan 22, 2011 12:42 am
- Location: NL
Re: Tuning piece values with CLOP
Jazz very clearly overvalues the pawns early on (perhaps later too, haven't really checked that), but fixing that is not as easy as defining an early game and an endgame value for it (but I could of course add the code for doing so quite easily).lucasart wrote: * pawn = 80 (opening) 100 (endgame = pivot point)
Regardless, I'll be revisiting the value of the pawn later.
I was now tuning in four dimensions, with any material imbalance tweaks (apart from bishop pair) set to 0. This seems to work ok (but I disabled razoring and forward pruning, which have margins that make implicit assumptions about the piece values and my gut tells me would interfere with tuning them).* knight = bishop = variable1
* bishop pair hardcoded so something like 50
* rook = variable1+variable2 (so variable2 represent the exchange value and is > 0, it avoids stupid sampling of rook < knight values)
* queen = rook+variable3 = var1+var2+var3 (same reason)
I did specify margins that are "sensible": knight ~ bishop, knight ~ 300cp +/- delta, rook ~ knight + 200cp, queen ~ 3 knights ~ 2 rooks.
Actually, one of the things that came out of all the tuning runs I've done (so it seems to be very robust) is that the knight needs to be devalued by 25cp relative to the bishop. I haven't really looked into this much yet or verified this, but it probably comes from two things: an untuned (fixed) bishop-pair bonus, and unbalanced piece-square tables (so the average contribution from them is non-zero). Fixing the piece square tables to correct that is not entirely trivial (in Jazz) because the piece square tables are not static, unless I just add a constant to them. Which seems a bit silly since the constant could (should?) just be absorbed in the piece value anyway.Then you can fix these values and optimize the bishop pair, and the difference bishop-knight=epsilon (probably small and hard to measure).
-
Rémi Coulom
- Posts: 438
- Joined: Mon Apr 24, 2006 8:06 pm
Re: Tuning piece values with CLOP
CLOP flushes the data file after each sample, so there should be no need to make it re-write the file. It may be a good idea to not kill the application violently, and pause the experiment before closing it. But it should not be necessary.Evert wrote:Nope, over 600GB left on the drive. It does look like both the .dat and the .log file are truncated at the same point. It's almost as though CLOP just stopped writing output to the files.Rémi Coulom wrote: That sounds strange. CLOP writes two files: x.dat and x.log. When an experiment is restarted, it will open these files to re-read the data. A new .dat file is created, and the old .dat file is renamed to x-old.dat. I never observed the problem you describe. The files are flushed after each sample. Maybe you can find a clue by looking at x.log. Or maybe you ran out of disk space?
I suppose it wouldn't have mattered much if I'd checked the files before exiting the GUI since there is no way to force a rewrite of the data by hand. That's assuming CLOP keeps the result in memory and only reads back the files when it loads the results.
I'm not particularly keen to try to reproduce the problem, but I'll keep an eye out for it.
If you fear losing data again, what you can do is copy the data file before closing clop, and check that it contains all the samples. If it does not, then I really wonder what can be the reason.
Please tell me if you ever experience this problem again.
Rémi