Crafty eval tweak

jarkkop · Post by **jarkkop** » Mon Aug 30, 2010 6:15 pm

Bob, Are you willing to test these parameter changes in your Cluster if these setting give Crafty 23.3 few ELO points boost?

personality 11 -101 101
personality 12 -325 325
personality 13 -322 322
personality 14 -521 521
personality 15 -1028 1028
personality 21 -99 99

I got about 15 ELO points in silver suite against Rybka1.1.

bob · Post by **bob** » Mon Aug 30, 2010 6:33 pm

jarkkop wrote:Bob, Are you willing to test these parameter changes in your Cluster if these setting give Crafty 23.3 few ELO points boost?

personality 11 -101 101
personality 12 -325 325
personality 13 -322 322
personality 14 -521 521
personality 15 -1028 1028
personality 21 -99 99

I got about 15 ELO points in silver suite against Rybka1.1.

will be happy to give it a whirl. Might be this evening before it starts, we tested to + changes last night/this AM. I have combined them and am testing both together to be sure there is no interaction, although I don't see how.

bob · Post by **bob** » Mon Aug 30, 2010 10:14 pm

bob wrote:
jarkkop wrote:Bob, Are you willing to test these parameter changes in your Cluster if these setting give Crafty 23.3 few ELO points boost?

personality 11 -101 101
personality 12 -325 325
personality 13 -322 322
personality 14 -521 521
personality 15 -1028 1028
personality 21 -99 99

I got about 15 ELO points in silver suite against Rybka1.1.
will be happy to give it a whirl. Might be this evening before it starts, we tested to + changes last night/this AM. I have combined them and am testing both together to be sure there is no interaction, although I don't see how.

Here's an early result, just over 1/4 way thru the test. R07 is the stock 23.4 version, but with the suggested eval term changes... So far, -25, although it might pick up a bit (or not) as the rest of the games play out. More early this evening when the entire 30K games have finished (we are still running on 1/2 the cluster so it takes a couple of hours to run the test, and at the moment, I am sharing it with another user and am getting about 1/4 of the normal cluster, so it is a 3-4 hour wait...

Crafty-23.4-1 2674 6 6 30000 65% 2552 22%
Crafty-23.3-1 2668 6 6 30000 65% 2552 22%
Crafty-23.4R07 2649 8 8 8109 62% 2552 21%

bob · Post by **bob** » Mon Aug 30, 2010 11:17 pm

bob wrote:
bob wrote:
jarkkop wrote:Bob, Are you willing to test these parameter changes in your Cluster if these setting give Crafty 23.3 few ELO points boost?

personality 11 -101 101
personality 12 -325 325
personality 13 -322 322
personality 14 -521 521
personality 15 -1028 1028
personality 21 -99 99

I got about 15 ELO points in silver suite against Rybka1.1.
will be happy to give it a whirl. Might be this evening before it starts, we tested to + changes last night/this AM. I have combined them and am testing both together to be sure there is no interaction, although I don't see how.
Here's an early result, just over 1/4 way thru the test. R07 is the stock 23.4 version, but with the suggested eval term changes... So far, -25, although it might pick up a bit (or not) as the rest of the games play out. More early this evening when the entire 30K games have finished (we are still running on 1/2 the cluster so it takes a couple of hours to run the test, and at the moment, I am sharing it with another user and am getting about 1/4 of the normal cluster, so it is a 3-4 hour wait...

Crafty-23.4-1 2674 6 6 30000 65% 2552 22%
Crafty-23.3-1 2668 6 6 30000 65% 2552 22%
Crafty-23.4R07 2649 8 8 8109 62% 2552 21%

Ignore those results. Got in too big a hurry and was not using best version + your changes. Re-starting test now. Should have results by 9pm tonight or so...

bob · Post by **bob** » Tue Aug 31, 2010 3:12 am

bob wrote:
bob wrote:
bob wrote:
jarkkop wrote:Bob, Are you willing to test these parameter changes in your Cluster if these setting give Crafty 23.3 few ELO points boost?

personality 11 -101 101
personality 12 -325 325
personality 13 -322 322
personality 14 -521 521
personality 15 -1028 1028
personality 21 -99 99

I got about 15 ELO points in silver suite against Rybka1.1.
will be happy to give it a whirl. Might be this evening before it starts, we tested to + changes last night/this AM. I have combined them and am testing both together to be sure there is no interaction, although I don't see how.
Here's an early result, just over 1/4 way thru the test. R07 is the stock 23.4 version, but with the suggested eval term changes... So far, -25, although it might pick up a bit (or not) as the rest of the games play out. More early this evening when the entire 30K games have finished (we are still running on 1/2 the cluster so it takes a couple of hours to run the test, and at the moment, I am sharing it with another user and am getting about 1/4 of the normal cluster, so it is a 3-4 hour wait...

Crafty-23.4-1 2674 6 6 30000 65% 2552 22%
Crafty-23.3-1 2668 6 6 30000 65% 2552 22%
Crafty-23.4R07 2649 8 8 8109 62% 2552 21%
Ignore those results. Got in too big a hurry and was not using best version + your changes. Re-starting test now. Should have results by 9pm tonight or so...

OK, better numbers this time around. Looks to be essentially identical with 23.4... -1 elo actually, but the error bar is +/- 3 so I don't pay much attention. One interesting point, is that the last change in your personality list does nothing. The "bad_trade" score has not been used in a couple of versions now, so that change doesn't do anything. I am not surprised that the piece value changes don't have any effect as we have tuned those until we were sick of 'em. I am re-running the test for confirmation since there is nothing else scheduled for tonight. Will post all BayesElo output in the morning.

jarkkop · Post by **jarkkop** » Tue Aug 31, 2010 6:21 am

I will try to find better anyway and report If I am successful.

bob · Post by **bob** » Tue Aug 31, 2010 5:53 pm

jarkkop wrote:I will try to find better anyway and report If I am successful.

Here is the final results. I did edit 23.4 to remove that "bad_trade" option since in 23.3 you can set the value but it is not used anywhere.

Code: Select all

Crafty-23.4-2        2658    4    4 30000   66%  2536   22%  
Crafty-23.4-1        2655    4    4 30000   65%  2536   22%  
Crafty-23.4R07-1     2655    4    4 30000   65%  2536   22%  
Crafty-23.4R07-2     2654    4    4 30000   65%  2536   22%

23.4R07 is the version with your .craftyrc changes. 23.4-1 and 23.4-2 are the current 23.4 version.

jarkkop · Post by **jarkkop** » Tue Aug 31, 2010 6:16 pm

What was the time control in these matches?

Have you considered taking a free Rybka in your tournament pool? I think versions from 1.0 to 2.3.2 are nowadays available free of charge.

bob · Post by **bob** » Tue Aug 31, 2010 8:54 pm

jarkkop wrote:What was the time control in these matches?

Have you considered taking a free Rybka in your tournament pool? I think versions from 1.0 to 2.3.2 are nowadays available free of charge.

These were the usual 10s +0.1s games. I can try longer, but have yet to find any eval terms that look better at longer time controls than they do at shorter ones. Sometimes a search change might do better at longer (or sometimes better at shorter) time controls...

I have not tried Rybka. I can't use non-source engines, as I need to compile them on our cluster, using our cluster libraries, to get them to run, and I have not yet seen a source distribution of Rybka.

I have tried some ip* and family, but they are both strong and unreliable. Crashes are unpredictable, and I don't need unexpected randomness in my tests so I gave up on 'em. Some of the recent ones are not available in source form so I can't use those at all.

jarkkop · Post by **jarkkop** » Wed Sep 01, 2010 9:20 am

My testing was done with time control 1min+1sec @3.2GHz Q9300 (one core) 100 games.

Previous versions were 156 ELO behind Rybka11.
A new setting after 80 games were only 100 ELO behind.
If the gap stays about the same I will post the new parameters.
If you have time you can try those.

Crafty eval tweak

Crafty eval tweak

Re: Crafty eval tweak

Re: Crafty eval tweak - early results

Re: Crafty eval tweak - early results

Re: Crafty eval tweak - early results

Re: Crafty eval tweak - early results

Re: Crafty eval tweak - early results

Re: Crafty eval tweak - early results

Re: Crafty eval tweak - early results

Re: Crafty eval tweak - early results