New Tool for Tuning with Skopt

Discussion of chess software programming and technical issues.

Moderators: bob, hgm, Harvey Williamson

Forum rules
This textbox is used to restore diagrams posted with the [d] tag before the upgrade.
zenpawn
Posts: 296
Joined: Sat Aug 06, 2016 6:31 pm
Location: United States

Re: New Tool for Tuning with Skopt

Post by zenpawn » Sun Sep 08, 2019 7:46 pm

I've run the script with my xboard engine, but I don't set usermove=1. I do get timeout errors loading the engine, despite it taking well less than 1 second and sending done=0 and done=1. Nevertheless, it runs anyway.
Erin Dame
Author of RookieMonster

thomasahle
Posts: 71
Joined: Thu Feb 27, 2014 7:19 pm

Re: New Tool for Tuning with Skopt

Post by thomasahle » Sun Sep 08, 2019 9:02 pm

xr_a_y wrote:
Sun Sep 08, 2019 10:53 am
I'm trying it now, using xboard protocol, and I'm not really successful because python-chess seems to be rejecting "usermove=1".
Anyone experience this before ?
Yes, this is frustrating. I had to implement usermove=0 in sunfish to get it to work with python-chess.

Would be great if anyone could send a pull request to niklas about this.

User avatar
JVMerlino
Posts: 1003
Joined: Wed Mar 08, 2006 9:15 pm
Location: San Francisco, California

Re: New Tool for Tuning with Skopt

Post by JVMerlino » Sun Sep 08, 2019 10:44 pm

thomasahle wrote:
Sun Sep 08, 2019 9:02 pm
xr_a_y wrote:
Sun Sep 08, 2019 10:53 am
I'm trying it now, using xboard protocol, and I'm not really successful because python-chess seems to be rejecting "usermove=1".
Anyone experience this before ?
Yes, this is frustrating. I had to implement usermove=0 in sunfish to get it to work with python-chess.

Would be great if anyone could send a pull request to niklas about this.
Well, Myrddin's in luck here. Since "usermove=0" is the default, I never implemented it.

thomasahle
Posts: 71
Joined: Thu Feb 27, 2014 7:19 pm

Re: New Tool for Tuning with Skopt

Post by thomasahle » Mon Sep 09, 2019 4:03 am

JVMerlino wrote:
Sun Sep 08, 2019 10:44 pm
Well, Myrddin's in luck here. Since "usermove=0" is the default, I never implemented it.
Well, I didn't mean "implement" it as anything else than trying to parse unrecognised commands as moves.

User avatar
xr_a_y
Posts: 769
Joined: Sat Nov 25, 2017 1:28 pm
Location: France

Re: New Tool for Tuning with Skopt

Post by xr_a_y » Wed Sep 11, 2019 5:21 am

After trying this script a little on some search parameter I am not in success.
Any advice will be welcome about choosing the good script configuration and a reasonable number of games.

For now, 1000 games to tune 4 or 5 parameters at a time seems not good.

thomasahle
Posts: 71
Joined: Thu Feb 27, 2014 7:19 pm

Re: New Tool for Tuning with Skopt

Post by thomasahle » Wed Sep 11, 2019 5:09 pm

xr_a_y wrote:
Wed Sep 11, 2019 5:21 am
After trying this script a little on some search parameter I am not in success.
Any advice will be welcome about choosing the good script configuration and a reasonable number of games.
For now, 1000 games to tune 4 or 5 parameters at a time seems not good.
People have had some problems with skopt not doing enough exploration.
I personally have found that increasing the assumed noise (say -asq-noise 10) made it much better at this.
Another approach is to tweak the -n-initial-points, and some of the other parameters under "Optimization parameters".
If you want to increase -n above 1000 it might be useful to change the -base-estimator to GBRT or ET, since the standard gaussian optimizer gets slow.

Reports of any experiences you've had would be beneficial. We are all learning.

nionita
Posts: 161
Joined: Fri Oct 22, 2010 7:47 pm
Location: Austria

Re: New Tool for Tuning with Skopt

Post by nionita » Wed Sep 11, 2019 7:30 pm

thomasahle wrote:
Wed Sep 11, 2019 5:09 pm
xr_a_y wrote:
Wed Sep 11, 2019 5:21 am
After trying this script a little on some search parameter I am not in success.
Any advice will be welcome about choosing the good script configuration and a reasonable number of games.
For now, 1000 games to tune 4 or 5 parameters at a time seems not good.
People have had some problems with skopt not doing enough exploration.
I personally have found that increasing the assumed noise (say -asq-noise 10) made it much better at this.
Another approach is to tweak the -n-initial-points, and some of the other parameters under "Optimization parameters".
If you want to increase -n above 1000 it might be useful to change the -base-estimator to GBRT or ET, since the standard gaussian optimizer gets slow.

Reports of any experiences you've had would be beneficial. We are all learning.
Thanks for sharing this! I do not use your script, basically because I had already some python scripts for tuning based on DSPSA, but after seeing it, I wrote also a Bayesian optimizer using skopt and started last week to experiment with this again.

I have a feeling that this method will not work if the "measuments" (i.e. the game samples) are too noisy. That means, if you play a few games per parameter configuration, I guess it will fail. I experiment since 2-3 days with 1k to 2.5k games per configuration, I don't have yet confirmation of how strong the results are. But in the skopt documentation the function to be approximated is called "(very) expensive", which somehow suggests that taking a measurement must take much longer than doing the optimization of the surrogate function.

Joerg Oster
Posts: 691
Joined: Fri Mar 10, 2006 3:29 pm
Location: Germany

Re: New Tool for Tuning with Skopt

Post by Joerg Oster » Wed Sep 11, 2019 8:25 pm

nionita wrote:
Wed Sep 11, 2019 7:30 pm
thomasahle wrote:
Wed Sep 11, 2019 5:09 pm
xr_a_y wrote:
Wed Sep 11, 2019 5:21 am
After trying this script a little on some search parameter I am not in success.
Any advice will be welcome about choosing the good script configuration and a reasonable number of games.
For now, 1000 games to tune 4 or 5 parameters at a time seems not good.
People have had some problems with skopt not doing enough exploration.
I personally have found that increasing the assumed noise (say -asq-noise 10) made it much better at this.
Another approach is to tweak the -n-initial-points, and some of the other parameters under "Optimization parameters".
If you want to increase -n above 1000 it might be useful to change the -base-estimator to GBRT or ET, since the standard gaussian optimizer gets slow.

Reports of any experiences you've had would be beneficial. We are all learning.
Thanks for sharing this! I do not use your script, basically because I had already some python scripts for tuning based on DSPSA, but after seeing it, I wrote also a Bayesian optimizer using skopt and started last week to experiment with this again.

I have a feeling that this method will not work if the "measuments" (i.e. the game samples) are too noisy. That means, if you play a few games per parameter configuration, I guess it will fail. I experiment since 2-3 days with 1k to 2.5k games per configuration, I don't have yet confirmation of how strong the results are. But in the skopt documentation the function to be approximated is called "(very) expensive", which somehow suggests that taking a measurement must take much longer than doing the optimization of the surrogate function.
That's why I wanted to be able to increase the number of games per parameter setting.
Another question is how much of the parameter space (possible configurations) needs to be explored
To get a good estimate?

I'm right now experimenting with GBRT as base estimator and increased kappa value to allow broader exploration.
Jörg Oster

User avatar
xr_a_y
Posts: 769
Joined: Sat Nov 25, 2017 1:28 pm
Location: France

Re: New Tool for Tuning with Skopt

Post by xr_a_y » Fri Sep 13, 2019 5:16 pm

thomasahle wrote:
Wed Sep 11, 2019 5:09 pm
xr_a_y wrote:
Wed Sep 11, 2019 5:21 am
After trying this script a little on some search parameter I am not in success.
Any advice will be welcome about choosing the good script configuration and a reasonable number of games.
For now, 1000 games to tune 4 or 5 parameters at a time seems not good.
People have had some problems with skopt not doing enough exploration.
I personally have found that increasing the assumed noise (say -asq-noise 10) made it much better at this.
Another approach is to tweak the -n-initial-points, and some of the other parameters under "Optimization parameters".
If you want to increase -n above 1000 it might be useful to change the -base-estimator to GBRT or ET, since the standard gaussian optimizer gets slow.

Reports of any experiences you've had would be beneficial. We are all learning.
Yes sorry for not giving enough details.

I'm trying to tune search parameters (for example static nullmove depth and coeff) in Minic.

Last try was

Code: Select all

python3 tune.py minic_dev_uci -opt staticNullMoveDepthInit0 0 800 -opt staticNullMoveDepthInit1 0 800 -opt staticNullMoveDepthCoeff0 0 800 -opt staticNullMoveDepthCoeff1 0 800 -opt staticNullMoveMaxDepth0 0 20 -opt staticNullMoveMaxDepth1 0 20 -movetime 30 -conf ~/.config/cutechess/engines.json -concurrency=7 -games-file out.pgn -n 8000 -base-estimator GBRT -acq-noise 5 
giving those very unclear results 0 +/- 190

Code: Select all

Best expectation (κ=0): [106 207 361 782   5   9] = -0.000 ± 0.500 (ELO-diff -0.000 ± 190.849)
Best expectation (κ=1): [112 477 663  48  13   0] = -0.000 ± 0.261 (ELO-diff -0.000 ± 92.678)
Best expectation (κ=2): [112 477 663  48  13   0] = -0.000 ± 0.261 (ELO-diff -0.000 ± 92.678)
Best expectation (κ=3): [112 477 663  48  13   0] = -0.000 ± 0.261 (ELO-diff -0.000 ± 92.678)
where only first answer is somehow a possible good idea.

I think current values of search parameter in Minic are quite good and wonder why this is not easily shown by those optimization tries.

I've also tried with various other margins and coeff without success.

I may be doing something wrong of course ! and my knowledge on optimization is very poor...

User avatar
pedrox
Posts: 992
Joined: Fri Mar 10, 2006 5:07 am
Location: Basque Country (Spain)
Contact:

Re: New Tool for Tuning with Skopt

Post by pedrox » Sat Sep 14, 2019 11:12 am

Once I have played a series of games and I have the file data.log, if I run the test again and use the file, in each run I have different results and the results seem quite random.

Code: Select all

...
Using [8.0, 90.0, 5.0, 114.0] => 0.5 from log-file
Using [1.0, 144.0, 1.0, 121.0] => 0.0 from log-file
Using [9.0, 107.0, 6.0, 99.0] => 0.0 from log-file
Using [8.0, 158.0, 7.0, 125.0] => -0.5 from log-file
Using [8.0, 101.0, 9.0, 134.0] => -1.0 from log-file
Using [3.0, 116.0, 9.0, 112.0] => -0.5 from log-file
Using [1.0, 133.0, 3.0, 142.0] => -1.0 from log-file
Using [6.0, 154.0, 8.0, 145.0] => 0.0 from log-file
Using [3.0, 156.0, 7.0, 115.0] => 0.0 from log-file
Using [1.0, 135.0, 4.0, 144.0] => 0.0 from log-file
Using [5.0, 148.0, 10.0, 126.0] => -0.5 from log-file
Using [2.0, 90.0, 4.0, 111.0] => -1.0 from log-file
Using [4.0, 115.0, 3.0, 84.0] => -0.5 from log-file
Using [8.0, 92.0, 1.0, 123.0] => 0.0 from log-file
Using [8.0, 100.0, 4.0, 129.0] => 0.0 from log-file
Using [3.0, 104.0, 0.0, 128.0] => -0.5 from log-file
Using [2.0, 128.0, 9.0, 134.0] => 0.5 from log-file
Using [6.0, 100.0, 4.0, 104.0] => -0.5 from log-file
Using [5.0, 133.0, 3.0, 107.0] => 0.0 from log-file
Using [6.0, 117.0, 10.0, 105.0] => 0.5 from log-file
Using [3.0, 100.0, 8.0, 97.0] => -0.5 from log-file
Fitting first model
Summarizing best values
Best expectation (κ=0): [  1.  83.   2. 135.] = -0.000 ± 0.375 (ELO-diff -0.000 ± 338.039)
Best expectation (κ=1): [  9. 155.   2.  92.] = -0.000 ± 0.255 (ELO-diff -0.000 ± 195.793)
Best expectation (κ=2): [  9. 155.   2.  92.] = -0.000 ± 0.255 (ELO-diff -0.000 ± 195.793)
Best expectation (κ=3): [  9. 155.   2.  92.] = -0.000 ± 0.255 (ELO-diff -0.000 ± 195.793)

Code: Select all

...
Using [8.0, 90.0, 5.0, 114.0] => 0.5 from log-file
Using [1.0, 144.0, 1.0, 121.0] => 0.0 from log-file
Using [9.0, 107.0, 6.0, 99.0] => 0.0 from log-file
Using [8.0, 158.0, 7.0, 125.0] => -0.5 from log-file
Using [8.0, 101.0, 9.0, 134.0] => -1.0 from log-file
Using [3.0, 116.0, 9.0, 112.0] => -0.5 from log-file
Using [1.0, 133.0, 3.0, 142.0] => -1.0 from log-file
Using [6.0, 154.0, 8.0, 145.0] => 0.0 from log-file
Using [3.0, 156.0, 7.0, 115.0] => 0.0 from log-file
Using [1.0, 135.0, 4.0, 144.0] => 0.0 from log-file
Using [5.0, 148.0, 10.0, 126.0] => -0.5 from log-file
Using [2.0, 90.0, 4.0, 111.0] => -1.0 from log-file
Using [4.0, 115.0, 3.0, 84.0] => -0.5 from log-file
Using [8.0, 92.0, 1.0, 123.0] => 0.0 from log-file
Using [8.0, 100.0, 4.0, 129.0] => 0.0 from log-file
Using [3.0, 104.0, 0.0, 128.0] => -0.5 from log-file
Using [2.0, 128.0, 9.0, 134.0] => 0.5 from log-file
Using [6.0, 100.0, 4.0, 104.0] => -0.5 from log-file
Using [5.0, 133.0, 3.0, 107.0] => 0.0 from log-file
Using [6.0, 117.0, 10.0, 105.0] => 0.5 from log-file
Using [3.0, 100.0, 8.0, 97.0] => -0.5 from log-file
Fitting first model
Summarizing best values
Best expectation (κ=0): [  0.  95.   9. 113.] = -0.000 ± 0.375 (ELO-diff -0.000 ± 338.039)
Best expectation (κ=1): [ 10. 149.   1. 120.] = -0.000 ± 0.255 (ELO-diff -0.000 ± 195.793)
Best expectation (κ=2): [ 10. 149.   1. 120.] = -0.000 ± 0.255 (ELO-diff -0.000 ± 195.793)
Best expectation (κ=3): [ 10. 149.   1. 120.] = -0.000 ± 0.255 (ELO-diff -0.000 ± 195.793)

Code: Select all

...
Using [8.0, 90.0, 5.0, 114.0] => 0.5 from log-file
Using [1.0, 144.0, 1.0, 121.0] => 0.0 from log-file
Using [9.0, 107.0, 6.0, 99.0] => 0.0 from log-file
Using [8.0, 158.0, 7.0, 125.0] => -0.5 from log-file
Using [8.0, 101.0, 9.0, 134.0] => -1.0 from log-file
Using [3.0, 116.0, 9.0, 112.0] => -0.5 from log-file
Using [1.0, 133.0, 3.0, 142.0] => -1.0 from log-file
Using [6.0, 154.0, 8.0, 145.0] => 0.0 from log-file
Using [3.0, 156.0, 7.0, 115.0] => 0.0 from log-file
Using [1.0, 135.0, 4.0, 144.0] => 0.0 from log-file
Using [5.0, 148.0, 10.0, 126.0] => -0.5 from log-file
Using [2.0, 90.0, 4.0, 111.0] => -1.0 from log-file
Using [4.0, 115.0, 3.0, 84.0] => -0.5 from log-file
Using [8.0, 92.0, 1.0, 123.0] => 0.0 from log-file
Using [8.0, 100.0, 4.0, 129.0] => 0.0 from log-file
Using [3.0, 104.0, 0.0, 128.0] => -0.5 from log-file
Using [2.0, 128.0, 9.0, 134.0] => 0.5 from log-file
Using [6.0, 100.0, 4.0, 104.0] => -0.5 from log-file
Using [5.0, 133.0, 3.0, 107.0] => 0.0 from log-file
Using [6.0, 117.0, 10.0, 105.0] => 0.5 from log-file
Using [3.0, 100.0, 8.0, 97.0] => -0.5 from log-file
Fitting first model
Summarizing best values
Best expectation (κ=0): [ 5. 97. 10. 88.] = -0.000 ± 0.375 (ELO-diff -0.000 ± 338.039)
Best expectation (κ=1): [  5. 159.   6. 140.] = -0.000 ± 0.255 (ELO-diff -0.000 ± 195.793)
Best expectation (κ=2): [  5. 159.   6. 140.] = -0.000 ± 0.255 (ELO-diff -0.000 ± 195.793)
Best expectation (κ=3): [  5. 159.   6. 140.] = -0.000 ± 0.255 (ELO-diff -0.000 ± 195.793)
I thought you could reuse data, whether there is a crash or if you want to expand the number of iterations, but I am confused that in each run once all the games have been played I have different results.

Post Reply