A paper about parameter tuning

joelv · Post by **joelv** » Sun Jan 17, 2010 1:58 am

Thanks for this. N=1000 will be fine.

I have spare CPU time at the beginning of February. Once the results are on my website, I will let you know.

diep · Post by **diep** » Sun Jan 17, 2010 4:21 pm

Sorry for the selective snipping...

mcostalba wrote: In chess engines testing numbers are more important then ideas (because they require much longer times to get properly), and much more important then struggle to be the first. In your paper experimental data is small, not clearly documented and too focused on the attempted direction.

Do i read this correct you say that persons who test things are more important than persons who have ideas?

Please confirm if so.

Vincent

diep · Post by **diep** » Sun Jan 17, 2010 5:31 pm

mcostalba wrote:
joelv wrote: Suppose I did what you suggested earlier, and repeated the experiment 10 times (I don't have the CPU time to think about any more) with different sets of random starting weights and reported the results. Assuming they were similar to what was reported, would this be validation enough for you? If not, what would it take?
Also 5 runs are enough. Here are the conditions:

1) Take 5 different starting random sets and run the trial for N games.

2) At the end run a tournament among the 5 resulting engines + one randomly chose among the 5 random starting engines (to show that we actually had an increment).

3) If the estimate ELO of the all 5 trained engines ends within a 20 ELO range and is higher of the starting one by a margin comparable to your paper's result we (I) will clap at the miracle

Now _you_ have to choose N, of course bigger the N easier is to achieve the task because you allow more time to your algorithm to converge. The choice is up to you but if you really want to amaze me you can take something like N=1000 (that I consider very small) and show that 1000 training games are enough to stabilize the result.

I also would like that test conditions are documented:

- Time control used for trial games
- Book used for trial games, ponder off, etc.
- Hardware used CPU and number of cores used, and possibly average middle game search depth (if it is easy for you to get this important number)

- Time control used for ELO measuring and final tournament verification
- Book used for final ELO verification, ponder off, etc.
- Number of tournament rounds : how many games to measure estimate ELO of the resulting engines.
- Hardware used for final tournament.

joelv wrote: Is this parametrized version of Stockfish available as open source? Sounds like it would be the perfect test-bed for some of the ideas in the paper...
I have sent you a pm on this point.

Nice from you to post all this, where is the tunable stockfish code in the meantime and the tuner used to tune it?

Thanks,
Vincent

mcostalba · Post by **mcostalba** » Sun Jan 17, 2010 6:13 pm

diep wrote:Sorry for the selective snipping...

mcostalba wrote: In chess engines testing numbers are more important then ideas (because they require much longer times to get properly), and much more important then struggle to be the first. In your paper experimental data is small, not clearly documented and too focused on the attempted direction.
Do i read this correct you say that persons who test things are more important than persons who have ideas?

Please confirm if so.

Vincent

Yes I confirm, especially when it is the same person as in my case. I (but not only me) spend _much_ more time in testing then in thinking. I have bunches of new ideas or new possible tweaks and them come out very easily, but I am able to test only a small part becasue testing resources are a real bottleneck for us.

We use the following methodology for testing:

- If an idea touches many aspects then we split the idea in many single focused parts and test independently each part.

- We test each idea up to a point where it is more or less reliably verified it works, and it means a lot of games

- We can use different ways to early filter out not working ideas, but at the end _all_ the code changes that are committed are tested with real games and it takes a lot.

So for us testing is the biggest part in development.

mcostalba · Post by **mcostalba** » Sun Jan 17, 2010 6:17 pm

diep wrote: Nice from you to post all this, where is the tunable stockfish code in the meantime and the tuner used to tune it?

It is private and private it will remain because we think that the auto-tuning system we have developed is superior to what is available in literature (starting from an already almost tuned engine, not an academic toy of a random parameters vector) and is the biggest reason of the elo boost in SF from 1.3 to 1.6 passing from 1.4 and 1.5.1

The methodology was developed and finalized during the SF 1.4 release cycle.

diep · Post by **diep** » Sun Jan 17, 2010 6:30 pm

mcostalba wrote:
diep wrote: Nice from you to post all this, where is the tunable stockfish code in the meantime and the tuner used to tune it?
It is private and private it will remain because we think that the auto-tuning system we have developed is superior to what is available in literature (starting from an already almost tuned engine, not an academic toy of a random parameters vector) and is the biggest reason of the elo boost in SF from 1.3 to 1.6 passing from 1.4 and 1.5.1

The methodology was developed and finalized during the SF 1.4 release cycle.

there is 2 aspects in engine tuning.

a) modifications to the source code of stockfish so it CAN get tuned
b) the tuner program with your new secret ideas

Obviously if you want to keep secret at some supercomputer from government the B program, i understand.

I'm interested in A.

Where is the modified stockfish code so it CAN get tuned, so parameterizable parameters.

Where is that code?

Thanks,
Vincent

mcostalba · Post by **mcostalba** » Sun Jan 17, 2010 6:35 pm

diep wrote:
mcostalba wrote:
diep wrote: Nice from you to post all this, where is the tunable stockfish code in the meantime and the tuner used to tune it?
It is private and private it will remain because we think that the auto-tuning system we have developed is superior to what is available in literature (starting from an already almost tuned engine, not an academic toy of a random parameters vector) and is the biggest reason of the elo boost in SF from 1.3 to 1.6 passing from 1.4 and 1.5.1

The methodology was developed and finalized during the SF 1.4 release cycle.

there is 2 aspects in engine tuning.

a) modifications to the source code of stockfish so it CAN get tuned
b) the tuner program with your new secret ideas

Obviously if you want to keep secret at some supercomputer from government the B program, i understand.

I'm interested in A.

Where is the modified stockfish code so it CAN get tuned, so parameterizable parameters.

Where is that code?

Thanks,
Vincent

The point is that (a) and (b) are mixed togheter, there is a specific way of how to modify a family of parameters and is not one by one, and this specific way is what makes the tuner works.

diep · Post by **diep** » Sun Jan 17, 2010 7:06 pm

mcostalba wrote:
diep wrote:
mcostalba wrote:
diep wrote: Nice from you to post all this, where is the tunable stockfish code in the meantime and the tuner used to tune it?
It is private and private it will remain because we think that the auto-tuning system we have developed is superior to what is available in literature (starting from an already almost tuned engine, not an academic toy of a random parameters vector) and is the biggest reason of the elo boost in SF from 1.3 to 1.6 passing from 1.4 and 1.5.1

The methodology was developed and finalized during the SF 1.4 release cycle.

there is 2 aspects in engine tuning.

a) modifications to the source code of stockfish so it CAN get tuned
b) the tuner program with your new secret ideas

Obviously if you want to keep secret at some supercomputer from government the B program, i understand.

I'm interested in A.

Where is the modified stockfish code so it CAN get tuned, so parameterizable parameters.

Where is that code?

Thanks,
Vincent
The point is that (a) and (b) are mixed togheter, there is a specific way of how to modify a family of parameters and is not one by one, and this specific way is what makes the tuner works.

/*
Glaurung, a UCI chess playing engine.
Copyright (C) 2004-2008 Tord Romstad

Glaurung is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.

Glaurung is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.

You should have received a copy of the GNU General Public License
along with this program. If not, see <http://www.gnu.org/licenses/>.
*/

In the stockfish code repeated:

GNU GENERAL PUBLIC LICENSE
Version 3, 29 June 2007

Copyright (C) 2007 Free Software Foundation, Inc. <http://fsf.org/>
Everyone is permitted to copy and distribute verbatim copies
of this license document, but changing it is not allowed.

...

You realize what that means?

mcostalba · Post by **mcostalba** » Sun Jan 17, 2010 7:19 pm

diep wrote: You realize what that means?

Unfortunately for you I realize much better then you what GPL means !

If we _distribute_ or _release_ the derived tunable version of SF we have to release with sources. But GPL does not forbids anybody to modify and keep for it's internal use the modified code.

Have you ever heard of private versions of Fruit ? There is one even in CEGT lists in these days.

http://www.husvankempen.de/nunn/40_40%2 ... on/95.html

We are even in a much less exposed position then this because we even don't distribute for testing.

Ok you tried...try again next time perhaps you'll be luckier

Aaron Becker · Post by **Aaron Becker** » Sun Jan 17, 2010 7:21 pm

diep wrote: You realize what that means?

The GPL does not obligate them to release any code that's not part of the stockfish binary that they distribute. Although I do think it's a bit of a jerk move to withhold the tunable version of stockfish and then complain that others don't work on tuning top engines.

A paper about parameter tuning

Re: A paper about parameter tuning

Re: A paper about parameter tuning

Re: A paper about parameter tuning

Re: A paper about parameter tuning

Re: A paper about parameter tuning

Re: A paper about parameter tuning

Re: A paper about parameter tuning

Re: A paper about parameter tuning

Re: A paper about parameter tuning

Re: A paper about parameter tuning