Chess programs have a lot of parameters that need to be tuned. Tuning these parameters is not that easy and it takes a lot of time.
If a change in the parameters only influence the speed and not the moves selected, it is easy to compare if playing strength has improved.
But what to do if other moves are selected and playing strength only changes a bit. In this case it is not clear if a (very) small step is taken in a good direction or not.
Maybe one should only test big steps ?!
Tuning misery
Moderators: hgm, Dann Corbit, Harvey Williamson
-
velmarin
- Posts: 1600
- Joined: Mon Feb 21, 2011 9:48 am
Re: Tuning misery
Version Bouquet development,
in a quick count, only pawns, has more than 2000 parameters, (there are many arrays 64 * 2).
If once I win the lottery I'll have a room full of clusters.
Better, asking my wife says we're going to an island beach without electronics .....
That is best idea.
in a quick count, only pawns, has more than 2000 parameters, (there are many arrays 64 * 2).
If once I win the lottery I'll have a room full of clusters.
Better, asking my wife says we're going to an island beach without electronics .....
That is best idea.
-
hgm
- Posts: 27703
- Joined: Fri Mar 10, 2006 10:06 am
- Location: Amsterdam
- Full name: H G Muller
Re: Tuning misery
Is it? I would say it is only easy to test if the speed improved. By defenition, if it selects the same moves, its playing strength would not have changed.Henk wrote:If a change in the parameters only influence the speed and not the moves selected, it is easy to compare if playing strength has improved.
Even testing if the speed has changed is not that easy, and requires you to average over many representative positions. Sometimes a change in move ordering that increases average speed backfires in certain positions.
-
Henk
- Posts: 7210
- Joined: Mon May 27, 2013 10:31 am
Re: Tuning misery
I do not like islands, I watched the movie 'Papillon' when I was 12 on a holiday on an Dutch island. This has changed my life (not).
Maybe one should only change parameters if they are not correlated.
It may also happen that if you have tuned parameters well, you get a good idea change the algorithm, and you have to start tuning again.
Better do very lousy or sloppy tuning.
Maybe one should only change parameters if they are not correlated.
It may also happen that if you have tuned parameters well, you get a good idea change the algorithm, and you have to start tuning again.
Better do very lousy or sloppy tuning.
-
velmarin
- Posts: 1600
- Joined: Mon Feb 21, 2011 9:48 am
Re: Tuning misery
With 12 years should not watch movies how "Papillon" ...
Well, I think you raised the issue already knowing the answer, ...
Or occupation was the question,,, had some?
Well, I think you raised the issue already knowing the answer, ...
Or occupation was the question,,, had some?
-
Uri Blass
- Posts: 10108
- Joined: Thu Mar 09, 2006 12:37 am
- Location: Tel-Aviv Israel
Re: Tuning misery
I think that the intention is that it choose the same moves not with the same time but with the same number of nodes.hgm wrote:Is it? I would say it is only easy to test if the speed improved. By defenition, if it selects the same moves, its playing strength would not have changed.Henk wrote:If a change in the parameters only influence the speed and not the moves selected, it is easy to compare if playing strength has improved.
Even testing if the speed has changed is not that easy, and requires you to average over many representative positions. Sometimes a change in move ordering that increases average speed backfires in certain positions.
If the program search exactly the same tree and do it faster and if it has not serious bug that cause it to play weaker at longer time control(usually programs do not have it) then it is obvious that speed improvement means an improvement in playing strength.
testing a change in move ordering is a little more complicated because the tree is different but still I believe that it may be possible to be convinced about some elo improvement without testing it in many games based on testing it at fixed depth and showing that the average time to finish the depth is faster when the result is usually the same.