tpetzke wrote:In fact the solution found after 100 generations (each one 8*7 games)
This is in total only 5600 games. Consider the error bar that each pairing has. You're are expecting probably to much given the small amount of games that you play. The parameters are still largely influenced by randomness.
You have to play more generations, more games per generation and depending on the numbers of values you want to tune also a population size that is big enough. The downside to this is, there is no overnight run.
In my eval tuning I use 1000 generations, about 600 games per generation (an increasing number in later generations) and a population size of 256. It takes 3 to 4 weeks to complete but gets good results then.
But it takes more than 600.000 games
Thomas...
Yes, you're right. This first experiment have a time constraint, because Tomorrow starts the IGT tournament, in Torino, and i have to be there, with a playing engine. I will back to hand-made values, for now.
You're numbers are very important to me, they are a good point to start from, for the next tuning session.
Another problem in my approach is the evaluation of each result. Lets say that player A play against player B. I stop at move 50 and then it could be a win for A/B or not. If not, i call a fast iteration at depth 4 (for the said time constraints) on the final position, just to assign a value to the two engines. This value is from the White side so i add it to the White player (A) and subtract from the black (B).
The problems are:
1) the final evaluation is based on Player A parameters and doesn't take care of B point of view
2) more generally: maybe it would be better to choose an "external player", as evaluator; maybe another engine or Satana with hand-made values, for sample
The point 1 is not so important, because parameters tends to be almost equal, between the engines in the population.
The point 2 is the more complex. If i choose an external evaluator, the Whole population would tend to imitates those engine. That's not my goal.
Maybe ignoring the value, or considering only material, in the final position could be a solution.
I would add another point:
3) stopping after an arbitrary number of moves will avoid choosing the right parameters for endings or long middle-game; maybe only complete games or a bigger moves limit should be better
I think that this is an unexplored field that would grow in the future. Just speaking about a way to increment ELO of 10 points with alphabeta-FLDSMDFR algorithm becomes almost annoying
