Has anyone tried tuning eval by looking at the concordance of scores among successive evaluations? For instance, play a game, and see if scores earlier are indicative of scores further along in the game. Then, select for evaluation parameters that best give this effect.
Of course, some considerations will have to be taken to avoid training functions in a non-useful manner (ie, evals that always give 0, etc).
tuning by evaluation concordance
Moderators: hgm, Dann Corbit, Harvey Williamson
-
nionita
- Posts: 175
- Joined: Fri Oct 22, 2010 9:47 pm
- Location: Austria
Re: tuning by evaluation concordance
Yes, I tried this very extensively in the beginning of my engine Abulafia, as I had no idea how the parameters of the evaluation function should be. I tried to minimise the error of eval between 0 and ply 1, then between 0 and play 2, over a set of positions (not sure which set I choose) and some other variations. The parameters I got from this optimization were mostly meaningless, and much worse as the guess I had begun with.
Perhaps it must to be done over much more positions (I think I had about 6000 position). But the fact that there was absolutely no enhancement demotivated me to continue on this way and then I begun to optimise the parameters as everyone does, by playing real games.
Perhaps it must to be done over much more positions (I think I had about 6000 position). But the fact that there was absolutely no enhancement demotivated me to continue on this way and then I begun to optimise the parameters as everyone does, by playing real games.
-
jdart
- Posts: 4361
- Joined: Fri Mar 10, 2006 5:23 am
- Location: http://www.arasanchess.org
Re: tuning by evaluation concordance
The problem is, most chess evaluation functions are very nonlinear. Plus chess programs still suffer from the so-called horizon effect, which "hides" the result of a move until some time after the move has been played. So it not uncommon to have sudden jumps in value from one move to another. For this reason I don't think your suggested tuning method will work well.
-
AlvaroBegue
- Posts: 931
- Joined: Tue Mar 09, 2010 3:46 pm
- Location: New York
- Full name: Álvaro Begué (RuyDos)
Re: tuning by evaluation concordance
I remember reading an article about a similar method in the ICCA magazine many years ago. Here it is: http://citeseerx.ist.psu.edu/viewdoc/do ... 1&type=pdf
I don't have any hard evidence, but I think it's probably better to try to estimate the result of the game directly, as 1/(1+exp(-eval)). I am working on implementing a tuning mechanism for non-linear evaluation functions using a method called BFGS. If my results are interesting, I'll post them here.
I don't have any hard evidence, but I think it's probably better to try to estimate the result of the game directly, as 1/(1+exp(-eval)). I am working on implementing a tuning mechanism for non-linear evaluation functions using a method called BFGS. If my results are interesting, I'll post them here.
-
Rémi Coulom
- Posts: 438
- Joined: Mon Apr 24, 2006 8:06 pm
Re: tuning by evaluation concordance
What you describes is called the temporal-difference method (TD). It has been applied to many games, with some spectacular success in backgammon. You'll find a lot by googling it. I am not so familiar with computer chess any more, but I would not be surprised if some of the strong programs use a form of TD learning.DrRibosome wrote:Has anyone tried tuning eval by looking at the concordance of scores among successive evaluations? For instance, play a game, and see if scores earlier are indicative of scores further along in the game. Then, select for evaluation parameters that best give this effect.
Of course, some considerations will have to be taken to avoid training functions in a non-useful manner (ie, evals that always give 0, etc).
I applied TD learning with success to my Othello program. I simply tuned the evaluation parameters such that the static evaluation matches a 3-ply search, over hundreds of thousands of games. Maybe you can try it in chess too. Make sure you apply it only to quiet positions.
Rémi
-
DrRibosome
- Posts: 19
- Joined: Tue Mar 12, 2013 5:31 pm
Re: tuning by evaluation concordance
I am not referring to TD learning, etc.
Instead, more specifically, calculate something like Kendall Tau statistic associated with score predictions in the course of sample games. Then, use some selection means (perhaps GA?) to generate new parameters with the goal of maximizing the concordance. That way, try to maximize the correctness of predictions.
Instead, more specifically, calculate something like Kendall Tau statistic associated with score predictions in the course of sample games. Then, use some selection means (perhaps GA?) to generate new parameters with the goal of maximizing the concordance. That way, try to maximize the correctness of predictions.