tuning by evaluation concordance

Discussion of chess software programming and technical issues.

Moderators: hgm, Dann Corbit, Harvey Williamson

DrRibosome
Posts: 19
Joined: Tue Mar 12, 2013 5:31 pm

tuning by evaluation concordance

Post by DrRibosome »

Has anyone tried tuning eval by looking at the concordance of scores among successive evaluations? For instance, play a game, and see if scores earlier are indicative of scores further along in the game. Then, select for evaluation parameters that best give this effect.

Of course, some considerations will have to be taken to avoid training functions in a non-useful manner (ie, evals that always give 0, etc).
nionita
Posts: 175
Joined: Fri Oct 22, 2010 9:47 pm
Location: Austria

Re: tuning by evaluation concordance

Post by nionita »

Yes, I tried this very extensively in the beginning of my engine Abulafia, as I had no idea how the parameters of the evaluation function should be. I tried to minimise the error of eval between 0 and ply 1, then between 0 and play 2, over a set of positions (not sure which set I choose) and some other variations. The parameters I got from this optimization were mostly meaningless, and much worse as the guess I had begun with.

Perhaps it must to be done over much more positions (I think I had about 6000 position). But the fact that there was absolutely no enhancement demotivated me to continue on this way and then I begun to optimise the parameters as everyone does, by playing real games.
jdart
Posts: 4361
Joined: Fri Mar 10, 2006 5:23 am
Location: http://www.arasanchess.org

Re: tuning by evaluation concordance

Post by jdart »

The problem is, most chess evaluation functions are very nonlinear. Plus chess programs still suffer from the so-called horizon effect, which "hides" the result of a move until some time after the move has been played. So it not uncommon to have sudden jumps in value from one move to another. For this reason I don't think your suggested tuning method will work well.
AlvaroBegue
Posts: 931
Joined: Tue Mar 09, 2010 3:46 pm
Location: New York
Full name: Álvaro Begué (RuyDos)

Re: tuning by evaluation concordance

Post by AlvaroBegue »

I remember reading an article about a similar method in the ICCA magazine many years ago. Here it is: http://citeseerx.ist.psu.edu/viewdoc/do ... 1&type=pdf

I don't have any hard evidence, but I think it's probably better to try to estimate the result of the game directly, as 1/(1+exp(-eval)). I am working on implementing a tuning mechanism for non-linear evaluation functions using a method called BFGS. If my results are interesting, I'll post them here.
Rémi Coulom
Posts: 438
Joined: Mon Apr 24, 2006 8:06 pm

Re: tuning by evaluation concordance

Post by Rémi Coulom »

DrRibosome wrote:Has anyone tried tuning eval by looking at the concordance of scores among successive evaluations? For instance, play a game, and see if scores earlier are indicative of scores further along in the game. Then, select for evaluation parameters that best give this effect.

Of course, some considerations will have to be taken to avoid training functions in a non-useful manner (ie, evals that always give 0, etc).
What you describes is called the temporal-difference method (TD). It has been applied to many games, with some spectacular success in backgammon. You'll find a lot by googling it. I am not so familiar with computer chess any more, but I would not be surprised if some of the strong programs use a form of TD learning.

I applied TD learning with success to my Othello program. I simply tuned the evaluation parameters such that the static evaluation matches a 3-ply search, over hundreds of thousands of games. Maybe you can try it in chess too. Make sure you apply it only to quiet positions.

Rémi
DrRibosome
Posts: 19
Joined: Tue Mar 12, 2013 5:31 pm

Re: tuning by evaluation concordance

Post by DrRibosome »

I am not referring to TD learning, etc.

Instead, more specifically, calculate something like Kendall Tau statistic associated with score predictions in the course of sample games. Then, use some selection means (perhaps GA?) to generate new parameters with the goal of maximizing the concordance. That way, try to maximize the correctness of predictions.