Texel tuning method question

hgm · Post by **hgm** » Wed Jun 07, 2017 4:13 pm

Sven Schüle wrote:I understand that. But what is the advantage of doing a search, then, instead of directly calling static eval for each (root) position and calculating the gradient of the corresponding error function, based on initially gathering reference eval scores?

The QS root might not be quiet. It would introduce a lot of noise, and possibly bias, to include non-quiet positions in your test set. E.g. if you were tuning a 'side-to-move bonus', it would get very large, indicating how much hanging material you can gobble up on the average in the test set.

jdart · Post by **jdart** » Wed Jun 07, 2017 4:47 pm

So at first glance i don't know in what way the "eval at the end of the pv" is different to the search result score.

It isn't. But after obtaining the PV in the first iteration I don't then actually do another search after adjusting the parameters. Each gradient descent step just does a static eval for each position and assumes the PV does not change. (I don't take credit for this idea: it was used in the related MMTO optimization method by Hoki and Kaneko). I have the option of periodically re-calculating the PV after some number of iterations.

--Jon

jdart · Post by **jdart** » Wed Jun 07, 2017 4:51 pm

Gradient descent is fast and 400 parameters is not a problem for that.

--Jon

Desperado · Post by **Desperado** » Wed Jun 07, 2017 4:57 pm

hgm wrote:
Sven Schüle wrote:I understand that. But what is the advantage of doing a search, then, instead of directly calling static eval for each (root) position and calculating the gradient of the corresponding error function, based on initially gathering reference eval scores?
The QS root might not be quiet. It would introduce a lot of noise, and possibly bias, to include non-quiet positions in your test set. E.g. if you were tuning a 'side-to-move bonus', it would get very large, indicating how much hanging material you can gobble up on the average in the test set.

Again and very simple !

I simply asked for the terminologie "end-of-PV evals" which is a nonsense term imho because any normal search will produce a "search result score" that is the value of a pv leaf node
(is this the crux at this point ? are we talking of qs value of the leaf node or a static eval score of the leaf node).

And what has it to do with the Texel algorihm itself? I ask again because if you have blackbox which provides the score, the algorithm will work exactly the same.

If i come close with my understanding which would mean a static score from the pv leaf, i bet it provides the same noise as any other static score you compute in any other situation.

Robert Pope · Post by **Robert Pope** » Wed Jun 07, 2017 5:01 pm

Sven Schüle wrote: I understand that. But what is the advantage of doing a search, then, instead of directly calling static eval for each (root) position and calculating the gradient of the corresponding error function, based on initially gathering reference eval scores?

What I think about are the situations where you have a series of exchanges in the PV. If you minimize error on the root, then you have several extra pieces on the board that will have their values affecting the score when they shouldn't, adding noise and misfitting.

AlvaroBegue · Post by **AlvaroBegue** » Wed Jun 07, 2017 5:08 pm

Desperado wrote:
hgm wrote:
Sven Schüle wrote:I understand that. But what is the advantage of doing a search, then, instead of directly calling static eval for each (root) position and calculating the gradient of the corresponding error function, based on initially gathering reference eval scores?
The QS root might not be quiet. It would introduce a lot of noise, and possibly bias, to include non-quiet positions in your test set. E.g. if you were tuning a 'side-to-move bonus', it would get very large, indicating how much hanging material you can gobble up on the average in the test set.
Again and very simple !

I simply asked for the term "end-of-PV evals" which is a nonsense term imho because any normal search will produce a "search result score" that is the value of a pv leaf node (is this the crux at this point ? are we talking of qs value of the leaf node or a static eval score of the leaf node).

And what has it to do with the Texel algorihm itself? I ask again because if you have blackbox which provides the score, the algorithm will work exactly the same.

If i come close with my understanding which would mean a static score from the pv leaf, i bet i provides the same noise as any other static score you compute in any other situation.

I don't know if I can explain it any clearer than it has been explained already, but I'll give it another try.

In Texel's tuning method you are trying to minimize a function E, which is the average value of the square of the difference between the result of a game and the sigmoid of the QS score (with infinite alpha-beta window) of a position that happened during the game, with the average taken over a large number of training pairs.

In order to do this minimization, we'll use the gradient of E with respect to the parameters we are trying to tune. By linearity of the derivative, this is the average of the gradients of the functions that compute the square error for a single position.

OK, now that the stage has been set: The gradient computation is done by performing the QS search on the position, recovering the leaf of the QS search tree whose score was propagated to the root of the QS search, and then computing the gradient on that position.

Yes, the end-of-PV static evaluation will be the same as the result of the QS search, but thinking of it as the end-of-PV static evaluation allows you to compute the gradient more easily.

I'll be happy to explain technical details about how the gradient can be computed, if that might help.

Desperado · Post by **Desperado** » Wed Jun 07, 2017 5:10 pm

jdart wrote:
So at first glance i don't know in what way the "eval at the end of the pv" is different to the search result score.
It isn't. But after obtaining the PV in the first iteration I don't then actually do another search after adjusting the parameters. Each gradient descent step just does a static eval for each position and assumes the PV does not change. (I don't take credit for this idea: it was used in the related MMTO optimization method by Hoki and Kaneko). I have the option of periodically re-calculating the PV after some number of iterations.

--Jon

Hello Jon,

this assumption can always be done, even if you do not compute the pv but retrieve the score from a blackbox. I don't see the clue. But thx for some more clarification that helped at least a little bit.

Desperado · Post by **Desperado** » Wed Jun 07, 2017 5:37 pm

hgm wrote:
Sven Schüle wrote:I understand that. But what is the advantage of doing a search, then, instead of directly calling static eval for each (root) position and calculating the gradient of the corresponding error function, based on initially gathering reference eval scores?
The QS root might not be quiet. It would introduce a lot of noise, and possibly bias, to include non-quiet positions in your test set. E.g. if you were tuning a 'side-to-move bonus', it would get very large, indicating how much hanging material you can gobble up on the average in the test set.

Sorry HG, but this isn't a matter of the tuning algorithm, but a matter of the raw data. If you build a database that already satisfies the qualitiy requierments you can simply perform direct calls an the static eval for the root.

The algorithm is not affected and the choice of the raw data is a different matter. Finally the term "end-of-PV evals" is a redundant vocabulary that does not introduce some special meaning.

Well, imho a good preparation of the raw data allows fast computing of the gradient because the of the assumption that a pv does not change and the assumption that we have a collection of positions with minimal noise.

Now, imho the Texel algorithm is not effected and no special terminology is needed.

Thx to all of you.

Desperado · Post by **Desperado** » Wed Jun 07, 2017 5:44 pm

Hi Alvaro,

... 'll be happy to explain technical details about how the gradient can be computed, if that might help...

Of course

, thx for your patience.

hgm · Post by **hgm** » Wed Jun 07, 2017 5:46 pm

Desperado wrote:If i come close with my understanding which would mean a static score from the pv leaf, i bet it provides the same noise as any other static score you compute in any other situation.

What you seem to miss is that the tuning method doesn't use only the score, but also the gradient of the score, i.e. the direction in parameter space in which the score increases fastest. It needs this to know the relative magnitude of the changes it has to apply to all the parameters to get closer to the optimum.

To know this, you have to know how the score of the individual test cases will change as a result of a change in each parameter. But you cannot see how much each parameter contributes to the score in the root. E.g. the root could have a Rook for the opponent, and none for you. Then the Root score would get worse for you if the Rook value increases. But the PV might be NxR, PxN, so that the position at the end of the PV does not contain any Rooks at all, and the static eval of that position (and thus the root score )would be insensitive to the Rook value.

Texel tuning method question

Re: Texel tuning method question

Re: Texel tuning method question

Re: Texel tuning method question

Re: Texel tuning method question

Re: Texel tuning method question

Re: Texel tuning method question

Re: Texel tuning method question

Re: Texel tuning method question

Re: Texel tuning method question

Re: Texel tuning method question