texel tuning

flok · Post by **flok** » Fri Jun 01, 2018 10:35 am

Am looking at texel tuning.

From the wiki: "K is a scaling constant. Compute the K that minimizes E."
How to do so?

Also the wiki says to run a QS. But if I use quiet positions, isn't it enough to run only eval and use the result of that?

sedicla · Post by **sedicla** » Fri Jun 01, 2018 1:31 pm

You have to create a function to loop a range of values and keep the one that minimizes e. You can select the range values. Here is my code.

Code: Select all

double calc_min_k(void) 
{
    double k = 0;
    double i;
    double e;
    double s = 9999;

    for (i = -2; i <= 2; i += 0.1)  {
        e = calc_e_main(i, tune_param_value, MAX_THREADS, tune_thread);
        if (e < s) {
            k = i;
            s = e;
        }
        printf("i=%3.10f e=%3.10f k=%3.10f\n", i, e, k);
    }
    
    printf("\nk=%1.8f\n", k);

    return k;
}

I am using quiesce, not sure if makes a difference using quiet positions, haven't test it. Maybe others have tried and can provide some insight.
HTH.

flok · Post by **flok** » Fri Jun 01, 2018 1:46 pm

Hi,

sedicla wrote: ↑Fri Jun 01, 2018 1:31 pmYou have to create a function to loop a range of values and keep the one that minimizes e. You can select the range values.

Not entirely what I meant

"From the wiki: "K is a scaling constant. Compute the K that minimizes E." How to do so?"

What I meant is: how can I determine that scaling constant?

Currently the score of my eval routine lays between -99.99 and +99.99. Maybe a dumb question: but do I just enter that evaluation-value (or the score from the quiesce function) in that sigmoid function and then check if the result is lower than the previous one?

I tried so and this resulted only in the scaling values becoming "bigger" (e.g. PSQ would no longer be divided by 4 but by 1 instead. an other example is that the sum of the evaluation-values for each piece on the board would get multiplied by 99 (which is the limit I gave for that setting) instead of 1), not neccessarily the game play becoming better (far from it actually).

jdart · Post by **jdart** » Fri Jun 01, 2018 4:48 pm

What you are trying to minimize is the sum of the squared error between the predicted outcome of the game (which is the sigmoid function of the eval) and the actual result (0, 0.5 or 1), summed over all positions. This is what is termed "logistic regression."

K is a constant used in the sigmoid function and changing it will vary the "goodness" of prediction between the eval and the end result of the game (changing the eval weights modifies the predictiveness, too, but K is also a factor).

I use K = 0.75 and that is near-optimal for me. If you want to optimize it, run the whole tuning process (or at least one iteration of it, over all the positions) and choose the K value that minimizes the objective (mean-squared error).

--Jon

Robert Pope · Post by **Robert Pope** » Fri Jun 01, 2018 8:26 pm

jdart wrote: ↑Fri Jun 01, 2018 4:48 pm What you are trying to minimize is the sum of the squared error between the predicted outcome of the game (which is the sigmoid function of the eval) and the actual result (0, 0.5 or 1), summed over all positions. This is what is termed "logistic regression."

K is a constant used in the sigmoid function and changing it will vary the "goodness" of prediction between the eval and the end result of the game (changing the eval weights modifies the predictiveness, too, but K is also a factor).

I use K = 0.75 and that is near-optimal for me. If you want to optimize it, run the whole tuning process (or at least one iteration of it, over all the positions) and choose the K value that minimizes the objective (mean-squared error).

--Jon

Just to elaborate a bit, you are creating a training set of X thousand positions, and recording your evaluation/quiesce score and the game outcome:
1.5 1.0
4.7 1.0
1.6 0.5
-3.7 0.0
...
Then you are calculating the squared error, e.g. [sigmoid(1.5)-1.0]^2, and summing those across all positions in your training set to get a total error value (SSE)

In stage 1, you are tweaking K to find a value where the sigmoid curve gives the most natural fit to your current evaluation function. I would say this is useful because it reduces the amount of eval term adjustments needed to fit to the sigmoid curve later on, and it helps retain the original evaluations scaling of 100 = one pawn advantage type of metric, but it isn't absolutely required. In other domains it's not unusual to define K=1 and just go from there.

In stage 2, you are tweaking your evaluation terms to find the values that best fit the game outcomes.

Ferdy · Post by **Ferdy** » Sun Jun 03, 2018 4:55 pm

flok wrote: ↑Fri Jun 01, 2018 10:35 am Am looking at texel tuning.

From the wiki: "K is a scaling constant. Compute the K that minimizes E."
How to do so?

Also the wiki says to run a QS. But if I use quiet positions, isn't it enough to run only eval and use the result of that?

sigmoid is like scoring rate and varies from engine to engine. If your engine evaluates a certain position at 100 cp, what is its scoring rate?, Is it 100%, 90%, 80% etc? This scoring rate can be adjusted using a factor of K in the following formula.

Code: Select all

sigmoid = 1 / [1 + 10^(-ks/400)]

Example:
s = 100 cp
k = 1
sigmoid = 0.64

If you use K = 1, can your engine achieve a scoring rate of 0.64 or 64% against equal to stronger opponents?

If we can find the scoring rate of the engine when it evaluates a position with an advantage of 1 pawn, then we can get the K based from the sigmoid formula.

One approach to find your engine's scoring rate is by collecting its games and extracting positions where it is to move and get its score then get the average.

Example:

Game 1: Win
[White "yourengine"]
[Black "opp"]
[Result "1-0"]

1. e4 {book} e5 {book} 2. Nf3 {book} Nc6 {book} 3. Bb5 {book} a6 {book}
4. Ba4 {book} Nf6 {book} 5. O-O {book} Be7 {book} 6. Re1 {book} b5 {book}
7. Bb3 {book} O-O {book} 8. d3 {book} d6 {book} 9. a4 {book} b4 {book}
10. Nbd2 {book} Na5 {book} 11. Ba2 {book} c5 {book} 12. Nc4 {+0.21/13 0.75s}
Nxc4 {-0.11/16 0.73s} 13. Bxc4 {+0.08/17 0.55s} Be6 {-0.04/16 0.67s} ...

12. +0.21
13. +0.08
...

Say
35. +1.0
36. +1.05
37. +1.10

Since your engine won the game, count it as a win for that sample score range.
[1.0, 1.10] = 1.0

Game 2: Draw
[1.0, 1.10] = 1.5

Game 3: Win
[1.0, 1.10] = 2.5

scoring rate = 2.5/3 = 0.83
More games are better.

Now if you have that scoring rate, we can solve for K in sigmoid.
sigmoid = 1 / (1 + 10^(-ks/400))
a = -ks/400
sigmoid = 1 / (1 + 10^a)
sigmoid * (1 + 10^a) = 1
1 + 10^a = 1/sigmoid
10^a = (1/sigmoid) - 1
b = (1/sigmoid) - 1
10^a = b

log10(b) = a
a = -ks/400

log10(b) = -ks/400
-ks/400 = log10(b)
-ks = 400 * log10(b)
-k = 400/s * log10(b)
k = -400/s * log10(b)
b = (1/sigmoid) - 1

k = -400/s * log10((1/sigmoid) - 1)

Given:
s = 100 or 1 pawn
sigmoid or scoring rate = 0.58

k = -400/100 * log10((1/0.58) - 1)
k = -400/100 * (-0.1402)
k = 0.56

Plot Deuterium's (D2018 and D2014) scoring rate (extracted from games) playing against different common opponents of similar or lower strength. Also plot the sigmoid with varying K values.

lucasart · Post by **lucasart** » Fri Jun 08, 2018 1:13 pm

flok wrote: ↑Fri Jun 01, 2018 10:35 am Am looking at texel tuning.

From the wiki: "K is a scaling constant. Compute the K that minimizes E."
How to do so?

Its trivial using the Newton Raphson method. What you want to compute is K such that E'(K)=0. Don't they teach this at high school anymore ?

Also the wiki says to run a QS. But if I use quiet positions, isn't it enough to run only eval and use the result of that?

In theory, yes. In practice you'll have to generate tens of thousands of positions, so how will you they're all quiet without qsearching them?

Sven · Post by **Sven** » Sat Jun 09, 2018 12:02 pm

lucasart wrote: ↑Fri Jun 08, 2018 1:13 pm
flok wrote: ↑Fri Jun 01, 2018 10:35 am Am looking at texel tuning.

From the wiki: "K is a scaling constant. Compute the K that minimizes E."
How to do so?
Its trivial using the Newton Raphson method. What you want to compute is K such that E'(K)=0. Don't they teach this at high school anymore ?
Also the wiki says to run a QS. But if I use quiet positions, isn't it enough to run only eval and use the result of that?
In theory, yes. In practice you'll have to generate tens of thousands of positions, so how will you they're all quiet without qsearching them?

The selection of quiet positions from a given set of arbitrary positions can be done once in advance. And there are people who did that and published their results already, so one could rely on their work. I use such a quiet set and call eval() instead of qs(), and it works like a charm. It also saves a lot of computation time.

RubiChess · Post by **RubiChess** » Sat Jun 09, 2018 12:40 pm

I reuse this thread for another Question related to Texel-Tuning:

Suppose I have introduced a new evaluation feature and want to tune its parameters.
Now the question: Is it important that at least one or even both engines that are used to generate the game data used for tuning do support this new feature? I could imagine that if both engines don't know about this evaluation feature they simply cannot expose its value and tuning will fail.

- Andreas

Sven · Post by **Sven** » Sat Jun 09, 2018 5:49 pm

RubiChess wrote: ↑Sat Jun 09, 2018 12:40 pm I reuse this thread for another Question related to Texel-Tuning:

Suppose I have introduced a new evaluation feature and want to tune its parameters.
Now the question: Is it important that at least one or even both engines that are used to generate the game data used for tuning do support this new feature? I could imagine that if both engines don't know about this evaluation feature they simply cannot expose its value and tuning will fail.

- Andreas

I think it is not important. More relevant is that the set of positions that you use for tuning sufficiently covers those cases that you address with your new feature. E.g. if you introduce, say, detection of a rook trapped by bishop and pawn near a corner in late endgame but you mostly use middlegame positions then tuning will not help much.

texel tuning

texel tuning

Re: texel tuning

Re: texel tuning

Re: texel tuning

Re: texel tuning

Re: texel tuning

Re: texel tuning

Re: texel tuning

Re: texel tuning

Re: texel tuning