Weird Results from Texel Tuning

Discussion of chess software programming and technical issues.

Moderator: Ras

User avatar
Ras
Posts: 2671
Joined: Tue Aug 30, 2016 8:19 pm
Full name: Rasmus Althoff

Re: Weird Results from Texel Tuning

Post by Ras »

mvanthoor wrote: Fri Jan 19, 2024 11:04 pmWhen you use a different training set T2 with evaluation E, you should recompute K.
You can either do that, or compute the K for both training sets on the pre-tuned eval. I didn't gain anything from training both on the Zurichess and the Lichess data, so I went with the Lichess ones only because those gave slightly better results for me.
Do you also need to recompute K if you use the old training data T with a new evaluation E2?
That wouldn't gain you anything, it would only inflate your eval because of the reasons with positions such as KQ:K I mentioned before. The whole point is not the absolute eval or its MSE, it is its ability to tell advantages from disadvantages.
Rasmus Althoff
https://www.ct800.net
User avatar
mvanthoor
Posts: 1784
Joined: Wed Jul 03, 2019 4:42 pm
Location: Netherlands
Full name: Marcel Vanthoor

Re: Weird Results from Texel Tuning

Post by mvanthoor »

Ras wrote: Fri Jan 19, 2024 11:19 pm That wouldn't gain you anything, it would only inflate your eval because of the reasons with positions such as KQ:K I mentioned before. The whole point is not the absolute eval or its MSE, it is its ability to tell advantages from disadvantages.
I don't understand. If I add terms to the evaluation (king safety for example) and I then re-tune on the data set I used before, I assume that K should now be different, because my evaluation has changed and it should now be better at understanding advantage from disadvantage.

What I mean is: if I only have PSQT's, and I compute K by putting a data set through the evaluation and finding the K that minimizes MSE, then I'm quite sure that, if I add terms to the evaluation and I then recompute K again for the same data-set, it should be different. If a training set has a certain K that should never change for any evaluation, then why would it need to be computed? Someone could just do it once (and if so, on the basis of what evaluation function?) and provide the K value for the data set.

I can't believe I can just compute K once for a training set, and then use that K and training set over and over even when I add 2, 3, or 10 terms to the evaluation.
Author of Rustic, an engine written in Rust.
Releases | Code | Docs | Progress | CCRL
User avatar
Ras
Posts: 2671
Joined: Tue Aug 30, 2016 8:19 pm
Full name: Rasmus Althoff

Re: Weird Results from Texel Tuning

Post by Ras »

mvanthoor wrote: Fri Jan 19, 2024 11:43 pmIf I add terms to the evaluation (king safety for example) and I then re-tune on the data set I used before, I assume that K should now be different
That's correct, yes.
I can't believe I can just compute K once for a training set, and then use that K and training set over and over even when I add 2, 3, or 10 terms to the evaluation.
That's indeed not how to do it. K is kept constant within the algorithm, i.e. you determine it once before you run the actual tuner. I'd not start with the already trained eval, but ditch the previously gained weights and start over from 0 with the new eval.
Rasmus Althoff
https://www.ct800.net
User avatar
mvanthoor
Posts: 1784
Joined: Wed Jul 03, 2019 4:42 pm
Location: Netherlands
Full name: Marcel Vanthoor

Re: Weird Results from Texel Tuning

Post by mvanthoor »

Ras wrote: Sat Jan 20, 2024 9:32 am
mvanthoor wrote: Fri Jan 19, 2024 11:43 pmIf I add terms to the evaluation (king safety for example) and I then re-tune on the data set I used before, I assume that K should now be different
That's correct, yes.
I can't believe I can just compute K once for a training set, and then use that K and training set over and over even when I add 2, 3, or 10 terms to the evaluation.
That's indeed not how to do it. K is kept constant within the algorithm, i.e. you determine it once before you run the actual tuner. I'd not start with the already trained eval, but ditch the previously gained weights and start over from 0 with the new eval.
Then it seems I'm understanding this correctly. Another thing I didn't know is that I should be starting with a 'clear' eval each time. In that case, it would be best to make a "default eval" setup to which I can just add weights and give them a name. (I was initially planning to convert the existing eval into a list of parameters/weights, and keep tuning that, one tune on top of the other.)
Author of Rustic, an engine written in Rust.
Releases | Code | Docs | Progress | CCRL
User avatar
Ras
Posts: 2671
Joined: Tue Aug 30, 2016 8:19 pm
Full name: Rasmus Althoff

Re: Weird Results from Texel Tuning

Post by Ras »

mvanthoor wrote: Sat Jan 20, 2024 11:58 amAnother thing I didn't know is that I should be starting with a 'clear' eval each time.
You probably don't have to, but if you don't, how are you going to keep things reproducible? I like to have clear documentation on how exactly I arrived at the weights, with the additional advantage that I could easily apply the same steps to different training data and compare the strength.
In that case, it would be best to make a "default eval" setup to which I can just add weights and give them a name.
I have default material values, but other than that, the PSQTs and the bishop pair start simply from 0 when having the define for autotuning set during compilation. Then I give a series of I iterations with step size S, from coarse to fine, which prints the results in the terminal for copy/paste into the source editor as init values when the autotune define is not set.
Rasmus Althoff
https://www.ct800.net