Help with Texel's tuning

maksimKorzh · Post by **maksimKorzh** » Wed Jan 06, 2021 6:03 pm

OMG! It seems like it's working!!!
I'm running the proof of concept session now with only 1000 fens,
but the pipeline seems to be working!
It's so amazing to see my PST getting updated!

Ah! I wasn't so happy when passed the perft test for the first time ever....
Here's my noob's implementation:
https://github.com/maksimKorzh/wukongJS ... eval_tuner

Despite the fact Wuking JS is written in javascript I considered python to write tuner in
because writing eval and quiescence in python is much easier and faster
compared to writing pgn parser and FEN generator in javascript from scratch.
Tuner generates javascript code that I can simply copypaste to my engine when the values
are updated. Bearing in mind the fact that my engine evaluates only material and PST + tapered eval
the evaluation function itself is very portable so can be easily implemented in whatever language and
applied to whatever engine meanwhile making using of tuned material and PST weights.

For now I'm using FENs from Ethereal data dump thread,
but I'm going to generate my own set from gm2600.pgn

I can't believe Wukong JS is gonna obtain it's own unique texel tuned PST values!!!
Thanks to everyone supporting me and especially to Ronald Friederich for taking TIME to explain the implementation details.

P.S. Just to give you an idea of what it means to me - I tried to master Texel's tuning for several month but without success, until now.

mvanthoor · Post by **mvanthoor** » Thu Jan 07, 2021 12:41 pm

Congratulations

Where did you find the explanation with regard to Texel tuning, and the millions of games to test against? Or, did you get all the explanation needed by PM and created the test data yourself by (for example) scraping lichess games?

I'd be _very_ much interested in a video about this topic, because I feel that evaluation function tuning is something I just can't get my head around properly. Maybe I'm just missing some details. Writing a Texel tuner (for me) seems infinitely more complex than understanding and writing a magic bitboard engine.

I'd love to implement Texel tuning as one of the first features after the transposition table, so I can use it right from the beginning, even when only using PST's. I'm wondering how much Texel tuning could improve the PST's. (It'll probably compensate for a huge lack of knowledge, at least in the beginning. For example, having the rook on the 7th line is now in the PST, instead of being a discrete parameter.)

BrianNeal · Post by **BrianNeal** » Thu Jan 07, 2021 3:16 pm

Would shuffling the parameters before each iteration make sense (besides shuffling the training positions)?

maksimKorzh · Post by **maksimKorzh** » Thu Jan 07, 2021 3:31 pm

BrianNeal wrote: ↑Thu Jan 07, 2021 3:16 pm Would shuffling the parameters before each iteration make sense (besides shuffling the training positions)?

You should ask someone more competent than I)
In my noob's understanding shuffling training positions makes perfect sense because the mean square error becomes more objective,
for instance imagine you have only opening positions to tune endgame parameters - mean square error can still be minimized but it
would result a bullshit values, well at least IMO

Shuffling parameters doesn't make sense because if we anyway loop over all of the parameters and adjust each of them to minimize
mean square error than the order doesn't matter. May be the order matters if gradient decent is used instead of incrementing/decrementing
params by one, but I don't know that becasue didn't yet have a look at the idea behind gradient decent. I'm happy to come up with the
simplest implementation possible for now.

Pio · Post by **Pio** » Thu Jan 07, 2021 3:40 pm

BrianNeal wrote: ↑Thu Jan 07, 2021 3:16 pm Would shuffling the parameters before each iteration make sense (besides shuffling the training positions)?

It shouldn’t matter much but I think it will be a little bit worse. Doing my variable delta trick for each feature will speed up convergence and so will doing PCA without any reduction since it will orthogonalize the problem reducing the dependencies between different features.

maksimKorzh · Post by **maksimKorzh** » Thu Jan 07, 2021 3:55 pm

mvanthoor wrote: ↑Thu Jan 07, 2021 12:41 pm Congratulations

Where did you find the explanation with regard to Texel tuning, and the millions of games to test against? Or, did you get all the explanation needed by PM and created the test data yourself by (for example) scraping lichess games?

I'd be _very_ much interested in a video about this topic, because I feel that evaluation function tuning is something I just can't get my head around properly. Maybe I'm just missing some details. Writing a Texel tuner (for me) seems infinitely more complex than understanding and writing a magic bitboard engine.

I'd love to implement Texel tuning as one of the first features after the transposition table, so I can use it right from the beginning, even when only using PST's. I'm wondering how much Texel tuning could improve the PST's. (It'll probably compensate for a huge lack of knowledge, at least in the beginning. For example, having the rook on the 7th line is now in the PST, instead of being a discrete parameter.)

I got private explanations from Ronald Friederich, author of engines rofChade and PeSTO via PM here on talkchess.
Initially I've asked him to share his PeSTO (3098) material weights and PST serving the ONLY evaluation.
PeSTO tables worked worse compare to rofChade's tables for me which is surprising because rofChade's eval also includes
some additional params apart from material and PST.

Then I asked about Texel's tuning and Ronald provided me pseudo code, see first page of this thread.
Then he explained some core details.

First I tried FEN positions with results from Ethereal data dump thread.
Then I've generated my own from pgn2600.pgn, see this thread: http://talkchess.com/forum3/viewtopic.php?f=7&t=76251

The most difficult for me was to get the understanding of pipeline flow:
1. generate dataset (from self play or external source like in my case) - fen string followed by result of the game
2. loop over all positions (people loop over millions, I tried no more than 10000 for my code is slow as hell)
3. evaluate() or quiescence() position depending if it's tactical (I removed tactical positions on dataset generation to use eval() for it's faster)
4. convert scalar eval value, say 25 to probabilistic using sigmoid, say it would be 0.75
5. get error for every position, e.g. win is 1.0 draw is 0.5 loss is 0.0, so for 0.75 error would be 1.0 - 0.75 = 0.25
6. Sum all the errors (for all positions) and divide by the number of positions - mean square error

And then justt loop over all the eval parameters.
Adjust first by one.
Recalculate mean square error (this is so damn long if many positions, hence optimizations like using gradient decent are involved).
If mean square error didn't get minimized - decrement parameter by one and recalculate mean square error.

So if we drop the particular implementation - Texel's using is all about calculating mean square error and changing eval parameters.
(material weights, opening/endgame phase margins, PST for opening/endgame - around 794 params in total for me)

I'm definitely going to make an extended video explanation on this very soon.
Understanding of method's essence open opportunities for customizations depending on your goals.
For now now you can have a look and even run my implementation: https://github.com/maksimKorzh/wukongJS ... eval_tuner
it takes initial_weights.json as input
then while script is running eval params are changing
as the output it generates javascript code that I later copypaste into my engine, see /temp_weights folder

I'm going to alter the original method for Wukong JS to obtain material weights/PST from scratch (from 0 values)
and then document it within a project repo and also make a video.
It still needs lots of testing.

maksimKorzh · Post by **maksimKorzh** » Thu Jan 07, 2021 3:57 pm

Pio wrote: ↑Thu Jan 07, 2021 3:40 pm
BrianNeal wrote: ↑Thu Jan 07, 2021 3:16 pm Would shuffling the parameters before each iteration make sense (besides shuffling the training positions)?
It shouldn’t matter much but I think it will be a little bit worse. Doing my variable delta trick for each feature will speed up convergence and so will doing PCA without any reduction since it will orthogonalize the problem reducing the dependencies between different features.

I can't believe the fact that I've understood what you've just said... amazing)
Ronald's explanation made me a smart monkey)

Ferdy · Post by **Ferdy** » Thu Jan 07, 2021 5:37 pm

maksimKorzh wrote: ↑Thu Jan 07, 2021 3:31 pm
BrianNeal wrote: ↑Thu Jan 07, 2021 3:16 pm Would shuffling the parameters before each iteration make sense (besides shuffling the training positions)?
You should ask someone more competent than I)
In my noob's understanding shuffling training positions makes perfect sense because the mean square error becomes more objective,
for instance imagine you have only opening positions to tune endgame parameters - mean square error can still be minimized but it
would result a bullshit values, well at least IMO

Shuffling parameters doesn't make sense because if we anyway loop over all of the parameters and adjust each of them to minimize
mean square error than the order doesn't matter. May be the order matters if gradient decent is used instead of incrementing/decrementing
params by one, but I don't know that becasue didn't yet have a look at the idea behind gradient decent. I'm happy to come up with the
simplest implementation possible for now.

The "shuffling the parameters" BrianNeal mentioned is not actually like shuffling randomly. It is more like "ordering the parameters". As you have noticed, in Texel tuning the first parameter changes the whole evaluation which would affect later parameters. For example you have the parameters to optimize.
1. PawnValue
2. PawnPST
3. QueenAttacksKing

First you try PawnValue with +/-1, error does not improve.
Second you try PawnPST ... error does not improve.
Third you try QueenAttacksKing with +1 and error improves.

Now you can change the order of tuning because QueenAttacksKing is more sensitive to error.

1. QueenAttacksKing
2. PawnValue
3. PawnPST

It might happen that after evaluating QueenAttacksKing first, the PawnValue or PawnPST might improve the error.

As you try more parameters, like two_bishop advantage and double_pawn_penalty and other important parameters like passed_pawn, kingattack, threats and mobility, it would be interesting to order your parameters based on how many times a parameter improves the error. If kingattack improves the error by 4/10 iterations and mobility improves the error by 2/10, order the param with kingattack on top of mobility for the next iteration.

It is also possible not to order your parameters dynamically, but try to order your parameters in the first place by importance.
* passed_pawn is more importan than PST
* material is more important than PST
* passed_pawn is more important than material

So that would be
1. passed_pawn
2. material
3. PST

or for other example, knight PST is more important than pawn PST, so knight first before pawn.

fabianVDW · Post by **fabianVDW** » Thu Jan 07, 2021 5:48 pm

mvanthoor wrote: ↑Thu Jan 07, 2021 12:41 pm Congratulations

Where did you find the explanation with regard to Texel tuning, and the millions of games to test against? Or, did you get all the explanation needed by PM and created the test data yourself by (for example) scraping lichess games?

I'd be _very_ much interested in a video about this topic, because I feel that evaluation function tuning is something I just can't get my head around properly. Maybe I'm just missing some details. Writing a Texel tuner (for me) seems infinitely more complex than understanding and writing a magic bitboard engine.

I'd love to implement Texel tuning as one of the first features after the transposition table, so I can use it right from the beginning, even when only using PST's. I'm wondering how much Texel tuning could improve the PST's. (It'll probably compensate for a huge lack of knowledge, at least in the beginning. For example, having the rook on the 7th line is now in the PST, instead of being a discrete parameter.)

Depending on your mathematical education, you can read https://github.com/AndyGrant/Ethereal/b ... Tuning.pdf. I recommend atleast some basic knowledge about multivariate calculus. Then it is easy. For more questions on the contents of this, you can also ask Andrew directly I suppose or me here.

maksimKorzh · Post by **maksimKorzh** » Thu Jan 07, 2021 5:59 pm

Ferdy wrote: ↑Thu Jan 07, 2021 5:37 pm
maksimKorzh wrote: ↑Thu Jan 07, 2021 3:31 pm
BrianNeal wrote: ↑Thu Jan 07, 2021 3:16 pm Would shuffling the parameters before each iteration make sense (besides shuffling the training positions)?
You should ask someone more competent than I)
In my noob's understanding shuffling training positions makes perfect sense because the mean square error becomes more objective,
for instance imagine you have only opening positions to tune endgame parameters - mean square error can still be minimized but it
would result a bullshit values, well at least IMO

Shuffling parameters doesn't make sense because if we anyway loop over all of the parameters and adjust each of them to minimize
mean square error than the order doesn't matter. May be the order matters if gradient decent is used instead of incrementing/decrementing
params by one, but I don't know that becasue didn't yet have a look at the idea behind gradient decent. I'm happy to come up with the
simplest implementation possible for now.
The "shuffling the parameters" BrianNeal mentioned is not actually like shuffling randomly. It is more like "ordering the parameters". As you have noticed, in Texel tuning the first parameter changes the whole evaluation which would affect later parameters. For example you have the parameters to optimize.
1. PawnValue
2. PawnPST
3. QueenAttacksKing

First you try PawnValue with +/-1, error does not improve.
Second you try PawnPST ... error does not improve.
Third you try QueenAttacksKing with +1 and error improves.

Now you can change the order of tuning because QueenAttacksKing is more sensitive to error.

1. QueenAttacksKing
2. PawnValue
3. PawnPST

It might happen that after evaluating QueenAttacksKing first, the PawnValue or PawnPST might improve the error.

As you try more parameters, like two_bishop advantage and double_pawn_penalty and other important parameters like passed_pawn, kingattack, threats and mobility, it would be interesting to order your parameters based on how many times a parameter improves the error. If kingattack improves the error by 4/10 iterations and mobility improves the error by 2/10, order the param with kingattack on top of mobility for the next iteration.

It is also possible not to order your parameters dynamically, but try to order your parameters in the first place by importance.
* passed_pawn is more importan than PST
* material is more important than PST
* passed_pawn is more important than material

So that would be
1. passed_pawn
2. material
3. PST

or for other example, knight PST is more important than pawn PST, so knight first before pawn.

Thanks for such an amazing insight Ferdinand!
This is a very interesting idea to play around with!

Help with Texel's tuning

Re: Help with Texel's tuning

Re: Help with Texel's tuning

Re: Help with Texel's tuning

Re: Help with Texel's tuning

Re: Help with Texel's tuning

Re: Help with Texel's tuning

Re: Help with Texel's tuning

Re: Help with Texel's tuning

Re: Help with Texel's tuning

Re: Help with Texel's tuning