100 games is not enough to draw conclusions. The statistical uncertainty in the result (+/- 28.3) is higher than the ELO difference. I personally do around 2000 or more games in my tests on multiple threads (my computer has 6 cores, 12 threads) and leave the computer running over night at the expense of a higher electricity bill.algerbrex wrote: ↑Fri Jul 16, 2021 1:02 pmCode: Select all
Score of BlunderOld vs Blunder 1.0.0: 28 - 41 - 131 [0.468] ... BlunderOld playing White: 17 - 19 - 64 [0.490] 100 ... BlunderOld playing Black: 11 - 22 - 67 [0.445] 100 ... White vs Black: 39 - 30 - 131 [0.522] 200 Elo difference: -22.6 +/- 28.3, LOS: 5.9 %, DrawRatio: 65.5 % 200 of 200 games finished.
I found the article on CPW pretty good and sufficient to implement the tuner. The only other resource I used was a set of quiet positions from zurichess available here: https://bitbucket.org/zurichess/tuner/downloads/
Do you have any specific questions? Imo it's not very difficult conceptually or requires deep math skills if you are okay with using simple local optimization. An example what local optimization does is shown in pseudo code and as you can see it's really just trying out different values and keeping those that performed best.
And what does perform best mean? After each change you iterate over all positions in your set of annotated positions and evaluate it with the current set of PST values. Then you compare the evaluation of the position with the outcome of the game. And you try to improve the average predictive quality of your PST based evaluation.
How do you improve the predicitve quality of the PST? You try to minimize the mean squared error of all positions where the error is a positive number between 0 and 2 expressing how much the evaluation mispredicted the actual outcome of the game.
How do you compare the evaluation given in CP with the outcome of the game given as -1, 0, or 1? That's where the Sigmoid-function comes into play which maps the score given in centipawns into an interval between -1 and 1. Look at the plot of the sigmoid. Look how it changes when you change what the article calls K. So before you can start with tuning you need to settle on a value of K you want to use.
And then you just write code that changes the values in your PST until no further change get's the average error closer to 0. (Consider multithreading!) Then you're done.
That's the gist of it.