Tapered Evaluation and MSE (Texel Tuning)

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

Pio
Posts: 334
Joined: Sat Feb 25, 2012 10:42 pm
Location: Stockholm

Re: Tapered Evaluation and MSE (Texel Tuning)

Post by Pio »

hgm wrote: Fri Jan 15, 2021 3:38 pm
Pio wrote: Fri Jan 15, 2021 2:09 pmI thought about that this morning too, that it could explain it as well. Another problem might be that it looks like you haven’t computed the K from your test data (see code below). That means that the mapping between centipawn scores and probabilities might be completely off. Calculate the K Value and try what hgm suggested, then I am quite confident you will get reasonable values. If you haven’t calculated the K value, using one value as anchor will be even worse.
An anchor is only needed when the quantities to be fitted can be arbitrarily scaled. Here that is not the case; you want to reproduce a given sigmoid. This should fix all values. (At least when the data points cover the entire space that can be spanned by the parameters. If it would only cover a lower-dimensional sub-space, such as when you only include materially balanced positions, than the optimum is degenerate, and you can impose additional requirements to lift that degeneracy.)
Yes I know that. That is why I said it will make it a lot worse if you use an anchor while not having calculated the K 😀
Ferdy
Posts: 4833
Joined: Sun Aug 10, 2008 3:15 pm
Location: Philippines

Re: Tapered Evaluation and MSE (Texel Tuning)

Post by Ferdy »

Desperado wrote: Fri Jan 15, 2021 9:33 am
Desperado wrote: Fri Jan 15, 2021 9:00 am Hello,

can someone confirm the number below please.

Code: Select all

    double error = mse();
    printf("\nTotal error ccrl-40-15-elo-3200.epd with K=1.0: %.16f", error);
    getchar();
Loaded Epd Positions: 1537380
Total error ccrl-40-15-elo-3200.epd with K=1.0: 0.1216016900516253

Of course, I need to further divide the problem. Although I was convinced that the data was the problem,
I have to accept that there is a hidden problem with my code. Obviously, it's not the general algorithms,
nor the now-simplified scoring function. The bug seems to be hidden in the helper functions, if there is one.
So I want to start by checking if error calculation and epd routines work correctly.

So if someone can do a simple static material evaluation with the vector 100,300,300,500,1000 on the mentioned file,
I can just see if the error sum, mentioned above, is identical. Depending on the result then further steps will follow.

Thanks a lot in advance.
I use the phase encoding 1,2,4 (minor,roor,queen) with maximum 24 for mg positions.

The error for the statring vector

Code: Select all

int Eval::mgMat[7] = {0,110,310,310,510,1010,0};
int Eval::egMat[7] = {0, 90,290,290,490,990,0};
Loaded Epd Positions: 1537380
Total error ccrl-40-15-elo-3200.epd with K=1.0: 0.1215975237992763
I took the ccrl-3200 epd from here https://rebel13.nl/misc/epd.html.

Image

Got this result.

Code: Select all

K: 0, Pos: 4041988, total_sq_error: 416103.0,          mse: 0.1029451349187578
K: 1, Pos: 4041988, total_sq_error: 451012.9131711059, mse: 0.11158195253699563
K: 2, Pos: 4041988, total_sq_error: 520544.6291908135, mse: 0.12878430841229946
Reformatted epd file is here

Example:

Code: Select all

rnb2rk1/4bppp/2p1p3/p6q/Pp6/4NNP1/1PQ1PPBP/R2R2K1 w - -,1-0
Ferdy
Posts: 4833
Joined: Sun Aug 10, 2008 3:15 pm
Location: Philippines

Re: Tapered Evaluation and MSE (Texel Tuning)

Post by Ferdy »

Ferdy wrote: Fri Jan 15, 2021 5:30 pm Got this result.

Code: Select all

K: 0, Pos: 4041988, total_sq_error: 416103.0,          mse: 0.1029451349187578
K: 1, Pos: 4041988, total_sq_error: 451012.9131711059, mse: 0.11158195253699563
K: 2, Pos: 4041988, total_sq_error: 520544.6291908135, mse: 0.12878430841229946
Reformatted epd file is here

Example:

Code: Select all

rnb2rk1/4bppp/2p1p3/p6q/Pp6/4NNP1/1PQ1PPBP/R2R2K1 w - -,1-0
Piece value.

Code: Select all

pvalue = [[100,300,300,500,1000], [100,300,300,500,1000]]
Full result:

Code: Select all

K: 0, Pos: 4041988, total_sq_error: 416103.0,          mse: 0.1029451349187578
K: 1, Pos: 4041988, total_sq_error: 451012.9131711059, mse: 0.11158195253699563
K: 2, Pos: 4041988, total_sq_error: 520544.6291908135, mse: 0.12878430841229946
K: 3, Pos: 4041988, total_sq_error: 584036.6130747806, mse: 0.14449241637401708
K: 4, Pos: 4041988, total_sq_error: 633132.59761596,   mse: 0.15663891075776576
K: 5, Pos: 4041988, total_sq_error: 666981.03949793,   mse: 0.16501311718340825
User avatar
Desperado
Posts: 879
Joined: Mon Dec 15, 2008 11:45 am

Re: Tapered Evaluation and MSE (Texel Tuning)

Post by Desperado »

Hello Ferdy, thank you!

I will check it as soon as possible. This time to be sure, what is the pov of the result score?

1. pov result == side to move
2. pov result == white to move

Thanks.

@All Running my code for the quiet-labeled.epd i get "normal" results, even for minibatch like 50k.

quiet-labeled.epd
70 115 330 285 345 305 430 520 975 935 best: 0.069547 epoch: 14 SE/B=50K/K=1.6308
70 115 325 290 340 310 425 525 985 935 best: 0.064040 epoch: 15 QS/B=50K/K=1.6797
Ferdy
Posts: 4833
Joined: Sun Aug 10, 2008 3:15 pm
Location: Philippines

Re: Tapered Evaluation and MSE (Texel Tuning)

Post by Ferdy »

Desperado wrote: Fri Jan 15, 2021 6:41 pm Hello Ferdy, thank you!

I will check it as soon as possible. This time to be sure, what is the pov of the result score?

1. pov result == side to move
2. pov result == white to move

Code: Select all

rnb2rk1/4bppp/2p1p3/p6q/Pp6/4NNP1/1PQ1PPBP/R2R2K1 w - -,1-0
That 1-0 result is white wins, 0-1 black wins, does not matter whose side to move.
User avatar
Desperado
Posts: 879
Joined: Mon Dec 15, 2008 11:45 am

Re: Tapered Evaluation and MSE (Texel Tuning)

Post by Desperado »

Ferdy wrote: Fri Jan 15, 2021 6:04 pm
Ferdy wrote: Fri Jan 15, 2021 5:30 pm Got this result.

Code: Select all

K: 0, Pos: 4041988, total_sq_error: 416103.0,          mse: 0.1029451349187578
K: 1, Pos: 4041988, total_sq_error: 451012.9131711059, mse: 0.11158195253699563
K: 2, Pos: 4041988, total_sq_error: 520544.6291908135, mse: 0.12878430841229946
Reformatted epd file is here

Example:

Code: Select all

rnb2rk1/4bppp/2p1p3/p6q/Pp6/4NNP1/1PQ1PPBP/R2R2K1 w - -,1-0
Piece value.

Code: Select all

pvalue = [[100,300,300,500,1000], [100,300,300,500,1000]]
Full result:

Code: Select all

K: 0, Pos: 4041988, total_sq_error: 416103.0,          mse: 0.1029451349187578
K: 1, Pos: 4041988, total_sq_error: 451012.9131711059, mse: 0.11158195253699563
K: 2, Pos: 4041988, total_sq_error: 520544.6291908135, mse: 0.12878430841229946
K: 3, Pos: 4041988, total_sq_error: 584036.6130747806, mse: 0.14449241637401708
K: 4, Pos: 4041988, total_sq_error: 633132.59761596,   mse: 0.15663891075776576
K: 5, Pos: 4041988, total_sq_error: 666981.03949793,   mse: 0.16501311718340825
Hi, that looks pretty good

Code: Select all

K=0: MSE 0.1029451349187578
K=1: MSE 0.1115819525369956
K=2: MSE 0.1287843084122995
K=3: MSE 0.1444924163740171
K=4: MSE 0.1566389107577658
So, i don't have any issues with with epd utilities or my error computation!

That was very useful for me.

Of course i will do some tuning later and look at the results.Because this is simply another database than before.
User avatar
Desperado
Posts: 879
Joined: Mon Dec 15, 2008 11:45 am

Re: Tapered Evaluation and MSE (Texel Tuning)

Post by Desperado »

Hi, before i will start to check the update operations of the datastructure and the algorithm logic,
i thought i do what i did in the afternoon already, but with the current data.

In the latest post we reported the mse of the full file with different K and a given parameter vector for the material scores.
Fine!, independet code leads to the same results.

Here is a puzzle that might suprise you.

Code: Select all

int Eval::mgMat[7] = {0,1,1,1,1,1,0};
int Eval::egMat[7] = {0,1,1,1,1,1,0};

K=1: MSE 0.1029094541968299

int Eval::mgMat[7] = {0,-1,-2,-3,-4,-5,0};
int Eval::egMat[7] = {0, 2, 3, 4, 5, 6,0};

K=1: MSE 0.1028872134400059

int Eval::mgMat[7] = {0,100,300,300,500,1000,0};
int Eval::egMat[7] = {0,100,300,300,500,1000,0};

K=1: MSE 0.1115819525369956

Both artificial and meaningless vectors result in a significant better result than the starting vector related to the mse! No doubt this time.

This is consistent with my previous observations.
User avatar
hgm
Posts: 27787
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: Tapered Evaluation and MSE (Texel Tuning)

Post by hgm »

Not really a surprise. It was already clear form the constant eval experiment that there is something horribly wrong in the calculation of mse, and that it is not calculating what it should be calculating at all. Even with a completely non-sensical data set (e.g. just random numbers between 0 and 1 assigned to each position) the mse should have a parabolic dependence on a constant evaluation. You did not have that.
User avatar
Desperado
Posts: 879
Joined: Mon Dec 15, 2008 11:45 am

Re: Tapered Evaluation and MSE (Texel Tuning)

Post by Desperado »

Desperado wrote: Fri Jan 15, 2021 9:50 pm Hi, before i will start to check the update operations of the datastructure and the algorithm logic,
i thought i do what i did in the afternoon already, but with the current data.

In the latest post we reported the mse of the full file with different K and a given parameter vector for the material scores.
Fine!, independet code leads to the same results.

Here is a puzzle that might suprise you.

Code: Select all

int Eval::mgMat[7] = {0,1,1,1,1,1,0};
int Eval::egMat[7] = {0,1,1,1,1,1,0};

K=1: MSE 0.1029094541968299

int Eval::mgMat[7] = {0,-1,-2,-3,-4,-5,0};
int Eval::egMat[7] = {0, 2, 3, 4, 5, 6,0};

K=1: MSE 0.1028872134400059

int Eval::mgMat[7] = {0,100,300,300,500,1000,0};
int Eval::egMat[7] = {0,100,300,300,500,1000,0};

K=1: MSE 0.1115819525369956

Both artificial and meaningless vectors result in a significant better result than the starting vector related to the mse! No doubt this time.

This is consistent with my previous observations.

Code: Select all

int Eval::mgMat[7] = {0,-20,-50,-45,-140, -5,0};
int Eval::egMat[7] = {0, 80,270,280, 435,685,0};

K=1: MSE 0.0997049328336036
LOL
User avatar
Desperado
Posts: 879
Joined: Mon Dec 15, 2008 11:45 am

Re: Tapered Evaluation and MSE (Texel Tuning)

Post by Desperado »

hgm wrote: Fri Jan 15, 2021 10:47 pm Not really a surprise. It was already clear form the constant eval experiment that there is something horribly wrong in the calculation of mse, and that it is not calculating what it should be calculating at all. Even with a completely non-sensical data set (e.g. just random numbers between 0 and 1 assigned to each position) the mse should have a parabolic dependence on a constant evaluation. You did not have that.
HG, you missed something, the computation is verified as correct by Ferdy! There is nothing wrong in the calculation!
He posted several mse for different K for the complete file with 4041988 positions. The mse are completely identical!
And both of us used our own code base. Just look 3 posts before...

Then i computed the mse for the meaningless vectors, which have a clearly better mse.

After comparing the data base, we made separate measurements. There is a link and anyone can insert the vector and calculate the MSE (including you) for the data