Tapered Evaluation and MSE (Texel Tuning)

Ferdy · Post by **Ferdy** » Thu Jan 14, 2021 10:08 am

Ferdy wrote: ↑Thu Jan 14, 2021 8:08 am

Next I will try to run at 300k batch size.

300k is done with the following plot.

Iterations: 5
Best mse: 0.11082894164608197
Best parameters:
+----------+--------+---------+
| par      |   init |   tuned |
+==========+========+=========+
| KnightOp |    300 |     305 |
+----------+--------+---------+
| KnightEn |    300 |     295 |
+----------+--------+---------+
| BishopOp |    300 |     305 |
+----------+--------+---------+
| BishopEn |    300 |     310 |
+----------+--------+---------+
| RookOp   |    500 |     490 |
+----------+--------+---------+
| RookEn   |    500 |     490 |
+----------+--------+---------+
| QueenOp  |   1000 |    1010 |
+----------+--------+---------+
| QueenEn  |   1000 |     990 |
+----------+--------+---------+

Next use ccrl3200 positions with material imbalance only as training positions.

hgm · Post by **hgm** » Thu Jan 14, 2021 10:50 am

If you belief the tuning algorithm to work properly (i.e. indeed minimize the rms error on the total set), the only explanation for non-sensical values of the parameters producing the minimum error is that your test set doesn't contain enough prositions that punish the use of non-sensical values, in terms of a huge error.

So you have 15% phase = maxPhase = 24 positions in the set. Fine. So that already gives 15% that are TOTALLY USELESS for determining the values of non-Pawn pieces, because you can only have phase=24 when all pieces are still present, without material imbalance. At best you might hope to get a Pawn value from that, but I have no great hopes for that either. It is not visible from the game phase, but in most of those positions the Pawns were probably balanced too, and where they weren't, they were probably intentionally sacrificed for huge compensation.

It would be more revealing to know how many positions you have with phase = 23, and how large the error is on that set.

Desperado · Post by **Desperado** » Thu Jan 14, 2021 10:51 am

Ferdy wrote: ↑Thu Jan 14, 2021 10:08 am

Ferdy wrote: ↑Thu Jan 14, 2021 8:08 am

Next I will try to run at 300k batch size.

300k is done with the following plot.

Code: Select all

Iterations: 5
Best mse: 0.11082894164608197
Best parameters:
+----------+--------+---------+
| par      |   init |   tuned |
+==========+========+=========+
| KnightOp |    300 |     305 |
+----------+--------+---------+
| KnightEn |    300 |     295 |
+----------+--------+---------+
| BishopOp |    300 |     305 |
+----------+--------+---------+
| BishopEn |    300 |     310 |
+----------+--------+---------+
| RookOp   |    500 |     490 |
+----------+--------+---------+
| RookEn   |    500 |     490 |
+----------+--------+---------+
| QueenOp  |   1000 |    1010 |
+----------+--------+---------+
| QueenEn  |   1000 |     990 |
+----------+--------+---------+

Next use ccrl3200 positions with material imbalance only as training positions.

Hi Ferdy, from your readme

Postions are saved with the following conditions:
* If the move in the game is not a capture, and not a checking move and not a promote move and the side to move is not in check and the game has either 1-0 or 0-1 result.

Does that mean you ignore draws and positions which include a move in the epd/pgn of metioned type?

Desperado · Post by **Desperado** » Thu Jan 14, 2021 11:21 am

hgm wrote: ↑Thu Jan 14, 2021 10:50 am If you belief the tuning algorithm to work properly (i.e. indeed minimize the rms error on the total set), the only explanation for non-sensical values of the parameters producing the minimum error is that your test set doesn't contain enough prositions that punish the use of non-sensical values, in terms of a huge error.

So you have 15% phase = maxPhase = 24 positions in the set. Fine. So that already gives 15% that are TOTALLY USELESS for determining the values of non-Pawn pieces, because you can only have phase=24 when all pieces are still present, without material imbalance. At best you might hope to get a Pawn value from that, but I have no great hopes for that either. It is not visible from the game phase, but in most of those positions the Pawns were probably balanced too, and where they weren't, they were probably intentionally sacrificed for huge compensation.

It would be more revealing to know how many positions you have with phase = 23, and how large the error is on that set.

I understand, good point! Especially if the pawn value is an anchor the phase 24 does not provide any information the algorithm can learn from.
It would never update anything for n,b,r or q. That reduces the training sample by 15% already. So, it would be ideal to use positions that include an imbalance of the term that is tuned. The easy idea to achieve something like that is to make the general batch size bigger, the more positions are used the higher the probability is, that this condition will be met. The smart way would be to collect/filter the specific data and tune the collection.

Ferdy · Post by **Ferdy** » Thu Jan 14, 2021 11:23 am

Ferdy wrote: ↑Thu Jan 14, 2021 10:08 am Next use ccrl3200 positions with material imbalance only as training positions.

This one using ccrl3200 with only material imbalance. Pawn is included in the tuning.

Code: Select all

successive iteration without error improvement: 3
Exit tuning, error is not improved.

Iterations: 8
Best mse: 0.11876824568747148
Best parameters:
+----------+--------+---------+
| par      |   init |   tuned |
+==========+========+=========+
| PawnOp   |    100 |     100 |
+----------+--------+---------+
| PawnEn   |    100 |      80 |
+----------+--------+---------+
| KnightOp |    300 |     290 |
+----------+--------+---------+
| KnightEn |    300 |     305 |
+----------+--------+---------+
| BishopOp |    300 |     310 |
+----------+--------+---------+
| BishopEn |    300 |     320 |
+----------+--------+---------+
| RookOp   |    500 |     480 |
+----------+--------+---------+
| RookEn   |    500 |     480 |
+----------+--------+---------+
| QueenOp  |   1000 |    1000 |
+----------+--------+---------+
| QueenEn  |   1000 |     980 |
+----------+--------+---------+

Ferdy · Post by **Ferdy** » Thu Jan 14, 2021 11:46 am

Desperado wrote: ↑Thu Jan 14, 2021 10:51 am Hi Ferdy, from your readme

Postions are saved with the following conditions:
* If the move in the game is not a capture, and not a checking move and not a promote move and the side to move is not in check and the game has either 1-0 or 0-1 result.
Does that mean you ignore draws and positions which include a move in the epd/pgn of metioned type?

Yes.

Desperado · Post by **Desperado** » Thu Jan 14, 2021 12:00 pm

Hello Ferdy,

i have some questions, because looking into https://github.com/fsmosca/Piece-Value- ... aster/data
i am going to realize more and more differences compared to what i am doing.

1. choice of position related to the game result (i saw right now, that you answered it already...thx)

a. do you use draws or do you ignore them for the ccrl database ?
b. are there any other restrictions included ?

2. data presentation R4n2/4k3/P3n2p/5R2/8/7P/2r3PK/8 b - -,2,-2,0,1,0,1

a. i dont see how you interpolate the phase score out of the delta. I mean you can generate

mg = 2 * 100 + -2 * 300 + 1 * 500
eg = 2 * 110 + -2 * 290 + 1 * 550
Is the error computation seperated for mg/eg ? Or do you include logic to interpolate before computing the error ?
Like eval() in the engine is doing. At the beginning i pointed out, that "my" problem does not occur if i compute the
phases seperately (doesn't matter if i do that in parallel or in sequence).

b. The imbalance is represented differently

"2,1,0,-1,0" is very imbalanced because it is not "0,0,0,0,0" while my eval() would consider the resulting score "0" as balanced.
So,the resulting test set would look very different, i guess.

And a question related to your ccrl computation. You said that you computed the pawn value too, both mg and eg ?

Desperado · Post by **Desperado** » Thu Jan 14, 2021 12:04 pm

Ferdy wrote: ↑Thu Jan 14, 2021 11:46 am
Desperado wrote: ↑Thu Jan 14, 2021 10:51 am Hi Ferdy, from your readme

Postions are saved with the following conditions:
* If the move in the game is not a capture, and not a checking move and not a promote move and the side to move is not in check and the game has either 1-0 or 0-1 result.
Does that mean you ignore draws and positions which include a move in the epd/pgn of metioned type?
Yes.

So, you check the move in the epd for its attributes like capture, promotion or check, too. At least that is what i would expect now.

hgm · Post by **hgm** » Thu Jan 14, 2021 12:07 pm

Desperado wrote: ↑Thu Jan 14, 2021 11:21 amI understand, good point! Especially if the pawn value is an anchor the phase 24 does not provide any information the algorithm can learn from.
It would never update anything for n,b,r or q. That reduces the training sample by 15% already. So, it would be ideal to use positions that include an imbalance of the term that is tuned. The easy idea to achieve something like that is to make the general batch size bigger, the more positions are used the higher the probability is, that this condition will be met. The smart way would be to collect/filter the specific data and tune the collection.

It is not only that they should occur, but they should occur frequently enough to carry enough weight to actually pay attention to them. It is the relative occurrence that counts. Otherwise the optimizer will just take its losses on a few idiotic predictions to get the large majority of the positions marginally better predicted. I don't think there would be any hope to get sensible middle-game piece values if a significant fraction of the positions (say 10%) doesn't combine a large material imbalance (> 1 Pawn) with phase >= 28.

Ferdy · Post by **Ferdy** » Thu Jan 14, 2021 12:11 pm

Desperado wrote: ↑Wed Jan 13, 2021 1:04 pm Hello everybody,

to understand what is going on, i thought i can use a databse that i did not generate myself.
So i used ccrl-40-15-elo-3200.epd from https://rebel13.nl/misc/epd.html.

Setup:

1. material only evaluator
2. cpw algorithm
3. scalingfactor K=1.0
4. pawn values are anchor for mg and eg [100,100]
5. starting values [300,300][300,300],[500,500][1000,1000]
6. loss function uses squared error
7. 50K sample size
8. phase value computation
Code: Select all
    int phase = 1 * Bit::popcnt(Pos::minors(pos));
    phase += 2 * Bit::popcnt(Pos::rooks(pos));
    phase += 4 * Bit::popcnt(Pos::queens(pos));
    phase = min(phase, 24);

    int s = (score.mg * phase + score.eg * (24 - phase)) / 24;
    return pos->stm == WHITE ? s : -s;

Just checking, given this training position,

Code: Select all

1r6/1p1r4/2p1p3/2PBP1Bk/R2P2bP/5p2/1R1K4/8 b - -,1-0

Your material eval would return a score of -300 cp, point of view is side or spov.

Notice the side to move and result.

Check your score and formula if your sigmoid is the same as mine.

if side is black set score to -score, then apply sigmoid.

Code: Select all

K=1.0
sigmoid = 1.0/(1.0 + 10.0**(-K*score/400.0))

Code: Select all

sigmoid: 0.8490204427886767
result = 1.0
error = 1.0 - sigmoid
squared_error = error*error

Tapered Evaluation and MSE (Texel Tuning)

Re: Tapered Evaluation and MSE (Texel Tuning)

Re: Tapered Evaluation and MSE (Texel Tuning)

Re: Tapered Evaluation and MSE (Texel Tuning)

Re: Tapered Evaluation and MSE (Texel Tuning)

Re: Tapered Evaluation and MSE (Texel Tuning)

Re: Tapered Evaluation and MSE (Texel Tuning)

Re: Tapered Evaluation and MSE (Texel Tuning)

Re: Tapered Evaluation and MSE (Texel Tuning)

Re: Tapered Evaluation and MSE (Texel Tuning)

Re: Tapered Evaluation and MSE (Texel Tuning)