Page 2 of 2

Re: Training data

Posted: Tue Jan 12, 2021 11:17 pm
by maksimKorzh
Desperado, here're my test of MSE for two sets of material + PST params:

My own PST: 0.13688435023553586
rofChade PST (3000+ engine): 0.13744401297556394

I calculated MSE on my gm2600.pgn 30 000 quite positions.
Now despite the fact rofChade's error is slightly bigger it's PSTs score around 70 Elo stronger

Re: Training data

Posted: Wed Jan 13, 2021 7:54 am
by Desperado
maksimKorzh wrote:
Tue Jan 12, 2021 11:17 pm
Desperado, here're my test of MSE for two sets of material + PST params:

My own PST: 0.13688435023553586
rofChade PST (3000+ engine): 0.13744401297556394

I calculated MSE on my gm2600.pgn 30 000 quite positions.
Now despite the fact rofChade's error is slightly bigger it's PSTs score around 70 Elo stronger
Hello Maksim,

is there a question involved, sorry i don't understand what you want to tell me.

In general it is not unusal that a smaller error does not produce better gameplay.
There is nothing wrong with that. There can be many reasons for such an observation.

Re: Training data

Posted: Wed Jan 13, 2021 9:26 am
by Desperado
Here is what i do with success now

1. I generated 111.661.993 positions from ccrl database with players both elo over 2800.
2. I shuffled the file and picked 4M by random.

That is what i did before too. Now the improvement...

3. I did a 3-ply search for each position
4. I playout the pv to move 3 and create the resulting epd entry with result and score (i just kept the result ?!)

Now validating...

5.I did a training session for material on the new dataset. The result was stable now. No diverging numbers anymore.

Perfect! This process allows me to use noisy input and create useful training data out of it.
Especially because it shows statistically well the distribution of position types in games.
The new feature that the data correlates well with the engine is of course a special bonus.

There is also something to play with, because the value of the position is also output and can be used in the error
calculation in some way.

Re: Training data

Posted: Wed Jan 13, 2021 10:09 am
by Desperado
Desperado wrote:
Wed Jan 13, 2021 9:26 am
Here is what i do with success now

1. I generated 111.661.993 positions from ccrl database with players both elo over 2800.
2. I shuffled the file and picked 4M by random.

That is what i did before too. Now the improvement...

3. I did a 3-ply search for each position
4. I playout the pv to move 3 and create the resulting epd entry with result and score (i just kept the result ?!)

Now validating...

5.I did a training session for material on the new dataset. The result was stable now. No diverging numbers anymore.

Perfect! This process allows me to use noisy input and create useful training data out of it.
Especially because it shows statistically well the distribution of position types in games.
The new feature that the data correlates well with the engine is of course a special bonus.

There is also something to play with, because the value of the position is also output and can be used in the error
calculation in some way.
Something went wrong ... it is not true what i wrote. Sorry for the noise!

Re: Training data

Posted: Wed Jan 13, 2021 12:31 pm
by maksimKorzh
Desperado wrote:
Wed Jan 13, 2021 7:54 am
maksimKorzh wrote:
Tue Jan 12, 2021 11:17 pm
Desperado, here're my test of MSE for two sets of material + PST params:

My own PST: 0.13688435023553586
rofChade PST (3000+ engine): 0.13744401297556394

I calculated MSE on my gm2600.pgn 30 000 quite positions.
Now despite the fact rofChade's error is slightly bigger it's PSTs score around 70 Elo stronger
Hello Maksim,

is there a question involved, sorry i don't understand what you want to tell me.

In general it is not unusal that a smaller error does not produce better gameplay.
There is nothing wrong with that. There can be many reasons for such an observation.
Just shared my experiment.
No question here)