Desperado, here're my test of MSE for two sets of material + PST params:
My own PST: 0.13688435023553586
rofChade PST (3000+ engine): 0.13744401297556394
I calculated MSE on my gm2600.pgn 30 000 quite positions.
Now despite the fact rofChade's error is slightly bigger it's PSTs score around 70 Elo stronger
Training data
Moderators: hgm, Dann Corbit, Harvey Williamson
Forum rules
This textbox is used to restore diagrams posted with the [d] tag before the upgrade.
This textbox is used to restore diagrams posted with the [d] tag before the upgrade.
- maksimKorzh
- Posts: 627
- Joined: Sat Sep 08, 2018 3:37 pm
- Location: Ukraine
- Full name: Maksim Korzh
- Contact:
Re: Training data
Wukong Xiangqi (Chinese chess engine + apps to embed into 3rd party websites):
https://github.com/maksimKorzh/wukong-xiangqi
Chess programming YouTube channel:
https://www.youtube.com/channel/UCB9-pr ... KKqDgXhsMQ
https://github.com/maksimKorzh/wukong-xiangqi
Chess programming YouTube channel:
https://www.youtube.com/channel/UCB9-pr ... KKqDgXhsMQ
Re: Training data
Hello Maksim,maksimKorzh wrote: ↑Tue Jan 12, 2021 11:17 pmDesperado, here're my test of MSE for two sets of material + PST params:
My own PST: 0.13688435023553586
rofChade PST (3000+ engine): 0.13744401297556394
I calculated MSE on my gm2600.pgn 30 000 quite positions.
Now despite the fact rofChade's error is slightly bigger it's PSTs score around 70 Elo stronger
is there a question involved, sorry i don't understand what you want to tell me.
In general it is not unusal that a smaller error does not produce better gameplay.
There is nothing wrong with that. There can be many reasons for such an observation.
Re: Training data
Here is what i do with success now
1. I generated 111.661.993 positions from ccrl database with players both elo over 2800.
2. I shuffled the file and picked 4M by random.
That is what i did before too. Now the improvement...
3. I did a 3-ply search for each position
4. I playout the pv to move 3 and create the resulting epd entry with result and score (i just kept the result ?!)
Now validating...
5.I did a training session for material on the new dataset. The result was stable now. No diverging numbers anymore.
Perfect! This process allows me to use noisy input and create useful training data out of it.
Especially because it shows statistically well the distribution of position types in games.
The new feature that the data correlates well with the engine is of course a special bonus.
There is also something to play with, because the value of the position is also output and can be used in the error
calculation in some way.
1. I generated 111.661.993 positions from ccrl database with players both elo over 2800.
2. I shuffled the file and picked 4M by random.
That is what i did before too. Now the improvement...
3. I did a 3-ply search for each position
4. I playout the pv to move 3 and create the resulting epd entry with result and score (i just kept the result ?!)
Now validating...
5.I did a training session for material on the new dataset. The result was stable now. No diverging numbers anymore.
Perfect! This process allows me to use noisy input and create useful training data out of it.
Especially because it shows statistically well the distribution of position types in games.
The new feature that the data correlates well with the engine is of course a special bonus.
There is also something to play with, because the value of the position is also output and can be used in the error
calculation in some way.
Re: Training data
Something went wrong ... it is not true what i wrote. Sorry for the noise!Desperado wrote: ↑Wed Jan 13, 2021 9:26 amHere is what i do with success now
1. I generated 111.661.993 positions from ccrl database with players both elo over 2800.
2. I shuffled the file and picked 4M by random.
That is what i did before too. Now the improvement...
3. I did a 3-ply search for each position
4. I playout the pv to move 3 and create the resulting epd entry with result and score (i just kept the result ?!)
Now validating...
5.I did a training session for material on the new dataset. The result was stable now. No diverging numbers anymore.
Perfect! This process allows me to use noisy input and create useful training data out of it.
Especially because it shows statistically well the distribution of position types in games.
The new feature that the data correlates well with the engine is of course a special bonus.
There is also something to play with, because the value of the position is also output and can be used in the error
calculation in some way.
- maksimKorzh
- Posts: 627
- Joined: Sat Sep 08, 2018 3:37 pm
- Location: Ukraine
- Full name: Maksim Korzh
- Contact:
Re: Training data
Just shared my experiment.Desperado wrote: ↑Wed Jan 13, 2021 7:54 amHello Maksim,maksimKorzh wrote: ↑Tue Jan 12, 2021 11:17 pmDesperado, here're my test of MSE for two sets of material + PST params:
My own PST: 0.13688435023553586
rofChade PST (3000+ engine): 0.13744401297556394
I calculated MSE on my gm2600.pgn 30 000 quite positions.
Now despite the fact rofChade's error is slightly bigger it's PSTs score around 70 Elo stronger
is there a question involved, sorry i don't understand what you want to tell me.
In general it is not unusal that a smaller error does not produce better gameplay.
There is nothing wrong with that. There can be many reasons for such an observation.
No question here)
Wukong Xiangqi (Chinese chess engine + apps to embed into 3rd party websites):
https://github.com/maksimKorzh/wukong-xiangqi
Chess programming YouTube channel:
https://www.youtube.com/channel/UCB9-pr ... KKqDgXhsMQ
https://github.com/maksimKorzh/wukong-xiangqi
Chess programming YouTube channel:
https://www.youtube.com/channel/UCB9-pr ... KKqDgXhsMQ