SGD basics

j.t. · Post by **j.t.** » Sun Feb 20, 2022 2:53 pm

Desperado wrote: ↑Sun Feb 20, 2022 2:11 pm Just to be sure we do not confuse each other.

Ahh yes, I got confused. I shouldn't write comments when trying to wake up.
The error I calculate without any knowledge of endgames or openings (I mean, maybe in future I want to add more phases, and for that I don't want to rewrite my error calculation).

What I do is, that inside the evaluation function I calculate the value for each phase separately (also the gradient for each phase), and the only at the end, just before returning the value, I create a single value by interpolating between the two phase values / multiplying each gradient for each phase with the appropriate multiplier.
like this
this is then how I sum up values during the evaluation computation

Desperado · Post by **Desperado** » Fri Mar 11, 2022 6:25 pm

j.t. wrote: ↑Sun Feb 13, 2022 8:11 pm
lithander wrote: ↑Sun Feb 13, 2022 6:43 pm That really makes me wonder what the process behind Zurichess' data is?
From the README.txt:
How was the data generated
====

1. 75000 games were played between three slightly different versions of zurichess
using 2moves_v1.pgn opening book to ensure high play variability.
2. From each game 20 positions were sampled from the millions of positions evaluated by
the engine during the game play. This resulted in 1500k random positions which were stored in violent.epd.
3. From the set were removed all positions on which quiescence search found a winning capture.
The remaining positions were stored in quiet.epd.
4. Next, from each quiet position a game using Stockfish 080916 was played.
The result were stored in quiet-labeled.epd.

1. Can someone explain the idea why the step 1. and 2. was done to create violent.epd and quiet.epd?
Why these extra steps, where is the difference to play immediately the Stockfish games and filter quiet positions out of it?
So what is the point in playing 75K games if the training signal (game result) will be used from a different engine anyway?
To get quiet positions from different game phases can be done too when picking the positions from any dataset, or not?
2. And what was the time level for the playouts of Stockfish?
3. quiet-labled.epd includes 750K positions, so there were played 750K playouts by Stockfish?

j.t. · Post by **j.t.** » Fri Mar 11, 2022 8:54 pm

Desperado wrote: ↑Fri Mar 11, 2022 6:25 pm
How was the data generated
====

1. 75000 games were played between three slightly different versions of zurichess
using 2moves_v1.pgn opening book to ensure high play variability.
2. From each game 20 positions were sampled from the millions of positions evaluated by
the engine during the game play. This resulted in 1500k random positions which were stored in violent.epd.
3. From the set were removed all positions on which quiescence search found a winning capture.
The remaining positions were stored in quiet.epd.
4. Next, from each quiet position a game using Stockfish 080916 was played.
The result were stored in quiet-labeled.epd.
1. Can someone explain the idea why the step 1. and 2. was done to create violent.epd and quiet.epd?
Why these extra steps, where is the difference to play immediately the Stockfish games and filter quiet positions out of it?
So what is the point in playing 75K games if the training signal (game result) will be used from a different engine anyway?
To get quiet positions from different game phases can be done too when picking the positions from any dataset, or not?
2. And what was the time level for the playouts of Stockfish?
3. quiet-labled.epd includes 750K positions, so there were played 750K playouts by Stockfish?

1. The idea is that what you actually want to do with the evaluation is that it should predict positions that the engine you are trying to optimize has to statically evaluate during search as good as possible. These positions that an engine encounters in it's static evaluation are not necessarily of the same class as normal positions during a game. For example, if you look at the leaves of quiesce, you'll often see positions that, while not having winning captures anymore, are still extremely winning for one position because of some tactics. Which kind of positions an engine encounters in quiesce leaves (and at other times when a position needs to be statically evaluated during search), is likely slightly different for each engine, so it maybe helps if you sample these positions from the engine you are actually optimizing the evals for. Additionally, the sampled positions are representative of a usual game that the engine will play, so it has naturally the right amount of opening/endgame positions.

2. I don't know, but for the set I created for Nalwald (using almost the same method, if I understand correctly) during the playouts the time per move was 80 milliseconds. With Stockfish you'd obviously need less time per move to reach the same quality (when I use my Nalwald-set together with the ZurChess-set, the engine plays roughly 10 Elo better, than if I only use Nalwald-set, just to bring this into perspective).

3. I guess so. This process can take a very long time. I only did it myself because I had access to a machine with 80 threads. It took still more than a whole day.

Desperado · Post by **Desperado** » Fri Mar 11, 2022 9:36 pm

j.t. wrote: ↑Fri Mar 11, 2022 8:54 pm
Desperado wrote: ↑Fri Mar 11, 2022 6:25 pm
How was the data generated
====

1. 75000 games were played between three slightly different versions of zurichess
using 2moves_v1.pgn opening book to ensure high play variability.
2. From each game 20 positions were sampled from the millions of positions evaluated by
the engine during the game play. This resulted in 1500k random positions which were stored in violent.epd.
3. From the set were removed all positions on which quiescence search found a winning capture.
The remaining positions were stored in quiet.epd.
4. Next, from each quiet position a game using Stockfish 080916 was played.
The result were stored in quiet-labeled.epd.
1. Can someone explain the idea why the step 1. and 2. was done to create violent.epd and quiet.epd?
Why these extra steps, where is the difference to play immediately the Stockfish games and filter quiet positions out of it?
So what is the point in playing 75K games if the training signal (game result) will be used from a different engine anyway?
To get quiet positions from different game phases can be done too when picking the positions from any dataset, or not?
2. And what was the time level for the playouts of Stockfish?
3. quiet-labled.epd includes 750K positions, so there were played 750K playouts by Stockfish?
1. The idea is that what you actually want to do with the evaluation is that it should predict positions that the engine you are trying to optimize has to statically evaluate during search as good as possible. These positions that an engine encounters in it's static evaluation are not necessarily of the same class as normal positions during a game. For example, if you look at the leaves of quiesce, you'll often see positions that, while not having winning captures anymore, are still extremely winning for one position because of some tactics. Which kind of positions an engine encounters in quiesce leaves (and at other times when a position needs to be statically evaluated during search), is likely slightly different for each engine, so it maybe helps if you sample these positions from the engine you are actually optimizing the evals for. Additionally, the sampled positions are representative of a usual game that the engine will play, so it has naturally the right amount of opening/endgame positions.

2. I don't know, but for the set I created for Nalwald (using almost the same method, if I understand correctly) during the playouts the time per move was 80 milliseconds. With Stockfish you'd obviously need less time per move to reach the same quality (when I use my Nalwald-set together with the ZurChess-set, the engine plays roughly 10 Elo better, than if I only use Nalwald-set, just to bring this into perspective).

3. I guess so. This process can take a very long time. I only did it myself because I had access to a machine with 80 threads. It took still more than a whole day.

The positions from the game also do not represent positions from the search tree very well.
But in fact they always represent moves from the own principal variation. So the correlation, at least from the idea, is clearly better.

1. For this reason, I also think that the selected positions should be generated by the engine to be improved.
2. Based on this, it makes perfect sense to me to improve the training signal by using a stronger engine to predict the results more accurately.

For me, it would be particularly interesting to compare how much of an impact this second step has.
You can use the same tuning process for the original results and then for the results generated by the reference engine.

Desperado · Post by **Desperado** » Sun Mar 13, 2022 4:15 pm

Despite all this, I would like to note that for a material PST engine, the quiet-labeled.epd is still better than the self-generated datasets. I think this is so special because this data led to an improvement for many independent engines. This does not underline that this process step (data from the engine to be optimized) is crucial although it makes sense.

SGD basics

Re: SGD basics

Re: SGD basics

Re: SGD basics

Re: SGD basics

Re: SGD basics