Leela data worth +20 elo

Rebel · Post by **Rebel** » Mon Mar 28, 2022 5:18 am

dkappe wrote: ↑Fri Mar 25, 2022 6:26 pm Looks like a test by the SF team has shown that leela + sf data is 20 elo stronger than sf data by itself. +20 elo is no small thing at this level.

Indeed, it would be around 100 elo (or so) for a 3200 engine.

But since the thread exploded, allow me a question, it's unclear for me how you can mix data of 2 different engines, that's a receipt for trouble. I am just guessing, maybe they use Leela games and analyze them with SF itself? Then the whole argument SF using Leela data is pretty much lame.

dkappe · Post by **dkappe** » Mon Mar 28, 2022 6:14 am

Rebel wrote: ↑Mon Mar 28, 2022 5:18 am
dkappe wrote: ↑Fri Mar 25, 2022 6:26 pm Looks like a test by the SF team has shown that leela + sf data is 20 elo stronger than sf data by itself. +20 elo is no small thing at this level.
Indeed, it would be around 100 elo (or so) for a 3200 engine.

But since the thread exploded, allow me a question, it's unclear for me how you can mix data of 2 different engines, that's a receipt for trouble. I am just guessing, maybe they use Leela games and analyze them with SF itself? Then the whole argument SF using Leela data is pretty much lame.

Your instinct is in line with my own experience. I used bad gyal (mcts/nn) self-play data to train night nurse (nnue), then used night nurse running in CFish to generate data which I then tried to mix in in a variety of ways. This was with the original nnue architecture and nodchip trainer.

The results were always worse than plain bad gyal data.

Apparently training a net first with SF, then with a second run using leela data works a charm. Is it because of the new architecture or the new training software? I can’t say.

I have heard that they do filter the leela data in some ways, but I don’t know too much about that. Perhaps Sopel can speak to that.

Rebel · Post by **Rebel** » Mon Mar 28, 2022 10:30 am

dkappe wrote: ↑Mon Mar 28, 2022 6:14 am
Rebel wrote: ↑Mon Mar 28, 2022 5:18 am
dkappe wrote: ↑Fri Mar 25, 2022 6:26 pm Looks like a test by the SF team has shown that leela + sf data is 20 elo stronger than sf data by itself. +20 elo is no small thing at this level.
Indeed, it would be around 100 elo (or so) for a 3200 engine.

But since the thread exploded, allow me a question, it's unclear for me how you can mix data of 2 different engines, that's a receipt for trouble. I am just guessing, maybe they use Leela games and analyze them with SF itself? Then the whole argument SF using Leela data is pretty much lame.
Your instinct is in line with my own experience. I used bad gyal (mcts/nn) self-play data to train night nurse (nnue), then used night nurse running in CFish to generate data which I then tried to mix in in a variety of ways. This was with the original nnue architecture and nodchip trainer.

The results were always worse than plain bad gyal data.

Apparently training a net first with SF, then with a second run using leela data works a charm. Is it because of the new architecture or the new training software? I can’t say.

I have heard that they do filter the leela data in some ways, but I don’t know too much about that. Perhaps Sopel can speak to that.

Meaning, it's totally unclear how Lc0 data is used, it even can be used to only create volume. Nothing fishy about that.

Sopel · Post by **Sopel** » Mon Mar 28, 2022 11:25 am

dkappe wrote: ↑Mon Mar 28, 2022 6:14 am
Rebel wrote: ↑Mon Mar 28, 2022 5:18 am
dkappe wrote: ↑Fri Mar 25, 2022 6:26 pm Looks like a test by the SF team has shown that leela + sf data is 20 elo stronger than sf data by itself. +20 elo is no small thing at this level.
Indeed, it would be around 100 elo (or so) for a 3200 engine.

But since the thread exploded, allow me a question, it's unclear for me how you can mix data of 2 different engines, that's a receipt for trouble. I am just guessing, maybe they use Leela games and analyze them with SF itself? Then the whole argument SF using Leela data is pretty much lame.
Apparently training a net first with SF, then with a second run using leela data works a charm. Is it because of the new architecture or the new training software? I can’t say.

I have heard that they do filter the leela data in some ways, but I don’t know too much about that. Perhaps Sopel can speak to that.

Training first on SF data and then on mixed SF/Lc0 data has always been better, even with the very simple initial architecture. It's still a mystery why using SF/Lc0 data from the beginning produces worse nets. I don't remember if we ever tested this with the nodchip trainer but I doubt it's the trainer's fault.

As for filtering. We used to just filter out samples with bestmove being a capture or with the king being in check. This works well for all data we've tried. Recently vondele implemented additional stochastic skipping of positions based on correlation of the result with the evaluation (skipping more the more evaluation doesn't match the game result) which was shown to be slightly positive for Lc0 data but not SF data (might be because SF data is of worse quality and results have a lot of noise. Also consistent with it earlier achieving worse results with lambda<1.0). Also, since the introduction of that stochastic skipping it's best to use lambda=1.0 for mixed SF/Lc0 data, compared to previously best lambda~=0.8.

(target = some_sigmoid(pos_eval) * self.lambda_ + game_result * (1.0 - self.lambda_))

Rebel wrote: ↑Mon Mar 28, 2022 10:30 am Meaning, it's totally unclear how Lc0 data is used, it even can be used to only create volume. Nothing fishy about that.

We're perfectly clear what data we're using and how. Just look at commit messages that introduce new nets.

Rebel · Post by **Rebel** » Mon Mar 28, 2022 6:36 pm

Sopel wrote: ↑Mon Mar 28, 2022 11:25 am
Rebel wrote: ↑Mon Mar 28, 2022 5:18 am Meaning, it's totally unclear how Lc0 data is used, it even can be used to only create volume. Nothing fishy about that.
We're perfectly clear what data we're using and how. Just look at commit messages that introduce new nets.

Thanks for transparency, one more question about the Leela data, do the games come with comments (scores) and are the scores part of the learning process?

Sopel · Post by **Sopel** » Mon Mar 28, 2022 8:18 pm

Rebel wrote: ↑Mon Mar 28, 2022 6:36 pm
Sopel wrote: ↑Mon Mar 28, 2022 11:25 am
Rebel wrote: ↑Mon Mar 28, 2022 5:18 am Meaning, it's totally unclear how Lc0 data is used, it even can be used to only create volume. Nothing fishy about that.
We're perfectly clear what data we're using and how. Just look at commit messages that introduce new nets.
Thanks for transparency, one more question about the Leela data, do the games come with comments (scores) and are the scores part of the learning process?

Evaluation is the integral part of the training data. Lc0 data has evals of lc0, stockfish data has evals of stockfish. Evaluations are used to train the net.

Rebel · Post by **Rebel** » Mon Mar 28, 2022 8:41 pm

Sopel wrote: ↑Mon Mar 28, 2022 8:18 pm
Rebel wrote: ↑Mon Mar 28, 2022 6:36 pm
Sopel wrote: ↑Mon Mar 28, 2022 11:25 am
Rebel wrote: ↑Mon Mar 28, 2022 5:18 am Meaning, it's totally unclear how Lc0 data is used, it even can be used to only create volume. Nothing fishy about that.
We're perfectly clear what data we're using and how. Just look at commit messages that introduce new nets.
Thanks for transparency, one more question about the Leela data, do the games come with comments (scores) and are the scores part of the learning process?
Evaluation is the integral part of the training data. Lc0 data has evals of lc0, stockfish data has evals of stockfish. Evaluations are used to train the net.

Meaning that those who believe (not me) a NN should be made by the own engine have a point.

dkappe · Post by **dkappe** » Tue Mar 29, 2022 1:35 am

Rebel wrote: ↑Mon Mar 28, 2022 8:41 pm
Sopel wrote: ↑Mon Mar 28, 2022 8:18 pm
Rebel wrote: ↑Mon Mar 28, 2022 6:36 pm
Sopel wrote: ↑Mon Mar 28, 2022 11:25 am
Rebel wrote: ↑Mon Mar 28, 2022 5:18 am Meaning, it's totally unclear how Lc0 data is used, it even can be used to only create volume. Nothing fishy about that.
We're perfectly clear what data we're using and how. Just look at commit messages that introduce new nets.
Thanks for transparency, one more question about the Leela data, do the games come with comments (scores) and are the scores part of the learning process?
Evaluation is the integral part of the training data. Lc0 data has evals of lc0, stockfish data has evals of stockfish. Evaluations are used to train the net.
Meaning that those who believe (not me) a NN should be made by the own engine have a point.

I don’t agree with this. Making weird guidelines for TCEC about network and data origins is pointless. I think fully exploiting the value of lc0 data is a good thing and I’m glad the stockfish project did that. I would like to see an attempt made, however, to train a network based purely on pre-lc0 stockfish data. That’s an opportunity missed.

Rebel · Post by **Rebel** » Tue Mar 29, 2022 8:54 am

dkappe wrote: ↑Tue Mar 29, 2022 1:35 am
Rebel wrote: ↑Mon Mar 28, 2022 8:41 pm
Sopel wrote: ↑Mon Mar 28, 2022 8:18 pm
Rebel wrote: ↑Mon Mar 28, 2022 6:36 pm
Sopel wrote: ↑Mon Mar 28, 2022 11:25 am
Rebel wrote: ↑Mon Mar 28, 2022 5:18 am Meaning, it's totally unclear how Lc0 data is used, it even can be used to only create volume. Nothing fishy about that.
We're perfectly clear what data we're using and how. Just look at commit messages that introduce new nets.
Thanks for transparency, one more question about the Leela data, do the games come with comments (scores) and are the scores part of the learning process?
Evaluation is the integral part of the training data. Lc0 data has evals of lc0, stockfish data has evals of stockfish. Evaluations are used to train the net.
Meaning that those who believe (not me) a NN should be made by the own engine have a point.
I don’t agree with this. Making weird guidelines for TCEC about network and data origins is pointless. I think fully exploiting the value of lc0 data is a good thing and I’m glad the stockfish project did that. I would like to see an attempt made, however, to train a network based purely on pre-lc0 stockfish data. That’s an opportunity missed.

I think the only restriction should be that authors create their networks from the ground up.

dkappe · Post by **dkappe** » Tue Mar 29, 2022 9:53 am

Rebel wrote: ↑Tue Mar 29, 2022 8:54 am I think the only restriction should be that authors create their networks from the ground up.

Restriction? By whom for whom? I think there ought to be no restrictions, just a desire to do interesting work. Any restrictions that get in the way should be discarded.

Leela data worth +20 elo

Re: Leela data worth +20 elo

Re: Leela data worth +20 elo

Re: Leela data worth +20 elo

Re: Leela data worth +20 elo

Re: Leela data worth +20 elo

Re: Leela data worth +20 elo

Re: Leela data worth +20 elo

Re: Leela data worth +20 elo

Re: Leela data worth +20 elo

Re: Leela data worth +20 elo