That would be policy. At root, after a decent number of nodes, policy is dwarfed by evaluation. If your complaint is that policy is particularly refined in the opening, then fair enough. Tournament directors should use a TC that lets Leela have at least a half of a second of time in the opening to think, so we put aside concerns of policy dictating Leela's opening preferences. Are there any tournaments that don't do this?Milos wrote: ↑Mon Sep 14, 2020 4:43 amYou my friend have a conflict with a basic logic. The "fact" that the net with 18 pieces can't memorize the opening with 32 pieces has absolutely nothing to do with refuting that net with 32 pieces can memorize the opening with 32 pieces. You are just repeating your non sequitur argument, nothing else.
To simplify the argument for you so you'd be able to actually follow - NN is equal to book+evaluation. When you enter a position with 32 pieces into NN that is trained on 18 pieces NN will perform only eval. When you enter a position that has 32 pieces and that net has actually been trained at it will output book score adjusted by its eval.
Asking to people who believe Leela NN is a book, what they think about SF NN now?
Moderators: hgm, Rebel, chrisw
-
- Posts: 144
- Joined: Sun Oct 14, 2018 8:21 pm
- Full name: JSmith
Re: Asking to people who believe Leela NN is a book, what they think about SF NN now?
-
- Posts: 343
- Joined: Sun Aug 25, 2019 8:33 am
- Full name: .
Re: Asking to people who believe Leela NN is a book, what they think about SF NN now?
Overfitting is basically memorization. But I don't think LC0 overfits much. Using modern methods you can now have huge neural nets that are not overfitting and LC0's nets are not even large.AndrewGrant wrote: ↑Sun Sep 13, 2020 11:00 pmI'm behind NN research by 4 decades and even I'm aware that its a well documented phenomena that NNs bake in a memorization of the dataset.Ovyron wrote: ↑Sun Sep 13, 2020 10:13 pm Because whenever an NN checks the opening position, it's the first time it sees it, to make 1.e4 and 1.d4 reach the top it has to do it from scratch. Only, it's done extremely intelligently and fast, just after 1 position generated it'll recognize the patterns that it has learned and that can suffice to bring them to the top, but it has nothing to do with the opening position, as if you switch pieces around just after 1 position it'll also come up with a decent move already, and the only reason it's not as good it's because the person training it hasn't shown it the patterns that appear from this position with pieces switched around.
https://arxiv.org/pdf/1611.03530.pdf Read that, and if you still don't think its possible for Leela to "memorize" a selection of openings, please email the authors at the top of that document, as they will be far less generous than users here.
-
- Posts: 89
- Joined: Sat Nov 09, 2019 3:24 pm
- Full name: .
Re: Asking to people who believe Leela NN is a book, what they think about SF NN now?
Overfitting = memorization does not imply underfitting cannot be memorization too.
-
- Posts: 4190
- Joined: Wed Nov 25, 2009 1:47 am
Re: Asking to people who believe Leela NN is a book, what they think about SF NN now?
That is a huge simplification that is also logically problematic. While overfitting can be an indication of memory, there is zero indication that nets that don't overfit don't have memory. Assuming that is plain wrong.mmt wrote: ↑Mon Sep 14, 2020 7:29 amOverfitting is basically memorization. But I don't think LC0 overfits much. Using modern methods you can now have huge neural nets that are not overfitting and LC0's nets are not even large.AndrewGrant wrote: ↑Sun Sep 13, 2020 11:00 pmI'm behind NN research by 4 decades and even I'm aware that its a well documented phenomena that NNs bake in a memorization of the dataset.Ovyron wrote: ↑Sun Sep 13, 2020 10:13 pm Because whenever an NN checks the opening position, it's the first time it sees it, to make 1.e4 and 1.d4 reach the top it has to do it from scratch. Only, it's done extremely intelligently and fast, just after 1 position generated it'll recognize the patterns that it has learned and that can suffice to bring them to the top, but it has nothing to do with the opening position, as if you switch pieces around just after 1 position it'll also come up with a decent move already, and the only reason it's not as good it's because the person training it hasn't shown it the patterns that appear from this position with pieces switched around.
https://arxiv.org/pdf/1611.03530.pdf Read that, and if you still don't think its possible for Leela to "memorize" a selection of openings, please email the authors at the top of that document, as they will be far less generous than users here.
-
- Posts: 4190
- Joined: Wed Nov 25, 2009 1:47 am
Re: Asking to people who believe Leela NN is a book, what they think about SF NN now?
The "book" (i.e. policy) information certainly helps Lc0 quite a lot, so if you want a fair tournament conditions you either need to give A/B engines much more time in the opening than to Leela, or give it some access to the opening book. The whole discussion sparked from the point that current tournament conditions are not fair to A/B engines.cucumber wrote: ↑Mon Sep 14, 2020 6:19 amThat would be policy. At root, after a decent number of nodes, policy is dwarfed by evaluation. If your complaint is that policy is particularly refined in the opening, then fair enough. Tournament directors should use a TC that lets Leela have at least a half of a second of time in the opening to think, so we put aside concerns of policy dictating Leela's opening preferences. Are there any tournaments that don't do this?Milos wrote: ↑Mon Sep 14, 2020 4:43 amYou my friend have a conflict with a basic logic. The "fact" that the net with 18 pieces can't memorize the opening with 32 pieces has absolutely nothing to do with refuting that net with 32 pieces can memorize the opening with 32 pieces. You are just repeating your non sequitur argument, nothing else.
To simplify the argument for you so you'd be able to actually follow - NN is equal to book+evaluation. When you enter a position with 32 pieces into NN that is trained on 18 pieces NN will perform only eval. When you enter a position that has 32 pieces and that net has actually been trained at it will output book score adjusted by its eval.
-
- Posts: 550
- Joined: Tue Nov 19, 2019 8:48 pm
- Full name: Alayan Feh
Re: Asking to people who believe Leela NN is a book, what they think about SF NN now?
With a pure eval and without something like Leela's policy head that is explicitely trained to suggest moves that were successful in training/tuning games (instead of keeping eval parameters that do happen to produce successful moves overall), memorizing is harder.
Another element is that the parameter-space of classical eval is much smaller. The memorization capacity of NN is also related to the huge size of their parameter-space that largely exceeds the training dataset size in many instances. The study linked by Andrew earlier shows that CNN can be made to fit arbitrary datasets.
Nonetheless, you'd have a point, if SPSA tuning was done from the start position. This would definitely cause some memorization however minor and hidden in normal-looking parameters it might be.
SF's SPSA tuning is done from a book containing tens of thousands of positions, however. This is much bigger than SF's eval parameter space.
-
- Posts: 1631
- Joined: Tue Aug 21, 2018 7:52 pm
- Full name: Dietrich Kappe
Re: Asking to people who believe Leela NN is a book, what they think about SF NN now?
Crickets. Just people who like to argue.dkappe wrote: ↑Mon Sep 14, 2020 5:18 amSo your hypothesis is that a leela type network memorizes openings. How do we test this hypothesis? What evidence, for instance, would show this hypothesis to be false? If there is no possible way for the hypothesis to be disproven, then it is vacuous.Milos wrote: ↑Mon Sep 14, 2020 4:43 amYou my friend have a conflict with a basic logic. The "fact" that the net with 18 pieces can't memorize the opening with 32 pieces has absolutely nothing to do with refuting that net with 32 pieces can memorize the opening with 32 pieces. You are just repeating your non sequitur argument, nothing else.
To simplify the argument for you so you'd be able to actually follow - NN is equal to book+evaluation. When you enter a position with 32 pieces into NN that is trained on 18 pieces NN will perform only eval. When you enter a position that has 32 pieces and that net has actually been trained at it will output book score adjusted by its eval.
So, how would one go about trying to disprove it?
Fat Titz by Stockfish, the engine with the bodaciously big net. Remember: size matters. If you want to learn more about this engine just google for "Fat Titz".
-
- Posts: 144
- Joined: Sun Oct 14, 2018 8:21 pm
- Full name: JSmith
Re: Asking to people who believe Leela NN is a book, what they think about SF NN now?
I'm not convinced.Alayan wrote: ↑Mon Sep 14, 2020 2:07 pmWith a pure eval and without something like Leela's policy head that is explicitely trained to suggest moves that were successful in training/tuning games (instead of keeping eval parameters that do happen to produce successful moves overall), memorizing is harder.
Another element is that the parameter-space of classical eval is much smaller. The memorization capacity of NN is also related to the huge size of their parameter-space that largely exceeds the training dataset size in many instances. The study linked by Andrew earlier shows that CNN can be made to fit arbitrary datasets.
Nonetheless, you'd have a point, if SPSA tuning was done from the start position. This would definitely cause some memorization however minor and hidden in normal-looking parameters it might be.
SF's SPSA tuning is done from a book containing tens of thousands of positions, however. This is much bigger than SF's eval parameter space.
SPSA has done a ridiculous amount to teach SF theory. And SF's parameter space can capture more than enough for an optimizer to make it function as a highly-compressed book
Stockfish 070620 at depth 12: The PV in its entirety follows theory for 12 straight plies with 72,023 nodes.
NNUE at depth 12: PV follows theory for 9 straight plies with a mere 9,781 nodes.
Leela, with the latest T60 net (64988), follows theory for 4 plies with 10,381 nodes before playing weird moves.
Ethereal, with 161,564 nodes, is able to follow theory for four plies. Clearly, Ethereal demonstrates highly advanced opening-book tendencies just like Leela.
NNUE and classical eval both follow theory at laughably small node counts perfectly fine, where other engines (Leela included) would otherwise struggle tremendously. Even classical eval is able to follow some level of theory regardless of the node count at nearly any depth.
Stockfish is the best book out there.
-
- Posts: 144
- Joined: Sun Oct 14, 2018 8:21 pm
- Full name: JSmith
Re: Asking to people who believe Leela NN is a book, what they think about SF NN now?
SF12 knows opening theory better than Leela does. In fact, both SF11 and SF12 are far more efficient in discovering theory than either Leela or non-SPSA'd engines like Ethereal.Milos wrote: ↑Mon Sep 14, 2020 11:43 amThe "book" (i.e. policy) information certainly helps Lc0 quite a lot, so if you want a fair tournament conditions you either need to give A/B engines much more time in the opening than to Leela, or give it some access to the opening book. The whole discussion sparked from the point that current tournament conditions are not fair to A/B engines.cucumber wrote: ↑Mon Sep 14, 2020 6:19 amThat would be policy. At root, after a decent number of nodes, policy is dwarfed by evaluation. If your complaint is that policy is particularly refined in the opening, then fair enough. Tournament directors should use a TC that lets Leela have at least a half of a second of time in the opening to think, so we put aside concerns of policy dictating Leela's opening preferences. Are there any tournaments that don't do this?Milos wrote: ↑Mon Sep 14, 2020 4:43 amYou my friend have a conflict with a basic logic. The "fact" that the net with 18 pieces can't memorize the opening with 32 pieces has absolutely nothing to do with refuting that net with 32 pieces can memorize the opening with 32 pieces. You are just repeating your non sequitur argument, nothing else.
To simplify the argument for you so you'd be able to actually follow - NN is equal to book+evaluation. When you enter a position with 32 pieces into NN that is trained on 18 pieces NN will perform only eval. When you enter a position that has 32 pieces and that net has actually been trained at it will output book score adjusted by its eval.
Do you think we should give other engines more time in the opening when playing against Stockfish as well, then? Do we only penalize engines with policy heads? If so, can large decision trees be used in place of a neural policy head? Is it just the neural network structure that's problematic? How many parameters can search and move ordering code have before we need to give other engines extra time?
Last edited by cucumber on Mon Sep 14, 2020 9:49 pm, edited 1 time in total.
-
- Posts: 546
- Joined: Sat Aug 17, 2013 12:36 am
Re: Asking to people who believe Leela NN is a book, what they think about SF NN now?
On 1 node (or equal node counts) or via search? Because I have a hard time believing the former....cucumber wrote:SF12 knows opening theory better than Leela does.