I think that's the key point to the whole tactics discussion. Leela misses tactics whenever the policy priors on one of the moves of the tactical line are low. But this just proves that Leela is no AlphaZero (yet), not that she can never reach this level: The ResNet architecture with enough layers and filters is assuredly capable of resolving such tactics, once it has been shown enough examples. Dirichlet noise in self-play games provides the momentum to learn deeper tactics in self-play by exploring moves not known by the policy head, this process just takes a while and its performance ceiling is limited by the size of the network. Once that is increased, improvement resumes.syzygy wrote:But they don't help if the search simply never looks at the key move.Albert Silver wrote:80,000 NPS didn't hurt either...
LCZero: Progress and Scaling. Relation to CCRL Elo
Moderators: hgm, Rebel, chrisw
-
- Posts: 143
- Joined: Wed Jan 17, 2018 1:26 pm
Re: LCZero: Progress and Scaling. Relation to CCRL Elo
-
- Posts: 10948
- Joined: Wed Jul 26, 2006 10:21 pm
- Full name: Kai Laskos
Re: LCZero: Progress and Scaling. Relation to CCRL Elo
Still, I don't see how it will be improved dramatically, say on WAC positions, which most reasonable AB engines solve overwhelmingly. The current size network, which is surely not saturated yet, shows no progress on tactical shots, on WAC it shows even a regression:jkiliani wrote:I think that's the key point to the whole tactics discussion. Leela misses tactics whenever the policy priors on one of the moves of the tactical line are low. But this just proves that Leela is no AlphaZero (yet), not that she can never reach this level: The ResNet architecture with enough layers and filters is assuredly capable of resolving such tactics, once it has been shown enough examples. Dirichlet noise in self-play games provides the momentum to learn deeper tactics in self-play by exploring moves not known by the policy head, this process just takes a while and its performance ceiling is limited by the size of the network. Once that is increased, improvement resumes.syzygy wrote:But they don't help if the search simply never looks at the key move.Albert Silver wrote:80,000 NPS didn't hurt either...
6s/position 4 CPU threads (equivalent to 1s on GTX 1060)
ID227: 120/300
ID252: 113/300
After more than 1 million games with the bigger net. This is worrying. These WAC positions are piece of cake for reasonable AB engines, and an exploit can be made in order for this sort of positions to occur more often in games. Don't you suspect that even A0 was severely sub-par compared to 280+/300 results in WAC of reasonable AB engines? Also, it seems that network itself at 1 playout improves tactically (visible playing against it), but the search doesn't improve with better network in solving the tactics.
It was interesting to see LC0 trashing an AB engine in normal games from normal starting positions:
Openings: 3moves_GM.epd (side and reversed)
Score of LC0_245 vs Predateur 2.2.1: 93 - 1 - 6 [0.960] 100
ELO difference: 552.08 +/- 173.83
Finished match
But from WAC starting position (side and reversed), so in games having 1 tactical shot, it manages to lose to an engine considered in normal conditions almost 600 Elo points weaker:
Openings: WAC300.epd (side and reversed)
Score of LC0_245 vs Predateur 2.2.1: 42 - 52 - 6 [0.450] 100
ELO difference: -34.86 +/- 66.75
Finished match
Again, I suspect such a thing can happen even to A0 against SF or even much weaker AB engines.
-
- Posts: 5566
- Joined: Tue Feb 28, 2012 11:56 pm
Re: LCZero: Progress and Scaling. Relation to CCRL Elo
The tactics it missed here it will probably learn eventually, but my point is that it is indeed the NN that *must* have learned the tactic before its search will ever start to pay attention to such a bad-looking move.jkiliani wrote:I think that's the key point to the whole tactics discussion. Leela misses tactics whenever the policy priors on one of the moves of the tactical line are low. But this just proves that Leela is no AlphaZero (yet), not that she can never reach this level: The ResNet architecture with enough layers and filters is assuredly capable of resolving such tactics, once it has been shown enough examples.syzygy wrote:But they don't help if the search simply never looks at the key move.Albert Silver wrote:80,000 NPS didn't hurt either...
There are so many possible tactics that I doubt that it can learn enough of them that it could ever hope to measure itself, tactically, with SF.
Maybe I will turn out to be wrong on this, or maybe Leela's better positional understanding will outweigh its tactical shortcomings. We'll see.
-
- Posts: 291
- Joined: Wed May 08, 2013 6:49 am
Re: LCZero: Progress and Scaling. Relation to CCRL Elo
The way I understand is that the net does positional evaluation in every position by means of probability. But sharp tactics are not meant to be evaluated positionaly because they are meant to be searched until quiescence state is reached. Even humans don't do positional evaluation in the midst of captures. I wish it possible to exclude captures and checks in the Net's learning and just let quiescence search handle the rest.
-
- Posts: 13447
- Joined: Wed Mar 08, 2006 9:02 pm
- Location: Dallas, Texas
- Full name: Matthew Hull
Re: LCZero: Progress and Scaling. Relation to CCRL Elo
This seems normal to me. As an example, Chest excels in chess puzzles, and cooks of same but not chess itself. WAC are a set of puzzles in a way and obviously represent positions LC0 doesn't encounter very often in self-play. Just as A/B chess programs don't excel at chess puzzles, LC0 doesn't excel at tactics, at least not at the beginning. People are concerned about tactics holes but if you think about the reality that Leela is learning positional chess, principles of position that lead to favorable tactics, and we think of positional chess as deep tactics, not shallow, then its knowledge base is being filled-in as it were in reverse order than it would be for an A/B program which is after all a human teaching/writing a program in the order that humans tend to learn chess. LC0 is learning chess with a difference guidance regime that is dictated by the nature of artificial network training. We shouldn't expect it to learn the game the way we learn and thus the way we teach our A/B programs. So I see nothing of concern here.Laskos wrote: ↑Sun May 06, 2018 2:50 pmStill, I don't see how it will be improved dramatically, say on WAC positions, which most reasonable AB engines solve overwhelmingly. The current size network, which is surely not saturated yet, shows no progress on tactical shots, on WAC it shows even a regression:jkiliani wrote:I think that's the key point to the whole tactics discussion. Leela misses tactics whenever the policy priors on one of the moves of the tactical line are low. But this just proves that Leela is no AlphaZero (yet), not that she can never reach this level: The ResNet architecture with enough layers and filters is assuredly capable of resolving such tactics, once it has been shown enough examples. Dirichlet noise in self-play games provides the momentum to learn deeper tactics in self-play by exploring moves not known by the policy head, this process just takes a while and its performance ceiling is limited by the size of the network. Once that is increased, improvement resumes.syzygy wrote: But they don't help if the search simply never looks at the key move.
6s/position 4 CPU threads (equivalent to 1s on GTX 1060)
ID227: 120/300
ID252: 113/300
After more than 1 million games with the bigger net. This is worrying. These WAC positions are piece of cake for reasonable AB engines, and an exploit can be made in order for this sort of positions to occur more often in games. Don't you suspect that even A0 was severely sub-par compared to 280+/300 results in WAC of reasonable AB engines? Also, it seems that network itself at 1 playout improves tactically (visible playing against it), but the search doesn't improve with better network in solving the tactics.
It was interesting to see LC0 trashing an AB engine in normal games from normal starting positions:
Openings: 3moves_GM.epd (side and reversed)
Score of LC0_245 vs Predateur 2.2.1: 93 - 1 - 6 [0.960] 100
ELO difference: 552.08 +/- 173.83
Finished match
But from WAC starting position (side and reversed), so in games having 1 tactical shot, it manages to lose to an engine considered in normal conditions almost 600 Elo points weaker:
Openings: WAC300.epd (side and reversed)
Score of LC0_245 vs Predateur 2.2.1: 42 - 52 - 6 [0.450] 100
ELO difference: -34.86 +/- 66.75
Finished match
Again, I suspect such a thing can happen even to A0 against SF or even much weaker AB engines.
Matthew Hull
-
- Posts: 3019
- Joined: Wed Mar 08, 2006 9:57 pm
- Location: Rio de Janeiro, Brazil
Re: LCZero: Progress and Scaling. Relation to CCRL Elo
Were both runs done with the same LCZero version?Laskos wrote: ↑Sun May 06, 2018 2:50 pmStill, I don't see how it will be improved dramatically, say on WAC positions, which most reasonable AB engines solve overwhelmingly. The current size network, which is surely not saturated yet, shows no progress on tactical shots, on WAC it shows even a regression:jkiliani wrote:I think that's the key point to the whole tactics discussion. Leela misses tactics whenever the policy priors on one of the moves of the tactical line are low. But this just proves that Leela is no AlphaZero (yet), not that she can never reach this level: The ResNet architecture with enough layers and filters is assuredly capable of resolving such tactics, once it has been shown enough examples. Dirichlet noise in self-play games provides the momentum to learn deeper tactics in self-play by exploring moves not known by the policy head, this process just takes a while and its performance ceiling is limited by the size of the network. Once that is increased, improvement resumes.syzygy wrote: But they don't help if the search simply never looks at the key move.
6s/position 4 CPU threads (equivalent to 1s on GTX 1060)
ID227: 120/300
ID252: 113/300
"Tactics are the bricks and sticks that make up a game, but positional play is the architectural blueprint."
-
- Posts: 10948
- Joined: Wed Jul 26, 2006 10:21 pm
- Full name: Kai Laskos
Re: LCZero: Progress and Scaling. Relation to CCRL Elo
Yes, v0.8.Albert Silver wrote: ↑Mon May 07, 2018 5:01 pmWere both runs done with the same LCZero version?Laskos wrote: ↑Sun May 06, 2018 2:50 pmStill, I don't see how it will be improved dramatically, say on WAC positions, which most reasonable AB engines solve overwhelmingly. The current size network, which is surely not saturated yet, shows no progress on tactical shots, on WAC it shows even a regression:jkiliani wrote: I think that's the key point to the whole tactics discussion. Leela misses tactics whenever the policy priors on one of the moves of the tactical line are low. But this just proves that Leela is no AlphaZero (yet), not that she can never reach this level: The ResNet architecture with enough layers and filters is assuredly capable of resolving such tactics, once it has been shown enough examples. Dirichlet noise in self-play games provides the momentum to learn deeper tactics in self-play by exploring moves not known by the policy head, this process just takes a while and its performance ceiling is limited by the size of the network. Once that is increased, improvement resumes.
6s/position 4 CPU threads (equivalent to 1s on GTX 1060)
ID227: 120/300
ID252: 113/300
And, now ID258 has 110/300. Seems consistent with worsening on WAC.
-
- Posts: 10948
- Joined: Wed Jul 26, 2006 10:21 pm
- Full name: Kai Laskos
Re: LCZero: Progress and Scaling. Relation to CCRL Elo
Interesting, in positional opening suite, the trend is exactly the opposite. I show the results for the first 15x192 net compared to the las tone:Laskos wrote: ↑Mon May 07, 2018 5:13 pmYes, v0.8.Albert Silver wrote: ↑Mon May 07, 2018 5:01 pmWere both runs done with the same LCZero version?Laskos wrote: ↑Sun May 06, 2018 2:50 pm
Still, I don't see how it will be improved dramatically, say on WAC positions, which most reasonable AB engines solve overwhelmingly. The current size network, which is surely not saturated yet, shows no progress on tactical shots, on WAC it shows even a regression:
6s/position 4 CPU threads (equivalent to 1s on GTX 1060)
ID227: 120/300
ID252: 113/300
And, now ID258 has 110/300. Seems consistent with worsening on WAC.
LCZero v0.8
6s/position 4 CPU threads (equivalent to 1s on GTX 1060)
WAC300 tactical:
ID227: 120/300
ID258: 110/300
performance below 1800 Elo points AB engines, worsening
Openings200 positional:
ID227: 98/200
ID258: 111/200
performance above 3200 Elo points AB engines, improving
There seem to be some conflict between these two aspects, at least in the net+search part.
-
- Posts: 1142
- Joined: Thu Dec 28, 2017 4:06 pm
- Location: Argentina
Re: LCZero: Progress and Scaling. Relation to CCRL Elo
Kai, can you try some nets in the 231-236 range? Particularly 231, 232 and 236. Those are the ones that several of us consider the strongest.
Follow my tournament and some Leela gauntlets live at http://twitch.tv/ccls
-
- Posts: 4185
- Joined: Tue Mar 14, 2006 11:34 am
- Location: Ethiopia
Re: LCZero: Progress and Scaling. Relation to CCRL Elo
It is going to be a massive heartbreak for many who believe the NN is going to solve tacticsLaskos wrote: ↑Mon May 07, 2018 6:40 pmInteresting, in positional opening suite, the trend is exactly the opposite. I show the results for the first 15x192 net compared to the las tone:
LCZero v0.8
6s/position 4 CPU threads (equivalent to 1s on GTX 1060)
WAC300 tactical:
ID227: 120/300
ID258: 110/300
performance below 1800 Elo points AB engines, worsening
Openings200 positional:
ID227: 98/200
ID258: 111/200
performance above 3200 Elo points AB engines, improving
There seem to be some conflict between these two aspects, at least in the net+search part.
Hardware + cherry-picking seems to be the only explanation left so far ...
syzygy also gets it: judge only based on the evidence presented so far on tactics -- which is none.
Daniel