LCZero: Progress and Scaling. Relation to CCRL Elo

Joost Buijs · Post by **Joost Buijs** » Sat May 05, 2018 4:39 pm

AdminX wrote:
Joost Buijs wrote:
AlvaroBegue wrote:
mar wrote:[...] So tactics seems to be the Achilles heel of Leela, at least at this fast TC.
I have evidence that Leela is horrible at tactics at any time control. I posted a reproducible problem on the LCZero forum, but the responses I got are mostly from fan boys that don't want to see the problem.

https://groups.google.com/forum/#!topic ... q3lg9QV2XQ

Basically you can enter these UCI commands:
Code: Select all
position fen r4rk1/pp1b1ppp/1npb4/q5B1/3P3N/1BP5/P4PPP/R2QR1K1 w - - 9 16 moves g5d2 f8e8 d1f3 d7e6 a1d1 e6b3 a2b3 a5a2 c3c4
go infinite
[d]r3r1k1/pp3ppp/1npb4/8/3P3N/1PP2Q2/q2B1PPP/3RR1K1 w - - 1 20[/d]

This is a position where Qxd2 (a2d2) gains a bishop. The queen cannot be recaptured because of a bank-rank mate. This tactic is so simple that is obvious to me (Elo ~1500).

After 3 minutes of thinking time and over 200K playouts, LCZero had considered the correct move 0 times (!!!).

They need a better search algorithm but they are not looking for one yet.
Indeed, this is a simple 4 ply tactic, I would expect that the MCTS picks this up, maybe the network thinks Qxd2 is so bad that it never considers this move in the playouts. I have no clue about how the algorithm of LCZero exactly works, I never looked at the code, but I have the feeling that there are many things that need improvement before you can even start thinking about reaching the level of Stockfish.
Well isn't the theory that she's supposed to teach herself. Otherwise what is the purpose of self play and reinforcement learning? The same might even be said of giving it Syzygy support.

She only learns about things she has seen before, mainly positional, the network doesn't understand anything about tactics, maybe she will learn some shallow tactics by means of the history planes, but the majority of the tactics have to be resolved by the MCTS, and it looks like there is something amiss there.

Jhoravi · Post by **Jhoravi** » Sat May 05, 2018 4:45 pm

AdminX wrote:
syzygy wrote:
AdminX wrote:Well isn't the theory that she's supposed to teach herself. Otherwise what is the purpose of self play and reinforcement learning.
But there will necessarily be a limit on the tactics that its NN can resolve itself. If the NN incorrectly classifies a tactical move as bad with the result that the search never even looks at it, there is a problem.
Sounds like she will need an additional tactical module added to her system.

It's probably because tactics is a conflict in its learning because dropping a piece is usually learned as bad. For example out of 100 queen sacrifices 99 of it might be loosing and only one wins. As a result of the probability it learns to avoid giving up its queen.

mar · Post by **mar** » Sat May 05, 2018 4:57 pm

AlvaroBegue wrote:They need a better search algorithm but they are not looking for one yet.

Fully agreed. This is THE thing they should focus on now.
Current SF dev should be as strong as A0.

I wonder if something simple and stupid (like dedicating 1 thread for say material-only alphabeta searcher to guide tree expansion) may work, since net evals are very expensive.

But shouldn't the search converge regardless given more time to think? Your position is 2 plies + qsearch, piece of cake for any AB searcher out there...

I can't imagine A0 beating SF when missing tactics such as this one.

EDIT: the tactics is slightly more complicated than I thought, not 2 plies but still very easy for AB searchers

AdminX · Post by **AdminX** » Sat May 05, 2018 5:17 pm

Jhoravi wrote:
AdminX wrote:
syzygy wrote:
AdminX wrote:Well isn't the theory that she's supposed to teach herself. Otherwise what is the purpose of self play and reinforcement learning.
But there will necessarily be a limit on the tactics that its NN can resolve itself. If the NN incorrectly classifies a tactical move as bad with the result that the search never even looks at it, there is a problem.
Sounds like she will need an additional tactical module added to her system.
It's probably because tactics is a conflict in its learning because dropping a piece is usually learned as bad. For example out of 100 queen sacrifices 99 of it might be loosing and only one wins. As a result of the probability it learns to avoid giving up its queen.

So she is unable to comprehend paradox situations in her NN. Odd given some of her exchange sacs.

Ralph Stoesser · Post by **Ralph Stoesser** » Sat May 05, 2018 5:41 pm

Joost Buijs wrote:
AlvaroBegue wrote:
mar wrote:[...] So tactics seems to be the Achilles heel of Leela, at least at this fast TC.
I have evidence that Leela is horrible at tactics at any time control. I posted a reproducible problem on the LCZero forum, but the responses I got are mostly from fan boys that don't want to see the problem.

https://groups.google.com/forum/#!topic ... q3lg9QV2XQ

Basically you can enter these UCI commands:
Code: Select all
position fen r4rk1/pp1b1ppp/1npb4/q5B1/3P3N/1BP5/P4PPP/R2QR1K1 w - - 9 16 moves g5d2 f8e8 d1f3 d7e6 a1d1 e6b3 a2b3 a5a2 c3c4
go infinite
[d]r3r1k1/pp3ppp/1npb4/8/3P3N/1PP2Q2/q2B1PPP/3RR1K1 w - - 1 20[/d]

This is a position where Qxd2 (a2d2) gains a bishop. The queen cannot be recaptured because of a bank-rank mate. This tactic is so simple that is obvious to me (Elo ~1500).

After 3 minutes of thinking time and over 200K playouts, LCZero had considered the correct move 0 times (!!!).

They need a better search algorithm but they are not looking for one yet.
Indeed, this is a simple 4 ply tactic, I would expect that the MCTS picks this up, maybe the network thinks Qxd2 is so bad that it never considers this move in the playouts. I have no clue about how the algorithm of LCZero exactly works, I never looked at the code, but I have the feeling that there are many things that need improvement before you can even start thinking about reaching the level of Stockfish.

On the other hand it means that LCZero is already positionally extremely strong if it can reach 3000 Elo level while being tactically thus weak. I have the opposite feeling that an update for better tactical search shouldn‘t be that hard to find and therefore SF will be bettered soon.

AdminX · Post by **AdminX** » Sat May 05, 2018 5:55 pm

Ralph Stoesser wrote:
Joost Buijs wrote:
AlvaroBegue wrote:
mar wrote:[...] So tactics seems to be the Achilles heel of Leela, at least at this fast TC.
I have evidence that Leela is horrible at tactics at any time control. I posted a reproducible problem on the LCZero forum, but the responses I got are mostly from fan boys that don't want to see the problem.

https://groups.google.com/forum/#!topic ... q3lg9QV2XQ

Basically you can enter these UCI commands:
Code: Select all
position fen r4rk1/pp1b1ppp/1npb4/q5B1/3P3N/1BP5/P4PPP/R2QR1K1 w - - 9 16 moves g5d2 f8e8 d1f3 d7e6 a1d1 e6b3 a2b3 a5a2 c3c4
go infinite
[d]r3r1k1/pp3ppp/1npb4/8/3P3N/1PP2Q2/q2B1PPP/3RR1K1 w - - 1 20[/d]

This is a position where Qxd2 (a2d2) gains a bishop. The queen cannot be recaptured because of a bank-rank mate. This tactic is so simple that is obvious to me (Elo ~1500).

After 3 minutes of thinking time and over 200K playouts, LCZero had considered the correct move 0 times (!!!).

They need a better search algorithm but they are not looking for one yet.
Indeed, this is a simple 4 ply tactic, I would expect that the MCTS picks this up, maybe the network thinks Qxd2 is so bad that it never considers this move in the playouts. I have no clue about how the algorithm of LCZero exactly works, I never looked at the code, but I have the feeling that there are many things that need improvement before you can even start thinking about reaching the level of Stockfish.
On the other hand it means that LCZero is already positionally extremely strong if it can reach 3000 Elo level while being tactically thus weak. I have the opposite feeling that an update for better tactical search shouldn‘t be that hard to find and therefore SF will be bettered soon.

While totally destroying the relationship between tactics and strategy in chess in terms of value as we knew it.

Who would have thought that something so tactically weak could slaughter a Grandmaster.

jkiliani · Post by **jkiliani** » Sat May 05, 2018 5:55 pm

mar wrote:
AlvaroBegue wrote:They need a better search algorithm but they are not looking for one yet.
Fully agreed. This is THE thing they should focus on now.
Current SF dev should be as strong as A0.

I wonder if something simple and stupid (like dedicating 1 thread for say material-only alphabeta searcher to guide tree expansion) may work, since net evals are very expensive.

But shouldn't the search converge regardless given more time to think? Your position is 2 plies + qsearch, piece of cake for any AB searcher out there...

I can't imagine A0 beating SF when missing tactics such as this one.

EDIT: the tactics is slightly more complicated than I thought, not 2 plies but still very easy for AB searchers

A sufficiently deep neural net with enough training can increasingly resolve tactics without search. You can already see that even current Leela on 1 node (i.e. easy mode on play.lczero.org) can at least see simple tactics such as which pieces are protected and which aren't. If a neural net were incapable of resolving any tactics, this would not be possible. It often misses when tactics get more complicated, yet this tactical ability is already a lot better than it was a few weeks ago, or the pure net could not defeat Stockfish at Skill level 5-6 (which it can now).

AlphaZero used larger neural net (256x20) and ran a lot more training games on it before the match against Stockfish, so it makes sense that their network had better tactical understanding than current Leela.

Albert Silver · Post by **Albert Silver** » Sat May 05, 2018 6:16 pm

jkiliani wrote:
mar wrote:
AlvaroBegue wrote:They need a better search algorithm but they are not looking for one yet.
Fully agreed. This is THE thing they should focus on now.
Current SF dev should be as strong as A0.

I wonder if something simple and stupid (like dedicating 1 thread for say material-only alphabeta searcher to guide tree expansion) may work, since net evals are very expensive.

But shouldn't the search converge regardless given more time to think? Your position is 2 plies + qsearch, piece of cake for any AB searcher out there...

I can't imagine A0 beating SF when missing tactics such as this one.

EDIT: the tactics is slightly more complicated than I thought, not 2 plies but still very easy for AB searchers
A sufficiently deep neural net with enough training can increasingly resolve tactics without search. You can already see that even current Leela on 1 node (i.e. easy mode on play.lczero.org) can at least see simple tactics such as which pieces are protected and which aren't. If a neural net were incapable of resolving any tactics, this would not be possible. It often misses when tactics get more complicated, yet this tactical ability is already a lot better than it was a few weeks ago, or the pure net could not defeat Stockfish at Skill level 5-6 (which it can now).

AlphaZero used larger neural net (256x20) and ran a lot more training games on it before the match against Stockfish, so it makes sense that their network had better tactical understanding than current Leela.

80,000 NPS didn't hurt either...

syzygy · Post by **syzygy** » Sat May 05, 2018 6:50 pm

Albert Silver wrote:80,000 NPS didn't hurt either...

But they don't help if the search simply never looks at the key move.

Albert Silver · Post by **Albert Silver** » Sat May 05, 2018 7:14 pm

syzygy wrote:
Albert Silver wrote:80,000 NPS didn't hurt either...
But they don't help if the search simply never looks at the key move.

Assuredly, but you also have to give the network time since as it rains it will refine its concepts. All I have to do is look at how the engine has improved over the last month even in just tactics. In any case, one can widen the move selection or narrow it directly in the UCI engine settings. It warrants testing and that's exactly what I'm doing. I'll report when I have some data.

LCZero: Progress and Scaling. Relation to CCRL Elo

Re: LCZero: Progress and Scaling. Relation to CCRL Elo

Re: LCZero: Progress and Scaling. Relation to CCRL Elo

Re: LCZero: Progress and Scaling. Relation to CCRL Elo

Re: LCZero: Progress and Scaling. Relation to CCRL Elo

Re: LCZero: Progress and Scaling. Relation to CCRL Elo

Re: LCZero: Progress and Scaling. Relation to CCRL Elo

Re: LCZero: Progress and Scaling. Relation to CCRL Elo

Re: LCZero: Progress and Scaling. Relation to CCRL Elo

Re: LCZero: Progress and Scaling. Relation to CCRL Elo

Re: LCZero: Progress and Scaling. Relation to CCRL Elo