LCZero: Progress and Scaling. Relation to CCRL Elo

AlvaroBegue · Post by **AlvaroBegue** » Sat May 05, 2018 3:36 pm

mar wrote:[...] So tactics seems to be the Achilles heel of Leela, at least at this fast TC.

I have evidence that Leela is horrible at tactics at any time control. I posted a reproducible problem on the LCZero forum, but the responses I got are mostly from fan boys that don't want to see the problem.

https://groups.google.com/forum/#!topic ... q3lg9QV2XQ

Basically you can enter these UCI commands:

Code: Select all

position fen r4rk1/pp1b1ppp/1npb4/q5B1/3P3N/1BP5/P4PPP/R2QR1K1 w - - 9 16 moves g5d2 f8e8 d1f3 d7e6 a1d1 e6b3 a2b3 a5a2 c3c4
go infinite

[d]

This is a position where Qxd2 (a2d2) gains a bishop. The queen cannot be recaptured because of a bank-rank mate. This tactic is so simple that is obvious to me (Elo ~1500).

After 3 minutes of thinking time and over 200K playouts, LCZero had considered the correct move 0 times (!!!).

They need a better search algorithm but they are not looking for one yet.

jp · Post by jp » Sat May 05, 2018 3:59 pm

AlvaroBegue wrote: This is a position where Qxd2 (a2d2) gains a bishop.

After 3 minutes of thinking time and over 200K playouts, LCZero had considered the correct move 0 times (!!!).

How many minutes and how many 100Ks of playouts does LC0 need to consider and play Qxd2?

syzygy · Post by **syzygy** » Sat May 05, 2018 4:02 pm

AlvaroBegue wrote:After 3 minutes of thinking time and over 200K playouts, LCZero had considered the correct move 0 times (!!!).

They need a better search algorithm but they are not looking for one yet.

So you really mean:
[d]r3r1k1/pp3ppp/1npb4/8/2PP3N/1P3Q2/q2B1PPP/3RR1K1 b - - 0 20

Without a search, it makes sense not to expect Qxd2 to be a good move.

But what I don't get about the so-called MCTS (but without the MC) of Alpha/LC Zero is that move probabilities (which decide what lines to explore) seem to correlate with moves expected to be good. What you want to explore are moves that potentially cast doubt on your initial evaluation of the position.

I also wonder about the supposed applicability of the multi-armed bandit problem. The multi-armed bandit problem is about maximising expected payout. I see no reason (at all) why that would work particularly well for finding tactical shots.

Laskos · Post by **Laskos** » Sat May 05, 2018 4:09 pm

AlvaroBegue wrote:
mar wrote:[...] So tactics seems to be the Achilles heel of Leela, at least at this fast TC.
I have evidence that Leela is horrible at tactics at any time control. I posted a reproducible problem on the LCZero forum, but the responses I got are mostly from fan boys that don't want to see the problem.

https://groups.google.com/forum/#!topic ... q3lg9QV2XQ

Basically you can enter these UCI commands:
Code: Select all
position fen r4rk1/pp1b1ppp/1npb4/q5B1/3P3N/1BP5/P4PPP/R2QR1K1 w - - 9 16 moves g5d2 f8e8 d1f3 d7e6 a1d1 e6b3 a2b3 a5a2 c3c4
go infinite
[d]

This is a position where Qxd2 (a2d2) gains a bishop. The queen cannot be recaptured because of a bank-rank mate. This tactic is so simple that is obvious to me (Elo ~1500).

After 3 minutes of thinking time and over 200K playouts, LCZero had considered the correct move 0 times (!!!).

They need a better search algorithm but they are not looking for one yet.

That's very interesting. The position is even easier than usual WAC position, which are easy for most reasonable AB engines. I tested WAC (300 easy tactical shots):

1s/position
Fruit 2.1 (2700 CCRL)
294/300

1s/position
Predateur 2.2.1 (1800 CCRL)
272/300

6s/position (to compensate for absence of GPU, 4 CPU threads)
LC0 ID246
119/300

What Elo would ID246 be on that easy super-tactical suite?

This is such a weakness, that one can make an exploit, say build an AB engine specifically designed to set-up easy tactical traps. Maybe A0 had a similar problem, but at higher level, where such positions occur rarely in usual games with a usual engine like SF.

Joost Buijs · Post by **Joost Buijs** » Sat May 05, 2018 4:19 pm

AlvaroBegue wrote:
mar wrote:[...] So tactics seems to be the Achilles heel of Leela, at least at this fast TC.
I have evidence that Leela is horrible at tactics at any time control. I posted a reproducible problem on the LCZero forum, but the responses I got are mostly from fan boys that don't want to see the problem.

https://groups.google.com/forum/#!topic ... q3lg9QV2XQ

Basically you can enter these UCI commands:
Code: Select all
position fen r4rk1/pp1b1ppp/1npb4/q5B1/3P3N/1BP5/P4PPP/R2QR1K1 w - - 9 16 moves g5d2 f8e8 d1f3 d7e6 a1d1 e6b3 a2b3 a5a2 c3c4
go infinite
[d]

This is a position where Qxd2 (a2d2) gains a bishop. The queen cannot be recaptured because of a bank-rank mate. This tactic is so simple that is obvious to me (Elo ~1500).

After 3 minutes of thinking time and over 200K playouts, LCZero had considered the correct move 0 times (!!!).

They need a better search algorithm but they are not looking for one yet.

Indeed, this is a simple 4 ply tactic, I would expect that the MCTS picks this up, maybe the network thinks Qxd2 is so bad that it never considers this move in the playouts. I have no clue about how the algorithm of LCZero exactly works, I never looked at the code, but I have the feeling that there are many things that need improvement before you can even start thinking about reaching the level of Stockfish.

Joost Buijs · Post by **Joost Buijs** » Sat May 05, 2018 4:23 pm

Laskos wrote:
AlvaroBegue wrote:
mar wrote:[...] So tactics seems to be the Achilles heel of Leela, at least at this fast TC.
I have evidence that Leela is horrible at tactics at any time control. I posted a reproducible problem on the LCZero forum, but the responses I got are mostly from fan boys that don't want to see the problem.

https://groups.google.com/forum/#!topic ... q3lg9QV2XQ

Basically you can enter these UCI commands:
Code: Select all
position fen r4rk1/pp1b1ppp/1npb4/q5B1/3P3N/1BP5/P4PPP/R2QR1K1 w - - 9 16 moves g5d2 f8e8 d1f3 d7e6 a1d1 e6b3 a2b3 a5a2 c3c4
go infinite
[d]

This is a position where Qxd2 (a2d2) gains a bishop. The queen cannot be recaptured because of a bank-rank mate. This tactic is so simple that is obvious to me (Elo ~1500).

After 3 minutes of thinking time and over 200K playouts, LCZero had considered the correct move 0 times (!!!).

They need a better search algorithm but they are not looking for one yet.
That's very interesting. The position is even easier than usual WAC position, which are easy for most reasonable AB engines. I tested WAC (300 easy tactical shots):

1s/position
Fruit 2.1 (2700 CCRL)
294/300

1s/position
Predateur 2.2.1 (1800 CCRL)
272/300

6s/position (to compensate for absence of GPU, 4 CPU threads)
LC0 ID246
119/300

What Elo would ID246 be on that easy super-tactical suite?

This is such a weakness, that one can make an exploit, say build an AB engine specifically designed to set-up easy tactical traps. Maybe A0 had a similar problem, but at higher level, where such positions occur rarely in usual games with a usual engine like SF.

119/300 on WAC is really bad, my engine (certainly not a top engine) does 298/300 with just 1 second per move on a single core.

AdminX · Post by **AdminX** » Sat May 05, 2018 4:25 pm

Joost Buijs wrote:
AlvaroBegue wrote:
mar wrote:[...] So tactics seems to be the Achilles heel of Leela, at least at this fast TC.
I have evidence that Leela is horrible at tactics at any time control. I posted a reproducible problem on the LCZero forum, but the responses I got are mostly from fan boys that don't want to see the problem.

https://groups.google.com/forum/#!topic ... q3lg9QV2XQ

Basically you can enter these UCI commands:
Code: Select all
position fen r4rk1/pp1b1ppp/1npb4/q5B1/3P3N/1BP5/P4PPP/R2QR1K1 w - - 9 16 moves g5d2 f8e8 d1f3 d7e6 a1d1 e6b3 a2b3 a5a2 c3c4
go infinite
[d]

This is a position where Qxd2 (a2d2) gains a bishop. The queen cannot be recaptured because of a bank-rank mate. This tactic is so simple that is obvious to me (Elo ~1500).

After 3 minutes of thinking time and over 200K playouts, LCZero had considered the correct move 0 times (!!!).

They need a better search algorithm but they are not looking for one yet.
Indeed, this is a simple 4 ply tactic, I would expect that the MCTS picks this up, maybe the network thinks Qxd2 is so bad that it never considers this move in the playouts. I have no clue about how the algorithm of LCZero exactly works, I never looked at the code, but I have the feeling that there are many things that need improvement before you can even start thinking about reaching the level of Stockfish.

Well isn't the theory that she's supposed to teach herself. Otherwise what is the purpose of self play and reinforcement learning? The same might even be said of giving it Syzygy support.

syzygy · Post by **syzygy** » Sat May 05, 2018 4:30 pm

AdminX wrote:Well isn't the theory that she's supposed to teach herself. Otherwise what is the purpose of self play and reinforcement learning.

But there will necessarily be a limit on the tactics that its NN can resolve itself. If the NN incorrectly classifies a tactical move as bad with the result that the search never even looks at it, there is a problem.

AdminX · Post by **AdminX** » Sat May 05, 2018 4:37 pm

syzygy wrote:
AdminX wrote:Well isn't the theory that she's supposed to teach herself. Otherwise what is the purpose of self play and reinforcement learning.
But there will necessarily be a limit on the tactics that its NN can resolve itself. If the NN incorrectly classifies a tactical move as bad with the result that the search never even looks at it, there is a problem.

Sounds like she will need an additional tactical module added to her system.

peter · Post by **peter** » Sat May 05, 2018 4:39 pm

Hi Kai!

Laskos wrote:What Elo would ID246 be on that easy super-tactical suite?

This is such a weakness, that one can make an exploit, say build an AB engine specifically designed to set-up easy tactical traps. Maybe A0 had a similar problem, but at higher level, where such positions occur rarely in usual games with a usual engine like SF.

I wrote here
http://www.talkchess.com/forum/viewtopi ... 43&t=66945
about network 240 and don't want to repeat it with 246 already again.

And I don't like Elo- ratings out of single test suites, there are too many different Elo- ratings round here in computerchess since long time anyhow.

Yet I'm really happy that you start seeing some of my points I write about since A0 appeared then.

It might be simply a bigger difference between Celo (Computer- Elo) in bookless (or as well as bookless as for 0815 test-books or opening testsets) eng-eng-matches and testing with positions further off the initial position of chess, than between AB- engines only.

No wonder neither, AB- engines are (at least at the top of the rating- lists) much more similiar to each other than they might be compared to AI- engines like A0 or LC0 with the puristic Zero- concept of selflearning.

LCZero: Progress and Scaling. Relation to CCRL Elo

Re: LCZero: Progress and Scaling. Relation to CCRL Elo

Re: LCZero: Progress and Scaling. Relation to CCRL Elo

Re: LCZero: Progress and Scaling. Relation to CCRL Elo

Re: LCZero: Progress and Scaling. Relation to CCRL Elo

Re: LCZero: Progress and Scaling. Relation to CCRL Elo

Re: LCZero: Progress and Scaling. Relation to CCRL Elo

Re: LCZero: Progress and Scaling. Relation to CCRL Elo

Re: LCZero: Progress and Scaling. Relation to CCRL Elo

Re: LCZero: Progress and Scaling. Relation to CCRL Elo

Re: LCZero: Progress and Scaling. Relation to CCRL Elo