LCZero: Progress and Scaling. Relation to CCRL Elo

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

Robert Pope
Posts: 558
Joined: Sat Mar 25, 2006 8:27 pm

Re: LCZero: Progress and Scaling. Relation to CCRL Elo

Post by Robert Pope »

duncan wrote:so is lczero 'meant ' to make stockfish obsolete by 2019,or is this hype ?
Hype. It's meant to validate that a completely self-learning chess player is viable, and to see how far that approach will take us.

It may stall out well below Stockfish, or it may surpass it in at least some aspects. The big thing is showing that something other than alpha-beta and an endlessly hand-crafted/tuned evaluation function can work as an approach.
MonteCarlo
Posts: 188
Joined: Sun Dec 25, 2016 4:59 pm

Re: LCZero: Progress and Scaling. Relation to CCRL Elo

Post by MonteCarlo »

Just to be clear, Leela does not REQUIRE different hardware than normal engines.

It's just that as the network gets bigger, the performance penalty of running a CPU-only version will grow.

For now, though, if you have a low-to-middle-end GPU, you might actually get better performance from the CPU-only version; the network's small enough right now that the speed gap between a mid-range GPU and a decent CPU core can be overcome by throwing more cores at it.

Alas, that will change eventually, and then people like me who don't have GPUs will have to live with much slower searches :)
Milos
Posts: 4190
Joined: Wed Nov 25, 2009 1:47 am

Re: LCZero: Progress and Scaling. Relation to CCRL Elo

Post by Milos »

Dann Corbit wrote:At some point, the power of tensor arrays will also apply to ordinary computer chips as well, once some smart people figure out how to use them.
Dann really man, it doesn't hurt to get to know things a bit, especially when they are soooo simple.
NN evaluation is all about one and only one thing, dot product. That exists since 10 years as DPPS and recently in Haswells and onwards as AVX256 instruction.
There is nothing more to figure out. If you want more performance you need more ALUs that can do multiple DPPS instructions in parallel or longer operands ALU like AVX512/1024.
Uri Blass
Posts: 10267
Joined: Thu Mar 09, 2006 12:37 am
Location: Tel-Aviv Israel

Re: LCZero: Progress and Scaling. Relation to CCRL Elo

Post by Uri Blass »

MonteCarlo wrote:Indeed. Leela's output is an expected score, not actually a win%.

The wording on the site has since been changed, it seems, although now it includes this "50%=draw" bit in the legend, which caused some debate in the discord.

Probably should just say "50%=equal chances" or some such thing ("expected score" is pretty self-explanatory, so could probably do away with the legend altogether), but not a big deal. :)

Last net to pass is actually fairly reasonable now. It'll be interesting to see where we are a week from now (it's not even been a week since the last big bug was fixed).
Something is clearly wrong with the probability

After 1.f3 e5 2.g4
I get the right move but the following message
Leela thinks her expected score is 97.17%.

How is it possible that is not 100%?
MonteCarlo
Posts: 188
Joined: Sun Dec 25, 2016 4:59 pm

Re: LCZero: Progress and Scaling. Relation to CCRL Elo

Post by MonteCarlo »

Probably that "winrate" (expected score) is the average at the root; I'd have to look at the code to be sure, but that (or something similar) is almost certainly the case.

It knows that Qh4 is 100%:

Code: Select all

Using 2 thread(s).
Generated 1924 moves
Detecting residual layers...v1...64 channels...6 blocks.
BLAS Core: Haswell
position startpos moves f2f3 e7e5 g2g4
go
info depth 6 nodes 10 nps 1000 score cp 79 winrate 70.48% time 8 pv d8h4
info depth 8 nodes 40 nps 2167 score cp 191 winrate 89.13% time 17 pv d8h4
info depth 9 nodes 85 nps 3000 score cp 238 winrate 93.24% time 27 pv d8h4

d8h4 ->      93 (V: 100.00%) (N: 21.60%) PV: d8h4
d7d5 ->       7 (V: 58.17%) (N: 46.13%) PV: d7d5 e2e3 d8h4 e1e2
h7h5 ->       1 (V: 57.52%) (N:  5.71%) PV: h7h5 g4h5
b8c6 ->       0 (V: 58.08%) (N:  4.87%) PV: b8c6
g8f6 ->       0 (V: 58.08%) (N:  3.00%) PV: g8f6
f8c5 ->       0 (V: 58.08%) (N:  2.43%) PV: f8c5
h7h6 ->       0 (V: 58.08%) (N:  2.22%) PV: h7h6
c7c6 ->       0 (V: 58.08%) (N:  1.87%) PV: c7c6
f8e7 ->       0 (V: 58.08%) (N:  1.69%) PV: f8e7
g8e7 ->       0 (V: 58.08%) (N:  1.27%) PV: g8e7
d7d6 ->       0 (V: 58.08%) (N:  1.15%) PV: d7d6
a7a6 ->       0 (V: 58.08%) (N:  1.02%) PV: a7a6
a7a5 ->       0 (V: 58.08%) (N:  0.97%) PV: a7a5
g7g6 ->       0 (V: 58.08%) (N:  0.77%) PV: g7g6
c7c5 ->       0 (V: 58.08%) (N:  0.70%) PV: c7c5
f8d6 ->       0 (V: 58.08%) (N:  0.48%) PV: f8d6
b8a6 ->       0 (V: 58.08%) (N:  0.48%) PV: b8a6
g8h6 ->       0 (V: 58.08%) (N:  0.41%) PV: g8h6
b7b5 ->       0 (V: 58.08%) (N:  0.40%) PV: b7b5
g7g5 ->       0 (V: 58.08%) (N:  0.39%) PV: g7g5
b7b6 ->       0 (V: 58.08%) (N:  0.35%) PV: b7b6
f7f5 ->       0 (V: 58.08%) (N:  0.35%) PV: f7f5
e5e4 ->       0 (V: 58.08%) (N:  0.32%) PV: e5e4
d8e7 ->       0 (V: 58.08%) (N:  0.30%) PV: d8e7
f7f6 ->       0 (V: 58.08%) (N:  0.28%) PV: f7f6
e8e7 ->       0 (V: 58.08%) (N:  0.23%) PV: e8e7
d8f6 ->       0 (V: 58.08%) (N:  0.18%) PV: d8f6
d8g5 ->       0 (V: 58.08%) (N:  0.17%) PV: d8g5
f8b4 ->       0 (V: 58.08%) (N:  0.17%) PV: f8b4
f8a3 ->       0 (V: 58.08%) (N:  0.09%) PV: f8a3

info depth 9 nodes 102 nps 439 score cp 296 winrate 96.30% time 229 pv d8h4
bestmove d8h4
Note the 100% eval for Qh4 in the move list. The longer the search is, the more visits Qh4 gets compared to everything else, and the closer the winrate in the final output gets to 100%.

Code: Select all

position startpos moves f2f3 e7e5 g2g4
go movetime 60000
info depth 6 nodes 2 nps 1000 score cp 29 winrate 58.13% time 0 pv d7d5 d2d3
info depth 9 nodes 83 nps 20500 score cp 230 winrate 92.62% time 3 pv d8h4
info depth 10 nodes 254 nps 13316 score cp 303 winrate 96.56% time 18 pv d8h4
info depth 11 nodes 436 nps 10357 score cp 318 winrate 97.06% time 41 pv d8h4
info depth 12 nodes 1167 nps 14395 score cp 366 winrate 98.24% time 80 pv d8h4
info depth 13 nodes 2950 nps 20199 score cp 408 winrate 98.89% time 145 pv d8h4
info depth 14 nodes 8138 nps 29165 score cp 452 winrate 99.31% time 278 pv d8h4
info depth 15 nodes 22636 nps 45090 score cp 496 winrate 99.57% time 501 pv d8h4
info depth 16 nodes 69834 nps 75659 score cp 546 winrate 99.76% time 922 pv d8h4
info depth 17 nodes 211877 nps 123184 score cp 596 winrate 99.86% time 1719 pv d8h4
info depth 18 nodes 631465 nps 192931 score cp 645 winrate 99.92% time 3272 pv d8h4
info depth 19 nodes 1956793 nps 302347 score cp 696 winrate 99.95% time 6471 pv d8h4
info depth 20 nodes 6045328 nps 435197 score cp 747 winrate 99.97% time 13890 pv d8h4
info depth 21 nodes 19163934 nps 572024 score cp 799 winrate 99.98% time 33501 pv d8h4

d8h4 -> 38121480 (V: 100.00%) (N: 21.60%) PV: d8h4
d7d5 ->    6558 (V: 63.09%) (N: 46.13%) PV: d7d5 e2e3 d8h4 e1e2 h7h5 g4h5 h4h5 e2e1 g8f6 d2d4 b8c6 b1c3
h7h5 ->     885 (V: 66.13%) (N:  5.71%) PV: h7h5 g4g5 d8g5 d2d4 g5h4 e1d2 e5d4 c2c3 h4f2 g1h3
b8c6 ->     692 (V: 63.13%) (N:  4.87%) PV: b8c6 e2e3 d7d5 d2d4 d8h4 e1e2 e5d4 e3d4 h7h5 g4g5
g8f6 ->     381 (V: 58.82%) (N:  3.00%) PV: g8f6 g4g5 f6h5 e2e3 d8g5 d2d4 g5h4 e1d2
f8c5 ->     319 (V: 60.11%) (N:  2.43%) PV: f8c5 e2e3 d8h4 e1e2 d7d5 d2d4 e5d4 e3d4 c5b6 b1c3 b8c6
h7h6 ->     301 (V: 61.45%) (N:  2.22%) PV: h7h6 e2e4 d8h4 e1e2 f8c5 d2d4 e5d4 c2c3 d4c3 b1c3
c7c6 ->     249 (V: 60.72%) (N:  1.87%) PV: c7c6 e2e4 d7d5 b1c3 d8h4 e1e2 h4d8 e2e1
f8e7 ->     229 (V: 61.48%) (N:  1.69%) PV: f8e7 e2e4 d7d5 e4d5 e7h4 e1e2 d8d5 b1c3
d7d6 ->     161 (V: 62.49%) (N:  1.15%) PV: d7d6 e2e4 d8h4 e1e2 h7h5 g4h5 b8c6 b1c3
g8e7 ->     159 (V: 58.18%) (N:  1.27%) PV: g8e7 e2e4 d7d5 b1c3 d5d4 c3e2 e7g6 d2d3
a7a6 ->     138 (V: 61.08%) (N:  1.02%) PV: a7a6 e2e4 d8h4 e1e2 h4d8 e2e1 d7d5 e4d5
a7a5 ->     134 (V: 62.06%) (N:  0.97%) PV: a7a5 e2e4 d8h4 e1e2 h4d8 e2e1 d7d5 e4d5
g7g6 ->     101 (V: 59.92%) (N:  0.77%) PV: g7g6 e2e4 d8h4 e1e2 b8c6 b1c3 f8c5 d1e1
c7c5 ->      91 (V: 60.02%) (N:  0.70%) PV: c7c5 e2e4 d8h4 e1e2 b8c6 d2d3 h4d8 e2f2
f8d6 ->      67 (V: 62.55%) (N:  0.48%) PV: f8d6 e2e4 d8h4 e1e2 b8c6 d2d3 d6c5 c1e3
b8a6 ->      65 (V: 61.47%) (N:  0.48%) PV: b8a6 e2e4 d8h4 e1e2 h7h5 d2d4 e5d4
b7b5 ->      52 (V: 59.24%) (N:  0.40%) PV: b7b5 e2e4 d8h4 e1e2 h4d8 e2e1 d8h4
g8h6 ->      51 (V: 58.15%) (N:  0.41%) PV: g8h6 e2e4 d7d5 e4d5 c7c6 d5c6
f7f5 ->      50 (V: 63.71%) (N:  0.35%) PV: f7f5 e2e4 d8h4 e1e2 f5g4 b1c3 f8c5
e5e4 ->      47 (V: 63.89%) (N:  0.32%) PV: e5e4 f1g2 d7d5 d2d3 e4d3 c2d3
b7b6 ->      47 (V: 61.17%) (N:  0.35%) PV: b7b6 e2e4 d8h4 e1e2 c8a6 d2d3 f8c5
g7g5 ->      44 (V: 54.49%) (N:  0.39%) PV: g7g5 d2d4 e5d4 d1d4 f7f6 b1c3 b8c6 d4d1
d8e7 ->      39 (V: 59.76%) (N:  0.30%) PV: d8e7 e2e4 e7h4 e1e2 b8c6 b1c3 f8c5
f7f6 ->      32 (V: 54.52%) (N:  0.28%) PV: f7f6 e2e4 d7d5 e4d5 d8d5 b1c3
e8e7 ->      24 (V: 52.33%) (N:  0.23%) PV: e8e7 d2d4 e5d4 d1d4 e7e8 d4d1 d7d5
d8f6 ->      23 (V: 60.27%) (N:  0.18%) PV: d8f6 e2e4 f6h4 e1e2 d7d5
f8b4 ->      21 (V: 58.58%) (N:  0.17%) PV: f8b4 c2c3 b4e7 d2d4 e5d4 c3d4 d7d5
d8g5 ->      20 (V: 54.08%) (N:  0.17%) PV: d8g5 d2d4 g5h4 e1d2 e5d4 c2c3 d4c3
f8a3 ->      16 (V: 71.33%) (N:  0.09%) PV: f8a3 b1a3 d8h4

info depth 21 nodes 38132477 nps 634262 score cp 831 winrate 99.99% time 60120 pv d8h4
bestmove d8h4
Cheers!
Uri Blass
Posts: 10267
Joined: Thu Mar 09, 2006 12:37 am
Location: Tel-Aviv Israel

Re: LCZero: Progress and Scaling. Relation to CCRL Elo

Post by Uri Blass »

Thanks
I do not understand why not to show the 100% eval for Qh4 in the output of the expected score.


I also read the following:

info depth 6 nodes 2 nps 1000 score cp 29 winrate 58.13% time 0 pv d7d5 d2d3

No problem with the fact that the move is wrong but
how does it get winrate of 58.13% with only 2 nodes?
MonteCarlo
Posts: 188
Joined: Sun Dec 25, 2016 4:59 pm

Re: LCZero: Progress and Scaling. Relation to CCRL Elo

Post by MonteCarlo »

Leela does not do playouts to the end of the game and then tabulate results, so the expected scores returned don't have to somehow be evenly divisible into the number of playouts/visits.

Even if the MCTS only searches one line of play, and that line of play is only one ply deep, the NN will return an expected score; that expected score can be anything between 0 and 1.

Cheers!
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: LCZero: Progress and Scaling. Relation to CCRL Elo

Post by Laskos »

MonteCarlo wrote:Thanks for the update Kai!

On the one hand, it's quite possible that a fundamental change to its MCTS implementation will be required at some point if it wants to compete at the highest level, and the work Daniel Shawul has done with Scorpio could prove quite useful in that case (well, it's fantastic work in any case; it's just in this case that it would benefit LC0 :) ).

On the other hand, unless you subscribe to some form of conspiracy theory around the A0 results, we're nowhere near the limits of this sort of approach, so I wouldn't worry too much about that just yet.

Right now there are still a bunch of bugs being worked out, the network is still rather small, and the project is rather young (barely a month old, and it's barely been a week since the last major bug was discovered and fixed).

Some patience is required. It might turn out that switching to a new implementation of MCTS is required; it might also turn out that the NN at some level gives good enough prior probabilities for moves that even MCTS with averaging is good tactically.

We'll just have to give it some time :)
First I replayed with another pool engines the gauntlets (40 games each), with even more stable, stronger and well tested on CCRL engines, GreKo 6.5 (2336 CCRL Elo) and Cheese 1.2 (2330 CCRL Elo).

LC0 ID83 CPU 4 threads i7 Haswell.

At 1s/move:
10.0/40
Performance 2140 CCRL Elo points

At 10s/move:
17.0/40
Performance 2280 CCRL Elo points

So, previous results are confirmed, and the scaling seems good.

Then I had a look at the 10s/move games. First observation is that as soon as it gets tactical, LC0 blunders more often. When the positions are quiet, and the progress is slow, LC0 seems to overplay even this pretty strong opposition. In tactical endgames, LC0 might blunder grossly. Here is an example:

[D]3R2rk/4P3/4Q1p1/p6p/6K1/2q5/8/8 w - - 0 136

LCO is white. GreKo 6.5 just moved 135...h5+, but the position is completely won by LC0. Analysis by Stockfish 9:

Code: Select all

136.Kf4 Qc1+ 137.Qe3 Qc4+ 138.Qd4+ Qxd4+ 139.Rxd4 Kg7 140.Rd8 g5+ 141.Kf5 Re8 142.Rxe8 Kf7 143.Rg8 Kxe7 144.Rxg5 Kd6 145.Kf4 a4 146.Rxh5 Kc7 147.Ra5 Kc6 148.Rxa4 Kc5 149.Ra5+ Kd4 150.Kf3 Kc4 151.Ke4 Kb4 152.Re5 Kc3 153.Ke3 Kc4 154.Rh5 Kc3 155.Rc5+ Kb4 156.Kd4 Kb3 157.Kd3 Kb4 158.Re5 Kb3 159.Rb5+ Ka4 160.Kc4 Ka3 161.Kc3 Ka2 162.Ra5+ Kb1 163.Ra7 Kc1 164.Ra1# 
White mates: +- (#29)  Depth: 49/58   00:00:22  270MN, tb=5623189
SF9 needs less than 0.01s and less than 10,000 nodes to see the correct move (out of 3 legal moves) --- Kf4, with White's large advantage (it then sees the White Mate in one second or so).

LC0 here blundered and lost the game quickly, at 10s/move. It played Kh4, which is... Mate in 4 for Black! Out of only 3 legal moves.

I then let analyze LCO (ID83) on 4 threads for 60 seconds:

Code: Select all

lczero.exe -n -w latest.txt -p 0 --noponder --threads 4
Using 4 thread(s).
Generated 1924 moves
Detecting residual layers...v1...64 channels...6 blocks.
BLAS Core: Haswell

position fen 3R2rk/4P3/4Q1p1/p6p/6K1/2q5/8/8 w - - 0 136
go movetime 60000

info depth 7 nodes 3 nps 154 score cp -146 winrate 16.67% time 12 pv g4h4 g8d8
info depth 9 nodes 11 nps 476 score cp -68 winrate 32.01% time 20 pv g4g5 c3g7 d8g8
info depth 10 nodes 23 nps 759 score cp -11 winrate 46.85% time 28 pv g4g5 c3g3e6g4 h5g4
info depth 11 nodes 31 nps 789 score cp 13 winrate 53.70% time 37 pv g4g5 c3g3 g5f6 g8d8 e7d8q
info depth 12 nodes 55 nps 1000 score cp 31 winrate 58.57% time 53 pv g4g5 c3g3g5f6 g3f4 e6f5 f4f5
info depth 13 nodes 101 nps 1163 score cp 68 winrate 67.98% time 85 pv g4f4 g8d8 e7d8q h8g7 d8g8 g7h6 e6g6
info depth 14 nodes 185 nps 1203 score cp 110 winrate 77.06% time 152 pv g4f4 g8d8 e7d8q h8g7 d8g8 g7h6 e6g6
info depth 15 nodes 356 nps 1286 score cp 122 winrate 79.41% time 275 pv g4f4 c3c7 f4g5 g8d8 e7d8q c7d8 g5g6 d8g8 g6h5 g8e6
info depth 16 nodes 709 nps 1487 score cp 154 winrate 84.56% time 475 pv g4h4 c3g7 d8g8 g7g8 e6g8 h8g8 e7e8q g8g7 e8e7 g7h6 e7f6 a5a4 f6a6
info depth 17 nodes 1279 nps 1555 score cp 180 winrate 87.95% time 821 pv g4h4 c3g7 d8g8 g7g8 e6g8 h8g8 e7e8q g8g7 e8e7 g7h6 e7f6 h6h7 h4g5
info depth 18 nodes 2428 nps 1580 score cp 193 winrate 89.39% time 1535 pv g4h4 c3b4 h4g5 b4g4 e6g4 h5g4 d8g8 h8g8 e7e8q g8g7 e8e7 g7g8 g5g6 g4g3 e7e8
info depth 19 nodes 5262 nps 1931 score cp 226 winrate 92.32% time 2723 pv g4h4 c3b4 h4g5 b4g4 e6g4 h5g4 d8g8 h8g8 e7e8q g8h7 e8f7 h7h8 g5g6 g4g3 f7h7
info depth 20 nodes 9967 nps 2236 score cp 241 winrate 93.44% time 4456 pv g4h4 c3b4 h4g5 b4g4 e6g4 h5g4 d8g8 h8g8 e7e8q g8h7 e8e7 h7g8 g5g6 g4g3 e7e8
info depth 21 nodes 19755 nps 2652 score cp 261 winrate 94.68% time 7448 pv g4h4 c3g7 d8g8 g7g8 e7e8q g8e8 e6e8 h8g7 h4g5 h5h4 e8e7 g7g8 g5g6 h4h3 e7e8
info depth 22 nodes 38930 nps 3225 score cp 275 winrate 95.37% time 12071 pv g4h4 c3g7 d8g8 g7g8 e7e8r g8e8 e6e8 h8g7 h4g5 h5h4 e8e7 g7g8 g5g6 h4h3 e7e8
info depth 23 nodes 63572 nps 3072 score cp 259 winrate 94.55% time 20693 pv g4h4 c3b4 h4g3 b4c3 g3h4 c3g7 d8g8 g7g8 e7e8q g8e8 e6e8 h8g7 e8e7 g7h6 e7f6 a5a4 f6 h8
info depth 24 nodes 107476 nps 3275 score cp 260 winrate 94.60% time 32814 pv g4h4 c3b4 h4g3 b4g4 e6g4 h5g4 d8g8 h8g8 e7e8q g8g7 g3g4 g7f6 g4f4 g6g5 f4g4 a5a4 e8a4 f6e6 a4a6 e6e5
info depth 25 nodes 192280 nps 4112 score cp 261 winrate 94.66% time 46761 pv g4h4 c3b4 h4g3 b4a3 g3h2 a3b2 h2h3 b2g7 d8g8 g7g8 e7e8r g8e8 e6e8 h8g7 e8e7 g7h6 e7a7 h6g5 a7a5 g5f4 a5e1 g6g5 e1f1

g4h4 ->  268306 (V: 94.82%) (N: 17.28%) PV: g4h4 c3b4 h4g3 b4a3 g3h2 a3b2 h2h3 b2g7 d8g8 g7g8 e7e8r g8e8 e6e8 h8g7 e8e7 g7h6 e7a7 h6g5 a7a5 g5f4 a5e1 g6g5 e1f1 f4e3
g4f4 ->    3491 (V: 89.91%) (N: 38.80%) PV: g4f4 c3c7 e6e5 c7e5 f4e5 h5h4 d8g8 h8g8 e5f6 h4h3 e7e8q g8h7 e8g6 h7h8 g6h6 h8g8 h6h3 a5a4 h3a3 g8h7 a3a4 h7h8 a4h4
g4g5 ->     231 (V: 10.77%) (N: 43.91%) PV: g4g5 c3g3 g5f6 g3f4 e6f5 f4f5



info depth 25 nodes 272029 nps 4536 score cp 262 winrate 94.68% time 59967 pv g4h4 c3b4 h4g3 b4a3 g3h2 a3b2 h2h3 b2g7 d8g8 g7g8 e7e8r g8e8 e6e8 h8g7 e8e7 g7h6 e7a7 h6g5 a7a5 g5f4 a5e1 g6g5 e1f1 f4e3
bestmove g4h4
Only after some 1 million playouts it switches to the winning Kf4. I don't know how these MCTS rollouts work here, but my experience with MCTS Go engines, especially the newer Leela or Crazy Stone is different. Tactically this blunder (from won position to -M4) is equivalent to losing an elementary ladder in Go, a thing which for years doesn't happen with strong MCTS Go engines.
megamau
Posts: 37
Joined: Wed Feb 10, 2016 6:20 am
Location: Singapore

Re: LCZero: Progress and Scaling. Relation to CCRL Elo

Post by megamau »

Laskos wrote: Only after some 1 million playouts it switches to the winning Kf4. I don't know how these MCTS rollouts work here, but my experience with MCTS Go engines, especially the newer Leela or Crazy Stone is different. Tactically this blunder (from won position to -M4) is equivalent to losing an elementary ladder in Go, a thing which for years doesn't happen with strong MCTS Go engines.
Actually, "zero" NN engines in Go are known exactly for this. They can become very strong (up to 5-6 dan amateur and even above) while still struggling with relative simple ladders.
Leela zero had issues with ladders for the whole duration of the training.
They just seem to be even more "human like" than humans....great strategically and poor tactically.
MonteCarlo
Posts: 188
Joined: Sun Dec 25, 2016 4:59 pm

Re: LCZero: Progress and Scaling. Relation to CCRL Elo

Post by MonteCarlo »

Thanks for the update and that position, Kai!

The biggest problem in that position is just that black's mate (in the line 136.Kh4 g5+ 137.Kxh5 Qf3+ 138.Kh6 Qh1+ 139.Qh3 Qxh3#) has a series of only winning moves, and the net assigns all of them but the last a very low probability of being played (4%, 4.5%, and 5%), so it just doesn't really get explored.

Combined with the fact that even after those 3 moves have been played, the raw net gives 139.Qh3+ an expected score of 55%, we can really see that the net has a lot of room for improvement :)

It's essentially the same as a case of bad pruning in a traditional engine; it's ordering some moves far too low on the list, and as a result is missing a critical line (it's just this example is much easier than the ones we're used to seeing from traditional engines, because the line is so shallow).

It's also similar to those traditional cases in that once 136... g5+ is forced, it takes only 3 seconds on a single thread on my slow laptop for it to find the loss.

If the net were even slightly better with its assignment of probabilities to any of g5+, Qf3+, and Qh1+, it would find this quickly.

We'll just have to see how it develops. It might be that the net eventually learns to do really well on move ordering so that these examples go away, or it might not. That latter case is when we'd have to start considering making changes to the search.

Having said all of that, even, there are still some known bugs that haven't yet been removed (it's actually kind of funny, the net has seemingly learned to correct for them, but presumably at some cost to what else it's able to represent).

Time will tell :)