Future plans of Leela,(reset of the current one)

Nay Lin Tun · Post by **Nay Lin Tun** » Fri Aug 31, 2018 5:52 am

(copy from Leela blog)

As it was planned, we concluded our test10 run, and now it is time for another one.
Test10 was undoubtedly a success, but it has reached its limit. The vote on discord has shown that the community wants the reset as soon as possible, and that's what we did.

We used to keep network identifiers with test numbers (e.g test5 had network id 5xx), but as we had so many networks for the test10 that it overflown into networks id11xxx, the next test is called test20.

It is expected that at the current game rate it will take 6-7 weeks for test20 to become stronger than latest networks from test10.

Changes

What didn't change

Before telling what's new in the next run, let me list what of what we promised, but is not there:
Weights quantization is not enabled.
It is implemented, but we didn't test it enough to confirm that it doesn't lead to weaker nets.
SWA (Stochastic weights averaging).
Implementation turned out to be too slow, optimizations are needed.
Training multiple networks in parallel.
With frequent training that we plan, training pipeline won't be able to keep up with that.
There are plans to employ several GPUs during training, but that's not implemented yet.
It's not main2, but rather test20.
It's running on test server, but at least we updated the server version.

What did change

And now, how test20 will be different from test10:
Cpuct will be equal to 5
That's the value that Deepmind used in AlphaGo (they did not mention values of Cpuct in AlphaGo Zero and AlphaZero papers).
It is expected that this will make Leela better in tactics, and will add more variance to openings.
Rule50 bug fixed.
Leela will be able to use information about number of moves without captures and pawn moves.
Cache history bug fixed.
We recently found a bug, that different transposition of the same position could be taken from NN cache, while in reality NN can return different output depending on history. That was fixed.
Better resign threshold handling.
We'll watch at which eval value probability to resign correctly becomes 95% and adjust threshold dynamically.
Frequent network generation, ~40 networks per day.
Test10 started with only ~4 networks per day.
Larger batch size in training pipeline.
This is closer to what DeepMind did for AlphaZero and should reduce overfitting.
Ghost Batch Normalization from start
(I don't really know what it is). Also closer to what DeepMind did and also prevents overfitting.
En passant + threefold repetition bug is fixed.
This was a minor bug which probably won't have much effect. After pawn move by 2 squares, position was never counted towards three-fold repetition.

Rubinus · Post by **Rubinus** » Fri Aug 31, 2018 10:49 am

http://testserver.lczero.org/active_users
And link engine 0.18?

AdminX · Post by **AdminX** » Fri Aug 31, 2018 12:13 pm

Rubinus wrote: ↑Fri Aug 31, 2018 10:49 am http://testserver.lczero.org/active_users
And link engine 0.18?

You will need to compile it.
Version 18 is still in development, but here is the link to the source (Use the Master branch): https://github.com/LeelaChessZero/lc0

jp · Post by jp » Fri Aug 31, 2018 5:36 pm

Leela blog wrote:Update3
test20 training is finally started! First network training from non-random self-play games will be id20058. Networks id20000–20056 were intermediate networks from initial training, and id20057 is the final seed network.

What exactly do they mean by "reset"? Using nothing at all from previous runs (no past weights, no past games) & everything from scratch from 20058 on?
"Non-random" games?
20057 "seed network", but no relation to 20058?

crem, can you explain?

crem · Post by **crem** » Fri Aug 31, 2018 7:49 pm

jp wrote: ↑Fri Aug 31, 2018 5:36 pm
Leela blog wrote:Update3
test20 training is finally started! First network training from non-random self-play games will be id20058. Networks id20000–20056 were intermediate networks from initial training, and id20057 is the final seed network.
What exactly do they mean by "reset"? Using nothing at all from previous runs (no past weights, no past games) & everything from scratch from 20058 on?
"Non-random" games?
20057 "seed network", but no relation to 20058?

crem, can you explain?

Random games were generated, and used to train the neural network initially. The result of that process is id20057 (networks before that is just intermediate steps for that).
After that, the usual training was started. Clients downloaded id20057, generated selfplay games, and from those games id20058 is generated (with id20057 as a base). And so on.

What used to be id1 now id20057. What used to be id2 now id20058.

No past games, no past weights from test10 or any other run was used. Even random games are completely fresh.

whereagles · Post by **whereagles** » Fri Aug 31, 2018 11:23 pm

doing it all over again? what's the point?

Damir · Post by **Damir** » Sat Sep 01, 2018 5:16 pm

bug fixes and code cleanup needs a new start....

jp · Post by jp » Sat Sep 01, 2018 5:41 pm

Damir wrote: ↑Sat Sep 01, 2018 5:16 pm bug fixes and code cleanup needs a new start....

Bug fixes & better parameters like cpuct.
The real problem is how to know you've got rid of all bugs and using the best parameters for training.

Damir · Post by **Damir** » Sat Sep 01, 2018 6:12 pm

How to know if not trying it anyway ?

jp · Post by jp » Sun Sep 02, 2018 8:01 am

Yep, and if it's close to flatlining might as well.

Future plans of Leela,(reset of the current one)

Future plans of Leela,(reset of the current one)

Re: Future plans of Leela,(reset of the current one)

Re: Future plans of Leela,(reset of the current one)

Re: Future plans of Leela,(reset of the current one)

Re: Future plans of Leela,(reset of the current one)

Re: Future plans of Leela,(reset of the current one)

Re: Future plans of Leela,(reset of the current one)

Re: Future plans of Leela,(reset of the current one)

Re: Future plans of Leela,(reset of the current one)

Re: Future plans of Leela,(reset of the current one)