LCZero update (2)

Guenther · Post by **Guenther** » Fri Apr 06, 2018 9:21 pm

Henk wrote:If you create a bigger net tuning has to start from scratch or not. Can you reuse the weights ?

yes

hgm · Post by **hgm** » Fri Apr 06, 2018 9:29 pm

Are you sure? I thought they were just reusing the training games. Reusing the weights seems possible only for very specific extensions of the network.

koedem · Post by **koedem** » Fri Apr 06, 2018 9:33 pm

hgm wrote:Are you sure? I thought they were just reusing the training games. Reusing the weights seems possible only for very specific extensions of the network.

I also think so, I think the idea is to take the recent 300K training games and train a bigger net with it somehow.

Also worth noting, while e.g. ID 92 is much lower rated on the graph than 83 it actually beat 83 in a match. So probably it actually is stronger and the self play match strength is just not transitive. (though it could also be that 92 just happens to beat 83; hard to say at this point)

AlvaroBegue · Post by **AlvaroBegue** » Fri Apr 06, 2018 9:42 pm

Henk wrote:If you create a bigger net tuning has to start from scratch or not. Can you reuse the weights ?

My understanding is that net tuning takes very little computation compared to running the games, so not reusing weights is not a big deal. It would also be possible to reuse weights, but it's messy and it might introduce weird biases, so I would just do it from scratch. I imagine that's what LCZero does, but I don't know.

Henk · Post by **Henk** » Fri Apr 06, 2018 10:03 pm

In my program training the net is slowest operation. But perhaps because I set number of simulations to a low value.

Forward pass, backpropagation, Adam optimization all are slow. Then a network change means starting all over again.

Henk · Post by **Henk** » Fri Apr 06, 2018 11:46 pm

Training games give not enough information. For how do you get move probabilities. So you had to store all training positions which include move probabilities. But there can be very many of them. So I don't store them anymore.

brianr · Post by **brianr** » Sat Apr 07, 2018 12:05 am

Search the Leela Go GitHub for net2net. There is a paper and some things that facilitate reusing weights to accelerate the re-training. One mention here:

https://github.com/gcp/leela-zero/pull/704

Henk · Post by **Henk** » Wed Apr 11, 2018 3:19 pm

If I set number of simulations to a bigger value, forward pass through net becomes slowest operation (inner product?). So saving generated training examples might make sense if you don't care much about your disk ( space).

sovaz1997 · Post by **sovaz1997** » Wed Apr 11, 2018 3:36 pm

Strange tactical mistake LCZero on the 36th move.

Zevra - LCZero (107), 1m/move

[pgn]
[Event "?"]
[Site "?"]
[Date "2018.04.11"]
[Round "?"]
[White "zevra"]
[Black "lczero"]
[Result "1-0"]
[ECO "C11"]
[GameEndTime "2018-04-11T12:28:57.548 RTZ 2 (зима)"]
[GameStartTime "2018-04-11T11:04:41.189 RTZ 2 (зима)"]
[Opening "French"]
[PlyCount "109"]
[TimeControl "60/move"]
[Variation "Steinitz Variation"]

1. e4 {+0.37/20 60s} e6 {-0.04/22 42s} 2. d4 {+0.42/18 60s} d5 {-0.04/22 26s}
3. Nc3 {+0.49/18 60s} Nf6 {-0.03/22 34s} 4. e5 {+0.64/17 60s}
Nfd7 {+0.05/22 30s} 5. Nf3 {+0.68/16 60s} c5 {+0.10/22 27s} 6. a3 {+0.45/15 60s}
Nc6 {+0.21/22 41s} 7. Nb5 {-0.20/16 60s} cxd4 {+0.26/22 36s}
8. Bf4 {-0.28/16 60s} Qb6 {+0.26/23 57s} 9. Rb1 {-0.18/15 60s} h6 {+0.24/22 40s}
10. h4 {-0.17/15 60s} Nc5 {+0.22/23 53s} 11. Nfxd4 {+0.60/15 60s}
Bd7 {+0.25/22 40s} 12. Nxc6 {+0.48/15 60s} Bxc6 {+0.27/23 46s}
13. b4 {+0.87/17 60s} Nd7 {+0.05/23 50s} 14. Be3 {+0.88/16 60s}
Qd8 {+0.07/21 30s} 15. f4 {+0.54/16 60s} Nb6 {0.00/23 32s}
16. Qd4 {+0.86/14 60s} Be7 {+0.07/22 44s} 17. Nd6+ {+0.91/17 60s}
Bxd6 {+0.07/22 18s} 18. exd6 {+0.53/18 60s} Qxd6 {+0.05/22 13s}
19. Qxg7 {+1.39/16 60s} O-O-O {+0.17/22 30s} 20. Qf6 {+0.61/15 60s}
d4 {+0.22/21 32s} 21. Rd1 {+0.65/17 60s} Rhg8 {+0.20/22 21s}
22. Rxd4 {+1.55/17 60s} Nd5 {+0.25/21 23s} 23. Qxh6 {+0.82/15 60s}
Qc7 {+0.35/21 30s} 24. Bd2 {+0.61/15 60s} e5 {+0.29/22 30s}
25. Rc4 {-0.24/15 60s} Nb6 {+0.31/23 33s} 26. Rxc6 {+1.51/15 60s}
bxc6 {+0.31/22 17s} 27. fxe5 {+1.49/14 60s} Qxe5+ {+0.58/21 30s}
28. Qe3 {+1.44/16 60s} Qd6 {+0.65/22 36s} 29. Ba6+ {+1.56/15 60s}
Kc7 {+0.19/22 39s} 30. Qf4 {+0.50/15 60s} Rxg2 {+0.14/22 26s}
31. Qxd6+ {+0.26/17 60s} Kxd6 {+0.09/22 17s} 32. Rf1 {+0.17/18 60s}
Rh8 {+0.54/21 39s} 33. c4 {+0.12/16 60s} Nd7 {+0.54/21 40s}
34. c5+ {-0.11/17 60s} Kd5 {+0.48/22 51s} 35. Be2 {+0.07/17 60s}
Ne5 {+0.46/22 27s} 36. Rf5 {+0.41/19 60s} f6 {+0.70/22 49s}
37. Bf3+ {+5.04/21 60s} Ke6 {-1.03/21 31s} 38. Rxe5+ {+5.08/23 60s}
fxe5 {-1.16/22 25s} 39. Bxg2 {+4.34/23 60s} Rxh4 {-1.16/22 12s}
40. Bxc6 {+5.16/22 60s} Rh3 {-1.18/23 37s} 41. a4 {+5.18/20 60s}
Ra3 {-1.21/22 32s} 42. Ke2 {+5.48/20 60s} Ke7 {-1.24/22 37s}
43. a5 {+5.65/20 60s} Kd8 {-1.33/22 27s} 44. Bd5 {+6.41/20 60s}
Kc7 {-1.31/22 49s} 45. b5 {+11.30/19 60s} Kc8 {-1.60/23 56s}
46. b6 {+11.47/19 60s} axb6 {-1.83/22 29s} 47. axb6 {+16.77/18 60s}
Kb8 {-2.03/23 53s} 48. c6 {+M25/15 60s} Ra1 {-2.43/22 58s} 49. c7+ {+M13/15 60s}
Kc8 {-1.97/15 0.061s} 50. Be6+ {+M11/14 60s} Kb7 {-2.45/15 0.005s}
51. c8=Q+ {+M9/14 60s} Kxb6 {-2.69/15 0.017s} 52. Bd5 {+M7/13 60s}
Rb1 {-3.30/22 50s} 53. Qc6+ {+M5/13 60s} Ka7 {-3.08/16 0.042s}
54. Qc7+ {+M3/13 60s} Ka6 {-6.46/21 31s} 55. Qa5# {+M1/13 60s, White mates} 1-0

[/pgn]

Henk · Post by **Henk** » Wed Apr 11, 2018 4:21 pm

But if I set number of training cycles to a much higher value, training the network becomes bottleneck again. So I don't know yet which is slowest.

Alpha0 method being tremendously slow (costly) for my computer. That's for sure.

LCZero update (2)

Re: LCZero update

Re: LCZero update

Re: LCZero update

Re: LCZero update

Re: LCZero update

Re: LCZero update

Re: LCZero update

Re: LCZero update

Re: LCZero update

Re: LCZero update