LCZero: Progress and Scaling. Relation to CCRL Elo
Moderators: hgm, Rebel, chrisw
-
- Posts: 560
- Joined: Sun Nov 08, 2015 11:10 pm
Re: LCZero: Progress and Scaling. Relation to CCRL Elo
Are you sure that at low node count condition the new networks perform better but scale worse is not because of over-fitting?
-
- Posts: 143
- Joined: Wed Jan 17, 2018 1:26 pm
Re: LCZero: Progress and Scaling. Relation to CCRL Elo
The value head of recent nets used to be still very slight worse than that of 237. With 321, this may not be the case anymore according to some measurements. The reduced quality of the value head after the regression phase was indeed because of over-fitting, but it has been recovering starting from 287 and getting better ever since. When completely recovered (which may or may not already have happened), scaling properties should also be fully restored, and the new nets will become better at any time control.noobpwnftw wrote: ↑Sun May 20, 2018 11:32 pm Are you sure that at low node count condition the new networks perform better but scale worse is not because of over-fitting?
-
- Posts: 10948
- Joined: Wed Jul 26, 2006 10:21 pm
- Full name: Kai Laskos
Re: LCZero: Progress and Scaling. Relation to CCRL Elo
I hope the quality of the value head will continue to improve substantially, as for now it seems very strange that at nodes=1 the NN320 is almost 200 Elo points stronger than NN237, but scales worse and tactically it is still weaker. There seem to be a lot of room of improvement with 192x15 net, with a bug-fixed engine. I already got a win from LC0 CUDA NN237 in game 7 against Komodo 10.2 at 15'+ 15'' time control, the score for LC0 against Komodo 10.2 is +1 -3 =3, or about 100 Elo points difference, putting LC0 NN237 in these longer time control conditions above 3200 CCRL 40/4' Elo level, as in the games in similar conditions against Houdini 1.5a. You have to keep in mind that longer the TC, better is the rating of LC0 (at least with NN237 value head), because it scales better. Here is the game won by LC0 against Komodo 10.2. Komodo is a tough opponent, as it has a very good eval of the initial phases of the games and imbalances, where LC0 usually gains large advantages against weaker opponents. But in this game, Komodo 10.2 was pretty clueless of what happens up to move 35, considering that it has a large advantage, while the game was proceeding very well for LC0. The match will end in 2-3 hours with a total of 10 games. Anyway, I am pretty amazed by these performances against Houdini 1.5a and Komodo 10.2 at longer TC, some of recent top dogs, and objections that the samples are too small don't impress me. All in all combined (I have some other games played), LC0 CUDA NN237 with Albert's settings on GTX 1060 6GB on 2 CPU threads performs unexpectedly well to me at longer TC.jkiliani wrote: ↑Sun May 20, 2018 10:14 pmThere was a discussion about rollback on Discord yesterday, it isn't happening. At low node counts (800), current nets are far stronger than Id 237, although as you observed they don't scale quite as well (yet). But the quality of the value head is still improving, which is also the deciding factor in determining scaling properties. I'm not too worried this won't fix itself in the end, since we're going to upgrade to a 256x20 network eventually when there's no more improvement on the 192x15 architecture. Lc0 beating Komodo on your setup may not be happening yet, but I'm optimistic that it will soon, either still on 192x15 or at the latest once we go 256x20 (the AlphaZero size).Laskos wrote: ↑Sun May 20, 2018 8:24 pm It seems to be a bit more complicated. At 1'+ 1'' on GTX 1060 with LC0 CUDA, latest nets seem even stronger than NN237. But at 15'+ 15'', NN237 seems stronger. I left NN319 against Komodo 10.2, it lost 5 games in a row due to tactical blunders. Eval graph was also unstable. I interrupted the match and reverted to NN237, and in 5 games until now, there are 2 wins of Komodo and 3 draws. Only one game was lost due to blunder. Still waiting for one win of LC0 in 10 games. The sample is too small, but I saw a similar thing in games against Houdini 1.5a. It seems NN237 scales better with TC or playouts, having a better value head eval. It is strange, as at nodes=1, latest nets are some 150-200 Elo points stronger than NN237. Really, they have to roll back to v0.7 engine, the current nets are trained in some schizophrenic way with v0.10.
[Event "My Tournament"]
[Site "?"]
[Date "2018.05.20"]
[Round "7"]
[White "LC0_GPU_CUDA"]
[Black "Komodo 10.2"]
[Result "1-0"]
[FEN "r1bqkbnr/pppp1ppp/2n5/4p3/4P3/5N2/PPPPQPPP/RNB1KB1R w KQkq - 0 1"]
[PlyCount "151"]
[SetUp "1"]
[TimeControl "900+15"]
1. g3 {+0.09/2 49s} Bc5 {+0.10/25 58s} 2. c3 {+0.11/2 52s} Nf6 {+0.06/25 37s}
3. Bg2 {+0.14/2 34s} O-O {+0.07/26 42s} 4. b4 {+0.15/2 46s} Bb6 {+0.02/25 49s}
5. O-O {+0.14/2 63s} a6 {+0.09/26 44s} 6. d3 {+0.15/2 42s} Re8 {+0.10/26 49s}
7. Bg5 {+0.22/2 39s} d6 {+0.07/26 65s} 8. Nbd2 {+0.21/2 34s} h6 {+0.08/27 47s}
9. Bh4 {+0.14/2 60s} g5 {+0.74/24 25s} 10. Nxg5 {0.00/2 20s} hxg5 {+0.38/25 44s}
11. Bxg5 {0.00/2 8.8s} Bg4 {+0.58/24 15s} 12. Bf3 {+0.04/2 38s}
Be6 {+0.52/24 45s} 13. Nc4 {+0.20/2 38s} Ba7 {+0.41/25 62s}
14. Ne3 {+0.29/2 42s} Kg7 {+0.43/24 16s} 15. Ng4 {+0.25/2 22s}
Bxg4 {+0.57/25 20s} 16. Bxg4 {+0.22/2 6.7s} Rh8 {+0.49/25 30s}
17. Kg2 {+0.21/2 31s} Qe8 {+0.44/25 73s} 18. f4 {+0.20/2 34s}
Nxg4 {+0.69/25 18s} 19. Qxg4 {+0.73/2 32s} Qc8 {+0.50/26 31s}
20. f5 {+0.59/2 33s} Kf8 {+0.80/21 15s} 21. Rad1 {+0.46/2 44s}
Qd7 {+0.72/25 45s} 22. h4 {+0.86/2 34s} Rh7 {+0.74/24 17s} 23. a4 {+0.67/2 48s}
Re8 {+0.62/23 63s} 24. Qf3 {+0.54/2 48s} Rh8 {+0.62/25 45s}
25. Qe2 {+0.71/2 42s} Ne7 {+0.71/24 35s} 26. f6 {+0.66/2 32s} Nc8 {+0.64/23 18s}
27. Qd2 {+0.83/2 47s} Rd8 {+0.76/22 13s} 28. Bh6+ {+0.72/2 21s}
Ke8 {+0.92/24 10s} 29. Bg7 {+0.73/2 9.1s} Rh7 {+1.03/26 18s}
30. Qe2 {+0.81/2 12s} Qxa4 {+1.12/23 21s} 31. h5 {+1.02/2 28s}
Qd7 {+0.43/24 54s} 32. Qf3 {+1.03/2 24s} Qe6 {+0.95/20 13s} 33. h6 {+1.12/2 17s}
Nb6 {+1.12/23 39s} 34. d4 {+1.22/2 9.0s} c6 {+0.44/22 59s} 35. g4 {+1.42/2 26s}
exd4 {-0.18/25 78s} 36. cxd4 {+1.76/2 25s} Kd7 {-0.62/25 68s}
37. g5 {+1.67/2 17s} Rdh8 {-0.90/21 27s} 38. Qh3 {+2.49/2 36s}
Qxh3+ {-0.52/22 25s} 39. Kxh3 {+2.83/2 25s} Nc4 {-0.67/25 33s}
40. Rfe1 {+2.94/2 42s} Nb2 {-0.12/26 17s} 41. e5 {+3.18/2 17s}
Ke6 {-1.07/20 8.3s} 42. exd6+ {+2.95/2 33s} Kxd6 {-1.49/22 7.5s}
43. Rd2 {+3.05/2 5.0s} Nc4 {-1.40/24 5.7s} 44. Rdd1 {+3.22/2 5.6s}
Nb2 {-1.39/26 16s} 45. Rb1 {+4.05/2 20s} Bxd4 {-1.29/24 17s}
46. Re4 {+4.22/2 17s} c5 {-1.35/24 11s} 47. bxc5+ {+5.93/2 23s}
Kxc5 {-1.70/24 37s} 48. Rxd4 {+6.43/2 15s} Kxd4 {-2.95/19 4.4s}
49. Rxb2 {+6.61/2 8.8s} b5 {-4.63/22 27s} 50. Kg4 {+6.98/2 12s}
Rb8 {-4.35/22 7.1s} 51. g6 {+8.27/2 23s} fxg6 {-6.32/24 23s}
52. f7+ {+8.92/2 35s} Kd5 {-6.48/20 4.4s} 53. Re2 {+10.80/2 32s}
Rf8 {-7.31/22 12s} 54. Bxf8 {+11.30/2 16s} Rxf7 {-7.52/24 6.6s}
55. Bg7 {+11.45/2 7.9s} Rf1 {-12.01/25 36s} 56. Rh2 {+11.67/2 17s}
Rg1+ {-12.07/24 5.3s} 57. Kf4 {+12.07/2 31s} Rf1+ {-12.07/20 19s}
58. Kg5 {+12.06/2 20s} Rg1+ {-12.07/25 7.0s} 59. Kf6 {+12.08/2 10s}
Rf1+ {-12.07/21 16s} 60. Kxg6 {+12.91/2 22s} Rg1+ {-12.10/26 12s}
61. Kf7 {+13.08/2 14s} Rc1 {-250.00/25 22s} 62. Rh5+ {+12.87/2 33s}
Ke4 {-250.00/21 18s} 63. h7 {+12.75/2 19s} Rc8 {-M40/23 18s}
64. Re5+ {+16.10/2 27s} Kf3 {-M34/21 1.1s} 65. Re8 {+17.93/2 22s}
Rxe8 {-M32/21 0.95s} 66. Kxe8 {+17.83/2 13s} Ke3 {-M32/21 1.5s}
67. Kd7 {+19.28/2 31s} Kf4 {-M30/20 1.6s} 68. h8=Q {+20.38/2 42s}
b4 {-M26/20 2.6s} 69. Ke6 {+21.24/3 26s} Kg3 {-M16/20 3.1s}
70. Kf5 {+25.35/2 18s} Kf2 {-M10/35 1.7s} 71. Ke4 {+35.99/3 13s}
Kg2 {-M8/99 0.59s} 72. Be5 {+51.55/2 14s} b3 {-M6/99 0.037s}
73. Qh2+ {+M75/2 13s} Kf1 {-M4/5 0s} 74. Kf3 {+122.28/2 9.9s} a5 {-M2/99 0.008s}
75. Qe2+ {+127.03/2 19s} Kg1 {-M2/5 0s} 76. Qg2# {+128.00/2 12s, White mates}
1-0
-
- Posts: 3019
- Joined: Wed Mar 08, 2006 9:57 pm
- Location: Rio de Janeiro, Brazil
Re: LCZero: Progress and Scaling. Relation to CCRL Elo
Not sure why, but NN321 is now the best in tactics, beating NN237 in WAC Revised by one position. Also interesting, is that the 20x256 Net scored the same in tactics to NN237, when both are tested with LC0 Optimized. I had not expected this frankly, with half speed.Laskos wrote: ↑Mon May 21, 2018 12:05 amI hope the quality of the value head will continue to improve substantially, as for now it seems very strange that at nodes=1 the NN320 is almost 200 Elo points stronger than NN237, but scales worse and tactically it is still weaker.jkiliani wrote: ↑Sun May 20, 2018 10:14 pmThere was a discussion about rollback on Discord yesterday, it isn't happening. At low node counts (800), current nets are far stronger than Id 237, although as you observed they don't scale quite as well (yet). But the quality of the value head is still improving, which is also the deciding factor in determining scaling properties. I'm not too worried this won't fix itself in the end, since we're going to upgrade to a 256x20 network eventually when there's no more improvement on the 192x15 architecture. Lc0 beating Komodo on your setup may not be happening yet, but I'm optimistic that it will soon, either still on 192x15 or at the latest once we go 256x20 (the AlphaZero size).Laskos wrote: ↑Sun May 20, 2018 8:24 pm It seems to be a bit more complicated. At 1'+ 1'' on GTX 1060 with LC0 CUDA, latest nets seem even stronger than NN237. But at 15'+ 15'', NN237 seems stronger. I left NN319 against Komodo 10.2, it lost 5 games in a row due to tactical blunders. Eval graph was also unstable. I interrupted the match and reverted to NN237, and in 5 games until now, there are 2 wins of Komodo and 3 draws. Only one game was lost due to blunder. Still waiting for one win of LC0 in 10 games. The sample is too small, but I saw a similar thing in games against Houdini 1.5a. It seems NN237 scales better with TC or playouts, having a better value head eval. It is strange, as at nodes=1, latest nets are some 150-200 Elo points stronger than NN237. Really, they have to roll back to v0.7 engine, the current nets are trained in some schizophrenic way with v0.10.
"Tactics are the bricks and sticks that make up a game, but positional play is the architectural blueprint."
-
- Posts: 1470
- Joined: Mon Apr 23, 2018 7:54 am
Re: LCZero: Progress and Scaling. Relation to CCRL Elo
What is the 20x256 Net??Albert Silver wrote: ↑Mon May 21, 2018 12:42 am Not sure why, but NN321 is now the best in tactics, beating NN237 in WAC Revised by one position. Also interesting, is that the 20x256 Net scored the same in tactics to NN237, when both are tested with LC0 Optimized. I had not expected this frankly, with half speed.
-
- Posts: 10948
- Joined: Wed Jul 26, 2006 10:21 pm
- Full name: Kai Laskos
Re: LCZero: Progress and Scaling. Relation to CCRL Elo
The later NNs are already some 3200 CCRL Elo level on GTX 1060 at even short 1m + 1s time control. Here is the result against Houdini 1.5a (3170 CCRL):Albert Silver wrote: ↑Mon May 21, 2018 12:42 amNot sure why, but NN321 is now the best in tactics, beating NN237 in WAC Revised by one position. Also interesting, is that the 20x256 Net scored the same in tactics to NN237, when both are tested with LC0 Optimized. I had not expected this frankly, with half speed.Laskos wrote: ↑Mon May 21, 2018 12:05 amI hope the quality of the value head will continue to improve substantially, as for now it seems very strange that at nodes=1 the NN320 is almost 200 Elo points stronger than NN237, but scales worse and tactically it is still weaker.jkiliani wrote: ↑Sun May 20, 2018 10:14 pm
There was a discussion about rollback on Discord yesterday, it isn't happening. At low node counts (800), current nets are far stronger than Id 237, although as you observed they don't scale quite as well (yet). But the quality of the value head is still improving, which is also the deciding factor in determining scaling properties. I'm not too worried this won't fix itself in the end, since we're going to upgrade to a 256x20 network eventually when there's no more improvement on the 192x15 architecture. Lc0 beating Komodo on your setup may not be happening yet, but I'm optimistic that it will soon, either still on 192x15 or at the latest once we go 256x20 (the AlphaZero size).
Code: Select all
1m + 1s
Score of LC0_CUDA_NN322 vs Houdini 1.5a: 43 - 33 - 24 [0.550]
Elo difference: 34.86 +/- 60.09
100 of 100 games finished.
-
- Posts: 143
- Joined: Wed Jan 17, 2018 1:26 pm
Re: LCZero: Progress and Scaling. Relation to CCRL Elo
You might try Id 329 sometime. A test of the value head today yielded the best result of any network so far, for 329, which is a very promising indication that the net may also scale very well.
-
- Posts: 10948
- Joined: Wed Jul 26, 2006 10:21 pm
- Full name: Kai Laskos
Re: LCZero: Progress and Scaling. Relation to CCRL Elo
Yes, I myself was curious, and upon arriving home, played LTC games to check a bit, they take time. Here is the result (I found it sufficiently interesting to post in a new thread):
http://talkchess.com/forum3/viewtopic.php?f=2&t=67537
It seems that by now the newest nets are the best at all time controls.
-
- Posts: 51
- Joined: Mon Feb 20, 2017 8:29 am
- Location: Rialto, Venice
Re: LCZero: Progress and Scaling. Relation to CCRL Elo
Laskos wrote: ↑Sun Apr 01, 2018 11:37 amHi Kai,peter wrote:Hi Robin!LC0 seems already close to very strong engines in this opening suite. At this pace of advancement in positional understanding, I will be very curious how it develops.CheckersGuy wrote:That's indeed a very impressive result but that's probably what neural-nets are good at. It's kind of intresting. Weaker traditional alpha-beta engines are decent at tactics and suffer from bad positional play while with Leela0 it's the other way around
what's about your last experiments with this opening suite ?
Are last nets of LC0 (those >300) positionally better than Stockfish & Komodo ?
-
- Posts: 3019
- Joined: Wed Mar 08, 2006 9:57 pm
- Location: Rio de Janeiro, Brazil
Re: LCZero: Progress and Scaling. Relation to CCRL Elo
Offhand, I'd say maybe, but that is a very speculative maybe. One cannot remove tactics from the equation, so oversights in its calculations will affect its decisions. An argument such as ''this would be a great move if.... it didn't lose a piece" holds no water in my book.
"Tactics are the bricks and sticks that make up a game, but positional play is the architectural blueprint."