Lc0 51010

nabildanial · Post by **nabildanial** » Sun Mar 31, 2019 5:58 pm

lkaufman wrote: ↑Sun Mar 31, 2019 8:07 am The situation has now become even more dramatic. I tested version 51058 (on my RTX 2080), which is self-rated slightly below 1700, against Fire 7.1 on 4 very fast threads, which is rated 2337 on CEGT blitz, 3425 CCRL blitz. The two day old Lc0 won by 12 to 8 (+70 elo), putting it over 3400 CEGT blitz and nearly 3500 CCRL blitz!! So I guess we won't be hearing that the self-play ratings are inflated anymore, although it remains to be seen whether further gains in self-play ratings continue to translate to real gains. One question: does anyone know roughly how many 2080s would be needed to duplicate the training that this 51xxx series has averaged so far? It doesn't mean much to say that it has trained for two days without stating what the average resources used for the training were. I imagine that they were just a tiny percentage of the resources used to train AlphaZero in 9 hours or so.

I have a hard time believing this result. How does the best T30 fare against Fire on your machine?

Albert Silver · Post by **Albert Silver** » Sun Mar 31, 2019 6:20 pm

lkaufman wrote: ↑Fri Mar 29, 2019 8:20 pm
Werewolf wrote: ↑Fri Mar 29, 2019 7:40 pm
lkaufman wrote: ↑Fri Mar 29, 2019 5:51 pm
mwyoung wrote: ↑Fri Mar 29, 2019 8:43 am Stockfish was not as impressed as Fruit. Lc0 51012 vs Stockfish

Results.jpg
Well I don't expect a 3250 engine to score much against SF on a powerful computer. The point is that if it is 3250 level with a self-play rating under 1500, what should it be with a 3000+ self-play rating in a few weeks?
Sadly not 3250 + 1500. The self play ratings are..."not to scale" as an architect might put it.
Yes, I know that. But then why is the engine self-rated below 1500 playing at 3250 level? It should be playing way below 1500 level, assuming that the rating of zero for random play is fair. I actually think that random play would be rated lower than zero if CCRL or CEGT tested it. Part of the answer is my hardware, but that hardly seems like a full explanation. Somehow it seems that self-play understates rating differences up to some level, then starts to overstate them beyond that level.

This is a lot more believable than you think. The problem of course is thinking the selfplay ratings have any relevance. The T51 is issuing nets every 1000 steps instead of 250, which means that id51075 would be equivalent to id50300 in terms of sheer training. It also helps to better understand the pace of training in general and the Elo gains.

Let's suppose a 10b will achieve about 100 Elo less (vs SF and others) than the equivalent 20b (like T40). The Elo gains will be very very roughly: 70% in the first LR drop, then 17%, then 8%, then 2%. Do not quote those numbers as gospel, they are merely designed to help understand how it progresses.

It would likely break 3000-3100 by the first LR drop, which was at id51055, at which point it will surge for a short while. Now, my rating of 3000-3100 hinges on a ratio of about 1000-1 NPS, but a 2080 vs a mere quad will be crushing, and have at least 5 times better ratio, so add in about 300 Elo to that mix vs those a/b engines stuck on a quad. Suddenly those results don't seem quite so insane, do they?

lkaufman · Post by **lkaufman** » Sun Mar 31, 2019 6:30 pm

Laskos wrote: ↑Sun Mar 31, 2019 10:58 am
jp wrote: ↑Sun Mar 31, 2019 8:37 am
lkaufman wrote: ↑Sun Mar 31, 2019 8:07 am One question: does anyone know roughly how many 2080s would be needed to duplicate the training that this 51xxx series has averaged so far? It doesn't mean much to say that it has trained for two days without stating what the average resources used for the training were. I imagine that they were just a tiny percentage of the resources used to train AlphaZero in 9 hours or so.
NN 51010 has had 559441 games. The NN size is smaller (10x128) than the A0 one and 41xxx NNs (20x256), but the training param is visits=10000 (visits=800 for 41xxx).

I'll let others say what 2080 time that works out to.
I will try to estimate (writing on the phone, I am on a vacation ). About 0.2s/move on 2080, meaning games in about 20s. About 200 games per hour, so half a million games need 2500 hours, or about 100 days needed for training the NN 51010 on a single 2080. Or 100x 2080 GPUs needed to train it in one day.
The main unknown is the time to get 10000 visits, I took it as 0.2s on 2080. So this calculation is just for the order of magnitude.

Thanks. So the training resources were a lot more than I thought they were. It makes the amazing results at least easier to accept.

Albert Silver · Post by **Albert Silver** » Sun Mar 31, 2019 6:37 pm

lkaufman wrote: ↑Sun Mar 31, 2019 6:30 pm
Laskos wrote: ↑Sun Mar 31, 2019 10:58 am
jp wrote: ↑Sun Mar 31, 2019 8:37 am
lkaufman wrote: ↑Sun Mar 31, 2019 8:07 am One question: does anyone know roughly how many 2080s would be needed to duplicate the training that this 51xxx series has averaged so far? It doesn't mean much to say that it has trained for two days without stating what the average resources used for the training were. I imagine that they were just a tiny percentage of the resources used to train AlphaZero in 9 hours or so.
NN 51010 has had 559441 games. The NN size is smaller (10x128) than the A0 one and 41xxx NNs (20x256), but the training param is visits=10000 (visits=800 for 41xxx).

I'll let others say what 2080 time that works out to.
I will try to estimate (writing on the phone, I am on a vacation ). About 0.2s/move on 2080, meaning games in about 20s. About 200 games per hour, so half a million games need 2500 hours, or about 100 days needed for training the NN 51010 on a single 2080. Or 100x 2080 GPUs needed to train it in one day.
The main unknown is the time to get 10000 visits, I took it as 0.2s on 2080. So this calculation is just for the order of magnitude.
Thanks. So the training resources were a lot more than I thought they were. It makes the amazing results at least easier to accept.

I will repeat: the average nodes per visit in this training run is around 800 nodes, not 10 thousand. It is a complete misunderstanding of what is being done. The average nodes per move per game is around 800 still.

Laskos · Post by **Laskos** » Sun Mar 31, 2019 6:50 pm

Albert Silver wrote: ↑Sun Mar 31, 2019 6:37 pm
lkaufman wrote: ↑Sun Mar 31, 2019 6:30 pm
Laskos wrote: ↑Sun Mar 31, 2019 10:58 am
jp wrote: ↑Sun Mar 31, 2019 8:37 am
lkaufman wrote: ↑Sun Mar 31, 2019 8:07 am One question: does anyone know roughly how many 2080s would be needed to duplicate the training that this 51xxx series has averaged so far? It doesn't mean much to say that it has trained for two days without stating what the average resources used for the training were. I imagine that they were just a tiny percentage of the resources used to train AlphaZero in 9 hours or so.
NN 51010 has had 559441 games. The NN size is smaller (10x128) than the A0 one and 41xxx NNs (20x256), but the training param is visits=10000 (visits=800 for 41xxx).

I'll let others say what 2080 time that works out to.
I will try to estimate (writing on the phone, I am on a vacation ). About 0.2s/move on 2080, meaning games in about 20s. About 200 games per hour, so half a million games need 2500 hours, or about 100 days needed for training the NN 51010 on a single 2080. Or 100x 2080 GPUs needed to train it in one day.
The main unknown is the time to get 10000 visits, I took it as 0.2s on 2080. So this calculation is just for the order of magnitude.
Thanks. So the training resources were a lot more than I thought they were. It makes the amazing results at least easier to accept.
I will repeat: the average nodes per visit in this training run is around 800 nodes, not 10 thousand. It is a complete misunderstanding of what is being done. The average nodes per move per game is around 800 still.

Ok, then I would estimate a game on 2080 to take some 6 seconds. Meaning 30 days of 2080 training for half a million games, or 30x 2080 GPUs training for one day.

lkaufman · Post by **lkaufman** » Sun Mar 31, 2019 7:26 pm

nabildanial wrote: ↑Sun Mar 31, 2019 5:58 pm
lkaufman wrote: ↑Sun Mar 31, 2019 8:07 am The situation has now become even more dramatic. I tested version 51058 (on my RTX 2080), which is self-rated slightly below 1700, against Fire 7.1 on 4 very fast threads, which is rated 2337 on CEGT blitz, 3425 CCRL blitz. The two day old Lc0 won by 12 to 8 (+70 elo), putting it over 3400 CEGT blitz and nearly 3500 CCRL blitz!! So I guess we won't be hearing that the self-play ratings are inflated anymore, although it remains to be seen whether further gains in self-play ratings continue to translate to real gains. One question: does anyone know roughly how many 2080s would be needed to duplicate the training that this 51xxx series has averaged so far? It doesn't mean much to say that it has trained for two days without stating what the average resources used for the training were. I imagine that they were just a tiny percentage of the resources used to train AlphaZero in 9 hours or so.
I have a hard time believing this result. How does the best T30 fare against Fire on your machine?

Any highly rated T30 or T40 on my machine will defeat Stockfish 10 on 7 threads (I have 8 cores cpu) in a match, so of course Fire (about 200 elo lower rated than SF10) on four threads would be no contest. I'm not suggesting that T51 is near the level of the best networks yet, just that the selfElo is so ridiculously low that some explanation for this seems to be needed, and that the future is bright.

lkaufman · Post by **lkaufman** » Sun Mar 31, 2019 7:29 pm

Laskos wrote: ↑Sun Mar 31, 2019 6:50 pm
Albert Silver wrote: ↑Sun Mar 31, 2019 6:37 pm
lkaufman wrote: ↑Sun Mar 31, 2019 6:30 pm
Laskos wrote: ↑Sun Mar 31, 2019 10:58 am
jp wrote: ↑Sun Mar 31, 2019 8:37 am
lkaufman wrote: ↑Sun Mar 31, 2019 8:07 am One question: does anyone know roughly how many 2080s would be needed to duplicate the training that this 51xxx series has averaged so far? It doesn't mean much to say that it has trained for two days without stating what the average resources used for the training were. I imagine that they were just a tiny percentage of the resources used to train AlphaZero in 9 hours or so.
NN 51010 has had 559441 games. The NN size is smaller (10x128) than the A0 one and 41xxx NNs (20x256), but the training param is visits=10000 (visits=800 for 41xxx).

I'll let others say what 2080 time that works out to.
I will try to estimate (writing on the phone, I am on a vacation ). About 0.2s/move on 2080, meaning games in about 20s. About 200 games per hour, so half a million games need 2500 hours, or about 100 days needed for training the NN 51010 on a single 2080. Or 100x 2080 GPUs needed to train it in one day.
The main unknown is the time to get 10000 visits, I took it as 0.2s on 2080. So this calculation is just for the order of magnitude.
Thanks. So the training resources were a lot more than I thought they were. It makes the amazing results at least easier to accept.
I will repeat: the average nodes per visit in this training run is around 800 nodes, not 10 thousand. It is a complete misunderstanding of what is being done. The average nodes per move per game is around 800 still.
Ok, then I would estimate a game on 2080 to take some 6 seconds. Meaning 30 days of 2080 training for half a million games, or 30x 2080 GPUs training for one day.

Much closer to what I imagined.

lkaufman · Post by **lkaufman** » Sun Mar 31, 2019 7:33 pm

Albert Silver wrote: ↑Sun Mar 31, 2019 6:20 pm The problem of course is thinking the selfplay ratings have any relevance.

It is well known that selfplay ratings can exaggerate elo gains. But they are not random numbers either, otherwise Stockfish would have never made progress. Can you (or anyone) suggest a reason why in this case selfplay seems to have grossly UNDERSTATED Elo gains? That's what I'm trying to understand.

Uri Blass · Post by **Uri Blass** » Sun Mar 31, 2019 8:27 pm

Ozymandias wrote: ↑Sun Mar 31, 2019 12:23 pm
Uri Blass wrote: ↑Sun Mar 31, 2019 11:20 amThe claim is that TCEC used some opening book to reduce the number of draws and with good opening book we are going to get almost 100% draws.

larry's words about it:
"It seems pretty clear that if you take the strongest Stockfish and the strongest NN engine on TCEC type hardware, and have them play a long match at TCEC time controls, the results will depend heavily on the openings. If you give each side a really good, deep opening book and have them play only the optimum or near optimum openings, nearly every game will end in a draw."

I am not sure about it and I have 2 problems:
1)How do you define if a book is a good deep opening book?
If you do not get the expected draw result in a game you can claim that the book was not good enough or was not deep enough so you need some definition that is not dependent on the result.

2)How do you define if an opening is the optimum or near the optimum opening?
It is possible that the best move is a move that the engines do not like because they do not see deep enough.

claim 2 of larry:
"The only way the stronger engine can win games is to play suboptimal lines with White to exit from book early and hope to win despite the poor opening moves."

Based on what he claims that a line is suboptimal line?
A line that the best engines do not suggest does not have to be a suboptimal line.

There should be a basis to claim that a line is suboptimal(for example if the engine that beat another engine lose against itself in the same line)
I guess larry means to lines that white has equality instead of minimal advantage and in this case I doubt if it is correct to call it a suboptimal line because the line does not lose and can help to win against weaker engines.
I'm surprised to see this topic still being debated after so long, maybe you missed the draw death of Freestyle chess. The answer to your questions is there.

The draw death of freestyle chess proves nothing because it is possible that the participants are simply at equal level and it is possible to beat them by playing better than them.

There are also OTB tournaments when most of the games are drawn and not because the players make no mistakes.

Ozymandias · Post by **Ozymandias** » Sun Mar 31, 2019 8:38 pm

Uri Blass wrote: ↑Sun Mar 31, 2019 8:27 pmit is possible that the participants are simply at equal level and it is possible to beat them by playing better than them.

Get any game from any of the top performers, and tell me what you'd do differently (from any side of the board).

Lc0 51010

Re: Lc0 51010

Re: Lc0 51010

Re: Lc0 51010

Re: Lc0 51010

Re: Lc0 51010

Re: Lc0 51010

Re: Lc0 51010

Re: Lc0 51010

Re: Lc0 51010

Re: Lc0 51010