Something goes wrong with lc0 since yesterday?

Milos · Post by **Milos** » Sun Aug 26, 2018 4:54 pm

jp wrote: ↑Sun Aug 26, 2018 6:08 am
Milos wrote: ↑Sat Aug 25, 2018 11:57 pm I don't know about A0 scaling, but if that figure is correct for A0 scaling, it is quite bad. I was talking about SF scaling in that figure. It is simply bogus. SF8 scaling is simply much, much better than that. This thing is relatively easy to check, it just requires time. You run SF8 selfplay with timing odds, like 60s vs 1s. You don't need 64 threads machine. You can just scale time. I was running on 28 threads (with HT) and TC was 150s vs 2.5s. Difference was 250Elo. In that figure difference is less than 50Elo. As I said it is rather trivial to prove that SF8 scaling in that figure is bogus and if A0 scaling is correct it is much worse than SF8 and hoping that Lc0 will somehow have better scaling than A0 is frankly speaking dreaming, but ppl can dream ofc .
What did your 2.5s correspond to, i.e. how many nps, i.e. what part of the curve?

I have 25Mnps at root position with 28 threads IIRC. SF8 bench is around 38Mnps. This becomes equivalent to 62.5Mnps with 1s TC and 95Mnps for SF8 bench.
In that Google preprint they are talking about 70Mnps but it is not clear whether it is starting position or SF8 bench. Either way 62.5Mnps on 28 threads should be very close if not stronger than 70Mnps on 64 threads so the points should very well match 1s and 60s on Figure 2 SF8 curve in the paper.
And I got +250Elo after 100 games, while in Figure 2 it is not even 50Elo. One can nitpick about many things but SF8 performance in Figure 2 of the paper is so obviously bogus, fake and I dare to say outright cheating that I can't take anything for granted written by those Google guys.
And this kind of obvious cheating could not pass any peer review so it is not strange that beside the PR value that crappy preprint has no purpose.

Laskos · Post by **Laskos** » Sun Aug 26, 2018 5:27 pm

Milos wrote: ↑Sun Aug 26, 2018 4:54 pm
jp wrote: ↑Sun Aug 26, 2018 6:08 am
Milos wrote: ↑Sat Aug 25, 2018 11:57 pm I don't know about A0 scaling, but if that figure is correct for A0 scaling, it is quite bad. I was talking about SF scaling in that figure. It is simply bogus. SF8 scaling is simply much, much better than that. This thing is relatively easy to check, it just requires time. You run SF8 selfplay with timing odds, like 60s vs 1s. You don't need 64 threads machine. You can just scale time. I was running on 28 threads (with HT) and TC was 150s vs 2.5s. Difference was 250Elo. In that figure difference is less than 50Elo. As I said it is rather trivial to prove that SF8 scaling in that figure is bogus and if A0 scaling is correct it is much worse than SF8 and hoping that Lc0 will somehow have better scaling than A0 is frankly speaking dreaming, but ppl can dream ofc .
What did your 2.5s correspond to, i.e. how many nps, i.e. what part of the curve?
I have 25Mnps at root position with 28 threads IIRC. SF8 bench is around 38Mnps. This becomes equivalent to 62.5Mnps with 1s TC and 95Mnps for SF8 bench.
In that Google preprint they are talking about 70Mnps but it is not clear whether it is starting position or SF8 bench. Either way 62.5Mnps on 28 threads should be very close if not stronger than 70Mnps on 64 threads so the points should very well match 1s and 60s on Figure 2 SF8 curve in the paper.
And I got +250Elo after 100 games, while in Figure 2 it is not even 50Elo. One can nitpick about many things but SF8 performance in Figure 2 of the paper is so obviously bogus, fake and I dare to say outright cheating that I can't take anything for granted written by those Google guys.
And this kind of obvious cheating could not pass any peer review so it is not strange that beside the PR value that crappy preprint has no purpose.

I remember it was the first odd thing I observed. 6 SF doublings at that tc and even on so many cores should be at least 200 Elo points, not 50. I am on a sunny island now on vacation, cannot do some improvisations, but the scaling of Lc0, as even I observed discussing TCEC performance, might be problematic at longer tc x hardware. I had some issues too, but only collateral and not completely conclusive. I have some ideas on checking indirectly (directly needs big hardware).

yanquis1972 · Post by **yanquis1972** » Sun Aug 26, 2018 6:57 pm

Have you tried 28 threads with 256MB hash?

The one egregious flaw in the A0-SF8 match was the hash table. I don’t think anything shared in the DeepMind paper is bogus. We don’t know specifics about their search (ive read they’ve been evasive when asked even general questions); we do know stockfish had a suboptimal setup and the openings used, which were 1-6 ply, much fewer than typically tested.

jp · Post by jp » Mon Aug 27, 2018 9:46 am

yanquis1972 wrote: ↑Sun Aug 26, 2018 6:57 pm We don’t know specifics about their search (ive read they’ve been evasive when asked even general questions); we do know stockfish had a suboptimal setup and the openings used, which were 1-6 ply, much fewer than typically tested.

0 ply for the 100-game match they and media focus on, ignoring the 1200 other games, so SF commits opening suicide 100 times.

Yeah, search specifics may be the most crucial thing for lc0.

yanquis1972 · Post by **yanquis1972** » Mon Aug 27, 2018 9:13 pm

jp wrote: ↑Mon Aug 27, 2018 9:46 am
yanquis1972 wrote: ↑Sun Aug 26, 2018 6:57 pm We don’t know specifics about their search (ive read they’ve been evasive when asked even general questions); we do know stockfish had a suboptimal setup and the openings used, which were 1-6 ply, much fewer than typically tested.
0 ply for the 100-game match they and media focus on, ignoring the 1200 other games, so SF commits opening suicide 100 times.

Yeah, search specifics may be the most crucial thing for lc0.

Wait, really?? Can’t believe I missed that. I don’t think the issue is opening suicide as much as repeat openings. Still, the 1200 game match scores are extremely impressive (w. White pieces). I don’t think LcO is quite there yet. Maybe a few million more games at the latest (lowest?) LR to solidify endgame evaluation.

George Tsavdaris · Post by **George Tsavdaris** » Wed Aug 29, 2018 11:20 pm

I found this result very interesting:

Leela Lc0 11089 running on 4x P100 GPUs versus Stockfish dev 180829 running on 43 cores 16 GB hash, ended +1 -0 =9 in favor of Leela!!

Leela was getting 20 kN/s to 60 kN/s and Stockfish 35 MN/s to 60 MN/s !!

Vinvin · Post by **Vinvin** » Thu Aug 30, 2018 12:16 am

George Tsavdaris wrote: ↑Wed Aug 29, 2018 11:20 pm I found this result very interesting:

Leela Lc0 11089 running on 4x P100 GPUs versus Stockfish dev 180829 running on 43 cores 16 GB hash, ended +1 -0 =9 in favor of Leela!!

Leela was getting 20 kN/s to 60 kN/s and Stockfish 35 MN/s to 60 MN/s !!

Nice ! This conditions are very close to the AlphaZero vs Stockfish 8 Match !

Wikipedia wrote:AlphaZero searches just 80,000 positions per second in chess ..., compared to 70 million for Stockfish

Vinvin · Post by **Vinvin** » Thu Aug 30, 2018 1:04 am

Vinvin wrote: ↑Thu Aug 30, 2018 12:16 am
George Tsavdaris wrote: ↑Wed Aug 29, 2018 11:20 pm I found this result very interesting:

Leela Lc0 11089 running on 4x P100 GPUs versus Stockfish dev 180829 running on 43 cores 16 GB hash, ended +1 -0 =9 in favor of Leela!!

Leela was getting 20 kN/s to 60 kN/s and Stockfish 35 MN/s to 60 MN/s !!
Nice ! This conditions are very close to the AlphaZero vs Stockfish 8 Match !

Wikipedia wrote:AlphaZero searches just 80,000 positions per second in chess ..., compared to 70 million for Stockfish

And a sheet with estimation to recap : https://docs.google.com/spreadsheets/d/ ... id=8107660

CMCanavessi · Post by **CMCanavessi** » Thu Aug 30, 2018 10:32 pm

There's an interesting ongoing match between lc0 on 1 P100 and SFdev on 57 threads here: http://tcecbeta.chessdb.cn/bonusbeta/live.html

So far after 11 games, SF is +1.

Something goes wrong with lc0 since yesterday?

Re: Something goes wrong with lc0 since yesterday?

Re: Something goes wrong with lc0 since yesterday?

Re: Something goes wrong with lc0 since yesterday?

Re: Something goes wrong with lc0 since yesterday?

Re: Something goes wrong with lc0 since yesterday?

Re: Something goes wrong with lc0 since yesterday?

Re: Something goes wrong with lc0 since yesterday?

Re: Something goes wrong with lc0 since yesterday?

Re: Something goes wrong with lc0 since yesterday?

Re: Something goes wrong with lc0 since yesterday?