lc0-win-20180512-cuda90-cudnn712-00

shrapnel · Post by **shrapnel** » Mon May 21, 2018 8:20 pm

Latest lc0 binary dated 21st May by Alexander Lyashuk works very well !
Output now showing up in Arena GUI

.
Also, much faster, more than 21k !

Milos · Post by **Milos** » Tue May 22, 2018 12:05 am

shrapnel wrote: ↑Mon May 21, 2018 8:20 pm Latest lc0 binary dated 21st May by Alexander Lyashuk works very well !
Output now showing up in Arena GUI .
Also, much faster, more than 21k !

He also added zlib and changed the build model to mason where he builds zlib from source as static

.
Since most of the ppl use VS2017 in build chain now even less will be able to compile it for win

.
I prefer my compile switches so I will keep an old vcxproj file (it anyway produces a faster binary)

.

Laskos · Post by **Laskos** » Tue May 22, 2018 8:53 am

shrapnel wrote: ↑Mon May 21, 2018 6:34 pm As a side note, using -t3 and after switching from cudnn-9.2-windows7-x64-v7.14 to cudnn-9.2-windows10-x64-v7.14, I'm getting much better speeds, over 18k, as of now, and over 95 % GPU Utilization.
(I'm using Windows 8.1).
Better and better....

The first CUDA 9.2 version of lc0 was buggy, it was showing good NPS, but played significantly weaker than CUDA 9.0 version. The problem seems solved by now, with the latest CUDA 9.2 version of lc0, and I am switching to it as standard. The NPS is increased, but not dramatically, some 5%, but the bugs were eliminated. I compared the two versions in a gauntlet at 1s/move against Houdini 1.5a.

Code: Select all

Games Completed = 200 of 200 (Avg game length = 131.265 sec)
Settings = Gauntlet/64MB/1000ms per move/M 9000cp for 30 moves, D 150 moves/EPD:C:\LittleBlitzer\3moves_GM_04.epd(817)
Time = 27816 sec elapsed, 0 sec remaining

 1.  Houdini 1.5a             	110.0/200	73-53-74  	(L: m=53 t=0 i=0 a=0)	(D: r=54 i=7 f=6 s=0 a=7)	(tpm=993.1 d=19.07 nps=2773781)

 2.  LC0 CUDA 9.2             	48.0/100	30-34-36  	(L: m=34 t=0 i=0 a=0)	(D: r=24 i=6 f=3 s=0 a=3)	(tpm=761.2 d=2.65 nps=6013)
 3.  LC0 CUDA 9.0             	42.0/100	23-39-38  	(L: m=39 t=0 i=0 a=0)	(D: r=30 i=1 f=3 s=0 a=4)	(tpm=761.8 d=2.58 nps=5776)

The difference is small, and the number of games pretty small. What can be said is that CUDA 9.2 version of lc0 is 40 +/- 80 (2SD) Elo points stronger than CUDA 9.0 version, or about 80-85% probability that CUDA 9.2 version is better. I am getting about 7-8k nodes per second after say one minute on my GTX 1060 on initial position, so your 18k with GTX 1080ti is expected.
I use the following UCI options for both:

Scale thinking time=2.87
Cpuct MCTS Option=3.17
First Play Urgency Reduction=-0.068

The test was performed with one of the latest nets.

Nay Lin Tun · Post by **Nay Lin Tun** » Tue May 22, 2018 10:20 am

Omg, Good configuration on GTX 1060 is close to Houdini 1.5.
So Leela is only 300 elo behind stockfish 9.

Laskos · Post by **Laskos** » Tue May 22, 2018 12:10 pm

Nay Lin Tun wrote: ↑Tue May 22, 2018 10:20 am Omg, Good configuration on GTX 1060 is close to Houdini 1.5.
So Leela is only 300 elo behind stockfish 9.

That was at 1s/move. At longer TC, LC0 will perform even better. Even at 60''+ 1'', about double TC compared to 1s/move, it is already above Houdini 1.5a.

Code: Select all

1m + 1s
Score of LC0_CUDA_NN322 vs Houdini 1.5a: 43 - 33 - 24 [0.550]
Elo difference: 34.86 +/- 60.09

And that was with CUDA 9.0.

Werewolf · Post by **Werewolf** » Tue May 22, 2018 12:38 pm

Laskos wrote: ↑Tue May 22, 2018 12:10 pm
Nay Lin Tun wrote: ↑Tue May 22, 2018 10:20 am Omg, Good configuration on GTX 1060 is close to Houdini 1.5.
So Leela is only 300 elo behind stockfish 9.
That was at 1s/move. At longer TC, LC0 will perform even better. Even at 60''+ 1'', about double TC compared to 1s/move, it is already above Houdini 1.5a.
Code: Select all
1m + 1s
Score of LC0_CUDA_NN322 vs Houdini 1.5a: 43 - 33 - 24 [0.550]
Elo difference: 34.86 +/- 60.09
And that was with CUDA 9.0.

using the latest CUDA, how much stronger would you say LCZero is than the normal package which on my 1060 card runs at about 800 nps?

Milos · Post by **Milos** » Tue May 22, 2018 1:05 pm

Werewolf wrote: ↑Tue May 22, 2018 12:38 pm
Laskos wrote: ↑Tue May 22, 2018 12:10 pm
Nay Lin Tun wrote: ↑Tue May 22, 2018 10:20 am Omg, Good configuration on GTX 1060 is close to Houdini 1.5.
So Leela is only 300 elo behind stockfish 9.
That was at 1s/move. At longer TC, LC0 will perform even better. Even at 60''+ 1'', about double TC compared to 1s/move, it is already above Houdini 1.5a.
Code: Select all
1m + 1s
Score of LC0_CUDA_NN322 vs Houdini 1.5a: 43 - 33 - 24 [0.550]
Elo difference: 34.86 +/- 60.09
And that was with CUDA 9.0.
using the latest CUDA, how much stronger would you say LCZero is than the normal package which on my 1060 card runs at about 800 nps?

~80 Elo

shrapnel · Post by **shrapnel** » Tue May 22, 2018 1:09 pm

Laskos wrote: ↑Tue May 22, 2018 12:10 pm At longer TC, LC0 will perform even better. Even at 60''+ 1'', about double TC compared to 1s/move, it is already above Houdini 1.5a.

That's a given. I remember in the Google AlphaZero Paper, the Graphs indicated that AlphaZero increased rapidly in strength, more the time given, much more so than Stockfish.
So its not surprising that lco is showing a similar picture.
In fact, I'm running all my Tests in Arena using 1 minute/move Time Control. It takes time of course, but the results are more realistic, I think.
If Google had run 5' or 10' minute Matches, maybe Stockfish would have beaten AlphaZero !

Laskos · Post by **Laskos** » Tue May 22, 2018 1:56 pm

Milos wrote: ↑Tue May 22, 2018 1:05 pm
Werewolf wrote: ↑Tue May 22, 2018 12:38 pm
Laskos wrote: ↑Tue May 22, 2018 12:10 pm
That was at 1s/move. At longer TC, LC0 will perform even better. Even at 60''+ 1'', about double TC compared to 1s/move, it is already above Houdini 1.5a.
Code: Select all
1m + 1s
Score of LC0_CUDA_NN322 vs Houdini 1.5a: 43 - 33 - 24 [0.550]
Elo difference: 34.86 +/- 60.09
And that was with CUDA 9.0.
using the latest CUDA, how much stronger would you say LCZero is than the normal package which on my 1060 card runs at about 800 nps?
~80 Elo

Add another 80-100 for Cpuct and FPU settings, different from defaults.

Werewolf · Post by **Werewolf** » Tue May 22, 2018 1:59 pm

Laskos wrote: ↑Tue May 22, 2018 1:56 pm
Milos wrote: ↑Tue May 22, 2018 1:05 pm
Werewolf wrote: ↑Tue May 22, 2018 12:38 pm

using the latest CUDA, how much stronger would you say LCZero is than the normal package which on my 1060 card runs at about 800 nps?
~80 Elo
Add another 80-100 for Cpuct and FPU settings, different from defaults.

Wowzers. Will all this ever become part of the official package? It would seem good to include a CUDA option.

lc0-win-20180512-cuda90-cudnn712-00

Re: lc0-win-20180512-cuda90-cudnn712-00

Re: lc0-win-20180512-cuda90-cudnn712-00

Re: lc0-win-20180512-cuda90-cudnn712-00

Re: lc0-win-20180512-cuda90-cudnn712-00

Re: lc0-win-20180512-cuda90-cudnn712-00

Re: lc0-win-20180512-cuda90-cudnn712-00

Re: lc0-win-20180512-cuda90-cudnn712-00

Re: lc0-win-20180512-cuda90-cudnn712-00

Re: lc0-win-20180512-cuda90-cudnn712-00

Re: lc0-win-20180512-cuda90-cudnn712-00