Lc0 0.25.1 CUDA 10.2 compile is fast.
You can download this lc0 binary for Windows 10 at Discord under test-discuss, Mai 1.
Or here, with all needed Nvidia 10.2 drivers (340 MB):
https://filehorst.de/d/dyhbuIiz
Short test CUDA 10.2 compile under Win 10 on my slow GTX 1050 Ti, after 60s:
Sergio 3200, new 950 nps (old 650 nps)
Sergio 1810, new 3000 nps (old 2100 nps)
701820, new 15 kns (old 14 kns)
Fat Fritz 1.1, new 2900 nps (old 2500 nps)
old=lc0 0.25.1 with old Nvidia drivers, new=Lc0 0.25.1_10.2 compile with Nvidia drivers 10.2.
Great work, thanks!
Lc0 CUDA 10.2 compile (0.25.1) is faster
Moderators: hgm, Rebel, chrisw
-
- Posts: 1439
- Joined: Sat Oct 27, 2018 12:58 am
- Location: Germany
- Full name: N.N.
-
- Posts: 195
- Joined: Sun Apr 12, 2020 1:09 am
- Full name: Marc-O Moisan-Plante
Re: Lc0 CUDA 10.2 compile (0.25.1) is faster
Thank you for the info.
Unfortunately there's something wrong with the new combo drivers+compile on with GTX 1660 ti : with 20x256 nets the nps drastically dropped from 9000 to a bare 1000 nps... Does someone have the same problem?
Unfortunately there's something wrong with the new combo drivers+compile on with GTX 1660 ti : with 20x256 nets the nps drastically dropped from 9000 to a bare 1000 nps... Does someone have the same problem?
-
- Posts: 3657
- Joined: Wed Nov 18, 2015 11:41 am
- Location: hungary
Re: Lc0 CUDA 10.2 compile (0.25.1) is faster
As I remember well CUDA 10.2 needs VS 2019 compiler and newer NVIDIA GPU driver.
Last edited by corres on Mon May 04, 2020 8:17 am, edited 1 time in total.
-
- Posts: 2872
- Joined: Wed Mar 08, 2006 10:09 pm
- Location: Germany
- Full name: Werner Schüle
Re: Lc0 CUDA 10.2 compile (0.25.1) is faster
For Gtx 16...
set `--backend-opts=custom_winograd=false` when using official v0.25 build.
Werner
-
- Posts: 3657
- Joined: Wed Nov 18, 2015 11:41 am
- Location: hungary
-
- Posts: 10948
- Joined: Wed Jul 26, 2006 10:21 pm
- Full name: Kai Laskos
Re: Lc0 CUDA 10.2 compile (0.25.1) is faster
I am using it, seems fine, both this compile and the older v25.1 compile. Only that I am not getting any speed gain with this CUDA 10.2 build. Maybe only fp32 gets big gains.
-
- Posts: 3657
- Joined: Wed Nov 18, 2015 11:41 am
- Location: hungary
Re: Lc0 CUDA 10.2 compile (0.25.1) is faster
Thanks for the info but if I am right you use DX12 backend and not cudnn-fp16 backend.Laskos wrote: ↑Mon May 04, 2020 9:44 amI am using it, seems fine, both this compile and the older v25.1 compile. Only that I am not getting any speed gain with this CUDA 10.2 build. Maybe only fp32 gets big gains.
-
- Posts: 10948
- Joined: Wed Jul 26, 2006 10:21 pm
- Full name: Kai Laskos
Re: Lc0 CUDA 10.2 compile (0.25.1) is faster
No, I benchmarked that T...160 20x256 JHorthos net using cudnn-fp16 backend. DX12 is good with larger nets, but here the issue was this CUDA 10.2 compile and libraries.corres wrote: ↑Mon May 04, 2020 10:39 amThanks for the info but if I am right you use DX12 backend and not cudnn-fp16 backend.Laskos wrote: ↑Mon May 04, 2020 9:44 amI am using it, seems fine, both this compile and the older v25.1 compile. Only that I am not getting any speed gain with this CUDA 10.2 build. Maybe only fp32 gets big gains.
-
- Posts: 3657
- Joined: Wed Nov 18, 2015 11:41 am
- Location: hungary
Re: Lc0 CUDA 10.2 compile (0.25.1) is faster
I see.Laskos wrote: ↑Mon May 04, 2020 10:50 amNo, I benchmarked that T...160 20x256 JHorthos net using cudnn-fp16 backend. DX12 is good with larger nets, but here the issue was this CUDA 10.2 compile and libraries.corres wrote: ↑Mon May 04, 2020 10:39 amThanks for the info but if I am right you use DX12 backend and not cudnn-fp16 backend.Laskos wrote: ↑Mon May 04, 2020 9:44 amI am using it, seems fine, both this compile and the older v25.1 compile. Only that I am not getting any speed gain with this CUDA 10.2 build. Maybe only fp32 gets big gains.
-
- Posts: 2438
- Joined: Sat Sep 03, 2011 7:25 am
- Location: Berlin, Germany
- Full name: Stefan Pohl
Re: Lc0 CUDA 10.2 compile (0.25.1) is faster
Same here on my RTX 2060 (mobile). No speed gain compared to offcial 0.25.1 release with cudnn-fp16.