LC0 cuda speed question
Moderators: hgm, Dann Corbit, Harvey Williamson
Forum rules
This textbox is used to restore diagrams posted with the [d] tag before the upgrade.
This textbox is used to restore diagrams posted with the [d] tag before the upgrade.
Re: LC0 cuda speed question
Well how do I use those weights insead?
Re: LC0 cuda speed question
Save them (Weight-file) in same place where lczero.010.exe is.
Or a newer lczero.xxx.exe.
You not need unzipping or unpakking.
And take away other txt-files in same directory (map)
Then Lczero use the file ending in txt.gz.
(No need for unpaking gz-file).¨
Or a newer lczero.xxx.exe.
You not need unzipping or unpakking.
And take away other txt-files in same directory (map)
Then Lczero use the file ending in txt.gz.
(No need for unpaking gz-file).¨
Re: LC0 cuda speed question
2x GTX 1060 Net ID 10970
--threads=4
--minibatch-size=256
--backend=multiplexing
--backend-opts=(backend=cudnn,gpu=0,max_batch=1024),(backend=cudnn,gpu=1,max_batch=1024)

--threads=4
--minibatch-size=256
--backend=multiplexing
--backend-opts=(backend=cudnn,gpu=0,max_batch=1024),(backend=cudnn,gpu=1,max_batch=1024)

"Good decisions come from experience, and experience comes from bad decisions."
__________________________________________________________________
Ted Summers
__________________________________________________________________
Ted Summers
Re: LC0 cuda speed question
With my GTX-770 and default options with NN 588 I get about 2,750 nps after "hashfull" which is after 130 seconds. Eleven seconds is too short for it too "spin up" and at that point I get about 2,100 nps. So, your results look like mine.
Brian
Brian
Re: LC0 cuda speed question
net 588 is "small" i.e. 15x192 net.
You would get at least 2x smaller nps with "large" i.e. 20x256 net.
Btw, GTX770 is 2.5x slower than 1060.
With GTX770 and 20x256 net I was getting around 1400nps after 300k nodes.
With GTX1060 and 20x256 net I am getting close to 4000nps after 300k nodes. However, my GTX1060 is strongly OC'ed, otherwise it would be the usual 3600nps.
Re: LC0 cuda speed question
I'm running LC0 cuda (default settings) with my Geforce GTX 760 and only getting 875 nps. Something doesn't seem right?
- Attachments
-
- Capture.JPG (57.77 KiB) Viewed 2148 times
Dragon by Komodo Chess (beta tester)
Re: LC0 cuda speed question
That seems about right if you are using 20x256 net. On my Dell Laptop with an Nvidia MX150 GPU I am getting between 865 - 900 nps.
Last edited by AdminX on Thu Aug 23, 2018 9:59 pm, edited 1 time in total.
"Good decisions come from experience, and experience comes from bad decisions."
__________________________________________________________________
Ted Summers
__________________________________________________________________
Ted Summers
Re: LC0 cuda speed question
I was responding to original post by jmartus, which I probably should have quoted, as I just happened to have a GTX 770. In any case, thanks for your info.
Currently, I'm running some test matches between 6x64, 15x192, and 20x256 NNs on my GTX 1070 system against various other engines. Although slower in nps, the larger NNs are clearly stronger. I would like to get a better idea of how much stronger.
Currently, I'm running some test matches between 6x64, 15x192, and 20x256 NNs on my GTX 1070 system against various other engines. Although slower in nps, the larger NNs are clearly stronger. I would like to get a better idea of how much stronger.
Milos wrote: ↑Thu Aug 23, 2018 8:05 pmnet 588 is "small" i.e. 15x192 net.
You would get at least 2x smaller nps with "large" i.e. 20x256 net.
Btw, GTX770 is 2.5x slower than 1060.
With GTX770 and 20x256 net I was getting around 1400nps after 300k nodes.
With GTX1060 and 20x256 net I am getting close to 4000nps after 300k nodes. However, my GTX1060 is strongly OC'ed, otherwise it would be the usual 3600nps.
Re: LC0 cuda speed question
I'm using a 20x256 net. Well, thanks for letting me know

Dragon by Komodo Chess (beta tester)