LC0 cuda speed question

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

jmartus
Posts: 256
Joined: Sun May 16, 2010 2:50 am

Re: LC0 cuda speed question

Post by jmartus »

Well how do I use those weights insead?
consen
Posts: 80
Joined: Tue Mar 11, 2014 6:09 pm
Location: Norge

Re: LC0 cuda speed question

Post by consen »

Save them (Weight-file) in same place where lczero.010.exe is.
Or a newer lczero.xxx.exe.
You not need unzipping or unpakking.
And take away other txt-files in same directory (map)
Then Lczero use the file ending in txt.gz.
(No need for unpaking gz-file).¨
User avatar
AdminX
Posts: 6340
Joined: Mon Mar 13, 2006 2:34 pm
Location: Acworth, GA

Re: LC0 cuda speed question

Post by AdminX »

2x GTX 1060 Net ID 10970

--threads=4
--minibatch-size=256
--backend=multiplexing
--backend-opts=(backend=cudnn,gpu=0,max_batch=1024),(backend=cudnn,gpu=1,max_batch=1024)

Image
"Good decisions come from experience, and experience comes from bad decisions."
__________________________________________________________________
Ted Summers
brianr
Posts: 536
Joined: Thu Mar 09, 2006 3:01 pm

Re: LC0 cuda speed question

Post by brianr »

With my GTX-770 and default options with NN 588 I get about 2,750 nps after "hashfull" which is after 130 seconds. Eleven seconds is too short for it too "spin up" and at that point I get about 2,100 nps. So, your results look like mine.
Brian
Milos
Posts: 4190
Joined: Wed Nov 25, 2009 1:47 am

Re: LC0 cuda speed question

Post by Milos »

brianr wrote: Thu Aug 23, 2018 9:02 pm With my GTX-770 and default options with NN 588 I get about 2,750 nps after "hashfull" which is after 130 seconds. Eleven seconds is too short for it too "spin up" and at that point I get about 2,100 nps. So, your results look like mine.
Brian
net 588 is "small" i.e. 15x192 net.
You would get at least 2x smaller nps with "large" i.e. 20x256 net.
Btw, GTX770 is 2.5x slower than 1060.
With GTX770 and 20x256 net I was getting around 1400nps after 300k nodes.
With GTX1060 and 20x256 net I am getting close to 4000nps after 300k nodes. However, my GTX1060 is strongly OC'ed, otherwise it would be the usual 3600nps.
User avatar
cc2150dx
Posts: 325
Joined: Sat Nov 30, 2013 9:51 am
Location: Canada
Full name: Jason Coombs

Re: LC0 cuda speed question

Post by cc2150dx »

I'm running LC0 cuda (default settings) with my Geforce GTX 760 and only getting 875 nps. Something doesn't seem right?
User avatar
AdminX
Posts: 6340
Joined: Mon Mar 13, 2006 2:34 pm
Location: Acworth, GA

Re: LC0 cuda speed question

Post by AdminX »

cc2150dx wrote: Thu Aug 23, 2018 10:27 pm I'm running LC0 cuda (default settings) with my Geforce GTX 760 and only getting 875 nps. Something doesn't seem right?
That seems about right if you are using 20x256 net. On my Dell Laptop with an Nvidia MX150 GPU I am getting between 865 - 900 nps.
Last edited by AdminX on Thu Aug 23, 2018 11:59 pm, edited 1 time in total.
"Good decisions come from experience, and experience comes from bad decisions."
__________________________________________________________________
Ted Summers
brianr
Posts: 536
Joined: Thu Mar 09, 2006 3:01 pm

Re: LC0 cuda speed question

Post by brianr »

I was responding to original post by jmartus, which I probably should have quoted, as I just happened to have a GTX 770. In any case, thanks for your info.

Currently, I'm running some test matches between 6x64, 15x192, and 20x256 NNs on my GTX 1070 system against various other engines. Although slower in nps, the larger NNs are clearly stronger. I would like to get a better idea of how much stronger.
Milos wrote: Thu Aug 23, 2018 10:05 pm
brianr wrote: Thu Aug 23, 2018 9:02 pm With my GTX-770 and default options with NN 588 I get about 2,750 nps after "hashfull" which is after 130 seconds. Eleven seconds is too short for it too "spin up" and at that point I get about 2,100 nps. So, your results look like mine.
Brian
net 588 is "small" i.e. 15x192 net.
You would get at least 2x smaller nps with "large" i.e. 20x256 net.
Btw, GTX770 is 2.5x slower than 1060.
With GTX770 and 20x256 net I was getting around 1400nps after 300k nodes.
With GTX1060 and 20x256 net I am getting close to 4000nps after 300k nodes. However, my GTX1060 is strongly OC'ed, otherwise it would be the usual 3600nps.
User avatar
cc2150dx
Posts: 325
Joined: Sat Nov 30, 2013 9:51 am
Location: Canada
Full name: Jason Coombs

Re: LC0 cuda speed question

Post by cc2150dx »

AdminX wrote: Thu Aug 23, 2018 11:55 pm
cc2150dx wrote: Thu Aug 23, 2018 10:27 pm I'm running LC0 cuda (default settings) with my Geforce GTX 760 and only getting 875 nps. Something doesn't seem right?
That seems about right if you are using 20x256 net. On my Dell Laptop with an Nvidia MX150 GPU I am getting between 865 - 900 nps.
I'm using a 20x256 net. Well, thanks for letting me know :)