Page 1 of 1

lczero faq

Posted: Fri Apr 13, 2018 10:05 am
by duncan
Do not understand much of what is going on and suspect there are many others in a similar position. Anybody interested in creating or linking to a faq?

Re: lczero faq

Posted: Fri Apr 13, 2018 10:06 am
by AdminX

Re: lczero faq

Posted: Fri Apr 13, 2018 5:27 pm
by duncan
thanks for the links, although does not help me to know what these networks are and how they work and improve each version . or what buffers are etc.

Re: lczero faq

Posted: Fri Apr 13, 2018 5:36 pm
by Robert Pope
Basically, the networks are the weights of the evaluation function. As LCZero learns, the weights in the networks get better tuned and it plays stronger.

Re: lczero faq

Posted: Fri Apr 13, 2018 5:58 pm
by duncan
Robert Pope wrote:Basically, the networks are the weights of the evaluation function. As LCZero learns, the weights in the networks get better tuned and it plays stronger.
thanks. why does it take longer to make moves as the network gets better and why is a cpu not so suitable for play ?

Re: lczero faq

Posted: Fri Apr 13, 2018 6:23 pm
by MonteCarlo
It doesn't take longer just because the network gets better.

It's slower now because the network size was recently increased substantially.

A larger network means evaluating the network is more computationally expensive (essentially, the NN is just a giant set of math operations, and now there are a lot more of them); however, with a fixed network size (which this should be now for a while), the speed will stay the same whether the network improves or regresses (that's not increasing the number of the weights/operations, just changing their value to values that work better/worse).

The operations performed by the NN basically require doing the same operation on a lot of different data independently; this makes them quite amenable to running on GPUs, which are designed for just such purposes. At a very abstract level, GPUs basically have thousands of units for doing math, so as long as you have thousands of math operations that can be done independently, they'll be well-suited to the task.

They're much less well-suited for things that involve a lot of branching or task switching, or where you have to figure out the result of calculation N before you can perform calculation N+1, exactly the sorts of things that CPUs are optimized for.

Evaluating an NN is just the sort of problem that lies in the GPU's sweet spot, and isn't in the CPU's. The problem is exacerbated by larger network sizes. The previous network was small enough that the gap between CPU and GPU was not too difficult to overcome by using a handful of CPU cores.

After the recent increase in network size, though, GPU users were only mildly affected, because the previous net was so small the decent GPUs weren't even getting 100% utilized.

For CPU users, however, NPS dropped by 2-3x (although the strength improvement from the new larger network basically made this a wash for gameplay).

Re: lczero faq

Posted: Sun Apr 15, 2018 12:06 am
by duncan
MonteCarlo wrote:It doesn't take longer just because the network gets better.

It's slower now because the network size was recently increased substantially.

A larger network means evaluating the network is more computationally expensive (essentially, the NN is just a giant set of math operations, and now there are a lot more of them); however, with a fixed network size (which this should be now for a while), the speed will stay the same whether the network improves or regresses (that's not increasing the number of the weights/operations, just changing their value to values that work better/worse).

The operations performed by the NN basically require doing the same operation on a lot of different data independently; this makes them quite amenable to running on GPUs, which are designed for just such purposes. At a very abstract level, GPUs basically have thousands of units for doing math, so as long as you have thousands of math operations that can be done independently, they'll be well-suited to the task.

They're much less well-suited for things that involve a lot of branching or task switching, or where you have to figure out the result of calculation N before you can perform calculation N+1, exactly the sorts of things that CPUs are optimized for.

Evaluating an NN is just the sort of problem that lies in the GPU's sweet spot, and isn't in the CPU's. The problem is exacerbated by larger network sizes. The previous network was small enough that the gap between CPU and GPU was not too difficult to overcome by using a handful of CPU cores.

After the recent increase in network size, though, GPU users were only mildly affected, because the previous net was so small the decent GPUs weren't even getting 100% utilized.

For CPU users, however, NPS dropped by 2-3x (although the strength improvement from the new larger network basically made this a wash for gameplay).
thanks for your reply.


I read the network changed from 6*64 to 10*128. what do these figures mean.

and how is it decided when to change to a larger network?