with cudnn 7.3 and 411.63 driver available at nvidia.com
minibatch-size=512, network id: 11250, go nodes 1000000
fp32 fp16
Titan V: 13295 29379
RTX 2080: 9708 26678
RTX 2080Ti: 12208 32472
About fp32 and fp16, this is the calculation precision of the neural network inference. fp32 refers to 32 bit floats, fp16 to 16 bit floats. It has been experimentally confirmed that the reduced floating point accuracy of 16 bit NN inference does not reduce playing strength for Lc0 significantly. However, there is not much point with GTX 10xx GPUs since those are not optimised for fp16. The RTX cards on the other hand are, in their case fp16 gains a large amount of speed as can be seen from those benchmarks.
As for how to use, it, you initialise Lc0 with "backend=cudnn-fp16" instead of "backend=cudnn".
So Titan V benefits the least, then 2080Ti than 2080 benefiting the most from tensor cores, almost 70% additionally to fp32 performance. Interesting.
Still even with 2080 Tensor cores are using only 10% of its maximum bendwidth. With Titan V that was less then 5%.
So an RTX 2080 will give around a factor 3 improvement to the GTX 1080 Ti when using fp16 which the GTX doesn't support. For the 2080Ti, the improvement is a factor of 3.6. The RTX 2070 isn't released yet, but I would guess it should still pull around 20k nps on fp16, for a much lower power consumption and purchase price than the 1080Ti.
Considering how small Tensor cores impact on performance is 2070 will definitively be best bang for the buck.
Plus 2x2070 should be 25% stronger than 2080Ti at the same price and power consumption practically.
with cudnn 7.3 and 411.63 driver available at nvidia.com
minibatch-size=512, network id: 11250, go nodes 1000000
fp32 fp16
Titan V: 13295 29379
RTX 2080: 9708 26678
RTX 2080Ti: 12208 32472
About fp32 and fp16, this is the calculation precision of the neural network inference. fp32 refers to 32 bit floats, fp16 to 16 bit floats. It has been experimentally confirmed that the reduced floating point accuracy of 16 bit NN inference does not reduce playing strength for Lc0 significantly. However, there is not much point with GTX 10xx GPUs since those are not optimised for fp16. The RTX cards on the other hand are, in their case fp16 gains a large amount of speed as can be seen from those benchmarks.
As for how to use, it, you initialise Lc0 with "backend=cudnn-fp16" instead of "backend=cudnn".
How/where do you initialise LC0 with "backend=cudnn-fp16" ?
Robert Pope wrote: ↑Wed Oct 17, 2018 8:34 pm
Now that the 2070 is out, I was wondering if we have confirmation that it will do FP16, and how it stacks up against the others:
with cudnn 7.3 and 411.63 driver available at nvidia.com
minibatch-size=512, network id: 11250, go nodes 1000000
fp32 fp16
Titan V: 13295 29379
RTX 2080: 9708 26678
RTX 2080Ti: 12208 32472
I have the same question. From the specifications, it seems it does support fp16, and will come at ~20,000 NPS, i.e. about 3-3.5 times faster than my GTX 1060. In a month or so I will have one, if what I said stands.
with cudnn 7.3 and 411.63 driver available at nvidia.com
minibatch-size=512, network id: 11250, go nodes 1000000
v0.17, default values for all other settings
(2070 run was with v0.18.1 lc0 build but with same settings)
fp32 fp16
GTX 1080Ti: 8996 -
Titan V: 13295 29379
RTX 2070: 8841 23721
RTX 2080: 9708 26678
RTX 2080Ti: 12208 32472
I'm not sure though whether it's fair to have v0.17 vs v0.18.1 comparison, don't remember which changes were there between..
with cudnn 7.3 and 411.63 driver available at nvidia.com
minibatch-size=512, network id: 11250, go nodes 1000000
v0.17, default values for all other settings
(2070 run was with v0.18.1 lc0 build but with same settings)
fp32 fp16
GTX 1080Ti: 8996 -
Titan V: 13295 29379
RTX 2070: 8841 23721
RTX 2080: 9708 26678
RTX 2080Ti: 12208 32472
I'm not sure though whether it's fair to have v0.17 vs v0.18.1 comparison, don't remember which changes were there between..
Wow, thanks! RTX 2070 seems to be the best buy, and is about 4 times faster than my GTX 1060. Necessary procurement.
I just splashed out on a 2080ti new PC which is theoretically arriving in a few days. Am also wanting to get another 2080ti card to swap in or sit alongside the 1060 in my current PC. Anybody done a one for one replacement swap of these two? Any issues arise, or is it straightforward? Win10 btw. It’s amazing how resnet programming gets you into spending money!
with cudnn 7.3 and 411.63 driver available at nvidia.com
minibatch-size=512, network id: 11250, go nodes 1000000
v0.17, default values for all other settings
(2070 run was with v0.18.1 lc0 build but with same settings)
fp32 fp16
GTX 1080Ti: 8996 -
Titan V: 13295 29379
RTX 2070: 8841 23721
RTX 2080: 9708 26678
RTX 2080Ti: 12208 32472
I'm not sure though whether it's fair to have v0.17 vs v0.18.1 comparison, don't remember which changes were there between..
Wow, thanks! RTX 2070 seems to be the best buy, and is about 4 times faster than my GTX 1060. Necessary procurement.
It would get really interesting if you bought two of them and, perhaps, were quite a bit faster than a 2080 Ti