Post by JJJ » Fri Sep 21, 2018 1:47 pm

megamau wrote:
Fri Sep 21, 2018 12:25 pm
Milos wrote:
Sun Sep 16, 2018 5:22 am
ankan wrote:
Sun Sep 16, 2018 4:09 am
Milos wrote:
Sat Sep 15, 2018 11:24 pm
Since FP16 is not enabled in 20xx cards the same it wasn't enabled in 10xx cards the only gain is those 15% in extra CUDA cores and higher frequency. Therefore 2080Ti will be faster than 1080Ti for exactly those 15%. Anyone who believes in some other magical speed-up is frankly speaking just daydreaming.
This is definitely not true. The fp16 path of lc0 uses tensor cores on Volta and they do help 3x3 convolutions. The reason you see only about 3x speedup at best (compared to 8x if you compare the peak fp16 tensor math vs regular fp32 throughput) is because fp32 path uses winograd algorithm which is 2-3x faster than regular implicit gmem algorithm used by fp16 path. As you said tensor cores just gives you 4x4 matrix multiplications and making them work with winograd algorithm is hard.

2080Ti should be almost as fast as a TitanV for lc0 (or ~3X faster than 1080Ti when using fp16 mode).
Well you might be one of thousands of other Indian guys writing drivers for Nvidia, but what you are writing is definitively false.
Titan V has FP16 working in CUDA, 2080Ti doesn't. ....
Since 2080Ti doesn't have FP16 working in CUDA cores, 2080Ti additional speed up can be only 5% thanks to Tensor cores (plus around 15% thanks to more CUDA cores). Your stories about 3x speed up for 2080Ti compared to 1080Ti are nothing but marketing of your company. You are simply biased since you have vested interest.
So Milos, as the cards and the benchmarks are now available, is it time to admit you were wrong ?
arrogance precedes the fall

