Next-Gen GPUs for LC0

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

Milos
Posts: 4190
Joined: Wed Nov 25, 2009 1:47 am

Re: Next-Gen GPUs for LC0

Post by Milos »

smatovic wrote: Tue Sep 29, 2020 1:55 pm
Milos wrote: Tue Sep 29, 2020 1:01 pm
smatovic wrote: Tue Sep 29, 2020 12:48 pm
Milos wrote: Tue Sep 29, 2020 12:39 pm
smatovic wrote: Tue Sep 29, 2020 12:21 pm
Yes, Big Navi with RDNA 2 is supposed to catch up with Nvidia's high-end line,
at least for gaming, we saw already with RTX 20xx Super series a second launch
of the same architecture with better performance/price, remains open if such
a thing will happen after AMD's launch of its high-end series...sad that Intel
does not launch its Xe-HPG this year...would have been fun :)
Both OpenCL and ROCm are crap compared to CUDA and cudnn, so I see little point in mentioning Big Navi in the context of DL.
One needs 2x faster RDNA2 card in terms of TFLOPS to match performance of RTX card.
- some people prefer AMD over Nvidia
- the DX12 backend of Lc0 runs on AMD too?
- you miss the point, competition is good for us end users, if we have three gaming vendors competing, we profit by the performance/price competition

--
Srdja
Gamers profit for sure, ML scientist not at all. Who wants to buy AMD card that costs 1200$ and has worse performance for ML than NVIDIA card that costs 500$???
Hmm, why did DOE choose for its upcoming exa-FLOP systems Intel (Aurora), AMD (Frontier), AMD (El Capitan) and not IBM/Nvidia?

--
Srdja
Because they don't care about wasting tax payers money. ;)
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: Next-Gen GPUs for LC0

Post by Laskos »

mehmet123 wrote: Mon Sep 28, 2020 5:58 pm Lc0 benchmarks with SV-3010 network (384x30)

Default settings (minibatch-size=256)
---------------------------------------------
GPU baseline optimized perf gain (%)
---------------------------------------------
Titan RTX.. 17443 - 20084 15.1
RTX 3090.. 26820 - 29767 11.0
A100........ 41785 - 48815 16.8


Minibatch-size=1024, all other settings default:
---------------------------------------------
GPU baseline optimized perf gain (%)
---------------------------------------------
Titan RTX.... 20211 - 23003 13.8
RTX 3090..... 33032 - 36924 11.8
A100.......... 52732 - 59134 12.1

(From Lc0 Discord)
Are there reasons 3xxx units need larger batches than 2xxx cards? The goal is the strength, not NPS, and with 2xxx cards the best batch size was 256.
mmt
Posts: 343
Joined: Sun Aug 25, 2019 8:33 am
Full name: .

Re: Next-Gen GPUs for LC0

Post by mmt »

Tensorflow FP16 benchmarks for 3080 and 3090: https://www.pugetsystems.com/labs/hpc/R ... nary-1902/. Disappointing improvement for FP16 with CUDA 11. But they say:

"The current CUDA 11.0 does not have full support for the GA102 chips used in the RTX 3090 and RTX3080 (sm_86)."

" The surprising results were how much better the RTX20 GPUs performed with CUDA 11 and TensorFlow 1.15."