Because they don't care about wasting tax payers money.smatovic wrote: ↑Tue Sep 29, 2020 1:55 pmHmm, why did DOE choose for its upcoming exa-FLOP systems Intel (Aurora), AMD (Frontier), AMD (El Capitan) and not IBM/Nvidia?Milos wrote: ↑Tue Sep 29, 2020 1:01 pmGamers profit for sure, ML scientist not at all. Who wants to buy AMD card that costs 1200$ and has worse performance for ML than NVIDIA card that costs 500$???smatovic wrote: ↑Tue Sep 29, 2020 12:48 pm- some people prefer AMD over NvidiaMilos wrote: ↑Tue Sep 29, 2020 12:39 pmBoth OpenCL and ROCm are crap compared to CUDA and cudnn, so I see little point in mentioning Big Navi in the context of DL.smatovic wrote: ↑Tue Sep 29, 2020 12:21 pm
Yes, Big Navi with RDNA 2 is supposed to catch up with Nvidia's high-end line,
at least for gaming, we saw already with RTX 20xx Super series a second launch
of the same architecture with better performance/price, remains open if such
a thing will happen after AMD's launch of its high-end series...sad that Intel
does not launch its Xe-HPG this year...would have been fun
One needs 2x faster RDNA2 card in terms of TFLOPS to match performance of RTX card.
- the DX12 backend of Lc0 runs on AMD too?
- you miss the point, competition is good for us end users, if we have three gaming vendors competing, we profit by the performance/price competition
--
Srdja
--
Srdja
Next-Gen GPUs for LC0
Moderators: hgm, Rebel, chrisw
-
- Posts: 4190
- Joined: Wed Nov 25, 2009 1:47 am
Re: Next-Gen GPUs for LC0
-
- Posts: 10948
- Joined: Wed Jul 26, 2006 10:21 pm
- Full name: Kai Laskos
Re: Next-Gen GPUs for LC0
Are there reasons 3xxx units need larger batches than 2xxx cards? The goal is the strength, not NPS, and with 2xxx cards the best batch size was 256.mehmet123 wrote: ↑Mon Sep 28, 2020 5:58 pm Lc0 benchmarks with SV-3010 network (384x30)
Default settings (minibatch-size=256)
---------------------------------------------
GPU baseline optimized perf gain (%)
---------------------------------------------
Titan RTX.. 17443 - 20084 15.1
RTX 3090.. 26820 - 29767 11.0
A100........ 41785 - 48815 16.8
Minibatch-size=1024, all other settings default:
---------------------------------------------
GPU baseline optimized perf gain (%)
---------------------------------------------
Titan RTX.... 20211 - 23003 13.8
RTX 3090..... 33032 - 36924 11.8
A100.......... 52732 - 59134 12.1
(From Lc0 Discord)
-
- Posts: 343
- Joined: Sun Aug 25, 2019 8:33 am
- Full name: .
Re: Next-Gen GPUs for LC0
Tensorflow FP16 benchmarks for 3080 and 3090: https://www.pugetsystems.com/labs/hpc/R ... nary-1902/. Disappointing improvement for FP16 with CUDA 11. But they say:
"The current CUDA 11.0 does not have full support for the GA102 chips used in the RTX 3090 and RTX3080 (sm_86)."
" The surprising results were how much better the RTX20 GPUs performed with CUDA 11 and TensorFlow 1.15."
"The current CUDA 11.0 does not have full support for the GA102 chips used in the RTX 3090 and RTX3080 (sm_86)."
" The surprising results were how much better the RTX20 GPUs performed with CUDA 11 and TensorFlow 1.15."