towforce wrote: ↑Thu Sep 12, 2024 12:18 am
[...]
As a general thought: graphics cards are not ideal for AI - TPUs and the like are better suited - they just happen to be relatively cheap and readily available high powered computing devices that can be put to uses other than drawing pictures.
Preamble: I am not qualified to write on low level implementation of neural networks for generative AIs, Transformers, Stable Diffusion, etc.
What we know is, that image recognition with CNNs got a boost in the 2010s by using GPUs, data level parallelism, "embarrassingly easy parallelism":
https://en.wikipedia.org/wiki/Embarrassingly_parallel
What we know is that we need scalar, vector and matrix operations on our computing devices for AI, hence meanwhile all AI silicon offers these to various degrees and in different flavors.
What we know is that data-center GPUs (better "AI accelerators") and AI models do co-evolve, the hardware and software does evolve together.
But what the article mentions is that we will see a shift from special purpose hardware to general purpose hardware in this regard.
Take Lc0 with CNNs and SF with NNUE in our computer chess domain for example.
~2018 Lc0 took off, and people thought they need a high-end GPU for several thousand dollars to play top notch computer chess. Then 2020 NNUE took off, and a common CPU was sufficient.
With NNUE we have the division for training neural networks with big data via GPUs and running neural network inference more efficient on a CPU.
We already see the "AI PC" with dedicated NPU chip, so the big players might have come to the conclusion that such a division makes sense too for generative AI.
I saw recently an article that we might not need matrix-multiplications for neural networks at all, we just need to rethink our AI models, thus, there is for sure some hardware-software co-evolution in progress.
--
Srdja