For neural nets it's actually the opposite - half-precision bfloat16 is enough. ...

unnouinceput · on May 27, 2023

For gaming do matter those double precision. And we were talking about a certain GPU, which is used for gaming, not AI. Hence why the AI chips exists in the first place - dedicated hardware for dedicated tasks (or ASIC for short)

HarHarVeryFunny · on May 27, 2023

The NVIDIA cards are all dual-use for gaming and compute/ML. Some features like the RTX 4070's Tensor Cores (incl. bfloat16) are there primarily for ML, and other features like ray tracing are there for gaming.

unnouinceput · on May 28, 2023

The NVIDIA cards are for mining crypto-coins too, and they successfully did that for years, before being made obsolete in that area by ASICs. Now it's time for the same thing in AI/ML too, hence why AI chips are being developed, they are the ASICs for this domain. That's the big picture. In 2 to 3 years none is going to use NVIDIA gaming cards for AI/ML anymore, no matter how many GFLOPS future 5000/6000 series are going to offer. They will be for gaming only. End of story.

HarHarVeryFunny · on May 28, 2023

ASICs aren't magic - they are just chips designed to do a single function fast (e.g run a crypto mining algorithm) as an alternative to using a general purpose CPU/GPU whose generality comes at the cost of some performance overhead.

If your application calls for generality - like a gaming card's need to run custom shaders, or an ML model's need to run custom compute kernels, then an ASIC won't help you. These applications still need a general purpose processor, just one that provides huge parallelism.

It seems you may be thinking that all an ML chip does is matrix multiplication, and so a specialized ASIC would make sense, but that's not the case - an ML chip needs to run the entire model - think of it as a PyTorch accelerator, not a matmul accelerator.

Finally, the market for consumer (vs data center) ML cards is tiny relative to the gaming market, and these chips/cards are expensive to develop. Unless this changes, it doesn't make sense for companies like NVIDA to develop ML-only cards when with minimal effort they can leverage their data center designs and build dual-use GPU/compute consumer cards.