Hacker News new | past | comments | ask | show | jobs | submit login

For neural nets it's actually the opposite - half-precision bfloat16 is enough. You need large range, but not much accuracy.

Yes, the exact numbers are going to vary, but just giving a data point to indicate the magnitude of the numbers. If you want to quibble there's CPU SIMD too.




For gaming do matter those double precision. And we were talking about a certain GPU, which is used for gaming, not AI. Hence why the AI chips exists in the first place - dedicated hardware for dedicated tasks (or ASIC for short)


The NVIDIA cards are all dual-use for gaming and compute/ML. Some features like the RTX 4070's Tensor Cores (incl. bfloat16) are there primarily for ML, and other features like ray tracing are there for gaming.


The NVIDIA cards are for mining crypto-coins too, and they successfully did that for years, before being made obsolete in that area by ASICs. Now it's time for the same thing in AI/ML too, hence why AI chips are being developed, they are the ASICs for this domain. That's the big picture. In 2 to 3 years none is going to use NVIDIA gaming cards for AI/ML anymore, no matter how many GFLOPS future 5000/6000 series are going to offer. They will be for gaming only. End of story.


ASICs aren't magic - they are just chips designed to do a single function fast (e.g run a crypto mining algorithm) as an alternative to using a general purpose CPU/GPU whose generality comes at the cost of some performance overhead.

If your application calls for generality - like a gaming card's need to run custom shaders, or an ML model's need to run custom compute kernels, then an ASIC won't help you. These applications still need a general purpose processor, just one that provides huge parallelism.

It seems you may be thinking that all an ML chip does is matrix multiplication, and so a specialized ASIC would make sense, but that's not the case - an ML chip needs to run the entire model - think of it as a PyTorch accelerator, not a matmul accelerator.

Finally, the market for consumer (vs data center) ML cards is tiny relative to the gaming market, and these chips/cards are expensive to develop. Unless this changes, it doesn't make sense for companies like NVIDA to develop ML-only cards when with minimal effort they can leverage their data center designs and build dual-use GPU/compute consumer cards.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: