Hacker News new | past | comments | ask | show | jobs | submit login

It's been talked to death but non-CUDA implementations have their challenges regardless of use case. That's what first-mover advantage and > 15 years of investment by Nvidia in their overall ecosystem will do for you.

But support for production serving of inference workloads outside of CUDA is universally dismal. This is where I spend most of my time and compared to CUDA anything else is non-existent or a non-starter unless you're all-in on packaged API driven Google/Amazon/etc tooling utilizing their TPUs (or whatever). The most significant vendor/cloud lock-in I think I've ever seen.

Efficient and high-scale serving of inference workloads is THE thing you need to do to serve customers and actually have a chance at ever making any money. It's shocking to me that Nvidia/CUDA has a complete stranglehold on this obvious use case.




A great summary of how unserious NVIDIA's competitors are is how long it took AMD's flagship consumer/retail GPU, the 7900 XT[X], to gain ROCm support.

That's quite literally unacceptable.


For those who don't know - one year after launch.

Meanwhile Nvidia will go as far as to back port Hopper support to CUDA 11.8 so it "just runs" the day of launch with everything you already have.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: