Hacker News new | past | comments | ask | show | jobs | submit login

My experience with many of these services renting mostly A100s:

LambdaLabs: For on-demand instances, they are the cheapest available option. Their offering is straightforward, and I've never had a problem. The downside is that their instance availability is spotty. It seems like things have gotten a little better in the last month, and 8x machines are available more often than not, but single A100s were rarely available for most of this year. Another downside is lack of persistent storage, meaning you have to transfer your data every time you start a new instance. They have some persistent storage in beta, but it's effectively useless since it's only in one region and there's no instances in that region that I've seen.

Jarvis: Didn't work for me when I tried them a couple months ago. The instances would never finish booting. It's also a pre-paid system, so you have to fill up your "balance" before renting machines. But their customer service was friendly and gave me a full refund so shrug.

GCP: This is my go-to so far. A100s are $1.1/hr interruptible, and of course you get all the other Google offerings like persistent disks, S3, managed SQL, container registry, etc. Availability of interruptible instances has been consistently quite good, if a bit confusing. I've had some machines up for a week solid without interruption, while other times I can tear down a stack of machines and immediately request a new one only to be told they are out of availability. The downsides are the usual GCP downsides: poor documentation, sometimes weird glitches, and perhaps the worst billing system I've seen outside of the healthcare industry.

Vast.ai: They can be a good chunk cheaper, but at the cost of privacy, security, support, and reliability. Pre-load only. For certain workloads and if you're highly cost sensitive this is a good option to consider.

RunPod: Terrible performance issues. Pre-load only. Non-responsive customer support. I ended up having to get my credit card company involved.

Self-hosted: As a sibling comment points out, self hosting is a great option to consider. In particular "Having the dedicated hardware also meant that researchers were willing to experiment more". I've got a couple cards in my lab that I use for experimentation, and then throw to the cloud for big runs.




please consider CoreWeave - great experience so far.


For 2 to 3 times the cost of Lambda on demand and GCP preempt?


+1 to this




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: