Hacker News new | past | comments | ask | show | jobs | submit login
New – EC2 Instances (G5) with Nvidia A10G Tensor Core GPUs (amazon.com)
95 points by my123 70 days ago | hide | past | favorite | 71 comments

I take pretty unhealthy pleasure in looking at EC2/GC announcements and comparing them to random offers on the WebHostingTalk's Dedicated Hosting Offers section. The cloud offerings almost always end up being 2-4x more expensive (or more) and 2-4x slower.

Except when GPUs are involved. This just seems like something that dedicated hosting industry isn't on top of _at all_. I don't know if it's hardware availability, upfront costs, unreliable demand, an inability to compete on price, or if the usage really is so much more elastic.

The best I could [quickly] find was a 5600X w/32G Ram, 1TB SSD and 12GB 3060 for $189/m.

You’re not paying for the compute, you’re paying for IAM, the service ecosystem, unified enterprise billing, and the engineering experience. Totally different market than bare metal colocated. Like comparing a crated LS engine to a showroom Corvette.

That's one reason. Some others:

1 - Free credits

2 - No one ever got fired for buying IBM

3 - AWS is sold to CTO/CIOs. AWS is a massive marketing and sales success

4 - An entire job type has sprung up around cloud-complexity. These people have a vested interest is staying on cloud.

5 - Growing number of people think the choice is exclusively between cloud or colo.

6 - Entire generations brought up on cloud: a lot of people don't realize just how expensive cloud is.

> 3 - AWS is sold to CTO/CIOs. AWS is a massive marketing and sales success

Not always. People on the ground push for cloud stuff because it makes their lives easier and they are used to it from their side projects (where they, more likely than not, used the generous free credits).

It's just like back in ye olde days when all you needed to do to get a copy of Photoshop was the official installer and a keygen... Adobe didn't invest much in marketing and didn't do much against piracy until the end of CS6/the rise of CC, simply to become the universal industry standard.

Obviously 5 is really only a problem when you get into enterprises that have enough servers and monthly spend to where it would make more sense to invest in the overhead that is having your own servers.

> 6 - Entire generations brought up on cloud: a lot of people don't realize just how expensive cloud is.

Ah, so true. Heroku was my "solution" for so long, when its really so expensive on its own. All the pricing models are so messed up.

I have some Digital Ocean nodes at either $5 or $20/mo, I need to shut those down.

Deploying stuff that can communicate to other stuff is so easy right now and so cheap ... somewhere. Vercel to the rescue, for now.

> 6 - Entire generations brought up on cloud: a lot of people don't realize just how expensive cloud is.

The same people who know it’s expensive seem to not know how cheap it is.

If we're talking low amount of traffic, then maybe cloud is cheaper than dedicated. But for anything medium to high amount of traffic, cloud will always end up more expensive, or you have some unique use case where cloud is actually cheaper than dedicated over time?

If we are only considering traffic then sure AWS can be expensive. Costs me $730 for 17TB of data at the moment. But looking at where money is being saved. 6 ec2 instances previously was $648.24/m, now its $6.82/m after moving to Step Functions / Lambda's, and it just works better as there's no fighting for resources anymore.

You know whats expensive in AWS. S3 Storage. Got some odd ~500TB stored in there. But at 6k/m I can't imagine trying to keep that in dedicated hardware. Atleast in S3 I feel it's safe.

A lot of hobby and or small websites and business could benefit from a pay-per-ms model like AWS Lambda.

It's literally pennies in some cases, vs. renting a server that sits idle 99% of the time.

What "hobby and or small websites and business" deal with medium to high amount of traffic? Those seems to strictly fall into the "low amount of traffic" bucket.

Besides, you could either rent a $5/month VPS that you know will always be $5/month, or you can use AWS Lambda and never know what your bill is gonna be at the end. Your new blogpost hit frontpage of HN? Good news, your bill is now $40 for this month.

Most (85% maybe) of my clients I work with want consistent costs over cheaper costs.

A can rent a few €/month VM that is idling all the time but who cares, or struggle developing with AWS Lambda and risk paying orders of magnitude more.

All excellent points.

Yeah, but for compute or bandwidth intense businesses, you might not need IAM, unified enterprise billing, or the "engineering experience".

I helped a niche API-as-a-service company go from -20% margins to being profitable. How? Moving away from AWS to Hetzner.

Administration is not free. Given today’s risks I really doubt a small scale company can afford in-house properly secured hosting/storing of anything valuable.

Yes your monthly bill is lower, but now the liability is all yours.

These types of comments constantly come up. It's a false dichotomy. Following HN guidelines, I'll assume best intention rather than trolling or FUD.

There are other options than the two that you've described - namely VPS providers and dedicated providers.

Can you help me understand why you thought those were the only two options? Maybe people who joined the industry after, say ~2010, were primarily exposed only to AWS?

I don’t think anyone is saying that VPS and dedicated server providers aren’t an option. Rather, most startups rule them out since, if you grow enough to where the administration control of the Cloud is warranted, that past decision now warrants a host migration and the month+ of planning that goes into that. Unless the company really started as a side-project whose main funding was extra cash from the founder, it doesn’t make sense to artificially limit your growth.

Once more, it’s not like AWS is absolutely nothing. Chances are that any web-based startup would indeed be able to take advantage of the many autoscale-as-a-service features of AWS, enabling them to handle the next quarterly 100x increase in customers and throughout.

Feels to me like you have zero experience scaling outside the big cloud provider as this is simply not true

I'm only talking about why you would choose a cloud provider over VPS/dedicated, not that they're the only possible choice if you want to scale, so I don't see what is "not true" about my comment.

>These types of comments constantly come up

because triple++ digit margins enable cloud service providers to allocate vast amounts of money towards marketing.

Give us your contacts to us to receive our gift cards then.

You don't need in house hosting. Managed hosting at smaller providers almost always beats AWS by a huge margin. There are valid reasons to pick AWS but cost is almost never one of them - it's the expensive luxury option.

Anectodal: I work for a small scale company and we've just moved from self-hosted GitLab to GitHub for exactly this reason, after our (not-patched-since-eternity-because-no-one-has-time) GitLab servers got infected and started participating in a [D]DoS attack.

Unrelated to the original post, but why not go from hosted gitlab to SaaS gitlab? As someone who has used both, I would have stuck with gitlab, even if it was simply to keep things as similar as possible for engineers.

We were actually deciding between GitHub and GitLab, and think GitHub in the long run was better as it's now owned by a giant like Microsoft which I'd trust more in terms of company size and stability. I know it's kind of a capitalistic view, but I have to think about my own (projects') security/safety and letting MS host feels safer than letting GitLab host.

If your service goes down then, usually customers aren't impressed by you screaming about it being some suppliers fault and it being their liability.

My point is that you can't outsource reputational damage and blame.

You can outsource it, at least partially, by buying the market leader, i.e. AWS or previously IBM.

When my service is down and my competitor's isn't, the customers don't care that "it's ok, it's because I chose the market leader and they're having an outage".

My customers don't have a contract with AWS. A service they're paying for is down, and what am I going to do about it?

Your customers cannot sue you for reasonable downtime. Your contract always has a downtime clause, and nobody is promising 100% availability.

But they can sue you to bankruptcy if you mess up with their data. And these incidents happen way too frequently, even to big players, to ignore them.

You cannot casually say I am able to setup a Linux box myself, no need to hire experts / outsource this task to experts.

IMO, you're mostly paying for the future value of having your ecosystem in aws. It's nice being able to leverage the things aws provides without an additional switching cost.

Some other reasons to pay extra charges are: Trust, Compliance and Competency. When AWS says my RDS is backed up and point in time recovery is available, I can trust that it is really available. In self hosted databases, admins sometimes make mistakes and data is lost.

I really like that analogy

I think this really is much more elastic for some workloads. I want to use hundreds of A100s for a few weeks. Then not use them for a quarter while I use and analyze what I trained, fine tune on smaller machines, etc. We’d never be able to justify buying a cluster that large and managing it for intermittent use, much better to have it burst properly.

When I needed GPU instances, I asked my long-term bare metal hoster (hetzner.de) and they sent me a private price list.

My theory is that publicly advertising GPU instances tends to attract people who want to convert stolen credit cards into crypto mining power. I mean it's the same reason why the 3090 == GPU most suitable for crypto has been sold out for a year.

That's also in line with the fact that I had to sign a voucher stating that I will NOT do any kind of crypto hosting / mining / processing on any of their servers before I got that GPU instance.

Are the prices at the private price list considerably cheaper?

When was that?

I also asked Hetzner for GPU instances after they took down the ones from the website, and a price list was not available at the time, but for a sufficiently sized order they said they would provide me some.

I thought it is possible to send them your own GPU and they would put it in your server.

Apparently Hetzner discontinued theirs because of cryptocurrency abuse(whatever that means)

There was a related discussion here on YC News.

The gist of it was that these hosting providers exchange credit for cash-equivalents. For example, you can sign up with a credit card, and then immediately start using the compute that you have purchased to mine crypto.

Now, if you pay your bill, this ought not to matter. Who cares what you use your rented computer power for? (As long at is not hideously illegal child porn hosting or what have you.)

The problem was that the smaller providers were being mercilessly abused by criminals signing up with stolen credit card numbers. They would extract sufficient crypto coins before the provider shut them down for this to be worth doing over and over.

The providers could have simply requested up-front payment for compute until a "track record" was established, and this would have solved the problem.

Instead, they all opted to maximise the "sign ups per month" metric to get those sweet, sweet KPIs met, and then killed the crypto-mining accounts mercilessly with AI detection.

... which also killed several legitimate customer's businesses outright. Typically with no recourse.


Specifically Hetzner required me to put a deposit, when I rented a server a few months ago. It was a reasonable amount, and you're right about the reasons.

Edit. Found their message:

"At the following link you can upload the copy of your ID, or you can use our pre-paid option, where you pay €20 in advance via PayPal: https://accounts.hetzner.com"

if you like following WHT dedicated hosting offers one fun thing is to just see how long lasting (or not) these providers are.

Despite supposedly being 4x faster and 4x cheaper, these dedicated hosting offer guys come and go like moths circling a burning flame.

Same thing with the s3 competitor file hosting companies. I haven't followed this space recently from hotfile / filesonic etc etc. All supposedly way cheaper than S3 - but boy do they disappear in the night at times.

On the flip side, there's no shortage of providers that have been around since before AWS was a thing.

For example, the quote that I found was from a provider I used over a decade ago (no affiliation), and who have been in business since 1999.

I just checked my old provider pair networks.

Processor 8 Cores HD 240GB SSD RAM 16GB $311 / month

This is not that competitive with m5a.4xlarge reserved for one year.

64GB / 16 core machines - good modern gen cores

Those are managed dedicated servers, where they'll install and maintain software for you. It's an entirely separate class.

And even if it wasn't, the existence of a single expensive dedicated server provider wouldn't disprove parent's claims.

IIRC, you are not allowed to use a RTX/GTX in a datacentre.

Says who, Nvidia? They might not honor the warranty, but there's nothing they can do to stop you.

EULA says it

If you are at AWS scale, NVIDIA would definitely sue you ...

Licenses are a thing?

I bought a card not a license

There’s vast.ai!

Still way too expensive for individuals and startups imo. These A10 cards are just slightly slower than RTX3090, which makes it an easy comparison. $1/hour for the cheapest option means renting for 62 days is equivalent to buying outright (add a few weeks to account for the price of the other computer components)

A big part of the difference is in Nvidia's datacenter tax (The A10 is basically a 3090, just double the price). Hopefully AMD's new accelerators will bring some much needed competition to the market.

Most people don't just rent a single GPU to run something outside of training. A10 tensors are a faster version of T4, which are cheaper for running inference (albeit slower than typical GPU).

Cheapest reasonable GPU instance you can get on EC2 is P3.2xlarge with $3.06/h.

Given 5 year old cards can still perform at some level, buying makes much more sense than renting at $1 per hour.

There are a lot of other costs to consider than only the total cost of the physical item. If you need full time dedicated tensor cards and willing to foot the datacenter and staffing costs then maybe purchasing the cards out right is a better option.

Why do you need to put them in some expensive datacenter?

I am a fan of aws but it is all too easy to mount up large bills. I began using accelerated compute instances but as a newbie it was hard for me to get a feel of the performance, how large and a dataset could we run before it fell over or over heated. If we were spending other people's money I would have gone all in on aws GPS-enabled instances, as a boot strapped firm doing some experiments in the ML space it was more cost effective to buy a small server and get a feel for the performance envelope. We just doubled our hardware spec for another on-prem computer. The best thing is, amazon could not offer better value.

Still awaiting GPU runtime in lambda. The only alternatives I can think up is ECS auto-scaling or to hook up a Kubernetes cluster/Knative-Kuda solution.

Forget the GPUs, the top instance type here (g5.48xlarge) has 192 vCPUs! I'd love to get that in a compute-optimized instance type, without the GPUs.

(The largest available compute-optimized instance is currently the c6i.32xlarge at 128 vCPUs, not counting the absurd u-* family of instances that mere mortals can't get. 192 vCPUs would be a substantial upgrade; 256 would be even better.)

Just out of curiosity, what workload do you have that requires so much compute? Could you move it to the gpu?

I run batch computing jobs at >1000-CPU scale (I work at a large payment processing company). Being able to spread those cores across fewer physical nodes improves performance considerably.

And no, GPUs aren't appropriate for this workload type. We're bound on IO and memory more than raw compute.

Software compilation. And no, nobody's written a way to run massively parallel code generation/optimization on a GPU that I'm aware of.

Could you split up the compilation to run over multiple nodes?

Working on that, but meanwhile, it's still faster and simpler to run it on bigger nodes.

Does anyone know how this will compare with the v100 GPUs that AWS offers currently? Any benchmark pointers?

A10G is a cut-down 3090:

> Unlike the fully unlocked GeForce RTX 3090 Ti, which uses the same GPU but has all 10752 shaders enabled, NVIDIA has disabled some shading units on the A10G to reach the product's target shader count. It features 9216 shading units, 288 texture mapping units, and 96 ROPs. Also included are 288 tensor cores which help improve the speed of machine learning applications. The card also has 72 raytracing acceleration cores.

I used to be engaged in this part of the industry, but haven't looked in a while. So, sincere question: Are V100's still meaningfully popular? I know they were incredible for a long time, but I figured most usage would have shifted to A100 for high performance or T4 for cost.

It's a cut down and under-clocked A6000 with 24 GB GDDR6 instead of 48. I would expect that it performs about the same as a V100 or a tad slower. https://www.techpowerup.com/gpu-specs/a10g.c3798 https://www.techpowerup.com/gpu-specs/rtx-a6000.c3686 - https://lambdalabs.com/gpu-benchmarks

I can’t believe people are still paying for cloud providers. The next generation of computing is decentralized and cheap. AWS stands on the back of poor engineering and a massive marketing Bs.

As CTO from two unicorns, I’d say that it’s pretty much wasted money.

The "Remote Workstations" use case is interesting: what kind of software do people use to do that nowadays? Is x2go still a thing?

Crazy that for each new type of gpu (or new instance type in general) aws needs to make those available in many many regions at the same time.

Or, do your inference on a $10/month ($0.015 per hour) AVX-512 server at Vultr (https://www.vultr.com/products/cloud-compute/) with NN-512 (https://NN-512.com)

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact