Hacker News new | past | comments | ask | show | jobs | submit login
AWS costs every programmer should know (hatanian.com)
382 points by dizzih on June 9, 2019 | hide | past | favorite | 193 comments

For certain apps, bandwidth cost is stupid expensive on AWS.

For example I've been following the development of an online manga reading site (mangadex.org) that is now pushing over 1PB/month of images. If they had built it on aws, even at the lowest rate of $0.02/GB with cloudfront/S3, it would be $20,000 per month. But they ended up paying nothing by using cloudflare who gave them unmetered bandwidth.

(well, until they got throttled earlier this year but they got out of that by upgrading to a measly $200/month plan)

Just for everyone's clarity, you can use Cloudflare with AWS just fine.

A previous site I built (https://hearthstonejson.com/) is pushing on the order of 100TB / month in images and JSON data, with 96% cached requests and 99.9% cloudflare-cached bandwidth. Almost nothing comes out of S3 itself.

Cloudfront is the ridiculously-priced offering IMO. I don't really understand why people use it.

I'm afraid you are in for a nasty surprise: CloudFlare specifically forbids using it mostly as an image/video CDN unless you are on the Enterprise tier (clause 2.8 in the Terms). I found it out the hard way a couple of years ago when I was helping with maintenance of a pretty large image hosting website (we got to 1-2Pb/mo range). CF may not pay attention to you while you are small, but when they do, it's going to cost you.

via https://www.cloudflare.com/terms/ -

"2.8 Limitation on Non-HTML Caching The Service is offered primarily as a platform to cache and serve web pages and websites. Unless explicitly included as a part of a Paid Service purchased by you, you agree to use the Service solely for the purpose of serving web pages as viewed through a web browser or other application and the Hypertext Markup Language (HTML) protocol or other equivalent technology. Use of the Service for the storage or caching of video (unless purchased separately as a Paid Service) or a disproportionate percentage of pictures, audio files, or other non-HTML content, is prohibited."


"disproportionate percentage" seems wide open

> CloudFlare specifically forbids using it mostly as an image/video CDN unless you are on the Enterprise tier

For those of us not in the know, what else would you use it for?

I remember them billing themselves as free DDoS protection.

I don't follow. They're in the business of serving static content, right?

They are willing to serve as your CDN if you don't make them front too much image or audio traffic and you don't make them front any video. Their free tier is for non-media-focused sites only.

Why don't they just limit it by bandwidth/hits? Is there anything special about media content that it should be considered differently?

Video might mean adaptive bitrate I suppose.


No enterprise contract required. Just for video, obviously.

I wonder if that's just for specifically image hosting websites like your case. Parent comment and my previous experiences seem to imply that embedded media is okay.

Even then, just a few terabytes of traffic will make Cloudfront reach the thousands.

What’s the enterprise cost per month?

Pro is $200, enterprise is $2000.

This is probably customized per client and could be more for big companies.

$Contact us

That. We ended up at $3500/month for ~200 TB/month. Reading this thread I might have negotiate harder when renewal time comes.

Subject for negotiation. But it's going to be in the ballpark of many $k/mo

I wonder if image data URIs fly below the radar

>> Cloudfront is the ridiculously-priced offering IMO. I don't really understand why people use it.

Companies that get their highly regulated product(health insurance, hospitals, government contractors) "certified" for use in the AWS ecosystem. Once they do that, they are likely to use it for everything, and pass the cost on to the customer.

Yes I came here to talk about bandwidth too, which is not mentioned in the article at all but can be the biggest engineering cost focus depending on app/scale :-). Over 10tb of transfer you can (and should) negotiate some cheaper pre-committed pricing on bandwidth.

There’s also engineering around the network involved in your application. A lot of architectures internally reflect traffic around (moreso if you’re also accepting some ridiculous amount of traffic rather than just serving it). In those cases you can save a lot of money by figuring out ways to pass traffic through unmetered connections like internal albs and amazon hosted dbs. Otherwise you can either rack up a lot of cross-az traffic charges or be on the hook for engineering your way to minimize those costs.

Right. At a previous company our Cloudfront bill ran up to $2000/mo because someone ran a test script a few times every minute that fetched a few resources from CF. Once we killed that script our CF bill dropped down to $200/mo.

Cloudfront’s pricing can absolutely bite you if you are not careful. Which is one of the reasons why I prefer Cloudflare over it.

Bandwidth is insanely expensive on all clouds. If you’re a large consumer, however, you can usually negotiate a much better rate.

> Bandwidth is insanely expensive on all clouds.

OVH Public Cloud[1] offers unmetered bandwidth (250-500 Mbps) in all regions, apart from Asia-Pacific.

Hetzner Cloud[2] offers 20 TB (1 Gbps) for each cloud instance, but has locations only in Europe.

Both of them offer dedicated servers with up to 1-3 Gbps unmetered bandwidth that could be used as exit nodes.

[1] https://www.ovh.com/world/public-cloud/instances/prices/

[2] https://www.hetzner.com/cloud

I've never had luck with the unlimited / unmetered folks.

Let's say you spin up 10 instances with OVH at the cheapest level (ie, $200/month). That is supposed to give you dedicated 3Gbps in addition to compute / storage etc or around 1PB data per month + compute and storage.

This compares favorably to google standard egress at $60,000/month. But as soon as you build a business model around this - poof - rate limiting -> some TOS violation claim pops up.

"Oh, we meant unlimited or unmetered, but only if..."

Seriously, with unlimited / unmetered at $20 these would make the great bases of things like CDN networks, image / static asset hosts for big properties etc. But it generally turns out to be total BS.

In contrast - paying for bandwidth with AWS / Google etc -> no one has ever complained to me (though my current usage is minuscule in the distance past had high usage experience)

> But as soon as you build a business model around this - poof - rate limiting -> some TOS violation claim pops up.

If bandwidth was at the core of my business model, I would certainly want to pay for it separately to avoid possible interruptions. In case of OVH, I assume, that's what "Bandwidth upgrade"[1] with a "limit of 20 Gbps per customer, and per datacentre" is designed for. It's not unreasonable to consider "spinning up to 10 cheapest cloud instances to avoid bandwidth limits" as a service abuse.

[1] https://www.ovh.com/world/dedicated-servers/bandwidth-upgrad...

Hetzner specifically is not unlimited/unmetered: You get a ridiculous amount of traffic included, and then you pay per TB.

The current price per TB is 1 EUR + VAT if applicable, so even if we assume you'll have to pay the German VAT, it's 1.35 USD/TB.

> Hetzner specifically is not unlimited/unmetered

That's true for Hetzner Cloud, but Hetzner's dedicated servers have been unmetered (with a guarantee of 1 Gbps) since October 2018[1].

[1] https://www.hetzner.com/news/traffic-limit/

Using the AWS figures above, it's over $20/TB, so Hetzner still compares very favourably.

Is this actually unmetered? Or is it unmetered until you hit a secret limit that they don't tell you, and then demand you pay them, like Cloudflare? (I'm not saying this is a bad thing, it's reasonable for CF to want their highest-usage customers to pay, I'm just curious)

I believe it’s unmetered by them personally.

But there’s always a “secret limit”—the point where your bandwidth usage looks like a DDoS attack, and the tier-1 exchange feeding the cloud provider’s DC decides to blackhole your traffic for the sake of the network.

Their bandwidth limits are based on Mbps / Gbps, so the bandwidth is unmetered, but not unlimited. And unlike CDNs, public cloud companies make money from the computational power (CPU / RAM) they provide.

They're going to limit you if you spin up a bunch of cheap instances to use up their bandwidth, but if you pay $300/mo for their 3 Gbps connection you can saturate that and they don't care.

I assume if you saturate that 3Gbps connection they will definitely care. Just maybe not be in a position to do anything about it.

I guess they'll care about as much as a restaurant cares if you've eaten the meal you paid for.

You can’t compare it like that as the calculation is different. If someone eats a small part of their meal or the whole portion doesn’t change the outcome for the restaurant. They can’t sell the rest to someone else, if someone doesn’t use bandwidth someone else is going to use it.

> I'm not saying this is a bad thing

Why not? It's deliberately misleading, no?

No, it’s not. The audience is meant to understand “if this isn’t specifically the thing you’re building to use, you won’t have to worry about it.”

That’s useful to me so I can just not worry about that thing.

Except that you do still have to worry, because it is metered, they're just being coy about the precise figures, and they're really just outright lying to you by using the word 'unmetered'.

Am I missing something? I'm not seeing another side to this. Secret fair-use rules are exactly as dishonest as when the telcos lie about 'unlimited' data plans, no?

Even amazon's own Lightsail offers terabytes of traffic for cheap(5$=2Tb). Obviously aws knows lightsail can be used to cut down the bill and number of lightsail instances is limited and using lightsail for traffic 'engineering' is against the ToS

Yeah, but what does the peering look like for these?

I think it's more that hetzner and ovh are great until you want to go outside of europe. Hetzner is only in europe. Ovh has one datacenter in America, one in canada, one in australia, and one in singapore. It may work great for europe, but most of the world's population is elsewhere. If trying to build a global service, the cost of making your service scale properly across different providers is often too high.

> It may work great for europe, but most of the world's population is elsewhere.

That's true, European providers work best for Europe.

But how many global locations does a service really need, if it's sitting behind a CDN? Three locations (East Coast, Western Europe, and Singapore) alone are enough to be within 100-150 ms of the most of the world's population.

OVH probably wouldn't be the best choice for projects focused exclusively on South America or China, but it already covers the rest of the world pretty well, including three locations in North America (East Coast, West Coast, and Canada).

Ovh has two data centers in America. One is down the street from me where I keep my server and the other is on the east coast. Actually haven’t had much problems With their peering now that the DC is stable.

> Hetzner Cloud offers 20 TB (1 Gbps) for each cloud instance, but has locations only in Europe

I've looked at them to use for my email/small web server and like their prices and features, but am not sure of the GDPR implications.

Currently I use a US cloud provider, and I'm in the US, and so am a controller or processor not established in the Union, and all my data processing takes place outside the Union. All my GDPR obligations, if any, are those that arise under the extraterritorial jurisdiction provision of Article 3(2).

If my server was hosted in the EU by an EU company, would that still be the case? Or would GDPR now apply via the in Union jurisdiction provision of Article 3(1)?

That GDPR article says[1]:

This Regulation applies to the processing of personal data of data subjects who are in the Union by a controller or processor not established in the Union, where the processing activities are related to:

a) the offering of goods or services, irrespective of whether a payment of the data subject is required, to such data subjects in the Union; or

b)the monitoring of their behaviour as far as their behaviour takes place within the Union.

So it basically says that the GDPR applies, I don't think that hosting in europe would change anything at all.

[1]: https://gdpr-info.eu/art-3-gdpr/

As I understand, the point of GDPR is to protect data and privacy of EU citizens. Therefore, you would have the same obligations to EU citizens even if your server was hosted outside the EU. On the other hand, if you don't serve EU citizens, GDPR might not apply to you even if your server is hosted in the EU.

It's broader than that. According to Recital 14, "The protection afforded by this Regulation should apply to natural persons, whatever their nationality or place of residence, in relation to the processing of their personal data".

It can't actually accomplish that goal, because the EU doesn't have the jurisdiction for that.

For controllers and processors that are in the Union, GDPR applies to their processing of personal data of people regardless of where those people are or what entities they are citizens of.

So, for example, as a US citizen residing in the US who has never set foot within about 7000 km of Europe, but has bought things from vendors in the EU, those vendors need to obey GDPR when dealing with my data.

For controller and processors that are not in the Union, the EU lacks the authority to enforce such a broad requirement on them. Instead, the requirement is that if the person whose data you are processing is "in the Union" and you are offering goods and services to them or monitoring their behavior as far as their behavior takes place within the Union, GDPR applies.

(Whether or not they can actually enforce that is still an open question).

Putting this all together, if I'm in the US, with users in the US, but having my server in the EU makes me count as being in the EU for GDPR purposes, then I have to obey GDPR when dealing with US users. If having my server in the EU doesn't do this, so that for GDPR purposes I'm in the US, then GDPR does not apply to my dealings with people in the US.

One way to get Akamai to unilaterally lower their bandwidth prices is to threaten to use BitTorrent instead.


BitTorrent between TomTom (i.e. mobile) devices? Sounds like a terrible idea. Am I missing something? Users pay good money for upload bandwidth. Far more per megabyte than TomTom would pay to use an ordinary CDN.

Oh no, not on the device, but the desktop computer you plug it into via USB. So it can download maps while the device isn't connected, over your home internet connection (but probably not your phone), overnight or whenever.

Yes, it was not appropriate for situations where you had to pay for upload bandwidth. We were very careful in designing the gui to disclose the fact that it would use your upload bandwidth, explain the possible costs and benefits, let users opt-in if that was ok with them, monitor the download status, and switch the BitTorrent feature on and off.

TomTom real time traffic prediction system depends on users trusting them enough to opt-in to uploading anonymized trip measurements, so it was very important not to do something that broke their trust by abusing their network connection.


TomTom had an "iTunes-like" desktop content management and device control desktop app called TomTom Home, which was implemented in xulrunner (the underlying framework of Firefox and Thunderbird, kind of a predecessor to Electron for writing cross platform desktop apps in JavaScript with C++ XP/COM plugins).

The first thing I tried was to make an XP/COM plugin out of the LibTorrent library. That worked ok, but the idea of increasing the size and complexity of what was already essentially a web browser with a whole bunch more complex code that does lots of memory allocation and networking all the time, didn't seem like a good design. This was long before Firefox supported multiple processes, so the same app handing bittorrent would bloat the app and degrade the user interface responsiveness.

However RedSwoosh, FoxTorrent and BitTorrent DNA all ran in separate processes that just handled all the BitTorrent stuff, and that you could talk to via https (to stream the file with a special url, and retrieve progress telemetry on another url). And it's incredibly easy to integrate xulrunner or any web browser with those servers via http, so no C++ XP/COM plugin required.

Another issue is that you don't want every application to have its own BitTorrent client built in, or you will trash the network and disk, since otherwise they would compete for resources. It needs to be a centralized a system service, shared by any app that needs it.

BitTorrent DNA worked nicely that way. And it could fall back to the CDN to download at full speed if there weren't enough seeders.

> not on the device, but the desktop computer you plug it into via USB


P2P-with-failover-to-CDN is what Spotify famously used to do.

Does anyone have experience with bandwidth charges for colocated servers? If you pay $300 a month (or whatever it is, I really have no idea because most of these places don't post their prices publicly) to put a 1U somewhere with a 500mbps link, will they ever stop or throttle you if your use case is legitimate? It seems like colo could be significantly cheaper than cloud for a lot of high-bandwidth things that don't need HA.

I think I remember seeing ads for HE 100Gbit @ 10K/mon. the cloud markup for bandwidth is massive.

To be fair, there's a world of difference between a single POP HE connection and the globally distributed AWS CDN. Understand your use-case and what factors are important because HE has significant drawbacks in some cases.

that's the marketing side of the equation in reality cloud provides have such an amazingly complex control planes that you will see comparable amount of downtime. Notice the 2 google cloud outages were at network layer.

Didn't you say some manga re-uploaders used Blogger for free image hosing bandwidth? Did that get shut down?

I don't know how you remember that comment, it was from over a year ago: https://news.ycombinator.com/item?id=16784267

But as far as I know, kissmanga.com is still abusing blogspot.com/tumblr.com by using them as free image cdns.

I thought most of the clouds offered free ingress?

yes, but the topic is the price of egress.

i thought this was obvious to everybody: public clouds want to lock you in, so they make it very cheap and easy to get data in, but charge stupendous amounts for getting your data out.

MangaDex serves very high resolution scans of comic pages (egress)

Ah. I misunderstood the “push” part of the comment.

Thanks for reminding me that I forgot to include the bandwidth costs! I've added them to the article.

I have worked at two startups now where we made the fatal mistake of being profitable. If you make this mistake then the investors will swoop in and demand you spend more on marketing and AWS infrastructure, because we're scaling up to 5 billion users of course.

Of course we started spending all the money on new people and AWS, and soon there was no money.

At one point we were dumping like $15K a month on AWS for a dozen unnecessary over-engineered toys that nobody was using. This is the real cost of AWS.

I'd love to see Amazon's data on money invested vs actual user traffic for small startups, that's got to be some of the most interesting and valuable data on earth. Forget companies, I'll bet Jeff is sitting around predicting when entire industries rise and fall weeks before anyone else based just on this data.

AWS or not, investors invest for the sole purpose of spending money with the hope of finding a unicorn. If an investor just wants to sit on their money, they can do that without giving it to a company.

Then those investors are stupid. Hope is not a strategy. But more likely is this take is wrong. Also sounds to me like GP maybe missed another consideration: architecting your application in such a manner that scaling is straightforward (not easy, straightforward, there is a difference).

Using AWS as an example, at one of the businesses I worked at, we used Kinesis as an event bus. One shard handles 2MB/sec of output. This worked pretty well for thousands of messages a second, we even got up to the 100's of thousands by compressing the payload of the event message. After that, you can employ any number of strategies that work easily, such as shading and adding additional streams, and use a lambda to pipe the output of one stream to another. It scaled up to millions of messages by essentially pushing buttons in the AWS Console.

Take a look at your architecture. More than likely, outside of a FB/Google/Netflix traffic scenario there is probably an easier and more straightforward architecture you can use that scales to your realistic use cases. Worry about the billions when you get there, which you yourself probably wont at that point because you would have exited, or moved into a higher role most likely by now.

On the other side of the spectrum, AWS's extensive cost report metrics via tagging are great for big companies.

I can now show exactly which departments and dev teams are driving all the costs, and on what (CPU, storage, network). In a way that I never could for on-prem stuff.

...sure, as long as they tag their resources properly.

The closest I got to an org that did this well was a big company that ran Cloud Custodian in all their AWS accounts and if you launched an EC2 instance, it would terminate it immediately with extreme prejudice if it didn't have values for three required tags, one to identify the "owner" individually and two for accounting purposes.

The only problem with that is there's no mechanism to make sure that the values of the cost centers values were correct. There was a bit of a scandal when one group (who presumably just copied and pasted a bunch of CloudFormation from another group's repo) was running 5 figures a months of infrastructure under the other group's billing codes.

ALSO, as many have said, bandwidth is a big part of the cost, and at this time it's nearly impossible to do showback/chargeback on bandwidth. There may be a way to do it using Flow Logs by correlating IP addresses to instances and using those tags, but I've never heard of someone doing this successfully.

Egress charges leverage tags now. You can get down to good detail. Here's an image showing it will use tags if you set them: https://blog.cloudability.com/wp-content/uploads/2017/02/dat...

In this case, a service tag, set in some cases, not in others.

A better way than tagging is to give each team an AWS account to maintain and pay from their own budget.

Then you have to manage a million different AWS accounts. Each of them may be set up differently.

That’s what CloudFormation and Organizations are for....

You probably still want tags to break down costs by dev/test/prod, subsystem, etc. Or tags to aggregate them by department, customer, etc.

AWS cost reporting is far from great. Its hard to learn current(daily) charges; RI is completely hidden and only visible in the final bill, blended; there's no way to limit the spending and the detailed reports are in csv, not user-readable.

Yes, there aren't great AWS provided tools, but the data is there. We happen to use Cloudability, though I'm sure there are other good tools, maybe even free ones.

Can you elaborate on that or do you have any best-practices for tagging and correlating costs?

We use a commercial tool. But, the most important tags are environment (dev, test, prod), application name, app version, and owning team.

For some apps, perhaps a "component" or "service" tag would also be important.

Only 15k a month ?

Startup I was at burned 400k/month average sometimes when buying RI we spent 800k.

Had a couple engineers route the database connections over the public load balancers for a month, that cost 20k in network alone.

AWS is not cheap at scale, period.

A startup being able to spend 400k/month on AWS should also be able to employ somebody to keep an eye on the AWS cost and look for possible optimizations. If that's not the case I wouldn't blame AWS for the spent money.

No blame was on AWS. The TAMs tried to help fix things but no one was listening. Many times I was the only person meeting the 2-3 TAMs we had on the account.

I have many stories about this startup. The idea is great. If the startup succeeds it will be in spite of how its being managed technically

Could you please slip me in the bills at your next place? I'm currently spending less that 500 bucks a month.

What was the monthly revenue for this startup? Those are absolutely insane numbers.

Not sure what the monthly revenue was but I do know there were several multimillion dollar deals every 6 months or so.

Problems come from many areas... like the fact it was in the data management space and the 1500 or so i3's running Cassandra plus the hybrid cloud approach of front end in AWS talking to APIs in GKE which talked to backends in AWS because cool technologies you know.

No architecting was done. Build this and put it in the cloud. The team from RU didn't even know about autoscaling groups and when I tried to bring them in I ran into resistance.

I have many examples of "opportunities to optimize" from this place, needles to say my stress level is much lower not being there.

If you're taking money from investors and you're interested in being profitable, then you're going to sacrifice growth for the sake of profitability:

  - Let's not hire for this idea because it eats into our burn
  - Hmm, let's hold off on launching this feature until we have more data
Investment $ means taking risks (within reason) to maximize shareholder value.

If the company you worked for was profitable, then they could've structured a leveraged debt payoff to the investors to get them off the cap-table. Unless the company took so much money that the investors owned 60%+ and they unanimously do not agree about being profitable, then this is something that can be passed as a board resolution.

It sounds like the founders at your company were just inexperienced.

What is the rule of thumb for cloud-spending costs of production versus development/CI?

Somebody mentioned 1:3.

> If you make this mistake then the investors will swoop in and demand you spend more on marketing and AWS infrastructure

Would there be any "conflict of interest" issues is some of the same people requiring AWS spending were also Amazon shareholders?

Dude. A bunch of billionaires just got away with completely tanking the economy and getting bailed out for it. Zero people ever even talked about the possibility of anyone going to jail. Everyone got their bonuses. Nobody suffered any consequences. Literally nobody.

If you think anyone cares about any conflict of interest among the investing class you're beyond naive, you're just delusional.

How is Amazon supposed to know actual user numbers? That's app specific.

You can't get exact numbers, but for consumer-facing products you can estimate it via unique IPs hitting the load balancers, etc.

That would require amazon gathering customer information and possibly GDPR data on the usage of all of it's customers, which is a thing amazon doesn't do, and you don't want your cloud providers doing.

They probably meant page views and such.

5 billion users? That's substantially more than Facebook .. was that an error?

Most likely that was an exaggeration of the demands made by the investors.

That was a joke.

If you are profitable, why accept investor money anyway?

Grow faster?

Well then, be careful next time.

Just looking at the quote of "$58/mo for a vCPU", they're clearly taking the median cost of a vCPU across all instance types, which includes hyper-expensive instances with value-adds like GPUs or tons of memory.

The real, distinct cost of a vCPU is probably closer to $25/mo. You can look at a m5.large instance, their "general purpose" instance, 2vCPU+8gb @ $70/mo, which would put 1vCPU+4gb at $35/mo. Google Cloud specifically lists the distinct price of a vCPU in custom instances as $24/mo, and AWS feels close to that when you consider the cost of memory in that $35/mo.

The lack of networking cost also seems like a big oversight; if I could force every engineer to know a single "AWS Cost" that affects every vertical of development, from frontend to backend, I want them to know how crazy expensive DC egress is.

I never understood the benefits of the cloud for 99% of the projects.

I just moved https://vlang.io from Google Cloud (my free credit expired) to an Amazon Lightsail VPS (any VPS will do), and my spending went from ~$70 to $3.5/month.

And the performance actually improved a bit.

That. I see tons of posts of under/fresh grads trying to use microservices, k8s and TerraForm for their tiny < 100 qps apps.

A properly designed monolith on a dedicated server/VM can easily serve tens of thousands of users just fine and will be far easier to maintain than this mess (plus way cheaper as well).

The main advantage of those tools is when you have a number of distributed teams and you want to scale to millions of users. For nearly anything else, a far smaller setup is what you want.

I would argue that the majority of AWS' profits come from inefficiencies of their customers.

They know, and try to educate.

The crazy thing is, even used inefficiently, the Cloud is still a very good value prop for a lot of businesses.

> The crazy thing is, even used inefficiently, the Cloud is still a very good value prop for a lot of businesses.

What is there to say? There are huge scale inefficiencies in maintaining a small datacenter.

> What is there to say? There are huge scale inefficiencies in maintaining a small datacenter.

At $dayjob we have a small DC (8 racks). We’re facing chiller, UPS, and core switching/cabling replacements all within the next two years.

We did the math, public Cloud is cheaper for us, especially since about 80% of our instances can be shut down outside business hours.

Nice. It's always nice to see grass roots use cases for Cloud.

> I would argue that the majority of AWS' profits come from inefficiencies of their customers.

Agreed. I've spoken with a few companies that could save quite a bit of money by buying RIs with a few clicks. With a little bit of work they could save a lot more moving large parts of their workload to spot instances. Almost feels like they want to brag about the AWS bill size...

They might try to educate, but I don't think they engineer their products to make "getting it right" easy. It's not in their interests, but they _could_ make a product that is much easier to use correctly, cheaply, or both.

AWS creates work, and creates spend, much more than it needs to.

Yes, AWS incentives are not aligned with their customers. Any sufficiently sized organization on AWS ends up building custom tooling to help manage cost. They may also hire consultants to help optimize cost.

This is all by design.

My experience is that AWS is remarkably open to helping you get things right. They seem to focus on making things possible before easy, but (Enterprise) support is quite good and have specific incentives on spend efficiency that run counter to the short-term goals of Amazon.

I can only imagine how the Oracle cloud must go for victims^Wcustomers.

Systems engineering is hard, on-prem or in-cloud.

AWS TAMs I know spend a sizable portion of their time making sure that their costs are low for their customers, monitoring monthly spend and looking for inefficiencies. Business and Enterprise support customers also receive full access to Trusted Advisor, an automated tool which checks your account for spending inefficiencies and best practice violations.

When your profit margin is 25%+, the bigger risk is not people reducing spend, it's moving to a different cloud. Helping people decrease spend by increasing complexity is a form of vendor lock-in, and preserves the long-term revenue.

Enterprise support has saved our team from our own screwups in the past, so I've also had positive experiences with them.

TIL ^w

I don't get it yet... what does it mean?

Or you go for even smaller use cases. I host some stuff for some companies to share business information with external suppliers for example. These apps serve < 100 users a month.

Still, they all get a load balance configuration to grab the free TLS certs with auto renewal while keeping configuration simple.

While I would not recommend premature TLS termination for critical business data, it is sufficient for many applications.

Admins get their familiar AWS console to configure the apps and I can make sure these are bundled in a way to be deployed on many different systems.

If I wanted to, I could use AWS identity provider like cognito and have it all in one place.

I think the main argument for cloud provider is the utility and convenience they provide for small projects.

While I do not like the business practices of Amazon and would never like to work for them on AWS, I am also lazy and cannot say I that they didn't develop something very useful.

Generally companies don't really care if a solution is 70$ or 3,50$ a month. They do care about my wage much more. Unjustifiable!

Sure, you could dump it all on a server somewhere without using any external micro services. But in my experience these project are much more likely to start to randomly haunt you from the past. And worse, I so actually have to maintain the servers.

And the price difference for inefficiently used services is mostly between 5$ and 15$. But these extra 10$ a month can easily be set off for reduced development or administration time.

If after ~40 years the service is still active, I would freely admit to that mistake.

Granted, this isn't applicable for application that do actually need to be scaled. But the time to launch for standard solutions that nudge a few bits here and there can be reduced quite a bit thanks to infrastructure like this. And it may be in use anyway since all the Google Home/Alexa IOT stuff.

I used to host in savvis then IT team pushed me into their datacenter for cost reasons but they are so terrible with changes and support that azure/aws are very attractive to improve my dev speed even if it means i have to do all the ops work. Probably more expensive than colo but i have to use them if i do that but i have more freedom in the cloud (until they figure that out i'm sure).

> I see tons of posts of under/fresh grads trying to use microservices, k8s and TerraForm for their tiny < 100 qps apps

possibly because that's what most of the job postings are requiring?

I’m an AWS true believer, but in that case if I were on a budget, I would scrounge up enough cheap used hardware, setup my own home network and host everything myself just to learn.

I would think it makes a ton of sense for a graduating student to use managed micro services instead of their own monolith they need to continually support.

I'm confused, is that not still "the cloud" you are using?

There's certainly a huge range of pricing for the exact same services (sometimes worse services cough OVH dedi servers cough)

It's a server in "the cloud" but it's not cloud services. If you're running an OS on bare metal or a blank VM, you're basically just co-locating in someone else's datacenter which is a very traditional model.

I have a feeling you know this, but "the cloud" is nowadays generally taken to mean providers (such as AWS and Azure) that have IaaS, PaaS and SaaS offerings, with a focus on the latter two.

If you are not using load balancers, lambdas, provisioned iops, internet scale databases, pipelines, streams, cdns, data object storage, machine learning apis - you are not using “cloud”.

How is Amazon Lightsail not the "cloud"? In fact it gives you even less flexibility and control than managing your own EC2 instances.

Maybe I'm dense and there is some obvious source of data on the AWS admin pages that I haven't been able to find by clicking around, searching the documentation, and googling, but I still have this question:

Why is the cost and size of snapshots so opaque? Is there a way to see how big each one is, and how much it will cost to keep it around for a month?

I understand that snapshots are differential, and deleting one snapshot in a series may move blocks to other undeleted snapshots, of course.

I just want to know how much storage each snapshot requires right now, and to be able to accurately predict how much each one of them will cost at the end of the month, so I can decide how often to make snapshots and how long to keep them.

And is there an easy "serverless" way to automatically archive a snapshot in glacier storage or simply download it, without manually making it into a new volume and using "dd" or "tar" or "dump" or whatever?

I understand they're stored in S3. So why can't I see a list of them in the S3 interface, see how big they all are, and tell how much each of them is going to cost at the end of the month? It's my data. I'm paying for it. I think I deserve to be able to see it and know how much it will cost.

I get the feeling I'm missing something really obvious (probably a big blinking green button on the main page that my brain is filtering out). Surely many other customers must want these features?

Unfortunately there isn't currently any way to tell how much data each snapshot actually contains. There isn't any way also to archive these to glacier, but the somewhat new lifecycle management tool might help you save some money with little effort: https://aws.amazon.com/blogs/aws/new-lifecycle-management-fo...

Any analysis of AWS costs that doesn't start off talking about outbound data transfer pricing, let alone mention it at all, is useless for anything beyond pure-CPU workloads.


I'm not sure AWS costs are something you can do by remembering a few guidelines, rather than actually assigning an order-of-magnitude estimate to every billable thing and adding it all up.

Like the one that bit me recently was a $0.05/1000 cost per thing, which is easy to translate to $0.00005/thing and then mentally round to $0. That one added up to real money at 10^8 things, and would have been a big problem at 10^9. The cool thing about horizontal scaling is that doing something 10^6 times or 10^9 times is going to feel pretty similar when you do it -- the only difference is the order of magnitude of the bill.

I’d add to this, numbers every SaaS sales person should know with the resources used by their services, or roughly what the marginal cost of a new customer is.

Most SaaS services have very high markup, but get a few bits of infra wrong, over promise on a few things, and use expensive AWS services, and suddenly you’re selling $500 a month of AWS services for $99 a month. This is an easy mistake to make, I’ve seen it happen with these figures, and it only takes a few wrong assumptions across the engineering and sales teams.

As others have said, bandwidth costs can be absolutely insane with AWS. This was actually the primary reason we moved from S3 to Backblaze B2 as documented at https://news.ycombinator.com/item?id=19648607, and saved ourselves thousands of dollars per month, especially in conjuncture with Cloudflare's Bandwidth alliance. https://www.cloudflare.com/webinars/cloud-jitsu-migrating-23...

We still use AWS for a few things and still have a small bill with them every month, but we're very careful about putting anything there that's going to push a lot of traffic.

People really should be aware of this, yes. Even if your client /management absolutely insists on S3 for reputed durability etc., if bandwidth costs are high, you can often get 'free' extensive compute resources by hiring servers elsewhere to proxy and cache S3 access to cut bandwidth bills and run other things next to the largely network bound load.

In general I find the big problem with AWS is that cost is handled 'in reverse': developers often get near free reign, and cost only gets handled when someone balks at the size of the AWS bill. Often it turns out to be trivial to cut by changing instance types or renting a server somewhere to act as a cache. At that point people have often spent tens of thousands in unnecessary fees.

There's an underserved niche for people to do AWS cost optimization on a 'no-win no-fee' basis.

I used to help clients with cutting costs on AWS and if people were in doubt I'd often offer that.

And the savings were often staggering (e.g halving hosting costs was common; once we cut costs 90 percent by moving to Hetzner for a bandwidth intensive app even though long-term storage remained on S3).

The biggest challenge in serving that niche is getting people to realise they may be throwing money out the window as surprisingly many people still assume AWS is cheap, and offering to do an initial review for free and not charge if I couldn't achieve the promised savings made it a lot easier. Someone who likes the sales side more could make a killing doing that.

This why the "Netflix uses AWS!" rhetoric is misleading. Yes, they use AWS extensively for front-end, analytics, transcoding, billing (now), etc. The one thing they don't use AWS for much at all is Content Delivery (AKA big bandwidth). That uses the Netflix Open Connect CDN which is entirely developed and run in-house.

Also, I've worked with companies a tiny fraction of Netflix who got steep discounts. It's quite possible AWS becomes cost effective when you have millions in yearly spend as leverage. The problem is surviving until you get there.

Similar numbers for Google Cloud in the less-expensive regions:

1 vCPU: $17/mo for sustained usage, $10.5/mo for 3yr commit, $5/mo for preemptible (similar to spot)

1 GB RAM: $2.2/mo for sustained usage, $1.4/mo for 3yr commit, $0.7/mo for preemptible

Google Cloud also lets you choose exactly how much CPU and RAM you want, within reasonable limits (you have to allocate 0.9GB RAM per vCPU and you pay a little more for RAM above 6.5GB per vCPU). So these numbers aren't medians/derived values for Google Cloud but numbers that you can find in the docs at https://cloud.google.com/compute/pricing. (I don't work for Google, I'm just a fan of Google Cloud)

Every programmer?

In our team, only like two, maybe three programmers are concerned about our AWS architecture at all. I don't really know why everyone in the team should know how much the individual components of AWS cost. They should just know how to not write performance-destroying code that mandates scaling instances up.

Eagerly await the follow up: Falsehoods programmers believe about AWS costs

And then:

- 10 reasons why AWS costs so much. #7 is mind-boggling.

- AWS costs considered harmful

Don't forget:

- "Developer uses one weird trick to reduce cloud bills by 97%. AWS hates him!"

10 Falsehoods Every Programmer Should Know (Number 3 will SHOCK you!)

It's not just code, and it's not always obvious. AWS' tools (as robust as they seem at first) may not give you the low-level insight that you need to understand where you can really cut down on expenses. I just reduced my AWS bill from about $5.5k/mo to $3k/mo. The rise in price was only slightly outpacing the app growth for a while, and it was very gradual. When the forecasted bill was $6k it became a major priority.

The title is a mock of “Latency Numbers Every Programmer Should Know”[0], although these are (a) limited to a popular, yet not exclusive platform, (b) subjective to change at any moment.

[0]: https://people.eecs.berkeley.edu/~rcs/research/interactive_l...

It's a pattern title. It's not meant to be taken literally and most reasonable people wouldn't read it that way.

I don't think that every programmer should know it off by heart.

I do think it should be a reasonably common skill to make common tradeoffs without too much investigation.

Besides, the title is an homage to the Latency Numbers article which follows a similar concept. You don't need to memorise numbers, but you probably should be able to reason about them in very vague terms.

These 'every programmer should know' things should stop -- surely a web developer isn't supposed to know latency figures (or 114 pages about the details of RAM, see [1]). Likewise, most programmers don't care about AWS costs.

[1] https://people.freebsd.org/~lstewart/articles/cpumemory.pdf

I've found it very helpful to know about the differences in latency between data in RAM on the client and data in a database connected to a remote webserver. It pushed me to reduce the number of API requests and think about what data is needed when.

These are things developers definitely should know.

If that's your job, then you should, but not every programmer. At least make the title of your post "every web programmer" or "every programmer using AWS". It makes the post a lot less clickbaity.

You shouldn't read headlines like that literally or you're destined to be disappointed. They're hyperbolic in order to attract more clicks. The way it should be parsed is, "things that are useful to know if you're working on things where stuff like this is important." It should be obvious that if you're not working with AWS or plan on doing so in the future then you just skip articles about AWS, right?

I disagree that a web developer shouldn't know latency figures. It's pretty developer/project dependent.

I do agree that "EVERY programmer should know.." with regards to some specific software package or service is silly. Given that there are a million ways to provide services similar to AWS, and AWS is really only a means to an end -- I don't get why every programmer should be aware of the specifics.

Latency is one those things that hits almost all of CS. AWS isn't. "Things all programmers should know about AWS.." is , in my opinion, similar to saying something like "Things every programmer should know about the Python GIL." , it includes exactly the reason WHY every programmer needs not know it.

Not a amazon service user? Forget about AWS.

Not a python user? Who cares about the GIL.

I also agree that your linked PDF is overboard -- but it's one of those things that looks overboard now, but that kind of innate cpu/memory knowledge would have been a lot more useful when lower level languages were more widely used. Touching memory is just one of those things that seems to be on its' way out, and I, for one, am thankful for that.

Surely every programmer worth his salaray should know latency figures, especially if he is working on a website! At least have a basic understanding of costs of HTTPS and latency of mobile connections.

> surely a web developer isn't supposed to know latency figures

...that's a terrible example, 2007 link nothwithstanding. Nobody cares about AWS unless they use it (and many don't, either because they use something else, or because they're in a different domain), but awareness of true costs of what you're _actually_ doing can be the difference between O(n) and O(n^2).

114 pages? I'll never be able to read all of that!

This is an interesting idea. Could be handy to have a little tool to estimate and compare costs across providers, maybe with pros and cons, constraints, and even customer reviews. E.g. I only need 99% uptime, I will be serving images, most of them will be the same etc.

I must say though, the graphs in this could be a good bit better. There is no reason to use a line here, it makes it look like a time series plot. Each instance type should have its own bar and they should be ordered in some manner, either grouping by types or sorting by price.

Trackit.io does exactly this. It's available on GitHub.

Since it wasn't obvious from their website (no GitHub icon in their "social media tray"):


Sorry, I posted from my phone and hadn't much time :)

GPU prices are ridiculous. You can often purchase an equally performing PC, with a GeForce GPU, for the cost of just 1-2 months of cloud rent. Also many of them bundle VGPUs with large amount of CPU and RAM, as far as I remember only Google allows custom specs, both MS and Amazon don't have custom machines for users who only need GPUs.

I think this is more nVidia’s fault due to their Datacenter specific pricing that AWS has to abide (and then mark up).

Your GPU is not licensed in a way that allows AWS to use it.

I do agree, but the value proposition is more than simply the cost - being able to spin up GPUs on demand is extremely beneficial for a lot of projects.

Of course, once you have something in production, it might be worth considering buying your own GPUs, but really is all depends.

Its pretty useful for students or those who want to play with the tech and not sure if they will be committed to it long term. Thats a pretty big marker with all the deep learning hype.

Also you can get a couple of hundred bucks of credit from multiple providers. It was enough for me to learn and move on to getting my own gpu.

Often understated is how difficult it is to set up your local gpu for machine learning. Still haven't figured out how to connect my 8 gpus in a functional way.

I just got into AWS last year and this article does not mention my big mistake that I made. Using RDS for a small scale project instead of EC2.

RDS is super expensive and while you're gaining traction and users you might as well use an EC2 instance.

ElasticBeanstalk seemed like an easy intro to AWS but it steered me into RDS. Of course I've abandoned the whole of ElasticBeanstalk for serverless development lately.

My first intro to AWS was ElasticBeanstalk and serverless seemed too daunting. But now I've been smitten and can't stop thinking serverless.

I feel like serverless in a lot of cases is more expensive than just running a monolith, once you get past the toy example or past the MVP stage.

This is especially true once you factor in all the development overhead, which can ultimately be very expensive to do things correctly with serverless.

Azure have open sourced their serverless engine, so you even have the option of building out your own private serverless infrastructure.

What?! Please excuse my disbelief but I'm an old Linux veteran from the era when we typed M$. ;)

Where can I find this? I assume somewhere in the 1060 repos of https://github.com/Azure

I also assume that even though it's open source, it requires licensed MS Windows servers to run.

Still though, this is just another nail in the coffin of the OLD MS image.

It's MIT, and the host is built on .NET Core, so you can run it on Linux if you want:


Under Satya Nadella the change in Microsoft has been little short of astonishing. A few years ago, I could understand the continued cynicism - but if anything the pace of change has only intensified.

Have you considered using Aurora Serverless? If you're trying to prove market fit, it can be an order of magnitude cheaper than RDS and you don't have to worry about management in EC2.

AWS doesn't recommend Aurora Serverless for production, yet.

Yeah, great for an internal facing app on your company intranet that only needs infrequent access but requires a relational database.

Useful to have also the 'network' costs from an edge location to a user. That one just bit me so its close in my mind right now.

The #1 cost is increased complexity and opportunity cost of converting your simple app to an interesting serverless/nosql/microservices/etc distributed system. Of course the financial savings can be heroic and justify the effort because everything is very expensive on AWS.

I'm not sure when AWS became the standard, but the programming communities have lots of young kids learning their first backends on AWS.

Im not sure this is a good thing, having a generation of developers dependent on a single companies tools seems like the future will be painful.

It's not a good thing, but no worse than having developers depend on Docker without a basic understanding of the underlying OS.

i'd contend that it's much worse. One (Docker) is a piece of software, one (Amazon) is a corporation with a motivated will.

They are two different problems. One is basically the problem of copy-paste/stackoverflow developers, the other is a walled garden problem that'll make itself worse over time.

Docker is written by a corporation, and so in many senses it is an expression of their will.

I don't know if these fears ever actually pan out. In the 90s we had a world full of kids learning on Windows, and many of them later in their career switched to Linux or Mac. In the 2000s we had a world of kids learning Rails and plenty of them moved on to Javascript or Clojure or Python. The world was no worse for their first experience, and alternative tools did not cease to exist.

The people who learn on AWS and have an actual desire to continue their knowledge will always find ways to continue their knowledge. Learning on AWS first is not going to stop that.

I would like to see a post like this comparing services like RDS or Mongodb Atlas on different instance sizes. For example: what is a typical writes|reads/second I can achieve on a Postgres RDS db.m4.large instance.

I find it easier to use this tool for quick estimations, and it is more encompassing:


I was surprised this is all about physical costs and doesn’t include digital costs that folks should be aware of. I was expecting to see the latency between AZs in a region and between regions for example.

A bit OT, but I wait for the day where bandwidth isn't a limiting factor and we can host machines doing computations anywhere, instead of having to rent out economies of scale. With Kubernetes and other private-cloud-like solutions, wouldn't it eventually be meaningless to depend on centralized cloud clusters and instead host your own and tap into an equal-footing infrastructure. Projects like vast.ai (no affiliation) are examples of the ideal I hope for.

I was deciding between cloud hosts yesterday for a tiny project and got reminded how frustrating it is to predict costs on aws. Even to get a monthly ec2 price you need a calculator out, not including other costs. Their billing reports are also very unclear to me.

I’m planning on digital ocean because of this: their costs are straightforward and their interface is simple.

Edit: can anyone recommend an easy way to create different-provider backups of a DO server?

I use DigitalOcean because they're cheap and their costs are predictable over AWS. I don't do full server backups since I can just 'git pull' and for the most part have my server re-populated, but I do full backups of my Postgres database every night. I use pg_backup and the AWS CLI tools to push the backup to S3. I also keep a week of backups stored locally on the server in case I blow up the database but not the server, I can restore without having to bring it back from S3 (or in case S3 fails).

Look at AWS Lightsail. It’s a simple VPS provider without all of the complexity of AWS.

Any AWS metric that doesnt have the network costs is useless. If anyone wants a dedicated unmetered server up to 20gbps i'd recommend datapacket.com.

Agreed. The cloud is great if your workload is low-egress... AWS/GCP bandwidth pricing is predatory at times.

There are various "AWS Costs Cheatsheet" posts but I couldn't find a comprehensive _and_ updated one. Any suggestions?

Not a cheat sheet, but a comprehensive list of specs and pricing for every EC2 instance, in a really nice UI far more useful than mentally joining Amazon's various tables: https://www.ec2instances.info/

unless you're a big company, why not just run linode/digitalocean/etc, those are much more user friendly and AWS feels like a maze to me.

I wouldn’t trust any of those to run a business on.

Seems there are plenty of companies who do trust DO and Linode enough to run a business on.

[1] https://www.digitalocean.com/customers/

[2] https://www.linode.com/case-studies

The best non technical/political reason is that if AWS or Azure goes down or you have an issue with them none of your higher ups or investors are going to ask you, why did you choose AWS. It’s the old “no one ever got fired for buying IBM”.

Besides, you can never go wrong politically by saying I chose the vendor in the upper right corner of Gartner’s Magic Quadrant.

The technical reason is that if I want my hosting service to do more of the “undifferentiated heavy lifting” and provide managed services. I won’t get as much of an offering from Linode/DO. The last reason is that AWS Business Support is excellent. They’ve helped me out with some head scratchers and when I just didn’t want to figure something complex out myself. Their live chat is awesome.

Egress fees are a big 'hidden' cost.

Uh... no mention of RDS?

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact