
AWS costs every programmer should know - dizzih
https://david-codes.hatanian.com/2019/06/09/aws-costs-every-programmer-should-now.html
======
zawerf
For certain apps, bandwidth cost is stupid expensive on AWS.

For example I've been following the development of an online manga reading
site (mangadex.org) that is now pushing over 1PB/month of images. If they had
built it on aws, even at the lowest rate of $0.02/GB with cloudfront/S3, it
would be $20,000 per month. But they ended up paying nothing by using
cloudflare who gave them unmetered bandwidth.

(well, until they got throttled earlier this year but they got out of that by
upgrading to a measly $200/month plan)

~~~
stingraycharles
Bandwidth is insanely expensive on all clouds. If you’re a large consumer,
however, you can usually negotiate a much better rate.

~~~
krn
> Bandwidth is insanely expensive on all clouds.

OVH Public Cloud[1] offers unmetered bandwidth (250-500 Mbps) in all regions,
apart from Asia-Pacific.

Hetzner Cloud[2] offers 20 TB (1 Gbps) for each cloud instance, but has
locations only in Europe.

Both of them offer dedicated servers with up to 1-3 Gbps unmetered bandwidth
that could be used as exit nodes.

[1] [https://www.ovh.com/world/public-
cloud/instances/prices/](https://www.ovh.com/world/public-
cloud/instances/prices/)

[2] [https://www.hetzner.com/cloud](https://www.hetzner.com/cloud)

~~~
chipperyman573
Is this actually unmetered? Or is it unmetered until you hit a secret limit
that they don't tell you, and then demand you pay them, like Cloudflare? (I'm
not saying this is a bad thing, it's reasonable for CF to want their highest-
usage customers to pay, I'm just curious)

~~~
derefr
I believe it’s unmetered by them personally.

But there’s always a “secret limit”—the point where your bandwidth usage looks
like a DDoS attack, and the tier-1 exchange feeding the cloud provider’s DC
decides to blackhole your traffic for the sake of the network.

------
oldjokes
I have worked at two startups now where we made the fatal mistake of being
profitable. If you make this mistake then the investors will swoop in and
demand you spend more on marketing and AWS infrastructure, because we're
scaling up to 5 billion users of course.

Of course we started spending all the money on new people and AWS, and soon
there was no money.

At one point we were dumping like $15K a month on AWS for a dozen unnecessary
over-engineered toys that nobody was using. This is the real cost of AWS.

I'd love to see Amazon's data on money invested vs actual user traffic for
small startups, that's got to be some of the most interesting and valuable
data on earth. Forget companies, I'll bet Jeff is sitting around predicting
when entire industries rise and fall weeks before anyone else based just on
this data.

~~~
tyingq
On the other side of the spectrum, AWS's extensive cost report metrics via
tagging are great for big companies.

I can now show exactly which departments and dev teams are driving all the
costs, and on what (CPU, storage, network). In a way that I never could for
on-prem stuff.

~~~
joncrane
...sure, as long as they tag their resources properly.

The closest I got to an org that did this well was a big company that ran
Cloud Custodian in all their AWS accounts and if you launched an EC2 instance,
it would terminate it immediately with extreme prejudice if it didn't have
values for three required tags, one to identify the "owner" individually and
two for accounting purposes.

The only problem with that is there's no mechanism to make sure that the
values of the cost centers values were correct. There was a bit of a scandal
when one group (who presumably just copied and pasted a bunch of
CloudFormation from another group's repo) was running 5 figures a months of
infrastructure under the other group's billing codes.

ALSO, as many have said, bandwidth is a big part of the cost, and at this time
it's nearly impossible to do showback/chargeback on bandwidth. There may be a
way to do it using Flow Logs by correlating IP addresses to instances and
using those tags, but I've never heard of someone doing this successfully.

~~~
alien_
A better way than tagging is to give each team an AWS account to maintain and
pay from their own budget.

~~~
pm90
Then you have to manage a million different AWS accounts. Each of them may be
set up differently.

~~~
scarface74
That’s what CloudFormation and Organizations are for....

------
013a
Just looking at the quote of "$58/mo for a vCPU", they're clearly taking the
median cost of a vCPU across all instance types, which includes hyper-
expensive instances with value-adds like GPUs or tons of memory.

The real, distinct cost of a vCPU is probably closer to $25/mo. You can look
at a m5.large instance, their "general purpose" instance, 2vCPU+8gb @ $70/mo,
which would put 1vCPU+4gb at $35/mo. Google Cloud specifically lists the
distinct price of a vCPU in custom instances as $24/mo, and AWS feels close to
that when you consider the cost of memory in that $35/mo.

The lack of networking cost also seems like a big oversight; if I could force
every engineer to know a single "AWS Cost" that affects every vertical of
development, from frontend to backend, I want them to know how crazy expensive
DC egress is.

------
amedvednikov
I never understood the benefits of the cloud for 99% of the projects.

I just moved [https://vlang.io](https://vlang.io) from Google Cloud (my free
credit expired) to an Amazon Lightsail VPS (any VPS will do), and my spending
went from ~$70 to $3.5/month.

And the performance actually improved a bit.

~~~
dRaBoQ
That. I see tons of posts of under/fresh grads trying to use microservices,
k8s and TerraForm for their tiny < 100 qps apps.

A properly designed monolith on a dedicated server/VM can easily serve tens of
thousands of users just fine and will be far easier to maintain than this mess
(plus way cheaper as well).

The main advantage of those tools is when you have a number of distributed
teams and you want to scale to millions of users. For nearly anything else, a
far smaller setup is what you want.

~~~
joncrane
I would argue that the majority of AWS' profits come from inefficiencies of
their customers.

They know, and try to educate.

The crazy thing is, even used inefficiently, the Cloud is still a very good
value prop for a lot of businesses.

~~~
marcosdumay
> The crazy thing is, even used inefficiently, the Cloud is still a very good
> value prop for a lot of businesses.

What is there to say? There are huge scale inefficiencies in maintaining a
small datacenter.

~~~
tatersolid
> What is there to say? There are huge scale inefficiencies in maintaining a
> small datacenter.

At $dayjob we have a small DC (8 racks). We’re facing chiller, UPS, and core
switching/cabling replacements all within the next two years.

We did the math, public Cloud is cheaper for us, especially since about 80% of
our instances can be shut down outside business hours.

~~~
joncrane
Nice. It's always nice to see grass roots use cases for Cloud.

------
DonHopkins
Maybe I'm dense and there is some obvious source of data on the AWS admin
pages that I haven't been able to find by clicking around, searching the
documentation, and googling, but I still have this question:

Why is the cost and size of snapshots so opaque? Is there a way to see how big
each one is, and how much it will cost to keep it around for a month?

I understand that snapshots are differential, and deleting one snapshot in a
series may move blocks to other undeleted snapshots, of course.

I just want to know how much storage each snapshot requires right now, and to
be able to accurately predict how much each one of them will cost at the end
of the month, so I can decide how often to make snapshots and how long to keep
them.

And is there an easy "serverless" way to automatically archive a snapshot in
glacier storage or simply download it, without manually making it into a new
volume and using "dd" or "tar" or "dump" or whatever?

I understand they're stored in S3. So why can't I see a list of them in the S3
interface, see how big they all are, and tell how much each of them is going
to cost at the end of the month? It's my data. I'm paying for it. I think I
deserve to be able to see it and know how much it will cost.

I get the feeling I'm missing something really obvious (probably a big
blinking green button on the main page that my brain is filtering out). Surely
many other customers must want these features?

~~~
res0nat0r
Unfortunately there isn't currently any way to tell how much data each
snapshot actually contains. There isn't any way also to archive these to
glacier, but the somewhat new lifecycle management tool might help you save
some money with little effort: [https://aws.amazon.com/blogs/aws/new-
lifecycle-management-fo...](https://aws.amazon.com/blogs/aws/new-lifecycle-
management-for-amazon-ebs-snapshots/)

------
koolba
Any analysis of AWS costs that doesn't start off talking about outbound data
transfer pricing, let alone mention it at all, is useless for anything beyond
pure-CPU workloads.

~~~
privateSFacct
Agreed.

------
JackC
I'm not sure AWS costs are something you can do by remembering a few
guidelines, rather than actually assigning an order-of-magnitude estimate to
every billable thing and adding it all up.

Like the one that bit me recently was a $0.05/1000 cost per thing, which is
easy to translate to $0.00005/thing and then mentally round to $0. That one
added up to real money at 10^8 things, and would have been a big problem at
10^9. The cool thing about horizontal scaling is that doing something 10^6
times or 10^9 times is going to feel pretty similar when you do it -- the only
difference is the order of magnitude of the bill.

------
danpalmer
I’d add to this, numbers every SaaS sales person should know with the
resources used by their services, or roughly what the marginal cost of a new
customer is.

Most SaaS services have very high markup, but get a few bits of infra wrong,
over promise on a few things, and use expensive AWS services, and suddenly
you’re selling $500 a month of AWS services for $99 a month. This is an easy
mistake to make, I’ve seen it happen with these figures, and it only takes a
few wrong assumptions across the engineering and sales teams.

------
CherryJimbo
As others have said, bandwidth costs can be absolutely insane with AWS. This
was actually the primary reason we moved from S3 to Backblaze B2 as documented
at
[https://news.ycombinator.com/item?id=19648607](https://news.ycombinator.com/item?id=19648607),
and saved ourselves thousands of dollars per month, especially in conjuncture
with Cloudflare's Bandwidth alliance.
[https://www.cloudflare.com/webinars/cloud-jitsu-
migrating-23...](https://www.cloudflare.com/webinars/cloud-jitsu-
migrating-23tb-from-aws-s3-to-backblaze-b2-in-7-hours/)

We still use AWS for a few things and still have a small bill with them every
month, but we're very careful about putting anything there that's going to
push a lot of traffic.

~~~
vidarh
People really should be aware of this, yes. Even if your client /management
absolutely insists on S3 for reputed durability etc., if bandwidth costs are
high, you can often get 'free' extensive compute resources by hiring servers
elsewhere to proxy and cache S3 access to cut bandwidth bills and run other
things next to the largely network bound load.

In general I find the big problem with AWS is that cost is handled 'in
reverse': developers often get near free reign, and cost only gets handled
when someone balks at the size of the AWS bill. Often it turns out to be
trivial to cut by changing instance types or renting a server somewhere to act
as a cache. At that point people have often spent tens of thousands in
unnecessary fees.

There's an underserved niche for people to do AWS cost optimization on a 'no-
win no-fee' basis.

I used to help clients with cutting costs on AWS and if people were in doubt
I'd often offer that.

And the savings were often staggering (e.g halving hosting costs was common;
once we cut costs 90 percent by moving to Hetzner for a bandwidth intensive
app even though long-term storage remained on S3).

The biggest challenge in serving that niche is getting people to realise they
may be throwing money out the window as surprisingly many people still assume
AWS is cheap, and offering to do an initial review for free and not charge if
I couldn't achieve the promised savings made it a lot easier. Someone who
likes the sales side more could make a killing doing that.

~~~
secabeen
This why the "Netflix uses AWS!" rhetoric is misleading. Yes, they use AWS
extensively for front-end, analytics, transcoding, billing (now), etc. The one
thing they don't use AWS for much at all is Content Delivery (AKA big
bandwidth). That uses the Netflix Open Connect CDN which is entirely developed
and run in-house.

~~~
vidarh
Also, I've worked with companies a tiny fraction of Netflix who got steep
discounts. It's quite possible AWS becomes cost effective when you have
millions in yearly spend as leverage. The problem is surviving until you get
there.

------
dantillberg
Similar numbers for Google Cloud in the less-expensive regions:

1 vCPU: $17/mo for sustained usage, $10.5/mo for 3yr commit, $5/mo for
preemptible (similar to spot)

1 GB RAM: $2.2/mo for sustained usage, $1.4/mo for 3yr commit, $0.7/mo for
preemptible

Google Cloud also lets you choose exactly how much CPU and RAM you want,
within reasonable limits (you have to allocate 0.9GB RAM per vCPU and you pay
a little more for RAM above 6.5GB per vCPU). So these numbers aren't
medians/derived values for Google Cloud but numbers that you can find in the
docs at
[https://cloud.google.com/compute/pricing](https://cloud.google.com/compute/pricing).
(I don't work for Google, I'm just a fan of Google Cloud)

------
Hamuko
Every programmer?

In our team, only like two, maybe three programmers are concerned about our
AWS architecture at all. I don't really know why everyone in the team should
know how much the individual components of AWS cost. They should just know how
to not write performance-destroying code that mandates scaling instances up.

~~~
bpicolo
Eagerly await the follow up: Falsehoods programmers believe about AWS costs

~~~
jraph
And then:

\- 10 reasons why AWS costs so much. #7 is mind-boggling.

\- AWS costs considered harmful

~~~
jsty
Don't forget:

\- "Developer uses one weird trick to reduce cloud bills by 97%. AWS hates
him!"

------
meuk
These 'every programmer should know' things should stop -- surely a web
developer isn't supposed to know latency figures (or 114 pages about the
details of RAM, see [1]). Likewise, most programmers don't care about AWS
costs.

[1]
[https://people.freebsd.org/~lstewart/articles/cpumemory.pdf](https://people.freebsd.org/~lstewart/articles/cpumemory.pdf)

~~~
Kalium
I've found it very helpful to know about the differences in latency between
data in RAM on the client and data in a database connected to a remote
webserver. It pushed me to reduce the number of API requests and think about
what data is needed when.

These are things developers _definitely_ should know.

~~~
meuk
If that's your job, then you should, but not _every programmer_. At least make
the title of your post "every web programmer" or "every programmer using AWS".
It makes the post a lot less clickbaity.

------
OisinMoran
This is an interesting idea. Could be handy to have a little tool to estimate
and compare costs across providers, maybe with pros and cons, constraints, and
even customer reviews. E.g. I only need 99% uptime, I will be serving images,
most of them will be the same etc.

I must say though, the graphs in this could be a good bit better. There is no
reason to use a line here, it makes it look like a time series plot. Each
instance type should have its own bar and they should be ordered in some
manner, either grouping by types or sorting by price.

~~~
wazoox
Trackit.io does exactly this. It's available on GitHub.

~~~
mdaniel
Since it wasn't obvious from their website (no GitHub icon in their "social
media tray"):

[https://github.com/trackit/trackit-
server#readme](https://github.com/trackit/trackit-server#readme)

~~~
wazoox
Sorry, I posted from my phone and hadn't much time :)

------
Const-me
GPU prices are ridiculous. You can often purchase an equally performing PC,
with a GeForce GPU, for the cost of just 1-2 months of cloud rent. Also many
of them bundle VGPUs with large amount of CPU and RAM, as far as I remember
only Google allows custom specs, both MS and Amazon don't have custom machines
for users who only need GPUs.

~~~
dRaBoQ
Its pretty useful for students or those who want to play with the tech and not
sure if they will be committed to it long term. Thats a pretty big marker with
all the deep learning hype.

~~~
xodast
Also you can get a couple of hundred bucks of credit from multiple providers.
It was enough for me to learn and move on to getting my own gpu.

Often understated is how difficult it is to set up your local gpu for machine
learning. Still haven't figured out how to connect my 8 gpus in a functional
way.

------
INTPenis
I just got into AWS last year and this article does not mention my big mistake
that I made. Using RDS for a small scale project instead of EC2.

RDS is super expensive and while you're gaining traction and users you might
as well use an EC2 instance.

ElasticBeanstalk seemed like an easy intro to AWS but it steered me into RDS.
Of course I've abandoned the whole of ElasticBeanstalk for serverless
development lately.

My first intro to AWS was ElasticBeanstalk and serverless seemed too daunting.
But now I've been smitten and can't stop thinking serverless.

~~~
iends
I feel like serverless in a lot of cases is more expensive than just running a
monolith, once you get past the toy example or past the MVP stage.

This is especially true once you factor in all the development overhead, which
can ultimately be very expensive to do things correctly with serverless.

~~~
GordonS
Azure have open sourced their serverless engine, so you even have the option
of building out your own private serverless infrastructure.

~~~
INTPenis
What?! Please excuse my disbelief but I'm an old Linux veteran from the era
when we typed M$. ;)

Where can I find this? I assume somewhere in the 1060 repos of
[https://github.com/Azure](https://github.com/Azure)

I also assume that even though it's open source, it requires licensed MS
Windows servers to run.

Still though, this is just another nail in the coffin of the OLD MS image.

~~~
GordonS
It's MIT, and the host is built on .NET Core, so you can run it on Linux if
you want:

[https://github.com/Azure/azure-functions-
host](https://github.com/Azure/azure-functions-host)

Under Satya Nadella the change in Microsoft has been little short of
astonishing. A few years ago, I could understand the continued cynicism - but
if anything the pace of change has only intensified.

------
TruffleMuffin
Useful to have also the 'network' costs from an edge location to a user. That
one just bit me so its close in my mind right now.

------
fulafel
The #1 cost is increased complexity and opportunity cost of converting your
simple app to an interesting serverless/nosql/microservices/etc distributed
system. Of course the financial savings can be heroic and justify the effort
because everything is very expensive on AWS.

~~~
throwayEngineer
I'm not sure when AWS became the standard, but the programming communities
have lots of young kids learning their first backends on AWS.

Im not sure this is a good thing, having a generation of developers dependent
on a single companies tools seems like the future will be painful.

~~~
quicksilver03
It's not a good thing, but no worse than having developers depend on Docker
without a basic understanding of the underlying OS.

~~~
serf
i'd contend that it's much worse. One (Docker) is a piece of software, one
(Amazon) is a corporation with a motivated will.

They are two different problems. One is basically the problem of copy-
paste/stackoverflow developers, the other is a walled garden problem that'll
make itself worse over time.

~~~
nmca
Docker is written by a corporation, and so in many senses it is an expression
of their will.

------
vbsteven
I would like to see a post like this comparing services like RDS or Mongodb
Atlas on different instance sizes. For example: what is a typical
writes|reads/second I can achieve on a Postgres RDS db.m4.large instance.

------
matwood
I find it easier to use this tool for quick estimations, and it is more
encompassing:

[https://calculator.s3.amazonaws.com/index.html](https://calculator.s3.amazonaws.com/index.html)

------
tbrock
I was surprised this is all about physical costs and doesn’t include digital
costs that folks should be aware of. I was expecting to see the latency
between AZs in a region and between regions for example.

------
arithma
A bit OT, but I wait for the day where bandwidth isn't a limiting factor and
we can host machines doing computations anywhere, instead of having to rent
out economies of scale. With Kubernetes and other private-cloud-like
solutions, wouldn't it eventually be meaningless to depend on centralized
cloud clusters and instead host your own and tap into an equal-footing
infrastructure. Projects like vast.ai (no affiliation) are examples of the
ideal I hope for.

------
jammygit
I was deciding between cloud hosts yesterday for a tiny project and got
reminded how frustrating it is to predict costs on aws. Even to get a monthly
ec2 price you need a calculator out, not including other costs. Their billing
reports are also very unclear to me.

I’m planning on digital ocean because of this: their costs are straightforward
and their interface is simple.

Edit: can anyone recommend an easy way to create different-provider backups of
a DO server?

~~~
freehunter
I use DigitalOcean because they're cheap and their costs are predictable over
AWS. I don't do full server backups since I can just 'git pull' and for the
most part have my server re-populated, but I do full backups of my Postgres
database every night. I use pg_backup and the AWS CLI tools to push the backup
to S3. I also keep a week of backups stored locally on the server in case I
blow up the database but not the server, I can restore without having to bring
it back from S3 (or in case S3 fails).

------
kakkksaknmdm
Any AWS metric that doesnt have the network costs is useless. If anyone wants
a dedicated unmetered server up to 20gbps i'd recommend datapacket.com.

~~~
corrigible
Agreed. The cloud is great if your workload is low-egress... AWS/GCP bandwidth
pricing is predatory at times.

------
gtirloni
There are various "AWS Costs Cheatsheet" posts but I couldn't find a
comprehensive _and_ updated one. Any suggestions?

~~~
EForEndeavour
Not a cheat sheet, but a comprehensive list of specs and pricing for every EC2
instance, in a really nice UI far more useful than mentally joining Amazon's
various tables:
[https://www.ec2instances.info/](https://www.ec2instances.info/)

------
ausjke
unless you're a big company, why not just run linode/digitalocean/etc, those
are much more user friendly and AWS feels like a maze to me.

~~~
scarface74
I wouldn’t trust any of those to run a business on.

~~~
freehunter
Seems there are plenty of companies who _do_ trust DO and Linode enough to run
a business on.

[1]
[https://www.digitalocean.com/customers/](https://www.digitalocean.com/customers/)

[2] [https://www.linode.com/case-studies](https://www.linode.com/case-studies)

~~~
scarface74
The best non technical/political reason is that if AWS or Azure goes down or
you have an issue with them none of your higher ups or investors are going to
ask you, why did you choose AWS. It’s the old “no one ever got fired for
buying IBM”.

Besides, you can never go wrong politically by saying I chose the vendor in
the upper right corner of Gartner’s Magic Quadrant.

The technical reason is that if I want my hosting service to do more of the
“undifferentiated heavy lifting” and provide managed services. I won’t get as
much of an offering from Linode/DO. The last reason is that AWS Business
Support is excellent. They’ve helped me out with some head scratchers and when
I just didn’t want to figure something complex out myself. Their live chat is
awesome.

------
sonnyblarney
Egress fees are a big 'hidden' cost.

------
normalperson
Uh... no mention of RDS?

