
Effectively using AWS Reserved Instances - mglukhovsky
https://stripe.com/blog/aws-reserved-instances
======
jedberg
Despite being a strong advocate for AWS, this is where I will say Google
completely outshines Amazon.

Google's approach to pricing is, "do it as efficiently and quickly as
possible, and we'll make sure that's the cheapest option".

AWS's approach is more, "help us do capacity planning and we'll let you get a
price break for it.".

Google applies bulk discounts after the fact, AWS makes you ask for them ahead
of time.

~~~
yani
Using GCC still feels like comparing an early android to a modern iOS. I guess
they have to sell it cheap when the features and quality is not there.

~~~
outworlder
Whatever GCP has is rock-solid and often superior to AWS.

For example: when AWS encounters non-catastrophic issues with their
hypervisor, you are on the hook for moving the instances away (meaning stop-
start, or termination and relaunch for instance store). Depending on the
instance type, this can cause service disruption.

GCP will transparently migrate the VM _while it is running_ for you. You never
see it, you customers don't either.

Same for networking: if you use their "premium" network, you can have anycast
IPs to the closest POP, which will route traffic on Google's Network, not the
open internet. AWS does not have anything close to this, the closest is multi-
region VPC peering, without the fancy routing.

AWS offers more features though, which could be important if you require them.

~~~
stephengillie
> _GCP will transparently migrate the VM while it is running for you. You
> never see it, you customers don 't either._

This is just hot-migration between hosts. It "stuns" a VM for a fraction of a
second - very high performance databases and videogame servers will sometimes
notice, while everything else just sees a spot of lag and keeps going.

AWS are an odd cloud vendor who don't support many common cloud features:

\- No hot migrate between hosts.

\- No hot-add RAM or CPU.

\- No memory-state snapshot, only disk snapshot.

\- No arbitrary CPU or RAM quantities, only "t-shirt" sizes - can't build
servers in "nonsensical" configurations like 12 CPUs and 1GB RAM, or 1 CPU and
128 GB RAM.

This is on top of having arbitrary separators in their data centers, so they
send you strange messages about having to delete and rebuild your servers in
the same data center. AWS may think it's cute to sell different "areas" in
their data center, but the way to have redundancy for servers in AWS us-east-1
is to have servers Azure US West 2 or GCP us-central-1.

Like early iOS in the smartphone space, AWS are dominant in the VM space
through marketing, not features.

~~~
cthalupa
>AWS may think it's cute to sell different "areas" in their data center

[https://aws.amazon.com/about-aws/global-
infrastructure/](https://aws.amazon.com/about-aws/global-infrastructure/)

From re:invent presentations, we know that even availability zones might be
made up of multiple datacenters. In 2014, James Hamilton's presentation said
that one of the AZs in us-east-1 had 6 separate datacenters.

I don't think it's really accurate to say AWS is selling us different 'areas
of a datacenter' when we know that AZs are not only not sharing a datacenter,
but might be multiple datacenters themselves.

~~~
stephengillie
From a certain point of view, it's Amazon using different terminology - what
others call a data center, they call an AZ...or even a subset of an AZ, and
we'll never know exactly - nor the capacity of each[0]. And what they call a
data center, others call a region.

[0] When a server class is "sold out" in a region, you can't start your server
- but there's no indication of this anywhere until you try to start your
server. Other cloud providers auto-rebalance VMs to make space - using AWS is
sometimes more like using physical servers than VMs - maybe moreso with
paravirtualization.

~~~
cthalupa
I'm really confused as to what you're trying to say.

AWS has been very open over the years about what their terminology means. When
they say datacenter, they mean it in the traditional sense of the word. When
they say an AZ is made up of at least one but sometimes multiple datacenters,
they mean that that an AZ has multiple physical datacenters. They're not
slicing up a server room and calling these multiple datacenters.

We also know that AZs are physically separated from each other.

So an AWS region has at least as many physical datacenters as it has AZs, and
potentially quite a few more.

James Hamilton has talked pretty extensively about this stuff at re:invent,
and as an AWS customer, his talks have been some of the most interesting to
me.

Other people calling a datacenter a region doesn't suddenly reduce an AWS
region down to a datacenter. A datacenter is a word with a pretty specific
definition, and an AWS region does not fit that definition.

~~~
stephengillie
The confusion is entirely mine. Your clarity is appreciated.

~~~
cthalupa
You're welcome!

My apologies if I came across as combative.

------
babaganoosh89
Google Cloud's pricing approach seems much more sane than each company having
to spend all this effort juggling reserved instances.

~~~
sonnyblarney
It's not so much sane, as possibly a little more friendly to those who can't -
or don't want to do capacity planning.

Visibility in terms of outcomes means savings, or rather, variability means
cost.

Ultimately, you're going to bear the cost if you cannot provide visibility
because Google is not likely ever going to do it as well as you can for your
own business.

Ultimately, if you knew exactly what you needed over the next few years, the
cost would be significantly cheaper as there's no need to have slack or wasted
capacity. Google effectively forgoes this option entirely and assumes at least
some minimum volatility, which means more cost.

I think it would be nice to have Google's offer, but then also a longer term
'lock in' low price option as well, as frankly, this fits a lot of businesses.
Most of the economy is not as dynamic as Valley startups.

~~~
boulos
Disclosure: I work for Google Cloud.

> I think it would be nice to have Google's offer, but then also a longer term
> 'lock in' low price option as well, as frankly, this fits a lot of
> businesses.

We hear you. That's why we offer Committed Use Discounts [1]. Are you saying
that a 3-year commitment to a specific price (or lower, as we do price cuts)
is insufficient though? (I want to understand)

[1] [https://cloud.google.com/compute/docs/instances/signing-
up-c...](https://cloud.google.com/compute/docs/instances/signing-up-committed-
use-discounts)

~~~
sonnyblarney
I'm only making a very general reference to the fact that long-term visibility
and predictability entails lower cost and therefore lower price, and that
business owners are likely more empowered to determine that outlook than the
cloud provider, either AWS or Google. Ergo - some kind of customer oriented
long term lock-in would likely, in the long run, produce the cheapest prices
in the system. That's all.

~~~
boulos
Ahh, I misunderstood your point! Yes, it’s certainly easier for us providers
if you provide a clear demand signal. RIs, CUDs, SUDs, and even general
contracting terms each provide some measure of information between customer
and provider.

We actually had an interesting debate about this topic at the NSF Workshop on
Cloud Economics [1]. The AWS person sadly had to cancel, but both some Google
people and MSFT people were present along with CS and Economics academics.
There are lots of industries where similar behavior exists, e.g., airlines or
hotels. If you book in advance, or commit to a room block, you get a discount
in exchange for certainty. We had lots of amusing debate about how similar
cloud actually is to say energy markets (which turn out to be incredibly
distorted, regulated, and confusing). Hopefully David and the rest of the
organizers will have their summary report out within a few months.

[1] [https://umass-sustainablecomputinglab.github.io/nsfw/](https://umass-
sustainablecomputinglab.github.io/nsfw/)

~~~
Joe8Bit
(as a ridiculous aside, apparently having 'NSFW' in a link can trigger overly
aggressive web filters)

------
alex_young
The notion that the break even point is 70% is ignoring some really important
stuff.

If you reserve workload x on hardware y for n years, you're effectively
strapping yourself into a sure-to-be-obsolete and more expensive platform
which you'll have to then move off of at an arbitrary point n years in the
future.

If you don't move, you wind up paying a premium to be stuck with the obsolete
/ more expensive platform just to avoid the cost of migration.

RIs are a lock in.

~~~
hinkley
I wonder how this is better than buying hardware. Some of the same problems
but more flexibility than RI.

~~~
vidarh
It's not _unless_ you depend on a huge amount of other AWS services. Buying
hardware and colocating - or even paying month to month to rent servers from a
dedicated hosting provider will typically be much cheaper than reserved
instances.

~~~
wongarsu
Which is likely the reason AWS bandwidth charges are so high. That's their
lock-in factor that often makes it infeasible to use cheaper servers elsewhere
as long as anything else is on AWS.

------
msravi
My experience with AWS reserved instances has not been very good previously.

1\. Once you buy a reserved instance, you're locked in to that type and price
for the duration, even though newer types at lower prices may get introduced
(as they almost definitely would over 1-3 yrs).

2\. If you're from outside the US, you might not be able to resell your
reserved instance. So you're stuck with an old instance type at an inflated
cost.

In contrast, Google Cloud just gives you a price equivalent to a reserved
instance price (or better), based on hours of usage, without asking for an
upfront commitment.

~~~
njovin
I’ve gotten proactive emails from our account manager when they release
new/cheaper instances and they offer us the option to transition and get a
credit for our existing RIs.

We aren’t a huge account (less than 30k/month) so I thought this was a nice
gesture on Amazon’s part.

~~~
scrollaway
Our 15k/month account did not get such an email when they introduced the dc2
redshift class, despite most of our spend being on Redshift.

~~~
Joe8Bit
I'm consistently surprised at how big the variance in quality of service is
from different AWS account managers; seemingly regardless of the size of
account.

The 2/3 reps we've had have been night and day in the level of service they've
given us, and we're a top 10% customer by volume.

------
hueving
The best answer is of course not to at all. Burst into the cloud, static
workloads in your own DC.

~~~
eldavido
This is terrible advice for all but the largest of organizations.

Running your own hardware is AWFUL. Get ready to dedicate an entire team to
network engineering, fixing broken hard disks, patching operating systems,
screwing around with RAID controllers, upgrading switches, planning power and
cooling, and endless vendor negotiation -- with ISPs, hardware manufacturers,
datacenter operators, etc.

Oh, and did I mention, throw elasticity out the window -- all of this has to
be planned in advance, and purchased, and installed, months ahead of when it
will be operable. So forget the ease and convenience of just spinning up more
capacity.

Also, there's a massive distraction of having to focus management attention on
this non-value-adding part of the business, all so that you can shave down
some cost, rather than investing in growing revenue.

Having been in this position firsthand, don't do this. If netflix can run 1/3
of Internet traffic off of AWS, I guarantee they're much larger than you, and
it should say something, that they'd rather outsource this part of their
business than dealing with all this crap.

Focus on software and product/market fit. It's just a much, much, much better
use of expensive technical people, that will be done on day 1, without any
risk, hassle, or complexity, than trying to replicate something someone else
already does for you, much better and at competitive costs, than trying to
reinvent all of this yourself.

To be honest, I don't even want to deal with EC2 anymore; I'd rather just use
a PaaS.

~~~
Johnny555
It's been a few years since doing the cost benefit analysis between AWS and
self-hosting, but for around 50 racks worth of servers and storage, the
numbers came in on AWS's side.

That didn't even take into account the "free" multi-region capability you get
from Amazon. Splitting our physical servers into a second region with enough
capacity to failover would have nearly doubled our costs.

~~~
jjeaff
Why would you compare AWS vs managing your own data center?

You could also compare AWS vs building your own silicon.

I think it would be better to compare AWS vs renting dedicated servers from a
large provider? I think you will find that the scales tip heavily in favor of
renting bare metal as far as price is concerned.

~~~
Johnny555
_Why would you compare AWS vs managing your own data center?_

Because we were already managing our own data center.

 _I think it would be better to compare AWS vs renting dedicated servers from
a large provider? I think you will find that the scales tip heavily in favor
of renting bare metal as far as price is concerned._

We offloaded a lot of work to Amazon that we were doing ourselves -- database
hosting, storage system management, etc (lots of little used data went into
S3/Glacier that previously we had on live disks)

Also, we liked the ability to have a failover region essentially for free - we
only pay for enough servers to replicate the key data we need for failover,
and keep the rest of the infrastructure powered off.

~~~
mmt
> storage system management

I was a bit incredulous that any truly all-inclusive analysis could ever show
AWS being cheaper, but this phrasing made me realize that it could have been
the one (remarkably common) case where it usually does: enterprise hardware.

That world is _easily_ more expensive than AWS, especially considering that
hardware maintenance contracts are a thing (and a shockingly expensive one, to
those of us accustomed to the commodity hardware world).

> Also, we liked the ability to have a failover region essentially for free -
> we only pay for enough servers to replicate the key data we need for
> failover, and keep the rest of the infrastructure powered off.

That's a useful advantage, though there's a pitfall in that there's no
powering off EBS volumes.

------
rohan404
One of the major issues we've seen with our customers is that many of them
(especially startups and SMBs/SMEs) don't have the ability to dedicate a team
to just managing their RI capacity. We've also seen enterprise customers
optimizing up to 70% of their EC2 usage, but many of them have trouble
ensuring a level of utilization due to rapidly changing infrastructure. I'd
definitely argue that GCP has a better model for some use cases as it requires
less active effort for optimizing billing, however if you manage your RIs on
AWS effectively you can often get a better price. Looks like Azure has also
gone down the same route as AWS, which is quite an interesting move on their
part.

Disclosure: I head engineering/devOps at Engineer.ai - one of our products
Cloudops.ai allows our customers to save up to 15% of their AWS bill without
making RI purchases, as well as get discounted prices and additional
flexibility (custom lock-in periods) for RIs they do wish to purchase. Feel
free to reach out for information - my email address is in my about section.

~~~
jhatax
Disclosure: I am a PM on Oracle Cloud Infrastructure (OCI).

I am aware that AWS and GCP are the go-to options for this audience, and that
Oracle isn’t particularly favored for the Java lawsuit (among other things).
If you are able to set these grievances aside, the OCI pricing team has done
something unique: they have created a means by which you can effectively buy
credits from Oracle and use them for whatever service (current or future) you
need. It is called the Universal Credits Model (UCM) [1].

If you anticipate usage above a certain threshold, tier-based discounts are
available at the time of purchase. It’s like a store gift card; buy whatever
you want. This takes away some of the stress of capacity planning and
instance-type selection. Additionally, you can adopt new services and avail
lower prices in the future.

With UCM, customers:

1\. Sign one single contract that provides unlimited access to all current and
future Oracle PaaS and IaaS services (Compute, DB, Block Storage, Blob
Storage, Network, etc.) spanning both Oracle Cloud and Oracle Cloud at
Customer. 2\. Gain on-demand access to all services plus the benefit of the
lower cost of pre-paid services. Depending on the projected spend, customers
can negotiate discounts. 3\. Possess the flexibility to upgrade, expand or
move services _across datacenters_ based on their requirements. 4\. Have the
freedom to switch PaaS or IaaS services they are using without having to
notify Oracle. 5\. Can adopt new services when they GA.

Please send any questions my way, and I will get answers to you.

(1) [https://www.oracle.com/cloud/bring-your-own-
license/faq/univ...](https://www.oracle.com/cloud/bring-your-own-
license/faq/universal-credit-pricing.html)

------
jakozaur
My experience from Sumo Logic is to take full advantage of RIs you need to do
capacity planning and that takes some effort. Still that's way over 30% of
savings which are needed if you run at scale.

Would recommend using CloudHealth or other tool vs. using custom ETL. I tried
do it myself on my tools, but got worse results than using dedicated tool.

However, dedicated tool need input from development. Sometimes it's worth to
buy non-convertible RIs for bigger instance. Sometimes convertible RIs are
easier. I just found that convertible RIs with some upfront are incredible
tricky to calculate amortisation.

------
toomuchtodo
> To automate this, we built an ETL process in SQL and Python that detects
> when we fall outside this band and automatically prepares a purchase for us
> to approve.

@Stripe: Will this (or parts of it) be open sourced?

~~~
citrablue
They published the code in a gist. It doesn't have a license, but since the
python code is only 61 lines, would be trivial to rewrite yourself from their
example.

[https://gist.github.com/lopopolo-
stripe/e00b4bfa0839c125ed7a...](https://gist.github.com/lopopolo-
stripe/e00b4bfa0839c125ed7aeb205a58164c)

~~~
p4lindromica
We've licensed the code with the MIT License.

------
meritt
Tangential to the point of the article but when did writing a select query
become an "ETL process"?

~~~
patio11
Suppose you have a data source and business logic which you want to run
periodically on the data source. Here are two scenarios which you could
reasonably implement this as:

Method one: You write a SQL query and some Python. You put a sticky note on
your computer "Remember to run that biweekly."

Method two: You pull up your shop's documentation for how to add the
(BIG_NUMBER)th entry into the data processing pipeline. This gets you
automatic scheduling, retries, monitoring, audit trails, alerts to the right
people in case of breakage, etc etc. You write a SQL query and some Python.
You plug it into the existing infrastructure.

------
bk_avalara
Similar idea to rolling your own, my company uses Cloudability for AWS
purchase planning. It saved a bunch of money as far as I remember.

[https://www.cloudability.com/](https://www.cloudability.com/)

------
yani
I thought that payment processors are using their own hardware. How is AWS
protecting their own customers' privacy? - can uncle Bob insert his fancy
flash drive, copy my data, and sell it? Before you say it is encrypted - where
does the encryption happen and doesn't AWS employees have access to the keys
too?

~~~
lowpro
AWS has different options for different companies/data. They even have options
for US government data that are certified by DSS I believe, and they have
options if you need PCI, HIPAA, and other types of compliance.

See: [https://aws.amazon.com/compliance/hipaa-
compliance/](https://aws.amazon.com/compliance/hipaa-compliance/)

[https://aws.amazon.com/compliance/pci-dss-
level-1-faqs/](https://aws.amazon.com/compliance/pci-dss-level-1-faqs/)

~~~
outworlder
It's worth noting that none of this is available on 'AWS' China.

