
Pinterest Cut Costs from $54 to $20 Per Hour With Automatic Shutdowns - autotravis
http://highscalability.com/blog/2012/12/12/pinterest-cut-costs-from-54-to-20-per-hour-by-automatically.html
======
RyanGWU82
I'm Ryan Park -- I'm the Pinterest engineer quoted in the article. I'm happy
to answer any questions about our setup.

Just to clarify, the auto-scaling was specifically for a pool of web
application servers. At the time I gathered the numbers, there were 80 servers
in that pool. In the last few months we've been moving toward a service-
oriented architecture, and we've been able to use the same code to auto-scale
the internal services. Of course it's not possible to auto-scale stateful
servers like databases, but it's still saving us a considerable amount of
money.

We implemented the auto-scaling in early 2012, so it's been in use for almost
a year now. It only took about 2 weeks of engineering to build the system. It
does need occasional maintenance, but it's still worth the effort given how
much money it saves us.

~~~
jedberg
> Of course it's not possible to auto-scale stateful servers like databases

That's not entirely true. :) It's just a lot _harder_ to scale them
elastically. At Netflix we've done some proof of concept work on elastically
scaling Cassandra, although we don't have it in production yet. I think that
is one of our goals for 2013.

~~~
thezilch
Theoretically, there's going to be a state of affairs where it's not possible
-- that's physics. Maybe with multi-host tenant, so that we're not relying on
AWS networking / control-plane screwing with what is going to be the most
important part of that dance.

But this is about saving money; not being a tenant of multiple providers; not
spending more than "2 weeks."

~~~
stonemetal
Even on a boring old DB: Spin up a few read slaves, give them a few min to
sync to current, turn them loose. Sure it is only read scalability, but every
bit counts.

~~~
thezilch
I think you're missing the point that I'm not going after the DB layer or your
own software states (eg. read only). Most AWS outages are related to network
layers; you have to already be sync'd. If you want to do writes, sync'd enough
to still guarantee a quorum for your writes AND reads, and if the AWS
network(s) are fractured enough, you could theoretically be without access to
bringing up a stable "wan" for yourself. And then you may simply be hamstrung
by ELB outages and be left without hope altogether.

If we're talking about read-only, I don't see why a DB, in the traditional
meaning, has much to do with the issue.

These are not necessarily AWS faults; they are certainly trying to help. It
might be the year for being a tenant of more than one PaaS though, but that is
only going to up your costs and still not give infinite 9s.

------
jackowayed
For now they really only saved $300k/yr, which is less than the cost of two
engineers. Though I guess cutting a nontrivial expense that is likely to grow
by nearly a factor of three is pretty great and probably worth investing in.

That said, this adds complexity to their systems with the only benefit being
cost savings. Given that we can assume that no code is perfect, it's likely
that at some point the auto-downscaling will cause an outage or period of slow
responses, which could easily lead to lost usage and trust that costs them as
much as they're saving on ops.

~~~
sliverstorm
IMO designing systems that can be powered on and off regularly (and quickly)
is a good thing in itself; it encourages "proper" setup. I've found servers
that have been on for months or years tend to need manual intervention after a
reboot. When the machine could reboot every day, you can't have that.

In other words, it's just good design.

~~~
calpaterson
Nitpicking: but it's not good "in itself", it just has other benefits.

------
cperciva
_If spot prices spike and spot instances are shut down, on-demand replacement
instances are launched. Spot instances will be relaunched when the price goes
back down._

I heard this at AWS re:invent, and thought I must have misunderstood. I'm
still confused as to how this can possibly be a good strategy.

The pool of EC2 on-demand instances has a finite size -- it isn't magical --
and does hit its capacity limit from time to time. When there's high demand
for EC2 instances -- say, when there's an outage in another AZ -- you're
likely to see both spot prices going up _and_ a lack of capacity in the on-
demand pool. As a result, this strategy seems designed to only ask for on-
demand instances at the times when they're least likely to be available.

~~~
jedberg
Spot instances come from both leftover on-demand instances as well as unused
reserved instances. So it's quite possible to run out of on-demand and still
have a low spot price.

~~~
cperciva
It's possible, sure... but only if the people who find that they can't launch
on-demand instances don't think to try spot instances instead.

~~~
jedberg
Many people aren't set up to handle spot instances. You need to be much more
resilient to single instance failures than when using on-demand or reserved
instances.

------
dclusin
$54 per hour for what? Is this simply the cost of running the instances? It
seems to me like pinterest is throughput bound and the vast majority of their
costs would be accrued via transit and storage charges.

~~~
aioprisan
I agree with your questions. I'd be interested what percent that signifies
versus the total per hour bill for all services, not just CPU time, but
bandwidth as well

------
jules
Given the premium that Amazon charges, how does this compare to dedicated
servers? And why are dedicated servers in the US so much more expensive than
in Europe?

~~~
aioprisan
power is cheaper in Europe than in the US

~~~
PanMan
Nope, it's the reverse, so that can't be the case. Some quick googling:
<http://en.wikipedia.org/wiki/Electricity_pricing> US is between 8-17 cents.
EU is 20-30+.

See also <http://blogs.platts.com/2012/11/20/electric_prices/>

~~~
gizzlon
That wikipedia article is not enough to back your claim. As you can see, it
lists a lot caveats. In my limited knowledge, electricity pricing is quite
complicated, and those numbers are probably not even close to what business
and industry are actually paying.

Edit: The blog-post is more convincing, but again, are those numbers really
comparable? Maybe they are, I don't know enough about the subject, I just find
it a little too simple to just compare numbers from different websites without
deep knowledge of the topic.

Here's another link, the prices differ (?):
[http://epp.eurostat.ec.europa.eu/statistics_explained/index....](http://epp.eurostat.ec.europa.eu/statistics_explained/index.php/Energy_price_statistics)

------
jkat
In our investigation, we've found AWS to be around 5x more expensive. Being
able to save 63% during off hours (or, let's just say, reducing AWS' bill an
average of 30%) doesn't really seem to make much of a difference - and that's
with paying a large amount upfront.

~~~
fbuilesv
5x more expensive than what? Owning/leasing servers? Similar cloud providers?

~~~
jkat
More popular alternative. Either dedicated or collocating.

------
jacques_chester
Reminds me a lot of how some energy-intensive plants can spin up less-
efficient / quick-start units during offpeak hours to squeeze a little extra
production out.

And vice versa: most electrical utilities have slow-starting, efficient-as-
possible turbines that never get turned off (baseload -- coal is most common),
and a bunch of relatively inefficient but flexible turbines (usually natural
gas).

~~~
brazzy
> relatively inefficient but flexible turbines (usually natural gas).

Actually, natural gas plants are at least as efficient as coal. They're just
more expensive, especially if you turn them on and off a lot, which is pretty
bad for the lifetime of a lot of components.

~~~
hga
Depends on the kind of plant, there's baseline gas plants that heat water to
steam and thence to steam turbines, they're akin to coal plants, high capital
costs, low operating costs. Then there are straight gas turbines, akin to the
ones that power jet airplanes but more like the ones that power most of the US
Navy's ships. These have lower capital costs but higher operating costs and
are used for peaking.

------
WestCoastJustin
Here is the AWS re: Invent STP 204: Pinterest Talks Rapid, Cost Effective
Scaling on Amazon Web Services --
<https://www.youtube.com/watch?v=73-G2zQ9sHU>

------
jbigelow76
That's pretty interesting. I've been frustrated in the past by both AWS and
GoGrid (and I'm sure every other cloud provider) that keep incurring VPS
instance costs even when the instance is shut off. I understand that even if
I'm not using the VPS the resources need to be kept in reserve (in theory),
but the solution of destroying and reprovisioning instances sucks pretty bad,
way to time consuming if you are dealing with only a handful and
operationalizing it is not not worth it.

I'd love to move to a provider that let me provision an extra instance or two
for either failover or testing/staging but not be charged for it if I wasn't
running traffic to it.

EDIT: I stand corrected, I might have been thinking of Rackspace's cloud
(can't remember what it's called now) instead of AWS. But I know for a fact I
am right on GoGrid (and pretty sure Azure) because I have a long email chain
arguing about charges for provisioned instances in off states.

~~~
james33
Unless AWS has changed something since I last used it (which admittedly has
been at least a year), they don't charge when an instance is off except for
storage.

~~~
josegonzalez
Well, not exactly true. If you pay a the Heavy Utilization instances, you are
charged regardless of whether you even have the instance allocated.

~~~
zwily
That's also not exactly true. If you purchase a reserved instance, you pay the
up front price, and then the reduced hourly price _for whenever the instance
is running_.

~~~
zwily
To anyone reading this later... I am wrong here. Heavy Utilization
reservations are charged whether or not you have instances running. Medium and
Light are only charged when running.

------
teeja
I'm wondering if you did (or anyone has done) any cost effectiveness on using
SSD's rather than HD's. (Or maybe that's an order-of-magnitude too low to
consider?)

~~~
thezilch
Netflix has done such benchmarks [0]; TLDR: _their workload_ cut costs in half
with hi1 instances.

[0] [http://techblog.netflix.com/2012/07/benchmarking-high-
perfor...](http://techblog.netflix.com/2012/07/benchmarking-high-performance-
io-with.html)

