
Which is less expensive: Amazon or self-hosted? - oscar-the-horse
http://gigaom.com/2012/02/11/which-is-less-expensive-amazon-or-self-hosted
======
benologist
Why does the author assume 1 EC2 extra large instance is equivalent to 1
dedicated server and then base the required dedicated servers on the peak
required EC2 instances?

He also neglects to weigh in ordinary web hosting with dedicated servers...
131 (based on the alleged equality between vps and dedicated) $300 boxes
hosted by someone else with 2x or more cpu, 2x the required bandwidth and all
that dedicated disk io for a full $30,000 per month less than AWS, $20,000 per
month less than the calculated self hosting (which includes amortized cost of
buying the servers), and outsources the physical maintenance of those servers.
Not to mention regular old big VPSs at xyzhostingcompany.

After everything it reaches the very predictable ending that AWS is worth it
if your requirements are variable.

------
ilaksh
I don't understand why so many people are fixated on Amazon AWS. Someone
please "explain" this to me.

Its overpriced and underpowered. Linode, RackSpace, and many other VPS
providers perform better and are a much better value.

To me where it makes sense to go with a dedicated or self-hosted solution is
when you start needing servers with lots of RAM, because all of the VPS
providers will gouge you when you need RAM. They will charge much more per
month for the server than the cost of the extra RAM chips and CPU and you will
have paid for the server within a couple of months.

I think that VPS providers will have to start lowering the prices for their
higher RAM instances pretty soon because RAM prices have gone down so far.

~~~
dangrossman
> I think that VPS providers will have to start lowering the prices for their
> higher RAM instances pretty soon because RAM prices have gone down so far.

Pricing of VPS and dedicated servers scales with RAM but is entirely unrelated
to the cost of RAM chips. Hosting companies use RAM as a proxy for their real
costs -- power/cooling, hardware wear/replacement, bandwidth and support.
There's a strong correlation between the amount of RAM a customer purchases
and the hardware utilization of that customer, that's why the industry has
converged on that component as the main factor in pricing.

Thinking that these companies charge $25/gb/mo for RAM has anything to do with
the cost of physical RAM (which would be paid off in the first month) is a
mistake.

I'm frustrated as anyone with the difficulty of finding affordable high-RAM
instances/servers without colocating, but complaining to the host that their
pricing should change because of the price of physical RAM won't get us
anywhere. That's not how they set the prices.

~~~
barrkel
Their proxy is also a market distortion. By not pricing RAM at closer to its
marginal cost, they encourage people to burn CPU instead of RAM in algorithm
design, which in turn increases power usage and creates cooling problems.

~~~
chc
You seem to be saying "In my imagination, their empirical observations are
wrong." What you're saying could theoreticlly happen, but that doesn't mean it
actually happens in real life.

~~~
barrkel
You seem to be replying to a comment that exists in your imagination.

When targeting an environment where CPU usage is cheaper than a 30GB hash
table, I'll choose the CPU usage. It's very simple. I am not actually
commenting on anyone's empirical observations, theoretical, imagined, or
otherwise.

------
kalleboo
I think it's pretty well known that for most use cases, cloud hosting is more
expensive than dedicated hardware.

That said, we're currently moving from a dedicated server to AWS, after we had
a bit of nasty downtime. We have dedicated servers with 1and1, and the RAID in
our server died and striped bad data all over one of the disks, slashing half
the files with junk. 1and1 tech support refused to acknowledge the problem
(and claimed we had software RAID setup…) and it took us a few days to get
back online from our weekly backups.

What I'm hoping from Amazon as a cloud provider is handling failure better:
With 1and1, a failed machine means a few days getting a new one, or paying
double for a hot spare. With Amazon, even if dead instances happen more often,
killing it and spinning up a new one is trivial, and can even be automated.
Backups can be made much more often non-intrusively by using snapshots.

For reliability's sake I also like the idea of having a few small instances
behind Elastic Load Balancer instead of one beefy machine. I haven't seen
anything like ELB with a dedicated hosting provider (aside from using an
actual load balancer, which is a very expensive proposition).

Of course, not having to plan your capacity so far in advance and being able
to start small and scale out at the drop of a hat if something on your server
goes viral is a really exciting proposition as well.

------
autarch
What about the amount of dev time and sysadmin needed to fully use each
option?

If you want to take advantage of AWS for spiky use, you need to automate the
heck out of starting and stopping instances, redistributing requests to new
machines, etc.

Horror stories about EBS make me think that you'd better reconsider storage if
you're hosting everything with AWS too.

Of course, the flip side is that with a totally self-hosted system, you'll
probably need more sysadmin work, and you may end up spending money on things
like remote hands when a drive fails or a network card dies.

Then there's managed hosting. You don't have the super awesome scaling magic
of AWS, but you don't have to deal with the physical bits much either. And you
still get real physical hardware attached directly to each system when it
comes to storage.

I think that really understanding the costs is a lot more complicated than
this article suggests.

~~~
yummyfajitas
_If you want to take advantage of AWS for spiky use, you need to automate the
heck out of starting and stopping instances, redistributing requests to new
machines, etc._

The thing is, this is not difficult to do with AWS:

    
    
        $ ec2-run-instance ami zone --user-data-file spin-up-a-new-webserver.sh
        ...(you need to parse this for the instance id)...
        $ elb-register-instances-with-lb $LOAD_BALANCER --instances $INSTANCE_ID
    

(In real life, use Boto (Python) or equivalent in your language.)

Once the new instance comes online and is legitimately serving up pages, the
load balancer will begin redirecting requests to it.

This is why we use amazon - handling stuff like this is just a matter of
calling their utilities.

------
halayli
I recently bought a server (xeon 5606 + 24GB Ram + 2 TB storage with hardware
raid) for $2k and I am hosting it in a colo for $75. I find this ideal for
small services and gives you a lot of performance allowing you to go a long
way with a couple of machines before you need to scale. Adding more
CPU/Ram/Storage will not add more to your monthly payment since you own the
machine. Now of course it's not as convenient as AWS and if a machine goes
down you are responsible to get it up again.

Scaling in AWS is a piece of cake and gives you a lot of flexibility but when
it comes to performance, especially RDBMS, I find AWS to be far behind. Some
will say that you can scale the service to provide good RDBMS performance but
that will not avoid the per instance disk IO slowness.

~~~
megaman821
If your network card fails or your CPU overheats, how long are you down for?

If you care if your website is up there are other costs when doing a colo. You
should have spare for every component which nearly doubles your hardware cost
and have someone on call for sys admin and hands-on fixes.

~~~
halayli
You can fall back to AWS in those cases.

------
verelo
Our new startup is all AWS, and honestly i dont think we could have pulled it
off any other way. Key factors:

* Getting into a data centre is costly and difficult without venture funding

* When stuff breaks, i need someone to go fix it...just too expensive and time consuming

* I want predictable expenses because we dont have a lot of money, not having to pay for repairs...and being able to create new servers easily myself gives me this

I can see how we may need to move away from AWS down the road to reduce costs,
but honestly i'm not convinced its going to be the difference between success
and failure.

Given what AWS provides in the short term, unless you're talking expenses of
40K/month i wouldnt even waste your time with self hosting. System
administration (Hardware) is very expensive...

Edit: I should also note we need lots of geographical locations, so we're a
little different in that regard. AWS again gives us an easily means of being
in 7 locations without opening any additional accounts.

------
DanBlake
Ive said this a few times earlier on. The main benefit to Cloud hosting is
hourly billing. Unless you are utilizing this feature extensively you will
almost always be better off in a dedicated/colo enviorment. You will get more
power/bandwidth for less money.

That being said, there are some edge cases. If you make extreme use of
additional feature of amazon ( SDB/EIP/etc.. ) then the Amazon "bundle" could
make it a better situation for you.

Im only writing this because I see so many HN'ers talk about "Well, I can spin
up a extra large instance at any moment to handle extra traffic!" - The fact
is, most of you dont do that. And even if you do, you would likely save money
just being on a dedicated anyways. Cloud hosting is really for "overflow" and
nothing more.

Again, Hourly billing- Cloud hostings biggest advantage.

~~~
blantonl
There are other significant benefits to cloud hosting. Specifically on AWS:

* Storage flexibility with EBS and S3

* instant migration abilities between instance sizes

* backup and restore

* provisioning of resources when needed

* staging, dev, test resources

SDB and ELB usage certainly aren't edge cases, they are core to many many
deployments.

The provisioning flexibility alone is a tremendous benefit considering that
most highly trafficked Web properties are not static environments. They
constantly innovate and have demands for infrastructure that can really be
challenging in a dedicated environment.

In our case, we use AWS extensively, but where we have been able to carve out
a "static" set of resources, we do and host with 100TB.com - simply because
they offer so much bandwidth for cheap and those static resources pump out a
lot of bandwidth which isn't cost effective on AWS for day to day operations.

Edit: from a financial standpoint, if you choose to purchase reserved
instances for AWS, hosting becomes far closer in costs benefits to dedicated
environments.

~~~
wahnfrieden
RDS also saves a lot of time if you're using MySQL with replication and
snapshotting. It makes a nontrivial amount of work trivial.

~~~
numlocked
At the risk of being an obnoxious pedant, "nontrivial amount of work" is, to
me, a non sequitur. Trivial describes the complexity of work, not the amount.
Work is either well understood (trivial) or not well understood (non-trivial).
In either case the work effort itself can be variable.

I can't comment on the accuracy of your comment otherwise. Those who can,
write content. Those who can't, nitpick semantics :)

~~~
georgemcbay
If you want to be a hardline language pendant about it, "trivial" describes
neither an amount, _nor_ complexity but commonness.

But either way, yeah, you're being obnoxious because idiomatically "trivial"
when applied to work fits fine and is easily understood for all of these
scenarios (complexity, amount, and uncommon vs common).

I certainly wouldn't consider collecting all the trash in the LA metro area
"trivial work" by the well-understood meaning of the phrase even though it is
commonly done and it is easily broken down into non-complex steps.

~~~
wahnfrieden
Prescriptivist vs descriptivist, and entirely off-topic.

------
ctdean
> Because labor is a mostly fixed cost for each alternative, it will tend not
> to impact the relative comparison of the two alternatives

I don't buy this. My experience is that the ops cost of a co-location facility
are much more expensive than aws. The ops cost functions doesn't seem to
linear as he is describing either.

People costs dominate early on and are a huge factor until you start to reach
steady state and that's the variable that you need to optimize for.

Having said that aws is expensive. If dollars are worth more than hours to
you, then yes, by all means host things yourself.

------
garyrichardson
This is looking at one aspect of hosting: your hosting bill at the end of the
month.

If you are taking advantage of the various services AWS offers, you can save
development time. For example, if I can get up and running quickly using RDS
and save a bunch of time compared to setting up replicated mysql, maintaining
backups, etc, I'll gladly pay the extra cost.

The same goes for load balancers, memcached, etc. Sure, I can save money in
the long run once I've established my app. Initially, I don't want to waste a
bunch of time bringing these services up on my own.

------
XERQ
I run SSD Nodes, Inc. (<http://www.ssdnodes.com>) and we have various business
clients using our services for their peak offloading while maintaining their
in-house infrastructure (I can't be more specific than that because of our
privacy policies). I would recommend doing both, mainly because scaling is
super easy.

~~~
getsat
Holy crap, you guys support FreeBSD, too. I've been looking for a good
alternative to Linode for a while.

Do you guys have automatic provisioning/pro-rated billing? Do you do any shady
stuff like requiring cancellations two months in advance?

~~~
XERQ
We're using XenServer, so auto-provisioning is still on our plate since we
have to build all of that in-house (solutions like SolusVM don't work with our
unique infrastructure). For the most part getting a new account up and running
takes 15-30 minutes, and you can request an OS reload at any time through a
support ticket.

We're very upfront with our cancellation policies, which is 24 hours from when
your bill is due (I can't imagine companies requiring a month or two in
advance, that's absurd). Our reasoning is that if the service is easy and
painless to cancel, people will be more than willing to order again.

~~~
getsat
Cool, thanks for the info. I will be checking you guys out.

Tilaa.nl is one company does the shady two month cancellation stuff. They
suck.

It seems non-Linode VPS providers either have shady cancellation/retention
policies OR lack auto-provisioning/pro-rated billing.

~~~
XERQ
Sorry I overlooked that last bit, we do offer pro-rated billing. The way it
works is when a new client orders, that becomes their monthly billing date.
Any upgrades are pro-rated with the remaining days in the billing cycle. If
you order another package as an existing client on a different date than your
monthly billing date, open a ticket with the billing department and we'll pro-
rate it with the remaining days in the cycle. Some clients like having
different billing dates for their services, others prefer them all to be on
the same day each month. We're flexible and will accommodate to your
preferences.

~~~
getsat
Cool, thanks for the replies.

------
jedberg
I noticed a few things in the calculaton that bias it towards self hosting:

* Using us-west, the more expensive option

* Not including labor, which is signifigantly higher if you have to rack and stack yourself.

That being said, I agree with the conclusions. If you're traffic isn't spiky
or variable, then you might be better off self hosting.

~~~
thadeus_venture
"You're" instead of "your"? Really?

~~~
RegEx
You're comment does not contribute to the thread at all. Really.

~~~
thadeus_venture
I don't give a shit.

------
giusemir1978
I tend to agree with this analysis. I work in a datacenter and I noticed that
people transition from shared cloud to self-hosting (or dedicated cloud) if
they have nil or few traffic spikes.

Those who are already self hosted use AWS to absorb traffic/computation
spikes.

------
sbov
As someone who has never written an application to be ran in a cloud
environment, I've been wondering for a while now: what sort of extra
complexity is introduced in your code to handle the unreliability of any given
server, the network, and latency in general? What techniques are used to deal
with it?

I'm approaching this more from the viewpoint of a webapp that might only need
~10 dedicated servers to handle peak load with redundancy rather than ~150, so
our colo solution doesn't have servers constantly breaking on us.

------
rburhum
Simple example: For one of my projects, I needed to host 16+TB worth of
satellite imagery and elevation. I put a machine together with 24GB Ram,
raid5, and 8 cores for around 3800. I pay $200 a month for unlimited bandwith
(and I still have 2Us that I am not using). I got pricing from rackspace,
amazon, go grid, etc. Nobody came remotely close to that price.

For most of my other projects, I always use Amazon. But for this use case,
there was no challenge by any other cloud/hybrid cloud service.

------
Uchikoma
I compared Amazon to rented servers (monthly cancelation possible) one year
ago, and even if you have spike traffic, you need to deep dive into cost to
find out if cloud is cheaper for you.

It wasn't for me.

[http://codemonkeyism.com/dark-side-virtualized-servers-
cloud...](http://codemonkeyism.com/dark-side-virtualized-servers-cloud/)

------
PaulHoule
The difference between $60k and $70k isn't all that significant --
particularly if you find that AMZN saves you labor.

------
sheraz
If you have a lot of storage needs then self-hosted is the way to go. We have
30TB storage mirrored across two data centers and pay less than $500 a month.
There is a copy on the east coast, one on the west coast, and one in the
office.

I wonder how many nines of reliability that gives us?

------
atesti
Why would it be so simple to use AWS compared to a hosted server like with
Hetzner? On EC2 you still have to administrate a Linux OS and install patches,
take care of incompatibilities, etc.

------
chaostheory
It depends on what you use them for. I use ec2 as a testing platform to
experiment and gauge performance. From that viewpoint, it's cheaper and easier
than a normal host.

------
nirvana
To be honest, I'm not much of an ops guy. So, I'm going with Hetzner dedicated
servers. AWS is more difficult for me opswise than dedicated, and with
dedicated I get all of the "let the people who know how to run data centers
host your servers" service that EC2 gives you.

AWS and traditional architectures are too much ops load, require too much
specialized knowledge and have too many single points of failure. Plus if
Amazon has terms you don't like (I personally won't do business with them
given their treatment of wikileaks) your reliance on their protocols and
services makes it non-trivial to migrate elsewhere... if you just use EC2 as
bare machines, then there's no advantage to AWS over any other bare machine
host (and a big cost disadvantage.)

I'm building a cluster of distributed three-times-replicated data on top of
Riak. Every node is a web server, dns server and database node. Round Robin
DNS distributes the load, and if a node goes down, I don't have to even get
out of bed... it can wait til morning, because nothing should break. (Of
course this is what I'm building, I can't say it has performed in production
yet, so this is theoretical...)

I call this Nirvana.

The only SPF I should have is if the whole datacenter goes offline-- this is a
legitimate risk, and once I get large enough to handle that risk, I'll upgrade
to Riak Enterprise and host in 2 data centers.

I'm not certain I'm not missing anything, but I don't understand why I'm
seemingly cutting new ground here-- this seems like the way everything should
run. (and if you agree and are interested in Nirvana, follow me on twitter,
I'll be open sourcing it as soon as I possibly can.)

------
shingen
Really depends on what you're doing and what you can afford.

If you're Foursquare and you can afford to double the cost of your
infrastructure because you want some of the benefits of the AWS platform (and
there are plenty of benefits these days) - then it's tremendous to say the
least. Amazon is doing really incredible things with AWS.

If you're doing less than a million uniques per day, you can go get three
tremendous machines, lasso them together, with a web server + main db + slave
db, for between $800 and $1250. You can get 100tb of bandwidth on a 1gbps port
(Amazon gives you none), dual 5645 Intel processors (or 2x16 core AMD), 48gb
of ram on the db machines (96gb if you want to pay another $200 / month), with
a RAID 10 config SAS 15k drives. That equivalent setup with Amazon would cost
you $5k to $10k depending on what you config. You can get this setup from
reputable hosts like WebNX and SecuredServers; if you want to pay more for a
better host and get a little less, you can go with Rackspace or Gigenet or
Softlayer.

