Hacker News new | past | comments | ask | show | jobs | submit login

This is a no-brainer if you've ever done anything at scale. The explanation is rather simple - hardware is always "on the premise", yours or Amazon's. Someone needs to swap drives, motherboards, man the networking gear, run cables, etc. Amazon doesn't really get a break on the hardware cost because 10,000 servers do not cost less per server than 100 servers (in fact they cost more as the volume goes up if you need them to be identical). When it comes to labor cost - if you have enough hardware for at least one full time datacenter tech, you're in the same boat as Amazon.

So you're paying Amazon to do the same work you would do otherwise - only you're subject to their rules and procedures and Amazon being a profitable business needs to mark their services up.




> The explanation is rather simple - hardware is always "on the premise", yours or Amazon's. Someone needs to swap drives, motherboards, man the networking gear, run cables, etc.

> So you're paying Amazon to do the same work you would do otherwise - only you're subject to their rules and procedures and Amazon being a profitable business needs to mark their services up.

But I thought that they were paying Softlayer to do that stuff instead of Amazon. They're not doing it themselves - and yet it's still cheaper!


I would like to know the cost calculation after a year or two. With a handful of servers it's easy to get the false impression that HW failures are rare.


Oh, there wasn't a handful of servers after we finished the migration (we have migrated a bit late IMO, so we had a lot of traffic even back then). And today, with much larger infrastructure, with hardware clusters specifically tailored to our customers needs, etc I'm pretty sure the same infrastructure on EC2 would cost more than 2x.

(Update) Re: failures - with a ~50 servers we see a hardware issue (disk dead in a RAID or an ECC memory failure) about once a month or so. None of those failures caused a single outage (RAID and ECC RAM FTW) so far.


I ran several dozen Dell blade enclosures fully maxed out - well over 300 server blades - and in 3 years I had two disk failures, none of which were critical. Hardware is pretty reliable these days.


How do you monitor HW and network failures and how do you notify SoftLayer? Is that 1-2 hours replacement time true for each components of your server fleet?


1-2 hours is their new server provisioning time. For HW issues we use nagios (that checks raid health and ECC memory health regularly) and at the moment we just file a ticket with SL about the issue showing them the output from our monitoring. They react within an hour and HW replacement is usually performed within an few hours after that (usually limited by our ability to quickly move our load away from a box to let them work on it).


HW failures are rare. At least hardware failures that matter. Disks in a RAID set dying or redundant power supply failures are not critical, and even those are more rare than you would generally expect them to be. With a bit of standardization it's incredibly cheap to keep a pool of spares handy and RMA the failed components at a leisurely pace.

Plus, you're still engineering your applications to be just as fault-tolerant as if they were running in cloud, right? The only difference is you are not paying the virtualization overhead tax. A single server dying should leave you in a no less redundant state than a single VM dying. They should also be nearly as easily deployable.

This is based off my personal experience in datacenters with 5,000-10,000 installed servers. Anything other than a PSU or HDD failure is exceedingly rare.


We have 100 physical servers and hardware failures really are very rare. Very rare.

In fact over 4 years we have only had 3 hard drives fail and no other hardware failures.


Do you have any plans on replacing your hard drives as they get old? Or you just wait for them to fail?


Generally I'd say the extra cost savings come from the lack of software needed to support the amazon style apis. They may have also made a multi year commitment, which would also further drive the cost down.


Why would 10,000 servers cost more than 100 servers? It seems like if you are buying all of your parts in bulk they are going to be cheaper. I'm pretty sure Intel's pricing on CPUs are cheaper by the tray than individually.

I know at least when I've bought 20-30 servers at a time, I was able to get a lower cost than when I've only been buying one.


Prices go down until to a certain amount, then then start to increase. Analogy: you want to buy shares of company X. If you buy 1, brokerage costs are high compared to your investment. If you buy 100, you still get the shares from the top of the order book, and fees become negligible. Buy 10M, and you will pay much-much more per share because supply is not going to be there.

Just think of the simple supply-demand curve. As demand increases, price increases as well. The bulk discount pricing is only valid for amounts that provide better utilization of the supply chain. If Intel can produce 1M chips a month, then if somebody orders the last 50k, he might get a discount. If someone wants 2M, then he needs to pay a huge markup because the supply chain is not ready.

And Amazon is definitely big enough to move the equilibrium price up.


I see no explanation for why suppliers wouldn't match demand. Aside from the HDD shortage which hit everyone, I've seen no issues like you're describing where essentially the market runs out of servers/CPUs/etc.

Because that's the only way this theory applies, if they're completely unable to meet the demand due to some specific shortage in the market.

The only reason your shares analogy makes sense is because there ARE a finite number of shares available at any given time, and buying too many drives up the cost in the entire market. Most manufacturers can scale up production as demand increases.


There are a lot of reasons why suppliers might not match demand.

Let's say you have a factory that runs at 90% utilization and somebody crawls out of the woodwork who wants to order 3 factory-months worth of widgets, delivered next week.

Well first of all, you cannot meet that schedule, so you turn away the order in the instant case.

Now the question is: if we were to scale up production, what is the chance that some new person will crawl out of the woodwork with a similar instant order once the factories are ready? Because if we judge what has happened a one-off case, then we will refuse to meet the demand, whether it is real or not.

(Of course we're also making a lot of simplifying assumptions here like that you have access to capital, that there is no regulatory issue with scaling up production, that increased production does not open you to new lines of attack from your competitors, etc. Which are not good assumptions in general.)

It is our judgment of the demand, rather than the real demand, that controls production. If we are manufacturing, say, kevlar vests in 2001, we may very well interpret a large order as representing an underlying demand shift. On the other hand, if our widgets are luxury cars in 2008, we may interpret a similar set of facts as a one-off order.

The insight here is that real demand is not known at the time that supply is trying to meet it; it is estimated. The extent to which the market clears depends on how good the estimation is. With something like oil we understand demand fairly well, but in markets like consumer electronics the demand predictions are poor. That is why on the one hand Apple is chronically short of iPhones and simultaneously Amazon cannot give its phones away: all the estimates were off.

In short, the more your widget is impacted by technological or cultural shocks, the more likely it is that suppliers won't adjust to meet demand.


No, they can't scale demand that quickly. And time matters. Also, there's no situation where demand can't be met (well, extremely rarely it happens, in case of inelastic goods). What happens is you increase prices until the demand lowers to match the supply. You never run out of products. Similarly, oil supplies will last forever, we won't ever run out of fossil fuels. Simply it will not be worth using them because of the price.

Really, microeconomics 101, all of us should have studied this in the first semester of any engineering degree :)


> Really, microeconomics 101, all of us should have studied this in the first semester of any engineering degree :)

That seems like quite a snide remark.

I have in fact studied "economics 101," and while you're using basic economic theory to form your opinions, you're mixing that in with data which you've just created for the sake of supporting your original point.

Essentially you have no supportable reason to assume that supply cannot meet demand OR that Amazon cannot space out their demand/pre-warn the supply chain. Amazon could, for all we know, give them a 12 months lead time.

To be honest this entire conversation reminds me of that scene in Good Will Hunting when the guy in the bar is mouthing off about "market economy in the early colonies" because he just finished studying them last semester. Reading your posts comes across like you're trying to shoehorn in as much eco 101 knowledge as you can. And rather than provide data or any meaningful explanation for why you believe the market would go a certain way, you just shove in more econ 101 theory and hope for the best.

This post in particular lacks any substance, and is just trying to impress upon us how much econ 101 you know. But really I am more interested in why you believe the market wouldn't meet demand, rather than how many buzz words and theory names you can reproduce from your textbook.


You make it sound as if Amazon buying servers was some kind of unexpected freak event.

They are just one of many large companies who buy hardware constantly.

Google, Microsoft, Facebook, Rackspace, Leaseweb, to name a few others...


Since most vendors source their parts from two or three different places you'll often find that even though you ordered 2000 'identical' computers, they'll have for example two or three different makes of hard drives in them, and sometimes different Bios versions and RAM configurations (2x8GB instead of 4x4GB for example)

If you need 10000 identical severs (ie exactly the same firmware versions, motherboards, hard drive version etc) then that is a bit of a pain since they can't just grab the next 10000 servers out of inventory and ship them to you. You have to make it as a separate special order.


> Amazon doesn't really get a break on the hardware cost because 10,000 servers do not cost less per server than 100 servers (in fact they cost more as the volume goes up if you need them to be identical).

Two problems here:

First off 10,000 servers almost certainly cost less than 100. Least of all because you can buy direct from the OEM rather than through a reseller (who profits), and also because the buyer has more leverage for negotiations (that's a lot of money, and they COULD go elsewhere).

Second problem: The servers don't need to be identical, and in fact Amazon's EC2 instances aren't identical (they just pretend to be). If you spin up several EC2 instances over a few weeks then look at e.g. the CPU info, you'll see that they vary quite a lot but are similar-ish (this has caused people issues when they're using on-demand instances and their software relies on specific CPU features, in particular when those features only exist on current-gen CPUs).

PS - Also 10,000 is not even ballpark how many physical servers Amazon has (try 450,000).

> When it comes to labor cost - if you have enough hardware for at least one full time datacenter tech, you're in the same boat as Amazon.

I highly doubt that. Amazon's scale allows them to develop better automation, detection, and procedures in general which allows the number of staff per server to be very low. For example, a single dedicated tech' might be able to handle 10-30 servers MAYBE, whereas at Amazon that might be just a single rack and effectively each tech might be responsible for hundreds of physical machines (even if automation does the lion's share of the heavy lifting).

> So you're paying Amazon to do the same work you would do otherwise - only you're subject to their rules and procedures and Amazon being a profitable business needs to mark their services up.

I will fully admit that a company like SoftLayer (per the article) can give Amazon's EC2 a run for its money. However as someone who's seen the costs associated with running servers in house (in particular staffing costs) I struggle to buy that you can under-cut Amazon by doing so (at least until you have a LOT of servers, and even then frankly it is less hassle to out-source it anyway).

There are legitimate arguments for why you'd want to do so e.g. privacy, security, legal reasons, unique hardware/OS, etc. However if you're just doing something generic like web-host+database, then out-sourcing it to a dedicated company is more cost effective. In particular when you start looking at the hidden costs of internal hosting (like office space, heating/electricity, security, and so on).


This is not entirely true. Certain components pricing decreases as the volume goes up, but this is likely a different scale than softlayer.

Largely the way to efficiently use amazon is to turn of nodes, when not needed for traffic. That is the service you are paying for.


Yeah, and you pay McDonald's to cook the burger for you -- but you can't do it cheaper than they can.




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: