I have setup a streaming website with 500k/d visitors, and it runs on 7 servers for 2000$ a month give or take.
That's very, very cheap compared to GAE or AWS.
At stage one, with <1000 visitors per day, I can get by on a $5/mo Digital Ocean node or $20 Linode. Just the fact that they're even numbers means I'm leaving something on the table, but just the cost per hour for talking about buying, building, and colocating my own machines is more expensive than this.
Stage two is a sweet spot where running your own 5-25 machines makes sense for stable, medium-sized, medium traffic businesses.
After that, though, stage 3 is when gets to be too much for one combination programmer/devops employee to manage in their spare time, and it makes sense to go back to the cloud and have someone else manage all of that for you. A lot of startups focused on rapid growth like to skip step 2 and go straight to this stage. When you outgrow the "do whatever's cheapest" stage of hosting, it can become untenable to explain to your investors that a few thousand in server hardware will probably be adequate for the next few years, and they'll see a nice return of a fraction of an employee per month on hosting if it's done most efficiently.
Somewhere in the stratosphere, stage 4 is when it again becomes more reasonable to hire an internal team to assemble and maintain your computing resources. Perhaps you can instead convince your investors that this is your target? After all, Facebook, Amazon, Google, and Microsoft are on their own hardware...
By this I mean that in such a business the issues surrounding servers and even software should be very minor compared to the bread and butter of helping people sell stuff so money is made. The costs of said servers and software should also be minimal vs things like sales and marketing costs.
For example: A dedicated m4.16xLarge EC2 instance in AWS is $3987/month. You could build that same server for $15,000 through Dell, lease it at $400/month (OpEx), and colo it with a 1GB/s connection for $150/month.
Yes, cost per compute measurable (memory, CPU, etc) is higher in the cloud. But what's never mentioned when people complain about the cloud being expensive:
a) ISP and CoLo costs
b) Redundancy/Multiple AZs
d) Failure recovery/cutover
f) Monitoring/Health statistics
g) The other goodies that come along with being in the cloud(Lambda, CloudFront, Route53, SQS, etc)
I feel like we're at a point where people should have to justify why they are running their own data centers vs. having to justify why they're running in the cloud. There's obviously many very good reasons to do either, or both, but if you don't already have a data center, you better have some damn good reasons for starting one up.
A) Colo is $50-150/U with blended top tier bandwidth, 99th% billing and usually includes remote hands time.
B) Even with 100% redundancy you're still ahead by $3000/month. Please understand Amazon offers absolutely no redundancy built in and nodes go down regularly. It is up to you as a developer to build redundancy around the tools they offer.
C) Amazon doesn't take backups for you. You have to pay for this either way.
D) Again Amazon doesn't handle this for you. You have to pay either way. Buying an exact replica of the hardware I've mentioned and cololocating it elsewhere still puts you ahead by $3000/month.
E) What support do you need? Dell offers same day or even 4 hour parts replacement with the appropriate warranty service. Most Colos offer remote hands for free up to a certain hourly.
F) Lots of ways to handle this. You can use IPMI, built in OS tools, etc. There isn't much exclusive to AWS you can't easily replicate elsewhere.
G) I never said not to use Amazon for anything. In fact you should be building your applications for scaleability INTO the cloud. This is part of the idea behind the whole microservices movement.
> Except then you're stuck with a $15K server, and you had to spend all 15K up front
Also, tying yourself exclusively to AWS services is not a good long term strategy because then you have vendor lock-in that's worse than being stuck with some old hardware.
If you don't need the scaling abilities of cloud, then it just doesn't justify the added expense because no matter what you say, you still need people to manage systems.
What I mean is : if you make a list of "shit that's wrong with my crafted good selling business", "how the servers are made" wouldn't be in the top 10 things to worry about. The fact that they're arguing that said servers should be sourced in a very cost-inefficient way makes it even more odd.
Consider for example this : http://www.zdnet.com/article/snapchat-spending-2-billion-ove...
which, other than here, nobody seemed to question as strange.
So you need that m4.16xlarge to handle the load, and honestly, there's no single instance available on Amazon that I've found that can come anywhere near the performance of physical hardware when it comes to handling database loads. Right now I am running ten r4.16xlarge instances to do the job of two Dell R710s, at considerably reduced performance even still.
I do run a database on my server (which isn't even a Large) for an e-commerce site. It fits handily in half a gig of ram. The hundreds of products we offer have associated images, but we just store the filenames in the database and they load from disk (or, more likely, Cloudflare cache). Honestly, it could be a static site but the database is a convenient way to edit the content and control the presentation.
You have 640 CPUs running at multiple gigahertz and trillions of bytes of RAM. What kind of workload requires that insane amount of compute power?
Part of the problem is that the network disk that EC2 provides as EBS is 100x slower than local disk, so keeping all the indexes and data in memory is the only way to replicate physical hardware performance.
If anyone knows of a better EC2 setup for PostgreSQL, I'm all ears.
> If anyone knows of a better EC2 setup for PostgreSQL, I'm all ears.
Have a proper replication & archiving setup, and use instance storage. If you have configured streaming replication to 1-2 other servers, and archive your WAL to s3 (using wal-e or such), you're already above EBS's guarantees (99.9% durability IIRC?).
Still doesn't even remotely compete with what you can get with "normal" hardware.
If this is all you see when it comes getting servers or working with Cloud servers/services, then you don't know what you are talking about.
You don't build for one massive single server, you build for a bunch of small ones that spin up and down as needed. It's micro services or you are wasting your time and money.
This seems unlikely to be true unless your software engineers love doing ops (and the needs of the business don't need those on software.)
The salaries of having Operations that can replace failed hardware 24/7 (and w/ vacations) alone is more expensive than a company like Etsy should have in ops. Then there is the cost of the pipe, power, and redundancies. Even higher up, managing all of the depreciation in accounting is going to have costs compared to a expense line item.
Hardware failures are far less common than you believe. I lost one hard drive in 10 years and because it was a RAID array it didn't even fail.
> Hardware failures are far less common than you believe.
Your experience may vary; I once rented metal from Softlayer (running ~15 shards, about 60 boxes), we had a number of drives* fail, a couple of the rack controllers, some raid controllers, and one time a power supply over a 3 year period. On the worst RAID failure, we sent one of our employees across country to manage the recovery directly.
*Some hard drive failures related to 2012 Seagate 3TB drive issue. One failed within a week of being replaced.
Softlayer had a team monitoring our servers and working through issues with us; Other than supplier issues I blame them for nothing.
In that worst scenario; we ran off our geo redundant slave for the better part of a week.