Hacker News new | past | comments | ask | show | jobs | submit login

The cloud is very, very expensive. If you want low cost, you do your own servers for anything under 20 nodes. Which, given the power of modern machines and with unmethered bandwidth, will take you to the millions of user easily.

I have setup a streaming website with 500k/d visitors, and it runs on 7 servers for 2000$ a month give or take.

That's very, very cheap compared to GAE or AWS.

I think there's a couple sweet spots:

At stage one, with <1000 visitors per day, I can get by on a $5/mo Digital Ocean node or $20 Linode. Just the fact that they're even numbers means I'm leaving something on the table, but just the cost per hour for talking about buying, building, and colocating my own machines is more expensive than this.

Stage two is a sweet spot where running your own 5-25 machines makes sense for stable, medium-sized, medium traffic businesses.

After that, though, stage 3 is when gets to be too much for one combination programmer/devops employee to manage in their spare time, and it makes sense to go back to the cloud and have someone else manage all of that for you. A lot of startups focused on rapid growth like to skip step 2 and go straight to this stage. When you outgrow the "do whatever's cheapest" stage of hosting, it can become untenable to explain to your investors that a few thousand in server hardware will probably be adequate for the next few years, and they'll see a nice return of a fraction of an employee per month on hosting if it's done most efficiently.

Somewhere in the stratosphere, stage 4 is when it again becomes more reasonable to hire an internal team to assemble and maintain your computing resources. Perhaps you can instead convince your investors that this is your target? After all, Facebook, Amazon, Google, and Microsoft are on their own hardware...

There has to be something very wrong when an article about the problems with a company that facilities an online marketplace is talking about the details of how their servers are run and paid for..

By this I mean that in such a business the issues surrounding servers and even software should be very minor compared to the bread and butter of helping people sell stuff so money is made. The costs of said servers and software should also be minimal vs things like sales and marketing costs.

Except traditional "cloud" hosting is wicked expensive in comparison. Ideally you should shoot for a hybrid model where excess capacity spills into AWS/GCE/Azure.

For example: A dedicated m4.16xLarge EC2 instance in AWS is $3987/month. You could build that same server for $15,000 through Dell, lease it at $400/month (OpEx), and colo it with a 1GB/s connection for $150/month.

Except then you're stuck with a $15K server, and you had to spend all 15K up front. What happens when a component prematurely fails? Better buy two. What about the person-hours you spend with ISPs, CoLo provider, etc? You want network redundancy? Big ISP bills.

Yes, cost per compute measurable (memory, CPU, etc) is higher in the cloud. But what's never mentioned when people complain about the cloud being expensive:

a) ISP and CoLo costs

b) Redundancy/Multiple AZs

c) Backups

d) Failure recovery/cutover

e) Support

f) Monitoring/Health statistics

g) The other goodies that come along with being in the cloud(Lambda, CloudFront, Route53, SQS, etc)

I feel like we're at a point where people should have to justify why they are running their own data centers vs. having to justify why they're running in the cloud. There's obviously many very good reasons to do either, or both, but if you don't already have a data center, you better have some damn good reasons for starting one up.

That $15,000 server is leased across 36 months becoming an OpEx expenditure. On the accounting side this is no different than AWS. So let's say you buy two for failover.

A) Colo is $50-150/U with blended top tier bandwidth, 99th% billing and usually includes remote hands time.

B) Even with 100% redundancy you're still ahead by $3000/month. Please understand Amazon offers absolutely no redundancy built in and nodes go down regularly. It is up to you as a developer to build redundancy around the tools they offer.

C) Amazon doesn't take backups for you. You have to pay for this either way.

D) Again Amazon doesn't handle this for you. You have to pay either way. Buying an exact replica of the hardware I've mentioned and cololocating it elsewhere still puts you ahead by $3000/month.

E) What support do you need? Dell offers same day or even 4 hour parts replacement with the appropriate warranty service. Most Colos offer remote hands for free up to a certain hourly.

F) Lots of ways to handle this. You can use IPMI, built in OS tools, etc. There isn't much exclusive to AWS you can't easily replicate elsewhere.

G) I never said not to use Amazon for anything. In fact you should be building your applications for scaleability INTO the cloud. This is part of the idea behind the whole microservices movement.

  > Except then you're stuck with a $15K server, and you had to spend all 15K up front
I think you missed the part where the parent said "lease it at $400/month (OpEx)"

Cloud doesn't automatically come with that stuff - you need to set up your own failovers, backups and monitoring. Also you still need to administer those servers.

Also, tying yourself exclusively to AWS services is not a good long term strategy because then you have vendor lock-in that's worse than being stuck with some old hardware.

If you don't need the scaling abilities of cloud, then it just doesn't justify the added expense because no matter what you say, you still need people to manage systems.

Yes you do need to know your shit. But that's true for all kinds of products and services any business buys. Someone has to understand the deep weeds Health Insurance at any company (in the USA).

You left out the paying multiple people >$10k/month that can replace a hardware failure immediately (and somehow with 0 downtime), the redundant hardware, etc.

Oh, preachin' to the choir with that : my company's SaaS service is deployed across 10 or so tin boxes that I personally screwed together from Supermicro parts; humming in a faceless DC in Santa Clara with a backhoe sat in the far corner of the parking lot..

What I mean is : if you make a list of "shit that's wrong with my crafted good selling business", "how the servers are made" wouldn't be in the top 10 things to worry about. The fact that they're arguing that said servers should be sourced in a very cost-inefficient way makes it even more odd.

Consider for example this : http://www.zdnet.com/article/snapchat-spending-2-billion-ove... which, other than here, nobody seemed to question as strange.

On the other hand, a reserved m4.Large instance is $45/month. After a year of that, you're just $550 into hosting, barely enough to select and order the case and power supply for your server. It would be 30 years before you paid the cost of that Dell, so unless you actually need that power and your time is free, it's easier to let this be a tiny fraction of Amazon's expenses and workload than a big part of yours.

The $15,000 Dell being discussed is equivalent to a m4.16xlarge with 64 threads and 256 gigs of RAM. The m4.large you're referencing at $45/mo only has 2 CPUs and 8 gigs of RAM.

Not all instances are the same. You can't run a database on an m4.large, not any database you plan on running queries on.

So you need that m4.16xlarge to handle the load, and honestly, there's no single instance available on Amazon that I've found that can come anywhere near the performance of physical hardware when it comes to handling database loads. Right now I am running ten r4.16xlarge instances to do the job of two Dell R710s, at considerably reduced performance even still.

I guess our use cases are just different. Mind sharing some examples of your use case that demands that much compute power?

I do run a database on my server (which isn't even a Large) for an e-commerce site. It fits handily in half a gig of ram. The hundreds of products we offer have associated images, but we just store the filenames in the database and they load from disk (or, more likely, Cloudflare cache). Honestly, it could be a static site but the database is a convenient way to edit the content and control the presentation.

You have 640 CPUs running at multiple gigahertz and trillions of bytes of RAM. What kind of workload requires that insane amount of compute power?

We have billions of rows of data and very complicated CTEs doing joins on a dozen tables matching based on GEOS radius data, and it's the primary query for our application. We're running PostgreSQL 9.5 and need to have all the data in memory for the fastest results.

Part of the problem is that the network disk that EC2 provides as EBS is 100x slower than local disk, so keeping all the indexes and data in memory is the only way to replicate physical hardware performance.

If anyone knows of a better EC2 setup for PostgreSQL, I'm all ears.

> Part of the problem is that the network disk that EC2 provides as EBS is 100x slower than local disk, so keeping all the indexes and data in memory is the only way to replicate physical hardware performance.

> If anyone knows of a better EC2 setup for PostgreSQL, I'm all ears.

Have a proper replication & archiving setup, and use instance storage. If you have configured streaming replication to 1-2 other servers, and archive your WAL to s3 (using wal-e or such), you're already above EBS's guarantees (99.9% durability IIRC?).

Instance storage is much more limited and is heavily tied to the specific instance types you pick.

Indeed. I didn't want to say it's a panacea, just that you can sometimes get a lot better performance for your money. E.g. larger i3's both have decent IO performance and sizes.

Still doesn't even remotely compete with what you can get with "normal" hardware.

This is why I claim a hybrid model is the way to go. Your baseline/mean load should be on redundant dedicated hardware with spill-over into a cloud provider.

> For example: A dedicated m4.16xLarge EC2 instance in AWS is $3987/month. You could build that same server for $15,000 through Dell, lease it at $400/month (OpEx), and colo it with a 1GB/s connection for $150/month.

If this is all you see when it comes getting servers or working with Cloud servers/services, then you don't know what you are talking about.

You don't build for one massive single server, you build for a bunch of small ones that spin up and down as needed. It's micro services or you are wasting your time and money.

> If you want low cost, you do your own servers for anything under 20 nodes.

This seems unlikely to be true unless your software engineers love doing ops (and the needs of the business don't need those on software.)

The salaries of having Operations that can replace failed hardware 24/7 (and w/ vacations) alone is more expensive than a company like Etsy should have in ops. Then there is the cost of the pipe, power, and redundancies. Even higher up, managing all of the depreciation in accounting is going to have costs compared to a expense line item.

Here is the dirty little secret: most business run fine without high availibility. Your customer will not leave if your service is crashing for a few hours during the year if you are small.

On top of that, Amazon's single region EC2 SLA kicks in at 99.95%. So unless you are using multiple regions in AWS, going by their SLA you're already in the range of a few hours of downtime per year.

Don't believe the lie that you don't need ops people in cloud. You're still running servers. So since you have the staff anyway, why not save 3-4x the cost?

Hardware failures are far less common than you believe. I lost one hard drive in 10 years and because it was a RAID array it didn't even fail.

You still need software ops, yes, but not hardware ops.

> Hardware failures are far less common than you believe.

Your experience may vary; I once rented metal from Softlayer (running ~15 shards, about 60 boxes), we had a number of drives* fail, a couple of the rack controllers, some raid controllers, and one time a power supply over a 3 year period. On the worst RAID failure, we sent one of our employees across country to manage the recovery directly.

*Some hard drive failures related to 2012 Seagate 3TB drive issue. One failed within a week of being replaced.

Softlayer had a team monitoring our servers and working through issues with us; Other than supplier issues I blame them for nothing.

In that worst scenario; we ran off our geo redundant slave for the better part of a week.

I think you overstate the overhead for hardware operations. I had a team of five that managed over 1,000 physical servers in four global locations and they had to visit a datacentre maybe twice in three years. Other than for hardware installs which were always easily planned out.

I'd love to pick your brain about that--what's a good way of getting a hold of you?

Try desmoulinmichel on gmail. But honestly I don't do anything special. varnish + redis + nginx + celery + postgres can take you a long way.

Registration is open for Startup School 2019. Classes start July 22nd.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact