Hacker News new | past | comments | ask | show | jobs | submit login

Then you remember that for years, Stack Overflow ran out of a couple of well administered servers. YAGNI. KISS. People forget the basics because "infrastructure astronautics" is fun, and it probably helps make a beautiful resume, too.



While infrastructure astronauts are typically money wasters and padding their resumes, I don't think SO is the best counter example.

SO's read to write ratio is enormous. While they talk about a handful of different bare metal servers a vast majority of their hits are handled by their caching layer which is not one of their bare metal machines.

Their write loads are not explicitly time sensitive either, if there's a delay between a successful post and cache invalidation it's not a big deal for a majority of their traffic.

Not every model is quite so forgiving. But even then there's a lot of stupid wasteful "infrastructure" made by people who think they're going to be Google, or at least tell their investors they'll be Google.


Well, it's an interactive database backed web site. That describes like maybe 90% of the things running on AWS.

StackOverflow is useful because it reminds people that machines are fast, and you probably don't need that many of them to scale to large sizes. StackOverflow is used by the entire global population of developers more or less and it runs off of one large MS SQL Server + some web servers. No auto scaling (not enough hw expense to be worth it), no need for fancy cloud LBs etc.

Sure maybe your service is gonna scale to more than the world's population of developers. Great. But ... a lot of services won't even go that far. For them it's hard to conclude it's really needed.

To pull off a StackOverflow you do need skilled sysadmins though. As the article points out, a surprisingly large number of people who call themselves devops don't really know UNIX sysadmin anymore.

https://stackexchange.com/performance


Autoscaling is overused anyway. It's cheaper and much less complex to have 200-300% of average capacity running 24/7 on dedicated servers (OVH/Hetzner, whatever) than trying to scale up and down according to demand.

Changing capacity automatically always has the potential to backfire. And as AWS & Co need to keep those servers running during times of less demand there's no way it's cheaper unless you have really unusual traffic patterns (even then it's probably not).


You're not wrong but it's worth noting SO as a case study for site design has some important caveats. Beefy hardware is great but aggressive caching and setting same expectations around write latency can be a massive scaling win.


I would guess SO had a relatively small number of employees and their servers were running a pretty straight-forward CRUD app and a database. Comparing that to the heterogeneous workloads that large organizations deal with is a bit silly. No doubt there's still a lot of fat to trim in those large organizations, but trimming that fat is rarely their largest opportunity (much to the chagrin of those of us who like simple, elegant systems).


my guy, you literally made me laugh. how's this https://stackexchange.com/performance different from any so called workloads large organizations are running. you've your app servers, db servers, load balancers and failover servers. pretty standard setup. yet SO is running on bare-metal. resume driven development and everyone thinking they're google | fb has killed and made money in our industry


> how's this https://stackexchange.com/performance different from any so called workloads

StackExchange is largely a CRUD app. High volume of tiny requests that hit the app layer, then the database, and back. Other organizations have lower volumes of compute-intensive requests, async tasks, etc.

With respect to the size of an organization, the cost of coordinating deployments and maintenance over a handful of servers grows with the size of the organization. It frequently behooves larger organizations to allow dev teams to operate their own services.

None of this is to say that there isn't waste or poor decisions throughout our industry; only that it's not the sole factor and SO's isn't the ideal architecture for all applications.

> my guy

I'm not your guy.


It is the World top 50 site by Alexa.

Comparatively speaking even compare them to the same CRUD app and DB, most are using 5 to 10x more servers with 1/2 to 1/5 of the traffic.


It is a large site by traffic measure but I would guess the traffic is heavily read only. Managing workloads with more data mutation introduces different complexities which mean you can't just cache everything and accept the TTL for writes based on cache invalidation.

edit: To be clear, not saying SO isn't an achievement, but its one type of use case that yields a really simple tech stack.


Their stats are here:

https://stackexchange.com/performance

Their DB handles peak of 11,000 qps and peaks at only 15% CPU usage. That's after caching. There are also some ElasticSearch servers. Sure, their traffic is heavily read only, but it's also a site that exists purely for user-generated content. They could probably handle far higher write loads than they do, and they handle a lot of traffic as-is.

What specific complexities would be introduced by an even higher write load that AWS specifically would help them address?


> Comparatively speaking even compare them to the same CRUD app and DB, most are using 5 to 10x more servers with 1/2 to 1/5 of the traffic.

No doubt, but how large are those other CRUD app organizations? Do they have a staff that is 20x the size all trying to coordinate deployments on the same handful of servers? What are their opportunity costs? Is trimming down server expenses really their most valuable opportunity? No doubt that SO has a great Ops capability, but it's not the only variable at play.


And you can probably cache all the data you need for the next 24h in a few terrabytes of RAM, _if_ you knew what that data was.


I was thinking more of the ordinary startup than of large organizations.


SO also ran on .net, which is a far cry from the typical startup server load running on Python, Ruby, or PHP.


What do you mean?


I guess that without a lot of optimizations, .NET will be much more performant. The lower the performance, the more servers you need and that will make it harder to manage your fleet.


Yes.


Another issue is people don't understand how powerful modern hardware is. You have modern systems that process less transaction per a second ones from the 1970s.

Just look at the modern SPA. They are slower than ones 10 years ago, and the JavaScript VM is much faster plus all the hardware gains. Why does it make Twitter and Gmail a few seconds to load?


I don't doubt certain mainframe apps of the (late) 1970s could beat the TPS of a generically frameworked app in certain situations, but do you have any real numbers/situations/case studies to back that up?


Can't remember the name of the system that did 4k but found a paper about bank of America processing 1k a second.


When you look at something like how Stack Exchange moved physical servers to a new datacenter, you can see where AWS benefits you.

Not all startups have the server and networking know-how to pull that kind of stuff off, or even set it up in the first place.

https://blog.serverfault.com/2015/03/05/how-we-upgrade-a-liv...


Its not like AWS services themselves require zero knowledge in their use (especially if you dont wanna be billed massive amounts)


That's the nice thing about Hetzner, OVH and Co. You don't need Colocation but can instead rent servers monthly. That way you never have to bother with physical hardware, the knowledge needed is purely on the software side.

I also think colocation or own datacenters are a poor fit for most. But dedicated servers are underrated. They can be offered much cheaper as it's simply renting standard hardware and you don't need any of the expertise or size you'd need for colocation.


Fedora just did it for all the infrastructure servers this summer:

https://hackmd.io/@fedorainfra2020/Sybp76XvL




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: