Hacker News new | past | comments | ask | show | jobs | submit login
Stack Exchange’s Architecture in Bullet Points (serverfault.com)
112 points by sagarun on Feb 11, 2011 | hide | past | web | favorite | 31 comments

I'd be interested in finding out why they chose to go with Windows-based web servers. Administering 12 production nodes would not be fun and doing so over remote desktop seems like a pain to me.

<offtopic>I recently discovered puppet and I am just way too excited about what I can do with it.</offtopic>

This is at a VERY high level, but here you go:

In theory I think that different stacks (LAMP, Java, and Microsoft were the big contenders at the time) are similar enough in capabilities. There are a million pros and cons to each stack. But I strongly believe that how WELL you know your stack is MUCH more important than WHICH stack you use.

In other words, an experienced Java team will FAR outperform if they can use Java, and an experienced Windows team will FAR outperform if they can use Windows, and the skills of the team are much more significant than the variations between otherwise very very similar platforms. (See also http://www.joelonsoftware.com/items/2006/09/01.html).

So, when we started, Jeff was really good at Microsoft technology, so he was able to produce better code faster using Microsoft technology than if he had to learn Ruby or Python or whatever. And that was BY FAR the dominant decision point for us.

Also, the cost of Windows licenses is virtually insignificant. It's just a non-issue. Not just because of BizSpark (which we took advantage of), but because compiled C# code on Windows servers is so damn efficient you don't need very many servers. Our stack of ten web servers is SHOCKINGLY overprovisioned. They run at insanely low loads now.

It's not really C# that matters. Most web apps should be database bound, and the main database should be on a separate server, so language just shouldn't matter. SO is fast because Microsoft SQL server is fast. It's fast right out of the box, unlike every other big SQL server which needs tuning, though other servers may be a little bit faster if you tune them correctly, if you can't find a better use for your time. Also, SX splits itself into several separate sites (SO, SF, etc), so scaling the database gets even easier - it's just a different database!

Everyone should keep this in mind when they worry about server architecture - SO runs off 12 servers. As long as an app doesn't do anything crazy, it's going to scale very easily throughout the first few years. 1 server should be enough for almost anyone. 1 database, and a few frontend web app servers will take you to millions of views per day.

Unlike user software, most web apps get faster every year, and the stacks tend to get faster (due to better interpreters) not slower (due to silly 3D effects on the desktop).

pg loves to say that people who pay for server software are crazy. That might still be true for Google these days, but not for most people. You don't need many servers, and Windows is no longer slow.

Both Joel Spolsky and Jeff Atwood made several articles or blog posts explaining this. Obviously, they've got the licenses cheap; they're using the technology (WISC) they know best; and WS2008 is far, far better than any other previous MS server OS, due precisely to huge pressure from the linux challenger :)

Note : I'm a FSF member and free software fanatic, but I won't pretend that something that obviously works well, doesn't.

StackOverflow is done in ASP.NET MVC. They were one of the earliest adopters of ASP.NET MVC. It would make sense to go with Windows servers as Mono w/ Linux may not be up-to-date with latest C#/ASP.NET code. BTW, as they are part of BizSpark program, they would have got the server licences for cheap.

And I am genuinely curious as to why you think 12 Windows servers are pain compared to 12 Linux boxes?

He seems to think you have to log on via remote desktop to administer.

Out of honest curiousity, how do you administer them?

The same way as you administer any cluster.

Using SSH? That's what I've used to administer every server I need to for many years.

The last time I had to work on windows servers SSH wasn't really an option though so we used remote desktop. This was on win2k servers though so things may have changed.

I think the question deserves a better reply than the glib "The same way as you administer any cluster." Is the answer really SSH? Or something else?

The introduction on the Powershell Wikipedia article is a good overview [1]. Powershell can also be used remotely, and various Microsoft products provide cmdlets (basically utilities) to use from the shell.

There are a few ways of dealing with IIS7: * Remotely using IIS Manager [2] * .NET-style configuration files - think httpd.conf * Powershell cmdlets * AppCmd.exe

SQL Server has Management Studio in various flavors, SqlCmd.exe or Powershell cmdlets.

The only time I log in to servers is to run installers or when laziness takes hold. These various tools work well enough that Microsoft offers Windows Server Core which only provides CLI access (and Powershell in the most recent version).

[1] http://en.wikipedia.org/wiki/PowerShell [2] http://www.iis.net/download/IISManager [3] http://learn.iis.net/page.aspx/334/install-and-configure-iis...

Using scripts and remote execution. "SSH" is merely a transport, it is irrelevant to the general principle.

The idea that Windows can't be scripted hasn't been true since Perl 5's COM module in the 1990s... Haters always hatin'.

They could do almost everything (administrator wise) via Powershell if they chose too, they're not necessarily stuck using Remote Desktop.

@Igor: For management I do feel it is a little bit more tricky since everything isn't just a text file. However, we use some Powershell, Group Policies, and WSUS as management tools (I talk about this in this post: http://blog.serverfault.com/post/1097492931/). Windows also has other management solutions I haven't gotten into yet. So although I am a believer in what Raymond refers to as Textuality in the Art of Unix Programming, it doesn't mean other options don't exist.

From what I recall, it was was Jeff Atwood knew best when he started it.

It is a good example of all those stories of not being able to change technology after you become popular. It seems that they are accepting Linux where it fits best at least.

Doesn't Spolsky's company live in the MS world, as well? (More significant that Atwood's familiarity, I'd guess.)

Sort of. http://www.codinghorror.com/blog/2006/09/has-joel-spolsky-ju...

This suggests they're Windows native, but using custom tech to provide non-Windows compatibility.

It seems odd, too, that they would use both CentOS and Ubuntu. Why not run Redis on Ubuntu?

Someone asked the question in the comments:


What stack is the most widely used by outsource companies? I'm assuming WAMP.

Interesting that they've moved to Lucene.net for search - I remember hearing them say that full text search is one of the reasons they were so happy with SQL Server.

I remember this too. Jeff Atwood made the comment in the stackoverflow blog back in 2008:

"We rely heavily on full-text search on stackoverflow.com, which worked amazingly well for us under SQL Server 2005. Looks like that’s no longer the case for SQL Server 2008, unfortunately."

Here's the entry: http://blog.stackoverflow.com/2008/11/sql-2008-full-text-sea...

ugh, no, I can't imagine ever saying that. Full Text Search in SQL Server is very badly integrated, buggy, deeply incompetent, and I hate it. Hate it hate it hate it hate it hate it.

I'm not sure where I got that impression then. Thanks for the correction!

what about microsoft FAST search? why it was not used?

What would you recommend for unstructured searches in .net platform?

I always search SO by Google -> some key words site:stackoverflow.com

Here's an earlier explanation about their caching set-up using Redis:


I'd be interested to know what platform they're running Redis on. Is it Windows or Linux?

Edit: Oh, the article link mentions it's running on CentOS. I was looking for that in the parent's link.

Congrats to them. 12 page servers and only 2 dbs is super lean for ~100m pageviews. They seem a bit bloated on eng staff though.

Off topic, but more proof how inaccurate Alexa is. I have a service that gets 1/10th of their p/v and is almost 3k points higher.

To be fair, Alexa doesn't purport to rank by page views.

I like posts like these, provide a good blueprint for other .NET sites to use as a guide if they end up exploding in popularity like Stack Overflow.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact