
The architecture of Stack Overflow [video] - schmylan
http://www.dev-metal.com/architecture-stackoverflow/
======
merak136
Some points that I find interesting:

[1] StackOverflow has VERY FEW tests. He says that StackOverflow doesn't use
many unit tests because of their active community and heavy usage of static
code.

[2] Most StackOverflow employees work remotely. This is very different than a
lot of companies that are now trying to force employees back into an office.

[3] Heavy usage of Static classes and methods. His main argument is that this
gives them better performance than a more standard OO approach.

[4] Caching even simple pages in order to avoid performance issues caused by
garbage collection.

[5] They don't worry about making a "Square Wheel". If their developers can
write something more lightweight than an already developed alternative, they
do! This is very different from the normal mindset of " don't reinvent the
wheel ".

[6] Always using multiple monitors. I love this. I feel like my productivity
is nearly halved when I am working on one tiny screen.

Overall, I was surprised at how few of the "norms" that they follow. Either
way, seems like it could be a pretty cool place to work.

~~~
thedufer
> StackOverflow employees work from home.

Many do, but they have a fairly large office in NYC and a smaller one in
London.

~~~
kmontrose
The Stack Overflow Q&A dev team has 2 people in New York, out of a team of 10
team. The Careers dev team is more New York heavy, 3 remote and 5 in New York.
The sysadmin team is also quite remote, though I don't know the breakdown
offhand.

I believe at this point most new technical hires are remote.

Our offices are mostly sales, Denver and London exclusively so.

~~~
thedufer
I saw that Jason went remote recently. Any particular reason so many devs are
going remote? Is it people making individual decisions or the company
providing new incentives to do so? My impression when you were at 55 was that
most devs worked at the office (I've been at Fog Creek since a little before
you guys moved. Hi!).

~~~
jaydles
People making individual decisions. All else equal, we'd _slightly_ prefer to
have people in NYC, because we think the in-person time is a plus for the
casual interaction that happens in between "getting things done". But we've
set our selves up to make real work and official team collaboration work
almost entirely online. We've learned that the in-person benefit is more than
outweighed by how much you get from being able to hire the best talent that
loves the product anywhere, not just the ones willing to live in the city you
happen to be in.

------
esw
Here are the slides for anyone who's interested:
[https://speakerdeck.com/sklivvz/the-architecture-of-
stackove...](https://speakerdeck.com/sklivvz/the-architecture-of-
stackoverflow-developer-conference-2013)

------
carsongross
The most important thing, technically, is having great developers who ship.

For piths sake, I want to say "Everything else is noise" but that isn't true.
Everything else can help or hurt, depending on the application and how
doctrinaire the application of a given approach/methodology is, the
organizational knock on effects (e.g. "Mr Tough Guy Testalot" holds up the
release train or nukes your architecture to make it 'testable'), etc. but,
seriously, "great developers who ship" is really what moves the needle.

~~~
WestCoastJustin
Having a great Ops staff also helps ;) Of note is Thomas Limoncelli who wrote
"The Practice of System and Network Administration" [1] and "Time Management
for System Administrators" [2] works for Stack Exchange (formerly at Google).
The Practice of System and Network Administration is basically the bible for
most sysadmins, myself included.

ps. I only singled Thomas Limoncelli out as an example just to highlight the
caliber of their Ops staff.

[1] [http://www.amazon.com/Practice-System-Network-
Administration...](http://www.amazon.com/Practice-System-Network-
Administration-Second/dp/0321492668)

[2] [http://www.amazon.com/Management-System-Administrators-
Thoma...](http://www.amazon.com/Management-System-Administrators-Thomas-
Limoncelli/dp/0596007833)

~~~
carsongross
Violently agree.

~~~
skeletonjelly
Vehemently? Or do you want to punch someone?

~~~
carsongross
Violently.

It's funnier.

------
skittles
He mentioned that they use the servicestack.text library. I've looked into
servicestack recently (using the nuget packages), but then found the library
to be pay-to-play. There's an older version (v3) that is BSD licensed that is
being maintained. Do any of you have experience with it? I have grown tired of
Microsoft pushing new solutions to the same problem (REST service with WCF and
then Asp.net web api).

~~~
sklivvz1971
We used it at the time I gave that talk, we don't anymore. We only used JSON
serialization and we have rolled out our own free solution, Jil.

[https://github.com/kevin-montrose/Jil](https://github.com/kevin-montrose/Jil)

~~~
kmontrose
Technically we use Newtonsoft and Jil, Jil replacing Newtonsoft as we become
increasingly confident in it.

I wouldn't suggest anyone use Jil in a production role unless you're at Stack
Overflow. It's too untested at the moment, and the typical person can't get me
on the horn to fix whatever just broke.

~~~
guiomie
Why would I use Jil over Newtosoft ?

~~~
JasonPunyon
You wouldn't right now (Kevin doesn't recommend it). But in the end it you'll
want to use it if JSON serialization is a performance bottleneck for you.

------
y0ghur7_xxx
I would love to know more about the Databases:

\- Are they used for different things on the sites?

\- Is data partitioned across tables?

\- Are they all SQL Server instances?

~~~
zero1zero
I would like to know more about this as well.

It sounds like they are all SQL Server instances. However, he made it seem
like they are reproducing the schema once per site? I.e., a separate database
per site rather than sharding the shared data to multiple hosts per site. Did
I hear this right in the question/answer portion?

~~~
kmontrose
Stack Exchange has one database per-site, so Stack Overflow gets on, Super
User gets one, Server Fault gets one, and so on. The schema for these is the
same.

There are a few wrinkles. There is one "network wide" database which has
things like login credentials, and aggregated data (mostly exposed through
stackexchange.com user profiles, or APIs). Careers Stack Overflow,
stackexchange.com, and Area 51 all have their own unique database schema.

All databases are MS SQL Server.

~~~
avemg
How do you manage schema changes with release deployments across across all of
the databases that are meant to be standard?

~~~
sklivvz1971
All the schema changes are applied to all site databases at the same time.
They need to be backwards compatible so, for example, if you need to rename a
column - a worst case scenario - it's a multiple steps process: add a new
column, add code which works with both columns, back fill the new column,
change code so it works with the new column only, remove the old column.

~~~
avemg
Thanks for the reply. We have a similar architecture where I work so this is
interesting to me. A couple more questions if you don't mind:

\- Do you use any tools for orchestrating the rollout of those schema changes
or do you just have some homegrown scripts?

\- Do you separate your schema versioning and deployment process from your
application versioning and deployment process?

\- How do you handle cases where backwards-compatibility is not possible? For
example, a new application feature that depends on a brand new table.

------
schmylan
Before the title was moderated there was an important tidbit. StackOverflow
doesn't unit-test. Fascinating.

~~~
schmylan
tldw; He says he doesn't advocate it but they get away with it by having the
community test it out for them in their meta site. Then the community writes
up the bugs.

~~~
merak136
He actually says " I'm not advocating that you shouldn't put in tests. [ The
reason we can get away with this ] is that we have a great community. "

I take this to mean that he feels that StackOverflow doesn't need tests. Not
that tests are useless.

~~~
BrandonY
User community as testers presents some interesting pros and cons.

Pros:

* Tests are self-updating. Add a new feature: tests come in for free. Change a feature: tests automatically update. Fail to document a change: tests fail.

* Tests are unusually thorough

* Eventually consistent testing. If nobody ever complains, it probably wasn't a bug worth fixing.

Cons:

* Tests cannot be run offline. Feature must be committed and deployed before tests can be run.

* Potentially large quantity of false positives (bad bug reports)

* Potentially large quantity of false negatives (nobody notices particular bug, release considered good)

* Does not work for non-user-visible features

So basically you trade the reliability of your tests for a substantial
build/release speedup. Some users experience each bug, but they are the users
who are actively using the meta-community and have signed up to experience
more bugs. Still, lack of pre-release unit testing must radically increase the
importance of VERY careful code reviews.

Not the decision I would have made, but definitely has the sorts of advantages
that a small team of engineers drool would drool over.

~~~
sklivvz1971
Remember that our community writes bug reports but also vets bug reports. We
rarely have to deal with bad reports. Interestingly, large quantities of false
negatives are a non-issue.

~~~
welegan
Presumably the same reason why they don't have a ton of bad questions on stack
overflow: their community scoring would apply just as much to bug reports

------
alexgartrell
Dear any Stack Overflow Developers,

Can you describe the network infrastructure in finer detail? Specifically what
type of load balancer are you running?

And what's peak RPS? Where are your network peaks? (I'm guessing major peak US
Pacific and minor US Atlantic?)

~~~
TacticalCoder
IIRC at first they had an entire Microsoft stack (I may be mistaken on that).

But nowadays, from what I've read here on HN by SE devs in other threads,
they're using lots and lots and lots of Linux: HAProxy, Redis, Nagios, etc.

I just double-checked the slide and although I didn't notice it at first, you
can see that 'HA Proxy' and 'Redis' are mentioned.

The core Q&A is in C#/MS-SQL so that's probably not going to move to Linux
anytime soon.

------
dimension64
This might be a stackoverflow question, so what is a static code?

------
notastartup
is there an open source, self-hosted version of stack overflow that you can
deploy on your own domain?

~~~
robzienert
Yes. [http://meta.stackoverflow.com/questions/2267/stack-
overflow-...](http://meta.stackoverflow.com/questions/2267/stack-overflow-
clones)

~~~
m_myers
To be clear: there is no version of the actual Stack Overflow code that is
publicly available. There are, however, numerous open-source reimplementations
of portions of the site code.

Also (as the video perhaps mentioned), the Stack Overflow developers have
often been able to spin off pieces of the code as open-source libraries. See
[http://blog.stackoverflow.com/2012/02/stack-exchange-open-
so...](http://blog.stackoverflow.com/2012/02/stack-exchange-open-source-
projects/)

------
dlazerka
I wouldn't trust Joel Spolsky's code expertise -- just look at Excel
internals! Nevertheless, Stack Overflow is super cool. But that tells nothing
about its architectural quality.

