StackOverflow has VERY FEW tests. He says that StackOverflow doesn't use many unit tests because of their active community and heavy usage of static code.
 Most StackOverflow employees work remotely. This is very different than a lot of companies that are now trying to force employees back into an office.
 Heavy usage of Static classes and methods. His main argument is that this gives them better performance than a more standard OO approach.
 Caching even simple pages in order to avoid performance issues caused by garbage collection.
 They don't worry about making a "Square Wheel". If their developers can write something more lightweight than an already developed alternative, they do! This is very different from the normal mindset of " don't reinvent the wheel ".
 Always using multiple monitors. I love this. I feel like my productivity is nearly halved when I am working on one tiny screen.
Overall, I was surprised at how few of the "norms" that they follow. Either way, seems like it could be a pretty cool place to work.
Seems very very optimized and cost effective. Its brilliant.
If your methods are static, there are tendency/lust to use static member variables (hence stateful) which will cause side effects.
Don't forget the following points too:
1) You still have pooled objects somewhere (stateless business logic classes like XYZServices, repository classes that may be backed by pooled DB connections and Transaction Managers) provided/managed by your Application Server or by 3rd-party framework (Spring does this).
2) Your Application Server tend to have beefy hardware, good enough not to care of GC hiccups.
There are other reasons to use static methods but I don't think they're strong enough in this case.
Typically there are three lifetimes for objects in server processes. Those that are allocated around startup and are never deallocated; those that are allocated per-request and become garbage once the response goes out; and lifetimes that span multiple requests, like objects in caches.
The first are normally ultra-cheap to "collect": with a generational GC, you simply don't scan them at all, because they haven't changed.
The second group, per-request, are also fairly cheap to collect. Every so often, you GC the youngest generation, and you only need to keep track of references in registers and on the stack. Ideally many requests will have occurred between collections, and the only objects that get kept alive are objects that are in-flight for the current request. And this is why you need at least three generations; you really don't want to have to scan the oldest generation to collect these ephemeral objects after they've built up over a number of youngest generation collections.
It's the third group that kills you. You can save on the cost of scanning the whole heap, using write barriers to track new roots buried in the oldest generation; but that adds accounting costs, and eventually overtakes the cost of a whole heap GC. These guys can also cause the fragmentation you're worried about - they need to be compacted down, copied possibly multiple times. On the CLR, last time I checked, you need a full gen2 GC in order to get rid of them, as they've likely survived a gen1 collection.
With these guys, it's worthwhile doing the big object thing. In fact, it may be worthwhile not having any GC heap storage for them at all, and refer to them using different techniques, like ephemeral keys that look up in Redis, or native pointers stored in statically allocated arrays.
In app servers I've designed, I've never seen GC CPU usage over 5% or so, even with heavy usage of tiny short-lived objects. But you need to care about lifetime.
When a Request comes in, the App-Server will allocate (or use from the pool) a thread to serve that Request (in .NET/JVM world, Ruby/Python uses Processes unless you use different App Server).
If you create small objects within the scope of that Request (which usually lives inside a method) and that objects are contained and don't hold references to any long-lived objects, they will be GC-ed quickly (and potentially way quicker) once the method is finished.
Thread is GC-ed as well once it's finished (unless you wish to release them back to the 'unused' pool).
My feeling is that their use of static methods have nothing to do at all with GC.
Many do, but they have a fairly large office in NYC and a smaller one in London.
I believe at this point most new technical hires are remote.
Our offices are mostly sales, Denver and London exclusively so.
I can think of 3 devs who have gone remote, and 2 devs (including myself) who have moved to NYC since I've been here. Most people stay wherever they were hired. The only location-specific policy I'm aware of is a cost-of-living adjustment in NYC (though that may also apply to London/SF/etc., I don't honestly know).
For piths sake, I want to say "Everything else is noise" but that isn't true. Everything else can help or hurt, depending on the application and how doctrinaire the application of a given approach/methodology is, the organizational knock on effects (e.g. "Mr Tough Guy Testalot" holds up the release train or nukes your architecture to make it 'testable'), etc. but, seriously, "great developers who ship" is really what moves the needle.
ps. I only singled Thomas Limoncelli out as an example just to highlight the caliber of their Ops staff.
Every feature/user story has to go through a workflow of selected for development -> UX design(If required) -> Development -> Unit Tests(Or the otherway around) -> Staging -> Load Test -> Acceptance Test -> Production -> Analytics(To see if people actually use it) -> Learn from analytics -> back to start if required.
The goal is to get as many issues through the workflow as fast and rigoursly(no shortcuts) as possible at a sustainable pace. Have a continuous flow of features rolling out through this process. Ideally with continuous delivery to automate the majority of it.
There's really only two steps to great software development.
1. Hire good developers.
2. Don't hire bad developers
I wouldn't suggest anyone use Jil in a production role unless you're at Stack Overflow. It's too untested at the moment, and the typical person can't get me on the horn to fix whatever just broke.
- Are they used for different things on the sites?
- Is data partitioned across tables?
- Are they all SQL Server instances?
It sounds like they are all SQL Server instances. However, he made it seem like they are reproducing the schema once per site? I.e., a separate database per site rather than sharding the shared data to multiple hosts per site. Did I hear this right in the question/answer portion?
There are a few wrinkles. There is one "network wide" database which has things like login credentials, and aggregated data (mostly exposed through stackexchange.com user profiles, or APIs). Careers Stack Overflow, stackexchange.com, and Area 51 all have their own unique database schema.
All databases are MS SQL Server.
- Do you use any tools for orchestrating the rollout of those schema changes or do you just have some homegrown scripts?
- Do you separate your schema versioning and deployment process from your application versioning and deployment process?
- How do you handle cases where backwards-compatibility is not possible? For example, a new application feature that depends on a brand new table.
I take this to mean that he feels that StackOverflow doesn't need tests. Not that tests are useless.
* Tests are self-updating. Add a new feature: tests come in for free. Change a feature: tests automatically update. Fail to document a change: tests fail.
* Tests are unusually thorough
* Eventually consistent testing. If nobody ever complains, it probably wasn't a bug worth fixing.
* Tests cannot be run offline. Feature must be committed and deployed before tests can be run.
* Potentially large quantity of false positives (bad bug reports)
* Potentially large quantity of false negatives (nobody notices particular bug, release considered good)
* Does not work for non-user-visible features
So basically you trade the reliability of your tests for a substantial build/release speedup. Some users experience each bug, but they are the users who are actively using the meta-community and have signed up to experience more bugs. Still, lack of pre-release unit testing must radically increase the importance of VERY careful code reviews.
Not the decision I would have made, but definitely has the sorts of advantages that a small team of engineers drool would drool over.
- Stack Exchange employee
Can you describe the network infrastructure in finer detail? Specifically what type of load balancer are you running?
And what's peak RPS? Where are your network peaks? (I'm guessing major peak US Pacific and minor US Atlantic?)
But nowadays, from what I've read here on HN by SE devs in other threads, they're using lots and lots and lots of Linux: HAProxy, Redis, Nagios, etc.
I just double-checked the slide and although I didn't notice it at first, you can see that 'HA Proxy' and 'Redis' are mentioned.
The core Q&A is in C#/MS-SQL so that's probably not going to move to Linux anytime soon.
Also (as the video perhaps mentioned), the Stack Overflow developers have often been able to spin off pieces of the code as open-source libraries. See http://blog.stackoverflow.com/2012/02/stack-exchange-open-so...