The Java servers get around 900K requests/s on beefy hardware while Rails squeezes out 6K. That's a 150x difference! Any real application will be slower than this, and on cloud hardware you can expect to be a order of magnitude slower as well. You just don't have much headroom if you use languages like Ruby before you have to scale out. And once you scale out you have to worry about sys. admin. and all the problems of a distributed system.
It's a one off cost to learn an efficient language, but it pays returns forever.
Except those TechEmpower benchmarks show .NET is not nearly as fast as Java. I think StackExchange prove that the platform is NOT the most important: it's much more important to make performance a priority in all engineering decisions, to benchmark everything beforehand and develop new technology where there's no good standard solution available. (think Dapper, protobuf-net, StackExchange.Redis)
To more accurately judge the request routing performance of C# tests on Windows, see Round 8 data. For example, see Round 8 limited to Java, C#, and Ruby tests on i7 hardware .
Other important notes on that data:
* The http-listener test is a low-level test implementation.
* The rack-jruby test is running on Torqbox (TorqueBox 4), which is based on Undertow and has very little Ruby code . This test is mostly about Undertow/Torqbox, with a bit of JRuby.
Another interesting test is Fortunes on i7 in Round 8 . Fortunes involves request routing, a database query, ORM translation of multiple rows, dynamic addition of rows, in-memory sorting, server-side templating, XSS countermeasures, and UTF-8 encoding. Here you will see Java frameworks at about 45K rps, ASP.NET on Windows at about 15K rps, and Ruby at about 2.5K rps.
I didn't notice .Net in the benchmarks -- no .Net languages are listed but .Net is there as a framework. It's still x20 faster than Ruby.
How many applications are rate limited by the speed of the front-end language? Not that many; the speed of the backing store will usually be the bottleneck.
One example, Django template processing is slow
Java servers get around 900K requests/s [...] while Rails squeezes out 6K
I'm not sure it is really representative of the overall speed you'll get with whatever framework you'll be using: how often do you have a static page that's best served by your framework instead of leaving nginx serve it directly?
I prefer watching the test results that actually involve the database (/db, /queries, /fortunes and /updates), as it shows less raw speed for serving static things and more overall speed for your dynamic pages.
With /queries for example, Java does ~11.3K, while php (on hhvm) does ~10.7K, Python is at ~7.7K and Dart ~12.8K (For the fastest framework of each language. Rails still does bad though).
I prefer the /json test because it gives me a performance ceiling. I know that's the max performance I can expect, and it's up to me to design my system to get as close to that as I can.
On the other hand the db tests don't really tell me much. Does my data and data access patterns match theirs? Probably not. So it is difficult to generalise from their results. If the systems you build are always stateless web servers talking to a relational db then I can see it might be more useful.
Meanwhile, the 20-query test is a bit pathological as it runs into a brick wall with the database wire protocol and efficiency of the database driver. Many otherwise high-performance frameworks and platforms become bottle-necked waiting on those 20 queries per request. But you and I agree, a 20-query-per-request scenario should be the exception and not the rule. When developing a smooth-running web-application, it's common to aim for zero queries per page load. (For those who find this to be crazy talk, note that I'm saying we aim for that ideal, and may not necessarily achieve it.)
I too particularly enjoy knowing the high-water mark set by the JSON and Plaintext tests. When we add the next test type (caching-enabled multiple queries), we should see some interesting results.
Update: To try to avoid turning this into a language war, there are other good reasons to use something like RoR. Ease of hiring is one. Using known tools is another, if you're not building anything you expect to get load.
Languages like ColdFusion and PHP are web languages that don't make much sense outside of the web.
It also imposes costs forever.
Software is never finished.
I'm curious what the reasoning is for "Efficient code ... makes code easier for programmers to understand". To my mind, efficient code (in this case, I assume coding to the hardware, as they mention elsewhere), has many benefits, but making it easier to understand is not one of them. A useful comment by the code may help, but that's not a result of efficient coding, that's a result of good commenting practice.
I've seen this a lot when working in PHP. I wrote some websites from ground-up (using something similar to my own framework). I optimized int multiple times to great results. The biggest benefits didn't come from making a particular function faster, they came from realizations that large chunks of complexity in templating, routing and permissions checking subsystems simply weren't necessary. It doesn't matter how clever those chunks were written, because I got rid of them completely.
I think this is what the author speaks about.
Note that the code is autogenerated, so it should be equally efficient. The Go version also happens to be very simple and no different than most humans would write by hand (without trying very hard to optimize).
Or are you saying that languages should always use UTF-8 natively? I would agree with you on that, but disagree that this proves your point that "efficient" and "easy to understand and verify correctness" aren't mutually exclusive. pb.getSomeId().charAt(1000) runs in constant time in Java (albeit failing to give correct data if getSomeId() contains codepoints higher than \uFFFF), but pb.GetSomeId() will give you garbage data in Go if your field contains non-ASCII text. To get a valid codepoint in Go, you'd need to do string(rune(pb.GetSomeId())), which runs in linear time and omits the check for valid UTF-8 bytes.
It's possible when they said "efficient", they did not mean "hardware efficient, runs fast, doesn't use a lot of resources" but "developer efficient, faster to write and easier to understand". But that entire phrase as is doesn't make sense, I agree with that.
It's also possible to be a simple mistake. Maybe they went overboard with marketing phrases and claimed something that isn't quite true.
To be very clear, I don't doubt the author (you?) had a good point to make, just that it was unclear what that was from the way it was presented, and to such a degree that it took me out of the flow of reading the post, and perhaps a less ambiguous way of expressing that concept (or omitting it, if it's expressed succinctly elsewhere) would be clearer and more effective.
Unfortunately, if it is just poor communication on their part, we'll likely get no answers that can be considered more likely than others without original author or someone related (work-wise) piping up.
Of course, as others have mentioned, it is validating the UTF-8 but the Go version is not.
The Go code is very concise and the Java code is very verbose.
The Go code does look more efficient but what is ImpressionData? Where is it checking the string is valid utf8?
users=db.get("SELECT name FROM users WHERE group='admin'")
tmp=db.get("SELECT name,group FROM users")
if (tmp[i]['group']=='admin') users=tmp[i][name];
Obviously you can make code both less efficient and less readable. But starting with code that a competent programmer has written, I find it seldom makes it more readable when you make it more efficient.
But who in their right mind would do the second one?
Surely you jest. I don't know who, but their handiwork is everywhere.
for i = 1 to 100
if x = this
do that and something with i
something else with i
easy to read, less efficient than:
if x = this
for i = 1 to 100
do that and something with i
for i = 1 to 100
do something else with i
from my experience, compilers usually dont hoist very well
edit: sorry, spaces didnt come out well
Essentially they're trying to keep SE from becoming a "chat" community, and keep it focused on being a Q&A community.
With Yahoo Answers, Usenets, even Subreddits, you have people having conversations within posts, which generates noise for people who were looking for answers to questions.
It'd be like going to Home Depot and needing information that is related to building supplies but the employees tell you you must go to a different store down the street to even ask your question.
I don't agree with your analogy; I think it's more akin to going to Home Depot to ask for vegetables. I think it makes perfect sense to have a separation between Cooking.SE, Photography.SE, Christianity.SE, etc.
There are a few edge cases, primarily in the tech sites, but I don't think those disprove the model.
But when you get the related topics servers, programming, security, tools, web development, web apps... All these items are so closely related it's often disadvantageous to try and ask a vertically silo'd question.
I get they are trying to create a detail 'manual' for all questions, but you get to a point where it's just better to read the manual.
I'm often surprised at the paucity of test-coverage in relatively large companies.
Can any SO devs give us more details on "not many tests"? Or how many bug reports get filed vs rate of change of software?
You can look at reported bugs here: http://meta.stackexchange.com/questions/tagged/bug -- it's pretty frequent, at least one every couple of hours or so. The last commit in our "Tests" project was 14 days ago, where as the last commit in the code base was 4 minutes ago.
If only for the wailing, gnashing of teeth and rending of garments.
Also, some might remember when they had Uncle Bob on the podcast, the topic being unit testing, and iirc Bob preaching the gospel (~ you must write plentiful unit tests, or else) and Jeff and Joel (especially) more or less saying they don't quite get why.
(Am I remembering this story correctly? I swear that's how I remember it.)
Then after a while it seems like an obvious way to work.
I have next to zero experience with server administration, but 384GB seems like a lot to me. Is that common for production servers for popular web services? Do you need a customized OS to address that much memory? Seems like you'd really need to beef up the cache hierarchy make 0.38TB of RAM fast.
Memory support for versions of Windows:
User-mode process Kernel-mode
address space address space
32-bit 64-bit all
Server Operating Systems:
Windows Server 2008 R2 4 GB* 2048 GB 2048 GB
Windows Server 2012 4 GB* 4096 GB 4096 GB
Windows Server 2012 R2 4 GB* 131072 GB 131072 GB
Client Operating Systems:
Windows 7 (x86) 3 GB* n/a 2 GB
Windows 7 (64-bit) 4 GB* 192 GB 192 GB
Windows 8 (64-bit) 4 GB* 512 GB 512 GB
Windows 8.1 (64-bit) 4 GB* 131072 GB 131072 GB
* requires IMAGE_FILE_LARGE_ADDRESS_AWARE flag, otherwise 2GB
** All units use binary exponents (GB = 2^30 bytes)
The system obviously behaves a bit different from a standard desktop machines, e.g. different areas of RAM are differently fast, depending on the core on which your current process runs, you need to disable individual CPU lines displayed in top etc., but apart from these, mostly everything seems “normal” to me.
this is NUMA, in case anyone 'new to sysadmin' or to architecture is trying to Google this
Still, I agree re. the difference in price. I'd buy too
(I honestly don't know, it's just surprising.)
Databases have inherent locking problems that takes more resources to resolve with more machines, more so in some database systems than others. When scaling you often hit a point where you get down to macro logistics, so imagine a car highway: You can get more throughput by adding more lanes. But what that all cars are going to merge to one lane at some point in the middle of the trip (because of a tunnel, bridge or something: it's impossible to have more than one lane at that point)? Now you won't get any throughput benefits of the multiple lanes after all! Just more latency because you get queues up until the single lane and because of the queues all cars are going slow and need to accelerate on the single lane, so the average speed is low too. You are also having too much resources after the single lane because you can never fill all those lanes. It might be better to have one beefed-up single-lane road all the way that people can go fast on. Basically: Remove locks and you get better overall performance.
Yes, this is on the expense of HA. Yes, the costs of scaling up grows asymptotically faster than the costs of scaling out. So this is definitely a trade-off in some sense.
They just run everything through one live database all the time under normal conditions. Personally, I think this is much simpler to manage than to be regularly spreading out queries over many queries.
If I remember correctly they're using Dell R720's for their SQL Server machines. At current Dell retail pricing 384GB of RAM would be about $5,900.00.
At that price I'd vote for throwing money at raw hardware and reaping the benefits of caching the database in RAM, versus trying to pay programmers to come up with clever techniques to deal with having less RAM.
I don't think addressing that memory is a problem on any x64 architecture. Windows Server 2012 has a 4TB upper limit on RAM, apparently , and as far as I can tell there's no reason for this other than product differentiation between different "levels" of the operating system.
Doing this on Amazon r3.8xlarge ($2.8/hr) would cost you $8100+ over 4 months.
A DIY server can make sense in some cases. Eg. you are a very early-stage startup with almost no funding, and you are good at assembling and troubleshooting computers, and are ok with mere warranty on parts as opposed to full vendor support from HP/Oracle/Dell, go for it. That's how many startups started (see Google and their makeshift half-donated half-loaned boxes in 1998).
10tb = roughly $1,000 per month
They're between 5 and 50 times more expensive on bandwidth than the options in dedicated / colo / self hosting.
The more RAM you have the better without exception here.
edit: wrong number
64 bit computers has a memmory limit so large that it is not understandeble.
The ram is still lightning fast compared to other stuff.
No, that's no common for production servers for popular web services, though- they tend to shard across other dimensions because the web services are often CPU or network-bound.
Says in the article:
Windows 2012 is used in New York but are upgrading to 2012 R2 (Oregon is already on it).
This is because you have to effectively (or literally) rebuild the Windows cluster from scratch and we just don't get that level of benefit from the 2012 to 2012 R2 upgrade. There are quite a few improvements we care about: native NVMe, better dynamic quorum, better DSC support, better SMB, and such...but not enough to make the upgrade worth it.
Nick Craver - Stack Exchange Sysadmin & Developer
Trust me, we're bitching about this as are most people and I think changes must be coming there. It doesn't matter how many fancy features you add to the OS if we can't upgrade to it, so they'll have to stop and address that problem.
Look towards the end at the tests for example of how to hook up a custom type (it's pretty simple).
(not to be confused with LINQ to SQL from Microsoft)
A huge problem with the tradeoff we had (well, still have in some areas) is the generated SQL is nasty, and finding the original code it came from is often non-trivial. Lack of ability to hint queries, control parameterization, etc. is also a big issue when trying to optimize queries. For example we (and by "we" I mean, "I made Marc Gravell do it") added literal replacement to Dapper to help with query parameterization which allows you to use things like filtered indexes. In dapper we also intercept the SQL calls to dapper and add add exactly where it came from. Here's what that looks like, replicate for the other methods:
public static IEnumerable<T> Query<T>(this DataContext db, SqlBuilder.Template template, bool buffered = true, int? commandTimeout = null, IDbTransaction transaction = null, [CallerFilePath]string fromFile = null, [CallerLineNumber]int onLine = 0, string comment = null)
return SqlMapper.Query<T>(db.Connection, MarkSqlString(template.RawSql, fromFile, onLine, comment), template.Parameters as object, transaction ?? db.Transaction, buffered, commandTimeout);
private static string MarkSqlString(string sql, string path, int lineNumber, string comment)
if (path.IsNullOrEmpty() || lineNumber == 0)
var commentWrap = " ";
var i = sql.IndexOf(Environment.NewLine);
// if we didn't find \n, or it was the very end, go to the first space method
if (i < 0 || i == sql.Length - 1)
i = sql.IndexOf(' ');
commentWrap = Environment.NewLine;
if (i < 0) return sql;
// Grab one directory and the file name worth of the path
// this dodges problems with the build server using temp dirs
// but also gives us enough info to uniquely identify a queries location
var split = path.LastIndexOf('\\') - 1;
if (split < 0) return sql;
split = path.LastIndexOf('\\', split);
if (split < 0) return sql;
split++; // just for Craver
var sqlComment = " /* " + path.Substring(split) + "@" + lineNumber + (comment.HasValue() ? " - " + comment : "") + " */" + commentWrap;
return sql.Substring(0, i) +
/* Models\Post.LinkedQuestions.cs@105 */
select top (@top)
The problem with scale-up is if you actually have to get a few times larger, it becomes super expensive. But fortunately hardware is increasing so much that you can probably just get away with it now. There's probably a crossover point we're rapidly approaching where even global-scale sites can just do all their transactions in RAM and keep it there (replicated). I know that's what VoltDB is counting on.
It's almost all over 9 servers, because 10 and 11 are only for meta.stackexchange.com, meta.stackoverflow.com, and the development tier. Those servers also run around 10-20% CPU which means we have quite a bit of headroom available. Here's a screenshot of our dashboard taken just now: http://i.stack.imgur.com/HPdtl.png We can currently handle the full load of all sites (including Stack Overflow) on 2 servers...not 1 though, that ends badly with thread exhaustion.
We could add web servers pretty cheaply; these servers are approaching 4 years old and weren't even close to top-of-the-line back them. Even current generation replacements would be several times more powerful, if we needed to go that route.
Honestly the only scale-up problem we have is SSD space on the SQL boxes due to the growth pattern of reliability vs. space in the non-consumer space. By that I mean drives that have capacitors for power loss and such. I actually just wrote a lengthy email about what we're planning for storage on one of our SQL clusters...perhaps I should echo it verbatim as a blog post? I'm not sure how many people care about that sort of stuff outside our teams.
I'm sure some DBAs and devs here would find it interesting.
I actually just wrote a lengthy email about what we're planning for storage on one of our SQL clusters...perhaps I should echo it verbatim as a blog post? I'm not sure how many people care about that sort of stuff outside our teams.
TLS Session IDs and tickets of course are absolutely essential, but I'd be curious how many peak TPS (number of full handshakes / sec) you are seeing on HAproxy.
The alternative, fanning out your SSL termination to your IIS endpoints, unfortunately means running HAproxy at L2, so you lose all your nice logging.
Even with all that SSL we're only running around 15% CPU at peak so it's not having any trouble. Most of that CPU usage does come from the SSL termination though - it ran around 3-5% CPU before. We're also working on much larger infrastructure changes that mean SSL termination much more local to our users, which means the HAProxy load will drop to just about nothing again. I'm working on a followup to my previous SSL post: http://nickcraver.com/blog/2013/04/23/stackoverflow-com-the-... that will explain why it's not already on for all hits...I think hacker news will have fun with hat one.
All that being said, there's no reason we can't forward that syslog traffic from the listeners to our logging infrastructure to get at least a counter out of it. If you're curious how we log, I'll explain a bit.
We feed it into a logging display called Realog built by Kyle Brandt and Matt Jibson in Go. Here are a few snapshots of the dashboard: http://i.stack.imgur.com/OIqhm.png http://i.stack.imgur.com/iQOP9.png http://i.stack.imgur.com/bfpUb.png http://i.stack.imgur.com/JZNy6.png
This lets us forward on custom headers to logging that the web servers are sending back. It tells us how much time we spent in SQL, redis, etc. We can graph it (bottom of the first link) and identify the root cause of any issues faster. It also handles parsing of that traffic from the syslog entry already, so we use it as a single logging point from HAProxy and it handles the JSON generation to forward that traffic data into logstash (a 300 TB cluster we're working on setting up right now).
As soon as we get the logstash piece working well and dogfooded thoroughly, I'll poke Kyle to open source Realog so hopefully some others can get use from it.
Not everyone has google-like problems that are betters solved by a battery of cheap boxes.
You simply put a load balancer (Nginx/ELB/HAProxy etc) in front of a fleet of smaller web/application servers that dynamically scale depending on traffic. That way it is cost effective, far more reliable, easier to scale and you can tolerate DC outages better.
So for us, the SQL data store is the real “up” part of the equation. We have a fair amount of headroom there, so if we can keep sharding (and other data strategies) out of the codebase, so much the better.
(Load-balancing HTTP requests “out” is not a big deal and we are doing that.)
Simply put, this is not true at all. Most of the websites today run happily on a single box and the choice of architecture is frankly irrelevant because of the abysmal amount of traffic they get.
Large, and by large I mean top 100 sites, are large and typically complex systems. Each one is different and bespoke. Making broad generalizations doesn't really work, because typically they have parts that need to be scaled up, parts that need to be scaled out, etc.
Finally, if you read the article you would know that we already use a load balancer with multiple web servers. What we are discussing here is whether it's better to have 100s of cheap/cloud boxes versus a few big ones.
In our use case (we want fast response times) the second solution is demonstrably better.
And this is compounded by people who have little ability to troubleshoot performance issues. It's quite easy to hunt down the cruddy SQL queries in a DB, or realize that you spinning rust is too slow. But when it starts to come down to things like a blocking network fabric, which has some big fat buffer between two servers, that is killing your transaction speed - many will just start to blame the devs.
Of course, the second point is compounded by the first - the less systems thinking that went into the design in the first place, the harder it is to produce accurate hypotheses about the system in order to troubleshoot.
Whilst there is a lot to be said for the "right tool for the job", if you watch an artisan crafting something, you'll realize that despite having a huge number of tools, they actually get by with relatively few. This is a generalization, but they only use the full range when doing something new, solving a particularly tricky problem, etc. "All the gear and no idea" is certainly applicable in many start ups.
Also StackOverflow often practices "scale-up" instead of "scale-out".
Usually you want your solution to be just right for the problem; solve the essential complexity but don't introduce any accidental complexity.
I don't understand this at all. What does TDD have to do with reducing garbage collection?
ps. That's my guess, but I'd encourage you to post your question to the meta site for SO.
It's not a theoretical risk we're trying to avoid here, it's a real problem that needed solving.
Separating Controller, Repository, and Services are good practices as well and let's be honest, we're looking at 3 methods layer at most.
Here's what happened in Java:
1. When you deploy your WAR, the DI will inject necessary component _once_ (and these components only instantiated _once_ for the whole web-app so there you go, Singleton without hardcoding).
2. A request comes in, gets processed by the pumped-up Servlet (already DI-ed). If the servlet has not been instantiated, it will be instantiated _once_ and the instance of the Servlet is kept in memory.
3. Another request comes in, gets processed by the same pumped-up Servlet that is already instantiated and has already been injected with the component (no more injection, no more instantiation, no more Object to create...)
So I've got to ask this question: what GC problem we're trying to solve here?
Some of the static methods are understandable but if Component A requires Component B and both of them have already been instantiated _once_ and have been properly wired up together, we have 2 Objects for the lifetime of the application.
I'd pay for a wee bit extra hardware for the cost of maintainable code ;)
Discipline (and knowledge of object graph) definitely help to reach to that point.
Rails and Django do things differently as to my knowledge they do it by spawning processes instead of threads. There are app-server for Rails or Django that may use threads for efficiency/performance reason but I am under the impression the whole LAMP stack is still 1 request 1 process (even though they re-use those processes from a pool of already allocated processes).
Not doing this will cause much self-righteous snickering from some.
At the risk of exposing my slow transmutation into a hipster programmer, I and a colleague found that mocking in Go was much easier than we anticipated, thanks to the way interfaces work.
It's frustrating to see time wasted obsessing over trying to maintain eventual consistency (or chasing bugs when you failed to do so) on systems that could quite happily run on a single, not that beefy machine.
Forgie me if I am misunderstanding you - but non-trivial applications can actually require scale out.
From my perspective, StackExchange is not techinically that complex. They have built a very efficient, cost-effective and performant stack for their singular application and that works very well for them, but the complexity of their forum is not an extraordinarily complex problem.
The fact is, the vast majority of projects that programmers are working on are less computationally complex than stack overflow. That's not to say that forum software is all that complex, more that most problems are pretty simple. Of course there are real reasons to use scale out - I simply advocate thinking hard about whether your problem will ever truly need it before taking the substantial complexity hit of coding for it.
Isn't that comparing apples to oranges? Taking any application built for large servers and putting the same application onto a cloud-based architecture will be more expensive.
Cloud-based architecture requires ground up differences in how the application is built. Now whether or not it is better to use a cloud-based approach or traditional bare metal is highly subjective and isn't my point.
On another site, in another context, it probably would be, but here it's really presented as a contrast of scale-up versus scale-out - something the regular audience of highscalability will certainly grok.
'Stack Overflow still uses a scale-up strategy. No clouds in site. With their SQL Servers loaded with 384 GB of RAM and 2TB of SSD, AWS would cost a fortune. The cloud would also slow them down, making it harder to optimize and troubleshoot system issues. Plus, SO doesn’t need a horizontal scaling strategy. Large peak loads, where scaling out makes sense, hasn’t been a problem because they’ve been quite successful at sizing their system correctly.'
It's an acknowledgement of their relatively unique strategy and the short list of caveats that make it possible.
And they've moved SSL termination to it.
That's a great quality product, and easy to setup.
Edit: I work I'm software that supports MySQL,PG and SqlServer. SqlServerr seems to be the most stable and consistent in performance - they're hard to kill! One of my few liked MS products :D
Failures have not been a problem, even with hundreds of intel 2.5" SSDs in production, a single one hasn’t failed yet. One or more spare parts are kept for each model, but multiple drive failure hasn't been a concern.
All of those 2.5" Intels though, still trucking along! We're looking at some P3700s PCIe NVMe drives now, blog post coming about that.
For comparison, for a while we nearly had a 1 to 1 dev to killed (consumer) SSD in work machines (see: http://blog.codinghorror.com/the-hot-crazy-solid-state-drive...). Granted, we did adopt them (see: http://blog.codinghorror.com/the-state-of-solid-state-hard-d...) for local dev use much earlier than we started putting them in servers.
I've killed 2 in my time here, though I do horrible big-ish data things that are either not SSD friendly or actively hostile.
That said, all the SSDs I've had have taken repeated pounding with no complaints. I accidentally swapped a few terabytes to swap on an SSD, and was simply surprised that the job I was doing finished faster.
If nothing else, it keeps less data in each table, so queries should be faster due to smaller datasets.
Sounds like they know what they are doing.
for example, do not put every single customer of an saas solution into a separate database.
however, inexperienced developers and teams are not good enough, or fast enough, to do that, and end up painting themselves into a corner when they have thousands of live customers to support and need to change their entire application architecture + database schema when their backup, replication, and housekeeping tasks choke on 1,000+ databases.
and oftentimes, one-customer-per-db also means one-instance-of-application-per-customer, another anti pattern to avoid.
we see this kind of stuff ALL the time. it happens a lot - people make terrible decisions and then are stuck with them 5 years down the line and are looking at a monumental cost to redo. not everyone is experience enough to work their way out of an awful situation like that.
With Higher Capacity DDR4, and Haswell or even Broadwell Xeon, and PCIe SSD getting cheaper and faster. It is not hard to imagine they could handle 3 - 4 times the load when they upgrade their server two years down the line.
But i would love to see Joel's take on it though. Since he is a Ruby guy now, and I dont think you could even achieve 20% of SO performance with RoR.
Hardware is cheap, Programmers are expensive! But i am sure there is a line where this crosses over though.
Is performance _really_ an issue here?
For SO, if we guess they max out around 2000 req/sec, then if a bunch of objects are being allocated for each request, there could be added GC pressure. From my own experience developing much higher-scale managed code, allocating objects is a real pain and people will take a lot of effort to avoid it.
public static decimal DiscountValue(Order order)
public static decimal DiscountValue(decimal orderValue, decimal orderDiscount)
public static decimal DiscountValue(int orderId)
It's a bit of a contrived example, as you're more likely to do the third method if you're trying to get some sort of summary data about an order out which would mean you don't really need/want to load the whole order object.
This means that normally, you want to avoid data structures that introduce a lot of objects trees, linked lists and stuff like that. In some circumstances you might lose some speed, but win big time when the full GC comes. Additionally, It may be worthwhile to consider avoiding heap allocations by organizing your data as value types instead of ordinary objects.
The vast majority of programmers need not worry about this, but if you're putting millions of objects through a single app domain in a short time, then it's a concern you should be aware of. This applies to pretty much any managed language.
var a = InitGiantArray<Foo>();
var b = InitGiantArray<Bar>();
Tuple.Create(new object(), new object());
} while (true);
Also the SSDs are funny to me because I recently had someone who was supposed to be an expert say that SSDs were a fatal mistake with a high likelihood of massive failure.
Or, to put it another way; 1 server per 8 pageviews per second
Does not seem that efficient when you put it like that. Even a resource hog like a wordpress install can eclipse that level.
*made correction- still not impressed
Does this mean the 2 Redis slaves, 2 SQL replicates etc. are in a secondary data center?
For redis both data centers have 2 servers, the chain would look like this typically:
Here's a screenshot of our redis dashboard just a moment ago (counts are low because we upgraded to 2.8.12 last week, which needed a restart): http://i.stack.imgur.com/IgaBU.png
They say that hardware is cheaper than programmers. Wouldn't that speak for the cloud solution (software based scaling instead of admins)?
Could you do it well if building for the cloud from day one? Yeah sure, I think so. Could you consistency render all your pages performing several up to date queries and cache fetches across that cloud network you don't control and getting sub 50ms render times? Well that's another matter, and unless you're talking about substantially higher cost (at least 3-4x), the answer is no - it's still more economical for us to host our own servers.
And yes I realize that's a fun claim to make without detail. As soon as I get a free day (I think there's one scheduled in 2017 somewhere), I'll get with George to do another estimate of what AWS and Azure would cost and what bottlenecks they each have.
If you happen to be familiar with the MS stack and are able to factor the license fees into your business plan there is no strong reason against and many for using that particular stack.
MS programmers tend to be a bit more expensive than people working with one of the many Linux offerings but that's not such a huge difference that it would become the deciding factor and if anything the MS stack is more performant than the Linux one on the same hardware (and I write that as an anti-MS guy, I strongly believe that politics and tech should be kept separate when it comes to discussing systems relative to each other).
So even if there are lots of reasons why I would not have made that particular choice I can completely understand why the SO people made their choice the other way.
Could you please provide some evidence to support that?
Keep in mind that it hardly ever is the webserver that is the bottle neck.
My theory on why this is the case is very simple: MS can afford to throw vast amounts of money at optimizations that are next to impossible in Linux simply because the coupling between the layers in Linux is looser. And that's a good thing, it translates into better security and fewer bugs.
As always, optimization alone is not a reason enough to go down a certain path. But for raw speed on requests it's fairly hard to beat IIS, if that's what you're after (I'm usually not, and even when it matters I can comfortably saturate most outbound links from a single server doing light processing, and as soon as the processing becomes the bottle-neck the CPU cache size, RAM and so on matter more than your OS, but given identical hardware a 'dirty' approach should yield better results at a cost of unreliability/complexity).
It turns out there are kernel-mode webservers (or there were, at least), which seriously outperform user-mode ones.
https://www.usenix.org/legacy/events/usenix01/full_papers/jo... (page 11; article is old, but anyway)
This is provably not true.
There is a good reason majority of the World-Wide-Web is run on 'Nix stacks -- and there is a good reason majority of servers in general are 'Nix stacks. Also, an overwhelming percentage of super-computers (I know we are talking about webservers, but it illustrates performance capabilities) are running a 'Nix stack.
There is also cost involved, as well as flexibility of the stack. With a 'Nix stack, it's infinitely flexible, not so much the case with a Microsoft stack. Also the default MS Stack comes with a lot of additional OS overhead that is not present in the 'Nix stack, which reduces any said box's scalability.
Without trying to turn this into some sort of flame war - I was merely trying to suggest that the MS stack was not/is not the best choice for a highly scalable website. Take the top 10 websites -- they all run on 'Nix. The top cloud providers (except MS Azure), all 'Nix. These are companies that can easily afford MS licensing, so that's not part of the equation.
A MS stack at this scale is unusual to say the least.
That's not to say it won't scale (as evidenced by the SE team), but it doesn't mean there isn't a better alternative that saves more money and scales better with less hardware, etc.
Which is a pretty smart decision. Whether an MS stack at this scale is unusual does not say anything at all about whether or not it performs well.
However, I have to disagree on your assertion the 'Nix's are less performant than the MS stack. The top tier web companies are not running 'Nix because TCO is lower; for most of these companies licensing costs are negligible and if it helped to scale better, it may even save them money going with MS stack... but they don't go with a MS stack...
Sometimes a RHEL license can even cost more than a MS license. Couple that with an Oracle DB back-end, and you easily have a much more costly setup than the MS equivalent. It's not about the money.
These companies are choosing the 'Nix stack because it is performing in an entirely different level than the MS stack. Everything from tiny embedded systems with 64k ram, up to monster systems with TB's of ram.
> "I Contribute to the Windows Kernel. We Are Slower Than Other Operating Systems. Here Is Why."
IMO, it has one of the best tooling sets of any platform out there (Visual Studio) which saves lots of programmer time and increases productivity. With BizSpark the initial cost argument is out...so how's that not pragmatic?
I'm biased of course, but I think it worked out pretty well.
* Currently three years of free licenses, all software you download during that time is free forever, and discounted MSDN subs thereafter.
: dealing with the problems that exist in a specific situation in a reasonable and logical way instead of depending on ideas and theories
> Microsoft infrastructure works and is cheap enough
not as cheap as just paying your team to maintain the boxes. I wonder how many times in the past few years SE has needed to call Microsoft for support? Would a RHEL license still be cheaper (probably)? If something like CentOS then there would be no support cost unless you need to bring in an outside contractor (for a particularly nasty issue).
I guess you're gonna have to define "more appropriate" for me, I have a feeling we'll have a fundamental disagreement there. We could (and have) run the entirety of our peak load with 1 SQL server and 2 web servers (and not pegging them). Don't forget we run with a crazy amount of headroom at all times, by design. I'm not sure how much better you picture that scaling on a linux environment.
> not as cheap as just paying your team to maintain the boxes
That makes a lot of assumptions about the team, their expertise, and what support issues would arise.
> I wonder how many times in the past few years SE has needed to call Microsoft for support
In the past 4 years I've been here? Twice. Both to report bugs in CTP versions of SQL server. At no cost, and we improved SQL server as a result of being testers. We have a very good relationship with Microsoft and talk to the developers that make the tools and platform we use in order to make life better for both of us. We do the same thing for redis, Elasticsearch, etc. It's the same reason we open source almost all of the tools we make.
> If something like CentOS then there would be no support cost unless you need to bring in an outside contractor (for a particularly nasty issue)
We use CentOS for all our linux systems and are deploying new servers on CentOS 7 now. We'll me migrating the others in the coming months. That doesn't mean it's free. Developer or sysadmin time to control Puppet deployments and such still eat some factor of time.
Even if it was the best choice at the time to go the Microsoft route, it may not be in the future... however -- SE has zero choice now otherwise they'd have to re-write their entire product... that sucks as a business because you have little choice over your own product now.
This seems like a bit of a waste of resources to try and run an internal bespoke application on every platform imaginable. Now, if you are developing a product that you are then going to sell to other people to run on their own hardware this make more sense.
There are some very big benefits to writing to a specific platform, especially in the performance space.
In the end of the day it is a trade off between trying to eliminate every 3rd party dependency (next to impossible) or picking a solution or company you think will be around for a long time, and forging strong relationships with them.
We regularly talk to people at Microsoft (and all of the people who create and build to tools we use), give them feedback, and get bugs fixed. There is very little that comes down the pipe from them that we are not away of ahead of time and in some cases have helped shape through early access programs.
> SE has zero choice now otherwise they'd have to re-write their entire product...
This is not true at all, we have choices if MS decided to blow everything up. Not great choices, but we have them. Choosing between two or three crap options does not mean that options do not exist.
-George (SE Sysadmin)
Why is the Microsoft stack not appropriate? I am not aware of any spectacular performance difference between Apache and IIS, or Windows Server 2012 and Linux, and clearly the SO team is not religiously Microsoft, so it's hard to see why they wouldn't have switched if it was a clear win. MSSQL is actually, in my experience coming from MySQL, amazing - I'm back on MySQL and Postgres now and I often wish I wasn't. I'd say it's the most underrated of MS's products, probably because of the expense of the Enterprise/Datacenter editions.
We... probably need to make that link a bit more obvious.
(I'm an SE SysAdmin )
I would love that problem.
This really needs to stop. Go to stackexchange.com and you'll find that more than half of the HTTP requests are to cdn.sstatic.net.
Looking up the IP addresses for cdn.sstatic.net returned five entries for me, all owned by CloudFlare. None of the CloudFlare servers that they are using seem to be in that 25 count.
Sure, these are all for static assets, that isn't the point. There are way more than 25 servers being used to serve the StackOverflow sites.
Sure, we use a CDN, but not for CPU or I/O load, but to make you find your answers faster.