However, I'm starting to think that all of the advocates of various frameworks are now conspiring independently to make this comparison meaningless...any framework (except Cake for some reason) can be superoptimized towards a small set of tasks. If you do another round, could you increase the number of different tasks? Some examples could be:
1) Mixed bag of queries of various complexity
2) Static file serving
3) A few computation/memory-intense benchmarks (such as those in the Language Benchmarks Game)
As Pat points out, we definitely look forward to implementing some more computationally-intense request types in the future. This round does include the first server-side template test. We'd like to hear the community's opinions about more tests.
That said, I feel most of the frameworks' implementations of the existing tests are not cheating. Our objective in this project is to measure every framework with realistic production-style implementation of the tests. No doubt there is temptation to trim out unnecessary functionality and focus on the benchmark's particular behavior. We have attempted to identify any such tests that remove framework features to target the benchmark as "Stripped" and those can now be filtered out from the list.
In other words, our aim is that the implementation of each framework's test is idiomatic to that framework and platform. And if that's not the case for a test, we want to correct it.
Your concern could be clarified by pointing out that framework authors may be tuning up their JSON serialization, database connection pools, and template processing in order to improve their position on these charts. And, to be clear, I have already seen evidence of that in my interaction with framework authors. To that concern, however, I would say: That is awesome. I want those features to be fast.
I can also say that this benchmark inspired me to take a hard look at class loading and I was able to make some improvements to the framework's efficiency in general. So, in a way, I did some tuning - not for the benchmark, but rather as a result of the benchmark. Thanks to this benchmark all Phreeze users will gain a little performance.
I would also like to suggest a test idea. I think the biggest challenge for frameworks comes into play when you have to do table joins. Something like looping through all purchase orders and displaying the customer name from a 2nd table - that would be a very real-world type of test. I think foreign key type of queries are more telling about an ORM than a single table query.
Some readers may feel we are attempting to paint some frameworks in a poor light. Yes, we do have favorites, but we are absolutely intent on keeping this open and fair. If we're doing something wrong, help us fix it! A pull request is very happily received.
When I read reactions of that sort, I selfishly want to point the author to Jakejake's comments to demonstrate how awesome it is to see a framework improving. Speaking of that, I want to eventually have the ability to show performance over time (e.g., compare Round 1 to Round X) as a potentially interesting illustration of a framework's intent to improve performance.
Also, thanks for the idea for a future test. That sounds like a good one.
One example is that the framework loaded an lot of MySQL classes whether or not you do a DB query. So, now I wait to initialize the DB stuff until after you make a call that requires it. Phreeze has always been lazy about opening the DB connection, but now it's even lazier and doesn't even load the classes until you need them!
There were some other utility-type classes like XML parsing and such that probably don't even get used much. So that is lazy loaded now too.
For a non-DB request I was able to get it down from about 37 files that loaded to around 20. For a DB request I think it's still around 30 files, but I definitely consider that a performance improvement. The benchmark led me to scrutinize what is being loaded so I think it has already improved the framework.
Except now it is clear that you are refusing optimizations for some frameworks due to a vague, aesthetic judgement of 'stripped'. Which now means that you actually aren't measuring the minimum framework overhead. You are measuring the overhead of the defaults, or the overhead of not taking optimization seriously, with large amounts of performance left on the table. Worse, selectively applying optimizations means you are comparing one framework's defaults to another framework's minimum overhead. And since you have abandoned minimum overhead, it now makes very little sense about why we are measuring performance independent of normal first-resort tactics like caching (who is running Cake without caching?)
If you were going to do that, you should have benchmarked defaults right down the line and allowed a full, normal range of simple deployment optimizations. Instead we have selective optimization and totally unrealistic deploys, so it really indicates very little.
I'm not sure where you get the impression that we are refusing tuned tests (what we call "Stripped" tests). We have accepted two of those and would accept further tests of that nature. An implementation of course still needs to work and meet the obligations of the test scenario. For example, each row must be fetched from the database individually and the response must be serialized JSON. We did "reject" one test that fetched all 20 rows using a WHERE IN clause, but that implementation was quickly reconfigured by the submitter to match our specification.
We are expressly not including reverse proxy caches in these tests. We're not benchmarking the performance of the nginx proxy cache, Apache HTTPD's proxy cache, Varnish, or anything similar. You can find such benchmarks elsewhere. We are benchmarking the performance of the application framework for requests that do reach the application server. The tests are intended to be a viable minimum stand-in for application functionality in order to fulfill requests that, for whatever reason, reach your application server.
If the scenario is difficult to conceive, imagine your site cannot leverage a proxy cache because every request is providing private user information.
To be clear: none of the frameworks are being tested with a front-end cache.
Also presently, none of the tests use a back-end cache either, but future tests will include tests of back-end in-memory and near-memory caches.
For example, Yesod has client session and logging disabled. I'm also sure that quite a few frameworks have logging disabled.
Does that not count as "stripped" since it deviates from the norm for deployment?
These are very good points you bring up and I will need to address them in the site's FAQ in addition to this response. I would appreciate any follow-ups as I am open to revising the opinions I include below.
First, if there are any specific examples of frameworks that have been mis-characterized, I would appreciate that we address each individually as a Github issue. For example, I will create an issue to discuss the Yesod test and its session configuration .
Here is our basic thinking on sessions. None of the current test types exercise sessions, but if the test types were changed to make use of sessions, session functionality should remain available within the framework.
If the a particular test implementation/configuration has gone out of its way to remove support for sessions from the framework, we consider that Stripped. If session functionality remains available but simply isn't being exercised because the test types we've created to-date don't use sessions, then at least with respect to sessions, that is Realistic.
Logging is an important point that we need to address. We intentionally disabled logging in all of the tests we created and will need to be careful to review the configuration of community-contributed tests to do the same.
You're correct, disabling logging is not consistent with the production-class goal. So, why did we opt to disable logging? A few reasons:
* We didn't want to deal with cleaning up old log files in the test scripts.
* We didn't want to deal with normalizing the logging granularity across frameworks. (Or deal with not doing so.)
* In spot checks, we didn't observe much performance differential when logging is enabled.
We're not unmovable on logging, however, and if there is sufficiency community demand, we would switch to leaving logging .
As for sessions, I just used Yesod as an example but it applies to all frameworks and other "middleware" as well and this is something I am mixed on. Some platforms do not support any middleware at all so should these also be classified as "stripped" or "barebones" also? What I'm getting at is, is this really a fair comparison? From a glance on the benchmark page, it is not apparent which frameworks have which configuration or feature if you're not familiar with the framework itself and it can get really complicated. I think labeling the frameworks in terms of size is a huge step in the right direction but my belief is that more information is needed.
A new "Fortunes" test was also added (implemented in 17 of the frameworks) that exercises server-side templates and collections.
With 57 total frameworks being tested, we have implemented some filtering to allow you to narrow your view to only those you care about.
As always, we'd really like to hear your questions, suggestions, and criticisms. And we hope you enjoy this latest round of data.
Also when measuring latency, average and std dev are only revelent if the distribution is guassian in distrition. Which is unlikely.
Better to show percentile based measurements. Like 90% of all requests served in 5ms, and 99% of requests served in 15ms.
See Gil Tene's talk "How not to measure latency"  for more info. Also be sure you are not falling into the "Coordinated Omission" trap where you end up measureing the latency wrong.
Thanks for the feedback! We started the project with WeigHTTP, then starting with Round 2 we switched to Wrk  at the advice of other readers. Wrk provides latency measurements consisting of average, standard deviation, and maximum.
See the earlier conversation about standard deviation here: https://news.ycombinator.com/item?id=5455972
If we had distribution data available, we would aim to provide that in some form. And perhaps the author of Wrk could add that in time.
However, for the time being, I consider the matter somewhat academic. Not to be dismissive--I value your opinion--but I don't believe that would measurably impact my assessment of each framework's performance. Though, it would be fascinating to be able to validate my suspicion that Onion, being written in C, does not suffer even the tiny garbage collection pauses of the Java frameworks.
Perhaps you could upvote or something?
Thanks for all the great work in these benchmarks. A useful resource.
technically not true. Knowledge of the second order moment (variance) lets you uniquely identify other distributions like Poisson, or uniform. Knowledge of even higher order moments lets you fit more complicated statistical models.
Low variance is good, regardless of underlying distribution.
You might have one clump of fast responses when no GC occurs, another when some GC occurs, and a smaller clump where a stop-the-world full GC has occurred.
In such a case average is not meaningful.
Mono Issue #1, since the vast vast majority of ASP.Net websites run on windows a Mono performance test even if accurate is going to be of dubious value.
Mono Issue #2, since Mono is nowhere near as polished as the Microsoft .Net implementation the numbers wont really be meaningful.
Windows Issue #1, if you do the test on a different OS than every other test implementation, the results really wont be comparable in any fair way.
Microsoft Issue #1, dont know if it still holds now a days but in past official EULA for .Net prohibited publishing benchmark results. PERIOD.
I am a .net developer and as much as I like ASP.Net I dont think the effort of adding a .Net implementation really would pay off.
I know that i wouldn't mind switching to a windows+.Net environment if it proved to be much much faster than what i'm using right now.
I have a friend who went to a Mono talk at MS MIX where a Mono developer was speaking. The Mono developer said that while Mono is a little slower than .NET (and he was talking a couple percent) Mono often ends up being faster on the same hardware because the Linux system calls the runtime uses are faster. There were a few ROFLs in the audience.
I also agree with you that investing in a .NET (Windows) test isn't good bang for the buck here.
Further, even if Mono itself is as fast as the .net framework, IIS the web server is going to be totally different performance characteristics from whatever webserver you are using on linux.
I am glad you showed me the error of my ways because I would have guess .net was somewhat faster than mono, but it goes to show even more comparing across operating systems is meaningless...
now i would love to get my hands on the numbers dont get me wrong, just saying if i was running the project I wouldnt go through the amount of effort required to get the .net results.
Edit: yeah right Mono and w/e
"As with the previous question, we'd love to. We have heard tentative word from a reader/contributor that a pull request may be incoming soon that will include several .NET frameworks on Mono, which we assume will be as easy to include as any other pull request. One challenge we face is that the test infrastructure we've built assumes a Linux deployment that we can automate using ssh and Python. To do a proper .NET test on Windows Servers, we will need to work on adapting that platform to automate Windows Servers as well. Community assistance on this would be greatly appreciated."
We also want to test on .Net's native Windows platform. But we need to work on the testing platform we've built in order to automate a Windows server in the same way we presently automate a Linux server.
Benchmarks like this are designed to be the starting-point of a discussion an investigation, and not as anything meaningful in their own right. Boiling it down a framework to one performance number ignores the many, many nuances of a framework.
What surprises me most is the difference between different frameworks. A few years ago the mantra seemed to be "Use Rails, Django or a similar full-stack framework. Speed of deployment trumps everything!" Over the last few years I've seen a shift as people are trying to get more performance from limited hardware. Personally I'm intrigued by how a fairly innocent decision early in the project (of what language/framework) may have profound performance implications in the long run.
For myself, I've been looking for a good functional-programming framework. Just looking at this gives me a good list of frameworks to start looking at. It feels to me that a framework that performs well is likely well engineered, so the ones that perform better will go at the front of my queue for investigations.
Part of that shift is also that other frameworks have learned and integrated a lot from Rails/Django. The productivity/time-to-launch gap isn't as significant as it used to be, so other factors like performance, compatibility with pre-existing infrastructure (eg for JVM-based frameworks), security, etc. are gaining more influence in the decision about what to use.
You're precisely right about how to put this data to use: as one point in a holistic decision making process. We address that in the Questions section of the site, in fact. That said, we are not reducing each framework to a single performance number. Our goal is to measure the performance of several key components of modern frameworks: database abstraction and connection pool performance, JSON serialization, list and collection functions, and server-side templates. We'd like to add even more computationally-intensive request types in future rounds.
So, no, we're not testing your (or anyone else's) specific application on each framework. But we are testing functions that your application is likely to use. You're still better off measuring the performance of your use-case on candidate frameworks before you start work, but perhaps you can first trim the field to a manageable number.
In the first round, we echoed your surprise at the spread--four orders of magnitude! I think the shifting winds of opinion come from the fact that today's high-performance languages, platforms, frameworks are not necessarily more cumbersome to use for development than the old guard. As others have pointed out elsewhere in this thread, Go is not a terribly verbose language, and yet its performance is fantastic.
Has the era of sacrificing performance at the altar of developer efficiency ended? I'm not sure. But we have some data to add to the conversation.
I still don't find these benchmarks very useful. From the looks of the comments, a lot of you don't really either (even if you don't realize it).
For example, a lot of people in these comments want to correlate language speed with performance in these benchmarks, by arguing specific examples, but comparing almost any two frameworks/platforms in this "benchmark" is an apples to non-apples comparison, and the result is actually full of counter examples (faster languages performing more poorly). That should instantly tell you that this benchmark isn't telling you what you think it's telling you, and that you haven't really derived any value from it.
Perhaps the biggest reason I don't find value here is that every product here does wildly different things. It's like comparing wrenches to hammers to screwdrivers to 3D printers.
I also want to point out to people who say that this is a "comparison" of frameworks that it is emphatically not a comparison. What is the value of a framework? Is it speed? Atypically. And this "benchmark" tends to point at such cases as "being better" because they do better in this specific task. A framework/platform's value lies in features and abstractions. This does not compare those.
I will gladly build a "framework" in NodeJS that is only capable of doing the tasks in this benchmark as fast and with as little overhead as possible. You would NEVER use it in the real world, but it would be a beast at serializing JSON and making repeated database queries in an insecure fashion. But score here is the important factor, right?
1) If you see problems with a language you're an expert in, submit a pull request. I've never seen a benchmark done like this before, it gives everyone a chance to fix problems in their favorite framework/language.
2) It is a little bit of a unfair comparison between very low feature frameworks to higher ones, but it gives you a good idea of what you're trading off on basic performance. For example, I thought our use of play1-java wasn't far off of servlet on basic tasks, but boy was I wrong, perhaps by 10x.
Should you read this list and pick the top thing on the chart? No. However, hard to argue this isn't interesting and useful information.
Otherwise I can see how someone would assume a simplified and misleading heuristic "If I can process 1000 requests in 1 second. That means the server can handle 1000 requests/seconds. So if 1000 requests come in at once, they will all be processed in 1 second". Two thing can happen, it could processes it slower than one seconds, it could error out and die, or it could actually process it fast if it can scale across CPUs. That is where the gold is if you ask me... Anyway just my 2 cents.
Says someone (many someones) about every benchmark, ever. I've never seen a benchmark that yields universal praise, every one earning criticism from people who don't like the results.
What is the value of a framework? Is it speed?
This is clearly a benchmark of performance. Is that the single value of a framework? Of course it isn't. But you certainly shouldn't stick your head in the sand about it.
While for no or just one query it's slower than a lot of the other frameworks (due to PHP being slow to parse, startup etc), as soon as we have a lot of DB queries, the C interface to MySQL leaves the other frameworks in the dust.
The well known PHP shortcomings aside, that's a nice example of optimizing for the things that matter most, especially for it's common use cases (Wordpress, Drupal, etc).
Disclaimer: mysqli does have async capabilities, but most people such as myself use PDO for its other benefits. And mysqli only works with MySQL.
With Servlet for example, a worker thread is chosen from Resin's thread pool and used to handle a request. The Servlet then executes 20 queries sequentially and returns the resulting list data structure. This is Servlet 3.0 but not using Servlet 3.0 async.
Async isn't making the top performers fast. Being fast is making them fast.
Some may issue queries in parallel and aggregate the results, blocking until everything is done. Others may run them sequentially, which is the simplest but slowest way.
That being said, above a certain scale and complexity level, you probably want the topology of your persistent data store hidden from your web request handlers anyway. For one thing, making requests to N backend shards from M frontend web workers starts to get bad when N and M are both large; for another, introducing really complex scatter-gather query logic into your request-handling pipeline can be a maintenance and debugging nightmare.
Introducing a proxy or data-abstraction service in between cuts down on the number of open connections and lets you change the data storage topology without updating frontend code.
Many people choose Ruby and figure, given that premature optimization is the root of all evil, they'll optimize later if needed.
That's like choosing between a farm tractor or a ferrari - and figuring if the tractor doesn't perform up to snuff, we'll add a spoiler (and given the 10x disparity between Java and Ruby in some of those graphs, if we throw out a 20mph top speed for a farm tractor, the ferrari analogy is actually rather spot on).
There are many good reasons to choose dynamic/interpreted languages - but always know you're giving up performance in exchange.
> That's like choosing between a farm tractor or a ferrari - and figuring if the tractor doesn't perform up to snuff, we'll add a spoiler
Its really not like that at all, because programming languages aren't like vehicles. Particularly, with Ruby, on typical method of optimization is finding which bits of code are bottlenecks, and then optimizing those bottlenecks, often by replacing them with C (or, if the Ruby runtime being used in JRuby, Java).
Which I guess is like having your tractor turn into a Ferrari for the parts of work that involve going long distances on a road without towing something, but I think that kind of points out how bad even using the tractor/Ferrari analogy is.
At my job after benchmarking we've done things like break out computation heavy things into C/C++, and have been even eyeballing things like Go and the Lua/Nginx based OpenResty for small computationally heavy services.
In many cases this means rewriting what used to be a 3rd party library. The big question is usually around cost in time and if we want to have to maintain that knowledge for the long term. Most of the time it's cheaper to toss more servers at it - but for certain things - namely cases where latency is very important no amount of scaling out is going to make it faster.
It would be interesting to see the profile of some these benchmarks for the various frameworks to see where the bottleneck is.
Certainly they do it for Ruby apps in general. I don't think its all that common for it to be a high-value proposition for web apps.
> Even if it happens to be part of some third-party library? And maintain a fork?
If its an open-source third-party library that tends to get used in a way that is performance-critical, upstream will probably accept moving bottlenecks to (portable) C and maintaining the API, so its unlikely that you'll need to take up responsibility for a fork.
> What if that performance-critical part is dependent on other parts in a non-trivial way?
If the call pattern is such that they are not part of the performance critical part themsellves, then the performance critical part calls them through the regular conventions for calling Ruby from C.
If the call patter is such that they are part of the performance critical piece, well, I think the answer is obvious.
> It seems that an unanticipated replacement of some core functionality with a C library may involve a major rewrite
It might, but in the meantime you've got working code.
> and most Ruby teams may not have the expertise to do a good job maintaining a C code base any way.
If the team determines it needs expertise in a particular area that it doesn't currently have, then it should either develop that expertise or bring in people that have it. That's true whether its particular domain expertise (e.g., building messaging systems) or particular technology expertise (e.g., C). That's part of the normal development of a team.
I don't think its costly compared to available alternatives; I think its generally an efficient alternative for the type of bottleneck that is actually related to implementation language efficieny. I think, for most typical web apps, the bottlenecks are only rarely of that type, so that's generally not where the effort is going to be spent, but for the ones that do have bottlenecks of that type, its quite appropriate a way of solving it.
> throw more hardware, write manually optimized Ruby, switch to a faster language/runtime
If writing manually optimized Ruby is an effective and cheaper solution, you aren't experiencing the class of bottlenecks that are related to implementation language efficiency. Switching languages or runtimes for a component is a proper subset of the work of switching languages or runtimes for a project, so the latter isn't going to be less costly than the former (it may, if language-related bottlenecks are pervasive, or if you have non-performance interests in the alternative language, have a bigger net payoff and be more cost effective, but it won't be less costly, and its inherently riskier to do all at once, since component-wise transition gives you a faster cycle time in terms of realizing value even if you end up doing a full replacement in the end.)
> And if you have a complex application that utilizes many of Ruby's idioms to deal with the complexity, it's extremely unlikely that you can simply replace parts of it with C libraries without reorganizing in such a way to increase complexity.
I disagree. Anything you can do in Ruby you can do in API-equivalent C that can still call out to the exact same Ruby code for the functions that aren't being moved into C, so there is no reason at all for the kind of reorganization you suggest, particularly if you are building with loosely-coupled components in the first place.
If you are building a complex app and its all tightly coupled, you've got a big maintainability nightmare no matter what language you're using, and that has nothing to do with Ruby.
Yes, you can write Ruby in C, but it would be almost as slow as writing Ruby in Ruby. I don't really see the point of saying, you can do anything you can do in Ruby in C, it would be much more verbose and about as slow. The point is that true optimization may force you to do things that you can only do in C and there's no guarantee that this optimized version can be easily utilized from the rest of your Ruby code. This has nothing to do with tight-coupling - it's simply taking advantage of the language's abstraction facilities.
And no, having to write hand-tuned Ruby, as opposed to idiomatic Ruby, to get performance that can be had by writing, say, idiomatic Scala or Haskell is an indictment of slow implementations and prevents you from taking full advantage of the expressiveness provided by the language.
And that's before you get into things like your team may have to get bigger because you need a C/Ruby-extension expert, half the team not being able to understand a critical part of the code base (very few Ruby developers are reasonably competent in C), etc.
Again, the whole point is that Ruby's performance problems pose a real pain point. Yes, you can rewrite parts of it in C, yes you can mitigate by using gems written in C, yes, you can spend more time optimizing, yes you can throw more hardware. But all of those are costly and it's disingenuous to pretend that a problem doesn't exist simply because a workaround does.
None, on purpose. We want maximum portability, and so the Rails defaults are Ruby-only on purpose. Of course, it's easy to add gems that replace things that are written in C or Java, depending on what makes the most sense for your platform.
Why not - is that a limitation of OpenResty or of LuaJIT?
How would you turn on JIT compiler in OpenResty?
But to further add to the analogy, the tractor, the ferrari and the go-kart may all perform about the same if you're only traveling 1 inch.
Love me some analogies!
In other words, although interesting (and exceedingly well done) these benchmarks should have "surprised" no one. Not even the disparity between languages.
For all non-trivial apps, by the time you get 100 req/sec your bottleneck is very likely going to be your database.
This exercise aims to provide a "baseline" for performance across the variety of frameworks. By baseline we mean the starting point, from which any real-world application's performance can only get worse. We aim to know the upper bound being set on an application's performance per unit of hardware by each platform and framework.
But we also want to exercise some of the frameworks' components such as its JSON serializer and data-store/database mapping. While each test boils down to a measurement of the number of requests per second that can be processed by a single server, we are exercising a sample of the components provided by modern frameworks, so we believe it's a reasonable starting point.
So, yes, these benchmarks should not be the only factor in choosing a framework, but they do provide a possibly important data point (depending on the specific scenario).
Moore's law has made this sorta moot. Unless you're on Heroku, for a successful small-to-medium app, the denominator in your hosting costs is doing to be the salary of the engineer or sysadmin who tends to it.
(If you're on Heroku, then you start worrying about dynos because, with monitoring, you're paying $60 per "worker".)
This is to say, the cost in salary to properly shard a database probably outweighs a year or two of hosting for the extra two or three boxes you're spinning up; almost no one experiences explosive growth where you need to spin up dozens of new boxes overnight.
This benchmark is even less useful than alioth's shootout, I'm not sure why there is so much effort put into it :)
Case in point: "How We Went from 30 Servers to 2":
That doesn't negate the point though, language performance matters at certain scales.
The Go code size is pretty small, in fact it might be smaller than the Rails code... I'm still trying to find all the Ruby files, Go is in one file...
Mojolicious, Dancer and Kelp have set the bar for small code size for me. Not sure yet if there are smaller ones (note that there are no other files required for those apps, period)
In the same vein, Lua's OpenResty looks good, as do Tornado, Flask and Bottle (although you need to tease the raw/ORM methods apart to get an idea for the last two). And of course, Sinatra.
There probably a lot more, especially for PHP, but I didn't feel like going through that list.
You've basically just listed Perl 3 times though. Particularly when the guts of the code in all 3 of those examples was Perl's standard database interface (the same DBI you'd use for CGI Perl or even standalone .pl scripts).
I do love Perl for the flexibility of it's syntax and how concise the code can be. But for me the performance of Go won out. And while mod_perl* does make great gains in performance, it also makes the code a lot less portable (unlike Go). So I found myself porting my performance critical webapps over to Go
* I've not tried Mojolicious, Dancer nor Kelp so I couldn't comment on how they compare for performance.
> Particularly when the guts of the code in all 3 of those examples was Perl's standard database interface (the same DBI you'd use for CGI Perl or even standalone .pl scripts).
The benchmark page clearly tags which implementations use raw SQL access and which use an ORM. These all happen to be using raw SQL. To my knowledge, none of them have a pre-bundled ORM, and I'm not sure whether the ORM tested implementations are only supposed to indicate the pre-shipped ORM.
> But for me the performance of Go won out
I wasn't trying to imply they competed on that metric, I just wanted to give some examples of much simpler implementations. What one considers small is obviously relative.
> I've not tried Mojolicious, Dancer nor Kelp so I couldn't comment on how they compare for performance.
They all look to be bottom-half of the full set of results, performance wise. Mojolicious quite a bit slower (relatively, they are all slow compared to Go) than the others, most likely because it uses it's own internal, pure Perl JSON module. There's was to fall back to the optimized C-based JSON::XS module, but I'm not sure whether that would keep with the spirit of the benchmarks.
You miss my point. All of those examples you gave used the same core database framework and as the test was primarily a database performance test, all those 3 examples were essentially the same core Perl code.
Whether it's ORM or raw SQL is completely besides the point (though since we're on the topic, Perl's DBI basically works the same as Go's - or rather that should be the other way around given their age).
>I wasn't trying to imply they competed on that metric
Again, you missed my point. I wasn't suggestion that you were comparing the performance of the two. I was commenting on why I switched away from Perl to Go.
> I just wanted to give some examples of much simpler implementations.
Except you didn't You gave AN example (singular). It was one language; Perl.
> They all look to be bottom-half of the full set of results, performance wise.
I wouldn't trust that kind of benchmark for comparisons of Perl frameworks as setting up a Perl environment isn't as simple as compiling a Go program. With Perl, you have a number of different ways you can hook the runtimes into the web server (CGI, Apache libs, etc), pure Perl and C libraries (which you also mentioned) that significantly affect both memory usage and runtime performance and a whole boatload of config ($ENVS in mod_perl, bespoke handlers, etc) that also affect performance.
The ironic thing with Perl is despite scripts in the language being some of the most portable code on the POSIX community, running performance critical Perl webapps leads to very unportable set ups. (which was the other reason I migrated my sites to Go).
This might sound critical, but I genuinely do love Perl. I'd say it was up there as one of my favourite languages (and over the years I've learn to develop in a great number of different languages). But sadly nothing in life is perfect.
I think we are talking past each other. I listed a lot of frameworks, including three in python. I started with Perl, and added a whole bunch more. I could, and should, have presented them better.
Personally I think the fact they are using DBI is the inconsequential part. It takes up few lines of the example, and most of the other code is the specifics of the framework (although they are very similar, because they all Sinatra clones, to varying degrees). What do you expect to be different in a non-DB based test (I'm still unclear what point you are trying to make)? Their template systems are pretty simple to use as well.
> Again, you missed my point. I wasn't suggestion that you were comparing the performance of the two. I was commenting on why I switched away from Perl to Go.
That's fine, and a worthy conversation to have, I'm just trying to keep this on the topic of implementation size, since I think the performance side of the discussion is being handled well enough elsewhere.
> Except you didn't You gave AN example (singular). It was one language; Perl.
Actually I gave eight examples, three Perl, three Python, three lua and 1 Ruby. The fact there were three Perl implementations first, and listed by themselves is sort of an accident. I was really interested in how Mojolicious did, since that's my favorite at the moment, and then I checked the other Perl implementations, and then I looked for others that might be good examples. I intended for them to be taken all together, even if that's not how it seemed.
> With Perl, you have a number of different ways you can hook the runtimes into the web server (CGI, Apache libs, etc), pure Perl and C libraries ...
> he ironic thing with Perl is despite scripts in the language being some of the most portable code on the POSIX community, running performance critical Perl webapps leads to very unportable set ups.
How recent is the data this opinion is based on? My understanding is that now most (new) Perl web projects are using PSGI as a common back-end making it extremely portable, and often using pure-perl servers for performance. There's some evidence they can significantly beat mod_perl2.
> This might sound critical, but I genuinely do love Perl. I'd say it was up there as one of my favourite languages (and over the years I've learn to develop in a great number of different languages). But sadly nothing in life is perfect.
I was really, _really_ trying to not make it a Perl vs Go thing. It's obvious I do have a preference though. I'm glad you like Perl, it does seem to fit the mindset of certain people well, and even if they don't stick with it, they remember it fondly. :)
I wasn't aware of PSGI nor the performance it has compared to mod_perl. That's probably one of the most interesting things I've read on here for a while (interesting in terms of it could have a direct impact on my business).
Thanks for that. :)
I wasn't offended, just sort of confused. ;)
> Thanks for that. :)
No problem! To tell the truth I didn't really have a clue about real performance until I looked it up for that post. I use the hypnotoad (pure-Perl preforking non-blocking), server for Mojolicious for my projects, but those are mostly internal, so I didn't have to worry much about performance. I always figured I would look more into it when it mattered. I thought worst case I would deploy using PSGI on mod_perl, but I also knew from prior experience you can get pretty good performance from a pure-Perl solution.
You realize that Go implements the new template test right? Your linked ones do not (at least the ones I spot checked).
Also, Go is statically typed = win
Actually I find Perl's type system to make the most sense for web work:
1) Any zero length string or 0 valued int is classed as false, which is handy when checking the returns from query strings et al.
2) You can use eq for string comparison or == for numeric checking, which means which is handy has you can read values from a query string and then compare it against an int without having to do type conversion.
Don't get me wrong, I don't have anything against statically typed languages - in fact I normally prefer them. But the way Perl does type checking I find reduces the number of type problems when dealing with web development.
That all said, I much prefer working with structured types in Go than in Perl.
hello.go uses more lines defining variables and types than the entirely of many of the alternatives I posted. Obviously they will be a little longer if they implement the fortune handler, but I doubt that will really make much of a difference.
> Also, Go is statically typed = win
I'm not sure what that has to do with implementation size (which is the only thing I was addressing), but feel free to make a case.
One thing I am wondering is "what about concurrency level"?
Just because a server can handle 10x the number of requests when doing a single request a time for 1000 requests, doesn't necessarily mean it can also handle those 1000 request at 10x performance when they all come in at once or in a short time period.
I saw some tests have "256 concurrency" does that mean they are sending 256 request concurrently? I want to see them play more with those numbers. Why not have 1024 or more. Then also play with the number of available CPUs and see which frameworks can auto-scale based on that. Some that can process sequential requests fast might fall face down when faced with slightly increased concurrency, in that respect these benchmarks are a bit misleading.
On the other hand it is good to see latency. That is a important. Now latency vs level of concurrency would also be interesting.
On a side-note, I'd really like to know why so few start-ups seem to be using Spring. It could be just a wrong impression . But from what I have seen most start-ups use RoR or Django. My guess is that Spring is less flexible and less known outside big companies, where it is usually the default. It could also be that Spring works better with the waterfall model whereas Django or RoR are better suited for explorative programming and that fits the respective spheres better.
I've used spring mvc in an agile setting a couple times now, and it has worked fine. It doesn't tend to make developers all that happy, in my experience. If you're in an enterprise full of spring, starting up the next app with it can be attractive -- there likely already exists a bunch of tooling and knowledge around spring.
I wouldn't use spring directly if I were trying to build something quickly for a startup. I'd be more apt to reach for grails (which wraps spring), dropwizard, or any of the other rapid-development frameworks.
I love the language, but let's not get too carried away until the ecosystem grows. The reality is, if you're going to use Go for web dev, you're going to need to be prepared to do a whole lot of things on your own.
Must not be looking on github... Yes its new, but in a majority of cases I have not had any trouble finding third party libraries and the std ones are wide and excellent.
But at the moment unfortunately I don't think it's very mature. Support for interacting with the DB, arguably the most important part of a web application, is pretty lacking IMO for something that otherwise wants to be an end-to-end solution.
In one of the examples I noticed that all kinds of interaction with the DB was being done in the Controller, not the Model. Which just seems wrong to me (at first I thought it was a "Play" framework thing, since Revel is modeled on that, but Play uses Hibernate for an ORM in its models). Also, you'll have to roll your own support for interacting with the DB using, say, gorp: https://github.com/coopernurse/gorp
That being said, robfig seems like a really cool dude, and he was responsive on github when I needed some help. The documentation is pretty great too.
The OP, to which I was replying, said: "see the gorilla test which is a Go framework."
I asked about the Gorilla benchmark itself, since OP seemed to say that adding something like Gorilla would slow down the Go benchmarks, which I don't agree with.
But I also didn't realize that that person was saying that there was a gorilla test, so now I'm mostly confused.
How so? There's Clojure, Scala and others perfectly good choices on the list. The max latency for Go is also very high compared to most others.
Avoiding preoptimization applies just as much to frameworks.
The first isn't true any more: the Java VM competes with native code on most benchmarks, and due to its ability to perform runtime optimizations, can occasionally outperform native code.
The second doesn't matter at all for web servers. The cost of starting up the web server is tertiary to uptime and performance. If the thing is going to run for 4 months without going down, who cares all that much if it takes 5 seconds or 5 ms to start up?
If it takes you 5s to start up your server, that's a lot of time you've added to each development iteration. Make a change, restart the server, wait 5s, see if it works/check debug output.
Now-a-days, starting a Tomcat or TomEE JVM in debug mode with Eclipse gives you the ability to hot swap probably 95% of your changes. It doesn't supporting adding completely new functions or changing declared fields. JRebel does support this though.
As a matter of fact, if you're in a stack frame and you pause the execution pointer with a breakpoint, you can completely change the code of the function and the JVM will discard the current stack frame and then restart the functional call. Essentially, you can rewrite your code, while it's executing, without losing your stack.
Make a change, restart the server, wait 5s, see if it works/check debug output.
Most of us aren't waiting 5 seconds before checking output. We just refresh the browser.
If even that's too much, there's JRebel which does full hot reloading of pretty much every piece of code you change.
I'm a ruby guy btw..
And don't get me started on the Azure Compute Emulator.
Although I hope to never write "public static void main" again (except ironically, of course), and I spend some time dabbling in Python/Ruby/obscure-language land, I'm really happy to see Clojure and Scala doing well here.
That being said, as a day-to-day Java web-developer, I cannot honestly remember the last time I wrote "public static void main".
Compared to the current crop of dynamic language interpreters, waaaay more engineering time and talent has been poured into optimizing the jvm.
People I think forget that Java was a more user friendly C++; the price you paid was somewhat slower apps, but that's OK because you write more robust apps more easily. Rinse and repeat for Ruby/Python/Your Lang Here.
The JVM indeed kicks some major butt.
They could lock at any time, for a perceptible few hundreds of miliseconds, but had still a nice average speed.
I've usually had pretty good luck with Slim, I'll have to try a version with & without redbean and see how big a difference it makes.
With slim in particular I notice that the benchmarks list it as "Raw database connectivity" but in the code it looks like it's using RedBean ORM. I'll look more at lunch, i'm probably just misreading something.
Although obviously ORM is more realistic, since if you're sophisticated enough to be using composer and a framework, you're probably using an ORM. I know the point of this benchmark is frameworks not ORMs, but it would be interesting to swap them out and see if there's a huge difference.
PHP can be used for long-running jobs, but it doesn't have very good garbage collection, and has no language-level concurrency and very little asynchronicity.
The problem with it is, one request can bring the whole thing down.
Which means that in most common web use cases (which are db heavy), PHP is as fast as any of them, since all the slowdowns (initialization, slow Zend engine etc) are dwarfed out by the fast db handling.
If I filter to just show PHP, the 20 query test shows that the first framework is less than 50% the performance of raw PHP.
It looks like it's off by default in Redbean at least. No idea if this makes a big difference, I haven't had time to try it out.
I don't agree. Frameworks often prefer the "safe" option over "performance" by default. If you activate persistent connections by raw coding it  then you should also set an absolutely obvious database configuration flag like  in a framework.
Also no love for Yii.
"PHP 5.4.13 with APC, PHP-FPM, nginx"
We've had a lot of input from the PHP community about setting up the PHP tests properly, but if you have a suggestion for an improvement we'd appreciate it.
Memcache isn't used because none of the tests are caching database results. A later test will use caching.
We'd be happy to include Yii. Submit a pull request. :)
And since that day I've been wondering: why does NodeJs (=V8 JS engine in C) talking to MongoDB have higher response times and latency than Ringo (=Rhino JS engine on JVM) talking to MySQL. The only thing where Node beats us JVM guys seems to be the JSON response test.
Looking at the res/seq we got from round4. In order of concurrency (8, 16, 32, 64, 128, 256):
nodejs (mongodb raw)
12,541 22,073 26,761 26,461 28,316 28,856
ringojs (mysql raw)
13,190 23,556 27,758 30,697 31,382 31,997
both look like they got room to grow
Until the project includes a WebSocket-enabled test or a test with forced idle time (e.g., waiting for an external service to provide a response), concurrency higher than 256 yields very little of interest. The reason being that we are fully saturating the server's CPU cores at 256 concurrency .
Increasing the client-side concurrency level simply means that the front-end web server (or built-in web server's socket listener thread) needs to maintain a small queue of requests to hand off to the application server's worker threads. It doesn't make the server any faster at completing those requests. I've written some more about this at my personal blog .
 Caveat: Some frameworks appear to have locking or resource contention issues and do not saturate the CPU cores. We will attempt to capture CPU utilization stats in future rounds since this might be of interest to readers and framework maintainers. But increasing concurrency would not increase CPU utilization in these scenarios either.
> concurrency higher than 256 yields very little of interest. The reason being that we are fully saturating the server's CPU cores at 256 concurrency .
Well websocket connections are becoming more and more popular. Maybe that's a different benchmark.
But the level of concurrency is pretty important. It basically tells the story of what happens to a "slashdotted" server. If nothing crazy like that happens than most servers might be ok, just maybe have a little higher latency. It is when shit hits the fan that different servers start separating from the herd. Some gracefully slow down, some scale smoothly across CPUs, some start throwing socket errors.
Who cares about these issues? Well anyone who becomes successful. If there are no visitors and no customers and only a GET request here and there every 10 minutes, then those places could really just use any server. A simple Perl or Ruby one will do. Now those that grow and see customers they will be interested in what happens in cases like that. There is a traffic spike at launch of new product so now there is a 200% increase in traffic for that one day and it tapers off.
Maybe we just come from a different background and that's why they focus is on different metrics.
> It doesn't make the server any faster at completing those requests.
But I am not sure what story does benchmarking the servers at an artificial level of concurrency tells us. Maybe it helps those that have a throttling/balancing proxy that always sets the number of connection to 256 at most and otherwise balances out the rest to other servers... And I am not sure if a the heuristic that "If it can handle 2456 requests/second with a single connection at a time" can be extrapolated and implies then it can "handle 2456 concurrent connections in a single second".
Yes, WebSocket is a different test and we aim to include a WebSocket test in the future.
I understand what you're saying about the "Slashdot Effect," but I think you may be misunderstanding me.
Taken from the context of preparing for a Slashdot effect, the 256-concurrency test we are running against high-performance frameworks on our i7 hardware plays out like the world's worst case of Slashdotting. Think about it for a moment: Finagle is processing 232,000 JSON requests per second. It would be even higher if our gigabit Ethernet weren't limiting the test.
With requests being pulled off the web server's inbound queue and processed so quickly, do you think it would be easy to simulate and maintain 1,000, 5,000, or 10,000+ concurrency?
Conceptually, the load tool has an opposite goal of the web server. From the load tool's point of view, an ideal request is one that takes infinitely long. If the request takes a long time for the server to fulfill, the load tool can just keep the request's connection open and satisfy the user's concurrency requirement. East peasy. But as soon as the server fulfills the request, the load tool must snap to it and get another request created ASAP to keep up its agreed-to concurrency level. Asking a load tool to maintain 1,000 (or worse) concurrency versus a web server completing requests at the rate of 232,000 per second is asking a lot. Wrk is up to the challenge, but gigabit Ethernet holds everything back. The Ethernet saturation means that even if you crank up concurrency against a high-performance web server, the results look basically the same. The web server simply doesn't perceive the concurrency target because gigabit Ethernet can't meet the demand.
As I wrote in the blog entry I cited earlier, if you start thinking about the idealized goal of a web server--to reduce all HTTP requests to zero milliseconds--it should become more clear why increasing concurrency beyond the CPU's saturation level doesn't actually do much except show the depth of the web-server's inbound request queue. In other words, once we've saturated the web server's CPUs with busy worker threads, we can increase concurrency for only one goal: to determine at what rate can we get the server to reject requests with 500-series HTTP responses. For the JSON test on gigabit Ethernet, we find it's impossible to cause high-performance frameworks to return 500-series HTTP responses because the load tool simply cannot transmit requests fast enough to keep the server's request queue full.
A slightly less-performant framework--let's use Unfiltered as an example--is not running into the gigabit Ethernet wall but is still processing 165,000 JSON requests per second. Since the network is not limiting the test, the CPU cores are completely saturated. 100% utilization.
165,000 requests per second is way worse than being "slashdotted." Slashdot has many readers, but they can't generate that kind of request rate in their wildest dreams. Hacker News also has a great deal of readers, but nothing with a narrow audience such as this could generate 165,000 requests per second from readers clicking on a news link. Not even an article about Tesla, Google, hackathons, lean startups, girl coders, 3D printers, web frameworks, and classic computing all wrapped into one could generate that kind of request rate from Hacker News readers. Being at the #1 spot on Hacker News will see a few dozen requests per second or so.
* Web servers maintain an inbound queue to hold requests that are to be handed off to worker threads or processes.
* If there are worker threads available, the web server will assign requests to worker threads immediately without queuing the request.
* If there are no worker threads available, the web server will put the request into its queue.
* If a worker thread becomes available and a request is in the queue, it will be assigned as above.
* If no worker thread is available, and the queue is full, the server will reject the request with a 500-series HTTP response.
* Worker threads are made available very quickly if requests are fulfilled very quickly.
* The server becomes starved for worker threads if requests are not fulfilled quickly enough to keep the inbound queue routinely flushed.
* Actual usage does not come in as 1,000 requests in a nanosecond followed by nothing, and then another burst of 1,000 requests in a nanosecond. Even if it did, because gigabit Ethernet is slow, the server's perception of that traffic would be 1,000 requests spread over several milliseconds.
* For (nearly) all platforms and frameworks below roughly 200,000 JSON responses per second, 256 concurrency causes the worker threads to completely saturate the server's CPU cores, busy fulfilling requests. In fact, in many frameworks' cases, even 128 and 256 concurrency are nearly identical--check the data tables view in the result page.
* Since the CPU cores are saturated, increasing concurrency only can demonstrate the limits of the server's inbound request queue. Doing so does not show anything of interest from a performance (completed requests per second; speed of computation) perspective. Once your CPU cores are saturated, your server is in dire risk of filling up its request queue. A hot-fix is quickly adjusting the queue size in the configuration and hoping that's enough to survive the traffic; the real fix is simply fulfilling requests faster.
* In practice, if your server can fulfill 200,000+ requests per second, your server will essentially never actually perceive concurrency over 256 anyway. Gigabit Ethernet simply can't transmit the requests rapidly enough.
I find that questions about very high concurrency (where the questioner is not asking about a WebSocket scenario wherein connections are held live but are mostly idle) are confusing high concurrency with a simpler matter: inability to fulfill requests rapidly enough to keep the web server's inbound queue flushed. That is a performance problem, plain and simple, and not a high-concurrency problem.
In other words, one may perceive a large number of live connections, and think, "this is a high concurrency situation," but what they are actually contending with is a side-effect of being slow.
We could test that out at some point.
This is highly respected work (bookmarked) and deserves all the upvotes HN can give.
In a previous round, I pasted the relevant code directly into the results view. I will likely do that again soon since it's convenient for the reader.
For the time being, I invite you to browse the Github repository and examine the test implementation source.
Again, thank you very much for hard work. I think for some there are some revelations there, like that C framework.
Take a look at the new Fortunes test. That test lists rows from a database, sorts them, and then renders the list using a server-side template. It's only implemented in 17 of the 57 frameworks right now, but we hope to have better coverage on that in time as well.
Is there some reason that you use built in json serialization for some frameworks and not others?
There is also a lot of heterogeneity in the implementation of the multiple queries test. For instance, even if I only look at... say... java frameworks, you seem to implement the exact same feature in very different ways between platforms. For instance, for servlets, you will store all of the results in a simple array... and then write them out when you are done. Like so:
final World worlds = new World[count];
final Random random = ThreadLocalRandom.current();
try (Connection conn = source.getConnection())
try (PreparedStatement statement = conn.prepareStatement(DB_QUERY,
// Run the query the number of times requested.
for (int i = 0; i < count; i++)
final int id = random.nextInt(DB_ROWS) + 1;
try (ResultSet results = statement.executeQuery())
worlds[i] = new World(id, results.getInt("randomNumber"));
catch (SQLException sqlex)
System.err.println("SQL Exception: " + sqlex);
// Write JSON encoded message to the response.
catch (IOException ioe)
// do nothing
private final HttpServerRequest req;
private final int queries;
private final List<Object> worlds = new CopyOnWriteArrayList<>();
public void handle(Message<JsonObject> reply)
final JsonObject body = reply.body;
if (this.worlds.size() == this.queries)
// All queries have completed; send the response.
// final JsonArray arr = new JsonArray(worlds);
final String result = mapper.writeValueAsString(worlds);
final int contentLength = result.getBytes(StandardCharsets.UTF_8).length;
this.req.response.putHeader("Content-Type", "application/json; charset=UTF-8");
catch (IOException e)
req.response.statusCode = 500;
The Onion C based code is written in an even MORE efficient manner for the multiple queries test. It actually stores it's results in json format from the outset! Like so:
snprintf(query,sizeof(query), "SELECT * FROM World WHERE id = %d", 1 + (rand()%10000));
MYSQL_RES *sqlres = mysql_store_result(db);
MYSQL_ROW row = mysql_fetch_row(sqlres);
json_object_object_add(obj, "randomNumber", json_object_new_int( atoi(row) ));
const char *str=json_object_to_json_string(json);
private final HttpServerRequest req;
private final int queries;
// INSTEAD OF:
//private final List<Object> worlds = new CopyOnWriteArrayList<>();
private final JsonArray worlds = new JsonArray();
public void handle(Message<JsonObject> reply)
final JsonObject body = reply.body;
// INSTEAD OF:
if (this.worlds.size() == this.queries)
// All queries have completed; send the response.
// final JsonArray arr = new JsonArray(worlds);
// INSTEAD OF:
//final String result = mapper.writeValueAsString(worlds);
final String result = worlds.encode();
final int contentLength = result.getBytes(StandardCharsets.UTF_8).length;
this.req.response.putHeader("Content-Type", "application/json; charset=UTF-8");
catch (IOException e)
req.response.statusCode = 500;
Is it the case here that some people have sent you test code optimized for their own frameworks?
If that is so, you should add some tests that would not be so amenable to optimization. I'm not picking on Onion here by the way. In fact, the argument could be made that Onion is not actually 'optimized', so much as just written correctly, and the other frameworks have tests written incorrectly. But I just wanted to know if you guys actually intended to use these different implementations for some reason that I am unaware of? Do they make the tests more fair somehow???
Thanks for taking the time to dig in and provide some feedback. As much as possible, we want each test to be representative of idiomatic production-grade usage of the framework or platform. Furthermore, we have solicited contributions from fans of frameworks and the frameworks' authors. A side objective is that the code double as an example of how best to use the framework or platform.
All of this means we fully expect that the implementation approaches will vary significantly.
The multiple query test has a client-provided count of queries, so in most Java cases, we create a fixed-size array to hold the results fetched from the database. I wrote the Servlet and Gemini tests, so I can confirm that behavior in those tests.
We are not Vert.x experts and we have not yet received a community contribution for the Vert.x test. However, it is our understanding that idiomatic Vert.x usage encourages the use of asynchronous queries. The question then is: how do we collect the results into a single List in a threadsafe manner? Is your JsonArray alternative threadsafe? Admittedly, using a CopyOnWriteArrayList gave us pause, but we are not (yet) aware of a better alternative.
The Onion test was contributed by a reader and admittedly its compliance with the specification we've created is perhaps a bit dubious. We want a JSON serializer to process an in-memory object into JSON. I'm not certain if the Onion implementation matches that expectation, but the test implementation nevertheless seemed sufficiently idiomatic for his platform.
We're certainly open to more opinions on that matter.
I agree, that approach would be best. I just was unsure why you didn't do it in Vert.x.
"...it is our understanding that idiomatic Vert.x usage encourages the use of asynchronous queries..."
Someone can correct me if I am wrong, but my understanding of Vert.x is that any query you send to the event bus is already asynchronous. There is no need for a developer to worry about threads at all when writing a vert.x handler. That handler will only ever be called from a single thread. So using a simple array is fine. Using the JsonArray is even better, because then it matches the Onion test idiomatically speaking. Which, I agree, is what you should be going for.
"...The Onion test was contributed by a reader and admittedly its compliance with the specification we've created is perhaps a bit dubious. We want a JSON serializer to process an in-memory object into JSON..."
Please don't misunderstand, the Onion test does what you want it to do. As well, it does it in the correct idiomatic fashion. That's exactly how I would write the Onion test. I was just wondering why the other tests went out of their way to decode Json and the reencode Json for each result. Onion only ever encodes to Json once, other tests are encoding and decoding multiple times. I only pointed out Vert.x because it was the most egregious. I mean in that case the answer from the persistor is already in Json. It is put in a non Json data structure... and then that data structure is encoded to Json??? Just seemed weird.
EDIT: Just verified that there is no need for thread safe code in a Vert.x handler. (Gotta say... that is pretty slick)
On a connected note... man ... these tests are a VERY good way to learn more about these different frameworks!
We had not previously understood that there was no need for thread-safe behavior within a Vert.x handler. Removing that (apparently fictional) requirement allows us to use just a simple array. Out of curiosity, can you point me to where you found confirmation that handlers do not require thread safety?
Thanks again for your feedback!
Edit: spot checking Vert.x with a simple array does not appear to affect performance to a measurable degree.
Anyway, the relevant part of the manual is:
Also discovered you are running the equivalent of a single threaded environment anyway. (8 workers on an 8 core machine)
Any reason for that???
Maybe more confusion about Vert.x concurrency??? If so, here is the relevant documentation on worker verticles:
Long story short, you dedicated the entire machine (8 cores) to the equivalent of database connection management, (the persistor). Very little of the machine, (whatever is context switched in effectively), is dedicated to request handling. Try something a bit more fair... like 4 workers and 4 web request handlers.
Also, I was going through the Node.js tests and I had a question... do you guys do any clustering for the Node tests at all? Or are these results from the tests run on a single Node?
Sorry for all the questions, just want to make all of the tests do the same thing across all of the frameworks so that when you guys run it again, we can use that data here in a more meaningful fashion. For us, it is useful.
In all tests, the database test is allocating a greater amount of effort to database connection management (in totality: the handling of connections, statements, queries, and result sets) versus request handling. This is not unique to Vert.x. The reason some frameworks' database tests achieve nearly 50% as many requests per second on i7 versus the pure JSON test is simply that at the ~210k rps range for the JSON tests, we are running into a Gigabit Ethernet wall (which I have commented about elsewhere). If we had 10 GBE, the JSON test results on i7 would be even higher. (Also see comments elsewhere about our intent to normalize, to a degree, the response header requirements since the variation observed is attributable to response-headers .)
Yes, the node.js tests are running with the cluster module .
Thanks for the comments. We have received great feedback in the previous rounds and this round received even more attention so there have been some more good questions. Unfortunately, there has also been some rehashing, which indicates we're not doing a great job of explaining to people how each environment is configured (linking to the repository only goes so far). That said, we also continue to receive some fantastic pull requests. Thanks to everyone who has helped out!
Unless the number of reads heavily outnumbers the numbers of writes, it is better to use something like Collections.synchronizedList(new ArrayList<String>()) instead of CopyOnArrayList. The synchronized list does lock while reading or writing, but writing still becomes much faster.
We're very happy to make it more correct and eek out whatever small gains can be had--plus make the code cleaner. But if there are readers looking to improve the Vert.x numbers, I think we're going to need contributions from someone with a deeper understanding of Vert.x tuning (or some more time to invest in becoming that!)
Did you use orm when it was available? Or just used raw query?
This is in line with my previous query about more usage scenarios (and thank you for fortunes test) who would be like we use them usually.
My point is, that the same reason why we use framework instead of raw language, we use orm instead of direct sql.
it is stated in the benchmark when a framework test uses an ORM or "raw" pdo/whatever for db request.
Things like Doctrine are elegant and smart ,but let's face it,they are so slow. PHP is not JAVA. Hibernate may be fast on JAVA but Doctrine is hardly (fast)...
Symfony Laravele and Silex share the same http-kernel & event-dispatcher. Laravele and Silex however can use closures for controllers and filters/middleware, Maybe that's why there are faster.
Classes are expensive in PHP , since PHP is not OO centric and classes are merely a add-on. Bootstraping Symfony means creating an insane number of objects. There are things that could be done about it. I'm sure PHP frameworks are so slow because of the abuse of class hierarchy.
We welcome all pull requests, suggestions and criticisms.
I guess though if this is only testing JSON serialization, it may not make sense. Perhaps adding JAX-RS implementations like CXF, Jersey, RESTEasy, and RESTLet would be more appropriate.
The World Wide Wait benchmark, which tested quite a lot, showed quite favorable numbers for both JSF implementations.
I guess everyone is busy because of the DConf 2013
Otherwise I will remember them next week.
I guess if anything it speaks to what a solid piece of software the JVM is.
While true, it all depends which JVM one talks about, there are plenty to choose from, even native code compilers.
I just noticed something, the two major JSF2.0 implementations MyFaces and Mojarra are both missing.
BTW, if you like Node.js, you should probably look at Vert.x. I haven't used it, but it's a similar concept, it runs on the JVM, and it seems to spank Node.js.
But for quickly spinning up a few light service endpoints, Node can't be beat. Especially if you are using JSON-based persistence like MongoDB or CouchDB, using JSON all the way from database to the client is a huge win. I get tired of writing lots of JAXB POJOs to map my JSON objects to and from, especially early on in development when those definitions change rapidly. That's why enjoy using Node, especially for "toy" projects. Less boilerplate and more productive quickly.
Side note: I find myself wishing Node had Annotations and AOP... one of Java's coolest (though oft-misused) features IMHO.
The ability to script Java is one of Ringo's killer features - for this benchmark, for example, we dropped in two jars (JDBC myqlconnector & connection pooling from apachecommons) and glued them together with 10 LOC of JS.
I have had a little less joy experimenting with both Clojurescript and Ember.js (with Clojure back end services): I eventually get things working, but at a huge time cost over writing non-rich clients just using Hiccup.
I wonder how the results would change if only one or two cores were enabled (using taskset or isolcpus)
Did you submit a pull request for it and it was denied?
I believe that the way these test are setup slightly advantage gemini, and more broadly java. Since they do not measure memory usage, or tasks that make memory usage critical, which is something JVM sucks at.
Gemini is our in-house framework and there are two points to consider:
(a) We are obviously very familiar with Gemini and therefore know how to use it effectively. For example, we know that we prefer to deploy Gemini applications using the Caucho Resin application server because it has proven the quickest Java application server in our previous experience. Of course, the other Java Servlet-based frameworks also benefit from deployment on Resin in these tests.
(b) In our design of Gemini, we do keep an eye on performance. But as the data shows, there are faster options.
Although we included Gemini in these tests, we did so because we wanted to know how it stacked up against other frameworks that we routinely use on projects. See more information in response to an earlier question here: https://groups.google.com/d/msg/framework-benchmarks/p3PbUTg...
A few frameworks have a "stripped" version (just Django and Rails so far) to try to show the best that can be achieved when typical functionality is stripped out. Essentially optimizing for this test, which is interesting even if it isn't the point of these benchmarks. If you think Symfony2 would benefit from a separate "stripped" test then please consider a submitting a pull request with that.
Same with any of the other frameworks - some have been optimized by their fans, others are running in default configs.
Techempower should add a filter to show only frameworks that have been optimized.
That said, we have tried to not run anything in the "default" configs, but rather the "production deployment" configs if we could find documentation on that. Unfortunately there is a huge variability across frameworks in how good the "production deployment" documentation is.
I know some of these communities don't really frequent HN and Reddit, but they all frequent their mailing lists.
If you're interested, we'd love to have some help getting this accomplished.
Can you explain. I might be able to help, I've got some experience with uwsgi + bottle.
I encourage you to benchmark Rails 4 yourself and see if there is a measurable difference from the latest 3.2.x, to get a preview of the impact it will have. When Rails 4 is released we'll definitely want to upgrade to that.
Edit: pfalls is too fast for me, and just completed the upgrade to 3.2.13 and closed the issue. The next round will use Rails 3.2.13.