Hacker News new | past | comments | ask | show | jobs | submit login
Nginx sucks at SSL, here's what to do instead (matt.io)
170 points by seldo on July 11, 2011 | hide | past | web | favorite | 62 comments



Without any reference to the nginx options

  ssl_session_cache
  keepalive_timeout
  keepalice_requests
and whether or not the other webservers mentioned support such options and if they are enabled in the tests. There is no way to know if this is alarming or just FUD. Also missing from this comparison is a vanilla Apache SSL benchmark (which will suck, but would serve as a reference point).

Given nginx's track record, I prepared to give nginx the benefit of the doubt and assume that it's a invalid test.

That said, I'm going to run some benchmarks of my own.


Mr. randomstring, your test results before blekko launched were much faster than this guy's number. Want me to quote from your email?


If I see such bad numbers, usually, before I write a post about it, I try to investigate and share the results of that with the post.

I'll try to reproduce these results tomorrow, but if I had to guess, I'd say ssl_session_cache was left to its default (off) which means that every connection has to do the expensive SSL handshake.


Sigh. Sometimes posts like these make me want to go back to academia (it's not perfect, but they generally believe in this little thing called rigor).

From TFA:

I tested nginx as a proxy, serving static files, and serving nginx-generated redirects. I tried changing all the relevant ssl parameters I could find. All setups resulted in the same SSL performance from nginx. I even tried the setup on more than one server (the other server was quad-core nginx got up to 75 requests per second).

So "all the relevant ssl parameters I could find", no details about what those involve, and the surprising result that it made no difference.

In the same situation, I might think I was doing something wrong...

And then this overarching statement:

Never let nginx listen for SSL connections itself.


Rigor is an interesting point. Do we prefer to have a flawed, but slightly useful, post now - or - do we prefer to wait a month or two for a squeaky clean post with all issues worked out?


In two years, people will still be saying "nginx + ssl = bad" because of this post even though the problem may well be fully addressed. Google will continue to surface this article even though it may be totally wrong at some future date. That sucks.


If it was really that easy to spread this questionable message for years, it would be as easy to spread other articles as well.

So it wouldn't need more than a few articles, like "nginx + ssl = works like a charm", or "nginx has better SSL support than Apache". It would not matter whether those were actually correct, just a single article of questionable quality would be sufficient.


... and here it is:

"nginx does not suck at ssl"

http://news.ycombinator.com/item?id=2759596


Why not doing both? First a small article about the surprising phenomenon, which announces a more thorough analysis next week.

That way, it is possible to get some initial feedback and maybe even some good hints that help speeding up the analysis. In the best case, the announced analysis could become a collaboration by multiple authors.


As Randy says: "The best way to get tech support online isn't to ask for help -- just visit a chat room and declare 'Linux sucks! It can't do X!'"


That reminds me of the old trick of asking a reasonable question. Then getting a friend to give a wrong answer to that. The real answer is likely to be somewhere in the flood of corrections that you see.


If the config files had been posted, the reader could work out the issues on their own and everybody benefits.


I agree. What if we all waited until 2008 for an academic to publish a 30 page paper about Ruby and Python and how they can be useful for building web apps?


Academics share and discuss findings in casual terms before formal publishing all the time. Two academics meeting over coffee aren't going to demand the rigor you get with a published article.


That's true, but the "damage" tends to be limited, because it's shared in person with a handful of people who understand the preliminary nature of the results--- not potentially tens of thousands of people clicking on the front page of Hacker News, who see a very definitive-sounding statement ("Nginx sucks at SSL").

I do think academics overcorrect on this, and should share more early results, possibly via things like blog posts (this is slowly starting to happen). But erring in the opposite direction is also quite common among tech bloggers. In particular, if you're going to publish anything that looks vaguely like a benchmark, it might be worth taking at least a few days to check out possible problems before sending it out into the world (not months or anything, but a few days).


But... that would take work! And he only works four hours a week. </tonguecheek>


IIRC I get pretty awful AB performance with ssl_session_cache on or off.

That said, whilst RPS is low, whilst I'm hammering it with AB it seems to have little problem being responsive in my browser :S


Try adding the -k parameter to ab to use keep alive requests and see if you notice an uptick. If you a generating a new session on each request it doesn't matter if you used the SSL session caching or not in Nginx.


Most of the servers suck at SSL without some sort of caching. Also, properly configured nginx setups have a fair share of keep alives.


Thanks! I didn't know nginx sucked at SSL. You may have increased our revenue. Many businesses like us have our conversion pages on SSL. Our front-end server is doing 2000 to 4000 http requests per second and we get over 3 million uniques on the main site where we sell stuff via SSL. If SSL is this slow, it probably impacts performance on our secure pages which affects revenue. Where do I send the beer?

On a 4 core Xeon E5410 using ab -c 50 -n 5000 with 64 bit ubuntu 10.10 and kernel 2.6.35 I get:

For a 43 byte transparent gif image on regular HTTP:

Requests per second: 11703.19 [#/sec] (mean)

Same file via HTTPS with various ssl_session_cache params set:

ssl_session_cache shared:SSL:10m; Requests per second: 180.13 [#/sec] (mean)

ssl_session_cache builtin:1000 shared:SSL:10m; Requests per second: 183.53 [#/sec] (mean)

ssl_session_cache builtin:1000; Requests per second: 182.63 [#/sec] (mean)

No ssl_session_cache: Requests per second: 184.67 [#/sec] (mean)

The cache probably has no effect because each 'ab' request is a new visitor. But I'd guess the first https pageview for any visitor is the most critical pageview of most funnels.


Choice of cipher as well as openssl version + features used make a difference too. See http://zombe.es/post/5183420528/accelerated-ssl for some examples.

Use "openssl speed -elapsed" to test performance on your system.


The post didn't mention if ssl_session_cache was enabled in the nginx config or not. In fact, I didn't see any configs posted. :(

Also, the article author apparently added support to stud (in his own fork) for x-forward-for. I don't think this is required any longer, due to this fairly recent stud commit: https://github.com/bumptech/stud/commit/9d9b52b7d3ce90fa84c6...


I hear the same performance happens with our without the session cache enabled (for benchmarks). The http(s) benchmarking tools don't resume sessions. It's simulating a horde of new clients who never come back or request other resources.

It would be interesting to see stud with a session cache too.


I'm reading over Matt's work carefully, but my initial inclination is not to merge the bulk of this into stud mainline. I'd rather keep stud simple and protocol-naive and have HAProxy do the HTTP work.


Which is to say indirectly that I think the right answer is for nginx (and daemons generally) to support the PROXY protocol, or some other agreed-upon standard for a naive upstream proxy to indicate host/port information.


Could someone link to a description of the PROXY protocol?



I really agree with this. I think keeping stud as simple as possible is a great goal.


I'm reading over Matt's work carefully

Thanks for the consideration!

initial inclination is not to merge the bulk of this into stud mainline

I agree. The HTTP stuff is still too integrated. ifdefs are ugly.

The solution is to do what showed up when I was 99% done working on XFF -- the nice PROXY protocol addition. We just need to get PROXY support into nginx now to obviate my XFF machinations.


I don't know what the limits are in the nginx HTTP parser Matt's using, so this is probably moot, but code that does things like "realloc(ptr, size + newsz)" or "malloc(size + 1)" expecting things to be fine gives me the howling fantods.


I don't know what the limits are in the nginx HTTP parser

You're correct in assuming the library enforces its own size limitations. It operates on length of received SSL data which is capped by the static receive buffer at 32k. Nice and tiny.

(Also, you are, of course, painfully correct about lack of bounds checking and lack of return value checking on the malloc/realloc calls. If I ever graduate the branch to production status, the six malloc calls and three realloc calls will be wrapped in proper checks.)


For what it's worth, don't check the retval of malloc/realloc/strdup; instead, rig them so they blow up if the allocation fails.


Why? Can't you just bail on whatever you are currently doing? How is the entire process compromised by a failed malloc? Resources are limited, sort of by definition, shouldn't good code be able to handle this possibility?


If malloc is rigged to explode when it fails, you can't accidentally forget to check; sometimes, malloc failures can end up being exploitable. It's not like most code does anything particularly smart when memory runs out.


An exception to this is library code, which should typically check for malloc failures and return an error to the client code.


Handling an out of memory error in any way other than terminating the entire process is very very hard, because the effects of memory exhaustion are felt by all your threads at the same time usually preceded by massive slowdown due to swapping (which will cause other symptoms if there is any real time constraints put on the process).

I'm not saying there are no cases where recovering from allocation errors would be possible but it's not the general case. It's usually easier to treat any allocation error as a fatal error and insure your programs so they don't run out of memory through other means.


I'm quite new to C programming. What's wrong with doing those things?


Numbers can't grow indefinitely; they wrap (usually at the size of the register).


"malloc(size+1)" is a sign you may have one off errors in your code. If you need to store a string of size s you need s+1 bytes allocated. The plus one is for null termination. If you want an array of t you can either pass the size around with the array or null terminate the array like strings, but then you can't store any null values in the array.

Also, there's no bounds checking on size so in certain conditions such as a 2GB/4GB allocation you may allocate zero bytes or -2GB bytes.


It would be nice it Matt provided full details on the testbed, including the client. In a test scenario it is very important to understand what gets tested in the end. I liked that "Russia" tag too :)


For a different point of view, check out:

http://web.archive.org/web/20090619214443/http://www.o3magaz...

The author benchmarks nginx at 26,590 TPS on a quad-core 2.5 ghz amd system.


I suspect that they are re-using connections in that benchmark. SSL connection setup is CPU intensive. Once a session is set up, an SSL connection uses only slightly more CPU time than an unencrypted connection.


That benchmark is using an SSL Accelerator card and nginx. Can honestly say I have seen very few people swing for SSL Accelerator cards.


The "open source SSL Accelerator" mentioned in the blog posting is a quad-core server running Linux and nginx.


whoops! my apologies -- the capitalized "SSL Accelerator" set my mind into thinking dedicated hardware device.


A friend of mine worked on some large scale SSL deployment, he wrote up the results of his tests here: http://zombe.es/post/5183420528/accelerated-ssl

He's concerned about the raw speed of the SSL calculations, not requests per second, but if you're actually concerned about SSL speed and you have enough requests per second to justify optimizing SSL speed, it could be pretty useful.


No configs, no methodology, no graphs... Great "benchmark".


If only there were some way you could create those things that you want and he didn't care about doing!


Calling out a tool as having some terribly negative characteristic places the burden of proof on he or she doing the calling.


I needed to use SSL on nginx and got great results from following a number of pieces of advice. I jotted down my noted here: http://auxbuss.com/blog/posts/2011_06_28_ssl_session_caching...

It made a significant performance difference to me.


I've seen good results with the "pound" front-end:

http://www.apsis.ch/pound/

I haven't done extensive benchmarking on it, but very knowledgeable people vouch for it.


Pound consumes an absurd amount of memory if you have lots of concurrent connections (due to thread stacks).


Interesting, thanks -- I'll watch for that. Until now I haven't paid much attention to it since I'm not responsible for that part of the configuration.


I'm not sure why he would be getting numbers that low. The only setup I have at the moment which would give useful numbers for ssl req/sec is a small single core VM, running one nginx worker process, and that pumps out 135 new req/sec. Add a few cores, workers, put it on real hardware, and I don't see how this couldn't push well over 400 req/sec.

This is using nginx strictly as an ssl termination, where I need to do some header manipulation that I couldn't do in stunnel/stud.


Self reply, since I waited too long to edit:

I remembered I had an older 8 core server sitting unused at the moment. I configured nginx with 8 workers, and ran `ab` against it. From a single (VM) host, I can get 680 connections per second (maxed the cpu on the host running the test). From 4 hosts, each host got > 290 connections per sec, so I got nginx up to over 1190 new connections per second, and can likely push it further.

[EDIT] got it to peak at 1535 requests per second with 4 hosts testing.


Where's the benchmark code? I'd like to see how ucspi-ssl [1] performs.

[1] http://www.superscript.com/ucspi-ssl/sslserver.html


Not to rain on the parade here, but we handle several thousand connections per second on nginx + SSL per 8 core Westmere machine.

The article needs way more detail.


If anyone does benchmarking, please include litespeed as I am curious. I suspect it's much faster than nginx at ssl. Even with the connection limit on the free version I suspect it will still be feasible for testing.



Let me guess - it is not nginx code which is slow but one from openssl? ^_^



Quoting:

    (on an 8 core server...)
    haproxy direct: 6,000 requests per second
    stunnel -> haproxy: 430 requests per second
    nginx (ssl) -> haproxy: 90 requests per second
Yet Matt Cutts tells us that SSL is not computationally expensive anymore. Based on these results it's still an order of magnitude slower.




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: