Hacker News new | past | comments | ask | show | jobs | submit login
Pushing Nginx to its limit with Lua (cloudflare.com)
201 points by jgrahamc on Dec 8, 2012 | hide | past | web | favorite | 57 comments

I'm really glad to see a Cloudflare post at the top of Hacker News, I've been following them and they have lots of interesting things to say. I switched from Amazon Cloudfront to Cloudflare two months ago and can't be happier. My hosting bill was reduced a 95% and my website speed decreased only a 13%, a figure my weekend project can afford to make it profitable.

Don't like that one bit. No reason your speed should decrease at all. That's quite different from other people's experience comparing CloudFlare to CloudFront. If you have a second, submit a ticket and we'll make sure you've got everything setup for maximum performance.

Thanks for your concern Matthew, I'm not very worried about this. That 13% of difference in speed is most probably due to the fact that Amazon has a Cloudfront edge location in Madrid, the city most of my traffic comes from.

I haven't really measured the speed from other locations but I will do it and file the ticket to let you know about it.

Thanks for the follow up. Madrid likely to be one of the next cities we expand to (although latency to Paris is <30ms). Stay tuned.

What caused the reduction in your hosting bill? Was it Amazon's supposedly-poor cache hit ratio, or a CloudFlare-specific feature? Have you tried negotiating with a larger CDN?

CloudFlare typically reduces bandwidth usage and load by about 70%. That can translate into cost savings if your host (e.g., AWS) charges you for bandwidth.

> CloudFlare typically reduces bandwidth usage and load by about 70%.

How does it do this? I'm going to be flipping our content-service from IIS to CloudFront soon so I'm familiar with CloudFront on a theoretical level. Not clear on what you mean.

One big thing they do is cache static assets for you at their edge locations. The majority of the bandwidth costs imposed by static images, CSS and JavaScript (assuming appropriate cache settings) should be offloaded onto CloudFlare once you set it up.

This is what any CDN does, including CloudFront (although supposedly CloudFront's cache hit ration is fairly low). The specific issue at hand is in what way CloudFlare (a CDN) saved more money than CloudFront (another CDN): neither "it typically saves money" (eastdakota's answer) nor "it is a CDN" (effectively your answer) answer that question: these are both nothing more than tautologies restated from the initial question, and do not help someone attempting to compare these two services. (Personally, I use CDNetworks, after having used EdgeCast for a while and evaluating Akamai for my use case; I personally see no reason why I would use CloudFlare, but am always curious.) (Is the issue simply that CloudFront is so expensive that you don't feel you are saving much money? Hence my question then about negotiating with a larger CDN, such as any of the ones I was working with.)

Well, yeah, but my point is that CloudFront does this too. The quote I selected made it sound like CloudFlare reduces bandwidth usage by 70% in comparison to CloudFront. So I'm curious how that works.

Simple: CloudFlare doesn't charge for bandwidth, CloudFront does.

See: http://www.couldflare.com/plans

$3k/mo buys you a lot of bandwidth, though; even at CloudFront's somewhat-high-for-a-CDN pricing, that's 30TB of bandwidth; to see a 95% reduction in your hosting costs over CloudFront with $3k/mo unmetered bandwidth you'd have to be pushing 1.8PB of data. (edit: I originally said 600TB, but I had done the math wrong for the later discount brackets.)

Even if you were down at the $200/mo plan, that's 45TB/mo before you get to the "95% less expensive" point; I have tens of millions of users worldwide downloading megabytes of packages from me (while the Cydia ecosystem has tons of things much larger, I don't host those: I just have the core package), and I don't often go above 45TB/mo.

Is the idea here that CloudFlare is seriously giving you ludicrously unlimited amounts of bandwidth (and will not give you any crap about it) with a high cache-hit ratio even at their $20/mo plan? If so, I'm going to have to run some insane experiments with their service ;P. (Part of me isn't certain that I want them to hate me that much, though ;P.)

(edit:) Ok, I looked into this some, and this argument ("they don't charge for bandwidth") is just as false as one would expect given that it isn't feasible of them to price that way ;P. Their terms of service makes it very clear that they are only designed for HTML, and that "caching of a disproportionate percentage of pictures, movies, audio files, or other non-HTML content, is prohibited" <- yes, even "pictures".

With this glaring restriction, there is really no way I can imagine any reasonably-normal company getting a 95% reduction in hosting costs over another CDN, even CloudFront: if you are pushing tens of terabytes of mostly-HTML content a month, you are doing something insanely awesome (and we've probably all heard of you ;P).

If it's web content, go right ahead. We have many very large sites using the free plan. From your use, it sounds like you're using a CDN for file distribution (i.e., sending out large package files), not traditional web content. CloudFlare isn't designed for that use case. We're also not setup for streaming content (e.g., if you're running a streaming server for video). In both those cases, you're likely better with a traditional CDN. However, if you're using us for traditional web content, there are no bandwidth caps even on the free plan.

Aha. We're serving lots and lots of very large image file and PDFs. Thanks much.

We're running some e-commerce sites with 8-20k items through the $20/month plans and have never heard a complaint from Cloudflare. That said, any sites we 'care' about are running on their business or enterprise levels which are much higher than $20/month :P.

Right, which is why I started that evaluation at the top-end of the scale. How much data do you move a month?

Probably you won't see this reply, but if you do... we move a decent amount, but not a crazy amount. In the last 30 days it was around 3TB total (through Cloudflare... we only saw about 2/3 of that).

Can you elaborate on CloudFronts "poor cache hit ratio"? Wouldn't this just be up to the origins use of cache headers (cache-control, expires, etc)? I suppose cold content could be LRUd out, but the CloudFront docs advertise advertise 1 day last I looked. PS CloudFront customer access logs include cache result (hit/miss/error) so you can see for yourself.

One of the main ways that CDNs compare with each other is how well they manage to actually cache content and not have to go back through to your origin server; I see a lot of developers go the path you are, saying "well, that's fixed, isn't it?", but if you think about the architecture of how you build a CDN, you rapidly see that that doesn't really make sense.

Yes: if you had a single server somewhere out in the cloud that was sitting in front of yours as a cache (which is how most developers seem to conceptualize this: such as running a copy of Varnish), it is easy to say "ok, I'll cache this for a day"; however, you first are going to run into limits on what can be stored on that box: there is only so much RAM, and it doesn't even make sense to cache everything anyway.

You can spill the content to disk, but now it might actually be slower than just asking my server for the content, as my content might have it in RAM. This is a major differentiating factor, and some CDNs will charge more to get guaranteed access to RAM for either certain files or in certain regions or for certain amounts of data, as opposed to getting spilled off to some slower disk. However, there is also only so much disk.

Even with an infinitely large and very fast disk, though, you don't have one server: in an ideal situation (Akamai), you have tens of thousands of servers at thousands of locations all over the world, and customers are going to hit the location closest to them to get the file. Now: what happens if that server doesn't have the file in question? That is where the architecture of how you build your CDN really starts to become important.

In the most naive case, you simply have your server contact back to the origin; but, that means as you add servers to your CDN, you will require more and more places that will need a copy to get cached. This isn't, in essence, a very scalable solution. Instead, you want to figure out a way to pool your cached storage among a bunch of servers... and somehow not make that so slow that you are better off asking my server for the original copy.

You then end up having bigger servers in each region that a front-line server can fall back on to get the cached copy, but that server is now going to become more and more of a centralized bottleneck, as more and more traffic will have to flow through it (as a ton of stuff always ends up not being currently warm in cache). It also eventually looks more and more like the centralized server straining for storage, and will have to evict items more often.

Some places might spend more time trying to specialize specific front-line servers for specific customers (some kind of hashing scheme), but then the CNAME'd DNS gets more complex and is less likely to be cached in whatever ISP this is, or you can attempt to distribute routing tables between your own servers for where to find content, etc.; this simply isn't a simple problem, and certainly doesn't come down to something as simple as "well, I told them to cache it, so they did: read my Cache-Control headers".

In some cases, you might even "well ahead of time" go ahead and download a file that you think might be valuable to other regions; you might also attempt to optimize for latency, and do prophylactic requests for things that clients haven't even asked for yet, so you can get them cached and ready for when they do (CDNetworks, for example, normally re-request files that are actively used when they are still 20% away from expiring, to make certain that the file never expires in their cache, which would cause a latency spike for the poor user who first requests it afterwards).


> ... Cotendo pre-fetches content into all of its POPs and keeps it there regardless of whether or not it’s been accessed recently. Akamai flushes objects out of cache if they haven’t been accessed recently. This means that you may see Akamai cache hit ratios that are only in the 70%-80% range, especially in trial evaluations, which is obviously going to have a big impact on performance. Akamai cache tuning can help some of those customers substantially drive up cache hits (for better performance, lower origin costs, etc.), although not necessarily enough; cache hit ratios have always been a competitive point that other rivals, like Mirror Image, have hammered on. It has always been a trade-off in CDN design — if you have a lot more POPs you get better edge performance, but now you also have a much more distributed cache and therefore lower likelihood of content being fresh in a particular POP.

Given all of this, what I've heard from people using CacheFront that have shopped around and know enough to pay attention to this kind of metric is that their cache-hit ratio is somewhat poor in comparison to other CDNs you might use. I am thereby curious if that's one of the things that is causing CloudFlare to come up much better than CloudFront, if it is a specific feature from CloudFlare that is helping, or if it is just that CloudFront is so expensive in general (bandwidth from CloudFront is ludicrously bad: even Akamai tends to hit you with initial quotes that are better than what CloudFront offers; but 95% reduction in price seems "unbelievable").

I'm using nginx+lua as the backend to http://typing.io, and I've found the combination to be a fast and robust alternative to more full featured web stacks like Rails.

It would seem that this approach gives all the advantages of async with none of the drawbacks involved with writing callbacks. If someone added a type annotation extension to the language and parser, this would be a killer combination for large code bases. Lua is also a good match for "The Good Parts" of Javascript, so it should be possible to have all the code in one language.

Lua was never designed to be used for really large code basis. It's a lightweight scripting language on top of C. The entire runtime is written in 20k lines of C. It also is designed to be easily incorporated in C programs. Nginx is a popular server case, while World Of Warcraft is another popular us.

It's really a pretty amazing language, when you embed it you can choose exact what functionality is exposed to the scripts you run in it, so it's pretty good for sandboxes type code (though, imposing memory restrictions is much harder).

> Lua was never designed to be used for really large code basis

I'd certainly agree that "maintainable for large code bases" was probably never on Ierusalimschy & Co's language design requirements document, but you could say the same of same of Perl, Python, Ruby, Javascript, and most Lisps. None of those languages have any additional mechanisms traditionally associated with enforcing uniformity and consistency across large projects. This hasn't stopped people from building large successful projects using these languages.

As for your comment on the implementation size: I'm not sure which way you meant it (complimentary or pejorative) but I often find people react to this in exactly the opposite fashion I would. I see "self-ish/JS-ish/python-ish semantics in 20kloc of ANSI C? Sign me up!". I find some people see it and assume it to be a toy. It reminds me of the Bill Gates line about (paraphrasing) "measuring software's success by lines of code is like measuring an aircraft's success by its weight".

Playing devils advocate here, Lua does indeed have some warts that make it less pleasant to use in a heterogenous environment (which pretty much all large code-bases are). Not so much due to deficiencies in Lua but due to the impedance-mismatch versus other languages.

The most obvious issue that everyone stumbles across is the "counting from 1". It seems like a minor thing, but the context-switch remains a drag when you're dealing with complex data-structures in two languages and only one of them is Lua.

The impedance-mismatch becomes even more apparent when the table-abstraction meets serialization. The lack of distinction between an "array" and a "hash" is awesome when you're in a pure Lua-environment, but it becomes a real problem when you need to exchange data with languages that do depend on this distinction (e.g. if you feed Lua an empty "array" it will later serialize it back to an empty "hash").

The final issue that I can't resist mentioning here is not a language but a community/mindset one. Up to this day Lua doesn't have an established package manager akin to RubyGems, Pip, Maven, Leiningen etc. (Luarocks exists but is... well, I've yet to see someone actually using it)

This is a deadly sin in terms of mainstream adoption. It makes deployment a serious pain in the ass.

GoLang shows how a modern language is supposed to handle this (importing/bundling packages directly from urls). I keep hoping someone will add something similar to lua-core, but I'm sadly not very optimistic about it.

I think many of the driving people behind Lua just don't care about it becoming a mainstream language or not. They care about it shining as an embedded language (and it does!) - it's just a little bitter for those of us who would love to use it on a broader scope.

You raise some good points; I understand you're playing the devil's advocate but I thought a few would benefit from a friendly counter :)

> ... everyone stumbles across is the "counting from 1" ...

Fair enough :) I find this objection to be largely a matter of taste; it was never an issue for me [added in edit: even when interoperating with C and JS code]. People have made similar complaints about Matlab that I never found persuasive (there are other more persuasive criticisms of Matlab's language design). I think the core argument I'd make here is that if you're using Lua tables in a way that requires array-offset semantics for the index variable, you could probably step up a level of abstraction using ipairs/pairs and save yourself worrying about 1 vs 0.

> The impedance-mismatch becomes even more apparent when the table-abstraction meets serialization.

Lua tables naturally serialize to Lua table syntax (modulo cycles). This is in fact Lua's origin story (if Lua were a spiderman comic, it would be a story of a table description language being bitten by a radioactive register-based VM). At a technical level, how are Lua table literals any less successful a serialization format than JSON (i.e. JS object literals)? To put it another way: JSON doesn't map naturally to XML; should we then conclude that it has an impedance mismatch with respect to serialization?

> doesn't have an established package manager ...

<old guy hat> The idea that a language should have a package manager has always seemed... confusing... to me. C doesn't have a package manager; people still seem to be able to get the relevant packages when they need them through the OS's package manager. That, to me, seems the sane solution. I realize I may be in the minority. </old guy hat>

Having said that I agree that LuaRocks' comparative weakness relative to Ruby's gems limits adoption in mainstream programming applications. OTOH, it is vastly easier to get started embedding Lua in a host program than any of its competitors (this is in fact what drove me to try it in the first place). So it's not all friction on the deployment story.

> ... many of the driving people behind Lua just don't care about it becoming a mainstream language or not.

Yes, I think this is likely true. I don't think any of the core contributors care about it being "the next Python/Ruby/Perl". If I had to summarize the emergent aesthetic, it's that Lua is designed to be a just a language with a large set of DIY practices around it, rather than a curated software ecosystem.

> C doesn't have a package manager; people still seem to be able to get the relevant packages when they need them through the OS's package manager.

This is fine with Linux, which has at least a few sane package management systems between the different distros. This goes out the window with OS X and Windows.

(This may just be an argument that anyone working on server-side software should be working inside a VM that matches your production environment. The Ruby community seems to have shown that people push back very hard on that.)

In the particular case of Nginx and Lua, the OpenResty package[1] is pretty much self contained except for libpcre.

Nginx+Lua is only the core part that powers this ecosystem. OpenResty comes with a lot of libraries for "usual web stuff", except maybe a default template system. I've been using this rather simple one [2]

In the full example[3], I installed it on Mac OS (and actually found a small problem with homebrew that should be fixed soon)

[1] http://openresty.org/#Download [2] http://code.google.com/p/slt/ [3] https://github.com/mtourne/nginx_log_by_lua/

The idea that a language should have a package manager has always seemed... confusing... to me.

Yes, it's seems less than ideal, but in practice it has proven to be a significant advantage (not only) in cross-platform deployments.

The C toolchain has the benefits of being a compiled language (dynamic linking) and ubiquity (autoconf is a mess but sort of works pretty much everywhere). No other language has that, you can not even rely on a recent version of your runtime being available on a given platform. And things get really hairy when you need multiple different versions on the same host.

The rubygems+rbenv approach just works really well, almost independently of the platform that you're dealing with. And once you become used to deployment being this easy your tolerance versus languages lacking this convenience declines rapidly.

> The most obvious issue that everyone stumbles across is the "counting from 1". It seems like a minor thing, but the context-switch remains a drag when you're dealing with complex data-structures in two languages and only one of them is Lua.

Not so much of a problem for me, because I found out I rarely need to use array indices on the same structure in two different languages.

> The impedance-mismatch becomes even more apparent when the table-abstraction meets serialization.

I don't know if you read the Lua mailing list but I have posted about that recently (http://lua-users.org/lists/lua-l/2012-11/msg00683.html) and gotten a reasonable answer (http://lua-users.org/lists/lua-l/2012-11/msg00691.html). I still think separate types for lists and maps (arrays and hashes) are good but the proposed solution is elegant (and a good example of loose coupling: you teach libraries your convention and not the other way around).

> Luarocks exists but is... well, I've yet to see someone actually using it

I can assure you the part of the community using Lua as a standalone language uses it. Embedded users don't really need a package manager anyway.

Things are moving on that front too: the Moonrocks project could become what Rubygems is to Ruby (http://rocks.moonscript.org/).

> (though, imposing memory restrictions is much harder).

I developed an Emergency GC patch [1] for Lua 5.1 (Lua 5.2 come with this feature). With this feature it is easy set a per Lua State/script memory limit. The Emergency GC feature is needed to allow a custom allocator to force the GC to run when the script is at it's memory limit. My EGC patch has been in use by the eLua [2] for years to help run Lua scripts on Microcontrollers with as little as 64Kbytes of Ram.

Also Lua is fine for large projects [3].

1. http://lua-users.org/wiki/EmergencyGarbageCollector

2. http://www.eluaproject.net/

3. http://prosody.im/

Lua was never designed to be used for really large code basis.

Bases. That doesn't mean that it wouldn't be good for large code bases. Many languages end up getting used for purposes they weren't designed for.

Javascript was never really designed, but look how that ended up.

Javascript was written by one guy in a shockingly short amount of time. What a language was intended for and what it's capable of are two very different things.

IME Lua is neither worse nor better than (eg) Python for large code bases.

When I looked at their code I couldn't really find how they are dealing with the async stuff. Is it using coroutines or is it something else?

I used OpenResty as the core for an experimental HTTP routing service at my previous job. The idea was to maintain host:port pairs for HTTP services in Redis and use a small bit of Lua code to look up the right place to proxy a virtual host to. It worked well enough, but never got deployed because we were concerned about the Redis SPOF. Now-a-days it wouldn't be that big of a deal.

That said, the nginx/lua combo is wonderful to work with. I got the core of that thing working in just a few hours.

Why doesn't everyone try Apache Traffic Server with its LUA plugin which has way better performance and many more capabilities?

(a) no evidence of better performance. Lua/Nginx is very fast too. (b) Its just a proxy, while Lua/Nginx/openresty is much more of a full web development environment. You can use it as a proxy, but it is not the main use case. (c) I don't think Traffic Server has more capabilities, as far as I can see, I think it has fewer as it is much more focus on being a proxy product, while Nginx is a full web server.

I think Lua/nginx and ATS are quite different beasts, solving different problems. Nice presentation on ATS - "Apache Traffic Server: More Than Just a Proxy" https://www.usenix.org/conference/lisa11/apache-traffic-serv...

That sounds interesting. Want to post your writeup of your experience with it?

> has way better performance

Do you have any performance comparison?

> many more capabilities

As...? Care to compare?

How about honoring the RFC's as a first comparison as to capabilities. Example: nginx doesn't even honor Vary headers:


I've been using Openresty (with Lua and Redis support) to collect a large amount of data from GET requests for a video sharing site. Having nginx push data directly into Redis is super fast, lends easily to real-time metrics, and makes it easy to batch everything up and push it into MySQL at the end of the day. Now that Redis supports internal Lua scripts, you can also do custom atomic functions and other neat things.

If you can get around the Redis SPOF, OpenResty + Redis is great for large-volume data collection. Thousands of requests/sec on an EC2 Small at < 10% load.

Sounds amazing and would love to step in immediately. Have some basic questions after reading briefly the post and the comments here.

1) What's the difference between OpenResty and the Nginx+Lua module (is the module the core of OpenResty)?

2) How does it compare to Node regarding ecosystem, performance and every day usage/maintainability (if it's comparable)?

I've only been playing with openresty for a couple of months. Here's what I've learned:

1 - openresty is nginx and many official and unofficial modules. From what I've asked the experts, there is supposed to be no difference in the source; openresty is just a single package of many modules with an easy to use ./config and make process.

2 - The nginx module and lua lib ecosystem seems to have most of the basics in order to roll a high level app framework, but only a half baked one exists thus far: https://github.com/appwilldev/moochine Also: https://github.com/antono/valum https://github.com/pintsized/lua-resty-rack

My impression is if you have the time to roll your own sinatra-like framework, it should be pretty straightforward.


I've written an authorization server for CloudFront using Python/Flask and have wondered whether just writing the same functionality via Lua could be a better/faster replacement. I'm just not clear on how to use Lua to make the calls to our API. This is a good pointer in the right direction though.

Nginx exposes multiple phase to process an incoming HTTP Request. Mainly rewrite, access, content, header_filter, and log

You can embed Lua at the access phase, and query your API from there. Some of the Lua libraries of the OpenResty package [1] might help you do that.

And here [2] you can find an example of OAuth support.

[1] http://openresty.org/#Components [2] http://seatgeek.com/blog/dev/oauth-support-for-nginx-with-lu...

Here is some example auth code for Amazon services in Lua for Nginx I wrote ages ago, which might help get an idea of what the code might look like https://gist.github.com/948423

It is very fast doing that type of stuff in Lua, and it all gets done in the same request context.

There is some new functionality in openresty since that example, so there are probably other choices of ways to implement it.

But what happens when people put big numbers in my Fibonacci calculator?

Lua implements TCO, so it'll do just fine. :P

If we are being pedantic, the naive Fibonacci implementation isn't tail recursive. (And any tail recursive implementation is unlikely to need TCO before the numbers get too large.)

Sorry friend, it's time to be SUPER pedantic.

Silly Computer Science nerds spend a lot of time trying to figure out the most efficient implementation of a recursive Fibonacci number function. "Ooo, let's make sure to take advantage of tail-call optimization, oh, and memoization, aren't we fancy?"

Meanwhile, mathematicians look at the problem and then cock their heads to the side and say "uh, guys, that's way too much trouble, just use the closed form solution and calculate any value in constant time using a tiny number of floating point operations". Because Fib(n) = (phi^n - (1-phi)^n)/sqrt(5), where phi is the golden ratio.

No, if there's any computations happening, that's engineers and physicists, especially if there are approximations happening.

A mathematician looks at the recurrence formula and generalises it (e.g. f(n) = a f(n-1) + b f(n-2) or f(n) = f(n-1)*f(n-2) (solve this one, hint: xkcd) or ...) and investigates its properties. The actual values of the numbers are not of much interest.

And a^n can only be computed in constant time if one uses exp(n log(a)), and (I think) that the scale of the numbers can have a large impact on the accuracy of the result, and so for large n, one needs more operations within both exp and log to give the same (relative) accuracy.

Last time I checked, numerical analysis was still considered a field of mathematics. It's all about the approximations.

Pssh, it's not abstract enough, it can't be real mathematics. ;P

(But seriously: yes, agreed.)

And then the paranoid Computer Science nerd gets freaks out because of is immortal fear of floating point errors and uses the alternate fast matrix exponentiation implementation that takes just in log(n) time.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact