Hacker News new | comments | show | ask | jobs | submit login
Reverse Proxy Performance – Varnish vs. Squid (Part 2) (deserialized.com)
20 points by prakash 2935 days ago | hide | past | web | 12 comments | favorite

In other news: a Ferrari is faster than a four-door luxury sedan, hummingbird wings move faster than those of an eagle, and a racing horse is faster than a pair of pack mules hauling a wagon.

Varnish is cool, but it's not a replacement for Squid. It can address some aspects of the problem space, but nowhere near the majority of them.

This also fails to discuss the configuration of the proxies. The thing is...Squid operates on a huge number of platforms, and the default configuration favors compatibility and reliability over all else. Asynchronous IO, cacheability and expiry rules, memory and disk usage, etc. all make a big difference in performance, concurrency, and cache hit rates.

There are arguments to be made for using Varnish in some deployments, just as there are arguments to be made for using nginx or Lighttpd instead of Apache. But, there are also trade-offs.

It can address some aspects of the problem space, but nowhere near the majority of them. [...] There are arguments to be made for using Varnish in some deployments

Since you, presumably, have some experience with Squid and/or Varnish and the 'problem space' could you perhaps discuss or opine more concretely on that? It can be surprisingly difficult to derive useful information from indignant metaphors about Ferraris and eagles.

Yes, I do have some experience in the area. I was a core Squid developer for several years, and my previous startup was a proxy caching appliance vendor. I have been aware of Varnish for years, and have experimented with it several times over those years.

Varnish has narrowed the feature gap quite a bit over the years, but it's still not particularly narrow, and Squid hasn't been standing still.

I will admit that I have never deployed Varnish in production, while my experience with Squid spans thousands of servers, so I know a lot more about Squid than Varnish. And, it's been three years since I worked in the field at all. Both products have evolved quite a bit in that time (including a major rewrite in C++ for Squid).

Nonetheless, after a quick perusal of the docs for Varnish, I can say that for pretty much everything Varnish can do, Squid generally has three or four different tactics built in for approaching the same problem. Caching policies, ACLs, source selection (whether load balancing, hash-based source selection, etc.), logging, authentication, encryption (Squid supports SSL on either side, Varnish doesn't support SSL in any form, as far as I know), other protocols (I once deployed Squid for a top three computer manufacturer across several worldwide offices to allow their customers to access formerly FTP only support resources via HTTP; not possible with Varnish), etc.

Actually a luxury sedan vs. a Ferrari isn't really appropriate...Squid is more of a Winnebago filled with goodies, while Varnish is a stripped down (but very, very fast) race car. Both have their place. If all you need is what Varnish can do, then you should use Varnish. My metaphors were not indignant, they were intended to be illustrative of the difference in design goals of the two projects. One seeks to solve a wide variety of web caching and website acceleration problems with excellent platform compatibility, good performance, good security, good reliability, and reasonable configuration. The other seeks to solve a very small set of website acceleration problems with excellent performance; performance is, and always has been, the primary motivation driving Varnish, and it is very successful by that metric.

My point was not to criticize Varnish, but to point out that it is a solution to a very specific set of problems, and that in my seven years of working in the website acceleration industry, I rarely came upon a deployment that would have been well-served by those particular features; if I had, I very likely would have deployed Varnish instead of Squid. I am agnostic when it comes to Open Source software, as long as the quality is high, and in this case quality is high for both projects. But most of the time, the customer needed several of those other features of Squid, and performance was only third or fourth on the priority list.

Have used both varnish and squid on production servers, currently serving up to 500K+ uniques daily using varnish.

Squid is a 'feature complete' proxy, it's got everything and the kitchen sink. Which means that if all you want is a web-facing proxy without those bells and whistles that there is a good chance that varnish is faster (as it was in our case).

But varnish is not suitable for all deployments, and it can take some pretty scary tweaking to get maximum performance out of it.

Have used both varnish and squid on production servers, currently serving up to 500K+ uniques daily using varnish.

To put this into perspective, this is only a couple hundred requests per second (I'm assuming a "unique" equates to ~40 item requests). Either Squid or Varnish can handle that load with ease from a single box (roughly ten years ago, my previous company built a 550MHz, 512MB, 1x7200 RPM IDE disk Squid-based appliance that was conservatively specced at 70 tps, and adding a second disk pushed it up to 110 tps).

This is what I meant by performance not being the most important factor in most deployments of this type (though it certainly matters a lot; and faster is always better, since it results in better retention, more sales, more page views, etc.).

Correction, we are serving 12K requests per second across several machines. (4 right now, spreading things out over 6 in a little while).

Our users are rather longer on our site than you assume, and a typical page is 50 hits on the varnish caches.

OK, your uniques are more demanding than most deployments I've seen. So, performance is of significant importance to you, and it would save you some machines to go with Varnish over Squid (which tops out at a few hundred requests per second, maybe as much as a couple thousand depending on type of work and whether it can be served from RAM, due to various limitations in it's poll-based architecture; then again, it looks like Adrian and Robert and Henrik have done quite a bit of work in the 2.7 and 3.1 branches, so it may be faster today).

We use varnish at vg.no (online newspaper) Its quite effective for unpersonalized content. We run 2 varnish boxes, each of them sees ~2-3k requests / sec. Backend sees only ~30 req/sec..so pretty good cache ratio.

For personalized content esi seems to be the way to go, but havent tried that yet.

What are the use cases for setting up reverse proxies for production web apps (of medium to small scale) versus using a CDN like Panther Express/CDNetworks?

(Genuinely curious)

There are lots of good reasons that go beyond content caching (which is the topic of the article).

Somewhat terse write-up here: http://joshua.schachter.org/2008/01/proxy.html

The basic idea is that there's fair bit of useful logic that can be handled by a simpler, lighter-weight, more easily securable front-end system than the lumbering tank that is your app server. This includes basic load balancing/failover, partitioning (requests of type foo go to these servers, all the low-latency ajax stuff over there, static content somewhere else), babysitting slow clients, blackholing malformed requests, etc.

define:medium. You might want a reverse proxy for production web apps if you have an application on a few servers (where it also fills the role of load balancer), if you want to be able to serve stale content if your app servers go down, you want to use awesomeness that is ESI, and you don't want latency between your edge and your app servers for quick flow-through, if you want sticky sessions (ewwwww!), et cetera.

CDNs are fantastic for serving up high volume static content (images, css, downloadables) but the are really not appropriate for dynamic web pages that are personalized to the viewer.

In other words, you can serve up just about anything that does not need up-to-the-minute customization using a CDN.

Applications are open for YC Winter 2018

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact