
Apache Traffic Server - nikolay
https://trafficserver.apache.org/
======
lumanaughty
I work at a large company that uses ATS heavily (top 5 site). There have been
huge improvements in performance and functionality after this paper has been
written.

In his benchmarks he was running with a single volume configured for cache,
which would have a global lock on cache. If he partitioned the cache into
multiple volumes (something we do by default now) he would have had much lower
cache hit response times.

The majority of our cache hit response times in production are less than 1 ms.

In the benchmarks I have run ATS has always been faster then Varnish and
NGiNX. If they weren't I would have made changes to ATS to make it faster.

~~~
eggnet
It's obviously working well for you but I'm curious how you deal with the
circular buffer cache with ATS. For a large library that would seem to be an
immediate disqualifier. Or maybe I'm not understanding how that works.

I'm assuming as a top 5 site you probably deal with a large library, and
appear to be ok with ATS despite that. Comments?

~~~
lumanaughty
The Tornado Cache (FIFO) hasn't really been an issue as an eviction algorithm.
Most of our caches are sized to hold over a weeks (most over a months) worth
of objects in cache. Most objects/traffic is temporal in nature. The popular
images and videos are normally only popular for a certain time period.

We have looked at not evicting objects on disk if they are in the RAM cache
and that has a LRU like eviction algorithm (really it is a CFLUS). Doing this
would help in not evicting really popular objects.

FIFO has advantages over LRU for disks. It is very efficient with writes since
they are all sequential. We use rotational disks when building out very large
second tier caches.

There are other things to consider when looking at cache in a proxy server.
How many bytes does the in memory index take per object in cache (for ATS 10
bytes and that is _extremely_ efficient). Also, does the cache use the
filesystem and/or use sendfile for HTTP (like NGiNX), but can't use sendfile
when using HTTPS or HTTP/2\. Netflix is experience this pain when moving to
HTTPS with NGiNX.

Every proxy server some advantage, easy of use, well supported APIs, flexible
configuration, dynamic loadable modules, HTTP specification compliance, HTTP/2
support, TLS support, performance, etc. It really depends on what you are
looking for when choosing a proxy server.

~~~
eggnet
Sounds like if you can size the cache large enough to hold enough data to keep
cache miss under control, ATS is solid. Thank you for sharing!

~~~
lhedstrom
Well, that's true for any cache. The choice of a simple eviction algorithm in
ATS is deliberate, and usually yields better cache efficiency than more
complex architectures.

Fwiw, it does support cache pinning, but that's rarely used nor necessary.

~~~
jjm
+1 In the long run efficiency is very high. Wonder if a set of test data to
demonstrate this as a KPI could be added other than raw benchmark perf.

------
zerd
Comparison with Varnish from "Performance Evaluation of the Apache Traffic
Server and Varnish Reverse Proxies" (2012) [1]

 _... the results indicated that Apache Traffic Server reached better cache
hit rates and slightly better bandwidth throughput with the cost of higher
system and network resource usage. Varnish on the other hand managed to
response higher request rates with better response time, especially for the
cache hits. The findings in this thesis indicates that Varnish seems to be
more promising reverse proxy._

[1]
[https://www.duo.uio.no/handle/10852/34903](https://www.duo.uio.no/handle/10852/34903)

------
nikolay
By the way, Google's PageSpeed module, which is available for Apache [0] and
Nginx [1], is also available on ATS [2]!

[0]: [http://modpagespeed.com/](http://modpagespeed.com/)

[1]: [http://ngxpagespeed.com/](http://ngxpagespeed.com/)

[2]: [https://www.iispeed.com/pagespeed/products/ats-
pagespeed](https://www.iispeed.com/pagespeed/products/ats-pagespeed)

------
bobfunk
Our CDN at Netlify ([https://www.netlify.com](https://www.netlify.com)) is
based on traffic server and it's powerful plugin engine.

I've used both Squid, varnish and nginx plenty, but traffic server beat them
in our benchmarks and the built in ssl termination + plugin api makes it
extremely powerful...

~~~
lhedstrom
Nice. How about adding Netlify to
[http://trafficserver.apache.org/users.html](http://trafficserver.apache.org/users.html)
? :)

------
exelius
Can someone explain how this is different than how most people use Apache
HTTPD these days? I'm not trying to be snarky -- I'm genuinely curious.

~~~
skuhn
As others have indicated, it's a proxy server rather than a general purpose
webserver. There's no code relation between the two servers; it's simply that
the team at Yahoo chose to pursue it as an Apache Foundation project when they
open sourced it.

ATS scales orders of magnitude better than Apache, due to its process model.
Whereas at Yahoo we would budget between 30-200 simultaneous connections per
Apache server (prefork), the proxy service which I ran using ATS was budgeted
for over 100,000 concurrent connections per machine.

It's significantly less featureful than Apache, but it does caching
substantially better than any other cache server commonly available (nginx,
apache, squid, varnish).

~~~
jimjag
"Apache, due to its process model."... "(prefork)"

Seems to me that the above is all based on reflections from decades ago w/
Apache httpd 1.3. Right now, all web servers can handle similar levels of
concurrency with the bottleneck being the network pipe itself.

ATS is a great platform; using it in combo w/ Apache httpd (2.4) allows a pure
open source implementation with all the power, speed, reliability one could
want, and protection against Open Core business models.

~~~
skuhn
Uh well, it wasn't "decades ago" but it was some old timey stuff. Yahoo used
Apache 1.3 as recently as 2012. They also disabled Keep-Alive on Apache, and
most properties would use the hardcoded default number of prefork processes
(32). It wasn't the smartest setup.

Nevertheless, I don't think that nginx / Apache / Varnish / haproxy / etc. are
able to handle similar concurrent connection levels as ATS without
significantly impacting 95th percentile latency due to their core
architectures.

~~~
jimjag
There's good info here:

    
    
        http://www.slideshare.net/bryan_call/choosing-a-proxy-server-apachecon-2014

~~~
jjm
Hey, nice seeing you here! Glad to see the project reach HN (finally). That
indeed is a very good slide deck.

Some features that I preach are:

    
    
      - good turnkey default values
      - lua support
      - config options galore (Bryan labels it a con, but if you want control it's perfect)
      - good logging
      - historically proven scalability on large smp, xxlarge memory, multi nic systems
    

Edit: forgot to add one more thing though maybe not worthy a bullet point. If
possible, a preference for physical rather than virtual is where I've seen
performance with ATS shine. That is one reason why you would want as much
config control possible.

------
pavs
Anyone using this as a transparent forward proxy on a decent traffic
(3-4gbps)? If so what kind of hardware needed to process 3-4 gbps tarffic?

------
blantonl
Any advantages to using this over a blended haproxy / varnish setup?

~~~
skuhn
Not really at small scale, but if you're building a large service there are
several advantages.

For one thing, not having to copy response data between processes improves
throughput. Since Varnish is so resistant to supporting SSL natively, you'll
always have to place something in front of it to use it with the modern web.
Whether it's haproxy, Apache or nginx, that's just one more thing to deal
with.

I have some other beefs with Varnish, but the most annoying one is the absence
of a persistent disk cache. If the Varnish process dies, there goes your disk
cache. Even though cache data is written out to disk, Varnish punted on saving
an index and re-using an old process's cache, so it writes the cache to an
unlinked file.

Imagine a bad code push or new traffic pattern that causes core dumps across
your entire service footprint -- and now it isn't just a problem of getting
the process back up and stable, you have also lost hundreds of terabytes of
cache data. Or something as simple as rolling out a new version. You can
architect around the problem, but why should you even have to?

ATS also (recently) supports Lua for plugins, which is way more powerful than
VCL. It is a finicky piece of software though, and there are a lot more sharp
edges that you're likely to cut yourself on during the initial honeymoon
period versus Varnish.

~~~
nikolay
Varnish (just like Nginx) is putting key features behind a paywall. That's one
of the reasons I personally want to consider Traffic Server. Plus, it's been
around for ages, has a great architecture, and a great track record as well.
All it needs is a little more awareness and that's why I keep posting it here.
:)

~~~
skuhn
I sympathize with the authors of Varnish and nginx; their software is used all
over the place, and they want to make a living at it. I just don't want to
support that kind of business model, and I'm never dealing with per-server
license compliance again.

I wish more companies would model themselves after Percona: charge for
support, custom engineering and on-call -- don't fork or paywall any code.

ATS suffers by comparison, since there is no "ATS Inc." to provide support and
engineering work. There's OmniTI, but I don't have first hand experience with
their service to say if it's worthwhile or not. They did get paid to write the
current ATS docs, so presumably they know what they're doing.

I wish ATS got more attention, but it is after all a bit of a niche product
hidden away in the Apache Foundation with a bunch of unrelated Java projects.
It's too fiddly for small scale use, and once you hit large scale you're
pretty much hiring someone from Yahoo or elsewhere that has experience running
and developing it (for example: I'd like to hire ATS people). Doesn't give it
a lot of opportunity to trickle into smaller shops and grow with their
service.

~~~
nikolay
There are right and wrong ways to monetize. Putting basic features behind a
paywall and per-server licensing as you pointed out is not something I can
live with. Even if I don't use these features, I feel I'm using a subpar
product, and this makes me look for alternatives such as ATS and H2O [0].

[0]: [https://h2o.examp1e.net/](https://h2o.examp1e.net/)

------
adrianpike
At a company I was at a while ago we used/abused ATS _very_ heavily - apart
from the pain of a comparatively obscure tool, it was great.

It took us some digging and work to get configured exactly right, especially
since we were using it fairly nonstandard - as a caching forward proxy to
external data sources.

Good stuff.

------
X-Istence
Apache Traffic Server is what is used in Comcast's CDN.

------
ksec
It seems strange to me, that Nginx, Varnish and mostly getting the headline
and not ATS, any idea why may that be?

~~~
lhedstrom
Likely because ATS is considered 'difficult' whereas the others are 'easy'.
I'd argue that if you are running a serious site, you need expertise
engineering regardless of which software you choose.

------
aayala
nginx ?

~~~
skuhn
Some people do use nginx as a caching server. CloudFlare, for example, is
built on top of it.

The reason to do so is because you want the rest of what nginx provides, not
to get its caching module. It is an extremely barebones solution that only
solves the most basic requirements. I can only presume that CloudFlare and
others have written their own caching modules for nginx.

A short list of annoyances:

1\. No support for multiple disk devices. Files are written to a fixed temp
path and then renamed to their real destination. So you need to use RAID to
present the disks as one logical device, which is a wholly unnecessary expense
in a caching environment.

2\. No support for purging in open source. This is an nginx plus feature,
which starts at $1900/server/year.

3\. Because of the temp file / rename thing, support for streaming subsequent
requests off of the first request that is filling cache is janky. Subsequent
requests have to acquire a lock.

4\. No support for any fancier cache setups, utilizing ICP / HTCP.

~~~
nikolay
I was considering Varnish + Nginx at some point, but this gets much more
complicated than just using ATS.

~~~
skuhn
It makes sense, considering that nginx is typically ahead of the pack on
support for things like SPDY and H2. Plus if you use OpenResty with its Lua
functionality, you can do a lot of fancy things in nginx and reduce your
dependency on Varnish's VCL. And you have to have something in front of
Varnish to do SSL anyway.

VCL in particular is kind of a trap. Early on it can do what you need --
remove a header, set a header, basic branching. Then you want to do basic
arithmetic, or validity checking, or anything that isn't suitable for string
assignment or regex and you straight up can't do it. VCL makes me long for the
power of bash scripts.

Ultimately though, it's really not a great solution to separate these concerns
between multiple applications. You're going to get bitten somewhere, even if
it's just the old ephemeral port exhaustion problem.

~~~
nikolay
That is true. I was particularly interested in tag-based cache purge and
although there are similar open-source Nginx modules [0] and [1], they still
don't have that out of the box.

[0]: [https://github.com/pintsized/ledge](https://github.com/pintsized/ledge)

[1]: [https://github.com/wandenberg/nginx-selective-cache-purge-
mo...](https://github.com/wandenberg/nginx-selective-cache-purge-module)

~~~
skuhn
Tag-based cache purges are something I would love to see in ATS. I think doing
it correctly would require a complete rejiggering of the cache storage though,
and that's not something to be undertaken lightly.

Storing externally in redis (for ledge) seems like the wrong approach to me.
Better to store metadata externally and generate the purge URLs based on that.
It's not ideal, but it's the best option I've come up with.

~~~
nikolay
It's an essential feature that is dragging me towards Varnish even if this
comes with tons of negatives.The bans feature is just amazing.

And right or wrong, there are not that many implementations to choose from in
the Nginx land, unfortunately.

------
frik
Better overview:
[https://en.wikipedia.org/wiki/Traffic_Server](https://en.wikipedia.org/wiki/Traffic_Server)

Initially created by Inktomi, bought by Yahoo! and got open sourced and
brought to Apache Foundation in 2009 because of the good experience of Yahoo!
with Hadoop in 2008/09.

