
How to Use Varnish Cache as Secured AWS S3 Gateway - wolfeel
http://info.varnish-software.com/blog/using-varnish-cache-secured-aws-s3-gateway
======
andrewguenther
I don't really buy this article. They claim you get "caching, more efficient
bandwidth utilization, centralized access with logging and security." S3 gives
you caching already. I fail to see how your bandwidth utilization improves, if
anything you've introduced another hop (depending where you're hosting
Varnish). Centralized access? I guess, if you're already using Varnish for
other things, but if you're at the point that you need Varnish, you probably
are aggregating multiple log sources already. Security? You're just adding
another possible exploitation layer. And on top of that you've just added
another box to maintain.

Not to rag on Varnish, I love it, I just don't really see this use case.
Personally, I would never route plain S3 data through Varnish.

~~~
reza_n
So I wrote that blog post, was intended to be a quick little guide for S3
integration. As for your point regarding caching and bandwidth utilization,
this actually comes up a bit, so its a fairly valid point. A lot of times
software (and people) are accessing the same S3 asset, multiple times, all on
the same network. So by going thru a local cache, you bypass having to reach
all the way back to S3.

~~~
coleifer
Lame of you not to link my post from yesterday about putting nginx in front of
s3. This post seems clearly to be in response to mine.

[http://charlesleifer.com/blog/nginx-a-caching-
thumbnailing-r...](http://charlesleifer.com/blog/nginx-a-caching-thumbnailing-
reverse-proxying-image-server-/)

------
phunge
We had a similar desire: secure, performant egress from S3. We ended up using
CloudFront. You can configure cloudfront with signed URLs, that will access
private buckets with a special S3 user
([http://docs.aws.amazon.com/AmazonCloudFront/latest/Developer...](http://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/PrivateContent.html)).
Any app that holds the private key can sign URLs to download from CloudFront.

It works great, but it really bugs me that we had to do that. The default
download speeds from our buckets on S3 are atrocious. We store big datafiles
in S3, and our development flow involves downloading them lot. If Amazon had
an upgrade to S3 so downloads by chosen users weren't throttled or slow, we'd
pay for it in a heartbeat.

------
fideloper
That's really neat that you can do the authentication/authorization headers
within Varnish.

I'd still use CloudFront instead, which gives you these features plus the
advantage of having global edge servers, and support for TLS/SSL (not sure
about the http/2 part).

Although the one advantage this has is the cache can stop you from having to
leave your external network to return a file (vs going to CloudFront or S3).
That can be useful.

Another commenter pointed out that Varnish won't support TLS either, which is
correct. The funny part about that is while they have long had the "why not
SSL?" FAQ page espousing the terribleness of OpenSSL, they DO offer that
feature on their paid product Varnish Plus. That not being a basic feature is
a tad crazy.

[1] Why no ssl? [https://www.varnish-
cache.org/docs/4.1/phk/ssl.html](https://www.varnish-
cache.org/docs/4.1/phk/ssl.html)

[2] [http://info.varnish-software.com/blog/how-implement-
ssltls-v...](http://info.varnish-software.com/blog/how-implement-ssltls-
varnish-plus)

------
gopalv
I wrote something similar for FB API access while I was at my last job, but
with zero caching.

This combined Varnish+Stunnel so that the varnish would redo keep-alives &
stunnel could put it back over SSL.

So there was a web-app (PHP) -> varnish:8081 (cheap-alive) -> stunnel:8080
(reverse ssl) -> api.fb:443.

VCL makes Varnish much more of a programming language than a configuration
system - but in total, skipping SSL negotiation in the PHP web-app and holding
onto sessions via stunnel got APIs down from the 800ms -> 240ms range.

------
bkeroack
No TLS and their stubborn refusal to even _acknowledge_ HTTP2 makes this of
very limited usefulness.

~~~
andrewguenther
If anything, you're just adding a layer that removes features that S3 gives
you....cool.

------
vfulco
Per the comment below using Cloudfront for "secure, performant egress from
S3", are there any other solutions that immediately come to mind? Working in
Shanghai and AWS Beijing doesn't have Cloudfront as a service; separately,
everything has to be kept in-country due to regulatory reasons. Ultimately I
am looking for non-expiring links to password protected private folders (low
volume) to serve to clients which I believe don't exist. A little out of my
element SysOps wise; using the excellent SteveLTN https-portal docker image
found on GH wrapped around JWilder's Nginx-proxy with some Flask, Mongodb,
Redis thrown in the mix (hopefully someday soon). TIA.

------
taylorbuley
This thing that's really challenging about Varnish and AWS is that AWS
typically uses CNAMEs and not IPs, whereas Varnish binds domain names on
startup. This means you'll have trouble trying to setup a Varnish cache in
front of an LB.

~~~
reza_n
Starting with Varnish Cache 4.1, backends are dynamic and configurable from
VMODs. This means proper DNS support for backends. There is already a DNS
backend implementation here [0].

[0] [https://github.com/Dridi/libvmod-named](https://github.com/Dridi/libvmod-
named)

------
b1twise
I think this is mostly a solution without a problem. Just to hit a few points
--S3 might (and I'm not sure I agree) be expensive. However, it gives you HA
and multiple copies of your files on the backend for safety. If you're serving
a lot of images, owning that hardware can be very costly. Also, when
cloudfront was tested the results were pretty good. Easy to implement, not
much more expensive than S3 alone. Our EC2 costs dwarf S3, and our site is
very image heavy.

------
atrudeau
Can you extend the post a bit more with information about cache size, eviction
policy, purging assets?

------
ran290
Is it possible to do the back end connection to S3 securely instead of over
insecure HTTP?

------
euphemize
This is an unfortunate title for this post. AWS released API Gateway sometime
last year, and this makes it sound like a caching layer behind API Gateway, to
access S3 files.

