Hacker News new | past | comments | ask | show | jobs | submit login
3 Problems AWS Needs to Address (jacobelder.com)
136 points by aaronwhite on May 9, 2012 | hide | past | web | favorite | 47 comments

The ability for S3 and CloudFront to properly handle GZIP compressed files would further encourage the use of S3+Cloudfront for static websites. As a host S3 + CloudFront have arbitrary scalability, good performance across the globe and is pay as you go.

With GZIP compression, bandwidth drops but more importantly load times can decrease significantly. "It takes several round trips between client and server before the two can communicate at the highest possible speed [and for broadband users] the number of round trips is the larger factor in determining the time required to load a web page"[2]. There was a graph depicting the non-linear impact file size increases have on load times but I can't find it... =[

In the Google article on compression, a 175% increase in a page's size (non-GZIP version of Facebook.com) results in a 414% increase in load time on DSL. Load time does not increase linearly with file size and hence why GZIP compression is so important for performant websites!

[1]: http://aws.typepad.com/aws/2011/02/host-your-static-website-...

[2]: https://developers.google.com/speed/articles/use-compression

It's a little-known fact that CloudFront supports GZip just fine, so long as you're using pull from custom origin (like most people are).

You just need to configure your origin servers to serve GZip even to HTTP 1.0 (which is what CF requests will come as) and set the "Vary: Accept-Encoding" header to prevent users of old IE versions from having GZip'd content they don't support stuffed down their throats.

For example, this is my nginx configuration which serves both GZip'd and non-GZip'd versions of the same objects via CF. The second and third lines are the most important for correct AWS CF GZip distribution:

    gzip  on;
    gzip_vary on;
    gzip_http_version 1.0;
    gzip_comp_level 4;
    gzip_proxied any;
    gzip_types      text/plain text/css application/x-javascript text/xml application/xml application/xml+rss text/javascript image/png;
    gzip_disable    "MSIE [1-6]\.";
Note that "image/png" is only in there because Google PageSpeed is very stupid and marks not GZipping PNG files as a "bug" because I can save "up to 1%" by employing GZip on PNGs.

>gzipping format that uses deflate compression


This isn't reddit. Please reply with a helpful comment describing why this is/isn't the right way to do things.


Sorry, I made assumptions that people knew what gzip was. EDIT: That wasn't meant to be condescending, I apologise.

To clarify the original comment: I think it's rather pointless to gzip a PNG file, since PNG files use deflate compression, the same method gzip uses, and hence it has very little real benefit, if any.

Your look of disapproval seems to have overlooked why he said he was using it. He's clearly aware of your point, and even mentioned the uselessness: "Up to 1% savings" while explaining why: Google Page Speed is stupid.

Yes, I was pointing out the stupidity of that further because you're compressing something twice.

Why do people always say "this is not reddit"? I never even heard of reddit till I saw those comments. And I think this observation made over and over again usually adds less to the discussion than the politically incorrect statements this comment is made in response to.

Also, I don't think your link relates the the comment. Your link is about choosing between gzip and deflate; the comment you found objectionable was about the irony of gzipping a file that is in a format that also support deflate (so you might compress a compressed file; which I think can actually make the file larger).

Many on HN were once / still are redditors. Especially proggit. The type of response where you quote part of the parent, don't add any comment or commentary, isn't especially helpful.

Addressing your second point: I also thought the comment was a snarky reply, wondering why someone would use gzip instead of deflate. The parent comment addressed, specifically, why they were using gzip on PNG, which is why I assumed the 3 character response was focused on the compression type, rather than whether to use it on PNGs.

Missing support for Cross-Origin Resource Sharing headers is a big problem for some applications. For example, drawing images to a canvas from s3/cloudfront will unavoidably taint your canvas. (https://developer.mozilla.org/en/CORS_Enabled_Image)

Right now I'm proxying image requests to s3 through nginx, which is a terrible workaround.

The AWS forums has a topic on the issue started in 2009 (~200 replies so far...): https://forums.aws.amazon.com/thread.jspa?threadID=34281

We do the same thing for the same reason. Luckily we were able to split off a lot of our other content serving (same content, not to canvas elements) to a CDN which is backed by our origin servers. It's a little crazy, but is the best we can do until there's better CORS and custom SSL for Cloudfront.

we're planning to migrate Pictos Server [1] away from S3 and onto Cloud Files for this exact reason. IE9 refuses to use webfonts loaded from a different origin unless CORS allows it, something we can't easily do without adding EC2 to our deployment.

[1] http://pictos.cc/server/

  S3 has eleven nines of durability.
The author will find, to his dismay, that durability is not the same thing as availability.

The complexity implied by anything "better" than three nines is a recipe for disaster.

In reality, neither you, nor Amazon, nor anyone else has any idea how durable S3 is. But if they _did_, it wouldn't matter because unexpected interactions, cascading failures, and SNAFU will keep it from ever being realized.

Much better to have more frequent, very boring failures than to have rare spectacular ones.

The author is proposing to serve his site entirely from S3, claiming it's better than using a couple of nginx boxes because S3 has eleven nines of durability.

Durability means you will get your data eventually (it will not be lost). Availability means you will get your data right now, which is probably what he really cares about in terms of serving live internet traffic.

Put another way: S3 not infrequently has availability hiccups (files are temporarily unavailable, resulting in a disruption of service), without taking durability hits (your files haven't been lost, you just can't see them right now).

Unfortunately frequent boring failures do not preclude rare spectacular ones.

This is less "AWS" and more "S3/CloudFront".

there are many other product features that EC2/R53/ELB/etc could use, but calling this AWS is a little too broad.

He uses other AWS services but these are all of his major gripes. So I think it's fair for him to say AWS.

Also AWS is an organization (part of a larger organization), but S3 is a product.

Actually I would like to see S3 support custom SSL certificate. That would be an awesome addition to make S3 a great static page server.

Related: custom SSL for Cloudfront. It's a real show-stopper that I can't serve Cloudfront over SSL via a CNAME.

> You could break your CSS into multiple files, but this is in direct opposition to one of the tenants of website optimization: minimize the number of HTTP requests.

Am I missing something here? Your fonts were going to be in a separate file anyway, right?

I tweeted the same thing to that account and got no response. I'm glad you did. the Access-Control-Allow-Origin header has been a heavily requested feature since 2009: https://forums.aws.amazon.com/thread.jspa?threadID=34281&...

One example of how fundamental this is: you cannot currently perform a direct AJAX upload to an s3 bucket from a web application hosted on an ec2 instance.

There is a postMessage hack that will work with small files, and of course you can use a proxy, but you'd think it would be a common scenario to want to upload files directly to S3.

You can upload files directly to s3 from your website: http://aws.amazon.com/articles/1434

This only works standalone and not embedded in an AJAX application.

I only wish someone from the S3 team at Amazon would at least answer to this thread:


Hundreds of messages in this thread and no answer after 2 years and counting.

Well, "AJAX application" is too nebulous of a term to really be meaningful here. If you mean you can't use an XHR request, sure, but you can certainly construct the necessary <input> element and POST request dynamically with JavaScript.

"...Someone monitoring the @awscloud account opened a trouble ticket to my email address asking for clarification" support through twitter is going mainstream. It is like praying loud ad getting a response.

1.) AWS staff doing something is the very definition of not mainstream when it comes to stuff like Twitter, their customers are developers

2.) AWS have always been awesome at responding to customer feedback in my experience

3.) But you're right, except change "is going" to "has gone". A friend of mine who works in SEO and social media (the good kind) says "In 2009 companies needed to have social media accounts, in 2010 they needed to put out content on them, in 2011 they needed to respond to customers through them" and he's right. The mentality of customers of Twitter/Facebook has, for the most part, moved from "holy hell, a company ACTUALLY SAW MY TWEET?" to "I tweeted about my problem an hour ago, where the fuck is my answer?".

"I tweeted about my problem an hour ago, where the fuck is my answer?".

Can anybody else verify this? To me, it seems ridiculous that anybody would expect to get support by posting something to a random website. Personally, I go to twitter.com about four times a year and type in the names of my products to do a quick vanity search about what people are saying about them. I've never seen anything like a support request (or even a complete coherent thought) in there. It just doesn't seem like something worth monitoring.

My product sites all have a contact page with an email address on it. If you want to contact me, that's how you do it.

Amazon has forums with dedicated representatives monitoring them. That's how you get in touch with them. I've never gone more than a few hours without a response from somebody who knows what they're talking about in there.

Sorry I should have been more clear, it's not that people will tweet "I have X, Y, Z problems with product" it's that they will tweet "@company Help me I have this problem".

This is basically because plenty of companies are doing this (see https://twitter.com/#!/vodafoneuk for example), so people get used to it.

So really it's not so much posting to a random website, it's more using social media as a means to contact them.

But based on the trend, people have come to expect it as the norm thanks to companies that lead the way in doing it, and now those compares are leading the way in proactively reaching out to customers who tweet not directly at them - for example 9 months ago I tweeted something like "As soon as my contract with Vodafone ends I'm moving to Orange", both companies tweeted at me offering to help - so people could well come to expect that as the norm too.

> I've never seen anything like a support request (or even a complete coherent thought) in there.

This cracked me up. And likewise, I think we've seen one. Meanwhile we have nearly a quarter million email "tickets" to date.

I certainly get tweets in this vein for my app @expandrive and storage service @strongspace. Especially if something is effecting availability.

Lots of users realize that there is likely a faster response from twitter than support@whatever.com because the developer has some amount of face at stake with the dirty laundry in public.

But it's not really "in public" though, is it? I mean really, how many people would you expect to go to search.twitter.com and type in "expandrive" in the twelve or so hours that they cache that post? That's the only way anybody would know that your dirty laundry was airing, and then only if they could parse what the airer was trying to say.

If you really wanted to "expose" something in public, you'd put it up on a blog or someplace that's actually on the public facing internet. Not that it would get you any more chance of the company hearing about it, but at least other people might see it.

And, of course, if you run a company that simply doesn't respond to things on Twitter, the customer in question will hopefully learn that they can send you an email and get a fast response.

The lack of CORS support have been known by Amazon for years, but they still have chosen not to fix it. There's a long running thread on their support forums somewhere where they start by saying they'll look into it. I believe this was years ago.

Cloudfront actually does support gzip encoding if you use Custom Origin, just not with S3.

Technically, CloudFront supports Accept-Encoding/Transfer Encoding, and /not/ compression. If client and server supported ROT13 as an encoding, CloudFront would support that, too. CloudFront is neither compressing nor decompressing anything.

These issues have been known to Amazon and to serious AWS users for a long time. Why do you expect that this time they will actually do something? It will take more than a simple twitter response from the AWS team to believe that they actually will make changes to fix the situation...

We've been hosting our gzipped JavaScript via S3/CloudFront, and have had no problems serving to IE7:


Also SQS should accept utf-8 in message body rather than a restricted set of characters.

SOAP is the cause: http://www.w3.org/TR/REC-xml/#charsets

Fortunately it looks like AWS is starting to use JSON for newer APIs: http://docs.amazonwebservices.com/amazondynamodb/latest/deve...

The whole internet needs to work with utf8. Most of it does of course, but there are a few problems around.

You can use S3/Cloudfront for compressed assets as long as your main page is dynamic. It can just generate different URLs for assets based on whether the browser supports gzip or not. See bagcheck.com for an example.

These are valid points and the same ones I've encountered when using S3 and CloudFront. I am actually amazed that gzip encoding still isn't supported — people have been complaining about this for years.

On a related note, do not use S3 on the web, use CloudFront. S3's performance is highly variable and latency tends to be high. Serving files from S3 and not CloudFront is foolish and will slow your site down.

I'd like to see Micro instances available in Virtual Private Cloud.

In the forums, an Amazon rep promised it'd be available within 2011. No luck, though.

They also need websockets support over ELBs.

At OpDemand we're using WebSockets successfully through multiple ELBs. The trick is setting the listener to use TCP instead of HTTP. With TCP forwarding you lose X-Forwarded-For headers, Cookie stickiness and a few other HTTP-specific features.. but you can always spin up a separate listener for that.

Only three problems?

Applications are open for YC Summer 2019

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact