They do lots of interesting things beyond gzip. They convert images to WebP and scale the resolution of the image based upon the resolution of the host device. They also minify CSS and HTML. What I noticed which I thought was quite cool was they add ETag headers to improve the client's ability to cache resources.
They rewrite the response headers to lower case to increase the effectiveness of compression and of course everything is served over SPDY (except in special cases where that fails)
I was assuming that in this post-Web 2.0 internet, most webpages were heavy (like >1MB), but if I read Figure 3 from the paper correctly, I see that actually about 90% of webpages are <= 500kb.
Hm, considering metered data for many (most?) mobile plans, it makes sense to save data for the user, but my initial assumption was that this would make things faster for most people turns out wrong, as demonstrated by the median (not average!) load time increasing by 6%, as you point out.
Median is a type of average, and in this case is by far the most important one.
Is the source code available so I can run the proxy on my own private server?
If not, here are some open source alternatives that I have used:
https://wiki.mozilla.org/Mobile/Janus (written in Node.js)
If anyone got motivated by the paper or feels like improving it, contributions are welcome.
> The majority of Flywheel code is written in Go, a fact we mention only to dispel any remaining notion that Go is not
a robust, production-ready language and runtime environment.
Or they're affirming the fact that the system has been running successfully for a long time, which is a good advert for Go as a production ready runtime.
Google Proxy is built in desktop Chrome for a while, this extension just enables it (and show some statistics).
Interestingly, this extension is the only one Chrome allows to use the API (dataReductionProxy permission on manifest).
I once was somewhere where there was only an abolutely minimal wifi connection with a latency and very low bandwith ( < 1 KB/s. Opera mini was still slow but worked wonders, where a regular browser was basicly unusable.
> The proxy service with the closest design to ours is Opera Turbo . Although Opera has not published the details of their optimizations or operation, we performed a point comparison of Flywheel and Turbo’s data reduc- tion gains, and found that Flywheel provides comparable data reduction.
Flywheel does not MITM SSL connections; it does not proxy SSL at all. If you just mean "MITM" in a generic sense, the fact that you believe you have "data sovereignty" is interesting since we're talking about unencrypted HTTP here. Nearly all ISPs and mobile carriers undertake proxying and extensive analysis and manipulation of in-the-clear HTTP traffic. We agree that users need to opt into this feature -- since you have to trust Google to proxy your traffic, after all -- but it's important to keep in mind that many other parties on the path between you and a website already do transparently proxy your unencrypted traffic.
Regarding the Third Party Doctrine, the fewer third parties with access to the information, the more sovereign over the data the remaining parties are. But I agree with you that ISPs and mobile carriers are other companies who intercept, profile, sell and partner away plaintext information - and indeed in the case of at least some mobile carriers encrypted communications too.
I agree with the assessment that this is an opt in feature. I have spoken about the reasons I won't be opting in. (In my opinion it is a very bad trade.)
Definitely agreed that many other parties have access to the data. I disagree that this is an argument to add another.
[By the way, thank you very much for taking the time to speak on HN about Flywheel from your position. :)]
(I'm biased, but I'd personally rather have Google MITM me than a mobile carrier like Verizon or T-mobile.)
Flywheel ends up masquerading some of that traffic, so if for no other reason, it is atypical. I also perceive that it would be in Google's best interest not to abuse that privilege since advertisements are how they make their money. If they abuse that privilege, consumers will go elsewhere and they will lose their market advantage.
You only need but look at the cookie tracking the telecoms are doing right now to see that their oligopoly gives them little incentive to respect consumer's privacy.
It's true they may not have much incentive to protect your privacy (besides perhaps competition from "better" companies and/or legislation).
But also keep in mind that Google has a huge incentive to breach your privacy, and have been taken to court over it, numerous times.
I guess that on a satellite network (~600ms), latency would be greatly reduced as there is only one (and persistent) TCP connection to the proxy, with HTTP2 (or SPDY) over that (using http2 multiplexing capabilities).
This one tcp connection will be fully established and with a full tcp-window basically the whole time. The TLS with be fully negotiated and setup. Leaving the client with an optimized tunnel to google servers.
Considering that most latency on the path is on the satellite part, and google servers are on the other side of it, tcp handshakes and ssl setups to destination servers will occur on that low-latency side. Google will just push those optimized content over the tunnel to clients.
Looks like paradise.
TL,DR: Satellite clients won't open 4 connections to servers and wait for 4 tcp handshakes and slowly open those 4 connections tcp-windows for every site visited. They will open one connection to google server (on the other side of satellite) and let google do the hard work. High-latency paradise!
I remember really appreciating how quickly pages rendered on that browser, compared to the stock chrome when I was using android.
so this is part of the google mobile strategy.
"Flywheel is integrated with the Chrome web browser and reduces the size of proxied web pages by 50% for a median user."
is there any implementation of this outside chrome, OS?
thx Matt, reading for background: 'Making the mobile web fast' ~ http://matt-welsh.blogspot.com.au/2011/05/what-im-working-on...
Since this is for Chrome only, I'm surprised they don't use the other possible compression format: zlib (confusingly named deflate in the RFC). It's 12 bytes less and uses a faster to compute ADLER32 checksum, compared to CRC32 in gzip.
So really, it's a gzipping proxy written in Go. Hopefully they extend the protocol to better compression in future.
Interesting idea I guess.
You'd be surprised how many sites still do not enable gzip.
I still don't think that excuses not implementing it by now but I'd bet the explanation starts with some engineer having a bad week and not wanting to relive the experience.
1. Some major CDNs and caching proxies like nginx also haven't bothered to implement Vary but that doesn't matter for this particular scenario since they don't appear to be using any of them.
If you did not serve up this information to request even though you had access to it, your company WOULD be breaking the law - this has already been settled in court. It is my conclusion that, as law abiding citizens and company, you would serve such requests. I also tangentially believe that if you felt the information would be useful to create a better product (and/or make more money) you would do this, as you are also compelled by law to maximize profit for shareholders and are incentivized do so to by financial compensation. I see no reason why your company would break the law to restrict the scope of this feature.
No need to speculate as appellate courts (and Google's own recent trials) have made the law quite clear.