This added a bit of complexity to a few code paths that now require to estimate the size of objects, data, and so forth, but I think this will pay back in the long run.
This blog post makes me wonder if we should implement compression in the replication link as a feature in the future, or at least binary encoding. In short there is a plan for a binary version of AOF / replication-link that can save a lot of bandwidth.
Initially we didn't add compression to save bandwidth but to solve the network bandwidth issues between amazon and rackspace. However, it turns out that you can save on the bill quite a lot, in our case about ~ $700.
Regarding the binary, would be very nice to have. The first though is that compression would work better for us than binary since our keys are quite long (50/100 bytes) and very regular on their patterns, so compression works very well in our case. Probably can be extrapolated for other cases where the keys are typically heavier than the values.
How much load would compression put on a Redis instance that talks to n clients where each client sends/receives medium sized JSON docs. And how much would be gained in terms of faster responses from Redis.
Also, if you're using ssh and you need maximum performance you should probably be using an arcfour cipher (`ssh -c arcfour`, alternatives are arcfour128 and arcfour256). I often see a 10x increase in bandwidth using these potentially less-secure algorithms, and slightly lower latency. If you need to save even more bandwidth, a UDP-based SSL VPN may improve things (depending on distance; long links are of course notoriously horrible) or SCTP patched into SSH as a compromise (I haven't played with this yet).
If you're not CPU bound, lzma might gain you bigger savings with marginally higher CPU if you need it down the road.
(I'd also like to take this moment to point out to the Redis devs that 'optimization' of the protocol could have prevented this hack from being necessary until much later. And to the admins of the systems, that monitoring metrics of network bandwidth could have shown the immensity of the traffic for replication. But hindsight is 20/20; let's just remember these examples!)
Of course, this is based on the assumption that the OP was happy to run their redis sync over the Internet without any other form of encryption. Either the underlying protocol provides encryption or they aren't concerned about their traffic being snooped or MITM'd (albeit very unlikely between two large providers) and so adding extra encryption on top is just an unnecessary CPU overhead.
I've run plenty of (production) stuff over ssh tunnels but it was always nervy with it and happy when the deployment was revamped to avoid the use of ssh tunnels completely.
I'd much rather each underlying application was able to provide tunable encryption and compression itself (using existing trusted libraries such as openssl) as part of its protocol. ssh tunnels are a kludge not a viable production solution (IMHO).
I would never recommend anyone tunnel anything unencrypted over the internet, especially a database. Arcfour is incredibly fast, comparable to 'none' encryption. RFC4345 specifies the main known attack on arcfour is password auth, so using keys may keep it relatively safe (and using arcfour256 for the longest key).
It's best to assume developers will not implement encryption correctly no matter what library they link to. Always consider how highly vetted the application has been first, and how long it's been around without major issues.
If the redis protocol is so compressible, maybe this behavior should be integrated as a protocol option like ?
For the replication link it's easy to imagine a simple way to encode command names as operation IDs, the same for prefixed length stuff. So now you have:
However it's not bad to have the current format to be exactly like the AOF, and the client protocol itself, so everything is the same currently... and is future/backward compatible without issues. But well performances always have some price.
Since I'm tempted to set this up myself (not for Redis though), what is the alternative?
I've only used it for tunnelling some traffic out of my home network to a VPS but it's been rock solid for me over several years of frequent use.
I suspect (guess) that not having an idle connection help the tunnel to not randomly drop.
Have you simulated a network partition? How does redis handle the reconnect?
However we also use ssh tunnels for our continuous integration (jenkins) and the irc bots that we use for development (http://3scale.github.com/2012/06/29/irc-driven-development-p...).
In this setup we have the autossh going awol every one or two weeks. But as the blog post mentions, the ssh tunnel runs between our HQ (fiber) and Amazon, not very reliable. Furthermore, there is a lot of idle periods, which seems to trigger most of the issues. We cannot give more specifics since we forcefully just restart the daemon with a monit/munin combo.
The outbound firewall of our company drops my outbound ssh connections if they are idle for just 5 minutes. Adding:-
A corresponding ClientAliveInterval setting in the sshd_config file also helps mask the problem [EDIT] if I ssh home from a machine at work that doesn't share my home directory ssh config file.
Now, their architecture diagrams show that that developers directly connect to the API servers. So the origin servers must be scaled out to handle "10's or 100's of millions of API calls per day" anyway.
So 3scale provides some kind of external dashboard that you can point an event stream at and get analytics? That is, something like New Relic for APIs? Is that correct?
It's used for some very big APIs but yes, the origin sees the traffic. The system can offload it though - there are agents for Varnish for example which you can sit in front of the app or integration with Akamai, in which case policies are enforced at CDN edge nodes.
If the API is handling high volumes you still have to plan for that - but the point here is that the management elements scale with it. You're not stuck with a black box in front of the API which might blow up at inconvenient moments.
I've used Redis once over an ssh tunnel (over a domestic internet connection that's everything but fast, it's a long story...). It is slow (as in agonizingly slow), even for a couple of queries, probably not redis fault, still.
One thing you really want to do in this case (described in the article) is to carefully select the encryption protocol for one that uses the least CPU or has the best data rate.
If your replication stream bandwidth continues to grow, the returns on compression will eventually diminish. As the cloud providers can't guarantee network throughput, it might be a good time to start planning to evacuate the cloud (or at least this part of the architecture).
If your business is growing that fast, this is a good problem to have. :)
however, getting out of cloud does not help. The problem is the high-availability. To maintain 99.9% we cannot rely on a single data-center, no matter what the claims on availability zones say. We have even seen network partition which are the worst of the worst. Once you go on the "Internet" there is no guarantees (unless dedicated lines).
I'm not sure whether you've analyzed the bandwidth of the replication stream, but my experience is that if you've got a good circuit provider, you should have adequate bandwidth for 30Mbps+ streams, except under extremely unusual conditions, even over the open Internet.
It seems that we have found this unusual conditions you mention :-) Before compression our replication was often lagging (when below the required bandwidth), which is very dangerous, even if after a minute bandwidth goes well above the threshold.
The case of replication is kind of worst case, it's no good to have an average of 30 Mbps if it stays a substantial fraction of the time below the threshold. For every hours at 75% memory grew 2GB, but even one minute below the threshold is bad... if the replication queue builds up and there is a crash, consistency goes out of the window. So better to over-dimension capacity.
We even did a quick test with Hetzner instead of Rackspace, it was worse. The average over 4 hours was about 15 Mbps.
The closest thing you can get to guaranteed bandwidth today is to rent (dark) fiber between multiple DC's yourself.