Moving persistent data out of Redis (githubengineering.com)
42 points by samlambert 1 hour ago | hide | past | web | 10 comments | favorite





Wow I'm learning today that Github used Redis for persistent data, now that they moved away :-) Anyway very happy that Redis helped to run such an important site. From the blog post it looks like that for certain things to move away from Redis was hard even if they are very skilled with MySQL, this is a good thing from the POV of Redis since it means that Redis allows to model certain things easily. However they wanted to move away as an important priority, so I wish to know why they wanted to move away so badly and how Redis could be improved in order to serve better the users. If Redis was better for their use case, they could have avoided to move to MySQL I guess. Unfortunately the blog post is short of details on that regard, perhaps because the blog post author(s) are too gentle to bash Redis after using it for a long time.

FWIW we went through a very similar process to that documented here by Github (~3 months ago). It was entirely due to operational reasons and nothing to with shortcomings in Redis itself. MySQL was the master record for 99% of our data while Redis was the master record for the other 1% (as it happens it was also a kind of activity stream). Having the single 'master' reference for our data reduced complexity to a degree that it was worth running a less computationally-efficient setup. We also have nowhere near Github's volume so we did not have to do such significant re-architecting to make unification possible.

Now we still use Redis for reading the activity streams and as LRU cache for all sorts of data, but it is populated like all of our specialised slave-read systems (elasticsearch, etc) by replicating from the MySQL log.

Hope that helps!

Pretty cool that redis was helping to host its own source code & development.

Reading more into this point in the post than i maybe should but "Take advantage of our expertise operating MySQL." sounds like they have more engineers familiar and comfortable working with MySQL than they are with Redis.

That's a valid reason indeed. Also technological consolidation, that is, if I can do everything with a single DB / language / ... I always tend to use a single thing.

Cool stuff. I found it especially interesting how they removed 30% of writes with new logic to compose some timelines of events in other timelines. It's a thought provoking optimization that calls to mind graph partitioning.

For example you have 10 people in your organization with various permissions on repos. Some people (CTO let's say) can see every repo while others might only be able to see some repos. Or you might have consultants or open source projects which non-employees contribute to. Then you construct a graph where each node is an contributor that is connected to other contributors by the permissions they have on repos (or are the repos the nodes and the contributor permissions the connections?). Finally you run a graph partitioning algorithm where the number of partitions is the number of unique timelines you have to write for an organization. Thinking about an organization with closer to 500 contributors I can see how this could reduce the number of timelines by 30%.

Working with large amounts of persistent data is hard. Limiting the architecture to a single database system (MySQL in this case) generally makes managing and scaling much easier, versus having to know/learn how to scale multiple systems independently.

Even if Redis was a better fit for some of their use cases, it just makes it much easier to not have the additional persistent database system to manage.

> We changed up how writing to and reading from Redis keys worked for [the organization] timeline before even thinking about MySQL ... This resulted in a dramatic 65% reduction of the write operations in for this feature.

Interesting. Is there a comparison of overall performance between the intermediate design (w/ Redis) and what they ended up with?

I wonder if this will allow better scaling of GitHub enterprise. We are pegging our usage; if we could we would migrate everything to Gitlab Enterprise (which we also have) which seems to have better scalability.

> which seems to have better scalability

Out of curiosity: how many users do you have, and what measurements are you using to determine scalability?

Disclaimer: I work at GitHub.

