Hacker News new | comments | show | ask | jobs | submit login
How Discord Indexes Billions of Messages Using Elasticsearch (discordapp.com)
31 points by jhgg 2 hours ago | hide | past | web | 5 comments | favorite





> Elasticsearch supports automatic shard rebalancing, which would let us add new nodes to the cluster, fulfilling the linearly scalable requirement out of the box.

I'm assuming they mean "I can add a new node to the cluster, and NEW SHARDS FROM NEW INDEXES can be distributed amongst the new nodes".... So far as I know elasticsearch can't rebalance shard location or composition automatically based on cluster membership change events...right? I mean, the cluster reroute api will let you manually move a shard, but thats all I know of.

reply


ES can indeed rebalance shards to new nodes as they join the cluster. More here: https://www.elastic.co/guide/en/elasticsearch/reference/curr...

reply


huh, the more you know. not sure how I've missed this. I'll definitely look for some discussions/use cases of how this actually plays out.

reply


I really enjoy Discord's blog. Their Cassandra write up was excellent as well. A couple of thoughts and questions:

- Having many clusters and assigning messages to a specific cluster seems like an interesting solution.

- I'm curious how they managed to lazily index messages.

- Since only message, channel and server ids are stored in ES, have there been any problems reindexing data after an index fails?

reply


The first time you run a search in a server (or the first time you run a search in a server after the index fails) - will trigger a full re-index of that server. Ctrl-F "Historical Index" in the blog post for more details! If you've never used search in a server - the messages are not indexed in real time until you do for the first time. Both these things make the system "lazy".

The worst case to an index failure is that the search query is delayed as the index rebuilds itself. We throttle the rate of historical indexing into ES to a safe level so that we're not degrading performance of other components of the system.

reply




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact

Search: