Hacker News new | past | comments | ask | show | jobs | submit login
How Discord Stores Billions of Messages Using Cassandra (discordapp.com)
438 points by jhgg on Jan 19, 2017 | hide | past | web | favorite | 155 comments

These kinds of write-ups offer valuable insight into a popular project's requirements and decision-making, and are some of the most instructive resources one can find: these show not only the kinds of challenges one has to face at scale, but also how architectural choices are made.

It's far more valuable to understand why Discord uses Cassandra than to merely be aware they do.

Out of curiosity, did you consider HBase and Riak? Did you entertain going fully hosted with Bigtable? If so, what criteria resulted in Cassandra winning out?

Let me take a stab at that.

Riak is not a good model since its more a blob store and we wanted to simply range scan through messages rather than sharding blobs (Cassandra is REALLY good at this).

HBase would have been fine for this model, but the open source version of HBase has much lower adoption than Cassandra so that was a big factor. We also don't care about consistency and HBase is a CP database, we prefer AP for this use case. As far as using GCP's BigTable (HBase compat), we made this decision before we moved to GCP, but we are also not fans of using platform lock-in. While BigTable has the same API as HBase we would hate to go to an less widely adopted version where we have a hard time getting community support if we decided to leave GCP.

Hope that helps.

> As far as using GCP's BigTable (HBase compat), we made this decision before we moved to GCP, but we are also not fans of using platform lock-in.

Did you consider GCP Datastore as well?

It has strong consistency for a single "entity group", but eventual consistency for queries on multiple entity groups.

So by storing data only relevant to a single user in an entity group, you can have strongly consistent, atomic transactions on that group (albeit limited to 1 tx/s), and at the same time do global queries on all user data with eventual consistency.

The pricing model does not fit our needs, and that is even more locked in than the BigTable variant.

I'm happy to hear you dropped it for non-technical reasons, since I'm asking because I've chosen Datastore for an app because I care less about vendor lock-in than ease of operation, and it fits my pricing model perfectly, due to the app in question receiving (Bitcoin) payments that are charged a fee on a per-request/payment basis.

Hint: if you have technical reasons for avoiding GCP Datastore I'd be very interested in hearing about them

Google Cloud is the least geo-distributed provider around. Which is a major problem if your use case has requirements around (a) latency and (b) data locality due to legal requirements.

In 2017 they will finally have datacenters in Sydney, London, Singapore, Frankfurt etc.

This is one area where Azure is leading with both Azure SQL and DocumentDB supporting geo-replication.

nope since azure is extremly expensive and also you need several accounts for different regions. i.e. you can't create servers in germany with a whole new account / credits / support.

This is about capabilities, not price. Azure Germany is the only one that requires a different account due to German legal issues. The rest of the datacenters are all connected from the same account.

That's because Azure in Germany is not offered by Microsoft but T-Systems. Microsoft just supplies the tech. For the other regions one account suffices (I'm not sure about China where they also use a partner).

> Riak is not a good model since its more a blob store and we wanted to simply range scan through messages rather than sharding blobs (Cassandra is REALLY good at this).

Can you tell a little bit more please? Range scan is done by using secondary indexes (index by timestamp) in our system. I'm not sure I understood the part about blobs or some things specific to Cassandra. Reply is highly appreciated.

Cassandra uses consistent hashing. A segment of data that is addressed by a key is called partition, found by the partition key. Partitions can contain just 1 "row" if you only use a single column as the key, or you can create a compound key with a part dedicated to finding the partition and the rest to finding several rows within that partition.

If you use a compound keys (multiple rows), these rows are all stored in the same partition (which all lives on the single node which owns or replicates that partition in the consistent hash ring), so scanning those rows is very fast and efficient.

Did you consider Scylladb (http://www.scylladb.com), a Cassandra-compatible DB written in C++ by the guys behind the KVM?

> the open source version of HBase has much lower adoption than Cassandra so that was a big factor

Is this due to the availability of experienced developers or another factor?

PostgreSQL has a lower adoption rate than MySQL, but we chose it due to its suitability to the tasks at hand. As long as the adoption rate is not low enough to give concern about the longevity of a tool, I'm less concerned about it than other factors.

Well, relatively speaking Postgre might have lower adoption than MySQL, although I am not too sure about it. However, if you look at the absolute numbers, Postgre has huge adoption, even if it is smaller than MySQL's. So it doesn't really matter as chances are, you will able to find an experienced developer. Can't say the same about HBase, etc since there are significantly fewer projects requiring it compared to MySQL/Postgre.

> These kinds of write-ups offer valuable insight

I agree completely. It is frustrating that no decent books have been written regarding scaling architectures/strategies with current tooling. One has to scavenge various blog posts to try and discover ideas that might help solve their growth issues. I would love to see a book that covers scaling for app servers, RDBMSes, NoSql dbs, using queues/messaging effectively, etc. Failing that, I'd like to see something like Scalers at Work (a la Coders at Work) which would interview different devs who had to solve scaling issues.

Look for local meetups - in Seattle there's "Seattle Scalability" which is great for this sort of thing (and highscalability used to be great for this, too).

is discord in seattle?

No, but it's where I live, and I was using it as an example of a non-silicon-valley city where you can find scalability related meetups.

This isn't exactly what you want, but you might find both of the following books helpful.

With respect to using queues/messaging: http://www.enterpriseintegrationpatterns.com/

And with respect to understanding this stuff in general: http://dataintensive.net/

Not the parent, but the linked material is more foundational than the subject matter raised in the post. There is in fact an appreciable lack of good, battle-tested, non-secret, sometimes-but-not-necessarily anecdotal public info about the part of the design process where you have a working system doing fairly okay, but you know you're inches away from a very unpleasant wall. On fire [1].

It doesn't help that distributed systems are a dark art, that many open source and free-to-use tools that developers have access to gate the HA/clustering features behind steep pricing (though I sympathize it's one of the few effective ways to make money in open source), and that expertise with scaling is very often a competitive advantage.

[1] http://www.slideshare.net/iammutex/scaling-instagram/25-404i...

Totally fair. :) Parent just mentioned using queues/messaging effectively, and EIP is arguably the gold standard for that.


The first part is mainly about erlang and the choices they made. But the last part is not at all specific to erlang and walk you all the way through all decisions to take to build that type of architecture.

I've always subscribed to the HighScalability RSS feed: http://highscalability.com

I use Discord a fair amount, and something that annoys me about it is that everyone has their own server.

I realize this is a key part of the product, but the way I tend to use it is split into two modes:

- I hang out on a primary server with a few friends. We use it when we play games together.

- I get invited to someone else's server when I join up with them in a game.

The former use case is fine but the latter annoys me. I end up having N extra servers on my Discord client that I'll likely never use again. I get pings from their silly bot channels (seemingly even if I turn notifications off for that server/channel), and I show up in their member lists until I remove myself.

I wish there was a way to accept an invite as "temporary", so that it automatically goes away when I leave or shut down Discord. Maybe keep a history somewhere if I want to go back (and the invite is still valid).

Aside from that, it's a great product and really cleaned up the gamer-focused voice chat landscape. It confuses me that people will still use things like TeamSpeak or (god help you) Ventrilo when you can get a server on Discord for free with far better features.

Now that I posted this, I realize this has little to do with TFA. Sorry.

edit: formatting, apology

We have plans to make using temporary sessions in games a much better experience so look out for that in the future.

There's also a "Grant temporary membership" option when creating the invite that will automatically kick users when they disconnect unless a role has been assigned to them, but having that as an option when accepted would be cool.

"seemingly even if I turn notifications off for that server/channel"

This kills me the most. When I turn off notifications for a server, I do not want to see the red dot on the app icon in my Dock.

I believe "Server Mute" solves this.

It's possible to leave servers. I was also irritated by the exact behaviour you described, until I figured out how to leave servers. You can also mute all notifications from a server.

I convinced my friends to switch from Skype to Discord for this reason. I've had a few new groups every day and I would get calls all day long because if someone wanted to play they would call everyone.

I made a Discord group about a month ago and everyone I know is using it. If someone new wants to play, we add it to this group, so everyone is there.

Also we're not annoyed by calls anymore, as you only have to join the voice channel, instead of calling everyone.

Right click on the logo on the left pane and click "leave server".

Discord seems to me like it has a very polished user experience, and it's no surprise that users are trashing programs like Skype in favor of Discord when it is better in every area.

Discord seems to take security seriously, as they should, but I'm curious about their stance on privacy and openness. For example, I wonder if they would consider:

- Allowing end-to-end encryption to be used between users for private communications

- Allowing users to connect to Discord servers using IRC or other clients (or, at least having an API that easily allows this)[1]

- Allow users to have better control over their own data, such as providing local/downloadable logs so that they can search or otherwise use logs themselves

Discord is definitely succeeding within the gaming market, but I'm curious what other markets they would like to take a stab at.

[1] I'm aware Discord has an API, but if I understand it correctly, normal users cannot easily use Discord from anything other the official Discord apps, as this API is specifically for Discord 'bots'. I see there's a discord-irc bridge, but not much more than that. I may be incorrect on this.

- E2E/OTR encryption is something some of us are interested in, but due to the nature of our platform probably isn't going to happen anytime soon (we'd want to do it right, which requires time and effort).

- Some libraries support connecting through user accounts, and there are various third-party tools for "linking" chat rooms, incl. some client plugins for irssi and such. We don't officially support it, but it's definitely possible.

- Search is currently live on our alpha-testing client, and should be rolling out globally soon. It's also possible to save or log channels through the API fairly easily.

So technically that mean that sysadmins at discord can freely browse the billions of message that are stored on your DB?

And if you are ever hacked all this chat database can be sucked up for free due to lack of encryption?

I must be wrong seriously what did I miss this can't be?

There's a difference between end-to-end/client-side encryption and secure/encrypted backend storage.

I don't think anyone's commented on the backend security situation (I'd hope they'd have messages encrypted at rest, but it doesn't seem that encryption has been a priority), just that they don't do E2E.

But with a chat app the "classic" behaviour is as far as i know, to guarantee that each participant got all the message they ought to.

Thus what are those billions of messages they store in the database? Is it only a very detailed cache data for current conversation or is it hardwired to PRISM or a commercial database? Why on earth should they store so much chat log?

Or maybe i'm not just not award of the popularity of discord, but Billions of messages volumes make me wonder because as a comparison it's roughly iMessage worldwide per day payload.

So messages are probably stored longer than needed : how and why?

The point of our service is that chat is persistent. You can scroll back through time and read all the messages you sent. Users are free to delete whatever they sent whenever if they wish, but for almost everyone persistent chat history is a huge feature. Also important to note that as of the numbers we released last July we receive around 40 million messages a day. The public stats released about iMessage suggest that 2 billion messages are sent per day.

Can users at least opt-out of persistent chat history? Or define a timeframe after which message are deleted?

You are basically confirming that your company is storing a lot of personal data without user specific encryption. This is pretty scary and I hope you have some improvement about this situation on your roadmap. If not your are a "leak" away from a big problem.

Cool features are neats, but in 2016 privacy should not be seen as a secondary feature...

thanks for the informative response. I will look into how difficult it is to connect to a server using a user account from an IRC client, as that would make the experience much nicer for users like me.

I'm curious about the logging API permissions - it seems kind of weird that I could potentially join someone's Discord server and then download logs of their conversations for the past year instantly after joining, but I suppose this is already possible by viewing history in the client?

EDIT: looking at the API on https://discordpy.readthedocs.io/en/latest/api.html, it seems you need permission for the channel logs, but that can't prevent someone from writing code to collect them manually, regardless of permissions?

Discord has a pretty indepth permission system that allows per-channel/per-user setup.

If a server allows a user to view the message history (which basically mean, when you enter the channel, you can see previous messages and scroll up), then yes, that user can write a bot to save all the messages. I don't really see what the issue is here.

That to me really is one of the main reasons I prefer Discord to IRC. It's the fact that you can join a channel from any device and see past conversation. But of course, if for security reasons you don't want that, you can very easily disable message history and have it act like IRC does.

The channel log permission only applies to logs before you joined. You can always scroll up to the point you joined the server.

It's pretty easy to setup the discord-irc bridge, assuming you're referring to bitlbee. I already use it to have an IRC interface for facebook, google hangouts, etc, so discord was just adding a plugin and configuring the account, which took about 10 minutes total.

> Discord seems to take security seriously

Any app that has voice turned on whenever it detects sound by default, without prompting the user on installation, doesn't take security seriously.

I mean, unless you expect a communications app, running in the background, to share the conversation you're having in your room, without telling you, with everyone in every channel, until you discover it in your user preferences.

(I'm going to assume you're going to misunderstand what the issue here is. It listens by default, like when you install, and you're not prompted that it's the default. Contrary to every other communications or microphone app in existence, save for ones that are designed to spy on people).

I don't think this is a "security" issue as much as it's a usability or privacy issue, and I don't think it's an example of Discord being evil.

For a start, it's not quite "on install", but after joining your first voice channel. The issue comes from the interaction of a series of reasonable steps that on the whole result in an unfortunate experience for some people. The problematic series:

* By default, Discord uses voice detection to determine when you're speaking, as opposed to push-to-talk. This feature makes perfect sense.

* By default, Discord configures itself to start up on login. This feature makes sense. (I personally immediately turn that option off, but I don't resent its inclusion.)

* When started, Discord rejoins any voice channel you were in when Discord was last exited. This feature also makes perfect sense. [Edit: Apparently this is no longer true, and Discord will only rejoin the channel if you were active within the past 5 minutes.]

Essentially the result of these design decisions in series is [edit: was] that if you install & use Discord, and fail to manually disconnect from your voice channel, next time you start your computer Discord will automatically join your last channel and broadcast any loud enough audio in the same room as your computer to the voice channel.

There are a few mitigating factors, too: Discord is pretty obviously open and on the screen when this happens, and it does show your active voice channel, and it does show an activity indicator when you're broadcasting.

> Essentially the result of these design decisions in series is that if you install & use Discord, and fail to manually disconnect from your voice channel, next time you start your computer Discord will automatically join your last channel and broadcast any loud enough audio in the same room as your computer to the voice channel.

It's worth noting that Discord no longer does this if you've been away from the voice channel for more than 5 minutes. The feature was intended to autoreconnect you when the app was restarted due to updates and such, not to cause people to accidentally broadcast themselves on system start.

Ah, excellent. I wasn't aware they'd added a timer to the channel reconnection. That's an elegant way of solving the problem without compromising the important part of the features.

What? No it doesn't?

The very first time, you need to very clearly UNMUTE yourself manually by pressing a button, even after you join a voice channel.

After that first time, joining a voice channel will enable your mic, and by default it also makes a clear sound. There is a feature that reconnects you if you were connected to a voice channel before you left (though it seems to be limited to 5m now).

But again, unless you 1. connect to a voice channel and 2. press the unmute button that first time, there is no listening happening...

You can verify all this on Chrome, since there, they have to specifically ask for your permission before accessing your microphone, so you know exactly when they do it. Open Discord on Chrome and play around, join a voice channel, unmute, and only then it will ask for access.

I would consider this more of a privacy issue than a security issue - but something that should nonetheless be communicated clearly to user. Maybe these terms don't seem worth differentiating, but I see privacy as "what info do they collect" and security as "how safe do they keep my info, after they collect it".

Voice activation is the default of all popular VOIP products. You have to specifically join a voice channel, and it will display that on the bottom left if you're in one. Mumble, TS, Skype especially, all of those use voice activation in setup first. Its the norm, and people expect it, and moreso, want it. I hate voice activation, personally, but with how Discord is used as a group chat in replacement of Skype, it makes sense to default it.

> Any app that has voice turned on whenever it detects sound by default, without prompting the user on installation, doesn't take security seriously.

That's an an opinion.

It's really interesting to see that you're using Cassandra for this. IIRC, Cassandra was created by Facebook for their messaging, and realized that eventual consistency was a bad model for chat, so they moved to HBase instead. (source: http://highscalability.com/blog/2010/11/16/facebooks-new-rea...)

The tombstone issue was really interesting ! Thanks for sharing.

They did make the original and the core model is the same, but Cassandra in 2017 is quite different from what they open sourced in features, usability and stability.

I guess everyone makes their own tradeoffs though. This has been working wonderfully for us.

You can have strong consistency in Cassandra - ING gave a talk at Cassandra summit 2015 on their multi-DC Strongly consistent use cases

Cassandra let's you choose - per query - how many replicas must ack the query. Strong consistency is just a query parameter away.

Need to be careful about the wording, "strong consistency". I dislike that datastax uses that in their documentation, because it's misleading and really confuses people. There is no commit protocol in place - Cassandra is still an AP db under the hood, so even having multiple replicas acknowledge doesn't mean the data is consistent. For that you need paxos or something similar. This becomes very obvious if you are doing updates from multiple sources to the same key.

You do realize that Cassandra implements paxos and has a CAS system right?

It is fascinating that more and more people are using Cassandra. DataStax believes they have fixed problems with prior guarantees claims that were exposed by Jepsen. But there has been no official Jepsen testing since.

On the topic of looking at Scylla next, I wonder why did the team not just start out with it to begin with. Also, are they people with experience running both. How is the performance? And what is the state of reliability?

The problems that Jepsen found were centered around the "transactions" feature that Cassandra added. We don't use these and don't need them since we don't need 100% consistency and prefer availability (for example we read at quorum to trigger read repair, but downgrade to single node reads if we need to).

Also ScyllaDB is a new product and it would be crazy to start off with it. We plan to run a long-term double write experiment before we are comfortable with using it as a primary data store.

The Jepsen tests were not completely centered around transactions. It also had to do with data loss when replicas go down and pure "last-write-wins" approach. For those wanting more info around this the original post is here:


Last-write-wins is a semantic we are okay with for this data, and dealing with one of our conflicts is outlined the article.

I enjoyed your article and I do appreciate your transparency!

I find it fascinating that people still think Cassandra is some risky new tech - been running it in production since 2010, and the fact that people are still worried about it makes me snicker a bit.

Not everyone need strong consistency, that's why those Jepsen reports are not relevant for all cases.

The whole ideas behind Jepsen report is not that people need Strong Consistency. It is that products should tell you precisely what they guarantee or not.

This is why Aphyr always test claims.

We're running Scylla in production. Much easier to setup, no tuning necessary, great performance, no issues. It has a strong team behind it.

Missing some features that Cassandra has that should be fixed by v2. Read their site/blog for a good overview of the tech and progress.

Cassandra now runs jepsen regularly see. If you dont believe they are fixed you can check yourself: https://www.youtube.com/watch?v=OnG1FCr5WTI

> While Cassandra has schemas not unlike a relational database, they are cheap to alter and do not impose any temporary performance impact

in most relational databases, the schema is cheap to alter and does not impose a temporary performance impact.

In-fact, all of their requirements (aside of linear scalability) could also be met with a relational database. Doing so would gain you much more flexible access to querying for various reports and it would reduce the engineering effort required for retrieval of data as they add more features (relational databases are really good at being queried for arbitrary constraints).

I think people tend to dismiss relational databases a bit too quickly these days.

a) I'm not aware of any relational database that can alter a schema in real time on a hot table with billions of records.

b) You were quite okay to just dismiss scalability there except that's the most important requirement for a company such as this. People don't just choose Cassandra lightly given how significant its tradeoffs are.

c) Most companies are offloading analytics/reporting workloads into Hadoop/Spark and then exporting the results back to their EDW. This allows for far more functionality and keeps your primary data store free from adhoc internal workloads.

d) Nobody dismisses relational databases quickly. In almost all cases they are the first choice because they are so well understood. The issue is that most of them do have issues with linear scalability and the cost to support them quite prohibitive e.g. Teradata, Oracle.

Regarding a), a sharded postgresql (i.e. with citus data) can easily accomodate that workflow with just a tiny bit of extra overhead.

Re c) this strongly depends on the usecase, I've seen companies use a) to avoid split-brain problems and having to manage two data pipelines to great success at similar scales. You might find https://www.youtube.com/watch?v=NVl9_6J1G60 interesting.

MemSQL can support online schema changes on big tables.

I sort of agree with threeseed and and the GP comment, so upvotes to both of you.

1) Altering schema is vague. It was used vaguely in the article (although, given the clarity of the article, I suspect the authors knew exactly what they meant). Some alterations on relational database tables are fine, even when hot and have billions of rows. Others are not.

Add a new column: fine. Index the new column: fine. Create a new index on a column with billions of rows: definitely not fine.

But the index plan described in the article was very specific about what they wanted. It doesn't sound like they had to add any new indices.

2) Mostly agree here. Linear scalability is a big deal here, and it's fucking hard to do well for most RDBMS systems. I slightly disagree, however, because the article explicitly states that the requirements are willing to trade C for A in CAP theorem. This is important. The hardest parts of linear scaling in RDBMS are enforcing C. Think transactional systems that absolutely must be consistent. Like your bank account. This isn't that, and the blog post clearly states it. Takes a lot of pressure off the relational database when it comes to scaling.

3) Strongly disagree. Most companies don't have the resources or manpower to do that. It takes a lot of time and a lot of effort. Hell, most companies don't even have an EDW. Let alone a pipeline from the OLTP server to Spark/Hadoop to the non-existent EDW.

4) We seem to run in different circles. Almost everyone I know dismisses relational databases without question. Mongo is the way to go. And I get called out as the resident old fart/luddite who insists on using postgres. Speaking of which, if the first things you think of with relational DBs are Teradata and Oracle, we are definitely operating in different contexts.

If your opinion is that relational databases are generally well understood by--and therefore often the first choice for--developers . . . I want to know where you work.

Because that's not a different context from where I am.

That's a different universe.

The reality is that storing and retrieving data is a hard problem, and there's no set answer that works for everyone in every situation. If you're building a new product from scratch, you should go with what your team knows, provided that the team knows enough to not put yourself in the situation where you're just losing data in a partition scenario (well-made point in the original article. Mongo is fine on one node. Scale it out, and you might as well write your data to /dev/null)

Almost any datastore will serve the needs of a new product until it needs to scale horizontally. Relational, NoSQL, Object store, whatever. When it comes to scaling linearly, you have to take factors into account.

1) Which part of CAP theorem are you willing to sacrifice? You always have to let go of one.

1a) If you want a CP system, you have no choice but to deal with scaling problems of relational databases. You must have transactional guarantees for this to work.

1b) If you need an AP system, you have choices, but the choices lean in favor of systems like cassandra. It's just easier than seting up multi-node postgres and doing sharding.

It's also worth pointing out that people very often dismiss vertical scaling too soon. Take a look at Joel Spolsky's articles about infrastructure at StackOverflow. You can do quite a lot with the available firepower of modern technology by just buying bigger and better hardware.

I'm not suggesting that going bigger would have been the right choice for Discord. But sometimes it can be the right choice.

If there's something I fundamentally disagree with about the article, it's this: trying to do everything in a single data store. I think--much like what you suggested above--that it's better to have separate systems for reading and writing. Since the use case is definitively AP, I can't see a reason not to have a transactional system in an RDBMS and a streaming pipeline to a cassandra cluster for reading.

Use the right tools for the right job, is basically my point.

> 4) We seem to run in different circles. Almost everyone I know dismisses relational databases without question. Mongo is the way to go. And I get called out as the resident old fart/luddite who insists on using postgres. Speaking of which, if the first things you think of with relational DBs are Teradata and Oracle, we are definitely operating in different contexts.

I suppose I run with more...sensible devs? I mean a lot of my co-workers are Millennial Hipster Rubyist types, and they'll pick a Postgres or MySQL database literally every time and never leave it with their cold dead hands. One team here even built their own queuing system on top of some Ruby and MySQL. (Please don't ask. They had...reasons but they basically reinvented Kafka.)

These same teams really try to avoid Redis, also.

Of course, these teams are writing REST APIs with very strict SLAs. Most the time I see MongoDB and other "NoSQL" DBs used is when you have front end JS devs writing the Node backend code. >.>

> The hardest parts of linear scaling in RDBMS are enforcing C.

The hardest parts of linear scaling in RDBMS is actually doing the scaling - it's "what do I do when I'm about to outgrow a master and need to add a bunch of capacity", and "what do I do when the master crashes". At Crowdstrike we would add 60-80 servers to a cassandra cluster AT A TIME, no downtime, no application changes, no extra work on our side - just bootstrap them in, they copy their data, and they answer queries. The tooling to do that in an RDBMS world probably exists at FB/Tumblr/YouTube, and almost nowhere else.

> Think transactional systems that absolutely must be consistent. Like your bank account

Most banks use eventual consistency, with running ledgers reconciled over time.

> It takes a lot of time and a lot of effort. Hell, most companies don't even have an EDW. Let alone a pipeline from the OLTP server to Spark/Hadoop to the non-existent EDW.

In the cassandra world, it's incredibly common to setup an extra pseudo-datacenter, which is only queried by analytics platforms (spark et al). Much less work, and doesn't impact OLTP side.

> 1a) If you want a CP system, you have no choice but to deal with scaling problems of relational databases. You must have transactional guarantees for this to work.

This is fundamentally untrue - you can query cassandra with ConsistencyLevel:ALL and get strong consistency on every query (and UnavailableException anytime the network is partitioned or a single node is offline). Better still, you can read and write with ConsistencyLevel:Quorum and get strong consistency and still tolerate a single node failure in most common configs.

> Use the right tools for the right job, is basically my point.

And this is the real point, with the caveat that you need to know all the tools in order to choose the right one.

The fuck are you even talking about?

1) scaling is easy . . . oh casandra. Where you can't have C and don't care about P.

2) Let me tell you about banks. I used to work for banks. Banks do not use systems that are eventually consistent. Banks use systems--however old and outmoded--that are strongly consistent. Banks do not use systems that are eventually consistent except for ACH transfers. And that's not a database. That's a flat file

3) There is no cassandra world that you speak of. This is utter bullshit.

4) No it's not untrue. Cow--as we call it on me team--absolutely sucks at C when you're talking about scaling horizontally.

Make up your mind. Is this good at single node guarantees or is it good at sharded guarantees?

Pick one.

We know for a fact that if you want CAP, you can't have all three. You can have AP or CP, but you can't have all of them. If you're arguing that you can have C and A, you have failed at P.

Maybe that's a thing you're willing to trade-off. But it doesn't in any way relate to my point.

My point, if you missed it, was this: if you want strong consistency, you need a relational database, and you need transactional guarantees.

That is hard to do, and no one does it well yet. You're just lying to people if you say otherwise.

I don't know what your background is, but I'm really encouraged by the fact that I've worked my whole career without having to deal with people that behave like you.

> 1) scaling is easy . . . oh casandra. Where you can't have C and don't care about P.

This isn't about teaching me the CAP theorem. I know the CAP theorem. I know the tradeoffs. I've built and managed systems that handle hundreds of billions of events a day, writing millions of writes a second into a thousand cassandra nodes. You can have C, if you want it - you dont get transactions with rollbacks, but that doesn't mean you dont have consistency.

> 2) Let me tell you about banks. I used to work for banks. Banks do not use systems that are eventually consistent

All this time, I thought ING was a bank: https://www.youtube.com/watch?v=EiqdX23u_Mk

Also: http://highscalability.com/blog/2013/5/1/myth-eric-brewer-on...

> 3) There is no cassandra world that you speak of. This is utter bullshit.

I see, lame troll or wholly clueless. Guess I'm done.

Cassandra lets you tweak the C or A trade-off on each query by setting the consistency level. So yes, the same system can provide both guarantees, depending on which you need.

> if you want strong consistency, you need a relational database

You can have consistency without transactions.

Overall, it really sounds like you've never used or know much about Cassandra, and are possibly confused on the CAP theorem.

Banks do not use systems that are eventually consistent.

Got a citation for that, skipper?

Never mind, I found one: http://bfy.tw/9bcO

FYI Banking is one of the most eventually consistent systems. It's an overused example for consistency that tends to not be correct.

These people are still commenting on use cases that clearly don't apply to them? I realize the Lords of Data like to lock themselves in their Oracle-built towers, lock away your data and access, and then every once in a while issue a fiat...

But you'd think they'd read about CAP theorem and cloud architectures while they're hiding from everyone.

> I think people tend to dismiss relational databases a bit too quickly these days.

In too many cases, people conflate RDBMS with MySQL which any schema mod is time consuming on large tables, even when adding nullable columns with no other constraints.

I call it the "MySQL Effect", aka NoMySQL.

Love Discord. Most of my friends and I have switched over from using Mumble and it's been great.

I run a small Mumble host [1] and I've always thought of the idea of wrapping the Mumble client and server APIs to function like Discord/Slack as an open source alternative. Mumble is great and all, but the UI/UX appeal of Discord is so much better.

Keep up the great work!

Also, is this is the same Stanislav of Guildwork? Ha, I remember when Guildwork was being formed back in the FFXI days.

[1] https://guildbit.com

I am :) glad people remember me from FFXI haha

Add me on Discord if you ever wanna chat, Stanislav#7943

Wildly biased Cassandra person, but I find this very well written and explained, and I'm especially happy that when you bumped into problems like wide partition and tombstone memory pressure, you didn't just throw up your hands, but you worked around it.

The wide partition memory problem should be fixed in 4.0, for what it's worth.

Discord missed an opportunity a year or two ago to become something like slack for large companies. Hipchat's perf is horrible and slack couldn't scale to +20k users a year ago. Managing a mattermost instance requires staff and is more outage prone.

It's really too bad that they didn't take advantage of it, since they were actually scalable compared to their competitors and had good voice chat. Slack has started becoming more scalable recently, so I don't know how much the opportunity is still there.

I think it makes sense for Discord to stick to it's gaming niche, rather than trying to do a bunch of things poorly.

The other market is a bit more saturated, with Microsoft and a few others piling on top too. Whereas the Gaming market was completely lacking. There were a few clients which focused much more on voice (mumble, vent, ts), but nothing quite like Discord. Free and one-button to make a new server.

The only other contender I can think of is Raidcall, but that was a joke... Now there's Curse but they came too late to the market and were DoA, except the people they forced to use it by paying thousands.

We are larger than Slack

We've been using Discord a bunch at our company (HearthSim). We have a server for our user community, one for our open source org and one for our company. It's superb, works so much better than Slack ever could.

Are companies a market you're serious about? There is so much focus on gaming, it's hard to be sure. I mentioned to support recently one of our prime issues as a company is being limited to a single owner per server.

PS: Are you the same Stanislav Vishnevskiy I'm thinking about? I remember working on Guildwork with you!

I sure am!

Add me on Discord if you wanna chat Stanislav#7943

In what metrics? Number of users? Number of paying users? Valuation? Headcount?

Seconding this question. Can you share the equivalent discord numbers for the slack numbers at the link below?


We currently don't share exact metrics for all those stats, but we have shared a few press releases and blog posts which you can easily extrapolate from. :)

I'm pretty sure he's suggesting number of users. Looks like roughly 11 vs 4 million?

Definitely not paying users / valuation currently. Slack is pretty massive there.


I use Discord exclusively for my non-profit organizations and couldn't be happier (except lack of search bar, but that's coming soon).

Look again, search was deployed very recently :)

Edit: Appears it's still missing from some servers!

It's a phased deployment. Still not added to mine, sadly.

If you download their Discord Canary build then you'll have it.

>Hipchat's perf is horrible

I've noticed that Slack performs a lot worse than HipChat.

Hipchat goes terribad in a 20k organization . In things like only delivering pushes to your phone sometimes or getting chat room history when you open the app. I dont know how slack performs in a similar situation.

If you're deleting often, I recommend running a full compact (after your repair) to free up space and rid yourself of those tombstones once and for all. Repairs without compacts make those SSTables grow and grow. It's amazing how much space a compact clears up.

I had to delete a shitload of data from Cassandra recently and it required dropping gc_grace_seconds to a very low value in order for the tombstoned records to be dropped during compaction (this was mentioned in the article)

Full compactions really shouldn't be needed - if the tombstones are problematic you may be able to data model around it.

Not surprised to see other companies facing issues with Cassandra and tombstones. Don't get me wrong, I understand the need for tombstones in a distributed system like Cassandra... It doesn't make it any less of a pain though :).

The tombstone problem described is due to misuse - probably from improper use of prepared statements. Looks like they worked around it well.

> The tombstone problem described is due to misuse.

My concerns with Cassandra are precisely here: this is easy to misuse it.

There are a lot of constraints on the schema (more specifically on the design of partition & clustering keys). Each choice leads to many restrictions on what can be requested/updated/removed; and to different issues with tombstones and GC.

The Discord's story is exactly what I experimented: a series of surprises, and even really bad surprises in production. In both cases, the story ended with an efficient system, but with by far more engineering work and rework than initially planned.

This article is extremely similar to almost every other "OMG! We used Cassandra and it was nothing like a SQL database!" article by Netflix, Spotify, and so many more. The fact that every single one contains the same 6 or 7 self-inflicted issues is pretty funny to me. I mean, I thought we lived in the age of the Google Dev.

Cassandra does require you to know more of it's internals than most other data stores. Unfortunately, the move to CQL and very SQL-like names for things that are nothing at ALL like their SQL counterparts is not helping.

Also, our own personal death by tombstone: A developer who didn't even know those existed checked in some logic that would write null into a column every time a thing succeeded.

After that passed QA and went into production, all hell broke loose with queries timing out everywhere. SUCH FUN.

> My concerns with Cassandra are precisely here: this is easy to misuse it.

You get this massively scalable (to thousands of nodes and tens of millions of operations per second) database for free, and all you have to do is have your developers read about it before they use it. Is that expecting too much?

Let me ask this, then:

It's an OSS project. Let's pretend I'm a committer or on the PMC, and I'd like to fix this in a way that works for you. We need a null write to act as a delete. We need tombstones to make sure things that are meant to be deleted stay deleted. We have to have all the tombstones we scan in heap on reads to reconcile what's deleted and what's not deleted within a partition on any given slice query.

What would you want to see changed to avoid the tombstone problem? There are dozens of blogs around that say "dont write null values if you dont want a tombstone to be created" (like this article, or [0]), but beyond that, do you expect to see errors in logs?

We've made unbound values in prepared statements no longer equivalent to null/delete [1].

What else would you expect an OSS project to do to protect you from abusing it? Serious question.

0: http://thelastpickle.com/blog/2016/09/15/Null-bindings-on-pr... 1: https://issues.apache.org/jira/browse/CASSANDRA-7304

These same surprises keep being written about for Cassandra. At this point reading a few blog posts and the documentation (especially about how the data is stored) will cover all the issues you might have in production.

Basically - do the research first.

I'm one of the people who nagged you on the redis post, and particularly expressed skepticism that such a transition would've been necessary. I haven't read this yet, but I just want to say thanks for actually following up to that thread and posting it. Looking forward to it!


EDIT: Just read the post, and while it provides a good perspective on Discord's rationale to introduce Cassandra in the first place and does a great job pointing out some unexpected pitfalls, it doesn't specifically respond to replacing Redis with Cassandra due to clustering difficulty, per the prior thread. [0] Redis is only specifically called out as something they "didn't want to use", which I guess is probably the most honest answer.

The bucket logic applied to Cassandra seems like it could've been applied to redis + a traditional permanent storage backend nearly as easily. The biggest downside here would be crossing the boundary for cold data, but that's a pretty common thing that we know lots of ways to address, right? And Cassandra effectively has to do the same thing anyway, it just abstracts it away.

Again, I'm left wondering what specific value Cassandra brings to the table that couldn't have been brought by applying equal-or-lesser effort to the system they already had.

I also found it amusing that they're already contemplating the need to transition to a datastore that runs on a non-garbage-collected platform.

[0] https://news.ycombinator.com/item?id=13368754

This post was about using Cassandra for message storage.

You are basically advocating for plugging 2 systems together, which out of the box don't provide elasticity. Or we could just use Cassandra. It is a simpler solution and Cassandra is not new tech. Aside from caching empty bucket information we have nothing sitting in front of Cassandra. It works great and the effort was minimal.

The Redis comment by jhgg was referring to our service which tracks read states for each user. We might write about that later, but it's not as interesting. The most interesting about that experience was reusing memory in Go to avoid thrashing the GC.

We care about seamless elasticity for our services which Redis doesn't provide out of the box except with Redis Cluster which does not seem to be wildly adopted and forces routing to clients.

> and forces routing to clients.

Both Cassandra and Redis Cluster will forward queries to the correct nodes and both use client drivers that learn the topology to properly address the right nodes when querying.

Ah, I see. Thank you for clarifying that this is not the service to which jhgg referred.

Obviously, I haven't addressed this problem in-depth and I don't really know enough about the specifics to criticize the decision directly. It's completely possible that Cassandra was the perfect fit here.

The previous thread was in the context of switching things up without strong technical motivation. I said that actually, it does seem easier to fix a redis deployment than to write a microservice architecture backed by Cassandra, and that I hope to hear more about a stable production ethos from the tech community as a whole. There are a lot "moving on to $Sexy_New_Toy_Z" posts, but not a lot of "we solved a problem that affected scaling our systems, and instead of throwing the whole component away and starting over, we actually did the hard work of fixing and optimizing it: here's how".

To address your specific complaints.

>You are basically advocating for plugging 2 systems together, which out of the box don't provide elasticity.

I mean, again, without getting in-depth, I'm not advocating anything (I feel like I need a "This is not technical advice; if you have problems, consult a local technician" disclaimer :P).

However, storing lots of randomly-accessed messages and maintaining reasonable caches are not new problems. There are lots of potential solutions here.

And while Cassandra is not "new tech" in the JavaScript-framework-is-a-grandpa-after-6-months sense, it's certainly "new tech" in the "I'm the system of record for irreplaceable production data" sense.

Cassandra is also among the less-used of the NoSQL datastores, putting it in a minority of a minority for battle-tested deployments. You mention Netflix and someone other big name using it in production as part of your belief that it's stable. This, I think, is part of the problem.

These big companies use these solutions because

a) they truly do have a scale that makes other solutions untenable (although probably not the case with Cassandra itself);

b) they can easily afford the complex computing and labor resources needed to run, repair, and maintain a large-scale cluster. Such burdens can be onerous on smaller companies (esp. labor cost);

c) when they need a patch or when something starts going awry, they can pay anyone whose willing to make the patch, their own team not excluded. Often the BDFLs/founders of such projects end up directly employed by these big companies that adopt their tech.

"Netflix [or any other big tech name] uses it so we know it's stable" is a giant misnomer, IMO.

None of this is to say that Cassandra isn't a good choice for this problem or any other specific problem, because again, as a drive-by third-party I don't know. But contrary to what the article states, it hardly seems like Cassandra was the only thing that could've possibly fit the bill. I bet it could be done well with a traditional SQL databases (which, from the body of the post that identifies Discord as beginning on MongoDB and planning to move to something Cassandra-ish later on, it doesn't sound like was ever tried or considered).

It's kind of like reading an RFP that was written by a a guy at a government agency that already knew they really wanted to hire their brother-in-law's firm. "Must have $EXTREMELY_SPECIFIC_FEATURE_X, because, well, we must! And it just so happens that this specific formulation can only be provided by $BIL_FIRM. What d'ya know."

>We care about seamless elasticity for our services which Redis doesn't provide out of the box except with Redis Cluster which does not seem to be wildly adopted and forces routing to clients.

First, you just admitted that Redis actually does have that feature that you're saying it doesn't have. "Redis Cluster" and "Redis" are the same thing. "Redis Cluster" is part of normal redis and afaik, while it requires additional configuration, it will automatically shard.

In any case, while I have no numbers, I would wager that Redis Cluster is more widely used than Cassandra.

Cassandra doesn't require all the data to live in RAM. At this scale, you need disk-based data access.

Cassandra was literally designed for this class of problem, and redis wasn't, isn't, and never will be.

The bucketing they're doing is within a partition - they still get all data in a logical cluster, which gives them transparent linear scalability by adding nodes without having to reshard, and they get fault tolerance/HA, reasonable fsync behavior, and even cross-wan replication - things you'd never get with redis.

Serious question, how do you backup a casandra database of that size. Do you even back it up or just rely on the sharing to prevent dataloss?

Cassandra has a snapshat command that creates a directory by symlinking files that hold data (this is safe cause Cassandra files are immutable). Then you just upload them to your backup storage. This is obviously for recovery scenarios that are catastrophic.

Normally though since the data is replicated on 3 nodes, you can technically loose a node completely and rebuild it from the other nodes.

Cassandra natively supports multi-cluster replication - so you can run an entirely separate cluster that also has a copy of the entire dataset (which itself has configurable replication within a cluster) which can be used as an online fully-active backup.

We run 3 geo-distributed clusters with no offline backups because of this.

I love Discord and use it on a daily basis, one of our main concern with my gaming group is the voice latency compared to TS, Mumble or Ventrilo but this is mainly due to the inability to host your own server.

One of the big missing feature we would like to have in Discord is the ability to assign special permission to our groups leader so they can communicate over voice chat to other other group leaders in other channels (global voice chat).

When we play PVP MMO's and have 40+ users all in the same channel calling shots its impossible to coordinate properly.

What we normally do is split the group in 4 so 10 players in 4 different channels and each group leaders are calling shots independently BUT can also communicate via voice chat to other group leaders. Basically there's a global voice chat for group leaders that no one else can hear but them.

Other than that Discord is amazing!

For a bit more information about the tombstone issue from the perspective of the person who caused it: https://www.reddit.com/r/programming/comments/5oynbu/_/dcnxy...

I'm curious why would people use a closed source software, when you can use something like https://riot.im

Please let me know. I may be missing something.

Games vs general use. By targeting gamers Discord is able to customize the experience to make it better for games.

Kind of like Twitch vs Youtube even



The incredibly slick ease of use/user onboarding doesn't hurt either.

why do people always push matrix on ycombinator when other companies launch their own chat products?

when there is an equally good x in open source, why would anyone use closed sourced alternative?

Is it not a valid question?

If it is the marketing, service of support then let me know.

Discord is great but I have intermittent performance issues with it that make it almost unusable in comparison to Slack which never has any noticeable latency.

This should not be the case, we strive for Discord be lightning fast at all times.

Please email me stanislav@discordapp.com I would love more details.

I can't really give you "more details," I was trying to use Discord with some people and every couple minutes or so the chat would freeze up and then flood with messages. It wasn't just me, it was everyone in the room's Discord doing it, and it really doesn't seem like a client-side bug.

This may or may not be your issue. But I found that the Windows Defender system would start going nuts on files related to Discord. This would cause all sorts of problems (though most of the files involved seemed to be more related to their auto-update system, which shouldn't have impacted the running process... or so I would have thought!).

No this was on Mac...

This is pretty good to hear honestly. Competition is always good for the consumer. This rivalry between Discord and Slack will only make things better for everyone.

> Having a large partition also means the data in it cannot be distributed around the cluster.

Why can't a large partition be distributed around the cluster?

Cassandra uses consistent hashing. A partition is a segment of data identified by the partition key to determine which node in the consistent hash ring owns that data.

You cant break down partitions any further because it's just a name for the smallest cohesive set of data owned by a hash key, so instead it's advisable to use more partitions with data modeling rather than making them huge.

Does anyone know what protocol/transport Discord uses? XMPP, web sockets, JSON, etc?

Our realtime messaging is done over Websockets using either JSON (for web/non-native) or ETF (http://erlang.org/doc/apps/erts/erl_ext_dist.html). Almost all user-based actions are sent over our HTTP API using JSON.

Since Cassandra is eventually consistent, how do clients get a consistent sequence of messages?

Do you actually use Cassandra range queries to poll for new messages, or do clients use some kind of queue to get notified?

Say messages A, B, C are created in that sequence. But isn't it then possible that a client asking for new messages only gets A and C, and B only shows up a few milliseconds later, which would be missed unless the client actually asked for "everything since before A"? Or is that not possible?

We have a real time distribution system written in Erlang/Elixir that keeps the clients in sync. We do not poll for messages. That wouldn't work at our scale :)

What's an upsert?

Update if exists, otherwise insert.

You're storing messages, how are you guaranteeing safety of those messages when it looks like one can seemingly just blast through your API calls to find messages when one isn't even on that server?

A few seconds of research would have revealed that their API requires an authentication token[1]


And a few extra additional seconds of research shows that they don't change the token unless you change your password, as opposed to having the token expire after every session.

Very bad security practice.

A few seconds of pen-testing tells me their token revocation is pretty shoddy. I'm now spying on my fiance's Discord chat.

Bravo to those that downvoted me without even bothering to use their brains (let alone test something.)

The normal token system revokes on password change, if you want to revoke and have extra security we offer MFA login which has unique tokens per login. If security is of importance to you then use MFA.

Why are you not revoking tokens after session end across the board? Token re-use is one of the faster-rising security breach factors now days.

Your first claim seems to be wrong - why wouldn't people downvote it?

As one of the project coders notes - "The normal token system revokes on password change"

Very bad security practice. Token revocation is hard, folks. If you aren't making that token expire every session then you're doing it wrong.

In fact, this was even discussed here on HN 1600+ days ago - http://homakov.blogspot.com/2012/07/saferweb-most-common-oau...


We make friends with owners of large public Discords and they had no problem with being included. The owner of that Discord even enjoyed reading the blog post.

You can also search past messages and the blog post ends with saying the follow-up will be about search. Search is not rolled out to everyone but will next week.


Neat. That seems like a major addition.

Can I download the full transcript of a channel for safekeeping in case things are deleted in the future?

We don't have an archive downloader, but you can use our API to scan over the history and save it.


Thanks for sharing. I guess I'm not quite in the target market. It seems like a very interesting evolution beyond prior closed chat systems like AIM, ICQ, etc.

They didn't do anything wrong. They just used a feature Discord provided. It was Discord who failed to anticipate a possible problem with the feature. It's not like they are blaming or publicly shaming someone.

Wow! This is an incredible article. I do research and development for systems like this at GUN, and this article nails a lot of important pieces. Particularly there ability to jump to an old message quickly.

We built a prototype of a similar system that handled 100M+ messages a day for about $10, 2 minute screen cast here: https://www.youtube.com/watch?v=x_WqBuEA7s8 . However, this was without FTS or Mentions tagging, so I want to explore some thoughts here:

1. The bucketing approach is what we did as well, it is quite effective. However, warning to outsiders, this only effective for append-only data (like chat apps, twitter, etc.) and not good for data that gets a lot of recurring updates.

2. The more indices you add, the more expensive it gets. If you are getting a 100M+ messages a day, and you then want to update the FTS index and mentions index (user messages' index, hashtag index, etc.) you'll be doing significantly more writes. And you'll notice that those writes are updates to an index - this is the gotcha and will increase your cost.

3. Our system by default backs up / replicates to S3, which is something they mention they want to perhaps do in the future. This has huge perks to it, including price reductions, fault tolerance, and less DevOps - which is something they (and you) should value!

There backend team is amazingly small. These guys and gals seem exceptionally talented and making smart decisions. I'm looking forward to the future post on FTS!

Registration is open for Startup School 2019. Classes start July 22nd.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact