It's far more valuable to understand why Discord uses Cassandra than to merely be aware they do.
Out of curiosity, did you consider HBase and Riak? Did you entertain going fully hosted with Bigtable? If so, what criteria resulted in Cassandra winning out?
Riak is not a good model since its more a blob store and we wanted to simply range scan through messages rather than sharding blobs (Cassandra is REALLY good at this).
HBase would have been fine for this model, but the open source version of HBase has much lower adoption than Cassandra so that was a big factor. We also don't care about consistency and HBase is a CP database, we prefer AP for this use case. As far as using GCP's BigTable (HBase compat), we made this decision before we moved to GCP, but we are also not fans of using platform lock-in. While BigTable has the same API as HBase we would hate to go to an less widely adopted version where we have a hard time getting community support if we decided to leave GCP.
Hope that helps.
Did you consider GCP Datastore as well?
It has strong consistency for a single "entity group", but eventual consistency for queries on multiple entity groups.
So by storing data only relevant to a single user in an entity group, you can have strongly consistent, atomic transactions on that group (albeit limited to 1 tx/s), and at the same time do global queries on all user data with eventual consistency.
Hint: if you have technical reasons for avoiding GCP Datastore I'd be very interested in hearing about them
In 2017 they will finally have datacenters in Sydney, London, Singapore, Frankfurt etc.
Can you tell a little bit more please? Range scan is done by using secondary indexes (index by timestamp) in our system. I'm not sure I understood the part about blobs or some things specific to Cassandra. Reply is highly appreciated.
If you use a compound keys (multiple rows), these rows are all stored in the same partition (which all lives on the single node which owns or replicates that partition in the consistent hash ring), so scanning those rows is very fast and efficient.
Is this due to the availability of experienced developers or another factor?
PostgreSQL has a lower adoption rate than MySQL, but we chose it due to its suitability to the tasks at hand. As long as the adoption rate is not low enough to give concern about the longevity of a tool, I'm less concerned about it than other factors.
I agree completely. It is frustrating that no decent books have been written regarding scaling architectures/strategies with current tooling. One has to scavenge various blog posts to try and discover ideas that might help solve their growth issues. I would love to see a book that covers scaling for app servers, RDBMSes, NoSql dbs, using queues/messaging effectively, etc. Failing that, I'd like to see something like Scalers at Work (a la Coders at Work) which would interview different devs who had to solve scaling issues.
With respect to using queues/messaging: http://www.enterpriseintegrationpatterns.com/
And with respect to understanding this stuff in general: http://dataintensive.net/
It doesn't help that distributed systems are a dark art, that many open source and free-to-use tools that developers have access to gate the HA/clustering features behind steep pricing (though I sympathize it's one of the few effective ways to make money in open source), and that expertise with scaling is very often a competitive advantage.
The first part is mainly about erlang and the choices they made. But the last part is not at all specific to erlang and walk you all the way through all decisions to take to build that type of architecture.
I realize this is a key part of the product, but the way I tend to use it is split into two modes:
- I hang out on a primary server with a few friends. We use it when we play games together.
- I get invited to someone else's server when I join up with them in a game.
The former use case is fine but the latter annoys me. I end up having N extra servers on my Discord client that I'll likely never use again. I get pings from their silly bot channels (seemingly even if I turn notifications off for that server/channel), and I show up in their member lists until I remove myself.
I wish there was a way to accept an invite as "temporary", so that it automatically goes away when I leave or shut down Discord. Maybe keep a history somewhere if I want to go back (and the invite is still valid).
Aside from that, it's a great product and really cleaned up the gamer-focused voice chat landscape. It confuses me that people will still use things like TeamSpeak or (god help you) Ventrilo when you can get a server on Discord for free with far better features.
Now that I posted this, I realize this has little to do with TFA. Sorry.
edit: formatting, apology
This kills me the most. When I turn off notifications for a server, I do not want to see the red dot on the app icon in my Dock.
I made a Discord group about a month ago and everyone I know is using it. If someone new wants to play, we add it to this group, so everyone is there.
Also we're not annoyed by calls anymore, as you only have to join the voice channel, instead of calling everyone.
Discord seems to take security seriously, as they should, but I'm curious about their stance on privacy and openness.
For example, I wonder if they would consider:
- Allowing end-to-end encryption to be used between users for private communications
- Allowing users to connect to Discord servers using IRC or other clients (or, at least having an API that easily allows this)
- Allow users to have better control over their own data, such as providing local/downloadable logs so that they can search or otherwise use logs themselves
Discord is definitely succeeding within the gaming market, but I'm curious what other markets they would like to take a stab at.
 I'm aware Discord has an API, but if I understand it correctly, normal users cannot easily use Discord from anything other the official Discord apps, as this API is specifically for Discord 'bots'. I see there's a discord-irc bridge, but not much more than that. I may be incorrect on this.
- Some libraries support connecting through user accounts, and there are various third-party tools for "linking" chat rooms, incl. some client plugins for irssi and such. We don't officially support it, but it's definitely possible.
- Search is currently live on our alpha-testing client, and should be rolling out globally soon. It's also possible to save or log channels through the API fairly easily.
And if you are ever hacked all this chat database can be sucked up for free due to lack of encryption?
I must be wrong seriously what did I miss this can't be?
I don't think anyone's commented on the backend security situation (I'd hope they'd have messages encrypted at rest, but it doesn't seem that encryption has been a priority), just that they don't do E2E.
Thus what are those billions of messages they store in the database? Is it only a very detailed cache data for current conversation or is it hardwired to PRISM or a commercial database? Why on earth should they store so much chat log?
Or maybe i'm not just not award of the popularity of discord, but Billions of messages volumes make me wonder because as a comparison it's roughly iMessage worldwide per day payload.
So messages are probably stored longer than needed : how and why?
You are basically confirming that your company is storing a lot of personal data without user specific encryption. This is pretty scary and I hope you have some improvement about this situation on your roadmap. If not your are a "leak" away from a big problem.
Cool features are neats, but in 2016 privacy should not be seen as a secondary feature...
I'm curious about the logging API permissions - it seems kind of weird that I could potentially join someone's Discord server and then download logs of their conversations for the past year instantly after joining, but I suppose this is already possible by viewing history in the client?
EDIT: looking at the API on https://discordpy.readthedocs.io/en/latest/api.html, it seems you need permission for the channel logs, but that can't prevent someone from writing code to collect them manually, regardless of permissions?
If a server allows a user to view the message history (which basically mean, when you enter the channel, you can see previous messages and scroll up), then yes, that user can write a bot to save all the messages. I don't really see what the issue is here.
That to me really is one of the main reasons I prefer Discord to IRC. It's the fact that you can join a channel from any device and see past conversation. But of course, if for security reasons you don't want that, you can very easily disable message history and have it act like IRC does.
Any app that has voice turned on whenever it detects sound by default, without prompting the user on installation, doesn't take security seriously.
I mean, unless you expect a communications app, running in the background, to share the conversation you're having in your room, without telling you, with everyone in every channel, until you discover it in your user preferences.
(I'm going to assume you're going to misunderstand what the issue here is. It listens by default, like when you install, and you're not prompted that it's the default. Contrary to every other communications or microphone app in existence, save for ones that are designed to spy on people).
For a start, it's not quite "on install", but after joining your first voice channel. The issue comes from the interaction of a series of reasonable steps that on the whole result in an unfortunate experience for some people. The problematic series:
* By default, Discord uses voice detection to determine when you're speaking, as opposed to push-to-talk. This feature makes perfect sense.
* By default, Discord configures itself to start up on login. This feature makes sense. (I personally immediately turn that option off, but I don't resent its inclusion.)
* When started, Discord rejoins any voice channel you were in when Discord was last exited. This feature also makes perfect sense. [Edit: Apparently this is no longer true, and Discord will only rejoin the channel if you were active within the past 5 minutes.]
Essentially the result of these design decisions in series is [edit: was] that if you install & use Discord, and fail to manually disconnect from your voice channel, next time you start your computer Discord will automatically join your last channel and broadcast any loud enough audio in the same room as your computer to the voice channel.
There are a few mitigating factors, too: Discord is pretty obviously open and on the screen when this happens, and it does show your active voice channel, and it does show an activity indicator when you're broadcasting.
It's worth noting that Discord no longer does this if you've been away from the voice channel for more than 5 minutes. The feature was intended to autoreconnect you when the app was restarted due to updates and such, not to cause people to accidentally broadcast themselves on system start.
The very first time, you need to very clearly UNMUTE yourself manually by pressing a button, even after you join a voice channel.
After that first time, joining a voice channel will enable your mic, and by default it also makes a clear sound. There is a feature that reconnects you if you were connected to a voice channel before you left (though it seems to be limited to 5m now).
But again, unless you 1. connect to a voice channel and 2. press the unmute button that first time, there is no listening happening...
You can verify all this on Chrome, since there, they have to specifically ask for your permission before accessing your microphone, so you know exactly when they do it. Open Discord on Chrome and play around, join a voice channel, unmute, and only then it will ask for access.
That's an an opinion.
The tombstone issue was really interesting ! Thanks for sharing.
I guess everyone makes their own tradeoffs though. This has been working wonderfully for us.
Cassandra let's you choose - per query - how many replicas must ack the query. Strong consistency is just a query parameter away.
On the topic of looking at Scylla next, I wonder why did the team not just start out with it to begin with. Also, are they people with experience running both. How is the performance? And what is the state of reliability?
Also ScyllaDB is a new product and it would be crazy to start off with it. We plan to run a long-term double write experiment before we are comfortable with using it as a primary data store.
This is why Aphyr always test claims.
Missing some features that Cassandra has that should be fixed by v2. Read their site/blog for a good overview of the tech and progress.
in most relational databases, the schema is cheap to alter and does not impose a temporary performance impact.
In-fact, all of their requirements (aside of linear scalability) could also be met with a relational database. Doing so would gain you much more flexible access to querying for various reports and it would reduce the engineering effort required for retrieval of data as they add more features (relational databases are really good at being queried for arbitrary constraints).
I think people tend to dismiss relational databases a bit too quickly these days.
b) You were quite okay to just dismiss scalability there except that's the most important requirement for a company such as this. People don't just choose Cassandra lightly given how significant its tradeoffs are.
c) Most companies are offloading analytics/reporting workloads into Hadoop/Spark and then exporting the results back to their EDW. This allows for far more functionality and keeps your primary data store free from adhoc internal workloads.
d) Nobody dismisses relational databases quickly. In almost all cases they are the first choice because they are so well understood. The issue is that most of them do have issues with linear scalability and the cost to support them quite prohibitive e.g. Teradata, Oracle.
Re c) this strongly depends on the usecase, I've seen companies use a) to avoid split-brain problems and having to manage two data pipelines to great success at similar scales. You might find https://www.youtube.com/watch?v=NVl9_6J1G60 interesting.
1) Altering schema is vague. It was used vaguely in the article (although, given the clarity of the article, I suspect the authors knew exactly what they meant). Some alterations on relational database tables are fine, even when hot and have billions of rows. Others are not.
Add a new column: fine. Index the new column: fine. Create a new index on a column with billions of rows: definitely not fine.
But the index plan described in the article was very specific about what they wanted. It doesn't sound like they had to add any new indices.
2) Mostly agree here. Linear scalability is a big deal here, and it's fucking hard to do well for most RDBMS systems. I slightly disagree, however, because the article explicitly states that the requirements are willing to trade C for A in CAP theorem. This is important. The hardest parts of linear scaling in RDBMS are enforcing C. Think transactional systems that absolutely must be consistent. Like your bank account. This isn't that, and the blog post clearly states it. Takes a lot of pressure off the relational database when it comes to scaling.
3) Strongly disagree. Most companies don't have the resources or manpower to do that. It takes a lot of time and a lot of effort. Hell, most companies don't even have an EDW. Let alone a pipeline from the OLTP server to Spark/Hadoop to the non-existent EDW.
4) We seem to run in different circles. Almost everyone I know dismisses relational databases without question. Mongo is the way to go. And I get called out as the resident old fart/luddite who insists on using postgres. Speaking of which, if the first things you think of with relational DBs are Teradata and Oracle, we are definitely operating in different contexts.
If your opinion is that relational databases are generally well understood by--and therefore often the first choice for--developers . . . I want to know where you work.
Because that's not a different context from where I am.
That's a different universe.
The reality is that storing and retrieving data is a hard problem, and there's no set answer that works for everyone in every situation. If you're building a new product from scratch, you should go with what your team knows, provided that the team knows enough to not put yourself in the situation where you're just losing data in a partition scenario (well-made point in the original article. Mongo is fine on one node. Scale it out, and you might as well write your data to /dev/null)
Almost any datastore will serve the needs of a new product until it needs to scale horizontally. Relational, NoSQL, Object store, whatever. When it comes to scaling linearly, you have to take factors into account.
1) Which part of CAP theorem are you willing to sacrifice? You always have to let go of one.
1a) If you want a CP system, you have no choice but to deal with scaling problems of relational databases. You must have transactional guarantees for this to work.
1b) If you need an AP system, you have choices, but the choices lean in favor of systems like cassandra. It's just easier than seting up multi-node postgres and doing sharding.
It's also worth pointing out that people very often dismiss vertical scaling too soon. Take a look at Joel Spolsky's articles about infrastructure at StackOverflow. You can do quite a lot with the available firepower of modern technology by just buying bigger and better hardware.
I'm not suggesting that going bigger would have been the right choice for Discord. But sometimes it can be the right choice.
If there's something I fundamentally disagree with about the article, it's this: trying to do everything in a single data store. I think--much like what you suggested above--that it's better to have separate systems for reading and writing. Since the use case is definitively AP, I can't see a reason not to have a transactional system in an RDBMS and a streaming pipeline to a cassandra cluster for reading.
Use the right tools for the right job, is basically my point.
I suppose I run with more...sensible devs? I mean a lot of my co-workers are Millennial Hipster Rubyist types, and they'll pick a Postgres or MySQL database literally every time and never leave it with their cold dead hands. One team here even built their own queuing system on top of some Ruby and MySQL. (Please don't ask. They had...reasons but they basically reinvented Kafka.)
These same teams really try to avoid Redis, also.
Of course, these teams are writing REST APIs with very strict SLAs. Most the time I see MongoDB and other "NoSQL" DBs used is when you have front end JS devs writing the Node backend code. >.>
The hardest parts of linear scaling in RDBMS is actually doing the scaling - it's "what do I do when I'm about to outgrow a master and need to add a bunch of capacity", and "what do I do when the master crashes". At Crowdstrike we would add 60-80 servers to a cassandra cluster AT A TIME, no downtime, no application changes, no extra work on our side - just bootstrap them in, they copy their data, and they answer queries. The tooling to do that in an RDBMS world probably exists at FB/Tumblr/YouTube, and almost nowhere else.
> Think transactional systems that absolutely must be consistent. Like your bank account
Most banks use eventual consistency, with running ledgers reconciled over time.
> It takes a lot of time and a lot of effort. Hell, most companies don't even have an EDW. Let alone a pipeline from the OLTP server to Spark/Hadoop to the non-existent EDW.
In the cassandra world, it's incredibly common to setup an extra pseudo-datacenter, which is only queried by analytics platforms (spark et al). Much less work, and doesn't impact OLTP side.
> 1a) If you want a CP system, you have no choice but to deal with scaling problems of relational databases. You must have transactional guarantees for this to work.
This is fundamentally untrue - you can query cassandra with ConsistencyLevel:ALL and get strong consistency on every query (and UnavailableException anytime the network is partitioned or a single node is offline). Better still, you can read and write with ConsistencyLevel:Quorum and get strong consistency and still tolerate a single node failure in most common configs.
> Use the right tools for the right job, is basically my point.
And this is the real point, with the caveat that you need to know all the tools in order to choose the right one.
1) scaling is easy . . . oh casandra. Where you can't have C and don't care about P.
2) Let me tell you about banks. I used to work for banks. Banks do not use systems that are eventually consistent. Banks use systems--however old and outmoded--that are strongly consistent. Banks do not use systems that are eventually consistent except for ACH transfers. And that's not a database. That's a flat file
3) There is no cassandra world that you speak of. This is utter bullshit.
4) No it's not untrue. Cow--as we call it on me team--absolutely sucks at C when you're talking about scaling horizontally.
Make up your mind. Is this good at single node guarantees or is it good at sharded guarantees?
We know for a fact that if you want CAP, you can't have all three. You can have AP or CP, but you can't have all of them. If you're arguing that you can have C and A, you have failed at P.
Maybe that's a thing you're willing to trade-off. But it doesn't in any way relate to my point.
My point, if you missed it, was this: if you want strong consistency, you need a relational database, and you need transactional guarantees.
That is hard to do, and no one does it well yet. You're just lying to people if you say otherwise.
> 1) scaling is easy . . . oh casandra. Where you can't have C and don't care about P.
This isn't about teaching me the CAP theorem. I know the CAP theorem. I know the tradeoffs. I've built and managed systems that handle hundreds of billions of events a day, writing millions of writes a second into a thousand cassandra nodes. You can have C, if you want it - you dont get transactions with rollbacks, but that doesn't mean you dont have consistency.
> 2) Let me tell you about banks. I used to work for banks. Banks do not use systems that are eventually consistent
All this time, I thought ING was a bank: https://www.youtube.com/watch?v=EiqdX23u_Mk
> 3) There is no cassandra world that you speak of. This is utter bullshit.
I see, lame troll or wholly clueless. Guess I'm done.
> if you want strong consistency, you need a relational database
You can have consistency without transactions.
Overall, it really sounds like you've never used or know much about Cassandra, and are possibly confused on the CAP theorem.
Got a citation for that, skipper?
Never mind, I found one: http://bfy.tw/9bcO
But you'd think they'd read about CAP theorem and cloud architectures while they're hiding from everyone.
In too many cases, people conflate RDBMS with MySQL which any schema mod is time consuming on large tables, even when adding nullable columns with no other constraints.
I call it the "MySQL Effect", aka NoMySQL.
I run a small Mumble host  and I've always thought of the idea of wrapping the Mumble client and server APIs to function like Discord/Slack as an open source alternative. Mumble is great and all, but the UI/UX appeal of Discord is so much better.
Keep up the great work!
Also, is this is the same Stanislav of Guildwork? Ha, I remember when Guildwork was being formed back in the FFXI days.
Add me on Discord if you ever wanna chat, Stanislav#7943
The wide partition memory problem should be fixed in 4.0, for what it's worth.
It's really too bad that they didn't take advantage of it, since they were actually scalable compared to their competitors and had good voice chat. Slack has started becoming more scalable recently, so I don't know how much the opportunity is still there.
The other market is a bit more saturated, with Microsoft and a few others piling on top too. Whereas the Gaming market was completely lacking. There were a few clients which focused much more on voice (mumble, vent, ts), but nothing quite like Discord. Free and one-button to make a new server.
The only other contender I can think of is Raidcall, but that was a joke... Now there's Curse but they came too late to the market and were DoA, except the people they forced to use it by paying thousands.
Are companies a market you're serious about? There is so much focus on gaming, it's hard to be sure. I mentioned to support recently one of our prime issues as a company is being limited to a single owner per server.
PS: Are you the same Stanislav Vishnevskiy I'm thinking about? I remember working on Guildwork with you!
Add me on Discord if you wanna chat Stanislav#7943
Definitely not paying users / valuation currently. Slack is pretty massive there.
Edit: Appears it's still missing from some servers!
I've noticed that Slack performs a lot worse than HipChat.
My concerns with Cassandra are precisely here: this is easy to misuse it.
There are a lot of constraints on the schema (more specifically on the design of partition & clustering keys).
Each choice leads to many restrictions on what can be requested/updated/removed;
and to different issues with tombstones and GC.
The Discord's story is exactly what I experimented: a series of surprises, and even really bad surprises in production.
In both cases, the story ended with an efficient system, but with by far more engineering work and rework than initially planned.
Cassandra does require you to know more of it's internals than most other data stores. Unfortunately, the move to CQL and very SQL-like names for things that are nothing at ALL like their SQL counterparts is not helping.
Also, our own personal death by tombstone: A developer who didn't even know those existed checked in some logic that would write null into a column every time a thing succeeded.
After that passed QA and went into production, all hell broke loose with queries timing out everywhere. SUCH FUN.
You get this massively scalable (to thousands of nodes and tens of millions of operations per second) database for free, and all you have to do is have your developers read about it before they use it. Is that expecting too much?
Let me ask this, then:
It's an OSS project. Let's pretend I'm a committer or on the PMC, and I'd like to fix this in a way that works for you. We need a null write to act as a delete. We need tombstones to make sure things that are meant to be deleted stay deleted. We have to have all the tombstones we scan in heap on reads to reconcile what's deleted and what's not deleted within a partition on any given slice query.
What would you want to see changed to avoid the tombstone problem? There are dozens of blogs around that say "dont write null values if you dont want a tombstone to be created" (like this article, or ), but beyond that, do you expect to see errors in logs?
We've made unbound values in prepared statements no longer equivalent to null/delete .
What else would you expect an OSS project to do to protect you from abusing it? Serious question.
Basically - do the research first.
EDIT: Just read the post, and while it provides a good perspective on Discord's rationale to introduce Cassandra in the first place and does a great job pointing out some unexpected pitfalls, it doesn't specifically respond to replacing Redis with Cassandra due to clustering difficulty, per the prior thread.  Redis is only specifically called out as something they "didn't want to use", which I guess is probably the most honest answer.
The bucket logic applied to Cassandra seems like it could've been applied to redis + a traditional permanent storage backend nearly as easily. The biggest downside here would be crossing the boundary for cold data, but that's a pretty common thing that we know lots of ways to address, right? And Cassandra effectively has to do the same thing anyway, it just abstracts it away.
Again, I'm left wondering what specific value Cassandra brings to the table that couldn't have been brought by applying equal-or-lesser effort to the system they already had.
I also found it amusing that they're already contemplating the need to transition to a datastore that runs on a non-garbage-collected platform.
You are basically advocating for plugging 2 systems together, which out of the box don't provide elasticity. Or we could just use Cassandra. It is a simpler solution and Cassandra is not new tech. Aside from caching empty bucket information we have nothing sitting in front of Cassandra. It works great and the effort was minimal.
The Redis comment by jhgg was referring to our service which tracks read states for each user. We might write about that later, but it's not as interesting. The most interesting about that experience was reusing memory in Go to avoid thrashing the GC.
We care about seamless elasticity for our services which Redis doesn't provide out of the box except with Redis Cluster which does not seem to be wildly adopted and forces routing to clients.
Both Cassandra and Redis Cluster will forward queries to the correct nodes and both use client drivers that learn the topology to properly address the right nodes when querying.
Obviously, I haven't addressed this problem in-depth and I don't really know enough about the specifics to criticize the decision directly. It's completely possible that Cassandra was the perfect fit here.
The previous thread was in the context of switching things up without strong technical motivation. I said that actually, it does seem easier to fix a redis deployment than to write a microservice architecture backed by Cassandra, and that I hope to hear more about a stable production ethos from the tech community as a whole. There are a lot "moving on to $Sexy_New_Toy_Z" posts, but not a lot of "we solved a problem that affected scaling our systems, and instead of throwing the whole component away and starting over, we actually did the hard work of fixing and optimizing it: here's how".
To address your specific complaints.
>You are basically advocating for plugging 2 systems together, which out of the box don't provide elasticity.
I mean, again, without getting in-depth, I'm not advocating anything (I feel like I need a "This is not technical advice; if you have problems, consult a local technician" disclaimer :P).
However, storing lots of randomly-accessed messages and maintaining reasonable caches are not new problems. There are lots of potential solutions here.
Cassandra is also among the less-used of the NoSQL datastores, putting it in a minority of a minority for battle-tested deployments. You mention Netflix and someone other big name using it in production as part of your belief that it's stable. This, I think, is part of the problem.
These big companies use these solutions because
a) they truly do have a scale that makes other solutions untenable (although probably not the case with Cassandra itself);
b) they can easily afford the complex computing and labor resources needed to run, repair, and maintain a large-scale cluster. Such burdens can be onerous on smaller companies (esp. labor cost);
c) when they need a patch or when something starts going awry, they can pay anyone whose willing to make the patch, their own team not excluded. Often the BDFLs/founders of such projects end up directly employed by these big companies that adopt their tech.
"Netflix [or any other big tech name] uses it so we know it's stable" is a giant misnomer, IMO.
None of this is to say that Cassandra isn't a good choice for this problem or any other specific problem, because again, as a drive-by third-party I don't know. But contrary to what the article states, it hardly seems like Cassandra was the only thing that could've possibly fit the bill. I bet it could be done well with a traditional SQL databases (which, from the body of the post that identifies Discord as beginning on MongoDB and planning to move to something Cassandra-ish later on, it doesn't sound like was ever tried or considered).
It's kind of like reading an RFP that was written by a a guy at a government agency that already knew they really wanted to hire their brother-in-law's firm. "Must have $EXTREMELY_SPECIFIC_FEATURE_X, because, well, we must! And it just so happens that this specific formulation can only be provided by $BIL_FIRM. What d'ya know."
>We care about seamless elasticity for our services which Redis doesn't provide out of the box except with Redis Cluster which does not seem to be wildly adopted and forces routing to clients.
First, you just admitted that Redis actually does have that feature that you're saying it doesn't have. "Redis Cluster" and "Redis" are the same thing. "Redis Cluster" is part of normal redis and afaik, while it requires additional configuration, it will automatically shard.
In any case, while I have no numbers, I would wager that Redis Cluster is more widely used than Cassandra.
The bucketing they're doing is within a partition - they still get all data in a logical cluster, which gives them transparent linear scalability by adding nodes without having to reshard, and they get fault tolerance/HA, reasonable fsync behavior, and even cross-wan replication - things you'd never get with redis.
Normally though since the data is replicated on 3 nodes, you can technically loose a node completely and rebuild it from the other nodes.
We run 3 geo-distributed clusters with no offline backups because of this.
One of the big missing feature we would like to have in Discord is the ability to assign special permission to our groups leader so they can communicate over voice chat to other other group leaders in other channels (global voice chat).
When we play PVP MMO's and have 40+ users all in the same channel calling shots its impossible to coordinate properly.
What we normally do is split the group in 4 so 10 players in 4 different channels and each group leaders are calling shots independently BUT can also communicate via voice chat to other group leaders. Basically there's a global voice chat for group leaders that no one else can hear but them.
Other than that Discord is amazing!
Please let me know. I may be missing something.
Is it not a valid question?
If it is the marketing, service of support then let me know.
Please email me firstname.lastname@example.org I would love more details.
Why can't a large partition be distributed around the cluster?
You cant break down partitions any further because it's just a name for the smallest cohesive set of data owned by a hash key, so instead it's advisable to use more partitions with data modeling rather than making them huge.
Do you actually use Cassandra range queries to poll for new messages, or do clients use some kind of queue to get notified?
Say messages A, B, C are created in that sequence. But isn't it then possible that a client asking for new messages only gets A and C, and B only shows up a few milliseconds later, which would be missed unless the client actually asked for "everything since before A"? Or is that not possible?
Very bad security practice.
Bravo to those that downvoted me without even bothering to use their brains (let alone test something.)
Very bad security practice. Token revocation is hard, folks. If you aren't making that token expire every session then you're doing it wrong.
In fact, this was even discussed here on HN 1600+ days ago - http://homakov.blogspot.com/2012/07/saferweb-most-common-oau...
You can also search past messages and the blog post ends with saying the follow-up will be about search. Search is not rolled out to everyone but will next week.
Can I download the full transcript of a channel for safekeeping in case things are deleted in the future?
We built a prototype of a similar system that handled 100M+ messages a day for about $10, 2 minute screen cast here: https://www.youtube.com/watch?v=x_WqBuEA7s8 . However, this was without FTS or Mentions tagging, so I want to explore some thoughts here:
1. The bucketing approach is what we did as well, it is quite effective. However, warning to outsiders, this only effective for append-only data (like chat apps, twitter, etc.) and not good for data that gets a lot of recurring updates.
2. The more indices you add, the more expensive it gets. If you are getting a 100M+ messages a day, and you then want to update the FTS index and mentions index (user messages' index, hashtag index, etc.) you'll be doing significantly more writes. And you'll notice that those writes are updates to an index - this is the gotcha and will increase your cost.
3. Our system by default backs up / replicates to S3, which is something they mention they want to perhaps do in the future. This has huge perks to it, including price reductions, fault tolerance, and less DevOps - which is something they (and you) should value!
There backend team is amazingly small. These guys and gals seem exceptionally talented and making smart decisions. I'm looking forward to the future post on FTS!