EDIT: Just read the post, and while it provides a good perspective on Discord's rationale to introduce Cassandra in the first place and does a great job pointing out some unexpected pitfalls, it doesn't specifically respond to replacing Redis with Cassandra due to clustering difficulty, per the prior thread.  Redis is only specifically called out as something they "didn't want to use", which I guess is probably the most honest answer.
The bucket logic applied to Cassandra seems like it could've been applied to redis + a traditional permanent storage backend nearly as easily. The biggest downside here would be crossing the boundary for cold data, but that's a pretty common thing that we know lots of ways to address, right? And Cassandra effectively has to do the same thing anyway, it just abstracts it away.
Again, I'm left wondering what specific value Cassandra brings to the table that couldn't have been brought by applying equal-or-lesser effort to the system they already had.
I also found it amusing that they're already contemplating the need to transition to a datastore that runs on a non-garbage-collected platform.
You are basically advocating for plugging 2 systems together, which out of the box don't provide elasticity. Or we could just use Cassandra. It is a simpler solution and Cassandra is not new tech. Aside from caching empty bucket information we have nothing sitting in front of Cassandra. It works great and the effort was minimal.
The Redis comment by jhgg was referring to our service which tracks read states for each user. We might write about that later, but it's not as interesting. The most interesting about that experience was reusing memory in Go to avoid thrashing the GC.
We care about seamless elasticity for our services which Redis doesn't provide out of the box except with Redis Cluster which does not seem to be wildly adopted and forces routing to clients.
Both Cassandra and Redis Cluster will forward queries to the correct nodes and both use client drivers that learn the topology to properly address the right nodes when querying.
Obviously, I haven't addressed this problem in-depth and I don't really know enough about the specifics to criticize the decision directly. It's completely possible that Cassandra was the perfect fit here.
The previous thread was in the context of switching things up without strong technical motivation. I said that actually, it does seem easier to fix a redis deployment than to write a microservice architecture backed by Cassandra, and that I hope to hear more about a stable production ethos from the tech community as a whole. There are a lot "moving on to $Sexy_New_Toy_Z" posts, but not a lot of "we solved a problem that affected scaling our systems, and instead of throwing the whole component away and starting over, we actually did the hard work of fixing and optimizing it: here's how".
To address your specific complaints.
>You are basically advocating for plugging 2 systems together, which out of the box don't provide elasticity.
I mean, again, without getting in-depth, I'm not advocating anything (I feel like I need a "This is not technical advice; if you have problems, consult a local technician" disclaimer :P).
However, storing lots of randomly-accessed messages and maintaining reasonable caches are not new problems. There are lots of potential solutions here.
Cassandra is also among the less-used of the NoSQL datastores, putting it in a minority of a minority for battle-tested deployments. You mention Netflix and someone other big name using it in production as part of your belief that it's stable. This, I think, is part of the problem.
These big companies use these solutions because
a) they truly do have a scale that makes other solutions untenable (although probably not the case with Cassandra itself);
b) they can easily afford the complex computing and labor resources needed to run, repair, and maintain a large-scale cluster. Such burdens can be onerous on smaller companies (esp. labor cost);
c) when they need a patch or when something starts going awry, they can pay anyone whose willing to make the patch, their own team not excluded. Often the BDFLs/founders of such projects end up directly employed by these big companies that adopt their tech.
"Netflix [or any other big tech name] uses it so we know it's stable" is a giant misnomer, IMO.
None of this is to say that Cassandra isn't a good choice for this problem or any other specific problem, because again, as a drive-by third-party I don't know. But contrary to what the article states, it hardly seems like Cassandra was the only thing that could've possibly fit the bill. I bet it could be done well with a traditional SQL databases (which, from the body of the post that identifies Discord as beginning on MongoDB and planning to move to something Cassandra-ish later on, it doesn't sound like was ever tried or considered).
It's kind of like reading an RFP that was written by a a guy at a government agency that already knew they really wanted to hire their brother-in-law's firm. "Must have $EXTREMELY_SPECIFIC_FEATURE_X, because, well, we must! And it just so happens that this specific formulation can only be provided by $BIL_FIRM. What d'ya know."
>We care about seamless elasticity for our services which Redis doesn't provide out of the box except with Redis Cluster which does not seem to be wildly adopted and forces routing to clients.
First, you just admitted that Redis actually does have that feature that you're saying it doesn't have. "Redis Cluster" and "Redis" are the same thing. "Redis Cluster" is part of normal redis and afaik, while it requires additional configuration, it will automatically shard.
In any case, while I have no numbers, I would wager that Redis Cluster is more widely used than Cassandra.
The bucketing they're doing is within a partition - they still get all data in a logical cluster, which gives them transparent linear scalability by adding nodes without having to reshard, and they get fault tolerance/HA, reasonable fsync behavior, and even cross-wan replication - things you'd never get with redis.