RethinkDB joins the Linux Foundation: What Happens Next (rethinkdb.com)
Bryan Cantrill posted his thoughts on the CNCF's decision to donate RethinkDB to The Linux Foundation here: https://news.ycombinator.com/item?id=13579544

We wanted to share RethinkDB's next steps in our new home with The Linux Foundation.

We've also had a lot of folks ask if they can donate to support the project. Stripe has generously offered to match up to $25k in donations (which will help fund server costs and future development.) You can learn how to contribute to the project with OSS contributions or donations here: https://rethinkdb.com/contribute

I don't have much to add, but as someone who has a lot of projects dependent on RethinkDB and also loves using it:

Thank you to everyone involved!

Linux foundation are collecting quite a lot "failed" projects and turn them to gold these days? I sometimes feel it is acting like a software goodwill store partially. Whatever that is, hope RethinkDB will do well in the future.

I stayed away from Rethink in the past few years due to its uncertain future. Now I'm seriously interested. Looking forward to the next chapter of RethinkDB.

I wonder if they could have closed more deals if they'd promised to do something like this. Then again, maybe the board would have balked at that as a sign of no confidence.

This is fantastic news. My current project would not have been possible if it wasn't for RethinkDB. Very glad to see it moving forward!

What kind of workload RethinkDB is suitable for ?

1. Transactional

2. Analytical

3. Operational

The tagline on the site describes the most obvious use case: "RethinkDB pushes JSON to your apps in realtime. When your app polls for data, it becomes slow, unscalable, and cumbersome to maintain."

In their FAQ, they call out these types of applications: Collaborative web and mobile apps, Streaming analytics apps, Multiplayer games, Realtime marketplaces, Connected devices

Sorry if this should be obvious but what is/are the killer feature/s of RethinkDB, what differentiates it from something like Redis or even CockroachDB?

Only RethinkDB gets you a) working changefeeds, where you can receive real-time changefeed updates to your queries, b) a well-implemented and Jepsen-proven distributed database.

As far as I know, there is no other solution which gets both things right.

I use it in PartsBox (https://partsbox.io/), a solution for keeping track of electronic components.

I am surprised more people aren't interested in changefeeds — the way I see it, it's the only way to implement multi-user webapps which update in real-time (as in: a change is made in one session and all other open sessions get the update immediately).

Well a) is provided by Mongo and is literally the reason why Meteor can do exactly what you described: multi-user webapps which update in real-time.

The oplog replication provided by Mongo is a very different mechanism than RethinkDB's change feeds. This is a pretty good overview of the differences: https://www.compose.com/articles/rethinking-changes-how-two-...

I couldn't find any documentation on change feeds for Mongo after a quick google search.

Could you post a link please?

It's call the "oplog": https://docs.mongodb.com/manual/core/replica-set-oplog/

While the oplog does provide some semblance of RethinkDB's changefeed, it's not nearly as powerful. With Rethink, you say you want a query, and rethink will let you know about changes to that query. Mongo just says "hey, here are operations that were done", and leaves the reconstruction to you".

RethinkDB is just a great document-centric database. It has guarantees similar to SQL, including server-side joins, while having a great replication/redundancy pattern (similar to C). It's probably best in a use case where you are planning on 3-15 servers for a cluster. If you need more than that C may be better.

I like to think of it as MongoDB done right. Above and beyond better consistency models and a broader, more well thought out API, they have an admin interface that is second to none (well SQL Management Studio might be slightly better). It's definitely better than any other "NoSQL" database.

A couple years ago, I had been considering it for a project, at the time it was missing a required feature for the project (geolocation indexes), so I wasn't able to use it then... but I followed the development of the feature, and prerequisites for that and the automatic master failover and the engineering discipline and planning was far better than pretty much any project I'd been exposed to ... The team(s) and their energies were not wasted, and I really appreciate what they have done.

I was sad to see the company shutter, but very happy to see the project under LF, and hope that it really takes off from here. It would be a pretty natural fit as an RDS service under Amazon and there are a few hosted options. Horizon also looks interesting compared to firebase.

This is another feature over competitors is that streaming updates is in the box, and not bolted on to oplog processing like competitors.

This might be a useful read: https://rethinkdb.com/faq/

It explains RethinkDB's ideal use cases, explains how to compare it to other databases, and details some of the differentiating features.

RethinkDB has passed jepsen testing.

*Fixed typo

I think you might mean the Jepsen tests: https://aphyr.com/posts/329-jepsen-rethinkdb-2-1-5

My weird and unrelated question is: if I donate software to an open source group like the Linux Foundation, can I write it off my taxes? And if so, how do I assess the value of it? RethinkDB probably has some legitimate market value...can the founders reflect that on their taxes?

It's not the founder's asset -- unless it has been released back to them in adjudicated liquidation proceedings or by contractual agreement. It is the company's asset, and would be reflected in the company's taxes.

Is this the best possible outcome given the circumstances?

Yes.

Honestly what needs to happen next is a serious effort to explain why or when rethink is better than mongo, cassandra, arango, aerospike, memsql, mysql, riak, or postgres, ++, not to mention all the TSDBs. On the event pushes I am unconvinced that message queues/computation graphs arent superior and that's another crowded space. When I last looked at it the advantages struck me as mostly incremental on the query language and decremental on performance. There are many excellent competitors in this space, most of which are well funded, and moving targets. Rethink doesn't seem to have a USP, or none that has been effectively communicated at least, IMO.

TBH, given the option... RethinkDB is probably the best case for anything that needs distribution/HA and automatic failover. SQL is a decent option, but you either pay a lot for HA, or you need to have a lot of domain knowledge or hire dedicated DBA support. Not that RethinkDB doesn't need some knowledge, their admin interface is great.

The replication model is similar to Cassandra (ring + redundancy), while the master/slave model and failover has had a lot of work to make it bulletproof.

It will scale well from 3-15 nodes, then it starts to drop off as less than linear growth. But if you need more than that, then you're in a whole other league.

If you want search only, go for ElasticSearch. If you need much greater linear growth at the cost of application complexity, Cassandra. If you need fast memory access, then go for Redis. If you don't need bullet-proof automagic failover, or are willing to pay through the nose for it, go for SQL. If you are okay with a single system, go SQL. Otherwise, RethinkDB should probably be the first choice.

Don't get me wrong, I'll reach for SQL first in many cases... but RethinkDB if I have a choice and HA is a requirement. I also happen to prefer a document-centric model/approach.

Aerospike touches most of your points at much higher speed and scale. It is next gen redis, basically, with disk, with auto-sharding scale, with cross node queries. Cassandra is not difficult once you wrap your mind around column storage, and if you need that, no other storage style will do.

The 15-node thing is also a major achilles heel. Who wants to commit to a stack that incurs massive technical debt in the event of massive success? Imagine reengineering your db and your event pushes, at scale...

reply


Aerospike doesn't offer many consistency guarantees. If you run it in a cluster on the cloud you are more than likely to see silent data loss [1].

It's not a fair comparison, RethinkDB is much safer. I'm sure, if you turn down the defaults on both read and write operations on RethinkDB you could scale it well past 15 nodes and with very high read and write throughput.

1: https://aphyr.com/posts/324-jepsen-aerospike

Comes down to jenkins for me. RethinkDB aced the jenkins tests.

Mongo has failed every jenkins test it's been put through, dunno about the status now though. Last I checked Mongo's default durability level was "data loss on power outage". Aerospike failed jenkins too, and not on small edge cases like Mongo, but with major dataloss.

Going by the problems Gitlab has recently with Postgres I wouldn't use that for a distributed database. Likely true for MySQL too.

* Yep. I meant to write Jepsen :)

I think you mean Jepsen? It's a slight exaggeration to say that RethinkDB aced it -- but they did very well[1] and (more importantly to me, honestly) Jepsen was used to find a subtle and nasty issue that was subsequently fixed.[2]

[1] https://aphyr.com/posts/329-jepsen-rethinkdb-2-1-5

[2] https://aphyr.com/posts/330-jepsen-rethinkdb-2-2-3-reconfigu...

I've been having a lot of trouble with Jenkins at work today. So when I though Jepsen, I wrote Jenkins.

Still feel they aced it. No one passes Jepsen on their first try. But RethinkDB is the first to immediately fix the issue.

I wouldn't go as far as to call it a nasty issue. It would only happen if you got node failures while reconfiguring your cluster. And reconfiguring the cluster must be initiated by the admin and it's something that happens very often.

Passt find replace Jenkins Jensen call-me-maybe ;)

Jepsen

