Hacker News new | past | comments | ask | show | jobs | submit login
RethinkDB Postmortem (github.com)
948 points by v3ss0n on Jan 17, 2017 | hide | past | web | favorite | 267 comments



Fascinating read, albeit a sad one: I love RethinkDB, and use it personally every day in our particular production deployment.[1] But -- and to Slava's point -- we didn't/don't/wouldn't pay for it, and in that regard, we were part of the problem. That said, the constraint on that particular problem was that it had to be open source, and if it couldn't have been RethinkDB, it would have been some other open source database -- proprietary software is a non-option for that use case (or indeed, for any of our use cases).

So I feel Slava's pain (and feel somewhat culpable as a user of RethinkDB), but I don't entirely agree with the analysis. Yes, infrastructure is a brutal space because it has been so thoroughly disrupted by open source -- but the dominance of AWS shows that open source can absolutely be monetized (just that it must be monetized as a service). Indeed, right now, I would love to deploy a RethinkDB-based service -- and if the company still existed, we (and by "we" I mean "we Samsung", not "we Joyent"[2]) would potentially be in a serious business conversation about what a supported, large-scale deployment would look like. Actually, that's not entirely true, and it leads to me to one thing that Slava didn't mention. While it kills me to bring it up, it does represent the single greatest impediment I have personally found to deep, mainstream, enterprise adoption of RethinkDB: its use of the AGPL. Anyone who has been part of a serious due diligence knows that the GPL alone is toxic to much of the enterprise world -- and the GPL is a buttoned-down corporate stooge compared to the virulent, ill-defined, never-tested AGPL. The AGPL is a complete non-starter for any purpose, and while I appreciate why the AGPL was selected by RethinkDB (and very much appreciate the explicit riders that RethinkDB put on it to make clear what it did and didn't apply to), the reality is that no one -- absolutely no one -- has built a business on AGPL software and it seems vanishingly unlikely that anyone ever will.

I still use RethinkDB and still intend to -- and I still harbor hope (crazily?) that whatever entity that owns the IP will see fit to relicense it to something that is palatable to the kind of people who care about their data. I actually believe that a business can be built on the technology -- which, to Slava's points, I have found to be delightfully sound in both design and implementation. And that RethinkDB is open source means -- for us, anyway -- it will continue to live on, despite the unfortunate demise of its corporate vessel (AGPL or no). That said, I would love to have a lot more company -- here's hoping that we see RethinkDB relicensed and a thriving business behind it!

[1] https://github.com/joyent/manta-thoth

[2] https://www.joyent.com/blog/samsung-acquires-joyent-a-ctos-p...


> the reality is that no one -- absolutely no one -- has built a business on AGPL software and it seems vanishingly unlikely that anyone ever will.

I strongly disagree with that. Take RavenDB, which is very much comparable to RethinkDB in terms of a business is dual licensed as AGPL and has a commercial license available [1]. This forces any business doing non-OSI approved OSS work to purchase a commercial license. I don't know too much about RethinkDB but I think they must have used a very similar licensing strategy?

From an outsiders' perspective, RavenDB appears to be going strong and doing well commercially. They are hiring and expanding and have invested a lot into v.4.0, which I'm really excited about.

Interestingly, on the spectrum between MongoDB and RethinkDB in terms of "Wrong metrics of goodness" (Correctness, Simplicity of the Interface) they appear to me much closer to RethinkDB than MongoDB. In terms of stability, they may have been somewhat in the middle (e.g. not as bad as MongoDB, but probably not as good as RethinkDB). And they have moved very fast to add features and fix bugs in the product. It's a really nice DB to work with as a dev, I suppose the #1 reason they are not as successful as Mongo is that they're coming from the .NET niche...

[1] https://ayende.com/blog/4508/comments-on-ravendb-licensing


> ... the reality is that no one -- absolutely no one -- has built a business on AGPL software ...

MongoDB server is AGPL - client drivers are Apache. To be fair the business side is based around larger scale deployment and management, and is proprietary add-ons.

I'm kind of surprised that approach wasn't mentioned for Rethinkdb - a free open core product, with paid extras dealing with pain points in larger deployments.


It is possible to buy a different non-AGPL license to MongoDB -- https://www.mongodb.com/community/licensing. With RethinkDB it seems completely impossible to do so. I begged and pleaded: http://sagemath.blogspot.com/2016/10/rethinkdb-must-relicens.... I just spent most of my time during the last two months rewriting SageMathCloud to use PostgreSQL instead of RethinkDB, with the catalyst for doing this being the AGPL licensing and some concrete enterprise customers needing a completely non-GPL'd stack for SageMathCloud. Coming out of this rewrite, and back to PostgreSQL (which I've used off and on for decades), I'm very impressed by PostgreSQL today, and the LISTEN/NOTIFY functionality is a solid building block on which to build something like changefeeds (thanks to the many HN comments on previous stories about RethinkDB for pointing this out!).


While I like that idea (paid extras to deal with pain points), it puts developers of the base product in a weird position. They are incentivized to ignore user complaints on one codebase, as that's a business driver on the other codebase.

Nothing new, just a strange place to be in.


Re-licensing is working on by Intermin leadership team. We are also waiting for the results. Hopefully it will be Apache or MIT Licensed.

source : https://docs.google.com/document/d/1f8qODp7voKIqwioQ3si69q3-...


I wish more databases were developed with modular, reusable pieces. RocksDB is already a popular storage engine for anyone building a database today. Throw iterlib[1] (shameless plug) on top of it, you get a query API for free.

Now you get to innovate on top of these modules using reactive APIs, user space TCP, Fancy serialization, Standards compliant SQL etc.

Some of the current incumbents were written in the 90s when C++ was a different programming language, which hurts innovation.

[1] https://github.com/facebookincubator/iterlib


A hosted distributed SQL-ish database running on Joyent would be amazing! I checked the available packages on Joyent cloud many times not quite believing RethinkDB _wasn't_ on the list as it seemed such a great fit. Fast native containers with RethinkDB's shardibg support would have been fun to have "precanned" support for.

Currently the pre-packaged databases on Joyent don't support both clustering and consistency. Is there anything in the works for this segment?!


I'm also interested to hear what the future, if any, looks like for RethinkDB. Would also be very curious to explore what can be done regarding the IP/licensing as part of that future (agree with the concerns mentioned here).


RethinkDB as a product is not dead yet, you can contribute in it. Re-Licensing attempts are already in progress , ex founders had to deal with legal issues for it so it will be long.


Here's a developer tool using AGPL which seems to be doing okay: http://itextpdf.com/Pricing


I really appreciate the deep introspection in this post. We can all learn a lot from it. I'm one of the (possible minority) of HN users that doesn't care about startups and business, so for me the takeaway was about how choosing quality and correctness over speed works out in the real world. It seems that you really can't take the idealist approach to quality that RethinkDB took. The world just isn't willing to wait, and it's not willing to take the time to understand a system and to see how clearly it distinguishes itself from the alternatives. Our decisions are driven by speed. Our culture is "always be shipping." But part of the result of this is that all our software is riddled with bugs and usability issues. Even the most basic functionality of our computers fail all the time, all the way down the stack. It's depressing that a really solid, honest attempt to produce extremely high quality software like RethinkDB can't survive in part because of these things.


> The world just isn't willing to wait, and it's not willing to take the time to understand a system and to see how clearly it distinguishes itself from the alternatives.

This world is always willing to wait, it was the RethinkDB team and/or their investors who weren't. The world will wait for you in eternity; it's just not willing to provide you with capital during the wait.

Your statement makes sense if there were a dozen competitors in the database space. With hundreds of competitors, actually evaluating them all becomes more expensive than just choosing eg. a managed SQL solution and then switching provider if it doesn't live up to its promises.

That's so much more flexible from a business perspective than evaluating a hundred different open source database systems, settling on one, and then having to rewrite your database code when the database you've chosen ceases development, and you're left as the maintainer of it.


I was a huge advocate of RethinkDB, absolutely love what Slava and the team were able to accomplish, but I am having a really hard time digesting that "Correctness", "Simplicity of the interface" and "Consistency" were the issue here. Example: Realm.

I can only imagine what the team at Realm is thinking after going through comments and the post-mortem. Realm was an open-source mobile first database for the past 3 years. It's free. Its the most correct, consistent, and simple mobile database that is running on millions of devices[1] (Yup fine, kind of my opinion but trust me mobile community will back this up). They ship FAST, they ship real USECASES. They just recently entered the cloud space with their Mobile Database Platform, and they are even experimenting with new things like a version built for Node.js.

I'm honestly not trying to make a specific point, but asking more questions... Is mobile different? Is it easier? Are mobile developers more willing to try new technology? Are they more willing to pay for technology? (Realm is free, haven't paid them a penny, so not sure about this one...)

My main point I guess is that it's not a terrible market, Realm made it work with VERY similar ideals and goals. So what else is going on here?

[1] https://realm.io/news/jp-simard-realm-core-database-engine/


I'm not familiar with Realm, but after a brief glance at their site:

Yes, mobile in this context is very different. Or rather, local is very different. Building a database that resides on a local device and is primarily accessed by a single user is an entirely different class of problem than one that is on a server. Usually most of the hard problems in building a database system involves dealing with multiple people accessing or modifying data at the same time. When you have a single user, that problem either disappears or is massively reduced.

As you said, Realm does appear to have a server offering now, but from the sounds of it, it hasn't really been proven yet, so I don't think it's as easy as saying they've succeeded where Rethink failed.


Absolutely, in no way am I saying their cloud offering is successful. What I am saying is they have shown you can be successful with the very same ideals and goals that the RethinkDB team had, with a very similar developer-minded audience. The biggest factor here is obviously the platforms their database runs on - Mobile (or local as you put it nicely) vs cloud.


It's in the Worse is Better essay. Strange enough, the essay series the author drew on advised the opposite strategy because it wins in the market. Here's a simpler one from someone who developed a product in high-assurance security then worked to turn Microsoft security around:

https://blogs.microsoft.com/microsoftsecure/2007/08/23/the-e...

He details why they put marketing and shipping first with constant improvements on correctness or security. On top of this, I advise just embedding the benefits into the product & designing it for easy change so components can be replaced once big revenue comes in. As development pace slows, QA pace can increase on components stabilizing.


It's probably too early to say that Realm made it work.

They've raised a roughly similar amount of VC money as RethinkDB did. Their last raise was a few years ago, before Parse's shutdown, so I suspect that investors are more antsy about this space today.

The fact that Realm recently entered the cloud space sounds similar to RethinkDB's trajectory too... Of course I hope Realm makes it, it's a great product. But I'm worried that they're trapped in the VC-funded model like RethinkDB was.


> Is mobile different? Is it easier?

It's easier to get adoption for mobile, I'd think.

Database component designed for use by mobile applications: something you base an app around

Database server: something you base your business around


I don't actually think it is easier to get adoption. The major mobile platforms have SQLite built-in (under the hood in the case of Core Data). I think that's actually a strong barrier to overcome.

Furthermore, mobile is becoming in-demand or even core in both consumer and enterprise. For enterprise especially, though, you have to think of it as mobile + sync, not on-device only.

Turns out sync is the really hard part when you want to build something that's robust in the face of poor network connectivity.



Slava (coffeemug) is very brave to introspect publicly and whenever I interacted with him, behaved with greatest integrity and kindness. Unfortunately, his hypothesis here that devtools are a tough market does not ring true. Making a business is always tough, making one that does something new is even tougher. There are many tools and database companies doing extraordinarily well, from Atlassian to Cloudera to Docker to Elastic. The difficulties that RethinkDB faced, as it pivoted from closed source MySQL DB backend, to open source independent database with a very cool update mechanism, were in my analysis much more mundane. Simply, RethinkDB was not easy. Here is a flavor of their thinking, when they rejected a REST HTTP interface: "The RethinkDB query language is so simple, expressive, and pleasant, that trying to make it available over HTTP would be quite difficult and a disservice to users."

Developers have a big advantage when making devtools in that they are their own target market. They are, but only at the start of the project before they become embroiled in it. Ironically this feeling of making it for oneself leads to a mistaken urge to thrust complexity onto the target. As an infra engineer myself, this is something I always try to keep in mind -- I worked on a modelling language at a large company and I credit its success to its absolutely straightforward and accessible syntax, lack of dependencies and straightforward integration into workflows. The temptation is always to make sure users are aware of the choices, force them out of bad habits and make them consider the tradeoffs (this Strange Loop talk from Peter Alvaro addresses the social treatment of users https://www.youtube.com/watch?v=R2Aa4PivG0g ). For RethinkDB, it was wrong to try to force people to learn a new language to use a DB, when there are just tons of interfaces already, and the new language has incremental benefits over say SQL.

However elegant the design of software, the deployment that is the largest or most demanding for it will always find problems. There's no point seriously trying to claim great reliability or quality for a nascent project. It's unbelievable. Instead you need to focus on getting it used in some friendly project that is willing to pay the cost of these teething problems for some particular feature. Once battle tested, reasonable tech leads can choose to use it, comfortable that they won't be the biggest deployment (e.g. git bootstrapped by its use in the Linux kernel). Alternatively, and less scrupulously, you can try to bootstrap by providing a very convenient experience at the start of a simple project for junior engineers - there are many NoSQL DBs that promise the world and are incredibly easy to set up (see the https://aphyr.com/tags/jepsen Jepsen blog for many sad examples, RethinkDB proudly did well in this test).

RethinkDB refused to make the market fit compromises necessary. Market fit is about timing. Don't read the Economist. Ship early and keep your users on board through whatever engineering compromise necessary.


I spent 1.5 years writing a ton of code using RethinkDB, and the last two months rewriting all the code (and more) using PostgreSQL. You are completely right, e.g., "For RethinkDB, it was wrong to try to force people to learn a new language to use a DB, when there are just tons of interfaces already, and the new language has incremental benefits over say SQL." rings very true. SQL is incredibly expressive and the tooling around it is very mature.


So I disagree with this. One of the things I loved about RethinkDB was the fact that it did away with SQL. The problem with SQL (in high transaction applications anyway), is that performance isn't necessarily reproducible, or determinable from the query. You are at the mercy of how the query optimizer decided fetch your data. The more complicated your query, the less you can can out reproducible performance.

With RQL you could very clearly determine how rethinkdb was fetching your data, and you knew it was going to get your data in the same way every time the query ran.


Best comment in here.


Hey all, author here.

FYI, the post is still in its unpublished state. It's basically right, but I've been meaning to edit it for tone and rewrite the market failure section for clarity. Didn't get the chance to do that before it made it on HN, so keep in mind that you're reading a (late) draft.


Kudos on a magnificent post, and of course commiserations.

One thing resonated strongly with me. When we started out many years ago someone said to me that the worst market you could go after was to target developers. Case in point - every developer who's ever put up a web site has built or thought about building a CMS. There must be thousands upon thousands of them, virtually all with a price of $0. Who would want to sell a CMS product?

We heeded that advice and I often reflect on it during the sales process when we have the "lets just run your offering past the IT team" meeting. So often, the line of questioning from the smartypants dudes in the room is all about tripping us up, finding our weak spot (I'm a techy so I get it - I think like this myself).

The best you can normally come out of that meeting with is grudging acceptance from IT that a SaaS solution should be OK, given that the business is in a rush and all, and given that IT has other priorities. (To be fair its much less like this these days, but a few years ago this was the norm).

By contrast, showing your product to a business person is a breath of fresh air. They don't want to build it themselves, they don't really care what's under the bonnet, they just want good, quick solutions to their actual problems, so they can go home at night and sit down to Netflix. HR people (our target market) are even nicer.

What's more those business people want to pay for your solution. A quick way to kill the deal is to joyously tell them "its free!". They need to understand how you get your buck at the end of the day. They'll never buy until they understand that. And the best (because its the simplest) answer is "we've got the best product, so we charge good money for it".


IT people try and "trip you up" because we've been burnt so many times in the past with systems that don't do what is promised by the sales people, don't integrate well and who stiff you during the implementation phase.

HR people are some of the worst because everything is super secret and they don't want to get IT involved, then they go and buy a system that lets you insert a carriage return into the phone number field and produces CSV files that don't have primary keys for data extracts.


It hurts because I lived it so many times. The "you can't understand", "it's complicated", "it's a HR thing" is such bullshit.


Everyone walks on eggshells around HR for "compliance" stuff. Anecdotally, I've not see a whole lot of compliant practices in the stuff HR wants. The high end HR solutions tend to enforce it, but there are always amazingly bad home grown tools sitting around to allow you to "pull a list of all our employees and all our data to a CSV" and then you find out the person converted that CSV to a google doc...(I've not seen anything that combined all these steps exactly this way, but I've seen each of these steps. The more legacy the tools, the worse)


And yet a bunch of them chose MongoDB (over something else, not just over RethinkDB.) Was there some point in time when it was the best option for a lot of people?


They were a fairly early entry into modern document databases. They offer sharding/mirroring, and have a really simple, relatively nice interface. In the end, they're a good (enough) fit for a lot of use cases. While RethinkDB imho is absolutely better, market share entropy counts for a lot. ElasticSearch is a similar product in the end, with lower consistency guarantees, that is another good fit for a lot of use cases. Cassandra (C*) is also a good fit for similar use cases, though more difficult to work with, it's also more tune-able to specific needs.

It really depends on what you need... Every non-sql database tends to sacrifice something for some performance gain... RethinkDB is as close as I've seen to one without sacrifice.


Yes, there was a point it was clearly the best option. Qualitative it was easy to understand. The mongo Cli looked exactly like MySQL. And it's single node performance was decent.

CouchDB was super slow. Cassandra was amazing but a bit confusing. MongoDB made sense and worked.


These anecdotes are gold


First, thanks for your candor in this.

If you're open to edits, the $200k revenue/employee rule of thumb really doesn't make a lot of sense when you're comparing your three alternate business models (Hosted Rethink, DBaaS Rethink and "Hey, maybe PaaS on top of Rethink?").

Like you, I thought it would be way too hard to run a service (my own company was encouraged to do so, and we shied away for it for the same reason you stated). However, having seen it now from the inside for GCE for nearly 4 years, I can say it's much easier to "service-ize" yourself than it is to have built a product people want. And moreover, the shell scripts and operations stuff really scales (effectively, the hosted DB offering is the binary plus a bit of bash, while the other offerings require yet more "not done before" software).

Even more surprisingly, this statement:

> Managed hosting is essentially running the database for people on AWS so they don't have to. The alternative to using these services is setting up the database on AWS yourself. That's a pain, but it isn't actually that hard. So there is a very hard cap on how much managed database hosting services can charge.

turns out to just not be true in practice. People pay us (Cloud SQL) and AWS (RDS) a surprising amount of money to run MySQL in a VM for them. It turns out that people really like outsourcing even a small amount of pain, but that we're (as engineers) easily blindsided to "Um, why would I pay you XX% over the raw VM price? You're just running some shell scripts for me". Little things like automatically updating to the latest version, configuration replication, etc. are things that nobody will bat at an eye at paying you for (I believe in the DB space in particular, it's because the overall cost of your DB is dwarfed by "All the rest", so it's a good trade of people time versus $$s).

I hope this doesn't come across as Monday-morning quarterbacking, but if I did mine over again I'd take the "Build a great piece of software, and even if it's open, service-ize it for folks". It's true that you're competing with the Cloud providers, and that it seems like you wouldn't really be able to get people to pay you for it (and it's not your core competency), but I think it's the best option for this flavor of infrastructure software moving forward.


> If you're open to edits, the $200k revenue/employee rule of thumb really doesn't make a lot of sense when you're comparing your three alternate business models

Sure, but for many venture funded companies and even large enterprises top-line is the top metric you're focused on. If you're in a growth industry it's about top line revenue, capturing market share, and cost of underlying bits will eventually come down.

> turns out to just not be true in practice. People pay us (Cloud SQL) and AWS (RDS) a surprising amount of money to run MySQL in a VM for them.

For infrastructure providers to "service-ize" you can set cost as a much smaller markup on the cost of the raw infrastructure. Most large organizations (AWS/GCE) have a model where they give some credit for the underlying infrastructure back to the team that servicized things. This is quite a bit different from a smaller startup that still needs to make some reasonable margin but has significant cost of goods of the infrastructure itself.

At the same time I fully agree with the notion that you should build and run a service, but if the marketing or even product perspective of such a thing is we installed a VM for you you're entirely missing the point. Having built an as-a-service product a few times over, there is a lot of engineering and product work that goes into it. Turn key backups, point in time recovery, high availability, performance monitoring are all not just installing a VM. And it's very different from building a core database engine. While there can be many benefits from collaborating on them, they are very different skill sets and the notion of focus shouldn't be underweighted.


To back you up on this, at the last place I worked we ran our own MySQL master and slaves for years, with moving to RDS always on the back burner. Finally completing that migration and shutting down our own servers was a big achievement. Never looked back, and were happy to be paying AWS to take care of the DBs for us.


I might be wrong by I think you missed maybe the most important keyword: backup. You can't fix a missing/corrupted backup, at this point it's too late. So you need to get it right. It's not like, say, configuring nginx. If you get nginx wrong you just fix it. If you need a backup and it's not there you can't fix it, it's (likely) game over.

So outsourcing the database is a very logical thing to do: if you can trust a third party to have a reliable backup/recovery story, that's one thing that can kill your business that you don't have to worry about.


>>> People pay us (Cloud SQL) and AWS (RDS) a surprising amount of money to run MySQL in a VM for them.

As a happy RDS user, I totally agree. Surely, I could run MySQL and the result would be mostly fine, after some mistakes and lessons learned. But all the things such as upgrading, etc still take time, add a bit of stress, one more thing to care about. Could that time and energy be spent on more valuable (from business point of view) activities? Yes.

I have regretted this decision.


This was a really interesting post. Thanks for writing it.

First with MySQL and then with MongoDB, a pattern is emerging. There's a popular open-source database. Its feature set is awesome and the ecosystem of software on top of it is good because lots of people are using it, but it's buggy and unreliable. So it's controversial. Since it's popular, it keeps being developed, and over time, it slowly gets more reliable.

To me this post is a reminder that, just because I can see clear flaws with a competing product, it doesn't mean those flaws are exploitable. Better to focus on what your own customers are demanding than on criticisms of your competitor.


The problem with MySQL early on was exactly its feature set. Fifteen years ago it was very far behind Postgres yet much more popular. I do not think it even had InnoDB/transactions at that point (or almost no one used it), definitely did not have user defined functions, procedures and many other features developers took for granted in commercial RDBMS.

Like MongoDB, it was popular mostly for being popular, which is a nice gig if you can get it.


Also it was fast on crappy hardware and easy to compile and setup. 80%. Better is worse. Unix.


That is not Unix. Unix is small, composable, correct and 80% of what you need (you write the rest). It doesn't mean 20% of quality was sacrificed.


The original "worse is better" example is precisely about Unix making a trade-off in which it gives up on "correctness"


Market timing, excitement (some of which later turns into hype), mindshare, and marketing make for a difficult combo to beat. The post-mortem covered addressed market timing and hinted at the rest, but is reluctant to call out MongoDB's marketing.

I can't help but wonder if the story had been different if the RethinkDB devs organized a bunch of codecamps and positioned their db as the 'default choice'. This is how Mongo gained a ton of mindshare, helped by other trends at the time, like the NoSQL wave, the Node.js boom, and the renaissance of client-side JS development.


> (...) Since it's popular, it keeps being developed, and over time, it slowly gets more reliable.

Or people simply don't care/are not willing to pay for alternatives. MongoDB looks as unreliable as it was on the first day, but somehow people don't seem to care. I hope it's because their use-cases can live with the unreliability, but in fear it's because "won't happen to me, just to others".


As a former startup founder, I totally empathize - it's always clearer with hindsight but it's important to honestly reflect so you can apply the lessons learned to the next turn of the ferris wheel.

> We set out to build a good database system, but users wanted a good way to do X.

If you replace "database system" with "[any product]," it encapsulates the reason why many startups fail. I was guilty of it too. I think it's a common pitfall for technical founders who tend to think in terms of solutions rather than problems.

If you have some free time, I would recommend the book What Customers Want by Anthony Ulwick, which offers a systematic approach to formulating a product that satisfies a market's underserved needs.


> I think it's a common pitfall for technical founders

There's a very closely related form that is common for non-technical founders which is "We're going to use technology to fix ____ industry"

A founder with experience in a particular industry may see how inefficient it is, and how much scope there is for improvement based on technological solutions, but business live-an-die by the question: "Are there enough people who will pay us to solve this problem for them?"

Superficially, it seems like "I can make your business run more efficiently" is a reasonable approximation for "I have something that you will pay money for", but it has a very similar value-gap to "I can provide you with a technically better database".

"More efficient" and "technically better" are both things that ought to be valuable to customers, but often aren't valuable enough to get them to hand over their money.


Nice write up! My outside personal perspective was that you needed to endure for longer. Things like Mongo rode the "Relational is a deadend, NoSQL to the rescue!" wave and gathered momentum before people started fully understanding the consequences of choosing Mongo. Now it seems there's a lot more caution being taken. Anecdotal accounts of people having success with things are taken with a grain of salt. People seem to be looking for more of a tidal wave of acceptance.

As an aside, lately it seems it's "OO is a deadend, Functional to the rescue!" ... so maybe you can leverage that wave instead... or deep learning... OverThinkDB maybe :)


The key to building developer tools is to build the ones that a developer needs but has no interest in creating. Backend for mobile, infrastructure for backend, etc.

Just don't create front end tools for front end developers, for example. Firebug had a donate button and we funneled all of it to volunteers in places where the currency was weak to make it at least seem material.


I don't think it's that simple. Which developer wants to build a reliable, realtime database?


Thanks for the post. One can feel your pain flowing through it.

I have a rather naive question: Why do you recommend "the Economist" as oppose to something like the HBR?


I've read the Economist for the last 20 years and the HBR for a few months before I dropped it.

The Economist offers insight, thoughtful analysis and writes about the world: literally but also politics, culture, and science. The obituary is surprisingly fascinating for instance. You may not always agree with its position, but you know that their position will be worth engaging with and thinking about.

No, I don't work for the Economist, I just appreciate thoughtful articles. Same goes for Private Eye, Foreign Affairs, New Yorker.


The best obituaries are in the Telegraph. I am still angry they have put them all behind a paywall :(


Aha interesting! Never read their obits. I find Evans-Pritchard a good read.


While I feel for you because failure hurts so much, congratulations on learning from your experience. I know this won't make you feel any better, but the next time you try things will be so much easier.

By the way the tone is fine given your experience.


Thanks for writing this, and writing it with candor.

What do you think of Spolsky's "Good software takes 10 years"?

https://www.joelonsoftware.com/2001/07/21/good-software-take...


It seems like you have a great ability to self-evaluate. Many of my friends and I, after reading your post, agree that we would be very willing to work in a team managed by people like you.


Sorry i am the one who leaked . @coffeemug we greatly appreciate your effort and rethinkDB is well architectured database in existance. We gonna keep using it and contribute to Rethinkdb .


Did you ever consider partnering with a larger vendor?

Figuring out the right GTM is tough, and the timing was weird, but GOOG and MSFT must have been pretty desperate for a NoSQL solution 2013-2014 to compete with DynamoDB. Firebase got scooped up in October 2014, but it wasn't until April 2015 when Azure announced DocumentDB.


Thank you for the detailed analysis & post. I am myself embarking on a startup journey, albeit in the world of consulting & I wanted to let you know that your in depth writeup was extremely valuable & in many ways poignant to me. There were so many learnings in there and the way your bared it all, it feels it must have been excruciatingly hard to let go of your company (& co-workers) in the end. Thanks again & I wish you all the very best for your next venture - wiser & better.


Thanks for all your work.

I understand that you're waiting for a legal framework to accept donations as an organization, but there was an article recently about an individual GIMP developer being funded on patreon. Is this something you would consider? I'd be happy to contribute to someone who's actively working on RethinkDB (of course without any backer rewards or similar expectations), and perhaps others would as well.


Kudos on the post and for sharing. I really enjoyed reading the candor and hearing from your experiences. I heard amazing things about RethinkDB and admire your team for what you did.


I hope my finished writing is as good as your draft.


Thanks for pointing that out, just noticed the "/_drafts/why-rethinkdb-failed.md".


Here's the part that resonated with me.

> we fought the losing battle of educating the market.

I've been at ten startups. At least half of them chose that battle, and that would be the less successful half. Sales cycles are fragile things. Anything that slows them down can be fatal. Clearly you need to explain how your offering is unique and special, but if you have to pause to explain basic concepts before they get it then you've probably lost their attention and their business.

The VP of marketing at one of my previous companies was generally an idiot and one of the chief drivers of that company's failure, but he introduced me to the concept of market vs. technical risk. A product can be easy to market but a challenge to build, or it can be the other way around. You can build a strategy around either. If it's hard in both ways, you're in trouble. Frankly, it sounds a bit like that's the trap RethinkDB fell into. They were doing things that were technically hard and hard to explain to users. That's how you get three years behind.

It's a sorry situation, but kudos to the RethinkDB folks for trying to do the right thing, mostly succeeding from an engineering perspective, and writing honestly about how that played out in a market of people who aren't equipped to appreciate it.


> if you have to pause to explain basic concepts before they get it then you've probably lost their attention and their business.

And this point of failure for many startups was beautiful represented/mocked in the silicon valley series of hbo: https://www.youtube.com/watch?v=Lrv8i2X3gnI


> but he introduced me to the concept of market vs. technical risk

This bit alone provided the right keywords to help me express some of the ideas in the essays I have been writing. Thanks for posting.


Hello HN , sorry for the early leak. I posted as I've discovered on github after looking around rethinkdb progress.

RethinkDB as a product is not dead.The community and one of ex RethinkDB senior developer are still maintaining it. Since it is fully Opensourced , we are welcoming contributors , lets make RethinkDB great , together!

https://github.com/rethinkdb/rethinkdb

Here are current rehtinkdb plans : https://docs.google.com/document/d/1f8qODp7voKIqwioQ3si69q3-...

Most of ex RethinkDB developers are going to keep contributing rethinkdb, after things got settled down a bit.

https://docs.google.com/document/d/1c27S3Ij2WLB_JiUmpIGkbwJO...

Please try out rethinkdb , and if there any questions we are activly answering at : https://rethinkdb.slack.com/messages/general/

Please join https://rethinkdb.slack.com/messages/open-rethinkdb for OpenRethinkDB updates.

Thank you very much , rethinkdb is an excellent Database. I see a lot of New users at RethinkDB slack. Before RethinkDB slack was quite empty but now it have lot more active user after RethinkDB Shutdown.

@coffeemug , those who've tried rethinkdb are staying with rethinkdb. The product is not a failure and the actual users values the standards you have set!

Many of us are going to stay with RethinkDB .


First of all: this is great, thank you so much!

Secondly, and with the risk of hijacking the OP: I think it would be a good thing if you could put a status update either on rethinkdb.com or ~the first paragraph of the readme on GitHub about what you are trying to accomplish related to the name/rights/ownership/license/... I check those about once a month to see if something has changed.

As a current user and someone evaluating future usage, this is causing the most uncertainty. I don't know how these things happen (it is probably complex). But I want to commit. Even if progress is slow. Bugfixes are important, but secondary if the project might somehow be significantly damaged trying to "escape" its current ownership.

As an example: you are trying to transfer ownership of the build tools? I understand that might take a while, but what if that somehow can't happen? Is recreating them realistic?

Basically a list of threats/risks would be nice to have! :)

EDIT: this is also relevant for contributors/donations/... any type of investment.

Thanks again!


I am just one of RethinkDB community. i am waiting to contribute donations to RethhinkDB when it have foundation up and running.

I don't think Rethinkdb.com is under control of the Current Leadership Team . But it would be very good idea to put the status updates in ReadMe. I will submit a PR on it.

Development is still continuing , but slowly. It is close to release 2.4, @atnnn and @srh is working on it.

> Bugfixes are important, but secondary if the project might somehow be significantly damaged trying to "escape" its current ownership. > As an example: you are trying to transfer ownership of the build tools? I understand that might take a while, but what if that somehow can't happen? Is recreating them realistic?

@atnnn is building and running own build tool server and its almost perfect , he will need help there: https://github.com/AtnNn/rethinkdb-nix https://thanos.atnnn.com/jobset/rethinkdb/next#tabs-evaluati...


also join us on https://rethinkdb.slack.com/messages/open-rethinkdb/ for more updated info


Haven't tried recently, but the only way to install RethinkDB on my machine (ubuntu) was running the docker image. Is it currently fixed again?


it can be easily installed on Ubuntu:

source /etc/lsb-release && echo "deb http://download.rethinkdb.com/apt $DISTRIB_CODENAME main" | sudo tee /etc/apt/sources.list.d/rethinkdb.list wget -qO- https://download.rethinkdb.com/apt/pubkey.gpg | sudo apt-key add - sudo apt-get update sudo apt-get install rethinkdb


That's strange. I've ran it on ubuntu (using the .deb, I believe) since 2014.


Coffeemug: there's one other thing you missed in the postmortem. The early marketing (1st year) for RethinkDB was all about optimizing for SSD storage. Since this was largely nonsense, database consultants like me (who were in a position to push technically correct solutions) read about it, and dismissed RethinkDB from our thoughts forever. I keep up pretty well with the OSS DB field, and didn't realize what you were really trying to build until five years later. With the number of NoSQL and NewSQL DBs entering the market in 2007-2010, I simply didn't have time to re-evaluate any of them.

So that's an other "take-home" from the post-mortem: be careful about your "first impression" on the market, because it can be really hard to change.


Why is optimizing for SSDs nonsense? Empirically this seems correct since RethinkDB themselves pivoted away from this but curious about the technical explanation.


coffeemug explained it himself in a podcast where he was interviewed about Rethinkdb (I don't remember the name of the podcast). Basically it boiled down to the fact that all existing databases (at that time) could be tuned with very little effort to take advantage of SSDs so that differentiating factor flew through the window


Exactly.


RethinkDB themselves pivoted away from this

I believe the storage backend remained largely the same (ie current RethinkDB, IIRC, is still optimised for SSD). Maybe I'm mistaken.


It's the same storage backend (well, modified quite a bit) and optimizing for SSD's isn't really the goal anymore, especially since things are now stored in a file and not the whole block device, and O_DIRECT is turned off.


I agree, i thought of rethink as a ssd product too and I only started using it when I read the postmortem


I don't know, my personal opinion was that RethinkDB had its head on straight, MongoDB is garbage and still neither is so much better than PostgreSQL that I will switch away from it.

Postgres is the default datastore (because schemas are awesome) for me, and I haven't had a use case yet where I needed something that Postgres wouldn't do. Maybe if you have a very specific need, you'd reach for another datastore (come to think of it, I have successfully used Cassandra for multi-DC deployments of a distributed read-only store), but that's not the norm for me.


I was truly hoping that RethinkDB would be the Mongo that Mongo could have been: a NoSQL database, but one with joins; a NoSQL database that is actually CP [0]; a database that comes out of the box with granular real-time updates (so you don't even need to worry at first about the extra moving parts of a Redis server or other queue). For a business starting from scratch with those needs, I'm making do with Mongo, but I was anxiously waiting for RethinkDB to get more battle-tested... alas, it wasn't meant to be. And I fear that we'll be stuck with Mongo for quite a while.

The truth is that NoSQL has great promise, but the folks with actual money to spare today are the ones who want a better SQL database, because they can afford the tradeoff of development time for reliability that SQL administration and migration provide. And people who solve the SQL pain points, such as the excellent folks at CockroachDB, and the various folks building replication layers on top of Postgres, can hopefully keep the market for database startups alive long enough for someone else to fix the NoSQL landscape as well.

[0] https://aphyr.com/posts/284-call-me-maybe-mongodb , where one finds that Mongo actually isn't guaranteed consistent during partitions, actually did hit us in production, and we had to dig through a rollback file manually (shudder); fun times. On the other hand, see: https://aphyr.com/posts/330-jepsen-rethinkdb-2-2-3-reconfigu...


I think Rethink would be that, but I don't know, I've never found SQL databases that hard to manage. I've found NoSQL databases hard to manage, because, invariably, someone will be doing a write right when you're trying to do a data migration/transformation, and now you have a single record that looks wrong, and it's a bug waiting to happen on read, rather than on write (where it should).

Rethink definitely had its use cases, I just never saw it as my primary data store.


That's been my experience as well. Postgres handles everything until you hit an edge case on some of your data (volume writes with scale out, performance critical advanced functionality, heavy BI work with huge datasets, etc).

When you hit those edge cases or specific needs, you implement the specialty solution. Most people never come close to those edge cases though. When they do, PGs foreign data wrappers make the adjustment as smooth as possible.


> heavy BI work with huge datasets

Because of single threaded query execution, lack of MPP, or something else?


I agree Postgres is pretty awesome, but I think the one thing it has going against it is a good GUI client. MySQL has way too many good options wherease Postgres is stuck on this front.


Please try Postage. It's free for all users, open source, comes in server and desktop distributions, and we listen to our users. You can get it at https://github.com/workflowproducts/postage

If it doesn't meet your needs please file an issue, our goal is to remove the no-good-GUI limitation that was imposed on postgres sixteen years ago.


What do you mean with "imposed"?


Imposed, as in, to put something in place on purpose.


Who imposed it? There's not exactly much central control over postgres in general, and just about none over the tooling built outside the core repo.


I've tried really hard to like Postgres, but coming from SQL Server land, the tooling is mind blowingly atrocious. Every few months I go on a google spree trying to find the magic tool to make it less painful, but everything is either just as awful as pgadmin or costs way too much.


Microsoft gets a lot of shit for various and mostly good reasons. But their tooling is seriously top notch. I would murder kittens to get tools like SSMS for postgres.

Hell Python Tools for Visual Studio almost makes me want to do python dev work on a PC.


We'd would love to have your help on Postage. We want to be the Microsoft SQL Server Management Studio of PostgreSQL. Just file issues at Github for bugs and features, we'll do our best to keep up.


JetBrains DataGrip is also worth considering - another polished quality app from JetBrains: https://www.jetbrains.com/datagrip/


I've been meaning to try this. I'll check it out.


I worked with people in the past that refused to touch postgres because it doesn't have a UI they like. They insisted on using MySQL for that sole reason.


+1. The main reason I didn't jump on to Postgress decades ago was because the tooling was really bad. Third party tools for MySQL were streets ahead, even back then.

End result is that I jumped on the MySQL bandwagon, and even all the negative posts against it now and the lauding of Postgres in here all the time cannot make me change tracks. I just have too much data and IP invested in it now.


Highly recommend DBeaver http://dbeaver.jkiss.org/


Just downloaded it and it seems effective and well-polished. Thanks for the recommendation, mySQL Workbench has some killer bugs that made me stop using it.


Will try. Thanks


Postico is very nice. I've been using it (and it's predecessor - PG Commander) for a few years. Worth the money I paid for it.

https://eggerapps.at/postico/


I developed and use(d) pgxplorer (github.com/pgxplorer). So, I can tell you this: the best out there is psql (CLI). There exists no use case where a GUI is better than psql.


Not even diagramming relationships between tables?


Those fall under db authoring/modeling tools. https://github.com/pgmodeler/pgmodeler is a good pick.


Postgres is horribly painful to cluster though, while RethinkDB clusters are trivially easy to set up.


Based on the postmortem, you weren't alone in thinking that way, it's just that all the people who did think so, didn't value the difference enough to actually pay for it. And, indeed, when something like Postgres is free...


Rethink, Mongo et al are scalable (at the expense of some ACID). PG is certainly a good DB (I use it in production) but it is at heart a traditional RDBMS - not designed for scalability. Apples vs Pears..


What do you mean pG isn't designed for scalability? Its SMP scaling is incredibly good, close to linear.


Postgres neither scales vertically or horizontally.

v9.6 finally brought some parallel scans but is still limited in scope and far behind the commercial databases. Same with replication and failover, although there are decent 3rd party extensions to get it working.


The vast majority of us probably won't ever need that scalability anyway. Most places I've seen that needed something more scalable needed to just stop being so retarded with the tools they already had.


As for the clustering in general... Scalability may be not important, but replication and failover should be. I believe, every place that had existed long enough must've had theirs "ouch, our master DB host went down" moment.

I wasn't able to set it up properly some years ago and settled down with warm standby replica (streaming + WAL shipping) and manual failover. Automating it (and even automating the recovery when old master is back online and healthy) was certainly possible - and I got the overall idea how it should operate. But the effort required to set it up was just too big for me, so I decided it wasn't worth the hassle and settled down with "uh, if it dies, we'll get alerted and switch to the backup server manually" scenario.

Having something that goes in line of "you start a fresh server, tell it some other server address to peer with, and it does the magic (with exact guarantees and drawbacks noted in documentation)" would be really awesome. RethinkDB is just like that. PostgreSQL - at least 8.4 - wasn't, unless I've really missed something. I haven't yet checked newer versions' features in any detail, so not sure about 8.5/8.6.


> you start a fresh server, tell it some other server address to peer with, and it does the magic

This is the one thing I ask of every database software, yet we still don't really have it. 90% of problems could be solved if there was a focus on the basics like easy startup, config and clustering.


SQL Server is almost that easy, but still doesn't handle schema changes well.


How so? Aerospike, ScyllaDB, Rethink, MemSQL, Redis are the only databases that get close to this.

SQL Server availability groups requires Windows Server Failover Clustering, which is not quick or easy.


You can try http://repmgr.org/ for automation.

It offers easy commands to perform failover and even has an option to configure automatic one.

After reading about github issues[1], I am a bit cautious about having automatic failover though.

[1] https://github.com/blog/1261-github-availability-this-week


Thanks for the link, I think I haven't saw this one before.

As for automation... Things can always go wrong, sure. But I wonder how many times HA and automatic failover had saved the day at GitHub so no outside observers had a faintest idea there was something failing in there.


Vertical scalability is important for everyone as it makes better use of hardware (lower costs or more performance).

Horizontal scalability is important for HA which every production environment would like or need.


I suppose the parent comment was more about scaling up for transactional workloads - the story there is a lot better (although there's still issues left we haven't tackled, but mostly on very big machines) than for analytics workloads. But yes, while we progressed in 9.6, there's still a lot of important things lacking to scale a larger fraction of analytics queries.


Don't have experience with Rethink, but Mongo scalable? That's a joke right?

Here how it compares: http://www.datastax.com/wp-content/themes/datastax-2014-08/f...

The only values that has higher than rest are on page 13, except those are latency and lower is better.

This paper shows clearly that Mongo doesn't scale.

And even for single instance Mongo is slower than Postgres with JSON data: https://www.enterprisedb.com/postgres-plus-edb-blog/marc-lin...

There are actually add-ons to Postgres[1] that add MongoDB protocol compatibility, and even with that overhead Postgres is still faster.

And even such benchmarks don't tell full story. I for example worked in one company that used Mongo to regional mapping (map latitude/longitude to a zip code and map IP address to ZIP). The database on Mongo was using around 30GB disk space (and RAM, because Mongo performed badly if it couldn't fit all data in RAM), mainly because Mongo was storing data without schema and also had limited number of types and indices. For example to do IP to ZIP mapping they generated every possible IPv4 as an integer, think how feasible that would be with IPv6.

With Postgres + ip4r (extension that adds a type for IP ranges) + PostGIS (extension that adds GEO capabilities (lookup by latitude/longitude that Mongo has, but PostGIS is way more powerful) things looked dramatically different.

After putting data using correct types and applying proper indices, all of that data took only ~600MB, which could fit in RAM on smallest AWS instance (Mongo required three beefy machines with large amount of RAM). Basically Postgres showed how trivial the problem really was when you store the data properly.

[1] https://github.com/torodb/torodb


This was an excellent reading. No surprise given the author. I think there is a missing element about this in the post:

> Correctness. We made very strict guarantees, and fulfilled them religiously.

It is true that the user base did not recognize this as important as the RethinkDB team, also because there is part of the development community which is very biased towards this aspect of correctness, at the point that you literally never experienced a problem for years with a system, but then a blog post is released where the system is shown to be not correct under certain assumptions, and you run away. So part of the tech community started to be very biased about the importance of correctness. But 99% of the developers actually running things were not.

Correctness is indeed important, and most developers have indeed less interest that they should have in formal matters that have practical effects. However, I don't think this point can be simply reduced to developers ignorance, that make them just wanting the DB to be fast in micro benchmarks. Another problem is use cases: despite what top 1% people say, many folks have use cases where to be absolutely correct all the time is not as important as raw speed, so the reason they often don't care is that they often can actually avoid caring at all about correctness, as long as things kinda work, while they absolutely need to care about raw speed in order to take things up with a small amount of hardware.


For me, this was very important and I think I was very much aligned with the teams goals, which is probably why I like RethinkDB so much. The problem is, I wasn't (and still am not) in a position where I could have paid a bunch of money for it. A little, maybe, but not enough to make a real dent.

So I suspect that the people who cared about the same things that the team did are also the people who 1) either aren't the ones making the final business decision or 2) aren't in a position where they could or would pay much for it.


I'd say that for more than 90% of potential RethinkDB users there's no need for guaranteed correctness (several exception are in FinTech or similar areas)

But even in those cases, the operation can be retried or the operation could take longer

If you're "Social startup of the year" it doesn't matter if one post appear to some followers 100ms later than they should be.


In my opinion, you got it backwards. Fewer than 0.1% of software engineers are working on the "social startup of the year", whereas the other 99.9% of developers are working on software where consistency and correctness matter more than availability.

And for most of those developers, atomicity guarantees (the A from ACID) are very important, because without transactions if that one operation fails, before it can be repeated as you say, the system also has to rewind the previous operations that succeeded in order to leave the data in a consistent state and implementing this rewind by yourself is extremely challenging (take a look at the Command pattern sometime for a possible strategy).

Or in other words, most developers need atomic guarantees / transactions as well and if the DB doesn't support transactions with the needed granularity for your app, it's easier to change the DB than implementing it yourself, which is why RDBMSs are still more popular than NoSQL systems, because they are more general purpose and made the right compromises for a majority of use-cases.


Thanks for your comment

I don't disagree consistency and correctness are important, but it's the same with uptime, where to have it an order of magnitude 'more confidence' you have to spend more time/resources (because even if your DB did everything right, your hard drive could have had a glitch, etc)

There are ways of working around the lack of atomicity, for the cases where you really need that guarantee (I really wouldn't try "writing my own" generic transaction manager, but you can try relying on the atomic operations and limiting changes to small steps - and if you wrote information that is irrelevant now because the operation didn't finish that's fine - you can gc it later)

Now if the operations you need to do are very complex then it is probably better to keep using RDBMSs


If you're "Social startup of the year" it doesn't matter if one post appear to some followers 100ms later than they should be.

That's latency, not correctness. I think even "social startup of the year" cares that the post shows up at all, is attributed to the correct author, isn't corrupted somewhere and so on. Data correctness/consistency is about making sure that the data isn't in an inconsistent state.


> That's latency, not correctness.

That's "eventual consistency" in replication


    > People wanted RethinkDB to be fast on workloads they actually tried, rather than "real world" workloads we suggested.

    > For example, they'd write quick scripts to measure how long it takes to insert ten thousand documents without ever reading them back.

    > MongoDB mastered these workloads brilliantly, while we fought the losing battle of educating the market.
I was one of the people who did this and I'd like to provide you another take away:

Don't tell people what "real world" workloads are. They know better than you do. When I presented my performance testing scripts to you guys it was a replication of our real world workload. Trying to explain to someone that their workload isn't "real world" hurts the confidence in the system. Its better to either say that isn't a workload you are optimized for or that you are working on it.

That being said, I'm now building a start-up and using RethinkDB because it is the best system out there! My performance requirements for a start-up are significantly different than the billion dollar survey company I used to work for ;) But I'd also like to commend you on the performance you've achieved with RethinkDB in general, it has come a long way and any company would be happy to hit the scale that RethinkDB has issues with!


Very nice to see people keep using and using RethinkDB for new products. We have one in productions. are also starting 2 more new products that is built upon rethink db. A Chatroom and an Realtime Geolocation /Map based app. Please join rethinkdb slack ! we are building back community,


There is bigger market forces that isn't being considered here - namely that the entire database as a service margins have been nearly driving down to zero by AWS. Why do you think firebase had to sellout to Google ? Why did Parse shut down ? Simply because the big cloud vendors figured out that providing paas services at nearly the cost of the infrastructure will win them even more compute workloads which is the real money maker (especially if your are reducing your computer pricing slower than Moore's law)


Right on the money, so to speak.

He says that the market was terrible because open source developer tools suffer from oversupply. I don't think that's what happened. Rather they chose to make a product for which the next best alternative (e.g. Mongo or Cassandra or MySQL or Postgresql) had almost zero cost. It doesn't matter how oversupplied the market is if someone is willing to give away a viable product for free. Perhaps "oversupply" is another way to say "someone was willing to sell at zero cost", I'm not sure.

Unless you can find a set of customers who see significant value in your product vs. the alternative zero-cost product, you're never going to make money.


> especially if your are reducing your computer pricing slower than Moore's law

I agree with most of your comment, but I take issue with that last part. Moore's law has no relation to the price of computing. It only makes predictions on the number of transistors in a CPU, not the performance of the CPU nor the price/performance ratio of the CPU. Even if it did, compute costs much more than the CPUs; memory, disk, power, cooling, networking, enclosures, and more all cost a significant amount relative to the silicon.

I know Google Cloud declared they were going to decrease prices at pace with Moore's law, but even if they can deliver on that it will be through innovations outside of the CPU.


Actually, I think learning to understand how these (complex) market forces come into play is exactly why Slava added the recommendation to religiously read The Economist at the very end of the article for maximum effect.


Some of us are keeping the RethinkDB database and community alive. The code is open source, and development continues slowly. The next release is almost ready. It'll have new features and performance improvements.


Thank you very much @atnnn for continued keeping RethinkDB alive.


I am using RethinkDB in PartsBox (https://partsbox.io/). I initially picked it because I wanted a JSON document database that has a correctly implemented distributed story. But later on it turned out that the realtime-push (changefeeds) functionality is actually fundamental for me. I used it to build an app where every open sessions gets instant updates.

Right now I don't know what other database I could use, because I found no other solution that supports this kind of functionality, especially in a distributed setting.

I agree with most of your analysis and I think that the most important part is about "worse is better": many of your potential customers won't understand or won't care about correctness or consistency.

That said, I really like RethinkDB (oh, it does have its warts, but overall it's great), I'm very thankful for it, and I hope it will continue as an open-source project.


I could be wrong but the only other database I've been able to find, with the ability to subscribe to events, is GunDB (http://gun.js.org/).

The only other option is to put something in front of a DB like FeathersJS, or deepstream.io

RethinkDB is a breath of fresh air and I really hope the community will be able to keep it alive.


Agreed, there is still a market for that imo. couchDB has it, but it's so slow :( Right now if you want a good DB and a pubsub/notification/streaming capability, that means either increased latency (polling the DB in your application layer and handle the notifications there) or using another dedicated system (kafka, redis, etc) with all the added complexity that introduce :(


It is a very different model, but have you tried Realm?

It is more of an object model that a straight up JSON store, but it has some of the best notification features I have tried. You can observe both object and query results in realtime, and since the dataset is replicated to you it has zero-latency local access.

They used to only support mobile, but now they also have a node.js version for server side use.

In my opinion Realms observability is light years ahead of all the other offerings out there.


Yes, it looks really good — but it's a different kind of database (I think). From a quick glance, it's an embedded object database used mostly client-side, while what I need is a distributed server-side database.


> Right now I don't know what other database I could use, because I found no other solution that supports this kind of functionality, especially in a distributed setting.

Have you looked at Datomic? It's not free but I think it provides the functionality you are looking for.


Yes. In fact it would fit my app very well, because it's also written in Clojure, and I'm even using datascript on the frontend side.

But "$5000 per Year per System" is outside of my price range for the foreseeable future.


same here, i cannot imagine what else i would use instead of rethinkdb + horizon.


firebase is also a realtime json database. I am quite satisfied with it till now


I do find it truly sad how anti-intellectualism is pervading our industry and causing good products to die. Nowadays, a significant number of the hottest new technologies offer literally nothing of value beyond "low learning curve". It's my current belief that tech culture has lost nearly everything that once made us great.

Think about some examples.

MongoDB is nothing but the object databases of the 80's, rebranded. It's main selling point is "you don't need to learn SQL". Object databases failed and people at the time were willing to learn something new (SQL). Nowadays, who will do that?

Go's entire selling point is "faster than python, no need to learn anything new". In terms of language features it's again living in the 80's - Java got generics in 2004 and Java was way behind the times. In terms of speed it's worse than Java and Rust, maybe comparable to Haskell.

And NodeJS is nothing but "write server code, no need to learn a new language". Isomorphism simply didn't work. People think Node is fast because it can do async, but Java has been able to do threaded async for 15 years.

I wonder how much the industry could advance if we could simply persuade every developer to learn Java Postgres.


Intellectual is the wrong target, just as much as easy. Humans have a limited capacity to understand complex systems, and once they grow to a certan point, your ability to integrate and innovate on them is limited due to overhead of the layers of incedental complexity. Easy is just shoving complexity under the rug. Simplicity, and building systems meant for our meager human intelligences is better.

I am not terribly fond of Go, but it has a decent answer in auto-implemented interfaces and duck typing. It's not about dumbing down, it's just different, and easier on the compiler, which is a significant advantage.

And as for async in java... ugh, how awful. We finally got something like promises and lambdas, but most APIs are still trapped in synchronous calls and using the JPA is a disaster of hidden thread-local state. What kind of stack are you using to make java async work? Maybe I'm just missing something.


The thing is, the modern systems are often less complex. SQL databases are generally far simpler than MongoDB - you just need to learn relational algebra rather than blindly shoveling whatever JSON you got into the DB.

As for Java + async + threads, I use Akka and it's basically seamless. But Netty + core Java Executors is fine too, just more verbose.


> you just need to learn relational algebra rather than blindly shoveling whatever JSON you got into the DB.

No

SQL per se is not hard.

But the whole data modeling, creating tables, picking types is just needless bureaucracy

Yes, I want to save a json snippet and be done with it. I don't want to create another table because I have a 1-N relationship. And while ORMs make this easier it is still not painless


Schema-less data always sounds good in the beginning until you actually try to use that data. Then you end up encoding all the validation and data munging logic into whatever process reads from it.

An internally-schema'd database just recognizes that this is absurd and lets you put the bottom three layers of validation logic inside the database, where you only have to do it once.


> Then you end up encoding all the validation and data munging logic into whatever process reads from it.

Do you think that doesn't happen in regular DBs or that schemas don't evolve?

Love for static typing is basically Stockholm's syndrome

Your system will eventually end up with a fixed schema, but until you get to that place you don't need to "stop the world" to change it. "But Postgresql can add columns without downtime" yes it can, but what about your system?


> Love for static typing is basically Stockholm's syndrome

Funny, that's sort of how I feel about my (long) time "loving" dynamic typing. Now I find it is such a relief to validate data once on the way in and completely trust it to be what it says it is thereafter. I found dynamic types required more bureaucracy (if I wanted to sleep at night) in the form of unit tests and precondition checks everywhere.

YMMV I suppose.


> I found dynamic types required more bureaucracy (if I wanted to sleep at night) in the form of unit tests and precondition checks everywhere.

This merits clarification

Most of programming in dynamic typing should be implicitly statically typed. Know what you're passing. Obey the implicit (duck-typed) interfaces

Basically, don't just check "if it's a list do that, if it's one element do something else, if it's a number do another thing" - this is a beginner's mistake (and I did those)

Besides that, unit tests are a good idea, and if you want "compile type checking" in Python use Pylint (it's a good idea to use it regardless of your situation).

Though it's not so much static typing that sucks, it's its implementation in the Java/C++ way. Go and Rust make it better.


> Most of programming in dynamic typing should be implicitly statically typed. Know what you're passing. Obey the implicit (duck-typed) interfaces

Yeah, I thought that too. After years of persistently seeing stupid (sometimes critical) bugs due to assuming the correct thing was being passed, I concluded that 1. the only sane thing to do is add explicit precondition checks that raise meaningful exceptions when assumptions are broken, and make sure those paths are exercised by unit tests, and that 2. I would much rather have that done statically and automatically by a compiler.

> Basically, don't just check "if it's a list do that, if it's one element do something else, if it's a number do another thing" - this is a beginner's mistake (and I did those)

I think this may be a misunderstanding of what I meant by "precondition checks" - I meant explicitly checking the assumptions being made by a method, most of which are type checks (foo.kind_of?) or (more commonly) interface checks (foo.respond_to?).

> Though it's not so much static typing that sucks, it's its implementation in the Java/C++ way. Go and Rust make it better.

I don't have much problem with Java's implementation of static typing at all, and I don't think C++'s problem is its static typing implementation. I find Go's implementation a bit inflexible and hard to work with though.

Again, YMMV!


Akka doesn't solve missing asynchronous libraries. You still have to wrap JDBC requests in a threadpool, and are limited in how many database requests you can serve by pool size.


I'm going to fundamentally disagree with at least the target of Go. Go introduced a series of concurrency primitives with syntax that's made writing and reasoning about concurrent code a joy, something that's not the case in Java/C++/C/Python (maybe up till the late 3.X in the case of Python), and it did that with a language paradigm (interface driven) that's vastly different from Python's mixture of OOP and duck-typing.

And sorry, but "I wonder how much the industry could advance if we could simply persuade every developer to learn Java Postgres."?

Plenty of other people would say "I wonder how far along the industry would be if you people would actually learn how to manage your memory and write C".

Plenty of other people would say "I wonder how far the industry would have progressed if we let go of the harmful and outdated concept of imperative programming and embraced functional paradigms."

... and on it goes.

-----------------

Additionally, you say "I do find it truly sad how anti-intellectualism is pervading our industry" only to end your post with "why can't we abandon all our experimenting and learning and just stick to good old Java which obviously would solve all our problems", which is itself MASSIVELY anti-intellectual.


Go didn't "introduce" CSP or actor like primitives. Scala had actors in 2006, I think Clojure has for a similar amount of time as well. There have been libraries for it in Java and C++ also since the early 2000's and probably earlier.

Plenty of other people would say "I wonder how far along the industry would be if you people would actually learn how to manage your memory and write C".

I don't see the industry gravitating towards bad re-implementations of C with less functionality and a flatter learning curve. The only "replacement" for C that I'm aware of is Rust, and Rust is most definitely not anti-intellectual in the same way I criticize Go.

Rust actually requires the developer to learn new concepts (borrow checker) which are not found in C, Python, Java, etc.

"why can't we abandon all our experimenting and learning and just stick to good old Java which obviously would solve all our problems", which is itself MASSIVELY anti-intellectual.

I didn't say to end experiments. I said everyone should actually learn some fundamentals before jumping on low-learning curve bandwagons which offer nothing new.


> Rust actually requires the developer to learn new concepts (borrow checker)

Well they do not want GC for their language hence borrow checker while Go already has an excellent GC.

As opposed to learning new things we need useful new things which couldn't be done before fancy new language.


So what can you do in Go that you couldn't do (without a lot more difficulty) in Java or Haskell?


> "why can't we abandon all our experimenting and learning and just stick to good old Java which obviously would solve all our problems", which is itself MASSIVELY anti-intellectual.

That's a mis-characterization. There is a difference between experimenting and learning and pushing all sorts of questionable fashion trends into production.


While true, the "pushing" being done is a very small compared to only the "development" of said technologies. The exact words of the parent where:

> I wonder how much the industry could advance if we could simply persuade every developer to learn Java Postgres.

And that is preceded by what I can only described as a bunch of undue mischaracterization of several languages and technologies, all of which definitely seems to imply that "yeah, all that other stuff is just a bunch of fads, why don't we stick to the real deal and all write Java."

Now, I don't want to be unkind to Java here. It's true that Java is an incredibly powerful language and there are a wide variety of technologies and tools that are absolutely incredible that you can only get via the Java ecosystem. But that can be said of many languages, e.g. C++/C.

I think what the GP may have meant, and which I do agree with, is that our industry would progress if we as developers spent a bit more time learning about the context and history of current and past solutions. We'd probably re-invent the wheel a bit less often.

But I don't want to limit that recommendation to "learn Java Postgres". I think Java developers might benefit from examining the benefits of a language like Haskell, Python programmers may benefit from a language with strong typing, such as C++/Rust/Go, that Unix lovers might learn a lot from examining the Windows NT kernel API's, and so on. If we examine the things that aren't close to us, the things with which we're the least familiar, that's when I believe the most fruitful learning can occur.


I took the GP in a very figurative way as well. A lot of the learning being done at the moment though is simply framework churn, there are developers out their familiar with a million web and client side frameworks that would have been much better served learning SQL. Our apps would be better if we focused more and learning about usability and our users than on CSS compilers. As an industry I think we're under serving everyone right now, people have machines a thousand times more powerful than a couple of decades ago but what they can do with them is almost identical.


I just want to nitpick about Node a little bit--as a Java developer and Postgres user myself--because there is more to Node than just async JavaScript and I think it gets a bad rap a little bit because of that. Async is one good quality of Node, but it also has this really great debugging integration with Chrome DevTools. And even really huge Node programs start fast--usually in less than a second--so you get this really tight development feedback loop. So you have this overall pleasant experience of a great module system where mostly everything out there is already going to be async (less likely in Java), a REPL, a fantastic debugger and profiler via the Chrome DevTools, and extremely quick process restarts. I think for some people these qualities outweigh the lack of static typing and JavaScript language quirks (which a lot have been ironed out in ES6 and ES7 anyway).


Don't forget the proliferation of electron so no one has to learn a new language and desktop toolsets.

Just for kicks I wrote a desktop todo list in GTK and C, not tools I had extensive knowledge of ever and nothing recently. It was a breath of fresh air how simple it was compared to doing the same in modern frameworks. A single ~30 line function to create the gui and the rest held together with a couple of function pointers. It was faster (to write and run) and simpler than anything I'd written in years. More responsive than the most responsive web framework too.


But is it cross-platform (Win32, Windows UWP, OS X, multiple linuxes, iOS, Android) and most of that same code works on the web?

It's okay to bemoan Electron for being bloated but multiple balls have been dropped by OS and platform makers, GUI toolkit makers, language runtime makers, and that giant amount of missing glue inbetween, to get to where we are.

It's rather sad that Electron, out of all things, is the first thing to fulfill the holy grail of write-once, run anywhere, in a way that's good enough and palatable to the plurality of users' and devs' satisfaction.


Electron certainly isn't the first thing to give "write once, run anywhere", let alone good enough to satisfy users. I groan every time I download an app only to discover it's Electron based.

Java beat Electron to write-once-run-anywhere by a couple of decades, and still does in the sense that Electron requires you to make platform-specific builds of your app even if the core code is the same (Java gives you that option but doesn't require it, you can still distribute a jar or web start file).

Meanwhile, Electron has managed the feat of being even bloatier than Java is. The DOM was never designed for GUIs and the terrible performance of web and Electron based apps is a testament to that. At least Java apps tend to have meaningful menu bars and context menus.

Electron satisfies web devs who have either never written desktop apps in other frameworks, or did decades ago and think nothing improved there. They "satisfy" users in the sense that users are rarely offered any alternative so have to suck it up and tend to judge web apps against each other vs a well coded, tightly written desktop app (office suites being a notable exception).


> Java beat Electron to write-once-run-anywhere No , not on web.


You can run electron apps on the web? How do you deal with things like file system access?


There is local storage. There is Server side.


Yes, I used GTK2 which is cross platform (GTK3 isn't). Would work fine on Win32 (the only windows worth supporting), OSX and multiple linuxes. Won't work on android or iOS, but IME that usually sacrifices the dektop app anyway. This case would have been fine though if it were possible, the app was responsive enough to work on a nokia 6110.

The biggest takeaway though is how simple it was to build. If all platforms where that simple then having seperate UI code for each platform really wouldn't be that big a deal.

At the moment I'm somewhat hopefully the libui project (https://github.com/andlabs/libui) will be successful though. It's learned the lessons from previous attempts.


Did you blog/gist that example, by any chance?


I half wrote a blog post/angry rant over Christmas. I'd post the draft/source here, but my c is very rusty and I want to at least check for memory leaks first (no valgrind on cygwin) ;)


> Go's entire selling point is "faster than python, no need to learn anything new".

To me, Go's selling points have been numerous:

- Faster than python

- Statically typed/no runtime bundling requirements

- Easy access to concurrency/multicore scaling

- Excellent tooling and integration with programming text editors

- Great documentation

- Excellent standard library which you can treat as a continuation of your own program

- Great design with pragmatic choices (this one is subjective, I agree)

In short, there's nothing else like that out there. Without Go, I'd have been stuck with JVM, Haskell or Erlang for most of my projects. And I dislike all of them, so Go has been a godsend.

Incindentally, I also happen to enjoy working with Postgres and prefer it to any other database I've worked with (SQL or not).


You haven't laid out any concrete selling points for Go over Haskell or Java beyond "I dislike all of them".

Can you come up with any intellectual rather than emotional reason for preferring Go?


I can open a standard library definition and quickly understand what it does. If I try to do that with Scala or Java, my eyes just glaze over. With Go I often don't notice when I venture outside of my own program and into 3rd party library code.


It sounds like you just didn't out in the work to learn Haskell. The docs are fantastic, but they do assume you are willing to get over the learning curve.

I don't think you are really disagreeing with me - go does nothing new, just lower learning curve.


That's the problem, there's no getting over the learning curve when it comes to languages like Scala or Haskell. While you do get more proficient with time, you never get to the state where the language gets out of the way and becomes your friend instead of something you have to wrestle most of the time to enjoy the benefits it offers (which are also offered by Go, BTW).

I was disagreeing with your original statement, Go is not just a faster python.


> That's the problem, there's no getting over the learning curve when it comes to languages like Scala or Haskell

For you, maybe. For plenty of others, that's clearly not true.

> While you do get more proficient with time, you never get to the state where the language gets out of the way and becomes your friend instead of something you have to wrestle most of the time to enjoy the benefits it offers (which are also offered by Go, BTW)

Go offers weaker type-safety benefits that Java (with the benefit of less type-safety ceremony), much less Scala or Haskell (against which it has less ceremony benefit, because Haskell and Scala have reduced ceremony with stronger type systems).

Go doesn't offer the benefits Haskell or Scala (or Rust) do. OTOH, some people find it more intuitive, and it does offer different benefits which may be more relevant to some use cases. The sweet spot for Go seems to be the place where, before Go, you might use Python but be upset about performance and might use C but be upset about boilerplate; Go improves over the weaknesses of either on that border, while preserving most (but not necessarily all) of their strengths.

> Go isn't just a faster Python.

Faster (and more parallel) Python, less obtrusive C, lower ceremony and more native Java -- it's not just any of these, but they sort of capture its primary strengths.


> That's the problem, there's no getting over the learning curve when it comes to languages like Scala or Haskell.

How did you come to this conclusion? Based on your own experiences? If so, why do you find it reasonable to assert your anecdotal truth generally?

> enjoy the benefits it offers (which are also offered by Go, BTW).

Are you seriously saying Go offers the entirety of features that Haskell offers. So what, you think people are just using Haskell for no real reason?


Thanks for providing an anecdote which 100% supports the claim I made: that all Go provides is an inferior language that appeals to people who don't wish to learn new things.


I agree for most part but what is important is productivity.

Java ecosystem is just toxic waste full of abandoned projects, over engineering everything, crashing build because everyone use arbitrary versioning schemes, ring-fences code with licenses/patents and developers with C++ mindset.

Node.js is maybe build on top hack on top of hack. With some ugly code, lack of any abstraction, lack of sensible std library, debugging capabilities, minimal editors support and finally with package manager that cannot be trusted.

But I can build CRUD API working with any database under 1h in node.js. Where In Java I need to start with learning all maven quirks, write pointless JSON mappers and learn stupid annotations that every library introduce.


    Go's entire selling point is "faster than python, no need to learn anything new". In terms of language features it's again living in the 80's - Java got generics in 2004 and Java was way behind the times. In terms of speed it's worse than Java and Rust, maybe comparable to Haskell.
This is not true, go is generally slightly faster than java (while using less memory_ and slightly slower than rust. The garbage collector is slower than java's default collector but has better latency but java's gc can be tuned and go gives you better tools to bypass gc partially or entirelly.


It's not anti-intellectualism -- it's complete and utter lack of knowledge transfer, coupled with young people's spirit being dampened by established practices that are 50% reasonable, 50% bullshit.

This, piled on with subtle ageism, ensures that any useful lessons learned will only be learned by the time it's too late, and there's a new group of can-do people who feel stifled by the combined wisdom of the ages, so they break off and do their own thing.

I wrote about this here [1] with nicer language.

[1] https://news.ycombinator.com/item?id=13022926


Interesting idea, but perhaps it's also possible to look at it as the industry broadening. 40 years ago, you had to be pretty knowledgeable to get anything at all done with computers. These days, it's possible for more people to do things, and yet there's still a market for people who are deeply knowledgeable.

A lot of people don't care much about the tech, and just want to build something. If easier (for them) tech lets them, that's a win, no?

Of course... that's easy to say but harder to swallow when they end up building something, making some money, and then hiring a developer to clean up their hideous mess. I've been there too.


NoSQL's only selling point is you don't need to learn SQL? Hardly true.

Major Oracle shops aren't dumping them for Mongo/Couchbase/... because they're trying to avoid SQL. They do it for scalability, availability, performance, flexibility and more.

I don't see that the facts in the real world back you up here.

Not claiming NoSQL is a panacea. You're definitely making some trade-offs, and in many cases it's just the wrong tech. But it's absurd to say it's happening to avoid a learning curve.


I didn't say that was NoSQL's only selling point. I said that was MongoDB's only selling point.


> Isomorphism simply didn't work.

Care to elaborate? I'd really like to know your opinion as this is an important aspect of my developer-life.


Interesting perspective.

The article talks alot about big companies that make lots of money but as a user of databases I always saw the competition to ReThink DB being Postgres, MySQL and to a lesser extent ArangoDB and MongoDB.

The core problem for Rethink being that, well, I can get as much database as I want for free. I explored many/most of the new/nosql databases and in the end always found that Postgres was the best solution - you just shouldn't bet against Postgres it seems.

I really liked RethinkDB and thought it was well implemented but it just wasn't needed.

I do wonder if ArangoDB is going to go the same way.


Exactly this. RethinkDB's value proposition wasn't really that attractive compared to Postgres.

Replication and a nice UI, sure, those are nice. But Postgres has fantastic single-host performance, and Rethink is not yet in a position to compete there, except in some horizontally scaled use cases. Rethink doesn't even have transactions, making it potentially useless for some things that require strict atomicity over a complicated data model. In general, Postgres has such a richness of features that it's hard to compete with it; PostGIS alone is a "killer app" if you do any kind of GIS stuff.

For me, there weren't any such "killer app" features. The change streaming is nice, but it's not difficult to do that with Postgres either. That's it really: Every solution offered by Rethink could be countered with "that's nice, but trivial with Postgres".

The one exception being horizontal replication. I have yet to launch a project where Postgres couldn't scale out well enough, and I suspect that if I got to that point, I would reach for bigger/different guns such as Cassandra. For our apps we typically use Elasticsearch in front as a kind of read-only mirror of Postgres; the latter being the slow, serializable path, and the former the fast, slightly-out-of-sync path. Users don't know the difference.


Firstly, to the OP, that is an excellent post-mortem. Introspective, goes deep, generally confidence inspiring even though a part of it is literally "we were incompetent and didn't know it", well, lessons learned.

I'd like to add some colour, although I've never used RethinkDB so take it all with a large pile of salt.

lobster_johnson says:

> Rethink doesn't even have transactions

The post-mortem dwells heavily on how RethinkDB focused on correctness and was beaten by MongoDB which is, to put it mildly, not concerned about correctness.

But Rethink is a JSON DB that apparently doesn't have transactions and which on its web page says "Query JSON documents with Python, Ruby, Node.js or dozens of other languages".

Here are three technologies I do not strongly associate with correctness:

* JSON

* Schemaless databases

* Scripting languages

Rethink's own FAQ says "RethinkDB is not a good choice if you need full ACID support or strong schema enforcement".

So in the database market there are of course lots of users who do strongly value correctness, but they tend to use databases and technologies that are statically typed (Java/C#) and database engines that have are ACID and provide schemas.

Very few of them want a database that doesn't provide ACID, doesn't provide schema enforcement and which appears to either not support standard industrial languages at all, or is at least distinctly uninterested in them.

In other words Rethink targeted the hipster startup market of companies building social networks, ride-hailing apps and other things where correctness is not important and low-correctness tech dominates, but tried to sell a product that was oriented around correctness. This seems like a market/product mismatch.


That's a good point. Competing with MongoDB is difficult because "correctness" alone is not a selling point. It's the "worse is better" phenomenon as a marketing challenge.

That said, I don't think lack of correctness itself is a design goal. I think it's accidental, or at least an intermediate state before we have something better.

Case in point: I'm currently working on an open-source (it will be released soon) data store that uses JSON to express "documents" of data, but also has schemas, joins and transactions. It uses PostgreSQL internally, but completely hides the underlying storage. The developers who are currently using it for actual work are pleasantly surprised about how nice it is to work with: You get to work with rich, structured data in its natural shape without needing some kind of ORM to mediate between an app's data model and the database's, and at the same time it's very strictly validated without being pushy (it has gradual typing: you can start out schemaless and then tighten the schema). When we designed it, we wanted to build something that reflected how people actually use a database, and I think we're achieving that. (I'm looking forward to make it public and share it on HN.)

So it's certainly possible to have the best of both worlds. It's just that the direction that the world moved in with "NoSQL" temporarily moved us off course for a little while. It started out as a natural reaction to scaling issues, and in that sense I think it might have been a necessary step while people figured out what they were doing. We're seeing the same evolution in the ways of people's thinking with regard to dynamically/statically typed languages, too.


As someone who has worked at a bootstrapped B2B FLOSS company in the ERP/database domain I think taking/relying on venture capital is a dangerous model in that domain. I feel like the best options are:

1) Don't build a company around your FLOSS project, accept it'll be free in all senses forever

2) Start with a customer that will finance you and build slowly (focus on profit not revenue)

That's obviously not very helpful as finding that one customer is the really hard part so it's usually a hybrid of expecting what you build to be free forever and then finding that one customer. Quote/Charge a lot. Nope more than that. Now double the price and you're roughly there.

On the plus side you'll get very enthusiastic developers that tend to be intrinsically motivated which is a huge plus once you can build that company. I'd argue that this is one of the key competitive advantages of a "FLOSS company"

When it comes to new customers...Always be aware that you're not selling your cool tech but rather selling the solution to a specific problem (basically you're selling pain-easing medicine). I feel like it is beneficial to visualize a somewhat mean guy at the other end of the table who is only thinking this one line over and over "enough with the nerd speak, how does it help me and does it roughly cost what I expect it to cost". Yes we like to imagine that we can impress people with nifty tech details and shudder that the competition might be "that horrible Oracle database every developer hates". Hand in hand with this...charge a lot. The price serves as a signal. Just assume that evil guy you're talking to only cares about getting back to that golf course and won't even look at the number that closely as long as it's fitting his expectation. Yes that means charging too little can kill your deal.


Out of all the databases I tried throughout the years RethinkDB was the best. His bit about "worse is better" nails it I think. It was an elegant product targeted at the wrong group of people. ReQL alone was a masterful piece of design and it was great fun composing queries in Ruby instead of some weird JSON frankenstein DSL. At one point I was pushing 500k events per hour through a single node on an c4.xlarge instance with plenty of room to grow. I'm sure other databases would have handled the load just fine as well but I didn't really want to find out because RethinkDB was so much fun to play with. I was very sad when I heard they were shutting down.


The product still lives and development is continuing slowly. There is almost daily commits there. We need more contributors! @atnnn ( ex rethinkdb developer) and community is working on it. https://rethinkdb.slack.com/messages/open-rethinkdb


Nice reading and Kudos to the entire RethinkDB team for what they have done, especially the evangelization of the Reactive Model in the database. This inspired other vendors like OrientDB to do the same.

Running a company where a large part of the users is developers is very hard. The secret sauce is providing a good product and create a business where some of the users would pay to have something more, like support and/or an Enterprise edition.

The truth is, AFAIK, no NoSQL company backed by VC is still profitable today. Not even MongoDB that has got more than $300M and is able to collect just $60M/year by spending much more to be up & running.

Disclaimer: I'm the author of OrientDB.


> The secret sauce is providing a good product and create a business where some of the users would pay to have something more, like support and/or an Enterprise edition.

I was wondering about this, as it's not explicit in the article: What is the business model that makes money for Docker and MongoDB? From MongoDB's website I gather they have some "Enterprise" things, but they want me to give them my personal data just to access a "datasheet" describing this. Docker's "Enterprise" offering seems to be a mix of support and hosting.

So is that it? Support, hosting, and donations from Big Business?

The article also says: "Thousands of people used RethinkDB, often in business contexts, but most were willing to pay less for the lifetime of usage than the price of a single Starbucks coffee", but I don't understand what those users would have payed for. What was the product being sold? All I can gather from the article is some cloudy hosty database-as-a-service thing that might have made money but never shipped.


Support. Notwithstanding more widely accepted benefits of support like a direct line to the product's experts and in some cases, developers, big orgs are political tinderboxes and you're always one bad downtime away from an internal catastrophe, let alone an outage that affects customers. It's therefore often politically wise for decision-makers to purchase support even in cases where the risk analysis might show that internal talent resolve or work around most issues.

This is a variation of the "no one ever got fired for IBM" trope, and is in fact a big moneymaker for the likes of IBM, Oracle, Microsoft, and the like, even in situations where the standard notions of vendor lock-in may not even apply.


I don't get this: "Read The Economist religiously. It will make you better faster". Anybody could explain?


It's a bookend to the recurring point that building a business requires a good understanding of markets and how companies fight for their place in them.

The Economist covers a lot of macro-economic news and frames the world in terms of market forces.

I think Slava is saying the type of biz-savy "personal development" he experienced (belatedly) could have been accelerated by more of this perspective.


It should come as no brainer, the back bone of any business is 'supply and demand', while it may sound simple, it is increasingly becoming complex and understanding the new world economics helps to grow the business and 'The economist' seems to serve lot of insights.


I was working in a company where we built a new backend system and had the opportunity to chose the technologies we'd use. I was constantly looking at rethinkdb waiting for them to tell me "Yes, you can use this now," however their blog posts and announcements constantly had the underlying message of being in some kind of alpha or beta state.

Admittedly, that was a few years back, so that could have been true then, but the feeling stuck with me and I just stopped taking it seriously.

The problem is that they got a lot of attention early on and expectations were huge and instead of building some hype around it, they were actually too honest. In hindsight I, as an engineer, would have probably done the same, so I think there are some valuable lessons there about marketing and human psychology. At least that part the people at MongoDB did brilliantly.


I'm surprised there are no comments here, this is the best thing I've seen on HN in a while.


I agree, let me take a stab at starting the conversation.

First of all, huge respect to Slava for this writeup. Having your startup fail is hard and it is not the time you want to blog about it. RethinkDB going broke was a sad thing to see for me, I can't imagine how it felt for him.

I think the analysis of the two root causes (hard market, focus on the wrong metrics) is accurate. It is very sad to see that in the end correctness doesn't win the day, not even for databases.

Since I run a startup too I can't help but apply his criteria to GitLab.

Good metrics to focus on:

1. Timely arrival => we try to ship great features every month on the 22nd, something that both our users https://twitter.com/PragTob/statuses/767777202045915136 and the parody account likely run by our competitors employees agree on https://twitter.com/gitlabceohere/status/768440048802947073

2. Palpable speed => we're doing OK on self hosted, really bad on GitLab.com (fixes are in https://gitlab.com/gitlab-com/infrastructure/issues/947 ). If you look beyond latency but to workflow we're doing great in integrating various parts of the process https://about.gitlab.com/2016/11/14/idea-to-production/

3. A use case => we're making GitLab an integrated tool with a broad scope https://about.gitlab.com/direction/#scope A good way to develop software from idea to production.

I hope the above list doesn't come across as pretentious. I welcome pushback and the opportunity to talk more about the OP.

I think the OP premise that it is hard to make money in the cloud is accurate. We're spending hundreds of thousands of dollars to make GitLab.com run and revenue takes a long time to grow. Selling on-premises software products has higher margins than a service.

BTW I appreciate the shoutout to GitLab as one of the five exceptions that are doing well in the open source developer market.


Thanks for your insights on this postmortem Sytse. Love how you are leading Gitlab.


I'll attempt to add to this. I might be stating things that are a bit controversial here.

1 thing we are always transparent with users about is how our business model works. 1 of my favorite videos on the internet about this is [1]. We are very much based on closed source for making revenue. This has actually helped with our community a lot.

We are basically focused on closed source solutions for specific verticals. The only way to get clients to pay is to make something more efficient or don't open source/core it. For open source companies there is always a trade off of community vs customers. You have to find a hard line and stick to it.

Community is great, but most of those people won't want to pay you. You can try to convert them, but that takes a lot of time and isn't the best path to market. We strive to provide a great base line experience to our users, but a few of the things we have found in practice that helps:

1. Features have customer names on them

2. There's a mass of community interest in a specific feature

You have to be careful about where time is spent. It's very easy to mistake "users" for "customers".

I wish slava and co the best of luck and really appreciate their post mortem. 1 thing we've learned as an infra company is "be as close to the app as possible". In our case the bulk of what we focus on is banking/telco anomaly detection.

It's a straight forward revenue making product that can benefit from deep learning. The great thing is we can grow in to other use cases later on if we find something else interesting to work on.

[1]: https://www.youtube.com/watch?v=6h3RJhoqgK8


> 1. Features have customer names on them

Even seen this backfire at many commercial companies. Adding whatever features a company is willing to pay for will often pull a product in multiple, incompatible directions.


I'll give you that. In my case, a lot of what we do is look for numbers and how well it fits our main application area. You definitely have a valid point here though.

That actually goes hand in hand with "users look like customers". Also: "Not every customer is a good customer."


I'm still meditating on the advice given. Perhaps a moment of silence for a fallen comrade...


I'm the Series A investor in FoundationDB, met with RethinkDB really early on, and really appreciate your post.


Why not position yourself as the better mongodb? You explained pretty well in the post mortem (and to yourself in 2014) how much better rethinkdb is to mongo. All you had to do was aggressively market yourself as the better option by directly comparing yourself to it.

I remember when iPhone first launched, the first 30 min was spent just trashing then-current smartphones - the blackberrys.


I think he did mention this indirectly. Mongo performs better than Rethink on metrics that have nothing to do with what you truly want from a DB. He mentioned developers that would just whip up a script to throw thousands of writes at the database but never reads. Use cases that they would never see in real scenarios. First, you would need to educate your market that they need to care about the CAP theorem and what that is, that Rethink was damn good CP and that while Mongo might be fast, it isn't correct.


Why not call them out then? It should take only one or two blog articles to explain why mongodb benchmarks are bogus. I fail to believe it is hard to teach some database knowledge (CAP and others) to your potential customer when they are developers. Heck, it would even boost your SEO by a ton.


"All you had to do was aggressively market yourself as the better option by directly comparing yourself to it."

Lol. "It's not hard to win an Olympic medal in the 100m sprint. All you have to do is move your legs faster than those other guys!"


But they were and still are better than mongodb? They already achieved that ?


Excellent reflection, I think he forgot the 'elephant' in the room. Whenever I wanted to choose between Nosql solution, I always come back to Postgresql, just enough to fit my limited usecases.


As somebody squarely in your target audience (sr software engineer with major respect for databases done correctly and no respect for anything like mongo) let me give my perspective.

A) I have no idea what makes rethink db better than mysql off the top of my head (maybe stuff exists, but I don't know what it is. I also am risk-averse and don't really have any major complaints with mysql).

B) I understand there are a lot of database technologies out there (mysql is always improving, people speak highly of postgres, people keep mentioning this maria DB thing, etc) which leads to a decision fatigue. It's much in the same way that if I have to choose between learning 3 frameworks, I'll just use none and see which one dies first because I'm optimizing for not learning dying technologies.

So unfortunately it comes down to marketing, which is a disappoint to anybody who wants to believe the best products will market themselves and win out. Sorry betamax.


Nice article. I think he's underestimating the failure on the sales team. They needed to find the intersection of Oracle customers and customers that needed the unique OLTP features of their product. I would have tried to bring an ex-Oracle guy on, for example, someone who's sold into finance or healthcare.


Healthcare and finance firms run conservative IT departments and you'd be unlikely to get them as the first customer.


TBH i never heard about RethinkDB until they announce they were closing... it look solid but never heard from them, probably because from my POV, the database choices gets pushed by communities more than "educated developers" like for example, python devs like to to push postgresql over anything else, nodejs devs like to push mongodb over anything else, php devs like to push mysql over anything else, aspnet/c#/paste-any-ms-languanges-on-win32 like to push sql server over anything else and oracle was there before anything else (so pretty much everyone knows it), my point is that maybe RethinkDB needed a community niche that push the db rather than being the second choice for every community.


Man, I would have loved to pay for RethinkDB. Granted, the blame lies on me here, but something I was disappointed to see the first point not acknowledge that RethinkDB didn't really ask for money. I mean, I don't know if they changed it after folding, but go the site and tell me how to pay for it.

I would have paid a licensing fee to continue using it, but we've taken the time to switch over to Postgres now.

EDIT: Also, re: metrics of success, the examples listed as wrong there are _exactly_ why I enjoyed the product and was willing to invest time learning and implementing in a relatively new technology.


On the other hand, I understand why most developers wouldn't pay "even the price of a Starbucks coffee" for the product: because paying anything as a developer is hard. It requires making the case for why you need it to other people in the company. Developers aren't usually in a position to make financial decisions on a whim, especially those where the cost will probably grow in the future.


I think this is right on the money. As a vendor, you have to understand who your users are and who your customers are. In this space, they are different. Just look at the number of developer tools that are "free for non-commercial use". Developers don't pay for tools. Companies pay for tools that they are convinced they can't get any other way.

They didn't have a poor market; they didn't have any market at all. If I'm the CTO of XYZ, explain to me what I'm buying. I hate to keep coming back to it, but that's what Cygnus did so brilliantly: "You need an embedded development system. The existing solutions do nearly what you want. Pay us slightly less, and we will make GCC do exactly what you want." I know what I'm paying for.

"Build it and they will come" does not work in the open source space -- because if you have already built it, they don't need to pay you. And if you haven't, then they don't want it.

Now, if they had built a DB similar to Mongo and then sold a service that migrated people off of failing Mongo installations and onto RethinkDB, then they would be on to something. Tired of having your DB chew your data for breakfast? Want to have your reports finished the same day you ran them? For the price of a single developer, we will fix all your problems by migrating you to RethinkDB. That's something that will make money.

But who wants to charge for services? How will we get our multiples if we only charge once for our work? That's just folly!


I started a new project in RethinkDB over the weekend, and this postmortem makes me want to start my next one in it even more so. It really is exactly what I want in a database. Relational document storage, with a sql like language, and realtime change feeds. Who WOULDN'T want to use that?!


The database will never sell itself. You need use cases - even Oracle sells sofware solutions for ERP, CRM and SCM which come bundled with a Oracle database. Microsoft SQL server is another example of bundling. So use cases come first before correctness, consistency or excellent SQL.


Excellent retrospective. I know these are painful to write, so thank you.


I think the basic problem here is a lot of technology is heavily hyped and in use but 'not in use'. People are learning about them, trying them, even trying or using them in small internal deployments but are not buying them or supporting them in any way.

Those who take them into production are startups who are themselves trying to succeed or large companies who have tons of engineering talent who can create the product if need be but since its open source and available they use the open source version but they don't value it. Think the recent article where Github, forget supporting Redis has not even reached out to the developer once to thank him.

The demands of the community are high touch, even one or two weeks without commits and people start getting restive and raise 'alarms' on github and here. If there is potential value that you are still figuring out how to capture and a lot of attention you could end up expending time dealing with the politics of a fork. On the other end purchase decisions of most companies brings a whole new sets of requirements from internal processes, approvals, policies to decision makers and most open source companies do not have the resources or experience initially to manage the expensive and high touch sales and marketing process that most companies buying technology require.

If you have funding the pressure is even higher. There is a missing piece of how to translate support and usage of open source software for commerical reasons into some sort of revenue even minimal without friction. I think there should be a spotlight on companies using open source and their support and contribution back, and not just acquiring projects or hiring their developers which is an 'influence play'.


The Economist comment came out of the blue. Why does the author believe that will help one improve?


It helps you view things through the lens of market forces. Most regular media is just pundits channeling some moral position based on the writer's politics.


The Economist is basically pundits channelling their personal politics these days too. Sadly. I used to enjoy reading it a lot more than I do these days.

I don't think The Economist is what to read for this situation though. A book on the history of Oracle or something might have been more appropriate.


Good ideas but as an outsider that wanted to use it this was my takeout:

If I look at google trends [0] mongodb starts trending July 2009 (It exists earlier but it is the time that rises) rethinkdb equivalent is March 2015 and it never rises so steeply. That is six years.

With all the people I've talked nobody new rethinkdb it and it was not a problem of re-educating them on its merits. So what was needed was marketing, success stories and competitors' failure stories.

People did not know it existed. Maybe in the cognitively saturated NoSQL market we more. Gone are the days we could predict developers' needs on our hunches or our interpretations of the feedback we are given from early adopters because they are a biased sample.

Worse: we might need people in marketing and, god forbid, some MBAs.

[0]: https://www.google.com/trends/explore?date=all&q=rethinkdb,m...


I always liked the concept of rethink db and I tried to get started with it a couple of times but in the end I just wanted to get my project(s) out of the door as quickly as possible.

This to me explains the massive success and quick rise of things like redis, mongodb etc. Sure the defaults are insecure but they work and they work immediately.


As a user of databases, I've been shocked by MongoDB's popularity in light of its apparent disregard for correctness and reliability. I used it once to cache a processed form of some data that could be regenerated. One unclean shutdown and it lost that data. I tried using some of the provided recovery tools and failed to recover the data.

That was it. Every time MongoDB has come up in conversation since, I've said "Mongo eats your data" and recommended another solution, even though I'm pretty sure that particular bug was fixed long ago. You had one job, MongoDB.

I see this as equivalent to a company marketing a smartphone with an impressive feature set and a high rate of bursting in to flames. Most owners would return their recalled phones and think twice about buying the next model from that manufacturer.


What was the actual go to market strategy? Selling to enterprises? They have the money to spend but it seems like there was never an enterprise version of the software itself with features that are paid only. This is the high-margin sales that database companies build on, and then perhaps offer cloud and managed service to capture more spend. Support plans can also be great revenue, although they're not a sales driver but rather just required for big companies and likely break even for smaller ones.

That being said, it's an interesting database and I'm still looking for a solid distributed OLTP document-database with SQL/joins and multi-cluster replication... maybe someone else will try again.


I was told (by their lead engineer) a few weeks before RethinkDB shut down that they did have an enterprise version of the software with features that are paid only. However, clearly, that product introduction was too little too late.


If your target market is enterprises who are willing and able to pay, isn't it paramount to get into some market analysis firm's ranking quickly as a "leader", like Gartner's Magic Quadrant?

On this very busy Quadrant (which ought to be a clue too) from October 2015 [1], I don't see RethinkDB at all, while many other niche players are present. This to me indicates that somehow, somewhere, not enough press was generated on RethinkDB, and it was largely unknown to the sorts of decision-makers who decide what products to evaluate down the road.

[1] http://imgur.com/a/PyR57


I really appreciate the time and effort to share such an analysis. I disagree with the perspective that users think the product is in the open source tools market. Open source tools market is where the company ends up as a consequence of the product not being compelling enough for a commercial use case.

5-6 years ago I thought wow this is a really complicated product (a better database?) and I hope they pull it off as a startup.

IMO the biggest issues are (a) a database is a complicated product and (b) there are many options that are good enough and stable. You have to do 10x better than the status quo and that is very hard to do in a space where products are complicated.


I fully agree about the developer tools space being very difficult. I have a popular open source project but was never able to monetize it (and I've had plenty of time and tried a few different things). I can fully relate to this article.

Having a popular OSS project helps me a bit when getting job opportunities but that's all... Even if you considered me as a single-person team and counted 'better job opportunities' as a return on my investment, I don't think the increased salary even comes close to getting me a ROI on time spent.

OSS is a terrible investment. It's like buying a lottery ticket.


"Unfortunately you're not in the market you think you're in -- you're in the market your users think you're in." This is so simple but easy to miss advice.


If you are a developer and build a product using react. When do you ask the business to buy support?

While you're building you assume you'll have to gain your own level of proficiency. So you don't know at what point you'll need support.

Having to go back and ask for an increased budget after it has shipped is a harder internal sale.

As much as I appreciate rethink's approach, I can see it as a challenging business to be in


This is the most useful postmortem I have ever read.

Thanks to the RethinkDB team and Slava for everything they've done over the years, and for making RethinkDB open source, I continue to use it in my projects.

While it might have ultimately lead to your demise, thank you for valuing correctness, simplicity, and consistency. I've never seen another document story company get things so right.


It's hard to sell something as low level as a database. For a startup there needs to be a specific itch to scratch, an itch that many potential customers experience frequently.

As a systems-guy (more or less) I find that frustrating, but I have seen many times that this is so.

It still worth having the courage to try and these guys did that. It could have worked. Hats off.


For us , its working very well and stable!


Hosted db is the way to go. Just as a perspective - this is 2017. How many multi-AZ, failover tolerant, hosted postgresql offerings do you know of?

RDS, Heroku and Compose. Google and Azure still dont do hosted postgres.

So many people pay for Dynamodb. You could be an alternative.

The point is - rethinkdb still exists. you could still do it. via preorders on Kickstarter.


It's much lower barrier to buy managed database service from Amazon or Microsoft than from a start-up.

They have the credibility. You don't question if these companies really have the people to keep the stuff running. In many cases the customers also already have a billing relationship with them. Much less friction to just add one new service to the already quite big invoice vs buying from a different company.

Competing with AWS or Google is hard, they would be still making money on the infra even if they gave the managed database out for free.


compose is a startup. heroku was a startup when they started. in fact, if I have my timeline right then heroku managed postgres preceded RDS (http://www.zdnet.com/article/heroku-launches-cloud-postgres-... - 2011 versus https://en.wikipedia.org/wiki/Amazon_Relational_Database_Ser... - 2013)

Rethinkdb is an incredible architecture. If I was in the market for something like rethinkdb/dynamodb, I would completely go with something that is built by the original inventors.

Just like docker for example.


Heroku was acquired by Salesforce in 2010, Compose by IBM in 2015 though.


While I think @coffeemug's analysis is good, I think it glosses over the core issue: money.

rethinkdb's core issue IMO was that they took VC money, which did two things:

- mandated that success had to be a home run

- gave them a fixed runway

Sure, rethinkdb made some mistakes, but I think those were successfully corrected. The problem is that those corrections were too late and they ran out of runway, and the success looked more like a triple than a home run so they couldn't go back to VC's for more.

Of course, not taking VC would have slowed things down even more, but that's fine. The primary attribute many look for when choosing a database is trust, and trust takes years to earn.

mongodb vs rethinkdb is very comparable to mysql vs postgresql, and I think the final results will turn out very similar. mysql did very well and was the leader for a long time, but today postgresql is the clear favorite.


Thanks to Salve for writing an incredible retrospective. One thing I'd be curious to hear is whether the RethinkDB team feels they could have create the tool and potentially the company without taking on venture capital/investors (similar to PostgresSQL, which I don't think has taken on vc).


That would have been completely impossible.


I love the postmortem culture and it's great to see people writing postmortems even when they don't have to. It would be nice to get to a point where postmortem is something that occurs naturally, with people expecting it and getting confused and asking questions if it didn't.

One can even imagine a future where each court case is followed by a postmortem. Possible bullet points:

1. Was this an actual problem? Did anybody get hurt? If not so, what's wrong with the law? How should we fix it?

2. What were the ultimate causes of the problem? Ok, A killed B, but why? Could it have been prevented?

3. What are the action items to prevent this exact scenario from happening again? Who's assigned to each action item? What's the deadline for the item?


Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: