Hacker News new | comments | show | ask | jobs | submit login
RethinkDB is shutting down (rethinkdb.com)
1674 points by neumino on Oct 6, 2016 | hide | past | web | favorite | 441 comments

RethinkDB is one of the developer tools that we at Stripe most looked up to.[1] The team had so many good ideas and rigorous, creative thoughts around what a database could be and how it should work. I'm really bummed that it didn't work out for them and have enormous respect for the tenacity of their effort.

I'm also excited to have them join us here at Stripe. As we've gotten to know Mike, Slava, and the other Rethinks -- and shared some of our plans with them and gotten their feedback -- we've become pretty convinced that we can build some excellent products together in the future. I'm looking forward to learning from all of them.

[1] (And, for me, even before Stripe -- I started learning Haskell with Slava's Lisp-interpreter-in-Haskell tutorial back in 2006... http://www.defmacro.org/ramblings/lisp-in-haskell.html)

As a user and supporter of RethinkDB I hope (and expect!) that the engineering team joining Stripe is a sign that Stripe will be able to take part in further development of the product.

On a sidenote, for people unfamiliar with RethinkDB, this episode of The Changelog with Slava explains some of the history and choices behind RethinkDB ; I found it really interesting. https://changelog.com/114/

Slava @ Rethink here.

The team at Stripe has been absolutely phenomenal throughout this process, they've gone above and beyond in finding high-impact projects for our team. We're brainstorming together how to transition RethinkDB to a self-sustaining open-source project, and Stripe is super-supportive of that too. If there is a way for RethinkDB to live on, we'll find it!

I can't describe how sad I am over this. RethinkDB is not just excellent db, but it is yardstick on how future db's should look like. What you accomplished there is excellent balance on features, great ui. We really need this project to go on, even if slower.

We use RethinkDB at CertSimple. It's always been a great DB with safe defaults, excellent documentation and it's always just worked. Trickier ReQL queries are intuitive. It deserved wider support than what it got.

It's depressing that a half-baked constantly problematic effort like MongoDB wins while this gets neglected. Unfortunately we still live in a world where marketing beats merit.

:) You nailed it. I think MongoDB was interesting experiment and can be used, but RethinkDB shows you how it should be done right.

MySQL was for long time default choice over PgSQL, yet once developers matured, quality won. Just it took a decade :)

It's always like this. Just look at Windows; if people cared about quality, no one would be complaining about Windows 10 giving them grief because Microsoft would be out of business or relegated to some business software niche, and we'd all be running some kind of Unix variant or maybe some descendant of BeOS or OS/2.

That is very good to hear. We just ported our backend to use it after evaluating many other options including managed databases.

It's really a top quality piece of software in every respect, combining much of what's good about nosql with relational features and robustness. The ease of clustering across data centers is just phenomenal.

Would still recommend it in spite of this bit of uncertainty.

Now that RethinkDB has no commercial ambitions, will you consider re-licensing to more commercial friendly license?

It's licensed under the Affero GPLv3, but I don't think AGPL license requires you to open source your code, if you're just using ReQL (the API/interface for RethinkDB).

GPL/AGPL requires you to share your code if you statically (or dynamically) link (which is a C language family concept) to GPL code, but I don't know if calling an API is considered "linking".

I don't think most companies really make any contributions or modifications to the databases they use that they absolutely do not want to share back to the community.

(If using ReQL constitutes "linking" under AGPLv3, then that's a very serious matter. Perhaps re-licensing under LGPLv3 would make sense then.)

But overall, I think it's fair that they used AGPL. I especially like that they opted for the Affero version, since companies that do make useful changes can't just hold onto it , and not share it back.

I've worked in a project that used some AGPL component. There were some many doubts on what should be open-sourced/which licenses were possible that we dropped that component and invested or time modification another one.

After some while, that component changed its licensing model. It was too late.

Exactly, I call the AGPL anxiety license. It create so much problem in the mindset that is VERY commercial unfriendly.

I wish you would just consider a 2 clause bsd license and end all debate and worry about licensing issues. But it's your code.

The GPL is of course completely commerce-friendly, as it permits anyone to resell the original or modifications to it. What you're asking for is a license friendly to proprietary software[1].

Why on earth would they want to enable someone to close his modifications to RethinkDB? How does that make the world a better place? How does that encourage the growth of RethinkDB (vice the proliferation of closed, proprietary forks of RethinkDB)?

[1] Which is to say, user-hostile software. Users should be free to use, modify & distribute software.

Proprietary software is not by definition "user-hostile," and hyperbole like this does much more harm than good to people like me who would like to see more open source and less proprietary software in the world simply as a matter of principle.

> Proprietary software is not by definition "user-hostile,"

Yes, it is: by definition it violates one of the Four Freedoms of users … which is hostile.

Just because your chosen religion says something is evil doesn't make it so. You're speaking tautologically.

Let's apply the Ferrengi Rules of Acquisition or Sun Tzu's Art of War to software licenses!

The Four Freedoms are just someone's opinion, not some tangible fact.

I can see why you're sometimes labeled as "fanatics" and "zealots". You speak in religious-like absolute terms and use circular logic to "prove" you're correct. The GPL's biggest enemy are its most ardent supporters. Truly.

I disagree. Most of those are things that most users don't give a damn about in most cases (because they're meaningless unless you have certain unusual skills), which means they don't make for a sensible definition of "hostile".

Do pirate bay support your statement or proof that it is wrong? Maybe it just that "most users don't give a damn" about copyright, and as such don't care if what they do is legal or illegal with current copyright laws. In order to care about copyright licenses, first users need to care about what happens when they don't follow it.

So you care more about having the entire pie, than about how much pie you have?

If you have a license that "permits" commercial sales in a way that by design makes most business models completely unfeasible... guess what? You'll only get contributions from those with one of the handful of blessed business models. Which will work or not depending on what sorts of businesses models your project is suited for.

Copyleft works fine for things like the kernel that are complementary to tons of different expensive things and nobody cares about otherwise. Permissive works well for Postgres.

At the same time, you're arguing that you should be able to take the work they've done, add to it, but not give back, despite having received a HUGE base to start with.

RethinkDB is licensed AGPL, not GPL.

That's an unfortunate license choice. No one will use it.

Not sure about no one but anything with gpl in the license is surely a dead end for some of us.

Affero General Public License is much more restrictive than GPL. People are not comfortable using it in a commercial setting.

Will Stripe allow your team to continue working maybe even on a part-time (20% time or 50% time) basis on RethinkDB inside Stripe?

I secretly (not 'secret' anymore I guess) and irrationally trust Stripe to do good on RethinkDB.

Also, TIL: Slava is defmarco. Yeah defmarco, that's how I read it for a long time. I especially love this piece from defmarco http://www.defmacro.org/ramblings/fp.html

> Also, TIL: Slava is defmarco

Just another +1 to the "OMG it's him/her?" moments regularly experience while reading HN. Thanks for mentioning.


I believe the RethinkDB team will be a great fit for Stripe. RethinkDB was literally a pleasure to work with. As someone who has had to endure the pain of setting up a production mongo environment, I was thrilled to see how easy RethinkDB was to provision and then configure (all in their beautiful dashboard).

I see a lot of similarities between Stripe's offering and RethinkDB, making a once painful process into one that is actually looked forward to when building a new product. I'm glad Stripe will have even more engineering firepower as they continue to succeed.

Sad that RethinkDB(.com) is no more, but happy they found a great company to join!

Patrick, I am curious why Stripe didn't just acquire RethinkDB completely? Seems like you could have gotten a bargain. Their investors get some money back, you'd get their excellent intellectual property, and finally the entire functioning RethinkDB team joins Stripe. Win-win right?

The IP is open source, and Stripe owes nothing to RethinkDB's investors.

Why buy something you can have for free?

Just a hypothetical - if you actually own the IP, you may be able to release future versions under non-free licenses. However I don't see Stripe in the premium-DB business.

yeah. RethinkDB is AGPL, which means that some people cannot use it (e.g. http://www.theregister.co.uk/2011/03/31/google_on_open_sourc...). While a very well grounded interpretation says that merely using the AGPL'd service via a public API doesn't force you to release the clients, some companies would prefer buying a commercial license rather than facing the legal risk.

> RethinkDB is AGPL, which means that some people cannot use it

That's completely false: anyone has the right to use it, for any purpose; that's what Freedom Zero is all about.

Some people do not want to both use it and also comply with its terms, but that is their choice; they can and may use it, but choose not to.

TL;DR: it's not always "them", there are also those working for "them", who might have no say in the matter.

cannot resist to reply to nitpicks :-)

counterexample: let's imagine I'm an employee of a company which has a policy that forbids me from using any AGPL software for work and it also forbids to install said software on my corporate laptop at all (even if it was for personal evaluation or toy projects) or else, if caught, I might incur in disciplinary action, who knows, possibly termination.

I guess that in this case you'll agree with me that the statement "some people cannot use it" is not quite false and especially not completely false, in that , yes, I could use it (as in no physical law forbids me), and even if you might say that it's me who freely chooses to not use it in order to avoid the repercussions I don't really think I have a choice here, do I?

(I didn't downvote you)

This is an extremely pedantic definition of "some people". As employees of the company doing work on their corporate laptop they can't do it, but that is because the company won't do it, and they are employees of the company.

Also: you shouldn't be using your company laptop for persona projects anyway... get your own laptop :/. So, likewise: "people who by premise are already doing something sketchy because they can't afford their own laptop and are pretending to borrow one under employment terms" is just an awkward place to start to define a reasonable, as opposed to pedantic, definition of "people".

A lot has been told about those cool companies that hire creative people that love to play with technology. A lot has been said about how some weekend projects have turned up in useful things to be used within the company, sometimes even major projects.

Such a passionate developer however might find himself in a situation where she's not allowed to use some tool X because of a restriction put up by his employer.

That restriction might be a minor annoyance, e.g. raise the barrier to entry because "doh, I need to go and grab my personal laptop for that? nah, let me use something else". Or it might be a deal breaker: I need to get some job done, I'd like to play with tool Y, use it to get the job done and learn new thins while doing it.

This kind of people, those passionate developers, will complain about that.

They will complain to their employee for putting up such a restriction (and you'll usually not hear about those complains here).

But they will also complain about why the tool Y has chosen a license that his employer find so problematic.

These people will not just stop complaining just because they shouldn't be wanting to play with things in the first place. That's what they do, and that's often why they are good at doing things.

People do complain when they have too many rules that hinder their ability to do stuff effectively. I do see that happening, quite a lot; and I can understand why and relate to it.

Do they have the right to complain? Well, that's another story.

If the tool is closed source and they don't want to buy a license, then sure they can complain but they will just shrug it off as "that's the way it is" and move on.

I believe that things start to be more blurred when you have an open-source tool, which suddenly you cannot use (I'm not saying extend and sell commercially as closed source!) just because of FUD around licensing.

That's a nonsense definition. In general language it's perfectly reasonable to say things like "vegans can't each chicken". Responding "FALSE! They can, but they choose not too!" doesn't add anything.

"yeah. RethinkDB is AGPL, which means that some people cannot use it"

No, it means some people CHOOSE not to use it. There is nothing stopping them from deciding to use it, other than their own fears.

No, if I'm working at a company that has a blanket policy against using any (A)GPL software, then I can't use it.

What the lawyers decide in a company that has 10k+ employees has nothing to do with the fears of anyone but the lawyers and the top management. Everyone else is just following the rules.

This definition of "can't" means there is no such thing as "won't".

Ummm...no. Won't implies you have a choice. Can't implies that it's not an option.

Working for one of those large companies, you simply don't have an option.

And the people without choices vastly outnumber the people who do have choices.

Frankly, most developers are probably happy using whatever code, under whatever license, that's available. It's the people who run the business who typically make the decisions about what's OK and what's not.

So it's developers who are punished by GPL, by and large, because many can't use the code. It's not helping those developers, and nothing zealots say or do will convince the legal department at those companies that they should change their position on (A)GPL.

Sounds as good news for any competitor (if your company has any). The competitor can use (A)GPL when its suitable and get products out earlier with lowest costs. If development time is cheaper than doing per-case evaluation of a license, then something must be horrible wrong.

>Sounds as good news for any competitor (if your company has any).

Actually thinking of Amazon.com. They have competitors, sure. But they also have the budget to write their own entire stacks internally when the license doesn't fit. And it's not like their decision to avoid GPL software is hurting them in the market in any relevant way.

When something is MIT licensed, they'll use it, and they can and do contribute changes upstream. GPLv2 code requires serious hoop jumping to use, and GPLv3 software is verboten (the patent clause can't be adhered to by large companies with cross-licensing agreements: Many of these licensed patents can't be sublicensed, and so they can't be in compliance with the license at all).

So all GPL does is prevent companies from using and supporting the software. For every instance of an MIT project that ends up modified in proprietary code there are probably 100 that either use it verbatim or contribute changes back upstream. It just makes sense so that you don't need to keep maintaining an increasingly divergent fork.

When you are large enough and entrenched enough, you can do suboptimal decision and still win the race in both market share and revenue. IE and Microsoft comes in mind, and it took major failures and long time before competitor started to gain ground.

And it would explain why say a gaming company can't have such rigid policies. Preventing that a game get delayed is worth both lawyer time and, in the case that a license directly conflict with the business model, send mail to the author and asking for an exception. For example, I recall that LGPLv3 which has the same patent clause you describe as "clause can't be adhered to by large companies" is used by blizzard in starcraft 2. Blizzard is not Google in size, but they are not exactly a street vendor. One might also ask if they have much need to protect patents about xml parsing, or fonts, or what ever specialized functions those numerous library do that blizzard use to build a game. The only thing they don't use is copyleft, as their core business model is designed around restricting copying in order to limit supply when selling copies.

Per case evaluation make sense when your product is time sensitive and when there is a lot of competition. A game from Blizzard is almost treated the same by the market as a game by a indie studio (keyword: almost), and a such can't rely on market share to protect them. A bad, delayed, and rushed game is still bad and won't sell regardless of who made it (To name an example, the last batman game). They must be agile, which mean religious thinking about software licenses must be thrown out and per-case evaluation be added to the process. If a library can be use within the business model, saves time and money, and is not your core ingredient in making your game stand out, its almost always a good idea to use it.

>Blizzard is not Google in size

It's the Google/Amazon/IBM/Apple size where you end up with endemic cross-licensing patent agreements. That's where they can't adhere to it: They have a license to a thousand patents that protects them from being sued, but they don't have the right to sublicense those same patents. (L)GPLv3 requires they sublicense any patents they have to anyone who's sued, IIRC, so since they can't, they lose the ability to distribute their software.

I'd be somewhat surprised if Blizzard were a party to such an agreement. They might be party to a "patent-troll-don't-sue-me" agreement, I guess? No idea. But if they are, depending on the terms of their agreement, someone could potentially sue them over it based on LGPLv3 terms.

I'm not planning to, so I haven't done the research, to be sure. The risk scenario that comes to mind is a stretch, but say I buy a Blizzard game with LGPLv3 code in it, which grants me the right to be protected against patent lawsuits relevant to that code, and then I ship my own game that uses the same LGPLv3 code -- and I'm sued by a patent troll over a patent that Blizzard has licensed through some agreement. Guess what? I can now demand that Blizzard protect my use of the same LGPLv3 code by sublicensing me that patent. Which they (probably) don't have the right to do. So they either pay for my license or they have to stop distributing their LGPLv3 code.

For Amazon, who distribute tons of code for people to use in AWS, and the fact that they're a much larger, more collectable company, the scenario is proportionally worse.

It's all beside the point, though: In no case is it the line-level developer making the call, it's someone in management. Not everyone can (or even wants to) work for Blizzard or equivalent.

In particular, not everyone wants to work at literally half the compensation or less just so they can have full software freedom, whatever that means. The Google/Amazon/Apple compensation can be that much better than start-ups for top developers. I didn't realize this myself until I got a job at one of them.

And I'd love to be able to use RethinkDB where ever I end up next, without having to worry about whether the company legal department has a problem with (A|L)GPLv3. It's a battle I wouldn't even bother taking on in most cases; too much work when I could use something else and get back to doing real work.

No, it's that your company is CHOOSING not to use it.

In the same vein: you don't HAVE to pay taxes, you're CHOOSING to.

No. Stop trying to blame others for your company being silly.

There might still be copyrights, trademarks, domains (and even patents).

The copyrights and patents are already covered under the licenses they used (AGPL v3 and Apache v2). A search for RETHINKDB on the USPTO trademark search engine turns up no results.

Coming from a failed startup myself, there is probably a bit of debt associated with the company at this point. In our final throws of life, we raised additional capital (angel/VC/etc) via convertible note to sustain the business a bit further. This in addition to back rent, unpaid bills, etc, leaves quite a pile of financial liability that any acquiring company would likely assume. Easier to let the company die and hire the staff as a separate operation.

Asset transfers are common even in successful startups for the same reasons.

Just out of interest, are you using Haskell at Stripe? If so, at what scale?

The comments section here can be brutal and God knows I've been responsible for some of it, which makes seeing comments like this all the more heartening.

No need to ask God. Just pull up the history. ;)

It's great to hear that the Rethink crew will have a home at Stripe. Seems like the perfect fit. I hope you consider putting resources behind the project. We've absolutely loved using at Lumi.

"we can build some excellent products together in the future."

So in other words you're not going to continue to work on Horizon and Rethink, or at least if you do they are secondary priorities? Ugh.

Slava did the right thing; his first responsibility is to his employees, not his users, but this really sucks for us users.

For us, it would probably have been better to shut down abruptly. The community would have scrambled. Some employees would find work in companies that used RethinkDB, and maintenance would have continued the same way that Apache was developed ~20 years ago. Some would have grabbed support contracts from users. And some would have moved on.

Oh, that's why it got shut down. Thanks

Why does nobody seem to have any introspection on why RethinkDB failed? Clearly there are some major problems that people re ignoring. If my favorite DB (I must mention Kx Systems once a month) folded, I could give you a laundry list of issues where things went sideways, but all I see is glowing praise and comments about the best tech not always winning (KDB knocks the socks off of everything, but I sure can give you a list of places it fails).

This isn't meant to be harsh, but these are times to learn, not simply pat each other on the back.

If I had to speculate, I'd say that they spent a long time in development before monetizing, longer than investors were willing to entertain. It's hard for a B2B company to raise a Series B without a thoroughly proven revenue engine.

I don't know how this could have been fixed though. Databases are hard to develop and it's a tough market to crack. Enterprises aren't going to buy an incomplete product, especially not a database, which is arguably the most critical component in the entire stack. Anyone investing would have to know that this would be a years-long effort to build a solid product before the first dollar.

Perhaps RethinkDB could have shortened their development time by not pivoting around (originally SSD optimizations, later realtime. also Horizon is a bit of a mini-pivot), but I don't know by how much.

Just looking at some of the RethinkDB and ReQL stuff, I certainly wouldn't have used it. Two things hit me immediately:

- bad performance

- a dearth of types (literally only a single numeric type that is a 64-bit float, so that eliminates entire categories that rely on integer/fixed-precision exact arithmetic. Also, time series are seriously hurt by that decision. I've seen DBs have to move to 64-bit longs because of that issue alone. Having a pseudo-type layered on top is just going to tie up CPU cycles in the encode/decodes that need to happen. (Plus milliseconds aren't enough in a nano-second world now.)

This whole "let's do everything with as few types as possible" I hope is a fad that will just die quickly in the DB world.

This is with about 20 minutes of looking around and just listing a few things. It looks like RethinkDB was written by web devs for web devs and I don't think that can compete in the database world. You might get the web devs onboard, but the database and HPC people at the firm will probably look at it and go "how quaint when can we get Oracle/Postgres/Informix/etc up and running?"

When you work with dynamically typed languages, the typeless nature of RethinkDB is awesome. MongoDB follows the same design.

Functionally, to me, RethinkDB was perfect. ReQL is one of the best query languages ever - It's so easy to learn and remember. I haven't operated RethinkDB at scale in production (so I can't say much about performance) but I was pretty impressed with its scalability features during testing.

I'm really more leaning towards the idea that this is purely a monetization failure. They've been going at it for 7 years - There must be a good reason why investors kept it going for so long - I think it's because of the product.

I think they did identify a good monetization strategy in the end but maybe it was too late - They dug themselves into a niche that had great long term growth potential but they didn't have the resources to wait it out any longer.

> literally only a single numeric type that is a 64-bit float

After dealing with JavaScript and Lua, I am ready to call this a complete anti-pattern. To be a good language, it must support at least one size each of machine floats and ints. To be really good, it should be possible for me to choose any size of machine-supported floats and ints. To be great, it should also support rationals, fixed-point and complex numbers out of the box.

Giving me floats but not ints just doesn't cut it. It works, in a kind of shoddy way, but … it's tasteless.

If you don't provide me with bitwise operations (earlier versions of Lua, I'm looking at you), then you don't get to call yourself a real language.

For a database, though, I suppose one could always store integers as their string representation. But please, no language ever do this again ever.

This anti-pattern -- floats as the sole numeric type -- goes all the way back to BASIC in 1964.

It's just as bad an idea now as it was then.

The only concrete problem anybody's mentioned with floats is that they only have 53 bits of precision, and some people need their integers to go up to 64 bits, or more.

Here's two actual concrete problems: efficiency and type safety. Indexing an array with "1.5" doesn't seem awesome, nor does using 8 bytes for a double where a 1 byte int would do fine.

A double with integral value in 127..127 already takes 2 bytes to store in rethinkdb (the first being a tag distinguishing between array, object, string, etc), compared to some random double's 9 bytes. The type safety advantages of distinguishing doubles from integers are pretty minimal in a database, because you'd need a schema for that anyway, and the benefits of that far outweighs the add-on benefits of having type checked query logic.

What are you going to do for currency amounts if you do not have ints ? Strings with numbers in them?

Floating point numbers handle integer values in the range -2^53..2^53 just fine.

You can definitely get around the currency issue by scaling yourself, but things like the timestamp issues (fractional seconds since the epoch with millisecond precision) are a little more problematic. You basically have to roll your own datetime format and lose any db support.

Databases are about data after all. Why not have a rich way to describe it?

I think so too. But I don't think, say, not having a 32-bit integer type, and 16-bit and 8-bit integers, is a big problem, if you've got doubles. Maybe it's a nice-to-have. A 64 bit type or bigint would add real value.

I think as far as number types, you NEED (in the you should be considered seriously broken even if you can hobble along without) to have:

- a 64-bit float

- a 64-bit int

everything else can be emulated in code from there and you can play all the encoding games you want to save space when you don't have to use it. This covers 99.999% of the use cases you reasonably see.

From there, I would argue, you should also have:

- a bit and byte

- 32-bit ints

- 32-bit float

- 128-bit int (once you add UUIDs you might as well make them full numerics citizens).

If I were doing a modern database, I'd have the kitchen sink as a type full numerics from 8 to 128 bit both signed and unsigned int and all supported hardware float sizes. I'd probably even have a 512-bit AVX type just to see what people would use it for.

Why do you NEED a 64 bit int? Why not 32 bit? You're not storing pointers in a database. (And then you can implement your 32 bit int in terms of the 53 bits of int-capacity in a double.)

Because I want to store something that is a 64bit int? I mean this is really a weird question. There are a lot of things that require this starting with timestamps:


Which you can store as two 32 bit integers, or one 53 bit integer. Nothing requires native 64 bit.

64 bit is nicer than 32 bit, but it's scarcely more necessary than 128 bit numbers.

datetimes, aggregations, etc. pretty much anything non-toy in the modern world requires 64 bit int.

like I said. read NEED as in you can certainly hobble around without it (pick an appropriate offset and read your 53 bits in relation to it), but like a fracture in a leg, it's still considered broken.

What exactly do you mean by aggregations here?

If you want a good timestamp you probably want a timespec, and that won't fit in 64 bits anyway.

I don't see why it's easy to make 128+ bit types be composite, but you can't have 64 bit types be composite.

Your database driver can take care of how the bits are packed on the wire. It really shouldn't be a big concern.

A query optimizer might do a lot better with a native integer type than a user-defined one cobbled out of int32's. It knows all the mathematical properties of the type and add things in any order, knows that x < y implies x + 1 <= y, etc.

probably easier to have this convo in email (check my profile). HN iisn't quite setup for back and forth. I'm more than willing to see your point about about a 53-bit int. I think dev experience can be made more difficult as long as a you get compensation in other areas. Email me a "hi" and I'll respond, but it might have to wait until later in the day.

Edit - I actually meant for this to be in reply to jnordwick's comment further down thread (the one with all the rhetorical questions), but apparently I suck at clicking the right things.

You seem a bit worked up, and it's hard not to read your comments in this thread in a tone which suggests you feel like you've been robbed of something.

These people don't owe us anything. When a company decides to open up and publish a postmortem it's a great and wonderful thing, but it's in no way obligatory. Even if their technology doesn't hold a candle to your expectations or the market's needs, it's still not without merit. If you want to learn, they've already given plenty for you to learn from. Rather than demanding more from them, be thankful for what they have already contributed.

I think his cynicism is warranted. This is a forum, and one that is subject to trends and favorites. Anytime RethinkDB pops up on HN (or stripe for that matter) it seems to be met with unchecked praise.

And yet here we are with a failed product. I think it's OK to ask why and consider its shortfallings. Obviously there was something, and I don't think it's fair to completely coast over the technology side of things and blame it on "marketing".

> marketing

I think RethinkDB's marketing was excellent. The unchecked praise you mention is likely in no small part also a result of this.

However I don't think that means the technology was the problem. There are far more broken and misarchitected pieces of technology that are financially successful.

What exactly went wrong is speculative until we see the insights Slava promised, but it seems the failure was entirely financial: they had a great product, they brilliantly marketed it, but that didn't translate into sustainable revenue.

Could aways go for the Lisp Plea and say that the technology is so amazing that the hoi polloi simply can't fathom its amazingness.

Exactly this. He says people need to 'man up', when in fact he needs to 'human up'. No surprise to see that he's involved in low latency fintech.

> - bad performance

What scenarios were you finding you had bad performance?

I was finding I could getting comparable performance to MySQL for straight by key CRUD activity which is all I'd use a document store for (particularly once you take into account MySQL replicating to multiple nodes and the related failover setup).

> This is with about 20 minutes of looking around and just listing a few things. It looks like RethinkDB was written by web devs for web devs and I don't think that can compete in the database world. You might get the web devs onboard, but the database and HPC people at the firm will probably look at it and go "how quaint when can we get Oracle/Postgres/Informix/etc up and running?"

Have you ever tried to fully automate the failover/replication/etc for those? It is a substantial pain point that usually requires an experienced DB person with that database to do it well.

There are plenty of CRUD datasets where RethinkDB would be good enough and wouldn't require the dedicated ops people. That is the pain point I'd say RethinkDB was focused on solving.


I'd say the real reason of the failure is more the business model end where the intersection of orgs w/o dedicated ops people AND willing to pay for support was simply too small. RethinkDB was built (imo) for small organizations that simply didn't have a dedicated ops team and it was just 5-10 developers + marketing/business.

That could just be how I feel because its the situation I'm in (less than 10 IT supporting 200+ person org and our sysadmins deal almost entirely with end user issues).

Well, to each his own, but I don't think the product was the problem. Lots of people love RethinkDB. The comments here are proof of that.

This is the exact kind of comment I was warning against.

The "it surely can't be the technology!" comment when yes, Shirley, it really can be. Even the best database systems I know (orders of magnitude quantitatively better than RethinkDB) have areas of technical weakness. And not just some areas, but usual gaping valleys. Nothing does everything well, and if it tries to it usually does everything below average.

Time to man up. Performance really was crappy? That only having a float for your ONLY number type really was kind of dumb? Having no native datetime was bad? Saddling timezones on the pseudo datetime was a brainfart? Milliseconds only was shortsighted? Horizon was a bad idea too soon? Document databases are the new object databases - forever 5 years away? Aggregates were poorly done? Nobody really knew how to do stationary time series properly (I read through the GitHub issues regarding this and it was laughable that these people called themselves database experts)? and on and on.

I'm not saying all of this is true. I'm saying let's be straight and not always blame sales and marketing or dumb consumers or C-level execs or everybody but our own crappy code.

Having worked for a successful(ish) database vendor, I can say that user stupidity (ignorance) is a major factor in database software success.

Literally no users, and almost no employees at the vendors, I would bet (if my experience is not unique) have any idea of the qualities of the software they use/peddle, its semantics or guarantees, or its performance characteristics or how to use it correctly.

This makes the technology secondary; what is most important is how well you carve out mindshare. Whether that is even primarily down to "good" marketing is something I doubt. In the modern saturated database market, it is entirely unclear to me how you win sufficient mindshare and trust to obtain a wide enough (paying) userbase.

I have no idea if RethinkDB was "good" or not, as I did not take the time to investigate; I have little interest in "document stores". It actually seemed to me they had some pretty solid engineering foundations; the kind of thing that the industry should value highly, but due to poor information is unable to price into their purchasing decision, and as a result is missing in many modern (especially OSS) database offerings today.

Get 'em, Jason!

Loving a product is somewhat orthogonal. To be successful an open source DB company needs some large customers paying to make that model work.

I dislike cassandra immensely, but I still use it every day.

> I dislike cassandra immensely, but I still use it every day.

I feel your pain. There is very few robust database system that handle recovery well enough at scale, and for those without the deep pocket to keep a dba on hand there are very little choices.

For small/middle scale robust system Cockroach db is promising but still needs a couple years development, if it can survive without a monetization strategy long enough.

I am not sure I am quite ready for that much adventure...


Cassandra is at least the devil I know, although it seems to find a new way to fail me every six months or so.

Why do you dislike Cassandra?

Why do you assume performance would be bad? It's got indexes. Obviously if you never create an index or never profile your queries it might get sluggish but that's the same in SQL.

The dearth of types can be good or bad depending on your use case. I agree that for some use cases it's a bad thing.

I like many things about SQL and still prefer it to RethinkDB in some ways, but there are no SQL databases that offer replication or HA fail over configurations that don't require incredible time investments to configure correctly. This includes commercial DBs as far as I can tell. If you're too small to have a DBA, HA SQL is out... unless you use a somewhat expensive and also restrictive cloud-managed solution. The latter would have been our second choice behind RethinkDB.

> Horizon is a bit of a mini-pivot

When Horizon was announced, that's when I stopped even considering RethinkDB for new projects (even if it was a good fit, feature-wise). Call it a "business smell", but I'm not surprised to hear they've wound the business down.

Good call, I missed that aspect of horizon. Though if they could've transitioned to a managed db / app service provider it might've provided a way to keep going. Makes me think the mini-pivot was "a dollar short, a day late", and that perhaps managed PaaS could provide good revenue models for open source infrastructure.

He literally said he'd be following up with lessons learned. Now is the time for pats on the back.

Communities are supposed to be supportive, too. Pity this thread is the only time I've seen that on HN as of late!

Those of us who have had failed startups can tell you that pats on the back and community support help. The Hype Machine would have you believe that failure is opportunity, and that you get a chip on your shoulder when you fail; but failure means failing (grade F in the US) and it hurts. A lot.

They mentioned that they will be releasing more post-mortems later. For now, I'm comfortable with supporting them and hearing about their learning later.

The RethinkDB team built an amazing product that many people appreciated - as noted by the comments here- and for that they have my praise and admiration. The fact that the product will continue without the business is also noteworthy. Well done all

Slava mentioned in his blog post that he would be writing some lessons learned over the upcoming months. The fact that this is an open source project that should continue to live on helps with the reception, too. It's sad news to see the company shut down, but it's positive news to know that we can still keep the project alive.

There's a lot of existing document-oriented databases like RethinkDB such as MongoDB, CouchDB, RavenDB, ect and they all have the same network bandwidth problem for geo-replication http://acmsocc.github.io/2015/posters/socc15posters-final52....

I guess those other document-oriented dbms just had better early stage shilling to get funding before RethinkDB did, and by the time Rethink shopped around there was already enough competitors that investments in documented-oriented dbms dried up.

The current state of the art is an in-memory dbms to take advantage of future available (soon, 2017) non-volatile memory. CMU is developing such a dbms http://pelotondb.org/ and there's also MemSql and Sap Hana.

I fully expect a write-up on why RethinkDB failed in the coming months, we're going to have to deal with the practical consequences first.

> KDB knocks the socks off everything else

...but the licence is 250k dollars a year so big fail for startups.

Very true, this is why they are going after the mature businesses where 250K investment has a huge return by having the advantages of KDB+.

KDB is not really a database as much as it is an RPN calculator with a lot of memory.

That's the salary of a single employee. Surely they can afford a database on the payroll.

Many early stage startups can't afford even a single salary. By the time enough cash is on the table to consider those kind of purchasing decisions, the initial round of architecture decisions have typically been made, and been cemented into the initial version of the software.

E.g. I've worked on more than one startup where the first angel investment wasn't on the table before 6-12 months into development, and where that first round in some cases was below $250k

On top of that you also have the issue of finding someone that knows it, and the associated staffing risk that comes with that (yes, I'm sure you can always find someone, but at what price? there are place I can go where I couldn't throw a stone in any direction without hitting someone that "knows" Postgres or MySQL or both sufficiently well to be an acceptable tradeoff)

In many startups the tech choices end up being made not just based on what fits and what is affordable, but also based on what you can find affordable people to work with (including e.g. co-founders or other people willing to do initial work for equity) - sometimes that can lead to niche tech getting used. But far more often it means picking from a small set of the most common alternatives.

So then the startup needs to get by with what it's got and earn money until they can afford it. Just like everyone else does with literally everything else.

Yes, and that is exactly the point: This means databases like Postgres or MySQL gets entrenched over options like KDB that costs a lot of even get started with. By the time they could afford the license, the cost of switching has risen dramatically.

After that comment I'd bet good money you've never started a company or worked full time at an early stage startup.

Recommend you take one of your ideas and sketch out a back of the napkin first year plan. I bet it doesn't include using 250k of investor's seed money for a database.

More like the salary of two or three employees.

please do introduce me to your generous VC. At current low interest rates, building a stack that is reliant on KDB, will cause a large valuation hit because you're effectively locked into a negative -250k/year cashflow in perpetuity. At a 5% discount rate for example that's more than 5m USD of inflexible negative NPV right up front. Look there are fintech applications where this will be seen as fine (right tool for the job might be the big difference between success and failure) but you'll admit that it's a big ask for the less conventional "disruptor" style businesses which may not have big ticket upfront funding.

You're right. I did not think this through.

It isn't that you didn't think it completely through, just that there are a lot of variables that go into any hiring or tech decision. Take this for instance (but it generalizes to many others): does the DB reduce your need for developers? does it open new market segments? does it reduce your need for developers? does it open new areas to target?

Startups are hard (Ive failed a couple times and had it work a couple times - definitely not always through my own effort).

My current view on this is that RethikDB didn't rethink enough They solved a probabem without much money involved and too small. They might be great devs, but just diidn't solve a prolblem that needed to be solved.

I can just speculate just as a user playing with RethinkDB (and having programing database SW from Tandem NonStop days) - for me, it seems like RethinkDB did not solve problem companies using databases and having money needs to be solved.

I think they needed to develop just one feature which cannot be done by Oracle and that is that.

The other problem could be that NoSQL hype just died out :(

I don't agree with some of what you said, but at least you are taking a chance to try and figure out what when wrong. You have been bookmarked :)

> Why does nobody seem to have any introspection on why RethinkDB failed?

Because nobody used it.

It seems that unfortunately RethinkDB the company was architected in such a way that the success of their product, in terms of performance and developer experience, led to a decrease in revenue.

This shutdown therefore goes a long way to say how talented and ethically correct the team was, something extremely evident in how they put correctness and reliability in front of performance.

In short, RethinkDB is a very solid piece of software that does well where (many) other NoSQLs fall short, that is:

  * easy HA and automatic failover
  * easy auto sharding
  * rational query language
  * ease of administration and awareness (webUI)
  * realtime capabilities
  * perform well on the Jepsen test!
Now, what could have they done differently to stay afloat? What avenues do we have to fund fund such great projects, whose point is being OSS? (I mean, one of the selling points of RethinkDB is that one can trust it in this land of NoSQLs that promise but don't deliver, and this is in part thanks to their open development processes)

Please dont make excuses for their failure. "Developer experience" and "performance" are never reasons for failure. What does lead to failure is

- lack of mass adoption - they didnt solve a problem everyone had

- lack of nitch adoption - they didnt solve a problem a few people had badly

I can tell you 4 features that I view more important than change subscription

- can run in any ecosystem - pouchdb/couchdb and waterline are examples of single apis in different environments. Even being able to use the database 100% in memory as a redis alternative would be nice.

- supports transactions across shards natively - this is a difficult but important feature that mongodb is missing

- supports document id transaction and table transaction - if Im only updating two documents, I have no interest in making many financial transactions wait for a table to be finished being used when I only care about 2 documents in that table.

- "sync" feature on a table - pouchdb supports this for syncing a client with a server

Rethinksb failed because, despite its sweet name, it wasnt solving that important of problems.

RethinkDB didn't fail because any of those things. As far as we can tell without an actual post mortem, it failed because RethinkDB -- the company -- didn't make enough money.

Going by the public information on their website, the only way the company was making money from its product was via support and training.

The problem with making money only with training and support is that it falls flat when the product and documentation are so well done barely anyone considers these things important enough to spend money on them.

Additionally the website doesn't really provide much information on these services other than a contact form. If you're not already in touch with RethinkDB devs, you're probably more likely to ask a local contractor or specialised training company for help instead of reaching out to RethinkDB.

By all appearances, RethinkDB has been pretty successful as an open source project. It has a strong and active community, it is well represented at meetups, conferences and podcasts, it has 16k stars on GitHub (MongoDB has only 10k, MariaDB and MySQL have only 1k each).

But the company was not able to turn any of those numbers into revenue. Downloads, GitHub stars and Twitter mentions don't pay to keep the lights on. A successful open source project does not equal a successful commercial enterprise.

Oh no, more starts on Github than MariaDB? Must be popular then! /s

In all seriousness the OP nailed the issue. The company is not successful because of the "issues" with the product. There is a very direct correlation there, stop trying to make it look like the failure of the company did not have anything to do with the product and that the product is perfect/popular/whatever.

I knew someone would make fun of GitHub stars eventually, which is why I was hesitant to mention them.

Yes, RethinkDB wasn't more popular than MongoDB or MariaDB. Of course not. But if you followed the JS ecosystem or Hacker News or podcasts or any of the usual suspects it would have seemed that RethinkDB was quite successful for its size.

People talked about RethinkDB. Whenever someone talked about realtime or streaming data, RethinkDB would come up. That RethinkDB felt so ubiquitous despite its low rank in Google trends or even on StackOverflow speaks to how well the team knew how to make themselves heard.

I'm not saying RethinkDB itself is without fail. The most overhyped feature is realtime change feeds and those don't even work for all types of queries (especially aggregates and joins just aren't supported for that). Based on the documentation it also doesn't really sound like change feeds can work at scale.

But RethinkDB certainly managed to catch people's attention. They just couldn't turn that attention into revenue. And considering you actually have to go out of your way to find out how to give them money when looking at their website, that's not surprising.

Even if RethinkDB had been the best thing since sliced bread, it seems like they would have still failed as a company because of this. To put it into online marketing terms: they certainly had the traffic, but they couldn't get any conversions.

I strongly disagree. I think it failed because it failed to gain popularity for social reasons. It solved a lot of very important correctness problems. On the other hand MongoDB is successful despite having many serious problems. The initial versions were so bad that it had no right ever becoming popular.

Unfortunately, popularity doesn't imply merit and merit doesn't imply popularity.

Im not saying it needed any of these features but I view these features more important than what rethink had to offer. What rethink did offer was redis + new query lang + persistance. I dont think popularity was the issue

I think a very important "selling" point that is not emphatized enough is how it addresses upfront with intellectual honesty in the "architecture" section of the docs[0] many important questions that other databases sometimes hide away to sweeten the deal, that is:

  * How does it position itself against the CAP theorem?
  * Will it ever lose my data during normal operation? (i.e. what are the consistency guarantees)
  * What happens when 1...N nodes go down?
  * What are the requirements for a well performing cluster?
  * What are the limitations in terms of document size, number of documents, ecc?
The point I'm trying to make is that after reading that document, one can quickly and completely understand if RethinkDB suits one needs or not, without "surprises" down the road.

0: https://rethinkdb.com/docs/architecture/

I agree with the comment about distributed transactions - basically, Mongo had the first mover advantage and Rethink (albeit a great platform and easy to setup / administer) needed a big differentiator. That's actually why I'm using cockroach right now, they picked a great problem to solve (cross dc / cross continent transactions) that enables a ton of developer use cases.

Do you mean 'niche' rather than 'nitch'?

Thank you. My heart nearly stopped there.

Does RethinkDB not support transactions? I thought it did.

"RethinkDB operations are never atomic across multiple keys. For this reason, RethinkDB cannot be considered an ACID database."

source: https://rethinkdb.com/docs/consistency/

However, certain operations on a single key, like incrementing a counter, are atomic.

I wish they had added transactions. It seems like such an important feature.

ACID correct distributed transactions create very, very painful performance problem because you have to lock across multiple physical machines separated by network connection(s).

Or y'know, you just don't do them correctly and accept a substantial rate of quietly corrupted data.

There isn't really a good option unless you have absurd amounts of money compared to your transaction volume.

ACID correct distributed transactions are possible. Google's F1 paper shows how: https://static.googleusercontent.com/media/research.google.c...

Very, very painful problems are very, very valuable.

They are possible but they are not possible to make per formant compared to the alternative.

Google falls under the "absurd amount of money" camp.

> Protocol Buffers have performance implications for query processing. First, we always have to fetch entire Protocol Buffer columns from Spanner, even when we are only interested in a small subset of fields.

There are a number of performance costs documented in the paper. Simply because you can solve it via money doesn't mean it works for all situations.

Maybe I'm reading this[1] wrong, but it doesn't sound like it does.


> what could have they done differently to stay afloat?

Make a private (non-OSS) version optimized for Business (B2B), do external consultant jobs and pretty much ask for donations from the big players/users. This is a common problem from almost every OSS project and so far the only one that is doing this is OpenSSL.

What are your suggestions for this "optimized for business"?

Off the top of my head:

1) An Stable "enterprise grade" version which receives fixes and minor performance updates for at least 5 years, maybe even charge extra for another 2-5 years round (just like Microsoft does).

2) Reports: Business love reports, if they add a way to generate reports straight from the DB (like Oracle and SQL Server) is a killer feature most of the time.

3) Customer Support.

Sharding is fundamentally a performance optimisation. If you can instead use a faster database and not have to shard or at least not as much, that saves you a lot of hassle. By having a smaller system you can in practice also reduce your failure rates.

I dunno about that. Is it ethically correct to get people to work for a company who's prospects for remaining viable are not steady?

Absolutely it is. As long as you're transparent with your employees about the opportunity and the risks.

I just want to personally thank Slava for his exceptional work over the past few years - at every step him and his team have been classy and professional, and brought real creativity and intelligence to the world of databases and startups.

Its clear to me that Rethink is the model for future databases - its just that DBs have a long gestation as no-one wants to risk their data until the code is aged like a fine wine. Its an important longterm technology play - just the kind we need to improve things for all of us.

In two or three years I think they would be making money, I think this is a failure of capitalism or imagination our HN/SV community.

To those several of us who have the power to write a check, please consider doing so, Rethink have been relentlessly building the future [ and you will make money ]

I'm not sure you can chalk this up to a failure of capitalism. Postgres is a tough database to contend with and it's mature and open source. People like to hate on mysql/mariadb, but it also has a massive open source community.

As a company choosing open source technologies, the popularity is important. Otherwise you run into issues finding answers to common problems and you end up paying through the nose for support.

> Postgres is a tough database to contend with and it's mature and open source.

Presumably, that's in part why CockroachDB decided to be more-or-less Postgres-compatible on the wire, and embraced SQL.

> CockroachDB decided to be more-or-less Postgres-compatible on the wire, and embraced SQL.

I was about to post that a quick grep of CockroachDB source turned up no references to postgres, libpq or anything obvious to back that up that claim but then I noticed my workspace was out of date, so I after a git pull/git grep I now see that you are correct. Commits to add postgres-compatible parsing and wire protocol appear to have landed late last year. Why wasn't I informed! So thank you for this - time for me to take another look at it.

They're also trying to make the information catalog compatible-ish as well, so that existing web-style frameworks that parse the information schema to determine datatypes and whatnot work out of the box.

You can give TiDB a try. TiDB is a NewSQL database inspired by Google Spanner and F1. It supports the best features of both traditional RDBMS and NoSQL. Check it out here: https://github.com/pingcap/tidb

Interesting! First I've heard of it.

I'm really excited to see where cockroachDB is headed.

Wow. I was seriously rooting for them. I'm looking forward to Slava's posts about their challenges on the business side. There's a part of me that's angry that MongoDB - which some would say has made an objectively worse product - has succeeded.

My initial thought is that MongoDB has done a way, way better job at SEO. The number of blog posts about RethinkDB pales in comparison to Mongo. I wonder if they got beat on sales as well? Not sure.

Mongo also got there "first". I mean, Mongo isn't much compared to Rethink, but they're both in the "not a traditional SQL database" camp and Mongo came before and had all the hype first.

I wouldn't be surprised if people a) didn't use Rethink because they're happy with Mongo and its popularity b) hated Mongo and saw Rethink as too similar/didn't research it enough.

I'm not sure if anyone else ran into this, but I evaluated RethinkDB for three separate projects over the years, and there was always that one missing feature. I don't remember exactly what they were, but I think the first time it was secondary indexes, the second was full text search and the third was geo indexes.

Went with Postgres for all three.

When Windows Phone 7 came out, without background tasks, MS was quick to point out that the iPhone didn't originally have it either. It was a silly argument; Neither Apple nor Google were selling phones without that feature at that time.

I'm not saying Rethink did anything wrong. And by bringing up MS, I don't want anyone to think Rethink had the same sense of entitlement as Microsoft.

Your v1 has to compete against your competitors current version, not their 5-year old v1. RethinkDB had a lot of powerful features, but it also lacked a lot of features that for me, and I have to suspect many others, made it an impossible choice.

(Also, I hated the query syntax. I was willing to fight through it, but I often wondered if that was turning people off)

I'm genuinely surprised at the number of people who vetted RethinkDB and are now really concerned with this news. With the number of solutions out there to accomplish the same or more RethinkDB almost seems like a novelty choice. Surely the decision to make it business critical came with the understanding and acceptance that the company might go bust? I'm not sure what was accomplishable with RethinkDB that wasn't with any number of other alternatives or combinations of alternatives.

RethinkDB really managed to build quite a lot of positive feelings in me on the back of not very much technology. But what technology was there seemed very robust. Just kind of incomplete.

My situation was very similar to latch's above. I vetted it earlier this year, after having developed warm fuzzy feelings for it last year at another company. We wound up going with Postgres + Solr because A) we've used them before, B) they performed a lot better than Rethink, and C) Rethink's compensating features (distribution) weren't worth the tradeoff.

I thought Rethink seemed like Mongo done right. Both have a simple document storage model. Rethink embraced "relations" a lot better, and seemed interested in borrowing some good ideas from relational theory. Having a uniform query language was a good idea; the "builder" style was an odd choice but whatever. Where Mongo ignored and failed to address Aphyr's concerns directly (and antirez seemed to emit a cloud of interminable semantics debates) Rethink actually reached out to Aphyr for testing and rapidly took action based on his recommendations. They accepted the criticism readily and went to some effort to be transparent about what they could and could not do. Failover was not a 1.0 feature, so they didn't hose it totally. The admin interface was beautiful. You really felt like you had a simple, powerful tool that you could understand.

Performance wasn't great. This was the showstopper for me. But to be honest, I probably wasn't going to give them a dollar even if it was a lot better. I'm not paying for Postgres or Solr either. I don't know how you bootstrap a database business. The Mongo/MySQL approach of "make garbage, monetize, hope someday you can refactor it into shape" looks obviously wrong stacked up against Postgres, which always took the academic approach of "first make it right, then make it perform." But performance is a feature and it takes a long time to make a database right, let alone perform. I think they took the right approach. It's just a long process, and it may not be compatible with startup culture.

Mongo may have been to ship non-acid compliant garbage, but MySQL just shipped with less features than Postgres.

I don't agree. To take a microscopic view of just one of them, consider TIMESTAMP. It was identical to DATETIME, except that if you touch that row, the _first_ TIMESTAMP column gets automagically updated to the current time. This isn't the kind of thing that happens by accident, someone decided that this was desirable and went out of their way to write code to make it happen. The feature caused a great deal of confusion and probably a significant amount of data loss. It's more than merely non-standard or in some sense "less" than Postgres, it's actually inimical to the model of what's going on, in several senses: column "ordering," a data type with behavior. This kind of "here's a bucket of special-cases" sloppiness is thematic for both MySQL and Mongo.

Oh, come on, MyISAM didn't even support transactions.

Here is a blog post mentioning the service offerings around MongoDB as another factor: http://blog.dripstat.com/mongodb-vs-rethinkdb-why-we-had-to-...

(via this HN submission: https://news.ycombinator.com/item?id=12650033)

My sense is that the market for NoSQL databases is large enough that RethinkDB could have focused on a particular customer segment and grown from there. So even if Mongo was the dominant player in NoSQL land, it seems at least possible that Rethink could stick it out and win.

Perhaps this is what RethinkDB's strategy was all along with targeting real-time applications. I wonder if they were just... too early?

Well, Couch DB got there first but it didn't help them much. I think you're right it was mostly about social media marketing.

Early couch had scaling problems. We built a product around it and had a terrible time keeping it running at scale. Ruined erlang for me, too. Not from the language, but the problems with the build tools.

I think couch never caught on because of it's very different querying paradigm that was enough of a friction to mass adoption.(which has only been fixed by making a mongo like DSL in the 2.0 release).

Still, the market for companies willing to pay for databases in NoSQL is absolutely tiny. If those NoSQL products want to grab a share of the SQL world which does have money, they are competing with very serious vendors. So the market is consolidating. Its just the way it goes.

MongoDB has received a huge amount of venture funding: $311M vs $12.2M for RethinkDB. That makes a big difference how far you can grow as a business if you use the money even half-way rationally.

Given that the market for document DBMS is pretty small to begin with it's tough to beat a 25x difference in budget unless you have a product that is a complete game-changer. It will be interesting to see Slava's analysis. Meanwhile I wish the team and RethinkDB users well. DBMS start-ups are damn tough to pull off.

I originally thought funding would play a big role, but the early, 2-3 year period of funding for both is comparable:

RethinkDB, 2009-11 = $4.2MM

MongoDB, 2008-2009 = $4.9MM

It's true that MongoDB quickly closed on a $6.5MM Series C in Dec of 2010, one year after their Series B. For Rethink, it took them another 2 years to get the last $8MM. There's no question that MongoDB was "better" at raising more money, faster.

I'm just thinking out loud here, but I wonder if MongoDB made a conscious tradeoff to sacrifice good tech to have better marketing & sales early on.

Crunchbase links here:

- https://www.crunchbase.com/organization/mongodb-inc/funding-...

- https://www.crunchbase.com/organization/rethinkdb/funding-ro...

No, still diff league: if 50% went to dev, 10 devs << 20. For example, 1-2 parallel teams could focus purely on monetization: managed hosting, weird enterprise features, sales engineering, etc. Compounding returns on that turn into more funding, and history unfolds.

Exactly. It's sad that a technical superior product can't make it while something like Mongo can.

I hope this isn't true. Having worked at/on multiple competitors I have nothing but respect and admiration for the work the RethinkDB team has done to make a great database and development platform.

This was real technology! I'm truly sad that the environment is such that great work like this can't continue to be funded.

Thanks for showing everyone how to write amazing documentation, caring about the fundamentals, and for the incrediblly snazzy admin panel.

Everyone at RethinkDB was incredible to work with.

Thanks for showing everyone how to write amazing documentation

Thank you! (Really. I'm RethinkDB's documentation writer/editor.)

I have always wondered why there isn't a tech company that just does kick ass documentation for companies. Seems like a small group of people who make kick butt stuff in this area.

Really, you did have fantastic documentation, as the original said. Kudos.

Thank you. And...hmm. That's an interesting question.

The Rethink docs truly are incredible.

Do you have any guides or books you recommend for writing good documentation? What software did you use to produce Rethink?

Sorry to see Rethink go.

The documentation is all maintained in Markdown and built with Jekyll, so it's pretty straightforward. There were a few custom Jekyll tags. For writing, I used the venerable Mac editor BBEdit; obviously any text editor can do a fine job here, but there are some subtleties in BBEdit I liked. Its "open file by name" command can open multiple files with the same name at once, for instance, and its find/replace functionality goes above and beyond the call of duty.

For books/guides, it's hard to say. Find documentation you like a lot and think about what makes it good--the organization/taxonomy is really important to pay attention to, as well as the tone (formal, conversational, weird, etc.). As shocking as this might be around here, the _Microsoft Manual of Style_ is useful as a specifically technical guide, and it's good to have a relative recent edition of the _Chicago Manual of Style_ kicking around as a reference. And, just being familiar with standard grammar and punctuation rules is important. A lot of people aren't. (A lot of people who aren't still think they are.)

Hahaha. That is surprising! Thanks for the helpful advice.

There is a company, yc in fact http://readme.io/

It doesn't look like readme.io actually writes the docs; they just make tools for writing docs.


Well, Slava himself posted the (sad) news so it can't be any more true than that.

Ever thought what do these salespeople do in a software company? This is what happens when salespeople fail to do enough of their thing.

Sales are essential for a for-profit company.

Not sales. Marketing. In today's world having someone push a new database for a hefty price is not going to work. I would need to be able to test it against mongo and decide myself that it is better... but how would I know it's even an option?

Back in 2014 I was consulting FoundationDB on marketing and customer acquisition. After FoundationDB was acquired by Apple, I got introduced to RethinkDB and was excited about the prospect of helping market another promising DB company.

Unfortunately, after a short while I stopped hearing from them. It doesn't look as though they ever brought in any marketing help since then.

Is there a firm line between sales and marketing these days? Is sales just the high-touch end of the marketing continuum?

> Is sales just the high-touch end of the marketing continuum?

IMO, no. Sales and marketing are pretty different things.

Marketing isn't "sales without people", and sales isn't "marketing done in person". They target different problems, and you may need one, the other, or both to be convincing, depending on the product or service.

You build a great product. Now what? How do you get users?

The next step is marketing. Unless your target market is so tiny you or your sales team can do a direct outreach.

Gaining users and customers is extremely difficult, often unpredictable and potentially expensive. While a bad product/technology/service can set you back there is no guarantee the best will get you success. Users can be unpredictable and the adoption curve for anything new is long.

Sometimes marketing is earlier in the market identification and product building stage so you know the market you are going after, gaps, segments and the value proposition to customers or users. This is product strategy. You can completely botch this step and expend effort on a product/service that is not clearly differentiated and has no market case.

B2C (business to consumer) marketing typically use large budgets to reach a wide potential customer base using traditional advertising channels, and now online and social channels. Even 'lucking' out with hype, viral marketing and rapid adoption requires someone or a team to fuel this hype.

B2B (business to business) marketing is more hands on and about building a predictable pipeline for your sales team to target and close. In some b2b business the sales cycle can be hugely long and expensive and the cost per lead and opportunity measured in thousands of dollars.

Marketing is about figuring out your customer's needs. Sales is about promoting your products.

Both are done to sell more stuff.

Marketing has always been a bit vague. Some definitions of marketing subsume the entire purpose of a company; for example, one definition is "identifying, anticipating and satisfying customer requirements profitably" - this could be used as the definition of a business enterprise.

IMO marketing is all about getting a good fit between what the company sells and what the market outside the company wants. It could be creating demand, so that the market changes to fit what the company creates; or it could be shaping the company's offerings to what the market is asking for; or it could be advertising, to make the market more aware of how the company's current offerings are a good solution for what the market is asking for. One way or another, it's about bridging the gap.

Sales is the process of completing profitable transactions. It's typically high-touch, one to one, and involves taking prospects, qualifying them, figuring out what they need and how the company's current offerings solve that problem in a personalized way, and managing the process to completion, and potentially further to upselling in the future. It's a pipelined process and is ultimately a numbers game, where the odds of proceeding to completion depend on both the value proposition and the salesperson's skill in communicating it.

Marketing is more about making the customers think they need your product.

The marketing strategy was the product itself, given that it was free, as well as the open development process.

And I do think that strategy was working to some degree; at least on channels like HN where we frequently saw posts related to RethinkDB.

The people that use HN aren't the people that sign checks for enterprise support contracts.

Guess what - database vendors like Oracle, MongoDB (to a much smaller extent) that get a bad rep on HN actually sign deals for real money.

> The marketing strategy was the product itself

And we see how that worked out. Pricing is an important component of marketing, but it's not a marketing strategy in itself.

So, maybe we should find a way to develop good technology that doesn't involve for-profit companies?

Non profit companies still need to make money to keep the lights on. That means targeting technologies that can attract lots of donations. Databases are a very mature technology so the number of companies running into issues with them is vanishingly small. Who is going to donate to a non profit spending its time developing a solution a problem almost nobody has?

I don't know, I feel like lots of people donate to universities to spend their times solving problems that almost nobody has. Or maybe they don't, but somehow that money still exists. How does that work?

Why are we assuming that everything has to be done by companies? What if we establish a National Science Foundation for fundamental CS research?

If you’re not getting paid by your users, then you often get something cool, but maybe not something you want to go to production with.

Or maybe they don't, but somehow that money still exists. How does that work?

This isn't that difficult of a question.

1) Universities get money from the students who pay tuition and fees for their education. 2) Universities get money from wealthy benefactors who donate their money to a cause they believe in. 3) Universities get money from for-profit companies that have sufficient commercial success to invest into research and such for what they hope will be later commercial benefit. 4) Universities get money from people who confiscated the money from other people, under threat of force and who might not have given it otherwise, and give it to a cause they believe in.

Not necessarily in that order by whatever measure you may choose. Note that, In no case did the wealth magically just appear. The wealth to consume was produced somewhere by someone.

Sounds like you're proposing 4 as you imply that the we that is not the companies won't be trying to create the wealth you want to use in your cause.

There's plenty of fundamental research being done. Research doesn't yield products.

Building roads/bridges is probably the better analogy. Infrastructure type public goods never seems to come from donations, only taxes.

Universities are still businesses. Grants, tuition, other fees, donations, endowments, etc all generate plenty of cash.

Easier said than done.

Very sad to hear, but hopefully the software will continue to be developed in an open source format.

Keep this in mind when you invest in a certain technology: some organizations, especially nonprofits (for example, the Apache Software Foundation, Python Software Foundation, the new Node Foundation) are probably going to support and develop their software for extended periods of time relative to, say, a startup or for-profit (Parse, MongoDB and RethinkDB immediately come to mind).

This is why the key product should be open-source. Else it's too scary to invest in it: what if the company behind it folds, and you end up in a dead end?

Only certain super-behemoths like Microsoft or Apple can afford to have their infrastructure products being closed-source.

How many people in the world can work on a large DB, even with source?

I am all for open source, but it doesn't make things like this is easy as some make it out to be. For example with Linux source, how long would it take me to fix a video driver bug? Perhaps a year?

His point is a little more subtle than that. When you're choosing a database engine to build your company on, knowing that it's open source provides some level of guarantee that someone will continue to maintain it if the company/developer folds, even if that someone is not you.

Therefore, if you're the one building the database, open-sourcing it de-risks your product to your potential customers, to some extent.

To expand on this point, if you look at Linux development, multiple companies are contributing to the project! The product exists outside any single organisation.

Of course closed source products can also be collaborations, but open source + open development practices can make this a smooth and natural result.

If the DBMS's users wouldn't chip in enough money to cover the cost of developing it before the company went under, it seems hard to believe that they'll suddenly decide it's cost effective to pay for having it developed post-failure. If anything, it just implies that they've become so tightly coupled to the product that the DBMS vendor's failure represents a sudden existential crisis.

I think, more important that whether or not the DBMS is open source, just consider how critical the DB is. If you can't live without it, stick with a DB that you can be virtually 100% sure isn't going anywhere soon. Save your experimentation with new products for the small and less-important stuff, and make sure the small stuff doesn't reach some critical mass before the DB vendor does.

Someone? You shouldn't pick an opensource project hoping someone motivated picks it up soon after it's abandoned. The primary company going down is usually it, especially when the project hasn't even picked up enough community yet.

I hope it wouldn't take you a year to hire someone qualified to fix it for you.

If open source only works when you pay for it, it's just an abstraction layer around ordinary commercial software, with the same pressures. In fact, most of the people qualified to fix Linux video drivers are already employed full-time at a graphics card manufacturer (with its sales pressures) or a commercial redistributor (with its sales pressures). You could pay one of those companies for a support contract. But other than that, if someone else qualified exists, it's only worth their time to take these contracts if there's enough sustainable business to keep up-to-date with how Linux graphics drivers work; chances are there isn't, and they're spending their time in full-time employment doing something else entirely. So, you're at the same place where you started: the problem is complicated enough that there aren't enough reliable sales, and so it only gets done by large companies that can afford to fund the work because they already have enough sales of enough things.

(Full disclosure: I am one of the people who could have been qualified; one of the jobs I was considering a few years back was a graphics-driver-hacker position at Red Hat. I happened to instead choose full-time employment doing various, mostly userspace Linux stuff for a startup. When they ended their incredible journey, I would have loved to make a living taking contracts like the ones you suggest - and I still wish I could - but there aren't enough of them and they aren't reliable enough to make it better than just taking a full-time job with a profitable company.)

With ordinary proprietary software, there's only one team in the world you could possibly hire for maintenance, and if they're too busy or have a conflict of interest you're just screwed. Competition among maintainers should make them more efficient and scale better.


But that assumes there is a ready supply of people that can fix a certain part, and want to take contracting work. For example, let's say I have an ATI XYZ in my laptop. It has a driver bug and won't work. ATI driver is open source. Who do I hire to fix it?

Except that's not quite true for ordinary proprietary software - there are a number of people who make a living supporting Windows or Active Directory or Exchange or whatever who don't work for Microsoft. There is a difference of degree, in that third parties who support open-source code (e.g. Oracle supporting RHEL) have access to the source, and third parties who support proprietary code usually don't (although it's possible to get a MS read-only source license!) and only have access to compiled binaries. But that's not a huge difference. There are people who are very good at tracing and disassembling compiled binaries, or simply at understanding how the system works even in the absence of code, and they have thriving businesses. Sysinternals, for instance, was a separate business acquired by Microsoft.

(Note that I am carefully using the phrase "open source," not "free software". Part of the free software ethos is that you can realistically modify your code as needed. I do happen to think that the current free software movement isn't very good at delivering on this promise.)

> This is why the key product should be open-source. Else it's too scary to invest in it: what if the company behind it folds, and you end up in a dead end?

The way this is dealt with for closed-source software is source-code escrow. The support contracts stipulate that the company that built the software set aside the code with some stable third party, and that it be available under specified conditions, such as the builders going out of business.

It's not as good as having the source freely available, but then it's dealing with an extreme contingency, anyway. Very, very few users of a piece of software have the expertise to build it from source and then debug it if they run into problems. So they're probably screwed either way if the builders go belly up.

> Only certain super-behemoths like Microsoft or Apple can afford to have their infrastructure products being closed-source.

The alternative is to buy truly critical software only from companies you trust a lot. XYZ Valley Startup is unlikely to be around in 20 years, but Oracle almost certainly will, as will Microsoft.

It works the opposite way too.

If a project is open source then who is committed to helping you support and maintain it? You essentially need to have a team to do it.

If there's a company behind it, it gives you assurances.

Especially for a small company that needs to focus on building a product rather than committing features and bug fixes to their database

You can have an open source product with a company behind it, they're not incompatible.

Right. My point was more that being open source in its own isn't always enough. Having a company behind it makes a difference to many people - more than the distinction between open vs closed.

I was able to sell our company on Cassandra because there is a Datastax there to support it when the shit hits the fan.

Of course the entire point of this particular article is that they are pretty difficult to make profitable

You cannot draw that conclusion from the data at hand. There's plenty of closed source product companies that shut down because they didn't manage to make a profitable business as well.

Red Hat's market cap is almost $15B and they're worth $2B. Definitely not incompatible.

This is like saying "the widget sells for $15 and the price is $2".

I assume you mean they have $2bn in annual revenue?

Yeah.. whoops. Thanks for pointing that out.

This is honestly quite depressing for me to hear. I've always liked the team and the fantastic product they created. A couple of years ago, when I was working on a DB product myself, I met a few of their team and I was just blown away by how nice and welcoming they were, even to someone developing a product that could potentially compete with theirs.

Later onwards, when I was working on an NLP startup earlier this year, I opted to use RethinkDB because I had seen how clean, smooth, and fast its internals were. When I had a hiccup with running a cluster in here cloud and tweeted about it, Mike and others from. The RethinkDB team instantly reached out to me and helped me resolve the issue.

Hi Haneef :D I was one of the peeps you met at OSCON. Hope to see you around and on Twitter still <3

A shame that a company like mongo can exist while a company like rethink folds.

I believe this is because the Node.JS community still recommends MongoDB as the perfect database engine for their projects. There is even a name for their stack — MEAN, MondoDB + Express.JS + Angular.JS + Node.JS — which is kind of ironic because considering the rapid evolution of the JavaScript ecosystem having an acronym like that will make no sense in a couple of months since now people prefer React-like libraries / frameworks. Besides the JavaScript community I don't know of any other that enjoys to use MongoDB — which to be fair it's not that bad for what it offers, but you know — or ships it by default in bundle installers which are preferred by many young and/or newbie developers.

Don't laugh -- there's now a thing called MERN, which is basically MEAN with Angular replaced by React.

The entire sell of MongoDB in the Node community seems to be "it speaks JavaScript and you can store objects in it". Which of course fails to take into account any of the strengths and weaknesses of MongoDB (in the latter case especially the lack of cross-collection transactions or fast joins -- which in my experience account for 90% of the problems Node developers face with MongoDB).

The main reason MongoDB is popular in the Node community is that both reached peak hype around the same time. So it made perfect sense to bundle both of the "hottest" technologies together. Except by the point Angular 1 reached peak hype MongoDB was already facing criticism, so MEAN mostly seems to bask in its afterglow.

If you had asked me last month, my prediction would have been a RethinkDB+Node+React (or RethinkDB+Node+Angular2) stack popping up to challenge MEAN, though the RethinkDB+Node bit seems to have already been addressed by Rethink's own Horizon.

Could be worse - might be ANGER - Angular.js + Node.js + Grunt + Elixhir + RethinkDB. That might even live up to it's name! Though I guess you could always go full tilt towards insanity and choose MongoDB + Angular.js + Node.js + Grunt + Express.js - MANGE.

> "it speaks JavaScript and you can store objects in it"

They should just switch to ES clearly.

FWIW I've been in the Node/JS community for a few years and I've honestly rarely seen MEAN or Mongo be mentioned outside of blogs and corporate Hackathons (where Mongo is usually a sponsor). Maybe companies use it (I rarely pay attention to what people do with Node at companies) but I'll bet it's because of "hype" and its apparent popularity. Most JS devs I follow or talk to use a SQL database and dislike Mongo.

I'm somewhat frequently exposed to absolute beginners and MEAN does occasionally pop up as one of the various flavours of "I picked a thing before learning the language, now what?".

Other popular contenders are similar "full stack" or "full featured" solutions like Sails, Meteor and Total.js.

This is generally a result of beginners to Node trying to find something that does everything instead of spending some time learning the basics with a library like Express and going on from there.

Regarding rapid change, I've just come off 90 minutes of troubleshooting, whose end result was "upgrade npm to latest". I've never before used a package manager which had to be so often upgraded to solve issues.

We tried with the REEP stack. RethinkDB, Elixir, Elm, Phoenix.

How did that work out?

I should rephrase. We, the community of people using Elm/Elixir/Phoenix/RethinkDB, tried to get the name REEP to catch on.

As far as the stack itself goes, it's wonderful.

Why not PEER? It's catchier, a common English word, and even kinda fitting given RethinkDB's distribution capabilities. And it has much nicer connotations than the MEAN stack.

This is my stack.

TERP Stack : The most adventureous stack for

InTREPid EnTERPreneurs

Tornado RethinkDB Emberjs Python.


MongoDB is very popular in the Big Data and IOT spaces.

In most enterprises around 2/3 of the data in the lake is unstructured and comes from multiple systems we don't control or have visibility into. So it just gets dumped into HDFS until someone creates the relevant HCatalog schemas. Having a database like MongoDB at least allows you to extract and access some of the data in an automated fashion e.g. JSON reflection. It's especially useful for adhoc and lightweight model scoring and feature engineering.

Who says Mongo won't go eventually as well?

MongoDB is like the Microsoft of operating systems. Well promoted and easy to get going with. Hopefully they use the money they have to rewrite MongoDB to be better.

Okay...but if Rethink couldn't make their business self-sustaining, why would Mongo when there are a plethora of other open source databases/datastores that can do Mongo's job better than Mongo?

EDIT: We'll just wait for Mongo to burn through all their cash then.

Because having the better product (often) doesn't matter: see Windows vs Apple in the early days.

Mongo has a strong community, which translates into a plethora of articles/blog-posts/tutorials/etc that secures its niche. From there it's possible to make money with things like consulting fees to help fix the problems it created in the first place.

It's all about execution. Case and point: I know what (purportedly) makes MongoDB special. I have no idea what RethinkDB is supposed to do, other than store data. Yes, I could check their site, but the point is I'm already familiar with Mongo's reputation. All of this contributes to a lesser product winning over a better one.

At my last workplace, I finally asked: "Since you guys bitch about mongo so much, why do we use it? Why not something else?". Response: "It's not actually that bad, it does the job".

Half the crap people speak about Mongo seems to be from either the early days, or from people who just like to complain. It certainly has it's issues, but there doesn't seem to be much community will to replace it with $superdupernosqldb.

Despite some of it's crappiness MongoDB actually has a bunch of features that other databases (such as PG) don't such as easy to set up sharding and replication.

We run Mongo side by side with PG, and even though the PG is just replicated and not also sharded like our Mongo cluster is, Mongo was more easy to configure.

I think MongoDB's biggest problem is that it was marketed as the general purpose DB in the Node.js community when actually it is a specialised tool. So now you have thousands of developers wondering why they're on this weirdly behaving document store when they could've scaled fine on something more generic like PG.

More on topic: RethinkDB was aiming to be the best of both worlds, generic like PG yet document oriented and horizontal scaling like Mongo. I think their problem is that they didn't manage to pierce the communities. It would have gone better if they had evangelists for every major platform that just produced a stream of blog posts, small open source projects and tools and crashed random meet ups.

I think Mongo lives in the "we probably shouldn't have chosen this, but it feels scarier to try to tear it out than to make do" place.

Also not the point.

I thought I was supporting your argument about network effects and will to change.

Well, real streaming queries, not just oplog tails for starters... a document database with the ability to do on-server joins. There's a few other things, but those are the two biggest ones over most others in the same mold... All that applies to Mongo, applies to rethink.

Oh, yeah... and the ability to do shard + redundancy, where mongo you're either or have to do both. So scaling works a bit better. The admin tooling for rethink is better than anything I've really used in non-sql databases. And frankly even better than db admin tools, including sql based ones.

It's just a really great, stable database with really good scaling for the majority of use cases.

That really wasn't the point.

*case in point


Business is about more than just the product, even if that product is better than the competitors.

MongoDB is killing it in the Big Data space right now with excellent integrations with Spark and Hadoop, strong relationships with Hortonworks and Cloudera and really good enterprise support. I know many Fortune 500 companies and equivalents overseas who are customers.

But please enlighten us poor fools which open source databases can compete with MongoDB in this space. It sure as hell isn't PostgreSQL.

But are people paying for it if MongoDB is "killing it". If not, no business model.

Yes. Enterprise companies are paying for it many you would know e.g. eBay, LinkedIn, Adobe etc.

In Big Data specifically when we are spending tens of millions on Hadoop/Spark/NVidia clusters buying a few extra licenses for MongoDB is nothing. And what makes Big Data even more compelling for database vendors is that often by law or for privacy reasons we are forced to run everything inhouse i.e. strictly no cloud.

You are a gem.

>MongoDB is like the Microsoft of operating systems.

I'm not a Windows fan, but that's harsh. http://cryto.net/~joepie91/blog/2015/07/19/why-you-should-ne...

That post is utter garbage. Seriously.

Most of the the points are for issues that were fixed years ago and then you get classics like "forces the poor habit of implicit schemas in nearly all usecases" as if some developer is going to somehow stumble on the fact is it a schemaless database.

It's like trying to say "Linux is shit" because of issues found in 1.0 of the kernel.

Normally when things are citing @aphyr, I tend to give them some weight, but even aphyr's post is rather old. Thanks for the heads up, I'll do more investigation before posting the link again.

Most of those points are fearmongering, outdated, opinions, or improperly cited. Every database has problems. Mongo has more than most. But I honestly think people dislike it so vehemently primarily because they've monetized so strongly.

They largely did with version 3.0 when they switched to Wiredtiger.


It's made a massive difference to performance and stability.

Sounds like it hasn't helped with consistency


No a storage engine change doesn't help with consistency.

This is the JIRA tracking that Jepsen stale read issue: https://jira.mongodb.org/browse/SERVER-17975

Yeah I'm glad to see the reopened it, because it was kind of a dick move to mark it as working as designed - https://jira.mongodb.org/browse/SERVER-17975?focusedCommentI...

As a product creating so much job opportunities gloablly and didn't take you $0.01, I would like to ask: How shameful is it?


Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact