I'm also excited to have them join us here at Stripe. As we've gotten to know Mike, Slava, and the other Rethinks -- and shared some of our plans with them and gotten their feedback -- we've become pretty convinced that we can build some excellent products together in the future. I'm looking forward to learning from all of them.
 (And, for me, even before Stripe -- I started learning Haskell with Slava's Lisp-interpreter-in-Haskell tutorial back in 2006... http://www.defmacro.org/ramblings/lisp-in-haskell.html)
On a sidenote, for people unfamiliar with RethinkDB, this episode of The Changelog with Slava explains some of the history and choices behind RethinkDB ; I found it really interesting. https://changelog.com/114/
The team at Stripe has been absolutely phenomenal throughout this process, they've gone above and beyond in finding high-impact projects for our team. We're brainstorming together how to transition RethinkDB to a self-sustaining open-source project, and Stripe is super-supportive of that too. If there is a way for RethinkDB to live on, we'll find it!
MySQL was for long time default choice over PgSQL, yet once developers matured, quality won. Just it took a decade :)
It's really a top quality piece of software in every respect, combining much of what's good about nosql with relational features and robustness. The ease of clustering across data centers is just phenomenal.
Would still recommend it in spite of this bit of uncertainty.
GPL/AGPL requires you to share your code if you statically (or dynamically) link (which is a C language family concept) to GPL code, but I don't know if calling an API is considered "linking".
I don't think most companies really make any contributions or modifications to the databases they use that they absolutely do not want to share back to the community.
(If using ReQL constitutes "linking" under AGPLv3, then that's a very serious matter. Perhaps re-licensing under LGPLv3 would make sense then.)
But overall, I think it's fair that they used AGPL. I especially like that they opted for the Affero version, since companies that do make useful changes can't just hold onto it , and not share it back.
After some while, that component changed its licensing model. It was too late.
Why on earth would they want to enable someone to close his modifications to RethinkDB? How does that make the world a better place? How does that encourage the growth of RethinkDB (vice the proliferation of closed, proprietary forks of RethinkDB)?
 Which is to say, user-hostile software. Users should be free to use, modify & distribute software.
Yes, it is: by definition it violates one of the Four Freedoms of users … which is hostile.
The Four Freedoms are just someone's opinion, not some tangible fact.
If you have a license that "permits" commercial sales in a way that by design makes most business models completely unfeasible... guess what? You'll only get contributions from those with one of the handful of blessed business models. Which will work or not depending on what sorts of businesses models your project is suited for.
Copyleft works fine for things like the kernel that are complementary to tons of different expensive things and nobody cares about otherwise. Permissive works well for Postgres.
Also, TIL: Slava is defmarco. Yeah defmarco, that's how I read it for a long time. I especially love this piece from defmarco http://www.defmacro.org/ramblings/fp.html
Just another +1 to the "OMG it's him/her?" moments regularly experience while reading HN. Thanks for mentioning.
I see a lot of similarities between Stripe's offering and RethinkDB, making a once painful process into one that is actually looked forward to when building a new product. I'm glad Stripe will have even more engineering firepower as they continue to succeed.
Sad that RethinkDB(.com) is no more, but happy they found a great company to join!
Why buy something you can have for free?
That's completely false: anyone has the right to use it, for any purpose; that's what Freedom Zero is all about.
Some people do not want to both use it and also comply with its terms, but that is their choice; they can and may use it, but choose not to.
cannot resist to reply to nitpicks :-)
counterexample: let's imagine I'm an employee of a company which has a policy that forbids me from using any AGPL software for work and it also forbids to install said software on my corporate laptop at all (even if it was for personal evaluation or toy projects) or else, if caught, I might incur in disciplinary action, who knows, possibly termination.
I guess that in this case you'll agree with me that the statement "some people cannot use it" is not quite false and especially not completely false, in that , yes, I could use it (as in no physical law forbids me), and even if you might say that it's me who freely chooses to not use it in order to avoid the repercussions I don't really think I have a choice here, do I?
(I didn't downvote you)
Also: you shouldn't be using your company laptop for persona projects anyway... get your own laptop :/. So, likewise: "people who by premise are already doing something sketchy because they can't afford their own laptop and are pretending to borrow one under employment terms" is just an awkward place to start to define a reasonable, as opposed to pedantic, definition of "people".
Such a passionate developer however might find himself in a situation where she's not allowed to use some tool X because of a restriction put up by his employer.
That restriction might be a minor annoyance, e.g. raise the barrier to entry because "doh, I need to go and grab my personal laptop for that? nah, let me use something else". Or it might be a deal breaker: I need to get some job done, I'd like to play with tool Y, use it to get the job done and learn new thins while doing it.
This kind of people, those passionate developers, will complain about that.
They will complain to their employee for putting up such a restriction (and you'll usually not hear about those complains here).
But they will also complain about why the tool Y has chosen a license that his employer find so problematic.
These people will not just stop complaining just because they shouldn't be wanting to play with things in the first place. That's what they do, and that's often why they are good at doing things.
People do complain when they have too many rules that hinder their ability to do stuff effectively. I do see that happening, quite a lot; and I can understand why and relate to it.
Do they have the right to complain? Well, that's another story.
If the tool is closed source and they don't want to buy a license, then sure they can complain but they will just shrug it off as "that's the way it is" and move on.
I believe that things start to be more blurred when you have an open-source tool, which suddenly you cannot use (I'm not saying extend and sell commercially as closed source!) just because of FUD around licensing.
No, it means some people CHOOSE not to use it. There is nothing stopping them from deciding to use it, other than their own fears.
What the lawyers decide in a company that has 10k+ employees has nothing to do with the fears of anyone but the lawyers and the top management. Everyone else is just following the rules.
Working for one of those large companies, you simply don't have an option.
And the people without choices vastly outnumber the people who do have choices.
Frankly, most developers are probably happy using whatever code, under whatever license, that's available. It's the people who run the business who typically make the decisions about what's OK and what's not.
So it's developers who are punished by GPL, by and large, because many can't use the code. It's not helping those developers, and nothing zealots say or do will convince the legal department at those companies that they should change their position on (A)GPL.
Actually thinking of Amazon.com. They have competitors, sure. But they also have the budget to write their own entire stacks internally when the license doesn't fit. And it's not like their decision to avoid GPL software is hurting them in the market in any relevant way.
When something is MIT licensed, they'll use it, and they can and do contribute changes upstream. GPLv2 code requires serious hoop jumping to use, and GPLv3 software is verboten (the patent clause can't be adhered to by large companies with cross-licensing agreements: Many of these licensed patents can't be sublicensed, and so they can't be in compliance with the license at all).
So all GPL does is prevent companies from using and supporting the software. For every instance of an MIT project that ends up modified in proprietary code there are probably 100 that either use it verbatim or contribute changes back upstream. It just makes sense so that you don't need to keep maintaining an increasingly divergent fork.
And it would explain why say a gaming company can't have such rigid policies. Preventing that a game get delayed is worth both lawyer time and, in the case that a license directly conflict with the business model, send mail to the author and asking for an exception. For example, I recall that LGPLv3 which has the same patent clause you describe as "clause can't be adhered to by large companies" is used by blizzard in starcraft 2. Blizzard is not Google in size, but they are not exactly a street vendor. One might also ask if they have much need to protect patents about xml parsing, or fonts, or what ever specialized functions those numerous library do that blizzard use to build a game. The only thing they don't use is copyleft, as their core business model is designed around restricting copying in order to limit supply when selling copies.
Per case evaluation make sense when your product is time sensitive and when there is a lot of competition. A game from Blizzard is almost treated the same by the market as a game by a indie studio (keyword: almost), and a such can't rely on market share to protect them. A bad, delayed, and rushed game is still bad and won't sell regardless of who made it (To name an example, the last batman game). They must be agile, which mean religious thinking about software licenses must be thrown out and per-case evaluation be added to the process. If a library can be use within the business model, saves time and money, and is not your core ingredient in making your game stand out, its almost always a good idea to use it.
It's the Google/Amazon/IBM/Apple size where you end up with endemic cross-licensing patent agreements. That's where they can't adhere to it: They have a license to a thousand patents that protects them from being sued, but they don't have the right to sublicense those same patents. (L)GPLv3 requires they sublicense any patents they have to anyone who's sued, IIRC, so since they can't, they lose the ability to distribute their software.
I'd be somewhat surprised if Blizzard were a party to such an agreement. They might be party to a "patent-troll-don't-sue-me" agreement, I guess? No idea. But if they are, depending on the terms of their agreement, someone could potentially sue them over it based on LGPLv3 terms.
I'm not planning to, so I haven't done the research, to be sure. The risk scenario that comes to mind is a stretch, but say I buy a Blizzard game with LGPLv3 code in it, which grants me the right to be protected against patent lawsuits relevant to that code, and then I ship my own game that uses the same LGPLv3 code -- and I'm sued by a patent troll over a patent that Blizzard has licensed through some agreement. Guess what? I can now demand that Blizzard protect my use of the same LGPLv3 code by sublicensing me that patent. Which they (probably) don't have the right to do. So they either pay for my license or they have to stop distributing their LGPLv3 code.
For Amazon, who distribute tons of code for people to use in AWS, and the fact that they're a much larger, more collectable company, the scenario is proportionally worse.
It's all beside the point, though: In no case is it the line-level developer making the call, it's someone in management. Not everyone can (or even wants to) work for Blizzard or equivalent.
In particular, not everyone wants to work at literally half the compensation or less just so they can have full software freedom, whatever that means. The Google/Amazon/Apple compensation can be that much better than start-ups for top developers. I didn't realize this myself until I got a job at one of them.
And I'd love to be able to use RethinkDB where ever I end up next, without having to worry about whether the company legal department has a problem with (A|L)GPLv3. It's a battle I wouldn't even bother taking on in most cases; too much work when I could use something else and get back to doing real work.
So in other words you're not going to continue to work on Horizon and Rethink, or at least if you do they are secondary priorities? Ugh.
Slava did the right thing; his first responsibility is to his employees, not his users, but this really sucks for us users.
For us, it would probably have been better to shut down abruptly. The community would have scrambled. Some employees would find work in companies that used RethinkDB, and maintenance would have continued the same way that Apache was developed ~20 years ago. Some would have grabbed support contracts from users. And some would have moved on.
This isn't meant to be harsh, but these are times to learn, not simply pat each other on the back.
I don't know how this could have been fixed though. Databases are hard to develop and it's a tough market to crack. Enterprises aren't going to buy an incomplete product, especially not a database, which is arguably the most critical component in the entire stack. Anyone investing would have to know that this would be a years-long effort to build a solid product before the first dollar.
Perhaps RethinkDB could have shortened their development time by not pivoting around (originally SSD optimizations, later realtime. also Horizon is a bit of a mini-pivot), but I don't know by how much.
- bad performance
- a dearth of types (literally only a single numeric type that is a 64-bit float, so that eliminates entire categories that rely on integer/fixed-precision exact arithmetic. Also, time series are seriously hurt by that decision. I've seen DBs have to move to 64-bit longs because of that issue alone. Having a pseudo-type layered on top is just going to tie up CPU cycles in the encode/decodes that need to happen. (Plus milliseconds aren't enough in a nano-second world now.)
This whole "let's do everything with as few types as possible" I hope is a fad that will just die quickly in the DB world.
This is with about 20 minutes of looking around and just listing a few things. It looks like RethinkDB was written by web devs for web devs and I don't think that can compete in the database world. You might get the web devs onboard, but the database and HPC people at the firm will probably look at it and go "how quaint when can we get Oracle/Postgres/Informix/etc up and running?"
Functionally, to me, RethinkDB was perfect. ReQL is one of the best query languages ever - It's so easy to learn and remember.
I haven't operated RethinkDB at scale in production (so I can't say much about performance) but I was pretty impressed with its scalability features during testing.
I'm really more leaning towards the idea that this is purely a monetization failure. They've been going at it for 7 years - There must be a good reason why investors kept it going for so long - I think it's because of the product.
I think they did identify a good monetization strategy in the end but maybe it was too late - They dug themselves into a niche that had great long term growth potential but they didn't have the resources to wait it out any longer.
Giving me floats but not ints just doesn't cut it. It works, in a kind of shoddy way, but … it's tasteless.
If you don't provide me with bitwise operations (earlier versions of Lua, I'm looking at you), then you don't get to call yourself a real language.
For a database, though, I suppose one could always store integers as their string representation. But please, no language ever do this again ever.
It's just as bad an idea now as it was then.
Databases are about data after all. Why not have a rich way to describe it?
- a 64-bit float
- a 64-bit int
everything else can be emulated in code from there and you can play all the encoding games you want to save space when you don't have to use it. This covers 99.999% of the use cases you reasonably see.
From there, I would argue, you should also have:
- a bit and byte
- 32-bit ints
- 32-bit float
- 128-bit int (once you add UUIDs you might as well make them full numerics citizens).
If I were doing a modern database, I'd have the kitchen sink as a type full numerics from 8 to 128 bit both signed and unsigned int and all supported hardware float sizes. I'd probably even have a 512-bit AVX type just to see what people would use it for.
64 bit is nicer than 32 bit, but it's scarcely more necessary than 128 bit numbers.
like I said. read NEED as in you can certainly hobble around without it (pick an appropriate offset and read your 53 bits in relation to it), but like a fracture in a leg, it's still considered broken.
If you want a good timestamp you probably want a timespec, and that won't fit in 64 bits anyway.
I don't see why it's easy to make 128+ bit types be composite, but you can't have 64 bit types be composite.
Your database driver can take care of how the bits are packed on the wire. It really shouldn't be a big concern.
You seem a bit worked up, and it's hard not to read your comments in this thread in a tone which suggests you feel like you've been robbed of something.
These people don't owe us anything. When a company decides to open up and publish a postmortem it's a great and wonderful thing, but it's in no way obligatory. Even if their technology doesn't hold a candle to your expectations or the market's needs, it's still not without merit. If you want to learn, they've already given plenty for you to learn from. Rather than demanding more from them, be thankful for what they have already contributed.
And yet here we are with a failed product. I think it's OK to ask why and consider its shortfallings. Obviously there was something, and I don't think it's fair to completely coast over the technology side of things and blame it on "marketing".
I think RethinkDB's marketing was excellent. The unchecked praise you mention is likely in no small part also a result of this.
However I don't think that means the technology was the problem. There are far more broken and misarchitected pieces of technology that are financially successful.
What exactly went wrong is speculative until we see the insights Slava promised, but it seems the failure was entirely financial: they had a great product, they brilliantly marketed it, but that didn't translate into sustainable revenue.
What scenarios were you finding you had bad performance?
I was finding I could getting comparable performance to MySQL for straight by key CRUD activity which is all I'd use a document store for (particularly once you take into account MySQL replicating to multiple nodes and the related failover setup).
> This is with about 20 minutes of looking around and just listing a few things. It looks like RethinkDB was written by web devs for web devs and I don't think that can compete in the database world. You might get the web devs onboard, but the database and HPC people at the firm will probably look at it and go "how quaint when can we get Oracle/Postgres/Informix/etc up and running?"
Have you ever tried to fully automate the failover/replication/etc for those? It is a substantial pain point that usually requires an experienced DB person with that database to do it well.
There are plenty of CRUD datasets where RethinkDB would be good enough and wouldn't require the dedicated ops people. That is the pain point I'd say RethinkDB was focused on solving.
I'd say the real reason of the failure is more the business model end where the intersection of orgs w/o dedicated ops people AND willing to pay for support was simply too small. RethinkDB was built (imo) for small organizations that simply didn't have a dedicated ops team and it was just 5-10 developers + marketing/business.
That could just be how I feel because its the situation I'm in (less than 10 IT supporting 200+ person org and our sysadmins deal almost entirely with end user issues).
The "it surely can't be the technology!" comment when yes, Shirley, it really can be. Even the best database systems I know (orders of magnitude quantitatively better than RethinkDB) have areas of technical weakness. And not just some areas, but usual gaping valleys. Nothing does everything well, and if it tries to it usually does everything below average.
Time to man up. Performance really was crappy? That only having a float for your ONLY number type really was kind of dumb? Having no native datetime was bad? Saddling timezones on the pseudo datetime was a brainfart? Milliseconds only was shortsighted? Horizon was a bad idea too soon? Document databases are the new object databases - forever 5 years away? Aggregates were poorly done? Nobody really knew how to do stationary time series properly (I read through the GitHub issues regarding this and it was laughable that these people called themselves database experts)? and on and on.
I'm not saying all of this is true. I'm saying let's be straight and not always blame sales and marketing or dumb consumers or C-level execs or everybody but our own crappy code.
Literally no users, and almost no employees at the vendors, I would bet (if my experience is not unique) have any idea of the qualities of the software they use/peddle, its semantics or guarantees, or its performance characteristics or how to use it correctly.
This makes the technology secondary; what is most important is how well you carve out mindshare. Whether that is even primarily down to "good" marketing is something I doubt. In the modern saturated database market, it is entirely unclear to me how you win sufficient mindshare and trust to obtain a wide enough (paying) userbase.
I have no idea if RethinkDB was "good" or not, as I did not take the time to investigate; I have little interest in "document stores". It actually seemed to me they had some pretty solid engineering foundations; the kind of thing that the industry should value highly, but due to poor information is unable to price into their purchasing decision, and as a result is missing in many modern (especially OSS) database offerings today.
I dislike cassandra immensely, but I still use it every day.
I feel your pain. There is very few robust database system that handle recovery well enough at scale, and for those without the deep pocket to keep a dba on hand there are very little choices.
For small/middle scale robust system Cockroach db is promising but still needs a couple years development, if it can survive without a monetization strategy long enough.
Cassandra is at least the devil I know, although it seems to find a new way to fail me every six months or so.
The dearth of types can be good or bad depending on your use case. I agree that for some use cases it's a bad thing.
I like many things about SQL and still prefer it to RethinkDB in some ways, but there are no SQL databases that offer replication or HA fail over configurations that don't require incredible time investments to configure correctly. This includes commercial DBs as far as I can tell. If you're too small to have a DBA, HA SQL is out... unless you use a somewhat expensive and also restrictive cloud-managed solution. The latter would have been our second choice behind RethinkDB.
When Horizon was announced, that's when I stopped even considering RethinkDB for new projects (even if it was a good fit, feature-wise). Call it a "business smell", but I'm not surprised to hear they've wound the business down.
Communities are supposed to be supportive, too. Pity this thread is the only time I've seen that on HN as of late!
They mentioned that they will be releasing more post-mortems later. For now, I'm comfortable with supporting them and hearing about their learning later.
The RethinkDB team built an amazing product that many people appreciated - as noted by the comments here- and for that they have my praise and admiration. The fact that the product will continue without the business is also noteworthy. Well done all
I guess those other document-oriented dbms just had better early stage shilling to get funding before RethinkDB did, and by the time Rethink shopped around there was already enough competitors that investments in documented-oriented dbms dried up.
The current state of the art is an in-memory dbms to take advantage of future available (soon, 2017) non-volatile memory. CMU is developing such a dbms http://pelotondb.org/ and there's also MemSql and Sap Hana.
...but the licence is 250k dollars a year so big fail for startups.
E.g. I've worked on more than one startup where the first angel investment wasn't on the table before 6-12 months into development, and where that first round in some cases was below $250k
On top of that you also have the issue of finding someone that knows it, and the associated staffing risk that comes with that (yes, I'm sure you can always find someone, but at what price? there are place I can go where I couldn't throw a stone in any direction without hitting someone that "knows" Postgres or MySQL or both sufficiently well to be an acceptable tradeoff)
In many startups the tech choices end up being made not just based on what fits and what is affordable, but also based on what you can find affordable people to work with (including e.g. co-founders or other people willing to do initial work for equity) - sometimes that can lead to niche tech getting used. But far more often it means picking from a small set of the most common alternatives.
Recommend you take one of your ideas and sketch out a back of the napkin first year plan. I bet it doesn't include using 250k of investor's seed money for a database.
Startups are hard (Ive failed a couple times and had it work a couple times - definitely not always through my own effort).
My current view on this is that RethikDB didn't rethink enough They solved a probabem without much money involved and too small. They might be great devs, but just diidn't solve a prolblem that needed to be solved.
I think they needed to develop just one feature which cannot be done by Oracle and that is that.
The other problem could be that NoSQL hype just died out :(
Because nobody used it.
This shutdown therefore goes a long way to say how talented and ethically correct the team was, something extremely evident in how they put correctness and reliability in front of performance.
In short, RethinkDB is a very solid piece of software that does well where (many) other NoSQLs fall short, that is:
* easy HA and automatic failover
* easy auto sharding
* rational query language
* ease of administration and awareness (webUI)
* realtime capabilities
* perform well on the Jepsen test!
- lack of mass adoption - they didnt solve a problem everyone had
- lack of nitch adoption - they didnt solve a problem a few people had badly
I can tell you 4 features that I view more important than change subscription
- can run in any ecosystem - pouchdb/couchdb and waterline are examples of single apis in different environments. Even being able to use the database 100% in memory as a redis alternative would be nice.
- supports transactions across shards natively - this is a difficult but important feature that mongodb is missing
- supports document id transaction and table transaction - if Im only updating two documents, I have no interest in making many financial transactions wait for a table to be finished being used when I only care about 2 documents in that table.
- "sync" feature on a table - pouchdb supports this for syncing a client with a server
Rethinksb failed because, despite its sweet name, it wasnt solving that important of problems.
Going by the public information on their website, the only way the company was making money from its product was via support and training.
The problem with making money only with training and support is that it falls flat when the product and documentation are so well done barely anyone considers these things important enough to spend money on them.
Additionally the website doesn't really provide much information on these services other than a contact form. If you're not already in touch with RethinkDB devs, you're probably more likely to ask a local contractor or specialised training company for help instead of reaching out to RethinkDB.
By all appearances, RethinkDB has been pretty successful as an open source project. It has a strong and active community, it is well represented at meetups, conferences and podcasts, it has 16k stars on GitHub (MongoDB has only 10k, MariaDB and MySQL have only 1k each).
But the company was not able to turn any of those numbers into revenue. Downloads, GitHub stars and Twitter mentions don't pay to keep the lights on. A successful open source project does not equal a successful commercial enterprise.
In all seriousness the OP nailed the issue. The company is not successful because of the "issues" with the product. There is a very direct correlation there, stop trying to make it look like the failure of the company did not have anything to do with the product and that the product is perfect/popular/whatever.
Yes, RethinkDB wasn't more popular than MongoDB or MariaDB. Of course not. But if you followed the JS ecosystem or Hacker News or podcasts or any of the usual suspects it would have seemed that RethinkDB was quite successful for its size.
People talked about RethinkDB. Whenever someone talked about realtime or streaming data, RethinkDB would come up. That RethinkDB felt so ubiquitous despite its low rank in Google trends or even on StackOverflow speaks to how well the team knew how to make themselves heard.
I'm not saying RethinkDB itself is without fail. The most overhyped feature is realtime change feeds and those don't even work for all types of queries (especially aggregates and joins just aren't supported for that). Based on the documentation it also doesn't really sound like change feeds can work at scale.
But RethinkDB certainly managed to catch people's attention. They just couldn't turn that attention into revenue. And considering you actually have to go out of your way to find out how to give them money when looking at their website, that's not surprising.
Even if RethinkDB had been the best thing since sliced bread, it seems like they would have still failed as a company because of this. To put it into online marketing terms: they certainly had the traffic, but they couldn't get any conversions.
Unfortunately, popularity doesn't imply merit and merit doesn't imply popularity.
* How does it position itself against the CAP theorem?
* Will it ever lose my data during normal operation? (i.e. what are the consistency guarantees)
* What happens when 1...N nodes go down?
* What are the requirements for a well performing cluster?
* What are the limitations in terms of document size, number of documents, ecc?
However, certain operations on a single key, like incrementing a counter, are atomic.
Or y'know, you just don't do them correctly and accept a substantial rate of quietly corrupted data.
There isn't really a good option unless you have absurd amounts of money compared to your transaction volume.
Very, very painful problems are very, very valuable.
Google falls under the "absurd amount of money" camp.
> Protocol Buffers have performance implications for query
processing. First, we always have to fetch entire Protocol Buffer columns from Spanner, even when we are only interested in a small subset of fields.
There are a number of performance costs documented in the paper. Simply because you can solve it via money doesn't mean it works for all situations.
Make a private (non-OSS) version optimized for Business (B2B), do external consultant jobs and pretty much ask for donations from the big players/users. This is a common problem from almost every OSS project and so far the only one that is doing this is OpenSSL.
1) An Stable "enterprise grade" version which receives fixes and minor performance updates for at least 5 years, maybe even charge extra for another 2-5 years round (just like Microsoft does).
2) Reports: Business love reports, if they add a way to generate reports straight from the DB (like Oracle and SQL Server) is a killer feature most of the time.
3) Customer Support.
Its clear to me that Rethink is the model for future databases - its just that DBs have a long gestation as no-one wants to risk their data until the code is aged like a fine wine. Its an important longterm technology play - just the kind we need to improve things for all of us.
In two or three years I think they would be making money, I think this is a failure of capitalism or imagination our HN/SV community.
To those several of us who have the power to write a check, please consider doing so, Rethink have been relentlessly building the future [ and you will make money ]
As a company choosing open source technologies, the popularity is important. Otherwise you run into issues finding answers to common problems and you end up paying through the nose for support.
Presumably, that's in part why CockroachDB decided to be more-or-less Postgres-compatible on the wire, and embraced SQL.
I was about to post that a quick grep of CockroachDB source turned up no references to postgres, libpq or anything obvious to back that up that claim but then I noticed my workspace was out of date, so I after a git pull/git grep I now see that you are correct. Commits to add postgres-compatible parsing and wire protocol appear to have landed late last year. Why wasn't I informed! So thank you for this - time for me to take another look at it.
My initial thought is that MongoDB has done a way, way better job at SEO. The number of blog posts about RethinkDB pales in comparison to Mongo. I wonder if they got beat on sales as well? Not sure.
I wouldn't be surprised if people a) didn't use Rethink because they're happy with Mongo and its popularity b) hated Mongo and saw Rethink as too similar/didn't research it enough.
Went with Postgres for all three.
When Windows Phone 7 came out, without background tasks, MS was quick to point out that the iPhone didn't originally have it either. It was a silly argument; Neither Apple nor Google were selling phones without that feature at that time.
I'm not saying Rethink did anything wrong. And by bringing up MS, I don't want anyone to think Rethink had the same sense of entitlement as Microsoft.
Your v1 has to compete against your competitors current version, not their 5-year old v1. RethinkDB had a lot of powerful features, but it also lacked a lot of features that for me, and I have to suspect many others, made it an impossible choice.
(Also, I hated the query syntax. I was willing to fight through it, but I often wondered if that was turning people off)
My situation was very similar to latch's above. I vetted it earlier this year, after having developed warm fuzzy feelings for it last year at another company. We wound up going with Postgres + Solr because A) we've used them before, B) they performed a lot better than Rethink, and C) Rethink's compensating features (distribution) weren't worth the tradeoff.
I thought Rethink seemed like Mongo done right. Both have a simple document storage model. Rethink embraced "relations" a lot better, and seemed interested in borrowing some good ideas from relational theory. Having a uniform query language was a good idea; the "builder" style was an odd choice but whatever. Where Mongo ignored and failed to address Aphyr's concerns directly (and antirez seemed to emit a cloud of interminable semantics debates) Rethink actually reached out to Aphyr for testing and rapidly took action based on his recommendations. They accepted the criticism readily and went to some effort to be transparent about what they could and could not do. Failover was not a 1.0 feature, so they didn't hose it totally. The admin interface was beautiful. You really felt like you had a simple, powerful tool that you could understand.
Performance wasn't great. This was the showstopper for me. But to be honest, I probably wasn't going to give them a dollar even if it was a lot better. I'm not paying for Postgres or Solr either. I don't know how you bootstrap a database business. The Mongo/MySQL approach of "make garbage, monetize, hope someday you can refactor it into shape" looks obviously wrong stacked up against Postgres, which always took the academic approach of "first make it right, then make it perform." But performance is a feature and it takes a long time to make a database right, let alone perform. I think they took the right approach. It's just a long process, and it may not be compatible with startup culture.
(via this HN submission: https://news.ycombinator.com/item?id=12650033)
Perhaps this is what RethinkDB's strategy was all along with targeting real-time applications. I wonder if they were just... too early?
Given that the market for document DBMS is pretty small to begin with it's tough to beat a 25x difference in budget unless you have a product that is a complete game-changer. It will be interesting to see Slava's analysis. Meanwhile I wish the team and RethinkDB users well. DBMS start-ups are damn tough to pull off.
RethinkDB, 2009-11 = $4.2MM
MongoDB, 2008-2009 = $4.9MM
It's true that MongoDB quickly closed on a $6.5MM Series C in Dec of 2010, one year after their Series B. For Rethink, it took them another 2 years to get the last $8MM. There's no question that MongoDB was "better" at raising more money, faster.
I'm just thinking out loud here, but I wonder if MongoDB made a conscious tradeoff to sacrifice good tech to have better marketing & sales early on.
Crunchbase links here:
This was real technology! I'm truly sad that the environment is such that great work like this can't continue to be funded.
Thanks for showing everyone how to write amazing documentation, caring about the fundamentals, and for the incrediblly snazzy admin panel.
Thanks for showing everyone how to write amazing documentation
Thank you! (Really. I'm RethinkDB's documentation writer/editor.)
Really, you did have fantastic documentation, as the original said. Kudos.
Do you have any guides or books you recommend for writing good documentation? What software did you use to produce Rethink?
Sorry to see Rethink go.
For books/guides, it's hard to say. Find documentation you like a lot and think about what makes it good--the organization/taxonomy is really important to pay attention to, as well as the tone (formal, conversational, weird, etc.). As shocking as this might be around here, the _Microsoft Manual of Style_ is useful as a specifically technical guide, and it's good to have a relative recent edition of the _Chicago Manual of Style_ kicking around as a reference. And, just being familiar with standard grammar and punctuation rules is important. A lot of people aren't. (A lot of people who aren't still think they are.)
Sales are essential for a for-profit company.
Unfortunately, after a short while I stopped hearing from them. It doesn't look as though they ever brought in any marketing help since then.
IMO, no. Sales and marketing are pretty different things.
Marketing isn't "sales without people", and sales isn't "marketing done in person". They target different problems, and you may need one, the other, or both to be convincing, depending on the product or service.
The next step is marketing. Unless your target market is so tiny you or your sales team can do a direct outreach.
Gaining users and customers is extremely difficult, often unpredictable and potentially expensive. While a bad product/technology/service can set you back there is no guarantee the best will get you success. Users can be unpredictable and the adoption curve for anything new is long.
Sometimes marketing is earlier in the market identification and product building stage so you know the market you are going after, gaps, segments and the value proposition to customers or users. This is product strategy. You can completely botch this step and expend effort on a product/service that is not clearly differentiated and has no market case.
B2C (business to consumer) marketing typically use large budgets to reach a wide potential customer base using traditional advertising channels, and now online and social channels. Even 'lucking' out with hype, viral marketing and rapid adoption requires someone or a team to fuel this hype.
B2B (business to business) marketing is more hands on and about building a predictable pipeline for your sales team to target and close. In some b2b business the sales cycle can be hugely long and expensive and the cost per lead and opportunity measured in thousands of dollars.
Both are done to sell more stuff.
IMO marketing is all about getting a good fit between what the company sells and what the market outside the company wants. It could be creating demand, so that the market changes to fit what the company creates; or it could be shaping the company's offerings to what the market is asking for; or it could be advertising, to make the market more aware of how the company's current offerings are a good solution for what the market is asking for. One way or another, it's about bridging the gap.
Sales is the process of completing profitable transactions. It's typically high-touch, one to one, and involves taking prospects, qualifying them, figuring out what they need and how the company's current offerings solve that problem in a personalized way, and managing the process to completion, and potentially further to upselling in the future. It's a pipelined process and is ultimately a numbers game, where the odds of proceeding to completion depend on both the value proposition and the salesperson's skill in communicating it.
And I do think that strategy was working to some degree; at least on channels like HN where we frequently saw posts related to RethinkDB.
Guess what - database vendors like Oracle, MongoDB (to a much smaller extent) that get a bad rep on HN actually sign deals for real money.
And we see how that worked out. Pricing is an important component of marketing, but it's not a marketing strategy in itself.
Why are we assuming that everything has to be done by companies? What if we establish a National Science Foundation for fundamental CS research?
This isn't that difficult of a question.
1) Universities get money from the students who pay tuition and fees for their education.
2) Universities get money from wealthy benefactors who donate their money to a cause they believe in.
3) Universities get money from for-profit companies that have sufficient commercial success to invest into research and such for what they hope will be later commercial benefit.
4) Universities get money from people who confiscated the money from other people, under threat of force and who might not have given it otherwise, and give it to a cause they believe in.
Not necessarily in that order by whatever measure you may choose. Note that, In no case did the wealth magically just appear. The wealth to consume was produced somewhere by someone.
Sounds like you're proposing 4 as you imply that the we that is not the companies won't be trying to create the wealth you want to use in your cause.
Keep this in mind when you invest in a certain technology: some organizations, especially nonprofits (for example, the Apache Software Foundation, Python Software Foundation, the new Node Foundation) are probably going to support and develop their software for extended periods of time relative to, say, a startup or for-profit (Parse, MongoDB and RethinkDB immediately come to mind).
Only certain super-behemoths like Microsoft or Apple can afford to have their infrastructure products being closed-source.
I am all for open source, but it doesn't make things like this is easy as some make it out to be. For example with Linux source, how long would it take me to fix a video driver bug? Perhaps a year?
Therefore, if you're the one building the database, open-sourcing it de-risks your product to your potential customers, to some extent.
Of course closed source products can also be collaborations, but open source + open development practices can make this a smooth and natural result.
I think, more important that whether or not the DBMS is open source, just consider how critical the DB is. If you can't live without it, stick with a DB that you can be virtually 100% sure isn't going anywhere soon. Save your experimentation with new products for the small and less-important stuff, and make sure the small stuff doesn't reach some critical mass before the DB vendor does.
(Full disclosure: I am one of the people who could have been qualified; one of the jobs I was considering a few years back was a graphics-driver-hacker position at Red Hat. I happened to instead choose full-time employment doing various, mostly userspace Linux stuff for a startup. When they ended their incredible journey, I would have loved to make a living taking contracts like the ones you suggest - and I still wish I could - but there aren't enough of them and they aren't reliable enough to make it better than just taking a full-time job with a profitable company.)
But that assumes there is a ready supply of people that can fix a certain part, and want to take contracting work. For example, let's say I have an ATI XYZ in my laptop. It has a driver bug and won't work. ATI driver is open source. Who do I hire to fix it?
(Note that I am carefully using the phrase "open source," not "free software". Part of the free software ethos is that you can realistically modify your code as needed. I do happen to think that the current free software movement isn't very good at delivering on this promise.)
The way this is dealt with for closed-source software is source-code escrow. The support contracts stipulate that the company that built the software set aside the code with some stable third party, and that it be available under specified conditions, such as the builders going out of business.
It's not as good as having the source freely available, but then it's dealing with an extreme contingency, anyway. Very, very few users of a piece of software have the expertise to build it from source and then debug it if they run into problems. So they're probably screwed either way if the builders go belly up.
> Only certain super-behemoths like Microsoft or Apple can afford to have their infrastructure products being closed-source.
The alternative is to buy truly critical software only from companies you trust a lot. XYZ Valley Startup is unlikely to be around in 20 years, but Oracle almost certainly will, as will Microsoft.
If a project is open source then who is committed to helping you support and maintain it? You essentially need to have a team to do it.
If there's a company behind it, it gives you assurances.
Especially for a small company that needs to focus on building a product rather than committing features and bug fixes to their database
I was able to sell our company on Cassandra because there is a Datastax there to support it when the shit hits the fan.
I assume you mean they have $2bn in annual revenue?
Later onwards, when I was working on an NLP startup earlier this year, I opted to use RethinkDB because I had seen how clean, smooth, and fast its internals were. When I had a hiccup with running a cluster in here cloud and tweeted about it, Mike and others from. The RethinkDB team instantly reached out to me and helped me resolve the issue.
The main reason MongoDB is popular in the Node community is that both reached peak hype around the same time. So it made perfect sense to bundle both of the "hottest" technologies together. Except by the point Angular 1 reached peak hype MongoDB was already facing criticism, so MEAN mostly seems to bask in its afterglow.
If you had asked me last month, my prediction would have been a RethinkDB+Node+React (or RethinkDB+Node+Angular2) stack popping up to challenge MEAN, though the RethinkDB+Node bit seems to have already been addressed by Rethink's own Horizon.
They should just switch to ES clearly.
Other popular contenders are similar "full stack" or "full featured" solutions like Sails, Meteor and Total.js.
This is generally a result of beginners to Node trying to find something that does everything instead of spending some time learning the basics with a library like Express and going on from there.
As far as the stack itself goes, it's wonderful.
TERP Stack : The most adventureous stack for
Tornado RethinkDB Emberjs Python.
In most enterprises around 2/3 of the data in the lake is unstructured and comes from multiple systems we don't control or have visibility into. So it just gets dumped into HDFS until someone creates the relevant HCatalog schemas. Having a database like MongoDB at least allows you to extract and access some of the data in an automated fashion e.g. JSON reflection. It's especially useful for adhoc and lightweight model scoring and feature engineering.
EDIT: We'll just wait for Mongo to burn through all their cash then.
Mongo has a strong community, which translates into a plethora of articles/blog-posts/tutorials/etc that secures its niche. From there it's possible to make money with things like consulting fees to help fix the problems it created in the first place.
It's all about execution. Case and point: I know what (purportedly) makes MongoDB special. I have no idea what RethinkDB is supposed to do, other than store data. Yes, I could check their site, but the point is I'm already familiar with Mongo's reputation. All of this contributes to a lesser product winning over a better one.
Half the crap people speak about Mongo seems to be from either the early days, or from people who just like to complain. It certainly has it's issues, but there doesn't seem to be much community will to replace it with $superdupernosqldb.
We run Mongo side by side with PG, and even though the PG is just replicated and not also sharded like our Mongo cluster is, Mongo was more easy to configure.
I think MongoDB's biggest problem is that it was marketed as the general purpose DB in the Node.js community when actually it is a specialised tool. So now you have thousands of developers wondering why they're on this weirdly behaving document store when they could've scaled fine on something more generic like PG.
More on topic: RethinkDB was aiming to be the best of both worlds, generic like PG yet document oriented and horizontal scaling like Mongo. I think their problem is that they didn't manage to pierce the communities. It would have gone better if they had evangelists for every major platform that just produced a stream of blog posts, small open source projects and tools and crashed random meet ups.
Oh, yeah... and the ability to do shard + redundancy, where mongo you're either or have to do both. So scaling works a bit better. The admin tooling for rethink is better than anything I've really used in non-sql databases. And frankly even better than db admin tools, including sql based ones.
It's just a really great, stable database with really good scaling for the majority of use cases.
But please enlighten us poor fools which open source databases can compete with MongoDB in this space. It sure as hell isn't PostgreSQL.
In Big Data specifically when we are spending tens of millions on Hadoop/Spark/NVidia clusters buying a few extra licenses for MongoDB is nothing. And what makes Big Data even more compelling for database vendors is that often by law or for privacy reasons we are forced to run everything inhouse i.e. strictly no cloud.
I'm not a Windows fan, but that's harsh. http://cryto.net/~joepie91/blog/2015/07/19/why-you-should-ne...
Most of the the points are for issues that were fixed years ago and then you get classics like "forces the poor habit of implicit schemas in nearly all usecases" as if some developer is going to somehow stumble on the fact is it a schemaless database.
It's like trying to say "Linux is shit" because of issues found in 1.0 of the kernel.
It's made a massive difference to performance and stability.
This is the JIRA tracking that Jepsen stale read issue: