The problem wasn't that we (and presumably others) didn't plan for open-core/cloud. We did, but there are structural problems in the market that prevent this from working.
Open-core didn't work because the space is so crowded with high-quality options that you have to give away enormous amount of functionality for free to get adoption. Given how complex distributed database products are, by the time you get to building a commercial edition you're many years in and very short on cash.
Cloud didn't work because AWS/GCloud have enormous moats of pricing and brand recognition. They drive margins down to epsilon, and if your product sees meaningful adoption in the industry they launch their own service and take all your customers.
Personally I believe that latter. I don't see how they're going to get people to pay for Cockroach with all the options that are already out there. I think they might have an even more difficulty than RethinkDB did because their interface is SQL which means that migrating to things like Postgres or RDS is a lot easier.
Contrast CockroachDB with some of the other nascent open-source-to-successful-businesses:
Influx - Solves timeseries applications challenges. (biz problem)
Citus - Solves a scale out technology problem with an already established DB platform. (tech problem)
Confluent - Solves scaling issues with an already established data streaming database. (tech problem)
Cloudera - Solves scaling issues with an already established big data platform. (tech problem)
Elastic - Solves search application challenges. (biz problem)
Cockroach - Solves scaling issues with SQL databases by offering an alternative DB. (??)
Unless Cockroach can position itself as either solving a technology challenge with an already adopted DB solution, or solves a business problem, it's going to be very difficult to achieve profitability.
Dunno what that anecdote adds, but... that was my experience.
I've been working on a series of blog posts since October around the subject to databases and their future, and one post was intended to be my thoughts on RethinkDB. The leak of your postmortem that was revealed on Tuesday has made me reconsider releasing it, but your comment above makes feel obligated to share a few thoughts.
1. The database market is NOT closed. In fact, we are in a database boom. Since 2009 (the year RethinkDB was founded), there have been over 100 production grade databases released in the market. These span document stores, Key/Value, time series, MPP, relational, in-memory, and the ever increasing "multi model databases."
2. Since 2009, over $600 MILLION dollars (publicly announced) has been invested in these database companies (RethinkDB represents 12.2M or about 2%). That's aside from money invested in the bigger established databases.
3. Almost all of the companies that have raised funding in this period generate revenue from one of more of the following areas:
a) exclusive hosting (meaning AWS et al. do not offer this product)
b) multi-node/cluster support
c) product enhancements
c) enterprise support
Looking at each of the above revenue paths as executed by RethinkDB:
a) RethinkDB never offered a hosted solution. Compose offered a hosted solution in October of 2014.
b) RethinkDB didn't support true high availability until the 2.1 release in August 2015. It was released as open source and to my knowledge was not monetized.
c/d) I've heard that an enterprise version of RethinkDB was offered near the end. Enterprise Support is, empirically, a bad approach for a venture backed company. I don't know that RethinkDB ever took this avenue seriously. Correct me if I am wrong.
A model that is not popular among RECENT databases but is popular among traditional databases is a standard licensing model (e.g. Oracle, Microsoft SQL Server). Even these are becoming more rare with the advent of A, but never underestimate the licensing market.
Again, this is complete conjecture, but I believe RethinkDB failed for a few reasons:
1) not pursuing one of the above revenue models early enough. This has serious affects on the order of the feature enhancements (for instance, the HA released in 2015 could have been released earlier at a premium or to help facilitate a hosted solution).
2) incorrect priority of enhancements:
2a) general database performance never reached the point it needed to. RethinkDB struggled with both write and read performance well into 2015. There was no clear value add in this area compared to many write or read focused databases released around this time.
2b) lack of (proper) High Availability for too long.
2c) ReQL was not necessary - most developers use ORMs when interacting with SQL. When you venture into analytical queries, we actually seem to make great effort to provide SQL: look at the number of projects or companies that exist to bring SQL to databases and filesystems that don't support it (Hive, Pig, Slam Data, etc).
2d) push notifications. This has not been demonstrated to be a clear market need yet. There are a small handful of companies that promoting development stacks around this, but no database company is doing the same.
2e) lack of focus. What was RethinkDB REALLY good at? It push ReQL and joins at first, but it lacked HA until 2015, struggled with high write or read loads into 2015. It then started to focus on real time notifications. Again, there just aren't many databases focusing on these areas.
My final thought is that RethinkDB didn't raise enough capital. Perhaps this is because of previous points, but without capital, the above can't be corrected. RethinkDB actually raised far less money than basically any other venture backed company in this space during this time.
Again, I've never run a database company so my thoughts are just from an outsider. However, I am the founder of a company that provides database integration products so I monitor this industry like I hawk. I simply don't agree that the database market has been "captured."
I expect to see even bigger growth in databases in the future. I'm happy to share my thoughts about what types of databases are working and where the market needs solutions. Additionally, companies are increasingly relying on third part cloud services for data they previously captured themselves. Anything from payment processes, order fulfillment, traffic analytics etc is now being handled by someone else.
Have you examined emerging databases like Tarantool https://tarantool.org/, GunDB http://gundb.io, TiDB https://github.com/pingcap/tidb, ClickHouse https://clickhouse.yandex/ ?
It would be great to read some deep and independent analysis for them to.
We have been working on a general-purpose resharding for over 3 years, but have yet to release it to the open source community: it's very hard to do it well.
But our customers get a sharding scheme that best suits their business needs, including fully automatic shard management and data re-balancing. I submitted a talk about the technology and know-how behind this to Percona Live 2017: https://www.percona.com/live/17/sessions/best-practices-appl...
This means that even with a low switching cost they still haven't created that burning desire for people to adopt the product and that's generally the harder part of the equation. I do think Cockroach is creating that in other ways with their Geo replication and sharding capabilities to name a few. But no one is switching to Cockroach because it's SQL so why not. However we know they're creating a burning desire for people to switch off their product by charging them money since people would always rather not spend money. The switching cost has to act as the counterbalance to that desire. The lower the switching cost the less you'll be able to charge people. This can be a very hard thing to solve after the fact and companies resort to all sorts of contrived things to try to get people locked in to products that don't inherently have strong lock-in.
First off, I really appreciated your frank blog post on the RethinkDB post mortem. The distillation of years of experience is incredibly valuable for us, and I'm sure for many others.
I agree that the database market is crowded with solid offerings. However, I believe that differentiating features do still matter and there will be tremendous growth in the database market for the foreseeable future. In your blog post you listed the metrics of goodness which you optimized for, perhaps incorrectly. You indeed had amazing execution on those original metrics. We have paid RethinkDB the compliment of doing our best to emulate the standards set with simplicity and consistency, in particular. You are also correct that the alternate metrics including timely arrival, palpable speed, and a use case are probably better ones to optimize for in an entrepreneurial setting.
We have been optimizing from the start for a still-small use case, but one which is likely to become a top of mind concern for every major enterprise over the next five years: building global, "multihomed" services. This is something Google has pioneered over the past decade, but which remains an elusive challenge for most everyone else. For an interesting read, check out https://static.googleusercontent.com/media/research.google.c... (tl;dr here: http://highscalability.com/blog/2016/2/23/googles-transition...)
You mention AWS/GCloud as existential risks for a cloud DBaaS offering. I would take that a step further and cite them as the biggest risk to all database companies. We must compete with them both by pushing the boundaries of what the database can accomplish, and by aggressively driving an anti-vendor-lockin message: embracing a proprietary cloud DBaaS offering is an unacceptable risk if there are non-proprietary alternatives.
Yes, very similar to Windows, Android etc, when owners of the platform learn which product goes well, and then make it themselves.
I have appreciated the snippets of mentorship that I have gotten from you and Mike (I've interacted with Mike more) - I'm Mark from the GUN team. Here are my thoughts:
RethinkDB's shutdown spells doom for Cockroach. However, I do disagree with you Slava, that the DB market is impossible.
Rethink and Cockroach are both Master-Slave, and I think you hit the nail on the head that that is an impossible market to try and compete in. However, it does not represent the entire DB market (albeit, it is the overwhelming majority).
The market take over is going to happen with P2P/decentralized databases (Cassandra, mine http://gunDB.io/ , even things like IPFS, etc.) because Master-Slave databases have a limit of how large they can scale and shard. Up to another 5B people are coming online into 2020, so the demand alone is going to reshape the industry towards the growing Master-Master databases. Cockroach is in the wrong place.
My company is going through an inflection point, so I'll presumably be one of the guinea pigs. If I'm right, we'll be able to keep all of our technology completely MIT/ZLIB/Apache2 Open Source, yet still grow healthily and fast enough for our VC backing. Why? The inevitable shift to P2P systems is going to require the existing monoliths and governments to get on board and they need experts who have designed those systems. (I'm already seeing this happen with some of our customers and potential clients).
Startups will be able to reap all the benefits for free, even if they get Pokemon Go level hyper growth. Why? Services with hundred million plus users will be the norm, not the enterprise. And their services will become more robust with more users on it. However the dinosaurs, governments, centralized services, etc. will still pay handily in order to keep spying on their users. Unfortunately, users will still use these services because those companies actually make a profit which they re-invest in conveniences that keep users around.
That will be the divide (and has always been) between free and paying DB customers. None of this CCL stuff.
1. CockroachDB Community License (CCL) might sound like a Community Edition, and that normally refers to the open source version instead of the proprietary one.
2. It is hard to quantify the difference between a startup and an established company. We put the difference for GitLab at 100 people that can potentially use our software.
3. I see the CTO commenting here https://news.ycombinator.com/item?id=13438863 that they will never move features from open source to the CCL Maybe they can consider publishing a set of promises to the community. We did that at about.gitlab.com/about/#stewardship
"At first I tried to make my argument the way that Stallman made his: on the merits. I would explain how freedom to share would lead to greater innovation at lower cost, greater economies of scale through more open standards, etc., and people would universally respond "It's a great idea, but it will never work, because nobody is going to pay money for free software." After two years of polishing my rhetoric, refining my arguments, and delivering my messages to people who paid for me to fly all over the world, I never got farther than "It's a great idea, but . . .," when I had my second insight: if everybody thinks it's a great idea, it probably is, and if nobody thinks it will work, I'll have no competition!"
I still find it interesting how many people dismiss Cygnus's business model out of hand when entering the open source market. (Cygnus was acquired by Red Hat for $600 million and Michael Tiemann is still VP of Open Source development IIRC) What is interesting to me is that I've never heard of anyone else even trying it. No successes. No failures. As Michael Tiemann said, no competition. And Red Hat enjoys that competitive advantage even today.
I highly recommend reading that chapter for an alternative view on how to approach open source development.
Well, as I understand it, part of the reason is that contrary to their origin story, Cygnus didn't really follow the "Cygnus Business Model" either, and anyone trying similar tactics since then has had to deal with much greater visibility.
MongoDB offers a commercial version of their product with enterprise features (encryption at rest, LDAP auth, etc) and support - MongoDB Enterprise.
Additionally they also offer managed, cloud hosted MongoDB deployments - MongoDB Atlas.
Over the last few years the valuation of MongoDB, Inc. has been slashed by institutional investors such as Fidelity and BlackRock. While they haven't had mass layoffs or some other negative corporate event, they have clearly had some difficulty making their (and apparently your) business model work.
Do you agree that this is a fair comparison? And what do you makes CockroachLabs more likely to succeed with this business model than MongoDB?
MongoDB Inc did have its valuation reduced by some institutional investors. Hard to say whether that was premature or what the impetus was behind their decision. MongoDB is an incredibly well-adopted product that has gotten considerably more capable over the years. I would argue they've had good success with this business model, as building a $1.6B business is a huge accomplishment whether you've got an OSS business model or not.
It would be fair to ask whether they've done the balancing act as well as they might have. They've certainly knocked OSS adoption out of the park. On the other hand, I've heard anecdotally that they waited a long time before introducing enterprise features.
Regardless, I view MongoDB Inc. as a big - and still growing - success, and consider much of what they've accomplished to be worthy of emulation.
"I don't know much about one of our signifcant competitors' business model." doesn't sound like something you would want to hear from someone just announcing their business model.
In the same vain, the next to be slashed will probably be Docker.
what it probably won't do is make plenty of money for investors.
Is this a dig at InfluxDB for removing clustering from their open source version?
They announced that when they'd have clustering, if they ever do, it will only be in a paid edition.
That's open for another debate entirely: Advertising and promoting features that don't exist and won't in any near future.
> The first is a fully-distributed, incremental capability for quickly and consistently backing up and restoring large databases using configurable storage sinks (e.g. S3 or GCS). The same functionality, but non-distributed, will be available for free to all users.
I appreciate that you're trying to write a good database and build a business, but what do you mean by "startup"?
If a database can't guarantee it can make backups, why would a startup attempt to use it in the first place?
If you can't incrementally back it up, you can't really afford to run it in production in a cluster that has a large dataset. If you don't have a large dataset, you don't need cockroach db (first law of distributed objects, etc).
Maybe you'd be better off designing features for clients with specific requirements and very deep pockets.
There are plenty of big customers locked into Oracle. If you gave them a backwards compatible but scalable database for half the price, they would happily cut their costs in half, and make cockroachlabs very rich.
And then Oracle would offer to buy your company.
We are a similar play (open core, etc) I commented in the rethink thread as well.
A few lessons on the business model we ran in to:
1. Cloud is commodity/experimental. It could be the future though and there's no reason not to have an offering at least. We are launching a partnership with microsoft on azure to experiment with cloud for their hadoop offering. We found it was bring your own license and very similar to on prem. It seems like a no brainer to at least try this.
2. We are on prem first. Enterprise customers need "pay to blame" kind of like insurance. We have closed source features we license. Support is secondary and comes only with a license.
3. Bundling and minimum purchasing is key. You need to validate a customer has a budget.
1 thing that I notice database companies don't cover which kind of surprises me (that cloudera,red hat, etc do) is training and the "services swamp" 1 way to qualify customers and increase run way is a "small" amount of support you charge a high rate which in turn drives licensing revenue on top. A lot of the companies that have actually stood the test of time started like this.
Where we deviate:
1. Building a horizontal platform but focus on 1 product at a time. In our case, we have a generic platform we can license for teams that just need "something better". We focus on a use case such as fraud/network intrusion/money laundering and sell that. This is very similar to the oracle model. "Bundle a database with the app"
2. Any services we end up doing we put aside to turn in to a product down the road. In our case, we observe and learn the patterns and reapply those as "templates". In our case we accumulate expertise, make a bit of money, and monetize later. Many SAAS companies in our space keep the customer data because they need to build better models. I'm not quite sure what the analog in the database market here is.
We just keep the "lessons learned". The key here will be scaling that. In our case, we do that by focusing on 1 vertical use case and owning that. We also minimize engineering time on the platform.
In short: We have a dual licensing model where we have an "app" that's easy to sell in the market, get paid to explore the market (while focusing on/prioritizing 1 use case for more scalable revenue, and for teams that just need a better platform, we can just offer that and make some fairly passive licensing revenue.
1 other neat anecdote: for all the people raving on HN about what google's doing with AI, the problems people had 50 years ago are still the same ones people pay for today. The database market maybe similar, it might not be bad to learn from your predecessors but maybe just update the business model a bit (eg: cloud offerings,..) At the end of the day people pay for use cases over everything else. Think about the reason they are about HA not "HA is cool, they obviously want to pay for that"