Eliot Horowitz (HN: @ehwizard), MongoDB’s current and founding CTO, reached out after my last post - and spent two hours providing feedback last week in Palo Alto. It was an expansive discussion, and Eliot was reflective and eager to understand the perspectives I had heard. He noted how much it mattered to him what HN thought.
I left with tremendous empathy for the challenges of building a venture-backed database company - while we also disagreed in a few key areas. As engineers, I hope we continue to have thoughtful, if spirited discussions, like Eliot and I had - where we both are open to being wrong and have a desire to understand differing viewpoints.
In the interest of length, I didn’t write the whole interview into my series, but wanted to share key parts of our discussion on HN:
- NoSQL: I felt that NoSQL was overhyped in the early 2010s - and that 10gen’s marketing claims were overblown. Eliot argued that many of NoSQL’s benefits have been realized and that 10gen’s early marketing accurately reflected the changes to come. For example, while he doesn’t love the term NoSQL for its impreciseness, he feels that both the JSON-like data structure and horizontal scaling are here to stay and were in fact the key changes NoSQL led to (not the SQL DSL). In that sense, he argues that Amazon RDS is a form of NoSQL today - and that NoSQL had a powerful impact on the roadmap of existing SQL databases.
- Data Loss: I generally have stayed away from talking about the controversial examples of MongoDB data loss in the earliest days (there are several of HN posts that note this). I know personally of only one team that lost data, but it’s always hard to understand if this was a database issue - or their own mistakes. I did ask Eliot explicitly about this, and he said that these were exceedingly rare.
- Defaults: I feel like it’s playing with fire to set bad defaults in a database - with numerous data breaches due to 10gen’s early decisions on authentication, remote login, and encryption (see for example, https://snyk.io/blog/mongodb-hack-and-secure-defaults/ ). For auth, Eliot argues that developers need to take responsibility for exposing MongoDB on public servers - and that the SLA for a self-hosted instance is different than a managed instance (at minimum, I have issues with users having their data exposed to the world through no fault of their own). He disagreed with 10gen’s decision to turn on auth by default in later self-hosted versions once MongoDB ignored remote connections by default (but thought this was the right choice for the managed Atlas service). (But before 2014, the default behavior was no auth - and accepting all remote connections, see https://snyk.io/blog/mongodb-hack-and-secure-defaults/ ; Eliot notes that this took a while because changing the default would have caused issues for existing customers)
I do have concerns when 10gen explicitly targets junior developers (to be fair, he could never have predicted the growth of Node and the interest in the backend for frontend engineers). What he says makes sense say 20 years ago, but with 25% of new software engineers coming from coding bootcamps with non-engineering backgrounds, I worry that defaults matter ever more in dev tools (and even seasoned engineers may mess this up, if they’re coming from a database with different defaults). We discussed analogies like seat belt lights versus the responsibility of passengers to know better. He also argued that waiting to get all this right - not just auth - would impact database innovation, while I think there’s a balance that gets us a lot of the low hanging fruit (like security).
- Mistakes with MongoDB: Eliot felt that certain posts (such as Sarah Mei’s popular HN post about Diaspora: https://news.ycombinator.com/item?id=6712703 ) misunderstood how to architect MongoDB. Regardless of the particulars, I think that new technology is especially susceptible to issues like this until the community develops broader knowledge. The failure states are well known in many relational databases and there is a broad base of knowledge about how best to architect relational data models. As startup engineers, we have to weigh tradeoffs of new vs old tools, with some new tech being game changing for startups (see PG’s much cited post about the benefits of new languages like Python: http://www.paulgraham.com/gh.html ) while many others are hyped new tech that damages startup productivity.
- SLAs are hard in databases: Early products make choices that customers get used to (unacknowledged saves), and you can’t quickly just change the default behavior, esp in a database. 10gen’s early customers loved not waiting for the writes, and only later did 10gen realize that this was an issue for others (the default is very different in nearly every other database). To migrate their first customers without dislocation, they had to hold off on changing the default behavior for longer than they wanted to. In Eliot’s view, their competitors would unfairly argue that this default was a way to juice benchmarks, hoping to stoke anger at MongoDB and cut into their growth. (I tend to have sympathy for 10gen’s perspectives, with the controversial, mistaken benchmark that I referred to in part 1 looking to me like an honest mistake)
Generally, he felt that the issues with MongoDB were few and far between - and that the benefits of MongoDB was game changing for so many startups. Many of them would not have survived without MongoDB.
I’ll add some more notes from the interview in part 3, where I go into 10gen’s early marketing. I also let Eliot know that I would be happy to share/publicize any broader responses/critiques he writes, so that we can have a thoughtful debate that benefits others (and he can point out issues in my arguments).
(Apologies in advance if I’ve made mistakes in representing Eliot’s views)
This does seem like a sticky situation, but a potential solution springs to mind. Maybe there's a reason this would have been infeasible, but why not introduce a "MongoDB Legacy" product line which would be a fork with unsafe defaults, secondary to the main product/branch with safer defaults? That way the old customers would have a clear upgrade path every release, just like the customers on the safe product, at the small expense of 10gen having to cut 2 releases each time and mind the diffs around options and defaults. Maybe this would have been more expedient than waiting until version 2.6 of MongoDB?
RDS is literally hosted RDBMS SQL databases.