Hacker News new | comments | ask | show | jobs | submit login

MongoDB has successfully played the 'hype first, features later' strategy. Now it is well on the way to being a decent swiss-army-knife database.

The RethinkDB retrospective[0] contains a lot of insight into how MongoDB has succeeded despite being vastly inferior on a technical level back when it first launched. I have to admit them a certain respect for executing their strategy so successfully.

Choice quote:

Every time MongoDB shipped a new release and people congratulated them on making improvements, I felt pangs of resentment. They’d announce they fixed the BKL, but really they’d get the granularity level down from a database to a collection. They’d add more operations, but instead of a composable interface that fits with the rest of the system, they’d simply bolt on one-off commands. They’d make sharding improvements, but it was obvious they were unwilling or unable to make even rudimentary data consistency guarantees.

But over time I learned to appreciate the wisdom of the crowds. MongoDB turned regular developers into heroes when people needed it, not years after the fact. It made data storage fast, and let people ship products quickly. And over time, MongoDB grew up. One by one, they fixed the issues with the architecture, and now it is an excellent product. It may not be as beautiful as we would have wanted, but it does the job, and it does it well.

[0] http://www.defmacro.org/2017/01/18/why-rethinkdb-failed.html




> MongoDB has successfully played the 'hype first, features later' strategy. Now it is well on the way to being a decent swiss-army-knife database.

I have no idea how capable MongoDB is these days, as I haven't used Mongo in years (and even then it was not for long).

However, I do not know any developers who, after living through the "hype first, features later" strategy, have been left with a positive enough opinion of MongoDB to ever want to use it again.


Epic had a post mortem blog post here that mentioned in passing they had stumped all the experts they could find to look at unsolveable issues they had with MongoDB. https://news.ycombinator.com/item?id=16340462 I kind of assumed the fix is going to be a rewrite with Postgres or MySQL.


- That's not what they said.

- You think people replace a MongoDB cluster by a single Posgres instance? You guys should really use HA, cluster in real life and stop reading reddit / HN and the hype behind PG, with 3.5M+ CCU no one would use an architecture with a single master / slave ( that's what pg is ).

MongoDB / MySQL have bad press by people that never used it in real life and just repeat what they read online.

I could tell you horror story about pg not have an official replication system until 2011 when pg 9.0 landed.


> I could tell you horror story about pg not have an official replication system until 2011 when pg 9.0 landed.

I could tell you a horror story that happened to me just few weeks ago, where MariaDB just corrupted data out of nowhere due to a bug[1]. This happened multiple times and costed us multiple hours of work (including service being down) each time it happened until we realized the issue wasn't hardware but a software bug.

If you ask me, I take PostgreSQL approach of not having a broken replication before 2011 than MySQLs still corrupting data. Data usually is the most valuable asset a company has.

[1] https://jira.mariadb.org/browse/MDEV-10977


> I could tell you horror story about pg not have an official replication system until 2011 when pg 9.0 landed.

And I would re-iterate that just because something isn't in mainline, doesn't mean it's not possible. Did you know that Pg didn't have native partitioning until Pg10? Somehow we managed to do partitioning before then.

I don't buy the argument that you need to ship broken features just to have them; Pg doesn't include it into base until it's a /good/ solution which is well engineered and has appropriate toggles. That is not a horror story.

> - You think people replace a MongoDB cluster by a single Posgres instance? You guys should really use HA, cluster in real life and stop reading reddit / HN and the hype behind PG, with 3.5M+ CCU no one would use an architecture with a single master / slave ( that's what pg is ).

I shipped a game which had similar CCUs (within the order of magnitude) and I can confirm that you can't do it with one postgresql machine, or.. actually you could but we chose to fsync() constantly to prevent corruption from ever happening and remove the RAID cache.. but you can shard on top of your database solution too.


I feel I need to quote the post mortem back at you so you point out where I mis read the quote.

"Our top focus right now is to ensure service availability. Our next steps are below: Identify and resolve the root cause of our DB performance issues. We’ve flown Mongo experts on-site to analyze our DB and usage, as well as provide real-time support during heavy load on weekends."

How does that disagree with my post?


You're implying they couldn't fix MongoDB or reached the limit which is false. In the current ( HN ) post they said they fixed it, I'm pretty sure they didn't have any experience in DBs in the first place hence why they asked for help.

Nowhere in the original post they mention issues related to MongoDB itself it was probably bad design on their side.


OK I should have been clearer about my interpretation - I read flying in experts as they flew in experts from MongoDB and stumped them so that had me thinking maybe this is not possible if they stumped them. Earlier in this thread one of the engineers from MongoDB says Epic solved the issue but had not updated the blog so I was wrong about that.


You can cluster with postgres & mysql, but you have to implement the clustering / sharding logic yourself.

Nowadays I would just use redis & cassandra if you need something beyond a collection of postgres instances. Most projects do not.


For clustering MySQL, take a look at Vitess. It was developed at YouTube and recently adopted into CNCF:

https://www.cncf.io/blog/2018/02/05/cncf-host-vitess/

(Disclosure: I'm executive director of CNCF.)


> mentioned in passing they had stumped all the experts they could find to look at unsolveable issues they had with MongoDB

That's not really what the article is saying, unless we interpret the following text differently.

"We’ve flown Mongo experts on-site to analyze our DB and usage, as well as provide real-time support during heavy load on weekends." -> "We have started to look into the problem together with experts" and not "Experts have tried and failed".


Yes, come over to the actually greener lawn of Postgres, it is amazing how much more powerful than MySQL it has become with recursive CTEs https://www.postgresql.org/docs/current/static/queries-with.... and some rather amazing JSON support https://www.postgresql.org/docs/current/static/functions-agg...


JSON has been great in MySQL for a few years now since 5.7, and recursive CTEs are coming in the next couple months with 8.0. I don't think you can make a wrong choice between the two these days, but choosing Mongo over either is almost always the wrong decision.

https://www.infoworld.com/article/3228154/sql/whats-new-in-m...


This is actually what I think the biggest power of JSONB is.

I can for example use jsonb_agg() and get a hierarchical response for 1:N joins. It returns JSON value even though neither of the columns contain JSON.

Previously in that scenario I would either need to make more than one query or get a response that has a lot of data repeated.


Another way of reading that post is that MongoDB is being used for some of the highest throughput concurrent workloads out there... and those are always hard to optimize. Doing a lift and shift for a "grass is greener" alternate solution is not a clear cut path to victory at all... but it's certainly a giant science project to contemplate.


Yes, I kind of wish MongoDB had come out and said what they are doing to help Epic Games here if this is something to address that issue or what the plan/thoughts are on the most newsworthy usage of MongoDB in a while.


I am one of the team of MongoDB engineers working with Epic on this issue, and I can assure you that the situation is under control and we have everything in place to scale this application to much higher numbers. However, we're not publishing details about our support cases, especially while they are in progress. That is something for Epic to decide, and I do assume they will eventually say in public just how well MongoDB is, in fact, performing for them.


Thanks for responding. I look forward to Epic updating their post mortem.


Not sure if you are aware but what you are asking for never happens.

It is not professional or appropriate for vendors to be revealing (a) that clients are having issues and need support and (b) the specific workings of technologies or processes within the client's business.


Funny to list MySQL here which preceded MongoDB with cheap/fast now, correctness later.


There's a whole 'nother generation of devs coming through who have never been burned by MongoDB though. Obviously they will be eventually, but by then another generation will come along to repeat the cycle.


I love deriding mongodb as much as the next dev that hasn't used it much; but I'll just note that while I'd still be hard pressed to prefer mysql over postgres - there was a long period where mysql was put to tasks it was ill suited for, especially prior to around version 4.x.

So while "hype first" might reap a deservedly abundant and bitter harvest of developer hatred - it doesn't preclude evolving into a genuinely useful product...


Both true, although I can completely understand why devs went with MySQL over PostgreSQL at that time. I remember that during the same time period that MySQL was drawing seemingly endless criticism for generally poor RDBMS behavior (3.x and 4.x), PostgreSQL was notorious for having poor performance due to insanely undersized default settings. Like out of the box it was sized to run with at most 10 MB of RAM or similar that was just unrealistic.

I also remember it also had a lot of quirks and missing features prior to v8. I assume it was leftover cruft from Ingres, but I remember PostgreSQL v6 and v7 being unreasonably complicated to get configured just because the defaults were so off reality.

One thing you can say about PostgreSQL, though, is that it's developers don't rest on their heels. Every major release packs in a ton of new features. They've gone from being fairly low or middling on the feature set to being pretty near the top. Even point releases have me saying, "Wow, that's really nice to have."


At the time oob postgres probably wasn't the (wrong) competition, sybase, Ms sql, Oracle was...

Or maybe, probably managed postgres. Esp. In the mysql 3.x days and earlier.


I used it about 3 years ago and my first thought was "How broken will multi-document ACID transactions be?"

I still want to like MongoDB, I still miss its style of query vs SQL, but I'd have a hard time advocating its use again...

Sometimes it's tempting to use it for projects that I know will remain small, but even then it's not worth the overhead of standing up a different DB when I have a perfectly good SQL server I can muddle through already.


Check out Rethink. It seems to be what you're after.


It's probably better to go with Cockroach or TiDB. RethinkDB has problems of its own.


We are in the process of evaluating CockroachDB vs Rethink internally and we've found CockroachDB to perform very poorly without obvious disk or CPU issues. I'm curious if you've seen different especially as it relates to Rethink.


I didn't do comparisons. But RethinkDB has straightforward issues like a slow QL implementation using a lot of CPU and a lot of disk space usage. Change feeds have a few scaling issues if you want a lot of them. I don't know that it has mysterious kinds of excessive resource usage. I'm a dev of RethinkDB, not an end user, so I might be seeing the worst side of it. I haven't used Cockroach or TiDB.


TiDB seems to have a pretty active community, judging by its repo (https://github.com/pingcap/tidb). Also, saw this thread on TiDB v MySQL recently that's pretty detailed (https://www.quora.com/How-does-TiDB-compare-with-MySQL). Looks like a good option worth trying out.


Can you clarify on those problems?


Performance, mostly. Too much CPU and disk usage. Change feeds don't scale well.


What problems (outside I guess the current status of the project as a whole)?


I work in a Danish muniplicity. Traditionally we've build everything on SQL because it's the world we function in, but we adopted the MEAN stack as a proof of concept a few years back and Mongo has been growing ever since.

It does require building and maintaining schemas in a different manner, but when you do that, it's pretty great to work with, especially when we're doing design driven development that consists of a lot of prototyping.

I'm a fan, but I'm a manager on business development and digitisation, so I may be a little sheltered from whatever annoyances it may cause in operations.


I work on a mongodb installation in a danish municipality. From the technical side mongodb has been great to work with.


I am curious why a municipality needs custom software. I mean, the scandinavian countries had standardised paper forms for most municipal tasks (population register, ledgers etc) already in the 17-18th century, and those were used nationwide, or at least throughout a single province. Why can't the same be done with software?


Well there are 98 municipalities and 98 ways to operate in a thousand different ways.

I’ve worked on quite a few multi-municipalitiy open source projects, like handling employee refunds on driving.

Basically I drive x kilometers for a meeting, I get paid x and the taxman gets the report. Simple stuff.

Well in the 6 parties involved there were 6 ways to interpret tax laws, 4 different agreements with unions on what rates to pay, 3 different payment systems with 3 very different ways of taking the reported data from a flat file to a rest interface, at least one political decision to overrule tax laws for a certain set of employeees and several different ideas on how to host it and so single sign on, oh and 4 different ways to obtain employee data.

That’s for a simple system with basically 1 function. We have more than 350 it-systems.

Another example is in automation. We have a scanner software and we have an archiving system. They both have APIs but the APIs speak very differ languages. This meant that our local scanner people were tasked with distribution after they scanned things, a task taking several hours each week because putting files into many different areas of an archive sucks. What we did was ask the scanning company to build a QR reader into their software, and then we made a piece of software that put the archive recipient addresses into QR codes. We also made a MOX agent, that accepta the output of our scanning software and loads it into the archive through the API. So now the process of distributing is automated.

You can certainly run a municipality without developers, using standard software and outside hires, it’s just really expensive.


Would it be fair to say that the political entity one step above the municipalities (whatever that is in Denmark) are not doing their job? I mean not doing their job on standardising things that can be in common between the municipalities. Some things will of course have to differ, but a lot of stuff likely differ just because not-invented-here. It sounds like the legislative environment is too complex, and that you have to work around it with a ton of software. Could it even be the case that computer systems have somewhat removed the incentive for the administration to rationalise the various systems? With just manual labor and typewriters all of that would have been very expensive, but with a server hall and a medium-size IT-team it kind of works out. Perhaps digitalisation only having come half-way is a factor - you mention scanning, but by now the so called "paper free office" that was a buzzword in the 1990s should be here already. Or is it perhaps just another sign that the IT industry overall is still very immature and this will sort itself out with time?


I think it's too complicated to blame anyone really. I mean, we working on standardising as much as possible, but it's often impossible because business practices are just so different. Often big standard products fall extremely short, or end up in complete failures because you can't jam people into boxes on an enterprise scale especially not when the people who build the systems have next to no domain knowledge and the people who write contracts have no technical knowledge. :)

I guess our government should work on writing laws that are more friendly to digitisation and stop expecting IT to fix business practices that don't really make sense in the first place. There has been a genuine movement toward that, but it's slow because none of our top politicians or bureaucrats are from technical fields, and they operate on such a high strategic level that they're often rather far from the daily challenges in a daycare institution.

Local political leadership and bureaucracy could certainly do more to focus on corporation, standardisation and digital transformation, and they actually do, but political views differ and they change every 4 years, and the truth is that there just isn't any voter interest in IT unless it goes wrong.

We're trying to build national standards, we've had a set of architectural standards called Rammearkitekturen for a few yers now, but getting them implemented is slow. For one they're made by muniplicities and our structure of government is split in three. Muniplicities, Counties and the State and each branch has it's own ideas, leading to bureaucracy and political differences. Some want us to use EU standards, others want us to build our own, and even if we decided, there are different sets of EU standards as well as different sets of Danish standards.

I personally think the best we can do is try to use whatever national standards are in favour, and build smaller applications on them, with open API's, and run everything as SaaS in infrastructures such as AWS or Azure. I also think we should do a lot more work on business development, modifying business practices before we throw IT at something.

But it's complicated and it's on a giant scale where even minor changes take years to implement


I used it quite happily circa 2009 and at a different company in 2014. In both cases it was being added to systems that already had mature functionality built atop a RDBMS. In the first case it was used to store events that had started to overwhelm the main RDBMS with write volume. (Originally a system with one database as the monolithic data store.) Probably Kafka would have been even better for this use case, had Kafka been available at the time. But MongoDB did the job very well. I did a prototype in Cassandra too before settling on MongoDB, but MongoDB had much better docs, drivers, and single-node read performance at the time.

The second time I used MongoDB to automatically track templated email bodies that were being delivered through a third party mail platform. We had dozens of recurring templates and many more one-off templates for different curated campaigns. If somebody complained that a link or image or token was wrong in their email, we wanted to be able to look back at the history to see if the problem was in the template data or potentially a client issue on their side. Most of the queries were ad-hoc and not very performance-sensitive. This was where a flexible JSON document format came in handy. Modern Postgres would have worked well for it too, but that wasn't available in the company at the time. With MongoDB I got good flexibility, adequate speed, and I avoided reinventing wheels by not trying to shoehorn the data into another MySQL table. I was able to solve a customer support pain point in less than a week and the system has worked well for nearly 4 years now.

I'd be really frustrated if I had to use MongoDB as my only data store. I would guess that much of the hate for it comes from people who were forced into that position, or maybe from people who didn't take its documented limitations seriously enough before productionizing its use.


I don't know much about MongoDB. I am mostly a client-side developer after all.

But every time I see a team transitioning from Mongo to something else, they transition to a relational database. May be their problem is not with MongoDB, but that their data is relational after all?

Personally, I'd take a relational db over NoSQL for most of my needs, but all these stories don't really say anything about how Mongo compares to other NoSQL databases.


That's because nearly all data is relational. At first it seems like you don't need a relational database, in fact NoSQL seems easier to use at first.

As your data grows though you realize that your application become more and more complex. A single query might translate to multiple queries to the database, you need to handle scenarios where fields might not exist etc.

With relational data you might have more work at front, but then the database solves many of the problems for you.

As another person said, when you're using databases like MongoDB you're going back in time and reliving the history, because databases in the past looked a lot like that before Codd invented the relational model, for example [1].

Also the whole NoSQL thing seems to be cyclical, we had XML databases in early 2000s[2].

[1] https://en.wikipedia.org/wiki/Hierarchical_database_model

[2] https://en.wikipedia.org/wiki/XML_database


Or maybe it's useful to get started off with a low overhead, easy to implement DB like Mongo and then as you grow larger, spin off uses that it doesn't serve well to other more specialized and complicated DBs?

One of the biggest problem with relational DBs is that once you decide on a schema, if it's the wrong one, you're gonna be in a lot of pain. Which makes a NoSQL DB a great fit for an early stage product where you are still figuring out what your product needs to do and contain. Once you have some more experience with it, and have a better understanding of your data, it's far easier to build the correct relationships.


> Or maybe it's useful to get started off with a low overhead, easy to implement DB like Mongo and then as you grow larger, spin off uses that it doesn't serve well to other more specialized and complicated DBs?

Not really, converting to relational data is quite a bit of work.

Actually the reverse is the correct approach. You start with normalized data, when there's a bottleneck you start denormalizing it, if that's still not enough you move /subset/ of data to NoSQL database.

> One of the biggest problem with relational DBs is that once you decide on a schema, if it's the wrong one, you're gonna be in a lot of pain.

Not really from my experience all migrations were done through SQL. Also if multiple people (who understand relational databases) come with a schema they pretty much will arrive to the same normalized result.


> every time I see a team transitioning from Mongo to something else, they transition to a relational database

They are repeating the discoveries that people made in the 1970s about storing data in flat files vs. relational models.

Know history or be doomed to repeat it and all that.


Yep this is me. I would require some pretty amazing reasons to even consider using Mongo again. Especially now all other relational databases i trust support json column types.


> However, I do not know any developers who, after living through the "hype first, features later" strategy, have been left with a positive enough opinion of MongoDB to ever want to use it again.

A new crop of developers is, always, just an year away though. I feel future adoption depends a lot on how well-suited the tools are for younger devs. That's where MongoDB found the initial audience!


MongoDB has successfully played the 'hype first, features later' strategy. Now it is well on the way to being a decent swiss-army-knife database.

I was going to say that I won't believe that it is on its way to being a decent database until after an article appears on https://aphyr.com/tags/jepsen saying that MongoDB actually delivers on what it claims.

So I looked for the most recent analysis of MongoDB and found https://jepsen.io/analyses/mongodb-3-4-0-rc3. I still want to see verification of the latest release, and hear battle stories from it in production. But I'm provisionally optimistic that a lot of the glaring "it is a pile of shit that doesn't work when the chips are down" issues are now addressed.

That said, I bet that it will be many years before most people who got burned by MongoDB ever rethink their attitudes about it. Once burned, twice shy. And it really was an overhyped steaming pile of shit for a very long time.


I have used MongoDB in production for a number of Fortune 100 sized companies. It has always been a unique database that was ideal for scenarios when your data model was document orientated.

> was an overhyped steaming pile of shit for a very long time

No it wasn't. This is something you heard from people who never really used it. It had its faults but it was never a pile of shit nor was it substantially worse than other databases.


This is something you heard from people who never really used it

I used it at a previous job. Project to move a multi-tera dataset from an Oracle box (24 CPUs, 24G RAM, SAN) to a MongoDB cluster (10 boxes, each with 48 cores, 96G RAM and internal SSD). MongoDB couldn't perform for shit, and it couldn't stay up in a usable state for more than a few hours at a time. This is with 20x the processors and 40x the memory of the system it was replacing. It's a complete joke of a product, sold on the basis of outright lies as far as what they told us and what it could actually do. Having been that badly burned I consider it an act of selfless public service to warn people off it.

If you're just using it for a personal blog that gets 10 views a day, sure it might be barely adequate for that. But I'd still use Postgres.


See this is the sort of crazy things I used to see people do and wonder why they had problems. MongoDB is a document database. You can't just take relational database tables, move them across and expect it to behave the same. And frankly I don't feel sympathy for bad engineering practice. You don't do system migrations without fully testing and understanding all of the systems.

But for those of us that had document orientated data models it allowed for performance that was orders of magnitude faster than any SQL database.


If you never tracked what happened in production, it may have worked most of the time well enough that you never saw how bad it was.

But read https://aphyr.com/posts/284-call-me-maybe-mongodb and https://aphyr.com/posts/284-call-me-maybe-mongodb for an idea of how the promises in MongoDB documentation compared to the reality of the software under stress. And it wasn't just hypothetical either - there are plenty of horror stories floating around from people who ran into those problems in production for uses cases that were supposed to be a fit for MongoDB.

And the performance argument didn't hold water either. As benchmarks like https://www.enterprisedb.com/node/3441 showed, decent relational databases consistently beat MongoDB on the same hardware. Yes, lots of people rewrote bad relational models and saw performance improve. But apples to apples, writing an application against a relational databases in the same way you would against MongoDB resulted in a win for the relational database.

So yes, there were lots of people saying exactly what you are saying now. But the ones who actually tested their systems and ran performance tests came to a very, very different conclusion.


Again. I have been personally involved the deployment and support of MongoDB clusters for very large datasets at very large companies. It does work if you use it for the right task i.e. highly nestable data not relational data. And let's be clear that if MongoDB was unusable then the company wouldn't still be here as successful as they are.

That EnterpriseDB link is completely ridiculous. Firstly, it predates WiredTiger which replaced the entire storage layer. Secondly, doing one for one comparisons with relational systems doesn't make sense. MongoDB is a document database. Compare it with other document databases.


The EnterpriseDB benchmark is just a hit piece. See https://newbiedba.wordpress.com/2017/05/26/thoughts-on-postg...


From your link, go to https://newbiedba.wordpress.com/2017/11/27/thoughts-on-postg... for the followup after he wrote those benchmarks. When he ran his benchmarks, he indeed got better throughput on MongoDB. But the 99% performance was massively worse - in fact slow enough to be unacceptable. To an extent that he concluded that you'd be better off using PostgreSQL.

And he's right. As pages like http://latencytipoftheday.blogspot.com/2014/06/latencytipoft... make clear, we have a lot of calls back to the application happening. Users will notice the occasional slow load surprisingly quickly, and it is worth a lot to get rid of them.

So even your chosen source agrees. A relational database is not orders of magnitude slower. In fact, a relational database is probably a better fit.


> It is benchmarking PostgreSQL psql against MongoDB Javescript shell (SpiderMoney/v8). That’s not a very fair benchmark is it?

Isn't javascript engine used whenever you interact with MongoDB?


Goodness, no.


See this is the sort of crazy things I used to see people do and wonder why they had problems.

Financial time series data is exactly one of the use cases Mongo claimed to be for. Seems you’re the one who can’t tell good engineering practice from bad. And yes, they also pitched themselves as a direct replacement for Oracle. That was highly disingenuous.


> No it wasn't. This is something you heard from people who never really used it. It had its faults but it was never a pile of shit nor was it substantially worse than other databases.

This is FUD, I have used mongodb, I have a certification in mongodb even.

Unless you know precisely what you're doing it's very easy to burn yourself. And mongo markets itself as being "easy to use out of the box" this is not a good thing to do.

I consider MySQL defaults to be unsafe, (as in, it used to corrupt data silently) but it's a godsend compared to the data consistency in mongodb.

There are countless promises it fails to deliver on too, I will not, ever, recommend it for a project. However in recent months I've heard it got better- This means I will stop deriding developers who now use it. But it does not mean I will be realistically allowing its use in the environments I work in. I tend to care about the data consistency in those.


There is a trend I noticed where I work.

Most people that “get it right” the first time around do not get any recognition whatsoever.

It is the people that screw up, release with big flaws that the customer then pressures the company about, that are heralded as heroes and bacon savers when the fix those flaws. After 3 years and as many releases.


The "squeaky wheel gets the grease" syndrome.


That is true in life in general, not just work place.

Nobody cares about people who are healthy all their life. But someone who suddenly realizes they need to eat better and exercise, and they do, they are applauded. They are defended too, if they go back to old ways. And so on...


>Most people that “get it right” the first time around do not get any recognition whatsoever.

I've heard similar complaints before. And, I get it, too-- at a glance, that person is playing the "superhero" by saving the project. But, good management will insist on root causing failures where this will unravel. If it's a recurring problem, you should bring it up with management.


My biggest gripe as a "lateral manager" (I don't manage engineers, I manage products) is that I see those things happen all the time, and I spend time coaching developers to interact effectively with their managers as much as I can. It's frustrating when I see people that should know better (because I know they heard me) not taking notes about serious issues they want to discuss with their superiors, not knowing how to escalate issues that threaten the well being of the product or the team but that their direct superior doesn't believe are urgent etc...

Developers complain about management but tend to forget that managers are people just like everyone else, and we need to apply some skill to our interactions if we are to get the results we desire.


This completely squares with my experiences as well — a lot of instances of complaints about management are hollow because developers aren't managing upward correctly. Their followup on their issues is missing, or non-actionable.

Do you have any resources you've found helpful improving your skill at this?


>But, good management will insist on root causing failures where this will unravel.

You can have management that understand tech who will get to the bottom of the problem and you can have management who don't understand tech. They won't.

Management who don't understand tech will either keep somebody on hand who they know and trust who does understand tech (e.g. a consultant) or, more likely, they'll just keep rewarding the faux superheroes who keep screwing up and bailing themselves out.


I'd say that good management needs to understand how their subordinates think and operate, even if they haven't played their exact role (e.g. engineer). The best managers that I've worked with, both lateral (e.g. PM) and direct (e.g. EM), take the time to get familiar with engineering processes if they don't know about them already and speak their language.


>It is the people that screw up, release with big flaws that the customer then pressures the company about, that are heralded as heroes and bacon savers when the fix those flaws. After 3 years and as many releases.

There's going to be a ton of survivorship bias even with them. It just goes to show that big marketing budgets are such a competitive advantage that can outweigh not actually being any good.

I'd seriously like somebody with a passing knowledge of data integrity who believes the tech industry is meritocratic to explain what they think the success of mongo is all about.


Reminds me of this old blog post: http://webchick.net/embrace-the-chaos


On the other hand, PostgreSQL is a very good example of a successful implementation of the opposite strategy, that is, "correctness first".

And since PostgreSQL fills that niche very well (correctness + real ACID + extensibility + decent performance), maybe it was really PostgreSQL who killed RethinkDB?


If you're playing the long game and not looking to make a profit that's fine, but PostgreSQL as a company would have been doomed a long time ago. You have to keep in mind the timelines of the business and what they need to do to keep the lights on.

MongoDB has identified a real pain point: many developers don't like to use SQL to interface with a transactional database. I'm not going into the merits of SQL vs. NoSQL, I'm just stating that it's clear there's a need or they wouldn't have gotten any traction.

Now they are maturing the product to the point it might be a safe bet for some use cases, it remains to be seen if their approach to product development will pay dividends or the reputation they have created for themselves has created a time bomb that will eventually kill them.


"PostgreSQL as a company would have been doomed a long time ago"

PG has astonishing feature throughput. With each yearly release, they add 1-3 wow features, 6-10 major features, and countless smaller features still worthy of the release notes.

That's really, really impressive for any database, commercial or otherwise.

There's a perception that postgres is slow to add features because sometimes the feature latency is high. The reason for that is they build a solid foundation first, and slowly build multiple major features on top of that foundation. Consider replication:

1. Write ahead log (WAL) 2. WAL archiving 3. Warm standby 4. Hot standby + Streaming replication 5. Synchronous replication 6. Logical decoding of WAL 7. Logical replication

That's a lot of engineering work there, but they delivered value to users at each stage along the way. And during this time, they did a ton of other stuff -- did you notice that we got parallel query along the way? And logical table partitioning came along too, which means the parallel query can now do partition-wise parallel joins.

Not to mention all of the SQL features and tons and tons of other stuff.

Postgres has kept the lights on for a lot of companies for a long time. I absolutely reject the idea that good engineering is at odds with business success.


Postgres has kept the lights on for a lot of companies for a long time. I absolutely reject the idea that good engineering is at odds with business success.

I don't think they're at odds, per se, but having been around through the original dotcom bubble, PostgreSQL (or "Postgres95," as I'm pretty sure it was still called when I was introduced to it!) was mostly known to, well, database nerds for at least the first decade of its life. One person's "solid foundation" is another person's "technically correct but practically crawling" -- a perception that, rightly or wrongly, PostgreSQL fought against for a very long time. And I think that's what OP was trying to get at: if PostgreSQL was being developed primarily by a single VC-funded company, they just might not have had the luxury to spend years building that solid foundation.

(I'll allow that as an ex-RethinkDBer, I may have some bias here: I loved many things about the product and especially about the product, but it's hard not to suspect we should have focused on speed and, y'know, revenue earlier than we did.)


VC-backed startups are not the entire business world. Some businesses don't consider correctness a "technicality".


And those have been using Oracle, or Sybase or IBM for 3 decades.


MongoDB supports 1) 2) 6) 7). Not sure what 3 and 4 is, but you can just add a new node and new data will be copied over, no need to restore data from snapshot, but you can restore from snapshot too, it shorten the time the replice become available.

Not sure what you mean by 5) though.

Anyway, replciation is strong point of MongoDB with oplog and I don't think Postgres can beat it.


PostgreSQL as a company: https://www.citusdata.com


There are several companies that are leveraging PostgreSQL for their own businesses, but that doesn't seem to me to be a rebuttal of the OP's assertion that PostgreSQL couldn't survive as a company itself. Citus Data is not "PostgreSQL as a company," it is "a company that exists because PostgreSQL already existed."


Well majority of key Postgres contributors work for 2ndQuadrant, EnterpriseDB, Crunchy Data, Citus etc. It basically means PostgreSQL is a distributed company and it would survive fine but being distributed it looks like it is able to innovate faster and is more resilient.


| It basically means PostgreSQL is a distributed company

I would argue that it means that different companies using PostgreSQL help fund PostgreSQL development. That's not the same thing as being a single company. It's a model which clearly works very well for PostgreSQL, but it doesn't really give us good data on whether the "single company doing closed source development" (e.g., Oracle) and "single company driving the bulk of open source development" (e.g., MongoDB) models would have worked as well for them.


There seems to be two points in this comment, one talking about the development of PostgreSQL and the other talking about the usability of it.

PostgreSQL remains one of the most mysteriously difficult common DBMSs to setup which is unfortunate, but since the advent of MongoDB they've adopted all the ease of use features that are warranted from it. Developing a quick-and-dirty product prototype on postgres is a breeze and bootstraping constraints and data-integrity to it afterwards is trivial. I am really not seeing any reason to start a new app on MongoDB exclusively at this point, start off in a strong DBMS like postgres and if you end up needing MongoDB-style document storage you can always branch to it later, but using it initially is a case of premature optimization, there is no need for it.


I find it easier to setup than MySQL, but with package managers today, both are a breeze. Can you explain why it is "mysteriously difficult"?


> MongoDB has identified a real pain point: many developers don't like to use SQL to interface with a transactional database.

The pain point relates less to SQL but more to the RDBMS and the rigid schema. SQL is spreading and may become a ubiquitous query language.


The problem isn’t the schema, it’s that you must have exactly one at all times. Sometimes you need zero, sometimes you need many. Having a fixed schema in production reduces unpredictability and provides optimization opportunities. The journey to get to that fixed schema, however, generally benefits from more flexibility.


It's linked in the RethinkDB essay, but it's always worth explicitly calling out the "Worse Is Better" essay:

http://dreamsongs.com/RiseOfWorseIsBetter.html

Ignore its lessons at your peril.

Your job isn't to build an engineering masterpiece. Your job, as pg says, is to build something people want.


I agree with the lessons in Worse Is Better, but I don't think that the author properly understood what he was observing. The result was a confused and confusing essay.

The way that I understand it is that what is "Good" depends on how you measure it. When we measure in terms of technical quality, we get one answer. When we measure in terms of suited to be widely adopted, we get a different answer.

We tend to idealize for technical quality, but popularity is what matters more. And once something is widely enough adopted, the technical inferiority tends to be fixable.


Shipping early and working on stability later may work for something like a video game but not for a database my system depends on thanks


I think many high profile games have over the last few years, have proven this does not work. Gamers are finally fighting back with their wallets.


If you're in nation-wide healthcare, that strategy quickly becomes unacceptable.


I was badly burned by Mongo hype back in the day, and as a result I won’t touch it with a 10-foot pole for the rest of my life, no matter how many times people say “No, really, it’s good now”. Falling for that was how I got into trouble in the first place. I know a lot of other devs like this.

If they can be successful despite us, more power to em I suppose. I’m a little annoyed that their path to success was built on the flaming wreckage of so many products that fell apart because of Mongo, by using us as their beta testers instead of building a non-shitty product, and I’m at least going to get this comment in so we aren’t completely forgotten among the congratulation.


I agree here, but I’d go further; build things people want, not the idealistic future some day version where we eventually get to a priority feature for a lot of people like releasing fast scalable software quickly (I’m not saying Rethink didn’t do this but they prioritised correctness and sharding, features fewer people need). For most apps built with Mongo this transaction support isn’t a problem (until it is).


Both approaches have downsides.

The TCP/IP stack was built and used while the OSI model was being designed, and it won all the mindshare. Perhaps it would have been better to have separate presentation and session layers, but we don't; the application layer handles that stuff. It works well enough.

OTOH, this quote is wise:

> It is easier to optimize correct code than to correct optimized code (Bill Harlan)

I think this is doubly true for databases; at least with obfuscated code, you can recover the underlying meaning with work and exploration.

Losing or corrupting data is the worst thing a database can do. Given "this will be correct and hopefully we can scale it" vs "this will be fast and hopefully we can keep it correct", I'd choose the former for any "source of truth" data every time.

There are tricks for speeding up queries - indexes, cacheing (including materialized views), sharding, read replicas, etc.

There are no tricks for recovering data you lost.


> I think this is doubly true for databases; at least with obfuscated code, you can recover the underlying meaning with work and exploration.

True for databases, but not true for businesses.

> Losing or corrupting data is the worst thing a database can do.

Clearly people building simple crud websites with slick JS features didn’t agree otherwise Mongo would be gone and Rethink would be worth hundreds of millions of dollars.


>Clearly people building simple crud websites with slick JS features didn’t agree

I doubt it's that they didn't agree, it's more likely that the thought simply never occurred to them.

Mongo's marketing is directed with laser like focus on the beginner developer seeking out tutorials to build a website, etc. Questions about data consistency simply never arise in that context.

Later on that developer who was gently guided towards using mongo by all of the slick marketing will likely try to defend their decision when somebody attacks it ("their data consistency problems aren't that bad" or "data consistency isn't that important"), but that's something else.


>Clearly people building simple crud websites with slick JS features didn’t agree otherwise Mongo would be gone

Popularity is not always a good measure of what ideas are good ones.


> build things people want, not the idealistic future some day version where we eventually get to a priority feature

FWIW, I don't think this was what happened. RethinkDB started out as an SSD optimized database, and quickly repositioned itself (due to "is this what people want") to something more generally useful, and was one of the most feature-rich databases at the time, I thought.

MongoDB however got first mover advantage and a bunch of cash that comes with it. They could afford to invest heavily in developer evangelism. Then they bought WiredTiger. If I sound bitter, I am a bit - not that Mongo did well in the end, but that RethinkDB went the way it did.


Clearly the evidence bears out the success of that strategy, but it's hard not to summarize it as, "Apparently a lot of developers, applications, and users don't need a database that works." But I don't know what that's really an indictment of exactly.


> I have to admit them a certain respect for executing their strategy so successfully.

But be certain not to conflate your respect as a business strategist with your judgment as a mindful developer. To speak clearly: By systematically playing a weak spot of ours [1] they have used countless of small teams as a stepping stone to sell their business contracts to large players, while hurting a lot of these small teams with an (at the time) inappropriate product for their needs. And through these huge costs they still made a product that is inferior to one that was designed properly.

As a community (both as a startup, as well as a developer community) we should resent these tactics and try to find ways to protect us against players that abuse the common good of mindshare. And lest you say, that is the price you have to pay to at all get a product like MongoDB in harsh business environments: We could also lobby for open source funds that are organized like research funds, producing fundamental technology that benefit everyone. Not every technology fits the model of for-profit startup innovation.

[1] Our community has very little defenses against marketing that comes from our midst, aiming to produce the (false) impression that a disproportionate amount of our fellows have evaluated the product and found it to be excellent. See https://www.nemil.com/mongo/3.html for a discussion about MongoDb specifically (HN thread: https://news.ycombinator.com/item?id=15124306 )


With all its problems, I built a MEAN (MongoDB, Express, Angular, Node) app from zero knowledge to production 2 years ago far faster than this React, Apollo, GraphQL, and Postgres app I'm building from zero knowledge.


Speed isn't always a great thing... If it takes you 2x faster to build but 10x extra support/maintenance after the fact and eventually you need to migrate to postgres anyway because of acid features and stability.. then the time/money loss > benefits.

Build something the right way first, even if it does take longer though I use rbdms(mysql or postgres) all the time with an ORM and the ORM does most of the heavy lifting. (Laravel/Eloquent in my case).. so I still develop pretty rapidly. I'm sure if you use pg+react on multiple projects eventually the speed to launch will increase...


It is honestly a nightmare. I had to make a decision which framework to invest learning in with very limited funds. At the time, the big choices where Angular which had been established and backed by Google and React which was still very new with a much smaller community. I went with Angular and by the time I learned everything I needed, everyone wanted to hire React developers. Running out of money, I ended selling all my belongings, moving to a new city, and doing Backbone.js development. I've been working on learning React for the last several months not earning money and it is more difficult to learn than either Angular or Backbone because it isn't as opinionated driving me to have to learn each tool to decide which is best. My mind craves structure. Whereas most developers have 2 years React experience on me. I figure if I waited 6 months to learn JavaScript frameworks, React would have been the better choice and I would have been far better off than what happened. In a way the MEAN stack screwed me.


The flip side of 'hype first, features later' is that if you are a user who is burnt by mongodb (or another solution) you'll recommend against using it for a long time. So there's a knife edge to balance on--hype enough, but not so much that too many people get burned.


Yes and Mongo helped us a lot to get thinks off the ground, but we're now moving everything to postgres with JSON for some time.


While MongoDB might be a decent document store, I found that Elasticsearch is better at this job (as a secondary datastore). Its aggregation capabilities are juste far better than MongoDB's with the added bonus of being really good for all kind of searches.


I also quoted Rethink's post in "The Marketing Behind MongoDB" in part 3 of my series on MongoDB:

> I sympathize with RethinkDB's team — they did what thoughtful engineers are trained to do. Engineering purity and humility is a tiny part of building a sustainable, venture-backed company.

https://www.nemil.com/mongo/


Despite their claims to the contrary, RethinkDB also released and claimed 'ready for production use' a version of the product that was pretty broken. I used it heavily and hit numerous serious bugs. The RethinkDB devs did do a very good job of tracking down and fixing them. Software is hard.


From that post mortem:

It was unfathomable to us why people would choose a system that barely does the thing it’s supposed to do (store data), has a big kernel lock, throws away errors at random, implements single node features that stop working when you shard, has a barely working sharding system despite it being one of the core features of the product, provides essentially no correctness guarantees, and exposes a hodge-podge of interfaces that have no discernible consistency or unity of vision.

I mean... that's unfathomable to me too. He explains it later, that " MongoDB turned regular developers into heroes when people needed it"

I have a hard time understanding why devs choose / chose MongoDB. Postgres with JSON columns gets you so far, why would you go with MongoDB, given the issues it's had?


> I have a hard time understanding why devs choose / chose MongoDB. Postgres with JSON columns gets you so far, why would you go with MongoDB, given the issues it's had?

Jsonb is a pretty recent addition to postgres, when compared with the MongoDB timeline. And even today postgres still doesn't have the replication/failover story that made MongoDB pretty compelling. I know, it's coming, whatever, but the point is that there was a time where if you wanted a json store that could stay alive through network issues, MongoDB was one of the only choices available, and postgres simply didn't have what was needed.


The problem with that thinking is that the replication mattered. I'd argue it didn't, it was essentially a scam that people fell for. Who cares about failover when you're losing data due to a bad implementation? Who cares about replication when you can gain the same performance by using a performant database on a single node?

Did mongodb truly allow anyone to really horizontally scale? Most places that need massive horizontal scaling using something like mysql as far as I know.


The thing I don't understand about mongodb is that it makes a tradeoff for scalability.

The secret ingredient in the horizontal scaling sauce is giving up inter node ACID transactions.

Nothing prevents you from making the same tradeoff with mysql or postgresql.


Indeed, that's how youtube, twitter, and facebook use MySQL, among others.


I loved working with RethinkDB, and the changefeed stuff was awesome. It gave me relational documents, which is all I wanted for most projects. Bummed that project has been basically slowed to nothing.


I'm still mourning RethinkDB. It's supposed to still be alive but the release cycle, or lack thereof, says otherwise.


Agreed. It was a complete piece of crap when it was around version 1.4-1.6 but it's pretty good now!


A database which utterly fails the Jepsen test should not be considered for production. It might be good enough for a cache, but trusting it with real data is reckless.


They have since passed the Jepsen test to be fair. But before then I fully agree, why people trusted MongoDB with their critical data is beyond me.




Applications are open for YC Summer 2019

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: