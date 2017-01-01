Hacker News new | comments | show | ask | jobs | submit login
MongoDB 3.4.0-rc3 (jepsen.io)
MongoDB receives a fair amount of criticism here but the company is a fantastic place to work. I'm proud that I was able to learn and grow as a developer alongside all of those who have been trying (and succeeding) to make a great database.

The team at MongoDB really cares a lot about making the best database product possible. I knew it when I was there and still think so after I've left.

When building a company, I care more about culture than anything else. It's the people that make the company great.

When choosing a database, caring is only one aspect of it. It's necessary, but not sufficient.

MongoDB as a product evolved tremendously in the last few years. All the improvements introduced with WiredTiger and other major releases have seriously improved the performance and reliability of the product. It's not a perfect solution, but the majority of criticism is something that was written about version 1.8.

>"MongoDB receives a fair amount of criticism here but the company is a fantastic place to work"

None of the criticisms I have ever seen or heard about MongoDB were related to Mongo Inc and their office culture but rather their product.

Bigger news is that Jepsen tests are now part of the MongoDB continuous integration suite: https://evergreen.mongodb.com/build/mongodb_mongo_master_ubu...

Open and available for everyone to see, for every build of MongoDB. Is there another database that has this much transparency? (for every build)

Given their start point (a product unfit for public consumption) that is the absolute minimum they need to do.

I at least will never trust Mongo for anything but a toy project. There are so many better options out there, options whose technical capabilities are as good as Mongo's marketing.

I think if you look back objectively, there are very few database platforms that were absolutely "fit for public consumption" right out of the box. Look at all the SQL Server shops out there (mine included) that won't even roll out a new version of SQL Server until it hits SP 1 at a minimum... Fro MongoDB, If you look forward based on what they are doing now rather than at how early adopters may have had a sub-optimal experience way back when, you'll see a mature product that is consistently improving and is demonstrably reliable. Can you give an example of another option you are referring to?

In the same NoSql space as mongo db? I can't remember not even one that passed fully jepsen test. Can you post some links to the better NoSql options?

And big news for me is the existence of that Evergreen tool as an open source project. Looks great! Is anyone else using it? Is there a comparison to Jenkins available?

It's been a long way from the "Call Me Maybe: MongoDB" post from years back. Aphyr/Kyle took them to task in so many ways for playing fast and loose with data integrity, and rightly so. MongoDB could have said, "that guy's full of BS, ignore him," but instead they did the smart thing and paid Kyle to help solve the problem.

n.b. I can't find the original "Call Me Maybe" post, but this later one [1] is similar.

[1]: https://aphyr.com/posts/284-jepsen-mongodb

Call Me Maybe post here (updated in Jan 2017): https://aphyr.com/posts/322-call-me-maybe-mongodb-stale-read...

HN Discussion: https://news.ycombinator.com/item?id=9417773

MongoDB 3.4 passes the rigorous and tough Jepsen test. Jepsen designs tests to make databases fail in terms of data consistency, correctness, and safety... MongoDB 3.4 passed through their newest tests.

I think that this really shows how mature of a Database MongoDB is.

While you are correct that 3.4 now passes, I feel like your interpretation here is a bit optimistic. It's not like I tested Mongo and it passed out of the box--it failed Jepsen tests, and not just with a read anomaly--it lost majority-acknowledged inserts. The v0 protocol still fails--it's fundamentally broken. The v1 bugs are fixed now, but that's a consequence of our collaboration.

I'd say the marker of maturity here is that MongoDB has put significant time and effort into correctness: they take clock skew and network partitions as serious failure modes, they've redesigned their replication protocol, added options for stronger reads, and invested in their own correctness test suite, and Jepsen tests, as a part of their CI process.

I'd like to use the occasion to thank you for the service you provide to the community. The tests themselves, the collaboration with vendors, the competition it fosters and the in-depth technical write-ups I think push the whole ecosystem forward, which everyone eventually benefits from. Much obliged.

Note it’s not really “The Jepsen Test” as the tests are all a bit custom and different. Sometimes they’re looking to validate different claims.

For example, Kyle integrated some serializability tests into the VoltDB Jepsen testing that wouldn’t apply to Mongo.

> I think that this really shows how mature of a Database MongoDB is.

Or that it took this long for them to pass basic proficiency tests. How do other database systems fare with these tests?

If you look at past Jepsen test mostly they fail, and sometimes they fail spectacularly. This is one of those things where if you don't have an automated test suite to continually verify correctness you will end up with bugs.

You can check out the other databases he has tested here: http://jepsen.io/analyses

That being said, none of the big players in sql are there, so you can't size it up against postgres or mysql.

Postgres has been analyzed, I'm not sure why it's not on that list.

https://aphyr.com/posts/282-call-me-maybe-postgres

isn't that because they are not distributed systems?

Galera was tested - enabling MySQL to behave like a distributed database

> In addition, we show that the new v1 replication protocol has multiple bugs, allowing data loss in all versions up to MongoDB 3.2.11 and 3.4.0-rc4. While the v0 protocol remains broken, patches for v1 are available in MongoDB 3.2.12 and 3.4.0, and now pass the expanded Jepsen test suite.

Wait, it is me not understanding what the abstract is trying to say?

It's saying it was fixed in the final 3.4.0 release (or, maybe, that there's at least a patch available) for the v1 protocol, but that the v0 protocol is unfixable - so, 3.4.0 + V1 protocol and it's solid.

Giving aphyr some money to do this makes me think much more of Mongo now; I wouldn't previously have considered it for serious use but this is an excellent demonstration of intent.

I can only think that the article was a bit late and the new MongoDB version has the last word. Beyond this, the article is extremely valuable and it is mandatory to upgrade your MongoDB.

This analysis was a collaboration with MongoDB over the past three months. MongoDB was able to fix these issues in 3.4.0 because I found them in rc3 and rc4.

reply


Sorry, my fault.

It's great that they're serious and committing resources to passing these tests. They just did, though, after nearly a decade of development. Which is terrific but it doesn't really show the 'how mature of a database MongoDB is'.

I think this is 10th year of MongoDB's existence. This year they seem to be having not just popular but good product. It will be interesting to see if ratio for Good/Popularity is similar for most non-traditional databases.

I hadn't noticed this before, but aphyr has an elaborate and articulate ethics policy: http://jepsen.io/ethics

I like it.

It's really good. I feel like a lot of security consulting shops could benefit from it as well.

So a client can get the analysis done and leave it unpublished if they don't like the results? Maybe that's standard practice but it doesn't seem great for users.

I agree. In my ideal world I publish everything immediately.

There's no "standard practice" that I know of--very few people are doing this kind of work. Behind every one of these analyses is weeks of contract negotiation where I try to convince assorted lawyers & CFOs to go along with my weird, idealistic ethical policies. And conversely, those lawyers and CFOs do their best to balance the desire for correctness with their duty to protect the company. This policy outlines how we compromise.

Originally I did offer a client veto, because clients weren't willing to sign without some assurance of control over the outcome. My current standard contract for analyses actually drops that client veto: I have final say on the analysis content and publication. There's still a grace period to allow the vendor to fix bugs get things in order. I'm adopting this as the standard going forward, but it's been a long road to get there.

That said, I do perform private consulting--usually for in-house systems, sometimes for clients that can't afford a full analysis, and sometimes as a precursor to more involved work. That means that yes, vendors may be aware of bugs and I won't have told you about them--but I promise that my public work remains honest and forthcoming.

Where does the policy say that? I see:

> "Once I've prepared a written analysis, the client may defer publication by up to three months. This gives the client a chance to understand the faults I've identified, attempt to fix bugs, update documentation, and deploy fixes to users."

So, a client can get the analysis done and attempt to defer publication for three months, after which time it'll be published. No?

Isn't their policy just responsible disclosure? If you find a security bug, shouldn't the company have a chance to fix it before it goes public?

This is great news. Historically I've accepted the risk of data loss and coded checks when needed. I will never rely on my database, regardless of which one I am using for complete data consistency. It is, however, nice to see strides being made towards even better robustness. Go MongoDB!

> I will never rely on my database, regardless of which one I am using for complete data consistency.

I can't imagine developing any software that involves relationships between entities that does not have data consistency. Check constraints, foreign keys, and data type validation all provide a minimum sanity level of the underlying data that allows your mind to focus on more important things. Otherwise you're entire application is going to be littered with those same sanity checks.

I'm not saying that all apps need that type of data store or that there isn't room in this world for NoSQL stores, I mean specifically that complicated interdependencies and validation checks lend themselves well to the relational model.

I think maybe the doubt is based on case like https://aphyr.com/posts/282-jepsen-postgres which all systems are subject to.

Why would anyone use MongoDB when RethinkDB is available?

