And if there is a differently-licenses _implementation_ of the protocol, one would assume that it is possible to implement (or just use directly) that implementation without any reliance on this specific specification document.
In other words: IANAL but I don't think this will accomplish what they want (which is Amazon re-implementing a wire-compatible-with-MongoDB database)
As an example: I (probably) can't scan and redistribute an entire cookbook, but I can bake cookies from a recipe in the cookbook and sell them.
It does require you manage outside contributions to the open source part carefully and get approval for those contributions to land in the not-open-source part.
The basic purpose of a protocol is interoperability, yet the restrictive license works directly against that.
I also wonder about the reach of this kind of license. It's a license on the published work. So suppose I read this, whole-heartedly agree to the terms of the license, and use it to create a MongoDB Interop framework. I include attribution and release it under BSD.
Am I good?
I did not use the work for a commercial purpose and provided attribution, etc.
Now suppose some commercial cloud provider picks up my framework and uses it to implement MongoDB interop. Aren't they good too? They followed my license. They may not even be aware of the protocol specification, much less have viewed or used it in any way.
So what exactly has the protocol specification license accomplished?
(I am not a lawyer, so I'm sure I don't understand, but I'd like to.)
Note that I'm not addressing whether the wire protocol's license affects server/client implementations — that's being discussed in a different top-level thread — I'm only addressing whether a difference could exist between server/client.
(I am not your lawyer, please do not treat social media comments as legal advice, etc.)
In that case, in my scenario I would not be good releasing my library as BSD.
To my mind, writing a library according to a specification is not a derivative work of the specification. (I think a derivative work would be something like an annotated version of the specification, or a new specification that adopted parts of the original specification.) But like I say, I'm no expert on this so could very well be wrong.
Like NodeJS and many web technologies, almost everything I read about it years ago turned out to be hype and not based on facts (eg, nodejs being faster for large numbers of requests than traditional backends is not true, but it was widely repeated on the first 10 pages of google search results)
edit: very well said, thanks for all the replies!
The web-scale meme came from them actually trying to position themselves like that, and it partially worked given that it is not only still around, but the prevalence of the MEAN/MERN stack, at least in (tech) pop culture and social media.
To this day my opinion is that Mongo is a snake oil PR firm with a "database" and I have not yet seen anything that has convinced me otherwise.
During which, data is naturally hierarchical and schema-less (ie., not defined, changing).
Mongo makes it easy just to "save your dictionary"
It's really just an "sqlite for hashmaps", and should have remained this.
Wired Tiger, the default storage engine is LSM Trees with BTrees for indexes.
Additionally, Junior developers do not care about data integrity (the customer's problem), or operational complexity (the SRE's problem).
Finally, Junior developers consider software to be much like bread. Older software is bad (such as SQL relational databases), and newer software is good (NoSQL).
Combine all three of these flawed systems of thinking, and you have MongoDB adaption.
... also before Kafka, the defacto queue broker was ActiveMQ, which was a pain to deploy, so many people used Mongo as a queue.
Their aggregation pipelines are pretty pleasant to use. You basically get a bunch of data transformation code that you can avoid writing and you can work with your data without it going out on the wire.
I would probably be able to do most of that in SQL and I’m sure someone who really knows SQL can do everything it can do but I suspect the query involved will be significantly gnarlier. :)
Not sure about how the two would compare performance wise but I do know that you can’t use “$lookup” (Mongo equivalent of joins) with sharding which strikes be as quite unfortunate for scalability.
There are a few instances where I've really preferred using a document database. Things like logs are often really useful to have in a document database like MongoDB. In one collection I can stick all kinds of logs. The schema of all the different fields in the document are really relevant to what kind of things I'm logging. If I were to have a column for every field I might have in the log message, each row would have a ton of null columns. If I wanted to add a new field to easily search on, I might have to add a new column. A "schemaless" design can then be very useful. Of course this is true of pretty much any document database and not necessarily only MongoDB, and these days some relational databases have decent JSON document database functionality as well. There are always certain kinds of tradeoffs to be made when choosing one database technology over another.
Another thing I've found useful with MongoDB in particular is GridFS, that has been pretty neat to use. These days there are other systems of handling binary blob storage, but as mentioned earlier there are always tradeoffs with choosing any particular technology.
It had a flexible data model as there is no predefined schema. The selling point was rapid protoyping.
The JSON-based documents meant developers could work in objects without an ORM. If I remember Mongo called this "fixing the data impedence mismatch." Sigh.
It had decent performance as long as your working set fit in RAM as it was basically a mmap()'d file.
It was easy to get up and running.
It was in the right place at the right time during the original NOSQL hype cycle.
MongoDB has automatic sharding, which is very important for horizontal scaling of writes. For MySQL you have to do manual sharding which is extremely hard.
And finally, document oriented databases are schemaless, which means there is no downtime when you add/remove fields.
Schemaless / document databases have a place and a purpose, but nine times out of ten the data we're dealing with day to day has a known and rigid structure that changes infrequently.
And you still need to do joins to enrich data from different collections.
I want Postgres back. :(
That's true. The strictness of SQL lets you use SQL as the integration point. Without it you'd have to write a service on top as the integration layer.
Exactly. There is always a schema, there has to be. It's just a question of whether it can easily be seen or if it is buried in a thousand places in the code.
Please define internet scale. Joining 10s of billions of rows using clickhouse here.
Vitess  and PlanetScale  ?
That is not something specific to document databases that is an approach taken by NOSQL datastores. It's known as denormalization which has its own share of issues that then need to be addressed.
MongoDB was nice for fast iteration on new projects, i.e. no alter table scripts required, but in the end it wasn't really an improvement. The data in the DB became hard to manage since different records in a collection could have different fields, due to no consistency or enforcement. Mongoose helped some with that, but then the schema just lived in the application layer instead of the DB. I also found myself needing to do multiple queries to do "joins" across multiple collections, so it ending up being less efficient. There was also the default setting of write and forget, i.e. to not wait for a confirmation that the data actually was written. Who writes a database like that? Performance focused but not data focused. Anyway, I liked the idea of having JS on the client / server (Node.js) / DB, but after some time I came back to relational DBs.
We lost face because of that, although finally the client understood that it was all bs. Stay away from MongoDB and if you have to contact them make sure they sign a strict NDA and don't reveal any details to them.
This is pretty terrible.
My reading of the license implies it covers _the document_, like a book or a song could be licensed that you cannot modify and redistribute the results, or maybe not redistribute at all: not what you make with the information you gain from reading it.
https://wiki.creativecommons.org/wiki/NonCommercial_interpre... makes me think the license is about _copyright_, not about what you make with the document. The page also has the explanation about "Explanations of NC do not modify the CC license".
Even if the NC clarification somehow held true, then I still don't see how it would prevent someone from writing e.g. a MIT-licensed library implementing the interface (the non-commercial use bit) and then someone else just takes it and uses it for whatever, within the limitations of the MIT license.
At least that's my understanding. Am I missing something here? It's fairly reasonable for a company to licence their docs with a non-commercial licence like this.
Lawyer repellent (I hope): The quote comes from https://github.com/mongodb/docs/commit/50e48200cde7e2eaffdc6... , and anyone who receives this comment may copy it as they like.
Because it's already in source form and isn't "compiled in" into a derived form I don't think it has the same potency from an infectious license perspective as GPL does. If you bundle this document as part of your commercial database I think they can't do diddly squat as long as you make clear that the document itself is still under the CC sharealike license.
I'm pretty sure the authors really want to prevent competing implementations that make money for someone else, and the wording expresses their intent. I believe that copyright on the document won't let them stop a company that can afford decent lawyers from implementing the spec, but it certainly doesn't make me any more likely to touch anything related.
> You may not use or adapt this material for any commercial purpose, such as to create a commercial database or database-as-a-service offering.
It'll be exciting to see Oracle vs Google turn into Mongo vs AWS, which is clearly where this is eventually going.
edit: I mean, it’s not that I strongly care, but why not explain if you disagree? How exactly is a specification not a form of documentation?
The relevant definition for spec would probably be:
> a detailed description of the design and materials used to make something.
That sounds an awful lot like documentation.
Maybe you are implying that you can’t read it and then make your own implementation of that reading, but that’s not true. IANAL, but I’d be willing to bet my lawful ass it means you can’t reproduce the document for your own commercial implementation. Not that you can’t reference it. I don’t think “derivative” is that strong.
Even if you subscribe to the belief that the protobufs are eligible for protection under copyright law, which is fair since it seems to be the current understanding even if it is relatively new, I still don’t understand how creating compatible specs without reversing is illicit. It’s fair use, no? I thought interoperability was a valid claim for fair use.
They have great marketing and appear to solve problems for some customers, but they also seem to cause major problems for customers.
So I'll just keep advocating to stay away from it.
Every single company I've ever worked for was crushed by the unreliability of mongo. They're ultra-expensive consulting is also a ripoff - in one case the guy came, suggested a bunch of stuff w.r.t. changing up our queries and indexes, left, then a day later the database exploded and we had to roll back everything he suggested. We tried again piecemeal, which eventually lead to the same thing happening again. Eventually spending the cash to train the engineers and admins to be able to do the tuning ourselves - which ended up being completely different than the garbage the consultant suggested. Let me emphasize that this consultant was from the MongoDB company - not some third party. Completely incompetent company at all levels.
Mongodb exists to extort money from idiots on their cloud offering where costs balloon out of control. It was easily 40% of our monthly infrastructure bill and what we got out of it definitely did not reflect that.
We refused to pay them. Don't ever give Atlas money.
What i do not agree on is that their atlas product is bad. It has a very nice and helpful dashboard. Atlas is quite solid when used from google cloud (downtime/slow performance just once in 2 years) and their consultant was very helpful. Not really fast response times, but it was not urgent. Also their offering is not super expensive, but we only have 50-100gb of data. Consultancy was good value for money.
Dont give atlas money because you probably want a relational database for relational data and use elastic (or postgres if small scale) for search or statistics. Mongodb also sucks with scaling, it suffers from the same issues as normal databases, except that they dont call it a problem. Which means, you, the developer should fix it. It also sucks with scaling, because expertise on using mongo is scarce.
Almost done at this client. Learned a lot about mongo. Would never recommend it to another client. Would recommend running managed sql databases by a cloud provider if their offering was as good as atlas.
That said, I think that the wire protocol is probably ~fine for a schemaless document store if that's what you want. I know that Apple implements MongoDB with FoundationDB, so that there are much stronger guarantees behind it, but they can still use MongoDB drivers in various languages, and that seems reasonable.
Older discussion for an older version of mongodb:
Mongo is just a poorly designed piece of software you shouldn't trust for any mission critical service, unless you are willing to dedicate a lot of resources to constantly put out fires.
Investors are often oblivious to these details. If the company is not doing as well as expected due to crippling tech debt, high turnover rates, dumb decisions, etc. the CEO will come up with an excuse (since COVID, this is easier than ever).
MongoDB is an immature product, yes.
But it is good at some things.
As long as you learn what things it is good at and what things it is not good at it can be quite viable solution, depending on your problem.
Learn and plan accordingly.
My experience is that the things that mongo is "good at", there are competing products that are just as good. Mongo downsides - such as not giving a crap if you lose data - make it a non starter when there are so many better products that actually protect your data.
So what is your experience? I have stated mine.
> My experience is that the things that mongo is "good at", there are competing products that are just as good.
So what does that mean? Nothing.
Every product is a set of compromises. The one that is suitable doesn't need to be perfect in every (or any) respect. It just needs to have the set of compromises that suits your project.
> Mongo downsides - such as not giving a crap if you lose data - make it a non starter when there are so many better products that actually protect your data.
Database do not (typically) "protect" data. There are some database that make it impossible to remove data once stored, but in general if you count on your database to prevent data loss, you are just waiting for a junior engineer to make a blunder and remove half your database, whether it is MongoDB or Oracle.
What all this means is that you cannot count on any database to prevent data loss and you have to organize some kind of way to protect your data. This usually means some kind of backup, snapshotted replica, redo log, etc.
This is a laugh. There is decades of work and research put into databases and making sure that what you put in comes out correctly and consistently. Things which mongo referred to as "not web scale" for years.
Obviously you can lose data when you make stupid mistakes or God decides to smite you - that's not what I was referring to. Pretty good strawman though.
Atlas works fine and it's reliable, on the consultant thing I can't comment, but MongoDB as it is today is a good DB.
 https://www.postgresql.org/about/donate/ ("PostgreSQL is an affiliated project of Software in the Public Interest. Funds donated to PostgreSQL are used to sponsor general PostgreSQL efforts. These funds are managed by the Fund raising group.")
 https://www.computerweekly.com/news/252455700/AWS-pushes-Mon... ("AWS pushes MongoDB compatible alternative as licences change")
It does not restrict you from building applications that derive knowledge from MySQL internals documentation.
Like, if Amazon wants to spend millions of dollars writing their own version of MongoDB that is drop-in compatible for existing MongoDB applications, that is great for customers. You, the customer, now have two choice -- options when things don't work out with one implementation. When you use license agreements to restrict that ability, I think it's a statement about the quality of your product -- you think Amazon can do it better than you, so you're going to use the legal system to prevent them from trying. As a database user, that is a signal for me to stay away. It means that when I'm having a scaling emergency there is only one way to fix it -- give most of my revenue to one company. That's scary.
Yes, Amazon did almost kill Elasticsearch. I sometimes wonder if the set of circumstances translates 1:1 to other companies. The products have similar names -- AWS has the Elastic Compute Cloud, and then there's this Elastic Search. That's not their product? Nope, just an unfortunate naming choice. And, Elasticsearch was particularly difficult to run at the time, so a hosted option was clearly valuable. And finally, Elasticsearch had a lot of company-killing problems: basically not doing what it said it did. https://aphyr.com/posts/317-call-me-maybe-elasticsearch At the time, Elasticsearch was losing acknowledged writes. That's a company killer if your only product is a database.
If you were intending to create a profitable business, and really wanted your product to be foss, and Amazon might just fork your project, honestly, what are your options? I'm honestly curious, it seems like such a difficult situation
> GNU Affero General Public License is a modified version of the ordinary GNU GPL version 3. It has one added requirement: if you run a modified program on a server and let other users communicate with it there, your server must also allow them to download the source code corresponding to the modified version running there.
>The purpose of the GNU Affero GPL is to prevent a problem that affects developers of free programs that are often used on servers.