Hacker News new | past | comments | ask | show | jobs | submit login
FerretDB: A truly open-source MongoDB alternative (github.com/ferretdb)
136 points by kiyanwang on Dec 8, 2021 | hide | past | favorite | 110 comments



See also this discussion when FerretDB was still called MangoDB: https://news.ycombinator.com/item?id=29071623


That was a good name, but too close to MongoDB. FerretDB does seem to follow the rule that databases need to have terrible names.


We indeed had a hard time finding a good name, especially with the situation around domain names nowadays. An infinite amount of time can be spent on these things. However, we decided that we would rather focus on things which are more likely to determine the overall success of FerretDB - the vision and the execution. We quite like ferrets, though!


The idiom "ferret away" means to store something in a secret place, and the idiom "ferret out" means to find something by careful searching. Seems like a pretty good name for a database.


This is exactly why we considered the name after all.


Also Ferrets are smart and nimble and make great pets :)


But in California they are illegal to own without a permit.


Fun little critters. They're cute, but stink even with their scent glands removed. Vicious even when domesticated, too. Our 60lb Rhodesian Ridgeback cross was scared of ours, while it tried to "play" with her. Bloodthirsty. One day my brother's pet ferret escaped from the house and went over to the neighbours garage and killed 8 or 9 kittens with a bite to the back of the neck and then just left them. Eventually it escaped one last time... when an owl caught it. Cycle of life.

How's that for off topic?!


Databases always stink There is no way around it! So why not to embrace it ? :)


At least it’s better than cockroach.


Don't forget clients too - looking at you DBeaver


Now we have FerretDB, CockroachDB. I wonder what animal they'll name a DB after next.




MongoDB is an even more terrible name in my country. I know they named it that because it’s humongous but in my country, “mongo” is a derogatory term that is equally offensive as calling someone a retard and is derived from the word “mongolid”.

I remember when I grew up we used to call each other mongo as a joke and not meaning anything bad about it, but then on one occasion a teacher overheard us and told us that she was sad to hear people use the word that way because she had a grandchild with Down’s syndrome. I think this was one of the first times I learned that words said to one person can be hurtful to someone else even if they were not the person you said it to. And after that I’ve tried to be more conscious about which words I use in public.


Frankly, I'd be less offended by someone calling me a retard than I would if someone suggested that I use MongoDB


Dang, what a quick wit!


[flagged]


Every time I hear “mongodb” I think of that guy from Blazing Saddles that punches horses.

My employer thought it would be cheaper to use Mongodb then hiring people who understood SQL - then free turned into a subscription, and a support plan got added in, now it’s a six figure expense per year. Plus it turns out that developers who don’t understand SQL are really unbelievably bad at everything else database related, so all the Mongodb queries end up being linear scans, and the apps fall over when the dataset grows beyond core memory size - because the app developers like to read the entire collection into ram then iterate through manually and pick the records they want.

It’s a huge rewrite when they can no longer do that.


[flagged]


"spaz" is indeed short for "spastic", but it's not typically seen as an ableist slur like it is in the UK. Presumably, the word had been out of use as a term for people with cerebral palsy long enough in the US that it passed through the "Euphemism treadmill"[1], and is no longer associated with actual disability (similar to "imbecile" or "moron").

[1] https://en.wikipedia.org/wiki/Euphemism#Lifespan


BTW I find it telling, that the entire thread has been flagged and killed. If you tell someone in Germany unfamiliar with the NoSQL space that you plan on using "MongoDB" they'll look at you like you said you plan to use "R*tardDB".

It's a genuine real-life problem and the general category of problems is especially well-known in the marketing space. It's probably more widely known in Europe than the US (where English-only is historically the norm) but there are a few examples that occasionally get brought up on HN (e.g. the island "Laputa" being named differently in Spanish editions of works featuring it, although the unfortunate implication possibly being intentional in that case).

Outside the database space there's also Wix, who after problems getting traction in Germany tried to "own" the implication of their name by using "Ich wixe" ("I jerk off" but misspelled) or "Ich bin ein Wixer" ("I'm a wanker" but misspelled) in their German ads. I don't see any company being able to pull this off with a name that sounds like an ableist slur though.

I'm not saying Americans are (intentionally) ableist when they say "spaz" but it's definitely a word that can provoke very different reactions when used in an international context. I don't know enough about the etymology to say whether it's like "fanny" (which refers to two very different body parts depending on which country you're from) or "c*nt" (which pretty much means the same thing regardless of where you're from but can either be a friendly taunt or extreme insult because of cultural differences) but it's certainly something to be aware of.

This isn't the kind of problem you'd want to have with the name of your product is all I'm saying. I guess I'm just happy FerretDB didn't go with SpazDB.


Explaining the Throttle option to my german colleagues is sometimes awkwardly funny (in german trottel = stupid)


Is "th" pronounced like a "t" in German? (Or is the German "t" pronounced like the English "th"?)


German doesn't have a "th" sound (okay, technically it's more correct to say German has neither of the two "th" sounds as this is an important difference when pointing out that English only has one of the two German "ch" sounds and only in some variants of English and almost exclusively in Scottish words but I digress).

But based on my experience it's a coin toss whether German speakers who don't know enough English to be able to make an attempt at pronouncing it will go for "trottel" or "srottel" in that case, with the latter being even more likely if they go by ear rather than reading the word spelled out.


The first.


  > “mongo” is a derogatory term that is equally offensive as calling someone a retard
Git and the Gimp are just as bad in English.


Those aren't really comparable, though. "Git" is just an insult:

> git, noun. British Slang. a foolish or contemptible person.

"Gimp" can be an ableist slur for someone with a walking impairment in US and Canadian English but also to someone engaging in BDSM (hence "gimp suit"). The GIMP project has also seen a fork in part due to conflicting opinions about whether the name should be changed (tho I think more people were aware of the BDSM meaning and felt uncomfortable with having sexual innuendos in professional software).

"Mongo" (in German, and presumably Dutch also, anyway) specifically is used to mock cognitive disability and is derived from the historical medical term "mongoloid" which in itself was already racist. Unlike git and GIMP it's also extremely likely none of the people involved in picking the name were aware of its unfortunate meaning since the name sounds innocuous in English.


> "Gimp" can be an ableist slur for someone with a walking impairment in US and Canadian English but also to someone engaging in BDSM (hence "gimp suit").

The sexual term's etymology is the ableist slur. The gimp suit gimps you. It's the same thing.


Sure but that's one layer removed. The adjective/verb is ableist, the noun refers to a specific sexual fetish but derives from the former, the product name is (in my experience) mostly understood as a reference to the noun. So ableism -> sexual reference -> product name.

Whereas in countries where "mongo" or some version thereof is a slur, for MongoDB it's racism -> ableism -> product name or even racism + ableism -> product name

The difference is subtle but IMO what makes it worse is that while "gimping" is a dismissive reference to disability, "mongo" is a slur directly mocking a person with a (real or alleged) disability. As I said, you wouldn't want to call your product "Ret*rdDB" either.

But I think arguing over which is worse misses the point: it's a good idea to check for foreign language implications when picking a name and you should avoid relying on juvenile puns.


Hah, I wasn't taking FerretDB seriously because it mentioned it was previously MangoDB. I was only familiar with https://github.com/dcramer/mangodb.


This has been attempted before:

https://github.com/torodb/server

I emailed the project authors a while back and they said that unfortunately it had been abandoned. A shame. I hope the FerretDB people can pull it off.


ToroDB founder here.

Thank you for mentioning this. Unfortunately, yes, ToroDB is no longer being developed. I still believe it's a fantastic idea, and provides significant value. But when it was being built, 5 years ago, the NoSQL (as in "abandon SQL") state of mind was too strong, and the value proposition was not well understood.

I moved to work on what's always been my passion and preference: Postgres, Postgres, Postgres. For those interested, StackGres[1] is what's now my company's focus.

Things may be different today with ToroDB. The technical foundations and ideas are still there. If there would be significant interest by entities that would like to contribute to its development, it could be considered.

I wish good luck to FerretDB. The task ahead is not easy: MongoDB protocol is very simple, but the API is terribly complex and full of nuances. Getting up and running a simple PoC is very simple. Getting from there to a production quality state with notable compatibility is very hard.

[1]: https://stackgres.io


Thank you for the wishes (FerretDB co-founder here), and also for your guidance before we started off on the project. We understand it is not easy, but it needs to be done, so we are dead set on delivering on this.


Yes, good luck! :)


The last point you made was exactly the problem when I attempted an experiment to migrate a heavy-IO MongoDB workload to DocumentDB - the latter imploded like a dying star after flipping the switch (under the hood it seems to be running a postgres/aurora-like storage layer). Implementation details matter, wiredtiger being one of the largest details.


I may be mistaken but I think ToroDB never tried to be MongoDB replacement but rather something you use as MongoDB replica to get your data in PostgreSQL so you can query it with SQL for analytical projects ?


It was both. There were two separate software based on the same underlying technology:

* ToroDB Stampede[1]: MongoDB replica, converting on-the-fly documents to relational structures. Targeting OLAP, as data normalization made queries from some % faster to 2-3 orders of magnitude faster.

* ToroDB Server[2]: what DocumentDB is or FerretDB is planning to be. It was less developed than Stampede, certainly.

[1]: https://github.com/torodb/stampede/

[2]: https://github.com/torodb/server

(edit: formatting)


Perhaps I'm just being naive, bug I still fail to understand the animosity towards the SSPL license.


The wording in sspl is awkward in its vagueness of its scope, specifically this clause:

> “Service Source Code” means the Corresponding Source for the Program or the modified version, and the Corresponding Source for all programs that you use to make the Program or modified version available as a service, including, without limitation, management software, user interfaces, application program interfaces, automation software, monitoring software, backup software, storage software and hosting software, all such that a user could run an instance of the service using the Service Source Code you make available

For example providing SSPL software running on AWS as a service is probably/possibly license infriging because you are not able to provide source code for all the stuff you are using.

In practice probably even the most well-intentioned service provider can easily be trapped by that clause.


For instance, does it mean you can only run on an open source OS if you are providing it as a service, for instance? I guess not...?

Personally, as an open source user, I want to be able to pay whomever I want to host a product I'm using. The SPPL seems intended to prevent this. Like it's intended to prevent cloud providers from offering it as a serivce, if it didn't do so successfully under the exact terms, they'd change the terms to do so, because that's the goal. Whereas in fact as a user, I want the freedom to pay whomever I want to host it for me, if it can effectively only be self-hosted (whatever that means!) or hosted by officially licensed vendors, that's not what I'm looking for in open source, I don't want hosting-provider lock-in.

So, while we could legalistically look at the exact terms, I'd rather just have a license that is not designed to discourage/limit/prevent one of the things I want to do with the software, which is of course why we choose open source in the first place.


You can still host mongoDB anywhere you'd like and manage it yourself. What they would like to prevent was AWS and the like offering "MongoDB as a Service", because that's how they make money and fund development of the product.

If you enjoy freedom in open source and avoid lock-in, you will probably be hosting Mongo on an EC2 instance, for example. SSPL provisions don't apply to that.


I would like the freedom to choose to use MongoDB from a MongoDB as a service offering, not being limited to only certain licensed service-providers. That is the kind of freedom I choose open source for.


It would be better if the license straight up forbid offering it as saas instead of hiding that behind conditions that are practically impossible to comply with completely.


This is not practical for many organizations which would want choice of vendors rather than do it inhouse or be hostage to MongoDB Inc.


I'm wary of it because "make the functionality of the Program or a modified version available to third parties as a service" is vague.

If I have a web app that persists data in MongoDB and lets users query it in a complicated way[0], but doesn't provide an outright MongoDB-as-a-service implementation, it's still arguably making its functionality available. I don't trust them to enforce the edge cases fairly.

[0] For example, a custom report builder for an inventory management system, or a query builder for a CRM


MongoDB's page says this "includes, without limitation, enabling third parties to interact with the functionality of the Program ... remotely through a computer network". That would include all (web) apps with a mongodb connection. It then adds "offering a service the value of which entirely or primarily derives from the value of the Program". That's the vague bit for me.


This sounds like the least arguable edge case possible even if they were interested in chasing it (which they wouldnt be).


They wouldn't be... until they go under.

Then the vultures will be happy to shake folks down to pay for the licensed version or a law suit. Merit doesn't really matter, they have a claim and fighting it will cost you. They'll just bank on you preferring to pay for the license.

They may not go after the mom and pops, but they'll hit every Fortune 1000 (and probably whether they use MongoDB or not).


Also unlikely. Vultures go for easy pickings.


Although I admit that it's unlikely they will sue you for a product that isn't just a database, you shouldn't build your product on the (perceived) goodwill of the MongoDB developers.


Meanwhile, half the world start-ups base their entire tech stack on the goodwill of amazon and google, not to pull the rug from under them in hundreds of legal ways.


Using your definition of "goodwill" means basically everything on the Internet is based on the goodwill of some other entity. I think that paying for a service pretty clearly moves it out of the "goodwill" category.


FAANG are pretty explicit about hating AGPL and SSPL coz they kind of target them. That leads to a lot of second order animosity.


It makes no sense to use FAANG in this context.

It's the cloud providers that are the target of the license, specifically AWS.


One of the "A"s in "FAANG" is the same "A" in "AWS", so it does indeed make some sense.


And Google (GCP).


But not Microsoft, to which Google is a distant third place as far as the cloud goes.


Plus, isn't it MAANA now? (Meta and Alphabet)


My take is simple - DBaaS is the leading way databases are going to be confused in the future and SSPL ensures MongoDB is akin Proprietary Databases in this deployment mode - same as with Oracle while other cloud vendors can offer MongoDB they have to do it under the thumb of MongoDB Inc

Open Source is on other hand ensures you have choice of vendors


correct me if i am wrong but imo sspl is a superset of AGPL, because i have read somewhere that agpl fails if kept behind an api. you don't need to publish source and all that.

the problem with non-sspl licenses is, lets say with bsd, that you give the downstream developer the right to decide if your work is part of software which gives source to users when the concern of "Free software" is that end users MUST be given source.

when it comes to cloud providers, even if this agpl license is true, google already bans AGPL software but not aws but i'm not sure but that means they dictate to the end user with vendor lock-in and stuff.

anyways SSPL aims to accomodate even this loophole by making sure if you are a provider, you have to provide source code to ALL software you use. this means, for an end user you can't be forced into a small free software carrot but still subject to rest of closed source.

i'd say this is a win-win for end users, intermediaries and developers don't matter when it comes to freedoms of end users


In the words of Matthew Garrett "There are many AGPL projects where you literally can't pay someone money to avoid the AGPL. There are zero SSPL projects in the same position."

https://twitter.com/mjg59/status/1354698533094318082


I would assume that this is a good thing.


"All rights reserved" is a subset of AGPL. We're just not sure how far from AGPL the SSPL actually fall, because it is worded ambiguously.

In practice the original claimed aim of the license does not matter that much.

Thankfully there are other licenses similar to AGPL, like BSL.


SSPL license is intended to stop cloud providers to offer it as a service. People will have to use MongoDB Inc.'s Atlas for that.


Or you can implement the MongoDB API like AWS did with DocumentDB.

SSPL license is more intended to stop the less sophisticated hosting providers.


Isn't that just MongoDB frozen in time before it was re-licensed under SSPL? Or is it actually a completely new query/storage engine that is MongoDB client-compatible?

In any case, I don't really like any of Amazon's homemade databases after the disaster that SimpleDB turned out to be.


DocumentDB an entirely separate implementation built on Amazon's internal tech which just speaks the MongoDB wire protocol. If they use any pre-SSPL MongoDB code it's just incidental bits and pieces.


Side note: From my experiences with documentdb, during the fires I had to put out trying to migrate a large mongodb workload to it, it looks like its Postgres or Aurora-Postgres under the hood (based solely on duck-typing around features, identifier constraints, storage limitations, billing, etc)


My issue is more with the software than the license - I'm comfortable running postgres at scale, less so mongodb. This will be great for people like me and widen the range of software that I can use off the shelf.


I don’t understand, your problem isn’t with the license but the software, but you don’t explain why you prefer FerretDB’s software? Is it more reliable than MongoDB?


It doesn't make open source function as free labor for billion dollar companies. Look at the top contributors to OSI and look up that time a Facebook lawyer badmouthed anything but the most liberal licenses at FOSDEM.


Is there a suite of conformance tests and/or benchmarks for MongoDB?

A blackbox test suite that tests common query types and could be run against mongo itself (similar to pgbench, TPC-C, etc) and can optionally test for correctness/speed would go a long way towards making it easier to trust this project with workloads.


This is early stage project so you're unlikely to be able to run any real MongoDB applications with it yet. Benchmarks are too to come

100% Compatibility is not really a focus, real application use cases is. You can read on FerretDB's CEO take on compatibility here

https://www.ferretdb.io/2021/12/07/mongodb-compatibility-wha...


How is it a drop-in replacement if 100% compatibility isn't a focus?


The linked blog post tried to explain it the best we could.


The MongoDB test suite used to test compatibility is here:

https://github.com/mongodb-developer/service-tests


Thanks for the pointer, couldn’t find this myself!


How does FerretDB represent documents in Postgres? JSONB, or something custom?

How does it handle the mismatch between jsonb and bson? In particular:

* more scalar types, including blobs. One hack I could think of is using ISO 8859-1 encoding instead of UTF-8, which can act as a pseudo-blob.

* preserved field order


FerretDB co-founder there.

The order of fields is maintained by adding a special field with keys in the original order. For example, `{z: 1, a: {y:2, x:3}}` is stored as `{"a":{"x":3,"y":2,"$k":["y","x"]},"z":1,"$k":["z","a"]}`.

Additional types are stored as objects with special fields too. For example, binary values are stored as `{"$b": "<base 64 string>", "s": <subtype number>}`. The full mapping is there: https://github.com/FerretDB/FerretDB/blob/b7e8240607e043a858...


FerretDB is very early in its development phase so things can change. Right now it uses JSONB but enhances object with additional meta data, such as field order for example.


I always find it very brave those who decide to use a newly created database in their projects.


I typically agree but this is based on Postgres which is nearly always a solid bet.

Postgres jsonb works well enough for me - but this looks like a good alternative for Mongo people.


It's of course the translation that can contain bugs. Dumb error in a date conversion? Your accounting no longer works.


That and significant differences in performance for more complex operations like the aggregation framework


Curious about that. I recently made something similar, but for SQLite w/JSON. The backend can use a local sqlite3 file instead of mongodb, while using the same queries. It can only translate the queries we actually use, but it's quite doable to translate $lookup, $group and $project to their SQL equivalents (which were the inspiration for the aggregation in the first place, IIRC). Initial evaluation shows that the performance is quite good, and the translations look decent, so it might be possible to get some mileage out of ferretdb.


Yeah seems like a mix between a ORM and a Database, not sure how much they can let postgres solve. But I agree when I look into it, it's bit less scary then a competely new database.


PostgreSQL isn't a "newly created database"


> FerretDB is an open-source proxy, converting the MongoDB wire protocol queries to SQL - using PostgreSQL as a database engine.

Based on this description I'd agree that FerretDB isn't a database itself. However the conversion between the MongoDB wire protocol to SQL queries could have bugs, data resiliency could be an issue if you need to guarantee writes, No guarantees of on-going support, etc.

New db's are always welcome but to use a brand new one in production would be very.... bold.


The humor is strong in you fellow Jedi


I'd be really curious to see if you can use the underlying Postgres to do effective relational queries. This could be an interesting solution for migrating a mongo database to a Postgres database. Or at least having more relational queries for certain things.


Interesting, but why not just use vanilla Postgresql? JSON support is awesome now.


JSON support in PostgrSQL is indeed extremely powerful however it is not as simple.

MongoDB provides "native" document persistence for programming languages, where with PostgreSQL JSON you have to live in relational database SQL land


1. Postgres's json is limited to standard json. MongoDB supports additional scalar types, like blobs and dates.

2. MongoDB's drivers are designed to handle documents. If you use postgres directly, you have to build that part yourself (effectively a kind of ORM, including a query builder)



FerretDB is not Percona's Project through me (Peter Zaitsev) is involved in both companies, CEO at Percona and Board Member/Advisor at FerretDB

AWS is not funding FerretDB either


I think it’s quite amusing being an ex mongo engineer that the wire protocol and query language is kind of becoming a de facto industry standard with several major products emulating it and now ferret db.


Yep. Well is not this how many standards are gone ?

I think MongoDB did a great job figuring out developer experience for Document Database


Coming soon: AWS Ferret, a fully managed AWS alternative to FerretDB;


Frankly, I hoped for years for things to be other way around with AWS to take DocumentDB Open Source and keep hosted version on steroids powered by Aurora PostgreSQL.

Having said that I think there is a lot of value in FerretDB for Cloud vendors which will be able to use this project to offer MongoDB Compatible DBaaS experience


It seems to be to be a misrepresentation to call this a mongodb alternative.


Why ? The Goal of the project is instead of MongoDB you use FerretDB + PostgreSQL (available as packaged solution)


Elasticsearch is the general purpose JSON store that really scales.


Right up until it falls over in a huge mess because Elasticsearch was never designed or intended to be used as a general purpose JSON store. It's a search engine.

Or to quote Kyle from the Jepsen tests, "I would not use this as a system of record".


With no write guarantees the last time I checked.


I would advise against storing data in ES. But it is a nice store for read models among other things.


Why is that?


It doesn't seems to support HTTP protocol. So instead of an 60+MB MongoDB binary, you need Postgres (that's 3 times that size), plus FerretDB, plus a backend. 3 instances to run instead of one, that's a huge trade off.


True, but you also get PostgreSQL instead of Mongo, which is easily worth another 120mb.

I’m not sure what you mean by backend though? You need that regardless right?


I'm front-end, so no. not really. just a few lambda here and there when the authorizations are too complex. not building big apps though


Can’t go directly from frontend to mongo? Unless you want to publically expose your instance.


postgres + postgres_setup + farretDB = something like MongoDB ?

Could you add more details on "Why someone need to choose this over other DBs"from a product PoV or even the DX pov?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: