Hacker News new | comments | show | ask | jobs | submit login
The MongoDB hack and the importance of secure defaults (snyk.io)
282 points by tkadlec on Jan 11, 2017 | hide | past | web | favorite | 214 comments

When our sysadmin set up our Mongo cluster, he firewalled out all IPs except our production systems, turned on authentication, set things up to ensure we used SSL, and configured backups.

He didn't do this because he's an amazing sysadmin. He did it because he's competent, knows how to put a service on the Internet, and RTFM.

I mean. If you can connect to something and use it without having to authenticate yourself, wouldn't it naturally cross your mind to check that others can't do the same? It's just common sense.

These kinds of articles about "attacks" are annoying. Let's be absolutely clear: there is no "attack" or "hack" involved. No real company was affected by this. There is no operating system bug involved. No software vulnerability. No zero-day exploit. No social engineering angle. There's no real problem.

This is simply first-time developers with no server experience, deploying their first project, who don't understand the first thing about having a server on the internet. Publicly exposing an open port for software that is innately designed to run on a private network has nothing to do with... whatever you want to call all these "exposé" pieces about "attacks".

You could write an article to explain the introductory concepts of system administration to newcomers to the industry. But to even mention the word "attack" when the only thing involved is an open port is... sigh.

I feel for newcomers who need to learn. I really do; the amount of information one needs to absorb to be even remotely competent is vast, and takes at least a few years to pick up (and then another decade or two to fine tune that knowledge). But this is not a situation where the maintainers of software packages are to blame for not educating their users, or insinuating that their products are not "secure out of the box". It has nothing to do with individual software packages, and everything to do with the very core aspects of having a computer connected to a public network.

I'm afraid your suggestion that "No real company was affected by this" doesn't bear out with the evidence that I've seen or the experiences I've had in security.

Real companies with real data make some pretty elementary mistakes with regards to security. I'm a security tester and the number of times I've got access to systems deployed by real companies who've really paid money for an external security reviewer using things like default creds is quite high.

It's tempting to think that this is just an education issue and that once people know how to do security well things will get better but personally, my opinion after 16 years in security is that this isn't the case.

Effort spent on security is a trade-off with other things and in many cases people make the choice (either unconsciously or deliberately) not to prioritise it.

I first started using Mongo in 2013/4, when I deployed my app, first thing I did was to change the default port and add authentication, as the manual recommended. I'm an accountant who's a hobbyist developer. I knew very little about security then, but I read the manual.

The insecure defaults were an issue sure, but anyone installing a piece of software in production without at least reading up on config options needs to find another job.

Meanwhile, the FTC sued D-Link over insecure defaults in their cameras.

We need good and consistent rules about this, and "well I was giving it away for free" isn't as clear a boundary as people will think it is.

The FTC sued over insecure defaults because it said "secure" on the box.

IP Cameras from China have a number of issues of `calling home` adhoc. Granted to say even looking at their kernel's I tend to keep them completely on their own sub-net away from the net.

So MongoDB should state 'insecure out of the box' on their home page ? Just kidding sorry.

I don't think it's far fetched. I'd prefer a disclaimer "For the convenience of easy testing the defaults are NOT secure. If you intend to use it productively read the manual about secure configuration".

I'll comment without knowledge about the D-Link issue that you mention.

I would imagine that if someone were to sue MongoDB Inc. today about the issue in the article, their first defence would be clear documentation that explained/recommended production guidelines.

I don't know if D-Link had similar, but then again legal systems sometimes produce weird results that don't seem rational.

I agree that we need good and consistent rules, in addition to that, we need DevOps/SysAdmins/SRE's that are responsible enough to know what they are doing. Carefully read documentation instead of "quickly deploy" only to come back a year later writing soppy "don't use MongoDB or XYZ because we didn't read the manual". :)

MongoDB is on 3.4 now, so I would rhetorically wonder why some people/companies are still on <=2.6. If the data that is being ransomed is that important, it'll be a good lesson to those DB maintainers to upgrade and secure their stack.

There is a huge difference between consumer facing and developer facing products when it comes to needing security out of the box. This isn't even close to a real issue.

> configured backups.

This part seems to be glossed over but is a HUGE issue.

It sounds like several companies have tried to pay the ransom with varying levels of success [0] ... why are they not just restoring from backup? I can only assume they don't have backups. (!)

What is their DR plan if the server dies? Or someone accidentally pushes code that messes up the contents of the DB? Or someone tries to drop the development database but oops: they didn't notice they were connected to the production server?

Even if you're using a hosted service, what if they go down? Get hacked? Lock you out because of a billing dispute/TOS violation/DMCA takedown/accident? Hired bad sysadmins that didn't do backups (correctly)?

Not having backups of your data is inexcusable and just reeks of utter incompetence, and has nothing to do with configuration defaults or documentation.

[0] https://krebsonsecurity.com/2017/01/extortionists-wipe-thous...

For small data volumes, MongoDB Inc. will actually back things up for you for $2.50/GB/mo storage (free bandwidth), and it's a live backup with multiple historical versions (an agent on your database machine tails the oplog, so you're at most a couple seconds out of date). And their free monitoring tier is great alongside it. I'm truly surprised that they don't sell it harder; it took me years of using Mongo before I knew this service existed.


If you're also hosting your data with them though, it's not a complete backup solution. You're still subject to essentially a single point of failure: if they are hacked, or suspend your account, get a DMCA notice against you, or are shut down by the SEC[1], both your live database and backups are subject to loss.

You still need to do external backups. You may have a lot of trust in the provider and these less frequently, but you should still do them.

[1] Had this happen to me once in the early 2000's: company I worked for had a dedicated server at a colo facility. After several days of them not responding to phone/email/etc, their answering machine was changed to a message saying the SEC had seized all assets and had all the owners 'under investigation' or something like that. We had external backups, but immediately took the latest stuff and got everything migrated to a new system in a new facility. Server stayed up for a few weeks after that, but then suddenly their whole IP space went offline. We never did get our server back.

I used to use Cloud Manager until they started charging a fee of I think $99 per server per month. I pay $150 for my hardware per month, so I stopped using their backup and monitoring tools.

Other than the cost, I recommend it for people who can afford it. Wonderful service that I was happy with for a long time.

Did the same, switched to Ansible for the Mongo deploy/management, and just use the free monitoring part of Cloud Manager.

Check out the ransoms being charged. It's .2 btc, or about $150 at current btc levels (which I acknowledge are unstable). That's pretty cheap. This is probably cheaper than having a single sysadmin spend a few hours restoring from backup just in raw labor time, not to mention everything else. Factor in the cost of downtime and the likelihood that you're going to lose some hours of user data since your last backup occurred, and the ransom is easily worthwhile to pay and get the server back.

I'm obviously not going to defend companies that don't have current backups (though this is practically everyone), and the importance of backups is always a great thing to emphasize, but in this case, the best option is to pay the ransom and get your stuff back.

You should read the article I linked though. After taking the data and replacing it with a README that says where to pay the ransom, the server is still left unsecured.. so someone else comes along, takes down that README and puts their own up with their own bitcoin address. Presumably if you pay them, you'll get back the next ransom note that says who to pay next, and so on, and as long as no one screwed up and everyone was honest about the ransom, after you've paid several people you'd eventually get your data back.

Well, you have a sysadmin, that's the point. Developers aren't sysadmin. Should they have a deep knowledge of network security? you bet they should.

Now I'm sorry, but MongoDB inc IS at fault as well, for not forcing developers to create credentials upon installation from the beginning. Any vendor that doesn't do that with its product isn't serious about security. Let's take Wordpress, imagine it didn't force its users to create an admin account upon deployment, everybody would be mad about it. But somehow MongoDB got a pass all this years? bullshit.

I hope this hack will permanently damage MongoDB brand.

That's the same logic that says people who get fired from their jobs should be homeless and starve, or that those who get sick without insurance should be denied care. We don't do that in our society.

There's a moral distinction between culpability and impact. There are profoundly stupid things that people do, yet still need protection from.

Those DB admins were incompetent by lots of measures, but their data still has value and its seizure is a public harm. It's the job of the rest of us (in this case, MongoDB's developers) to take reasonable steps to minimize the chance of that happening.

Secure defaults are a very reasonable precaution. MongoDB fucked up.

I'm no Mongo fan, but that's a false equivalence. You don't need to use Mongo in order to survive.

And having your data stolen is preferable to starving. The point is that, morally: X is bad X being all Y's fault doesn't imply that Y should be unprotected from the consequences of X.

We help each other out in this society. So in this case if you're a database developer with a good handle on deployment security, you don't put a insecure-by-default product in the hands of people who aren't. I genuinely can't understand why people are arguing to the contrary.

Cars can be dangerous and everybody should read the manual before using one, but it doesn't mean they are sold in an unsafe state where the user has to configure something first, otherwise it'll kill everybody.

Even knifes are sold with some package that prevents them from cutting before the package is removed.

I agree, I don't think your job is done just because you wrote somewhere "pay attention to this".

Thoughts like this are, and will forever be, the one dividing thing between a more ops-y person versus a more dev-y person.

But that's fine. At our place, our mysql cookbook is maturing. This mysql cookbook makes it really easy to say "hey, we need a mysql master with these 2 databases". The only security overhead consist of generating 4 passwords in that case (and we're planning to automate that).

Once you've done that, you get a mysql master with passwords, restricted root access, a local firewall, backups and all kinds of good practices. It's secure because our cookbooks are good, and people use our cookbooks because they are easier than handling mysql directly.

And that's a kind of cooperation I'm really happy about. Devs want services, and we provide easy ways to setup services in decently secure ways.

ops-y + dev-y, isn't that supposed to evolve into DevOps? I know the term is a bit overloaded.

Do the community chef, ansible and puppet cookbooks use secure defaults for MongoDB? Asking because I have never used MongoDB.

> ops-y + dev-y, isn't that supposed to evolve into DevOps? I know the term is a bit overloaded.

Yes, but I've grown to dislike that term, though it's my current job title. There's a number of people who're titled with DevOps and they're yelling about Docker this and CD that, and AWS/RDS those.

That crowd is way to excited about some tools and some solutions in specific use cases, and they sometimes tend to drown out the real value of config management and close cooperation of devs, ops and other involved people with their own specific focus - for a very large amount of use cases.

> Do the community chef, ansible and puppet cookbooks use secure defaults for MongoDB? Asking because I have never used MongoDB.

I'm a chef guy. First google result for "mongodb cookbook" doesn't set authentication, but makes it easy to enable required authentication. Second cookbook result doesn't enable authorization by default.

This makes sense though. If a community cookbook manages mongodb, that cookbook is supposed to support all use cases of mongodb and it usually tries to mimic the default use case of the application in question. To maintain that, the chef cookbooks for mongodb don't enforce authorization.

However, if I was supposed to implement a mongodb cookbook to use in my place, I'd intentionally fail the chef run if authentication is disabled and stop mongodb in that case. This would be trivial in both cookbooks I looked at.

>Yes, but I've grown to dislike that term, though it's my current job title. There's a number of people who're titled with DevOps and they're yelling about Docker this and CD that, and AWS/RDS those.

This is not exclusive to DevOps, though DevOps is a hot buzzword right now so lots of posers are flooding in. Most people, no matter their station or the prestige of their company, have no idea what they're doing. Knowing this is one of the most important things for dealing with the corporate world (I think I'm missing a couple of the other most important things, though...).

Different people try to cover this up in different ways. One popular way is acting like they always know about new cutting-edge tools sponsored by AppAmaGooBookSoft and that the next thing will finally be the Silver Bullet we've all been waiting for.

This impulse has brought us the prominence of node.js, MongoDB, the proliferation of "big data", and many other vastly-overdeployed niche products that have been ravenously and shamelessly misused by incompetent people trying to fake it through their careers. Our standards for this are, frankly, sad. It must have to do with a combination of a cultural bias against non-junior engineers outside of Java/.NET-land and putting completely unqualified MBAs in charge of tech (and yes, this is also applicable in startup world via VC proxies) -- but I digress.

Been hearing that song and dance re: docker and k8s for the last year at least, and boy is it ever tiring. Docker and k8s are both very niche, very immature products that greatly complicate system administration. They are missing features that you rely on and that you will have implement horrifying hacks to work around. Why are you doing this to yourself, again? Oh, because AWS is too expensive and you want to consolidate (fake reason btw, real reason "because it makes people think I'm smart")? Yeah, about that...

It's fun to throw together a lab for experimentation, maybe hook up a weekend project, but no sane Real Company is going to be moving its stuff to all-containerized k8s any time in the next 2 years.

My current project at work? Converting all of our environments to docker/k8s...

"ops-y person versus a more dev-y"

There are devs who are unaware of the concept of authenticating access to an application?

> When our sysadmin

Yeah, well, we're in the middle of a trend to act as though dedicated systems people are bad, pains in the arse, you shouldn't need or want them.

Wait, then who is supposed to administer our systems? Do people think developers are up to the task!?

What administration? If something happens to the system, just toss it out and spin up a new one.

Come on, man, this is 2017. Get with the program.


Oh god. That's a crazy thought process. People use VPSs more than physical servers these days. But someone still has to spin them up and configure them and deploy stuff to them. Seems insane for that to be a job for devs.

It depends on the developers mate, properly experienced ones are often up to the task, while the latest fad chasing buzzword inducing cool kids probably aren't .. it's no coincidence they are the demographic that put Mongo on the map :)

I'm a developer. I can do some sys admin stuff at a pinch. But it's not my area of expertise at all. I mean you might as well say since experienced sysadmins do a lot of scripting they might as well develop as well.

this is why we have botnets running on security cameras.

you can't just demand the user be smarter, you put sane defaults on there so they actually have to go to at least a small amount of effort to shoot themselves in the foot.

In the new, DevOps-centered world, doing all that is unnecessary and only serves to slow down the launching of $product.

Besides, you should be using containers for everything anyways. If something happens to it, just throw it out and spin up a new one.


The article's argument is that users have no common sense and don't think about what they're doing, so we should reinforce this same behavior.

> you may decide that not binding a database to local connections would be a reasonable default...But assumptions like that are dangerous because when not if someone does something that breaks that assumption, they’re now vulnerable to attacks—and they may not even know it. Understanding of the potential security risks presented by these defaults is not exactly mainstream knowledge.

I wouldn't call this reinforcing behavior...I'd call it protecting your users. Common sense is something that comes with experience, and MongoDB is popular enough that it's likely to be in many web devs' first projects in production. Is it their fault they haven't developed this "common sense"? Yes and no. On the other hand, is vulnerable-by-default ever a good idea?

Never trust the users.

sysadmin - that title is becoming rare. Would you expect the same competancy from a devops, or a site reliability engineer? ( Not trolling, real question; as a sysadmin myself I do see that priorities differ in many cases when a dev or a manager wants to see a service running and you have until yesterday. )

It is becoming rare, because management who ultimately hires and fires people, wants things done faster. They think that making devs responsible for ops stuff, combined with automation and new technology will magically make it all good.

I am a former Sr. Systems Administrator, and Sr. Systems Engineer. My current title is Sr. Devops Engineer.

Sr. Site Reliability Engineer here. (Itself a complex title, varying by employer.) I have taken to identifying myself as an Operations Engineer for this very reason. I'm an engineer focused on operations. Tada. You want some operating and I do some engineering to get there, and all of that is clearly printed on my tin. Some might disagree with my reduction, but I feel that it fits well what I do.

Pretty much everybody who knows their way around a compiler and administers systems could probably take this title and be fine, and then you're not suffering under the yoke of devops or whatever. Because when I hear "developer," I can't help but think a feature engineer on the product side for some reason. Same with that word creeping into "devops," and that leads to things like "full stack" engineering -- creating the impression that only a select few work with or consider "the whole stack," which is a scary subliminal message. Even if you're a developer simply changing the colors on the Facebook feed or something, you should always be thinking about the full scope of things and how your changes will be operated. (Your operations nerds will love you if you think about us ahead of time.)

As an aside, systems administrator with no software skills is a totally and completely fine career, with a number of very smart people (and friends, still proudly announcing every year of uptime in texts), and in a lot of these discussions I see them denigrated. Just remember that, not saying anyone is doing it.

I'm not sure the way titles are being thrown around in this thread is entirely helpful. opsy vs devsy, sysadmin vs devops? What exactly is a "devops"?

I would expect anyone responsible for something to have competency in it or find someone who is to help. Titles are irrelevant.

Next time you cross a bridge, make sure it will handle your weight ... And think about everything that can possible go wrong, what materials are used, and how old it is, etc.

Different. This isn't some deep, underlying fault that requires a lot of specialized knowledge to recognize, like a bridge that has the wrong type of struts or whatever. If the program doesn't make you put in a password ... it probably doesn't require a password.

Refusing to deploy a process into production without setting a password is more like refusing to drive over a bridge that's obviously missing chunks and crumbling -- that is, it should be obvious, even to the untrained eye, that there is a massive structural issue that makes it unsafe.

I will grant that some junior people may assume that local connections are coming in over a trusted socket and that TCP connections are required to hit an authentication process, but even this is kind of a stretch, and no one should be leaving production deployments for real companies in the hands of someone that inexperienced.

I will also grant that guilt-by-association is probably not appropriate here. There are a lot of good people at companies that have grossly incompetent and/or disorganized managerial superstructures that may have caused something like this to end up exposed to the internet. I have personally worked at multiple places where the daily comedy of errors we called "work" could've resulted in an unpassworded Mongo ending up on the public network.

Any way you slice it, driving over what is a clearly destroyed, ruined bridge is someone's fault, somewhere (and per regular management practice, the person least responsible for the fault will probably take the fall :) ). Let's just not pretend that this is an ordinary complication or oversight in the course of sysadmin.

The problem is that once you work with something every day for years it becomes common sense. You should always assume your users have zero experience and don't RTFM. Messing with configurations does require a lot of specialized knowledge. Not everyone has the knowledge of a Sysadmin.

"Messing with configurations" enough to add a password requires no more specialized knowledge than installing a database server. Personal responsibility has to come in to play at some point here. We're talking about real companies who said "We're going to put all of our data into this system and not set a password", not a fast food worker trying to pick up Rails development in his/her free time.

Furthermore, running "apt-get install" usually isn't enough to get something exposed onto the public network, even for home users, whose routers block all incoming ports by default. Someone has to go in and explicitly open the traffic before something like Mongo gets exposed; there's no reason that labs, developer machines, etc., would be publicly accessible.

Mongo is hardly alone here; comparable services like MySQL, PostgreSQL, and Redis also install without a password by default. For MySQL, this is perhaps less noticeable because Debian/Ubuntu (and probably some other distros) include a prompt to allow the user to set a new root password as a post-install script. For PostgreSQL, the default mode uses IDENT authentication, i.e., it's accessed by dropping to the "postgres" user ... and, like all other system users, including root, this user is passwordless by default.

For Redis, it's passwordless by default and I'm sure there are many installations that have this same "vulnerability" in redis-land (there are also many installations that have real vulnerabilities, because Ubuntu's redis-server package in 14.04 hasn't been patched), because people rarely think about redis auth either. Just since last year, there is a feature called "protected mode" that requires the user to explicitly disable a flag if they intend to run a server that is a) bound to a non-loopback interface and b) not passworded (but it doesn't actually require someone to set a password). That's a cool feature, but as users don't RTFM, I'm sure they either flip protected mode to false without understanding what it does or just give up on redis and install MongoDB instead. :)

The point here is that pretty much everything you can install on a server is passwordless by default. I don't think the system of someone who never thinks about passwords would last very long at all. This is NOT a MongoDB-specific anomaly. Passwordless is the typical case for most services.

A feature like Redis's protected mode is a nice bonus, but it's not fair to lay responsibility for this at the feet of Mongo or to call it a "vulnerability", as the OP does. The responsibility lies with people who say "I'm going to put all of my company's data in here and push it live without thinking even a tiny bit about security."

Do a lot of people do that? Yes. Should MongoDB take more steps to protect such people from themselves? Maybe. But this is a very basic, very routine thing, which is easy to look up and rectify -- not some obscure anomaly buried under 600 pages of documentation or requiring a special compile-time flag. Mongo doesn't deserve the heat for people who never set up a password.

MongoDB has learned the lesson and will no longer listen on public IP by default ... For a while it was common for ISP's to install a router on the customer side, basically creating a firewall because of NAT. But now it's usually up to the customer whether to install a router or not. And most people do not run a dedicated firewall. And with IPv6 rolling out there's really no need to have NAT as there are plenty of IP addresses. With IPv6 it's also possible for one interface to listen to many IP's. You can not assume something will only run on a private network! As a security practice you should assume all networked machines are connected (and accessible) to/from the Internet because 99% of the time they are!

Having something listening on a public IP without any password protection is irresponsible whether it's a database or security camera. Last time I installed mySQL I had to both set a root password and specify socket to listen on, and it took five seconds and no documentation reading was needed, so it is possible to both be user friendly and have good security. With mySQL you also have to explicitly set from what IP a user can connect from.

Secure should always be the default! Making it not secure should require messing with config files, not the other way around. Most people will just stick to the defaults. There are plenty of examples where you do not need a password, for example opening your fridge or turning on your TV, both being connected to the Internet. Being able to connect to it from your friends place should ring a bell, but I don't think most people try that. Just like people don't regularly port-scan their networks.

>MongoDB has learned the lesson and will no longer listen on public IP by default ... For a while it was common for ISP's to install a router on the customer side, basically creating a firewall because of NAT. But now it's usually up to the customer whether to install a router or not. And most people do not run a dedicated firewall. And with IPv6 rolling out there's really no need to have NAT as there are plenty of IP addresses. With IPv6 it's also possible for one interface to listen to many IP's. You can not assume something will only run on a private network! As a security practice you should assume all networked machines are connected (and accessible) to/from the Internet because 99% of the time they are!

This is not the case in the United States at all. In fact, it's the opposite. ISPs used to only offer modems that were intended to plug directly into one's computer, which would bridge to the computer's ethernet interface and expose it directly to the internet. Third-party routers would be placed in the middle to allow multiple computers to connect and they almost always include a firewall that has everything blocked by default.

Recently, ISPs have been bundling their modems into routers and providing customers with a single device that facilitates multiple computer access, wifi, etc. Every time I've encountered such a thing, it has had a firewall that blocked everything (except maybe a port for remote tech support) by default. This is by far the most common.

IPv6, as always, is a pipe dream, but when it becomes real, yes, we will not have the assumption of NAT to fall back on -- though it's pretty safe to assume that consumer-level hardware will continue to block all ports by default.

>Last time I installed mySQL I had to both set a root password and specify socket to listen on, and it took five seconds and no documentation reading was needed, so it is possible to both be user friendly and have good security. With mySQL you also have to explicitly set from what IP a user can connect from.

This is not a feature of MySQL, but a feature of the distribution you installed it on. It is true that MySQL's configs don't listen on all interfaces by default. IP-based security generally means that every user is wildcarded, and MySQL auto-generates a wildcard root account without a password iirc.

>There are plenty of examples where you do not need a password, for example opening your fridge or turning on your TV, both being connected to the Internet.

MongoDB is not the equivalent of a fridge or TV. MongoDB is the equivalent of a power saw. If you can't be arsed to take basic precautions while using it, you really shouldn't be anywhere near it.

And while many TVs and fridges may have an annoying built-in wifi feature now, most people do _not_ think of these in the same context as server software. :)

I have never used MongoDB so I admit I'm talking blind here, but can someone explain how/why a piece of highly popular software gets to version 2.6 allowing unsecured remote connections by default? Further to that is that type of thinking you want in the development process of something as critical as a database engine? It just seems amazing to me that it got so far before the community in general pushed back that this was really bad design? Hopefully someone with MongoDB history can explain if this was a long term sticking point, etc. Thanks.

If this is not addressed at a _very_ early state of a software, it will stick, because people get used to it and you drive people away by backwards-incompatible changes.

Especially if it's a software that's targeting people who wants things easy & fast, like Mongo did at the beginning.

See reasons[1] for WordPress needing to be PHP 5.2 compatible, which had been out of support for 6 years.

[1]: https://www.rarst.net/wordpress/technical-responsibility/

At a guess, it became popular through easy setup. That includes not having to configure login.

That's why so many other systems are insecure. Security nearly always increases friction.


I once had a conversation with someone who started an open source project about the "10K meter" view. I strongly encouraged secure defaults. He asserted that when getting started, grabbing mindshare requires insecure-out-of-the-box. They could fix that later.

Later never came; the project died (Not due to security). But I'm pretty sure something like this would have happened if it had taken off, because later doesn't come until something like this happens.

I don't know how wide-spread this thinking is, but it slots in well with a lot of the decisions I see made by other startup types. There seems to be a lot of conventional wisdom about the slimy-ish shortcuts and deceptions, bad design decisions, and so on that are "OK" to make to get off the ground. I'm not going to comment on the necessity of that sort of bad behavior from a business context; I think the answer is complex, and kind of appalling. But on the engineering side, it seems to usually lead to nasty things coming back to bite you, and it is notable that it does so after you've gotten attention and people depend on you.

I still think the XP guys had this figured out. You scale with your customers and stop trying to write it perfect the first time. I'm a huge proponent of never making the same mistake twice but there is just not enough time in the day for perfect.

Instead you write it in a way that it has the potential to be great.

Customers that have been using your software for a year will have different problems when they have been using it for four.

On one project we essentially had a goal of doubling our data capacity of our (on premises) software every year to stay ahead of our customers' data collection rate. So my goal was 10% per month. We kept on that treadmill for over two years. Yes a couple of those months involves some pretty deep changes to get to a better info architecture. But most of it was constant application of the campsite rule.

That said, in that same project we added security and permissions after the fact, and it was one of the most frustrating things I've ever done. A constant and unending stream of bad news from QA about some other spot we missed.

Sounds a lot like translating a web app after it was released. In a company I was working on, we made a commitment to call out strings without gettext as a bug, regardless if it was really was going to be used for a single country/language. It really takes no effort to do it the first time, but adding that stuff later on is major bs.

The reason it's hard to bolt-on after the fact is that languages conflate the concept of "string" and "user-facing text". If "user-facing text" had a different type or method or whatever, it would be trivial to find all references in your codebase and consolidate them into gettext stuff. But instead, every inline dynamic sql, every dictionary key, every http parsing looks like user-facing text.

I've found that you can grease the wheels a lot by explaining to management that using the localization features of the language means that stories don't get held up by arguments about terminology. That's been sufficient to get them to agree to localization on three products, two of which almost immediately got surprise customers (or new owners) for our "always english only" application that suddenly wanted it in 3 languages in 3 months. :/

But for now, if I use this technique in the code, and you business guys want to start calling files 'assets', or clients 'customers' (or vice versa), then you guys can go fight it out in a room and just tell me what you want afterward.

Oh, we did that too (and somehow I ended up being in charge of both. Whoopie!) But I decided to cut myself off at two sob stories :)

Security, concurrency, and localization are three things you can't just tack on later. Anyone 'test infected' would add automated testing to that list too. They all involve a long game of whack-a-mole and they're IME bad for morale and productivity.

It's really, really easy to pass things off for later and then lose track of them. What's worse is that, once some task is "lost," it's even easier to assume that it's already been completed. And when that happens, we can completely block it off from thought as we concentrate on other tasks unless some external stimulus causes us to think about it again.

It's not a perfect comparison, but there are some similarities to the tragic cases of children being left in hot cars in the summer. Dozens each year, largely by parents who aren't abusive, neglectful, or what most would consider to be "bad parents."[0] Instead, it's what's called a prospective memory failure: habitual memory processes override prospective memory, which by its very nature involves additional cognitive mechanisms. There are a couple of PM theories, but the multi-process model suggests that PM cues can trigger a prospective intention even without an active monitoring process. Miss or interrupt the cue, and the intention isn't retrieved and followed through on. Failures such as airplane crashes when pre-flight tasks are interrupted and tasks are missed,[2] fatal anesthesia incidents,[3], fires because gas appliances were left on, etc. A number of industries have been putting more effort into hedging against PM failures over the past few years through better checklists and other aids and precautions precisely because the consequences can be so tragic.

Getting back to the topic at hand, the issue with dumb shortcuts like insecure defaults isn't that they're bad on their own but that the way our brains work makes it less likely that we'll go back and fix all aspects of those shortcuts later before a problem arises. Documenting your shortcuts, setting reminders, and other actions can help minimize that likelihood, but how likely is it that you'll do so when you're in the middle of taking an ill-advised shortcut in the first place? At that point, you might as well spend the time to do it right.

It's the same danger that comes with hotfixes. Even when you have no other choice, you've put yourself in a position that could have unseen danger later on. Having a policy in place (checklists?) for dealing with something like that can help you avoid PM failures by forcing you to go back and retroactively apply your normal procedures to the patch in question. Style guides and other guidelines help as well in everyday circumstances, even though they can be a bit annoying at times. Because no matter how you look at it, mistakes will be made.

Insecure defaults and stupid shortcuts make it more likely by stacking the deck against the people working on a project.

0. http://www.cnn.com/2016/07/25/health/hot-car-deaths-explaine...

1. http://www.psych.wustl.edu/learning/McDaniel_Lab/Prospective...

2. http://www.psychologicalscience.org/news/releases/when-we-fo...

3. http://anesthesiology.duke.edu/?page_id=826043

IIRC on Debian it is by default not listening externally. So ideally on Ubuntu as well. Though that is Debian / Ubuntu specific, elsewhere they likely ship with default configs. Still no default authentication, but still better than just letting anyone externally connect to it.

But if a substantial use case requires it to listen remotely, you're just delaying disaster. Listening remotely should have required an authentication setup, or an additional configuration directive like "imbeingatotalidiotandamopeningmydatatuptoanymaluciousactors": true

Yeah though configuring login is not something that takes more time than to set it up.

In other disciplines they call this pandering. I'll never know why some of us see it as virtue.

It's easy to see why having a lot of users is desirable.

Pandering tends to be profitable in other disciplines too.

I know a couple where it's -really- profitable.

But most of us chose not to do that for a living.

MongoDB was easy to setup and new. It got popular among developers who always jump to the latest trend and don't care about quality, maintenance or long term support.

Internally everything is screwed, from the architecture to the security settings, to the reliability of data storage (or lack thereof).

MongoDB is the school case of 1) Worse/bad software can succeed despite being bad. 2) Don't trust hype. 3) Developers in general have low standards.

This is also the case for Elasticsearch. In fact, to get a proper username/password/ACL setup, you need to pay for professional support.

IIRC, there was also a time when Elasticsearch used multicast by default to detect other ES nodes on the network and to add them to the cluster automatically. So any node joining the same network will join a cluster and have shards allocated to it

Early J2EE frameworks did that too. Super fun when your dev box clusters with a coworker and undoes your regression testing because half of the tests run on an older version of the code!

It seemingly still does; I'm just getting started with Elasticsearch & your description aligns with what I've read in the introduction guide

Just FYI, multicast discovery was moved to a plugin in v2.0 [1], deprecated in v2.2 [2] and removed entirely in v5.0 [3]

- [1] https://www.elastic.co/guide/en/elasticsearch/reference/2.0/...

- [2] https://www.elastic.co/guide/en/elasticsearch/plugins/2.2/di...

- [3] https://www.elastic.co/guide/en/elasticsearch/plugins/5.1/di...

Because trivialities like requiring your users to set a password on their database are barriers to adoption.

Move fast, and break things - you're not responsible for the damage your product does. Our industry both optimizes for, and rewards this kind of behaviour.

We usually can't see the same problems happening in front of our noses. Why shouldn't we discuss why docker images run as root by default then?

I think we absolutely should discuss this - the number of images I see on Docker Hub that run complex network services as the root user really concerns me. The security measures implemented by Docker reduce the risk but absolutely do not eliminate it, you're taking a big risk there.

This issue, along with its related issues with permissions on persistent storage, is one of the major reasons I'm hoping to move away from Docker as soon as possible.

What Docker security measures do you mean? Actually, I was wondering recently what isolation Docker provides that a basic chroot-jail doesn't.

Re root permisions I guess if you're running a typical network daemon (Apache, say), then requiring Docker to run as root is no worse than running the service on the host; root would be a requisite anyway for the daemon to open privileged ports/gain logging permissions etc. before dropping to "nobody" or what have you.

Docker drops capabilities of root by default. It doesn't drop everything but better than not dropping at all.

If you are not using docker, you can manually add the specific privilege to open low-numbered ports without running the program in question as root.

at v2.6 the product was only about 4 years old and underwent a lot of change during that time. (For comparison Postgres version 3 was released in 1991)

I think the main guilty parties at mongodb were/are in marketing.

technically sophisticated users understood the immaturity of the product and the tradeoffs that came with its architecture.

however it was sometimes marketed as a general purpose data store, or as an alternative to much more mature relational data stores, which was and still is an unfair comparison.

Even technically sophisticated users should be able to expect secure defaults.

Semantic counterargument: If you put a database into production without understanding its network and auth configuration then you are by definition not "technically sophisticated"

This. So much this.

By this standard, every programmer who puts an app into production without understanding assembly is not technically sophisticated.

We could expect every programmer to have domain knowledge of databases, even when it's not normally needed to operate one.

Or we could just make databases not acknowledge writes that haven't happened, not listen on all ports, not hide information about data loss in non-clustered setups deep inside the documentation.

Counterargument: many people who are good at a particular topic enjoy feeling superior to people who don't have their knowledge.

at the time these versions of mongodb were released, it was only a few years old, and DID require considerable domain knowledge to run in production. This does not seem unreasonable for new immature technologies.

unfortunately, I don't think this was always communicated effectively by the explosion of mongodb marketing and hype.

> DID require considerable domain knowledge to run in production.

What do you mean? If you you mean it took knowledge to run per se: it didn't: unzip and run the service, it's up. It wasn't any more complex than any other piece of software.

Or do you mean it did require considerable knowledge to avoid data loss on software marked as 'stable'? Yes, that's the problem, and yes, that's unreasonable.

Sophisticated users understand when to adopt a product. MongoDB had all the red flags, a sophisticated user would have done a trial and ban it with a company-wide memo after the first week and already the 3rd disaster.

Sophisticated users without setting up authentication and backup for their valuable data?

Yes, you could expect. But you would check.

In production deployments, unsecured remote connections might be fine, because security would be handled at the network layer (e.g. by deploying MongoDB on a VLAN which only was accessible by your MongoDB clients which were on separate servers).

Before VMs and the cloud you'd buy a server with three or more network cards in it and you ended up with physically separate networks just because you could and it staved off network saturation. That it simplified your security architecture was the cherry on top.

But Mongo pretty much rejected that world view (web scale!) so it's a little odd that they designed a system that only works in it.

> In production deployments, unsecured remote connections might be fine

No. Absolutely not

Unsecured local connections yes, anything from outside should be explicitly allowed and secured (like port 80/443). Rate-limited, fail2banned, fuzzed during testing, etc

"I opened the ports on the firewall and gave the user access, why doesn't it work!? This system is stupid."

"I changed my password on one system and then on the other, but somehow my account got locked up in that brief time between changeover, how the hell do you unlock an account? This system is stupid."

"We're scaling up and we're running into outdated rate-limiting settings on every service we use, some of them even having multiple layers of rate-limiting within a single service, and each setting requires hours of debugging to figure out which setting is taking down the whole project. This system is stupid."

There's a reason Mongo got adoption while "secure" systems did not.

To be honest, the issue is that a lot of secure systems make things more convoluted, unintuitive and/or frustrating than it needs to be, then calls the user stupid when they can't set it up

I agree. You always want to have at least two mistakes between you and getting harmed (more if you are handling anything valuable).

If you only have a VLAN separating you from harm, it only takes 1 bad switch configuration or any other device on the VLAN to accidentally be running a proxy or router.

A firewall, and a VLAN configuration. If you are allowing white listed access through your frontend firewall, then you need to screw up both.

This is a really bad practice to follow. You can't rely on the network to be secure.

That's true in many cases, but if the network is solely for database access and the only servers with access to it would be configured with DB credentials anyway, then the path to compromise is equivalent (get into a server with db credentials/network access).

But why bother though? If somehow that network defense comes down because of maintenance or whatnot, you have just tainted your most precious asset out of wanting to save a couple of minutes for securing the connection.

This problem was caused by broken 'thinking' like that.

This is not unique to MongoDB however Elasticsearch is similar. The line they seem to tow is "this is made to be run behind a firewall." I agree its unacceptable but its unfortunately not unique to MongoDB.

That's not a fair comparison. ElasticSearch doesn't have authentication in the free edition. You gotta pay to get the authentication plugin.

ElasticSearch => Dropped authentication (i.e. an important aspect for security) to get a chance at monetization.

MongoDB => Dropped everything, including your data. YOLO!

Elasticsearch never had authentication. So they never dropped it from the free version. I feel like it is an appropriate comparison. The general response to Elasticsearch's lack of security was "your'e supposed to run it behind a firewall"

Elasticsearch got all the way to version 1.4 with no authentication as part of the design. Between 1.4.1 and 1.5 release candidates is when they announced Shield as part of their enterprise offering.

Here are the release dates and announcements:



Because it absolutely wasn't, and still isn't, necessary. Full stop. ElasticSearch speaks HTTP. Why would I want it to reimplement HTTP authentication when I can place it behind a reverse proxy that's already integrated with my company's AAA and has been doing nothing but HTTP for oh, three decades or so? Separation of concerns.

This is how this goes, and I've seen the pattern followed with several infrastructure services, even here in this HN thread right now:

1) A bug report comes in to AwesomeDB. "You do not have authentication support. This is bad. I'm going to yell about it on social media and make fun of you until you can accept a username and password, because everything in existence should write its own authentication code. Even MySQL takes passwords, dude, get with 2017."

2) A rough version is implemented by an intern, because authorization is way less sexy than hyperdimensional bloom filter quantum hash tables to speed up random reads. The intern then departs back to beer pong and nobody at AwesomeDB now understands the security infrastructure of their product.

3A) A mid-level executive, seeing the announcement, instructs me that I no longer have to keep AwesomeDB behind my 'stupid' reverse proxy + ACL solution and we can finally use native authentication and be really secure!

3B) Alternatively and more reastically, someone at Qualys adds "authentication is not turned on in AwesomeDB" to their scanning product and I get nastygrams from a contractor clicking "Scan" and copying and pasting the results for each of my fifty AwesomeDB instances.

4) During testing, I capture the credentials in plain text in about three minutes because it's a database. Why on earth would it terminate TLS? If we are down the 3B path above: Ha! Your move, security team!

5) Another bug report is filed against AwesomeDB to terminate TLS. Equal shaming is performed, because every service in the world needs to link in OpenSSL and terminate TLS, because that's never been a problem or backfired in any way. Oh yeah, and everyone has a different mechanism for handling certificates, and most don't bother to consider revocation.

(Just like the broader point I'm making about auth, stop adding TLS support to your systems. Please. If you speak TCP I can wrap you in something uniform that does a much better job anyway. You know how I know your integration is shitty? Because OpenSSL is hopelessly undocumented and the APIs even more so. You Googled and read a guide. How about instead, I can drop a pretty awesome TLS listener in front of you, potentially even in hardware! Whoo!)

6) TLS support is added to AwesomeDB. Cool! Now we can really be secure! (But nobody implements client certificates, so we still have to do the username/password shared secret business. And alternate DH parameters? What are those? They weren't covered in the guide, so AwesomeDB hardcoded a couple.) At this point, most people would put credentials directly in their repo. I'd sit around making the Vault argument, but nobody would listen. We have now traded my ACLs and authentication stack for security on our private GitHub repos. I set to work automating changing all of our infrastructure passwords every time an intern leaves the company. We still secure the network, of course, but now there's special strings of characters in existence to work around a competent network security strategy that already existed.

7) As a growing company we introduce LDAP and Kerberos. Oh wait, AwesomeDB and its home brewed authentication stack do not support LDAP because what startup needs LDAP?

8) Another bug report is filed against AwesomeDB for "LDAP support." A bunch of startup engineers spend several months trying to understand the guts of LDAP (I feel your pain) and, worse, the cornucopia of vendor shenanigans in that space, and set the bug report to "holy shit, someday." A rough version that works with one exact schema and hopelessly avoids being just right for my needs might ship, or might not.

9) I redeploy my perfectly functional reverse proxy solution which has supported LDAP since the mid 90s and wire it up to Google OAuth for bonus points (getting rid of passwords for all corporate systems overnight! Neat!) while half the world pointlessly fights for every infrastructure service that exists to basically link in an operating system's worth of encryption and authentication support just to be a document store or search server.

Given the choice I would very much rather secure my network or use well-established software to secure an infrastructure service. Asking every database vendor in existence to reimplement this stuff is a bad idea. People who blindly advocate for this sort of work, then paper over it with "you can't trust the network" without really supporting that statement, just generate busy work for companies that write infrastructure services. If you can't trust your network, what business do you have operating a production service that maintains user data? Do you really think having ElasticSearch do its own HTTP basic is more secure than any other solution? They at least had the JVM/netty/whatever to fall back on, where a lot of these infrastructure services have had to write their own authentication.

There is a middle ground here and a lot of advocacy, both on HN and elsewhere, that misses the forest for the trees. Of course, none of this applies to MongoDB, because they were like sure, the world needs another bespoke binary protocol. You absolutely should be keeping your data on a secured network, though, and opaque strings of characters for system A to log in to system B might seem like defense in depth, but they end up in your source code, another system, or the working set of your executing process, and they are routinely captured by active adversaries. So.

Having a mode where an infrastructure service assumes a connection is legitimate and just services the client is important. Some of us have a little more confidence in our security strategy than others and feel that we know what we are doing. Wiring that mode up to a bind, by default, is the bug.

Oh yeah, almost forgot:

10) All that custom authentication in AwesomeDB that you had an intern write becomes a CVE factory and starts reflecting on the security of the product. Womp womp.

>"Why would I want it to reimplement HTTP authentication when I can place it behind a reverse proxy that's already integrated with my company's AAA and has been doing nothing but HTTP for oh, three decades or so?"

Because getting RBAC right for individual indexes in Elasticsearch is not necessarily trivial. Example, I have multiple groups and users. And lets say I want to give some users access to these indexes but not these other ones and this group can access these indexes read-only and this group can add or delete but not access the admin endpoints etc. Just "putting it behind your company's proxy" doesn't work out of the box. And integrating this on Elasticsearch client Nodes - nodes where node.data is set to false and node.master is set to false, would have been very logical.

Also not every company has AAA with full LDAP integration, i.e any early stage start up that consists of a small handful of people wearing many hats because some of which are not their area of expertise. Lots of these types of companies are also using Elasticsearch for full text indexing.

There's a difference between username/password gatekeeping security of an infrastructure service, which my post was quite clearly discussing, and fine-grained RBAC in the data plane. One is itself data within the database, while the other is gatekeeping to the database; they have vastly different impetuses, design constraints, and security strategy.

Conflating the two is wrong and, yes, many services themselves conflate the two which reduces the flexibility of both. Why not let my system do the AA and tell the backend what user it is so that RBAC can be implemented? I haven't used Shield but I'm guessing this is difficult to separate; in a number of other systems it is.

The key to your office is not nearly the same as the key to each drawer of your filing cabinet. Each can be shared independently and serve different purposes. You might decide to never use your office key, for example, because of the armed linebacker in your lobby. In my experience, most of the effort to make an infrastructure service speak authentication is intended for gatekeeping and RBAC comes later. ElasticSearch obviously went all the way with it. This whole conversation about infrastructure authorization tends to make people forget what the actual threat vectors are, which is really the subtext of my thesis here.

Every company has at least a subset of AAA, and hand waving it away as "eh, early startup stuff" is also dangerous. Putting stuff behind Google SSO is nice for startups these days and, boom, AA-sorta-A.

This is a most excellent summary of how the technologies-de-jour seep their way into an organization!

Out of interest how did you handle the Google OAuth integration? Some kind of JWT module for httpd/nginx that validates the token, and acts on the subject?

Exactly that, yes. It's not the software I was talking about in my essay, but I've used Openresty to do the JWT lifting in about 100 lines of lovely, readable Lua. Made a nice page with a "Sign on with Google" button from their docs, served it statically from nginx, and had it mostly working in a day. I wouldn't write it again, though; I wasn't confident in its complete security owing to there being far more clever people in the world than me, and I'd do that specific task differently this time around.

Openresty/nginx is like an operations Leatherman. Very handy.

It's an entirely fair comparison. Just because they're hiding a feature from you to make you cough up money doesn't mean that a newbie won't be burned in exactly the same way.

You can set up the latest MySQL with a blank root password and open to the world as well. The lack of good security options is a 1.x-ism, now there are plenty of security mechanisms now https://docs.mongodb.com/v3.2/administration/security-checkl...

It is much harder to do this with MySQL.

The default is now a randomly generated password. You have to overwrite it with --initialize-insecure if you want to setup with a blank password.

I don't buy that. Just did an apt-get install mysql-server and it let me install with a blank password (Ubuntu 16.04 MySQL Server 5.7)

It is most likely appearing to work without a password because it is using socket authentication (vs tcp). See: http://mysqlserverteam.com/secure-by-default-in-mysql-5-7/

If you want to double check, try 'mysql --protocol=tcp'.

That's ubuntu screwing things up, not mysql. When you run the init manually -- mysqld --initialize -- it does require a password.

Since we're just talking about defaults, you'd get an equally unsecured MySQL install on the latest and most popular AWS AMI today, as you would with old MongoDB installs.

Most likely this configuration is due to the traditional mongo use-case, which is to have data replicated between many nodes. This requires each participating node to be able to communicate with every other node.

Elasticsearch is still this way, and actually tries to make you pay for Shield, their way of securing it.

Realistically, insecure defaults are part of the reason Mongo was adopted in the first place. Other databases are a hellscape of configuration, meanwhile Mongo is starting to get work done from the moment it's installed.

Security and usability are always at odds. If you make security a pain, people will give up on security. Mongo is the ultimate manifestation of that principle. Developers were driven to Mongo by opaque and painful admin tasks.

A "hellscape of configuration"?

`apt-get install mysql-server` makes you enter a "root" password out of the box.

`apt-get install postgresql` create a local user `postgres` with access the local database.

Both bind to only localhost by default.

Those sound like easy, reasonable defaults to me. (And this isn't exclusive to Debian-likes: RHEL almost certainly shares the exact same benefits)

I don't know, to me, these sound like things you have to do for those dinosaur databases, along with their antiquated notions of "authentication" and "actually persisting data you confirmed". Who has time for that?

`apt-get install postgresql` create a local user `postgres` with access the local database.

Does it!? I'm sure I've had to do that myself.


Yes, but there is such a thing as minimum viable security. You know, that magical place where you're one step above absolute zero.

> Security and usability are always at odds.

No they are not. Start the server for the first time? ask users to create an admin account, then display a notice saying they didn't configure SSL or something else properly. Problem solved.

Each time the server is started and the setup isn't secure enough, display a message. That's how you do it.

Too bad.

If you put something on the internet, it's on you to keep it secure.

Just making it bind to localhost and not would have zero impact on out-of-the-box usability and would prevent 99% of these drive-by hackings.

> Security and usability are always at odds.

Absolutely not. How does having to log in affect usability of a system itself in any way?

Security and convenience are often (but not always!) at odds.

Microsoft took a rash of shit some time ago (15 years?) for shipping MS Proxy Server with every port open by default. From the POV of employee-at-the-time, it took them a disappointingly long time for them to not do that anymore.

Since then, I've learned to not assume that products are secure-by-default. At the same time, I kind of thought we learned our lesson and cut that shit out low these many years later. Add a line to a text config file that's probably buried eight directories down in a hierarchy that's owned by root? (I'm just hyperbolically guessing for effect; I generally avoid Mongo.) Do it, or you're hacked? And it's been this way for years? Come on.

Do it, or you're hacked? And it's been this way for years? Come on.

It's not though. And it never really has been. A simple approach to security is to only expose the absolute minimum to the internet - you close everything and then open one thing at a time until your service works. Had those 30,000 MongoDB instances been sitting on IP addresses behind routers that only allowed local traffic, or specific IP addresses, then they would have been a lot harder to steal data from. The fact that they were willing to give up data to an unauthorised user would still be a problem, but it'd be a local network problem rather than something anyone with a port scanner could do.

This is not a case of discovering some arcane setting you have to change to make sure your data is secure. This hack was a case of people putting services on open connections without bothering to follow even basic practises. It's bad that MongoDB gave up data without auth details, but your infrastructure can be designed to block an attacker even when they have security details so the fact that MongoDB had this problem is never going to be the whole story.

Years ago Oracle had two admin accounts out of the box, but half the docs only talked about one. So a whole lot of people discovered that they had an admin account with a default password hanging out in production. Those were some awkward conversations.

In fact, this setting has defaulted to localhost only for years. MongoDB 2.6, which made this the default, shipped on April 8, 2014

Exchange 5.0 was an open mail relay and not possible to lock down - you actually had to pay for Exchange 5.5 in order to get relaying control.

But that's bad old NT4-era Microsoft. Not 201x MongoDB.

Microsoft is a different beast. They ship things based on the principal of least surprise. There are incredible features in newer versions of SQL Server that aren't turned on by default when upgrading, even though no one would want to not have them on.

To be fair, SQL Server is probably the best product they've ever built. IMO, the only good reason to use Oracle was that it runs on Linux. With SQL Server on Linux, I'm sure they'll gain more market share over the next years.

As much as I like using open source alternatives like Postgres, SQL Server for me is by far the best database system, from standpoints of performance, features and tools.

Can you give some examples?

Read isolation is probably the biggest stand-out in terms of "Wow this is way faster why would anyone* not use this?"

* Almost anyone, there are definitely potential pitfalls.

That sounds like a truly awful rash.

No matter what your defaults are, if a sysadmin doesn't understand that some software is OK to connect to a public-internet addressable interface while other software is not, their shit is going to get hacked (or, as in this case, accessed).

This is the equivalent of leaving your data in a file cabinet in your lobby and being surprised someone took it. The answer isn't to add self-locking little office locks to the file cabinets by default, it's to educate folks that this is a file cabinet that's designed to be inside the office, not in a publicly accessible place.

Network oriented software created after 2000 has zero excuse for allowing unauthenticated writes from the general internet by default.

Tell that to redis.

No, seriously. Please. I've been banging that drum for years.

Also, MySQL [1]. Are you sure that you're right and all these major db projects are wrong?

1. note the default value of "bind to *": http://dev.mysql.com/doc/refman/5.7/en/server-options.html#o...

Well, mysql was started before 2000, so it gets a pass. But I'm also pretty sure mysql doesn't allow unauthenticated writes, so it doesn't need one?

At least MySQL can speak TLS.

Redis doesn't listen on localhost by default? I've deployed countless redis instances and that's news to me. Did Debian/Ubuntu change the default default?

What about the "this is the most common setting used by our clients"? Is that not a good "excuse" for being the default?

Here's a semi-sane default for people who want the simple case to just work:

- require authentication for remote network connections

- allow r/w from any local connection, tcp or unix

Random devs installing mongo on their workstation will Just Work, but won't expose customer data to the internet.

No, it's the equivalent of selling a car that doesn't require the key to be present to start by default.

If the software is insecure out of the box, that's crap vendor behavior.

I think it is more like the equivalent of a refrigerator without a lock. You aren't supposed to keep a refrigerator outside your locked home. You can and do see people fit refrigerators with locks for that purpose, but when you do that there are tons of other things you will need to do, and I don't think it is optimal for the refrigerator people to try to solve them. (That said, as the issue here is "localhost or", I don't think keys or locks are good analogies: more like, when they come and install your refrigerator, they should probably install it inside, as that's the place it should be used, but surprisingly if you don't tell them this ahead of time they install it outside, so you can get access to it more quickly from your car, which is insane.)

There is plenty of heavy industrial machinery that doesn't have keys - the canonical example is a jumbo jet.

The most common MongoDB versions that are being exploited with the ransomware come with secure defaults!


Originally, I thought the problem would disappear once MongoDB changed the default from "" to "localhost". However, that has had no impact on the number of exposed MongoDB instances on the Internet. Bad defaults are definitely a problem but in this case that isn't what's happened.

Also note that the most popular location where these instances are hosted is on AWS where you explicitly have to open up the firewall.

Just a guess, but those people probably updated their installation with an existing (insecure-by-default) configuration in place. Very little software like this will mess with an existing configuration, so that upgrading doesn't appear to break anything. That's how they'd be on a secure-by-default version but still be insecure.

Yeah, I could definitely see that happening though the overall number of exposed MongoDBs has also increased. And there are Docker images that come w/ insecure defaults ala:


One of my concerns is that people are piling onto MongoDB because "it's web scale" and ignoring potentially systemic security issues that aren't specific to MongoDB. The same issue exists for Riak, Cassandra, Memcache etc.:


While we should hope that server software is secure by default, you shouldn't rely on it either. A good rule of thumb is: Distrust until verified

If you're running any servers yourself, there's no reason to open up network access to the world. At most the specific ports you want to expose, to the specific IP ranges you want to allow, should be white listed. A default of everything to everywhere is insane.

Even with SSH key based authentication (v.s. say passwords) I'd consider it inept to expose all your infrastructure publicly. Set up a VPC or whatever the local equivalent is for your hosting environment and proxy SSH access via a bastion host. If you whitelist the inbound addresses to the bastion host (let's say to restrict it to just your office) then you've also eliminated the vast majority of auth log spam too.

Unfortunately the people that would understand and implement things like this are the same set of people that wouldn't have publicly exposed MongoDB instances in the first place.

I find it interesting that there's no firewall with a default deny rule between these exposed mongodb installs and the Internet. All I can think is that most of them are on cloud services which are directly exposed. It reminds me of the fiasco with all of the directly-connected vulnerable network cameras on the Internet.

I think the problem is developers running apt-get install mongodb and assuming all other considerations, like a firewall, are somehow magically taken care of, then patting themselves on the back for not needing a sysadmin.

Why shouldn't these be taken care of?

Both Windows and UNIX/Linux programs have had interactive, guided installers, graphical or text-based, for years. Package managers support post-install scripts.

Organizational division-of-responsibility issues are not for applications to solve, but having sane defaults and performing user-supplied configuration is. This could've been handled with:

> Do you want to allow MongoDB to be accessible externally? (WARNING: this means ANYONE on the internet can connect. You will be asked to set up a username and password to protect your database. Don't do this if you're unsure or if your organization has to approve access.)

> Now that your MongoDB is open to the Internet, choose a username and a password to protect it from intruders.

I think it's more of a case of assuming that the database wouldn't expose itself to the internet at large, because why would it? It's like buying a new car, and checking for holes in the gas tank before rolling off the lot.

Your car analogy fails because safety is highly regulated by the government, and the manufacturer can be sued for their failings.

When it comes to server security, you are responsible, and should never make assumptions. If you don't know how, or simply can't be bothered to check, then you shouldn't be administering a server.

Also a firewall is a most basic requirement, even something simple like UFW would do.

To make this analogy a bit more accurate. With this car instead of filling the gas tank by opening the gas tank door and removing a gas cap you just put the pump into a hole in the side of the car/tank. Then act like you had no idea gas was sloshing out the side hole of this tank.

You didn't do anything special to connect to MongoDB, and are able to connect to it from your other servers. Why would you expect the same wouldn't apply to everyone?

I will say that people who ran MongoDB on their app server get somewhat of a pass on this.I could imagine that they would find the idea of MongoDB running open to the world with no auth to be so nuts that they assumed it was only allowing connections from the local machine.

On the subject of apt-get, Ubuntu (and most distros) have the firewall disabled by default because out of the box installs have no listening ports. This makes it very easy for the uninformed to have insecure systems. https://wiki.ubuntu.com/SecurityTeam/FAQ#UFW

This is absolutely the problem. There are high-horse developers who've made 4 webapps using PHP and don't know anything about the system they're deploying on -- but they're rockstars disrupting the world!

They leave mysqld bound to because they don't know any better. They SSH as root because they don't know any better. They have a default WordPress install with the config db sitting in webroot.

But hey, their website works and might one day make them some money.

Everyone can't know everything, be realistic.

Ah yes, everything is absolutely what I was getting at. This line of thinking is what allows these problems to happen due to pure laziness or not caring.

When you build a product or service, you should own it end-to-end OR simply let it be known that you are unable or unwilling to do so.

If you take it seriously, you'll know what can sink your project/business -- and stuff like leaving your entire database publically available with no auth, is one of them.

It's a requirement for a one-person shop. Or they have to get expert advice.

Sure, but everyone is naive at some point. You can't expect one person to make ZERO mistakes. There is a huge amount of steps involved from conception to implementation. I've done many, and still forget things.

I don't mean to defend Mongo here, but as a counter argument to the broad principle: Ubuntu doesn't ship with iptables blocking all incoming connections by default. Should it?

Honestly, in this day and age, it probably should. As well as whitelisting software that is allowed to make outgoing connections. At least for server installations. Network security is still a complete joke.

Well, if you're on a laptop, Linux may ship with no Wifi card support and thus no internet whatsoever. Guess that's even more secure!

The modern consensus is that it should ask you what class of network you're connected to and configure it accordingly, as more permissive rules are better for home networks and the opposite for public networks.

But also, if Ubuntu didn't run network services by default, there would be no need to secure them with iptables. So defaults affect defaults.


MongoDB has a history of using the insane defaults - not secure, and willing to loose your data.

I wouldn't blame MongoDB.

Any machine connected to the Internet should start with a configuration equivalent to:

    ufw default deny incoming
    ufw allow ssh
    ufw enable
And then start punching very specific holes as needed.

I had a situation where their cloud backup product was not backing up data and not issuing alerts when it wasn't working.

I gave them full access to logs, etc. and the support response was basically that they couldn't determine why it wasn't working or alerting, but it it works for them, so I should migrate to their hosted solution.

I sometimes think of the potential company-ending cluster-fudge that could have caused, while I stare wide-eyed at the ceiling at night.

To play devil's advocate for a second, isn't Mongo following the unix philosophy here? Small, composable tools instead of bolting on redundant features to everything?

Opening/closing ports is the firewall's job. Mongo's job is storing and retrieving data (now, how poorly it does at that is another conversation).

Having both Mongo and the Firewall handle ports means you do everything twice, which violates OnceAndOnlyOnce.

If that's the case, it should be behind something like inetd, instead of being able to daemonize itself ;)

Yes, it is. Apparently there's a time and a place to be following the Unix philosophy, and this is not that.

I stumbled upon this (awesome) article that made me realize tons of things: http://www.ranum.com/security/computer_security/editorials/d...

I'm not a security expert (far from it) but I hope that I understand enough the importance of security to learn a bit about it and implement it as much as I can.

Secure defaults is now maybe the first concept I'm trying to explain to people in my company.

Defaults are stupid and dangerous.

First off, there's the constant problem of default passwords. The big IoT compromises lately were due to default passwords, and ever since they became popular they've been a bane to security everywhere. Wifi APs used to use defaults, but luckily most come with a randomly assigned password now (except those provided by an ISP, occasionally). All software that matters should not use default passwords.

Second, a "default" is not intended to be an operational setting. When developing software, a default is literally a placeholder setting so that your software won't immediately crash if a configuration value is missing. Some software is effectively meaningless without a configuration, but if they are obscenely permissive they can allow a user to still take advantage of it, which is how MongoDB is set up.

Finally, there is effectively no such thing as a secure default, because something isn't really secure until it's been vetted through a process. Pretending your software is secure when you have literally not given it a second thought is just setting you up for eventual disaster. People should be actively thinking about their security if they care about it; defaults just leave them blissfully ignorant.

In my ideal world we would require software to no longer have defaults. Demand that users be aware of the configuration and running of their software so they understand what's involved with how it runs. This would teach them more about the software they're using, which would enable them to take better advantage of it, as well as make it more secure. If you don't have time to do that you probably shouldn't be using it.

It's pretty simple folks, RTFM. If you are running MonogoDB on your laptop to build an MVP, sure you can run it unsecured no worries. When you go to production, you go secure. My hope is that no DBA worth his under-appreciated skills would drop a totally open DB on the public internet. That said, it obviously happens... Read the directions, have a plan, and follow the checklist: https://docs.mongodb.com/manual/administration/security-chec...

Everyone has a different security model. In my case all my DB servers live behind and API layer on the internal network, and the DMZ web layer talks to the API. That makes keeping things secure MUCH easier...

The v2.6 open access issue in MongoDB is no new revelation, and has been rehashed over and over again. It's worth noting that this post comes from a company that profits from convincing its customers that the products they are using are insecure.

I'll say this is even worse because whether or not to bind to localhost, all interfaces, or something in between is often highly dependent on how the software works - which you're less likely to know when just trying it for the first time. Even someone security conscious would have to pause and do some investigations to find out what mongo expected in terms of port and interface bindings. If the default was to bind to my first assumption would be they require that.

This was 100% preventable, MongoDB comes with great resources for securing your data. I pity whomever puts ANY database on the internet without taking proper security measures.

Here's some documentation to get you started.




This really should have been titled "the importance of good DBA's..."

This isnt a mongodb problem. redis has already been targeted and memcached and elastic will. e next. for more info: ise.binaryedge.io or http://blog.binaryedge.io/2015/08/10/data-technologies-and-s...

I just had to deal with an insecure default in our Hadoop cluster.

Hadoop, by default, enables WebHDFS, which provides read+write access to HDFS over HTTP. The service runs on name node using the same port which displays the Hadoop Jobs Web UI.

Now, it is very easy to not know anything about WebHDFS and every day authenticate/login to server, submit jobs, and track the status on WebUI (which is public). But then anyone can send a POST request pretending to be any user and Hadoop will happily execute the command (including deleting all files).

Maybe Hadoop developers believe that all Hadoop installations are behind a firewall (ours was, but an exception was made for Jobs Web UI as it seemed to be a read-only status page for running jobs) and/or have Kerberos integrated with Hadoop for authentication.

But that being said, enabling HDFS access over HTTP by default seems to a debatable choice. Also, if you have a small Hadoop cluster running, do set "dfs.webhdfs.enabled" property to False in hdfs-site.xml.

I cannot believe, that so many users install MongoDB, only using default settings and installed on servers, which are accessible via the internet.

That is fundamentally wrong application design. The application connects to the database, so the database must not be public accessible via internet.

I mean, if you push something into a production environment, you must think a bit about the tools you are using and how they are configured for security.

On a development system, I don't want to setup users or auth-mechanisms.

There are so many resources from MongoDB, how you can secure your MongoDB database: - Security Architecture White Paper: https://www.mongodb.com/collateral/mongodb-security-architec... - Security Best Practices blog post: https://www.mongodb.com/blog/post/how-to-avoid-a-malicious-a...

There is even a checklist which you can work through: https://docs.mongodb.com/manual/administration/security-chec...

And for those, who don't want to read anything ;-), there is a comprehensive Video-Course on the MongoDB-university: https://university.mongodb.com/courses/M310/about

If I enter a car and don't use the seat-belt or switch on the light at night (defaults are off), then I cannot blame the car-manufacturer. If I use something important (here the database in a production environment), then I have to learn the ropes and read the material specifically on security topics.

While we are at this, this checklist is worth looking at: https://github.com/FallibleInc/security-guide-for-developers...

Postgres' defaults barely let you connect to the DB. You don't hear stories like this about Postgres.

Well, sure but then people just open the flood gates because it's such a pain in the ass to change postgres access policies.

I guess secure by default means there aren't likely to be massive hacks like this, but it does not mean people are running securely.

Is it just that no one's running exploit scanners for Postgres then?

But back when I was starting out you did get a hell of a lot of people choosing mysql over postgres partly based on the fact that the latter was easier to connect to out of the box, and/or filling the support channels with connection questions that are already answered in an installation guide in the documentation.

Convenience "sells" even at the expense of security (and in this case, actually working properly at all!) and if you don't make it as easy as possible be prepared to support people who refuse to read the instructions.

pg_hba.conf is one of the reasons it took me so long to learn Postgres. Of course, now that I know it, I don't look back - but making the first step in learning a database "understand this weird model of AF_UNIX username == dbname trust" delayed the learning process by years.

FWIW, I'm a postgresql committer these days, and pg_hba.conf was probably the bit I was uncomfortable with for the longest (user level, not code level).

Speaking of defaults, if you install Mongo on EC2 using a default instance of Ubuntu using the default engine, Wired Tiger, you will get ceaseless server crashes. Wired Tiger is NOT STABLE on the default volume - ext4. You have to use xfs.

This was only recently added as a startup check.

Education, experience, diligence. I'd expect that of anyone basing their livelihood on delivering production systems.

Secure out of the box is nice, except that when the boxing is done by others, and delivered to under-informed, unexperienced people. It's too easy to npm yourself a "full stack" or just pick a Docker container with full stack on it, then place it on public inter-web... there are dangers to be had, and others can save you. It's up to you.

And yeah, the click-bait titles including the word "hack" are laughable.

I have recently been working with mongodb wanted to start the service with auth enabled with defined users and security. Here is the way I did. https://medium.com/gitignore/automatically-start-mongodb-in-...

I found another piece of useful information from mlab, so I will share it.


The news is too late for me. Two of our development servers are already compromised. Anyway I added those servers under NAT so those servers are inaccessible from internet. Added admin access credentials also. The news should have been spread without any delay.

It's worth noting this isn't unique to MongoDB. The "Marked" npm package, with it's 2 million downloads, doesn't sanitize input by default. "st", another popular package, allows directory listing by default. Quite a few of those...

If I were a mongo dev, I wouldn't want to write authentication and have to manage crypto.

Default bind to localhost. If you want to handle external connections, the network layer provides your security.

The MongoDB hack and the importance of hiring competent people.

A bit off-topic but related: how secure are the defaults of an AWS RDS MySQL (created by a BeanStalk environment, if it matters)?

Not defaulting to 'localhost' (or is pretty bad in general.

Why would someone still use v2.6? There isn't much trouble to upgrade.

Based on the article at least 30.000 internet-accessible installations.

The same people that run an unsecured database that's accessible over the internet.

Who, when wanting to write a secure piece of software, thinks: "I know! I'll use JavaScript!"? Security was always going to be an afterthought at best. There are legitimate reasons for writing things in JavaScript, and none of them apply to a database.

You got it wrong, MongoDB is written in C++. I guess it's easier to jump on the JS hate bandwagon than look it up for yourself.

What does this have to do with Javascript?

MongoDB is written in Javascript.

In part. According to the Github repo, it's 75% C++, 18% JS.


The JS is from docs and examples, not from core MongoDB code.

So https://en.wikipedia.org/wiki/MongoDB is wrong? Well, guess I trusted Wikipedia a bit too much.

I'm confused, what about that page are you claiming to be right or wrong? The code is here for all to see: https://github.com/mongodb/mongo/tree/master/src/mongo

Thanks for the further clarification!

That's ridiculously incorrect.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact