Hacker News new | comments | show | ask | jobs | submit login
Over 30% of Official Images in Docker Hub Contain Security Vulnerabilities (banyanops.com)
222 points by mikagrml 911 days ago | hide | past | web | 77 comments | favorite

"Containers have revolutionized software development by providing a very efficient path to take software written by developers and run it in production in a matter of minutes or hours, rather than days or months using traditional approaches."

FUD. The technology of deployment does not change 'minutes or hours' into 'days or months' - it's management red tape that does that. In fact, in my use case, Docker takes a similar time to build as a normal package (.deb) using an up-to-date base image, but is actually slower to deploy, since now my servers have to download a stupidly large container with build-essential (npm doesn't really survive without it), python (because npm maintainers use python frequently), and graphicsmagick (for the in-house app), instead of 'just the app' that's in a normal package.

If your environment is simple enough that you don't have to be concerned with testing in 'staging' against staging databases or similar, then you're definitely not saving 'days', because your env just isn't that complicated.

"The technology of deployment does not change 'minutes or hours' into 'days or months'"

I wouldn't say that's true. We're transitioning into multiple languages, and want to have an environment that will allow future languages to be added as required. Building a generic infrastructure to run containers lets us run everything on the same base platform. Otherwise, we'd need to tailor the images and configuration for the individual language type. When a new language is introduced, it can take 'days or months' to get everything working well.

That's not to say Docker doesn't require the same attention to security as other options. This seems to me akin to running a downloaded base VM image without first doing updates.

Or, you could do what HPC has been doing for years and seperate the config from the machine.

What do I mean by that? shared drives.

Seriously, install python$ver plus dependenceies into /mnt/bin add it into your path. You now have a single source of (readonly optional) each binary version.

this means that you can have many versions of the same software all compiled in a different way. But because they are in the path, they can be transparently managed. Also it means that much of the config management is now in one place, making joining nodes super simple.

We do this at my company and it is a fucking nightmare. Why? Because there are like 4 different operating environments and there isn't an official way to do installations and you also have to manage site installations of various packages for each version of each language dependency. And god forbid some environment variable is pointing to the wrong version of something because it's not just as simple as setting PATH and LD_LIBRARY_PATH when every thing and its mother tries to set its own fucking environment variables all pointing to wherever they think they were compiled at.

No, it is much much MUCH better to actually have an application build with its dependencies and deploy with its dependencies. And you know how you fix issues with security patches? You have a real build system that rebuilds your binaries and you redeploy regularly.

Otherwise, we'd need to tailor the images and configuration for the individual language type. When a new language is introduced, it can take 'days or months' to get everything working well.

Yes, and now you have to tailor the distribution in the container to the new language. Of course, the impact is smaller than changing one system that contains everything.

However, this problem was long-solved before containers (as in OS-level virtualization) as well in virtualization (Xen, KVM, etc.). (Of course, FreeBSD had containers for ages, but they were largely ignored.)

Not surprised at all.

And here we have the prime example, why the Docker-model of building and distributing containers is horrible when it comes to security and maintenance.

Bundling dependencies for production environments has always been and always will be a terrible idea.

This sounds like an oversimplification, though:

> Bundling dependencies for production environments has always been and always will be a terrible idea.

We're considering Docker currently -- not for the distribution model at all, since we'd only ever use our own internally built & maintained images -- but as a clean way to break apart dependencies, and make it possible to run a diverse multiple-server-type environment (production) in miniature (development, demo, UAT).

I quite like the idea of something that may occupy multiple VMs or dedicated servers in production be able to run as a lightweight app in a dev environment, with exactly the same dependencies in place -- that's quite useful.

If this kind of use case is also a terrible idea, I'm interested to hear more -- we're just now tinkering with the idea, and haven't yet moved from theory to practice.

My own concerns revolve around how easy it will be to keep updated on RHEL patches, for example -- apparently we should be able to keep both host and app dependencies updated without much trouble, but it adds more complexity to the maintenance cycle (it seems).

> My own concerns revolve around how easy it will be to keep updated on RHEL patches, for example -- apparently we should be able to keep both host and app dependencies updated without much trouble, but it adds more complexity to the maintenance cycle (it seems).

That's about the "problem" with Docker – it's deceptively easy to roll out everything as its own containerized app. Updating? Not so much.

It turns Docker from a magical silver bullet into a slightly fancier way to handle reproducible deployments. Using it this way is fine, but not what Docker is marketed as by many.

Actually its pretty easy, I just did it yesterday for my PostgreSQL container.

Debian/Ubuntu example: sudo docker exec -it my_pgsql_container_name /bin/sh -c "apt-get update; apt-get -qqy upgrade; apt-get clean"

And what happens when you launch new container from the same image? You need to run apt-get/yum again. Or rebuild image.

That's why you keep everything with state in a separate volume container. Attach volume to built image and that's it.

You can, if you want, mount your root as readonly so you're not tempted to modify it. Then it behaves like a Live CD.

Mount data, logs, configuration, eventual extensions in the data container?

For pg, there might be some migration needed when jumping from a major version to the next. Which requires both versions installed, on Debian at least.

>Mount data, logs, configuration, eventual extensions in the data container?

Many programs have their state represented as files that are stable across versions. If you have a cluster of the same image with different states it's more efficient to move volume containers across a network. Easier to backup/upgrade too.

pg is going to give you those problems whether you are using Docker or not.

and that defeats the whole selling point of docker, which is no forward config, containers do not change once shipped.

Worse, doing this breaks your guarantee that all environments deployed from this image will be consistent. You'll have to deploy some config management software (Puppet/Chef/Salt/Ansible) to stay on top of these changes.

Check out Project Atomic http://www.projectatomic.io/. Or its downstream project RHEL Atomic Host. The whole update process for the host is much simpler. Read more abou it here: http://rhelblog.redhat.com/2015/04/01/red-hat-enterprise-lin...

Note: I am not related to Redhat, but we are considering Docker, too. And we are evaluating how would Atomic fit in our infrastructure.

Basically, you're thinking of building a custom PaaS.

I'd just use an existing one. PaaSes require an enormous amount of work to make them featuresome and robust. That's all work you're spending that isn't user-facing value.

I've worked on Cloud Foundry and so obviously I think it's the bee's knees. You might prefer OpenShift.

If you're happy in the public cloud, you can host on Heroku, Pivotal Web Services (my employer's Cloud Foundry instance) or on Bluemix (IBM's Cloud Foundry instance).

First i'd like to point out that you cannot have a miniature version of production and you cannot reduce maintenance complexity. It violates the fundamental laws of nature. No matter how small, you still have the same number of moving parts, so it's effectively the same when it comes to actually operating and maintaining it.

But lucky for you, Docker provides some ways to run commands on an existing image, like the RHEL patching/updating tools. It should be possible to update an image's files using RHEL's patches, as long as the whole RHEL install is there in the images.

As far as breaking apart these sets of files into disparate dependencies: again, it's totally possible, but it does not simplify nor reduce your maintenance complexity.

Now, some really stupid people would recommend you compile applications from source and deploy them on top of RHEL, and basically build all your deps from scratch. You don't want to do that because a large company has already done that for you and put it into a nice little package called an "rpm". You take these RPMs and you find a simple way to unpack them on the filesystem, make a Docker image out of them, label/version them, and keep them in your Docker image hub. Now you have your RHEL patches as individual Docker images and can deploy them willy-nilly.

(This is, of course, exactly the same as maintenance on systems without Docker, and your dev & production environments would be the same with or without Docker, but Docker does make a handy wrapper for deploying and running individual instances with different dependencies)

Do you believe that locally built "homegrown" deploys on average are going to be better or worse than these images?

Because I know what I'd bet on.

I'd bet on homegrown - the quality of the official docker images is pretty low when comes to applications ones. Images for OSes are fine. Applications images are often not updated when a new version of the applications is available until you send a pull request on github. I can do better than that myself.

Also, official images are not production ready, they are apparently intended for development purposes. Take the Django image as an example. The server it runs on start is not Gunicorn, or uWSGI, or Apache. It is the development server of Django. I can do better than that myself.

I don't think that is a problem with Docker - the application. If Docker - the company - does not have the resources to properly maintain so many official images then it shouldn't try to.

Given how hard it is to find hires - that comes from good development jobs elsewhere - that understands even the very basics of security, I think you're being overly influenced by your own skills.

You may very well do better than that yourself. I don't doubt that large proportions of HN users would do better.

But how many will?

Consider that the quality of the official docker images is an illustration of the quality of images from people who are above average invested in this.

Look at some of the unofficial images, and you will find incredible dreck very quickly.

Now imagine the set of users of images that have not even tried to build their own images yet, and imagine they were asked to put together their own replacements for the official images to use...

Homegrown every time, sadly.


Reason being, you can more easily deal with silly things like goofy hosts, goofy networks, possible lack of internet connections, bad host OS support, etc.

The normal downsides of doing it yourself of course apply.

As I noted elsewhere, I don't doubt that many of us could, and I'd expect HN users to do better on average than developers overall, given the number of security conscious people here.

But that's not what I'm questioning, but whether or not homegrown images on average are going to do better. Look at the non-official images, and see how much nonsense is in there.

If you know you can do better, by all means do. For many of us that is the best option. And I absolutely wish there was more focus on more secure practices for the official images too. But I still think the official images are likely to be better than what most developers would cook up.

Doesn't mean it's good. Just better than the (terrifying) alternative.

I see it as the opposite. If the maintainer of the container put some effort in, everyone could have a secure version of their software with minimal effort.

The trick is to get people to care about their security. In theory, this is what open source is about. Why not assemble a taskforce to go and secure these containers?

The problem is that "secure version" is a constantly moving target; the taskforce would need to go around once a month (or whenever there's an urgent vulnerability discovered) and update apps that needed it.

If Docker apps were somehow integrated with maintained Linux repos, this could be possible by default -- e.g., all Docker images built on Debian stable dependencies would have their internal dependencies auto-upgraded with each Debian stable sub-release, and possibly be flagged as "needs human intervention" on major releases.

Have there been efforts to do anything like this? I'm new to the Docker world....

There needs to be, though, otherwise a "secure app" is always a temporary creation.

>The trick is to get people to care about their security.

Its 2015. If security isn't a priority by a project, then that project is just incompetent. That may be harsh sounding, but are we really talking about security as optional with internet facing services? This is what happens when devs build their own systems without the experience of being a sysadmin. There's a lot of kitchen sink and duct tape "does it work? Yes, then we're done," mentalities at play here. Not enough people are worrying about maintainability and upgradability.

Heck, most of these things ship with everything running as root. Its like we've regressed to the 90s with Docker and Docker-like technologies.

> Bundling dependencies for production environments has always been and always will be a terrible idea.

If you are not bundling dependencies how do you rollback a deploy that migrated to a new version of a dependency? If you rollback your code, you also have to do something to rollback the dependency.

For Python, I currently rebuild a virtualenv from scratch on each deploy, but it just feels like a poor solution. Docker containers seem like an interesting way to package these dependencies in a way that is portable, where a deploy is just pushing a new version of the Docker container. Is there a Better Way(tm) that doesn't involve me needing to deal building OS packages for all of my virtualenv dependencies?

(I'll note that several dependencies have C extensions, and are thus not pure Python -- e.g. `itsdangerous` depends on `pycrypto` was has extensions.)

If you're basing deploys on image builds, you're not "rolling back" anything, but are building a new host (or host image) based on the correct dependencies.

That process relies on your platform's own dependency-resolution system, and I hope you're using something sane such as Debian/Ubuntu, or are building from source via Gentoo. RPM distros can work but tend to be far flakier.

Start with a base install, have a package for your own source which specifies deps, including if necessary _maximum_ version numbers for deps, and build the target image. Once that's built, you can generally deploy that directly rather than re-build for each deployed host.

Packaging and image preparation _aren't_ tasks which can be abstracted away entirely. It's this point which the containers craze founders on the reefs of reality. Yes, packaging software properly is a pain. But not packaging it properly is an even bigger pain.

I think that the "real" problem here is that Python's package management and apt/yum don't really interface well. I've built .debs for Python packages before, and it was a huge pain in the ass, even with the scripts and automations that I was able to find for it.

It's 'simple' for me to build a virtualenv in a directory with `pip install -r requirements.txt` in my source repo, but everything I've read about making those virtualenvs portable (even moving them between directories on the same server you built them on) is that it is a path fraught with peril.

The 'real' problem is that you shouldn't have to install dependencies in OS-level locations for an application-level product.

In other words, the app should be able to bundle dependencies without having to use a crazy opaque container system, and those dependencies should be easily auditable.

This is the case for Java, where dependencies are 1) bundled with the application, 2) declared explicitly, 3) signed, 4) centrally managed with maven repository software.

In Python, you get similar things with the exception of "bundled with the application" and IIRC "signed" only happens when uploading to the Python Package Index.

I think you can with the latest version of Pip (7)


He's talking about exactly what I am currently doing:

> It's now feasible to build a new virtualenv on every deploy. The virtualenv can be considered immutable. That is, once it is created, it will never be modified. No more concerns about legacy cruft causing issues with the build.

> This also opens the door to saving previous builds for quick rollbacks in the event of a bad deploy. Rolling back could be as simple as moving a symlink and reloading the Python services.

This is exactly what I do now: a new virtualenv from scratch on each deploy in the same directory with all other build artifacts (so that each deploy is in a self-contained, timestamped directory that is swapped out with a 'current' symlink). I just bite the bullet on the additional time it takes to deploy.

The part of this blog post that affects me is that upgrading to pip 7 would speed up my deploy times.

This part seems interesting:

> Another possibility is building your wheels in a central location prior to deployment. As long as your build server (or container) matches the OS and architecture of the application servers, you can build the wheels once and distribute them as a tarball (see Armin Ronacher's platter project) or using your own PyPI server. In this scenario, you are guaranteed the packages are an exact match across all your servers. You can also avoid installing build tools and development headers on all your servers because the wheels are pre-compiled.

I've looked at platter a bit, but I haven't really digested what will be needed to migrate to that point, and he doesn't really expand on it.

I have a few contentions with the study.

First, if you look at their own analysis the number drops from 30% to 23% when limited to only the latest tagged images in the official repository. I'd expect to see a higher rate of vulnerabilities in previous versions...that's why you rebuild. Find me a linux admin that would accept their OS is vulnerable if you're citing old, unpatched versions.

Second, they seem to virtually _all_ be package vulnerabilities. These would, ostensibly, reach parity with whatever the target distro is by simply updating packages on a rebuild.

Finally, I think one would be hard pressed to lay any vulnerabilities traced to updated, current packages at the feet of docker. That fault would seem to lie squarely with distro package maintainers.

So, two simple rules would seem to bring the security of container deployment in line with standard bare metal deployment (by the metrics applied in this research):

1. Don't use old shit

2. Rebuild your selected docker container to ensure packages are up to date. Why? See rule #1.

I thought the point of using docker containers was that they were pre-packaged apps. Not so you had to continually rebuild the container with your own updated packages. Doesn't having to rebuild the container to fix security vulns defeat one of the major reasons to have versioned docker images released for use? You could very well end up breaking dependencies.

You're sort of combining two things: 1) Docker makes it super simple for anyone to package software and run it 2) Dockerhub makes it simple to share software that you have packaged with other people.

Personally, my biggest gripe with Dockerhub is that a Dockerfile should be required in order to upload to the hub, and it should show the Dockerfile that produced each version. The fact that people can create fundamentally unreproducible binaries is nasty (there's also the issue of not specifying versions in the apt/yum steps used in the Dockerfiles, but that's just a general problem with the way package management software is designed).

None of that's a problem with Docker itself though.

Ahh got it. I have only really used lxc, so not super familiar with docker other than it being a container tech. Thanks for the explanation :D

I would say the primary benefit of docker is that you can build once, run the same everywhere.

E.g. you have a consistent, reproducible application environment which _should_ be vetted through a gauntlet of continuous integration, testing, etc. that once created will run identically on any host running docker.

If you have a "trusted source" to do all the grunt work for you, fine. But docker's promise isn't guaranteeing a trusted source. It's providing a consistent, invariant application target from developer laptop -> production host.

Just like Java. We've seen how this one ends :)

Well, sort of. Java was never really like Docker, and in fact always struggled architecturally to provide a good container abstraction for applications. The "servlet container" idea was (and is) a failure. Java never had the equivalent of the Docker daemon, and it only (relatively) recently got something like Dockerhub via Maven--and Maven repos aren't integrated with the JVM or the (non-existent) Java daemon.

Great points!

Just to clarify, our article was not meant to blame any particular party, but rather to provide awareness of the security vulnerabilities that exist even in the latest official images on Docker Hub.

As you point out, this study specifically focused on the OS package vulnerabilities -- including application-level packages and/or other types of vulnerabilities would increase the percentage of vulnerable images.

As we also mention in the article, rebuilding is a great way to solve some of the problems. However, rebuilding comes at a cost -- the overhead of redeploying the container infrastructure, managing audit trails, potential instability introduced to developer applications, etc. These need be balanced against the benefits of rebuilding constantly.

My primary contention with your post is that docker doesn't provide a package-manager-like way of ifnding out whether or not you're running older images. Everyone has their own homegrown way of doing it.

1. Don't use old shit 2. Docker should provide a way to tell you you're not running the latest tagged image so you stop running old shit 3. Don't use base images whose maintainers can't be bothered to rebuild when security updates hit

Well, all docker containers are hashed and can be version tagged. If you do a pull and run the 'latest' tag, it'll always be the HEAD of the commit hash.

This is assuming you want to trust some 3rd party with the maintenance and security of your production environment.

Docker containers are, usually, just operating systems running a single logical application service. I don't think Docker promises a free Sys Admin. ;)

My complaint is primarily that there's no mechanism to let you know "hey, there's an update to this" in the same way as apt, yum, and other systems do.

It's not about trusting a 3rd party with the maintenance and security of your production environment as much as it is "Docker should provide a way to let the people handling the maintenance of your production environment to know shit may be happening". Rebuilding from the 'latest' tag is great. If you know you have to rebuild, and that there's an update available.

Does AWS does this with their AMIs? Everything you listed can be applied in virtually the same way with VM images, and their are community based AMIs with all sorts of vulnerabilities and non-updated code, people just know not to use them or build their own.

Well, no. Everything I listed can be applied in virtually the same way to openstack images or AMIs or whatever... except that the intended use case of those includes regularly updating packages, which docker does not.

So if you rebuild your docker containers every time you deploy, and you deploy daily, security updates should happen on a daily basis. Correct?


And if you have a continuous integration environment building and validating artifacts on every developer commit with a regular, vetted release cycle that catches any regression bugs...

Well, now you're on the right track.

This isn't great, but it's not quite as terrible as its being made out to be for the official packages. The Mercurial bug is only relevant if you're using Mercurial with user supplied input on your production servers. Unlikely if you're not BitBucket. http://chargen.matasano.com/chargen/2015/3/17/this-new-vulne... Is a good read on the subject.

The libtasn1 bug seems to be only relevant if you're using GnuTLS. Again, not great but not the most widely used library either.

Cutting those two out cuts the number of vulnerable images in half and there's probably a few more rarely used programs with security issues further down the tail. Again, this isn't great, but it's not quite as terrible as the authors are making it to be.

The user supplied packages on the other hand seems to be quite a bit worse.

The take-home message is that you need to have a strategy for deploying updates. It's true that not all bugs are exploitable but there's a long history of people being catastrophically wrong in that kind of conclusion.

More importantly, however, you want updates to be a routine frequent thing so you don't train people to ignore them or let the backlog build up to the point where the size itself becomes a deterrent to updating because too many things will change. If you install updates regularly, you keep changes smaller and keep the focus on the tight reaction time which you'll need for serious vulnerabilities.

One of the authors here. I'd like to second this take-home message. The core of our work was to bring to the fore-front that package management using containers is important and we need to have sound operations management/security practices in place.

We think Docker, and containers in general, is a great way to deploy software -- the speed and agility is so much better than traditional approaches. This also means that we should have sound security practices in place from the very beginning, or else we could easily end up with insecure images floating around in several places (dev laptops to public cloud).

> the speed and agility is so much better than traditional approaches

Complete agreement here – Docker's strong points are exactly the things which make patch deployment easier than in legacy environments. Hopefully we'll start seeing orchestration tools which really streamline the rebuild/partial deploy/monitor error rates/deploy more cycle when updates are available.

It would be interesting to see if Docker could develop an integrated security scanner, checking the package lists of each image, and email out consumers of those images when security vulnerabilities come out.

If Docker Hub is a monetization strategy, I think a lot of people might be willing to pay for that -- though it's weird, because that's a problem golden images themselves created, so maybe it's not fair -- and the world would be better if security info was always free. Tracking security updates is hard if you use a lot of deps, anyway, this has the benefit of being a central place that can check these things. Most developers shipping software definitely do not track security history for most of their components, and this is a huge opportunity.

Problem gets harder when people get things from outside package managers and vendor stuff though -- which does not help.

I owe Red Hat for a large part of the way I think about things, and I do think the world would be better if package managers were used more extensively for exactly the reason of tracking vendor security. I also realize not everybody can package everything and do like to vendor deps (or similarly use language specific package managers often installed in arbitrary locations) or put them together however (random internet tarballs), and this ironically is why things like Docker also exist too.

The immutable systems movement is good, but something to clean up security practices would be a huge plus to avoid the comparisons to regression back to "golden images". Using random base images vs distro base images makes it worse, but using stale distro images is itself a thing.

A bit overstated. They definition of security vulnerable == got package which is vulnerable.

However, merely having some packages with vulnerabilities may not be enough. E.g. you have security in package manager (apt), but you never use it after building the image. Or even shellshock is no flyer, if you don't use CGI scripts and don't have ssh access.

In Virtual Machines this problem also exists. I guess it is more about how often you update your software than Docker itself.

Tell me about it!

Gotta love those security experts that your company hires when they say to you "your app has a security issue right here" and I say "alright then prove it, hack it, let's see if there really is a security issue" and they can't do it.

If I don't want to worry about deployment, there's Heroku. If I don't want to worry about testing, there's Circle CI. If I don't want to worry about scaling, there's AWS EC2. If I don't want to worry about security, there's... nothing. Because it's not a real product. At least not real in the way databases, deployment, testing and scaling are.

So when people say "programmers don't care about security" I honestly don't understand what they mean since I've never seen a secure app. It's like there's this mob of believers that want to convince you security is the salvation. OK, teach me by showing. Show me a bunch of secure apps and we'll learn from it. But those don't exist, so no one ever learns, but that doesn't keep "security experts" from blaming programmers building real things in the real world for not caring about their imaginary friend.

I'll believe security experts care when they create a service and sell it for money to people like me.

Security Guy: Hey, bank, it seems like your vault is accessible via some old sewage tunnels.

Bank: So what? Nobody knows about those tunnels.

Security Guy: But someone who finds them, like me, but with less morals, could rob you.

Bank: Prove it. Rob the vault.

Security Guy: ..... ?

Finding a vulnerability isn't the same thing as exploiting one, and a lack of exploitation doesn't imply a lack of vulnerability. You also have to consider that a small portion of vulnerabilities are actually exploitable, but it's a very hard problem to find out which ones are and which ones aren't. Exploiting a single vulnerability is typically harder, in fact, than patching a dozen of them (for example, you can easily start using a secure version of strcpy(), but exploiting it requires an attacker to smash the stack or ROP their way into full execution).

The bottom line is that you're not only naive if you believe what you just said, but you're doing a huge disservice to anybody who uses any code that you may write.

Security Guy: Hey, bank, it seems like your vault is accessible via some old sewage tunnels. But fret not, I as a security expert that goes around making sure places such as banks and schools can't have specific places accessed by entrances other than designated the ones, have a solution for you. Just put your safe inside this chroot building. What this does is makes sure only sewage goes through sewage pipes (not people). So all you have to do is purchase this solution and we will guarantee that no one will come into your bank through sewage pipes.

Why does that never happen? Why are security experts always consultants and they never have a product to sell?

Naive is a person that thinks just because they are a security expert, programmers will care. No amount of shaming will change that. If you're a security expert your job is to make this so easy that I almost don't think about it. Like I almost don't think about databases, deployment, testing, scaling. Getting on your high horse and begging programmers changes nothing.

Just look at RSpec. All of a sudden everyone wants to write tests because it's fun and easy and looks sort of like English. Now we don't have to care much about tests, we just write them and RSpec runs them, collects and reports errors, formats them nicely, tells me the path and the line number where each error occurred, etc. Now imagine you're a "testing expert" and there's no RSpec and you keep yelling at programmers to change their ways, to write and maintain tests, and so on. No one would do it (like few did before the recent craze). So please, learn from that lesson, round up some peers, and contribute to your damn field by letting me forget about it.

So a structural engineer shouldn't worry about the structural integrity of his buildings, only that they stand up under ideal conditions? A car manufacturer shouldn't worry about crash-testing or other safety concerns, only that their car moves?


Like it or not, we're stuck on Von Neumann architecture, and as a result, data can be treated as code and vice-versa. The consequence of this is that, under certain circumstances, data can be carefully crafted to act as code, and can be executed in an unforeseen context. As a software engineer, it is your job to take precautions when developing software. Precautions that prevent this execution. Security people do the best they can to make it easy to develop safely, but all of that is useless if the developers ignore it. And, because security vulnerabilities are a manipulation of context-and-program-specific control flow, there's not a way to encapsulate all security measures in a way that is transparent. It's just not possible. Only developers know the specifics of their software, and only developers can protect certain edge cases. If you assert otherwise, you have a fundamental misunderstanding of the systems that you work with, and you need to re-evaluate your education before continuing to work in the industry (assuming you do). This isn't an opinion. This is a fact.

Lastly, us "security experts" do contribute to our field. Security is one of the hard problems in computer science - far harder than whatever you're doing that lets you "not think about databases, deployment, testing, scaling" - and there's a lot of solutions that have been engineered to deal with software that has been created by people like you. There's static code analysis tools, which can detect bugs in code before it is even compiled. There's memory analyzers that can detect dozens of different classes of memory-related bugs by just watching your software run. There's memory allocators and garbage collectors that can prevent issues with use-after-free and other heap-related exploitation bugs at run-time. There's data execution prevention and buffer execution prevention that, at run-time, help prevent code from being executed from data pages. There's EMET and other real-time exploit detection tools that exist outside of your software and can still prevent exploitation. That's not even an exhaustive list. There are literally hundreds of tools out there that make finding and fixing security bugs easy, but those tools can't patch your code for you. That's why there are consultants, code auditors, and penetration-testers that can give advice on how to fix bugs, find bugs where automated tools fail, and even coach developers into writing more secure code; because having smart, security aware developers is one of the major ways to defend against security bugs.

> As a software engineer, it is your job to take precautions when developing software.

On other people's software as well? Why was it not PostgreSQL's (random example) job to make sure their software rejects invalid input? All it would take is for them to use a typed language (given that the type system in Haskell, for instance, is enough to prevent SQL injection). So tell me, when does it become my job to patch whatever database code I choose because no database ever has concerned itself (it seems) with solving this for everyone else in one fell swoop (so we didn't have to think about it anymore for all these decades of dealing with SQL injection in every language that implements a database driver)?

Before the first million programmers had to write the same damn code to clean the input to give to these databases, the database coder should have fixed it themselves. But you weren't there to chastise him so we didn't get it.

Maybe the "mere mortal" programmers like me would be more excited about security if the industry standard software was also secure (we would want to mimic it, and keep it all secure, and not introduce security problems). No security expert has fixed the SQL injection problem where it should be fixed, but they do charge by the hour to fix it in every company that uses a database.

That's a horrible example. SQL injection IS the fault of the programmer, not SQL itself. SQL injection is achieved by adding extra code to a query, which is only possible when a programmer allows inputs that can contain code to be concatenated directly into a query. Here's an example:

    query = "SELECT * FROM USERS WHERE NAME = '" + userinput + "'";
This input can be given:

    ' OR 1=1--
To make the application show the entire list of users. If this programmer used parameter binding, which is supported by PostgreSQL, MySQL, SQLite, and any other SQL platform you can think of, then SQL injection wouldn't be an issue. They could simply do something like this:

    statement = prepare(query, "USER", userinput)
Just because you don't know the right way to do something securely, doesn't mean it's not there. But you're right, no security expert fixed this problem. It was fixed by the library designers of these SQL platforms. Security experts just charge you by the hour to teach you that you're unfamiliar with the existing security mechanisms inside of these platforms.

Also, just to be pedantic, I'll point out that a type system wouldn't change how SQL injects currently work, lol, no clue how you think that's the case, but I wouldn't put it past you at this point.

I've programmed for a while now. I think I've heard vaguely of parameterized statements. :)

Just to be pedantic, I'll point out that maybe your C and C++ "type" system wouldn't change how SQL injects currently work, lol, but the one I use can avoid not just SQL injection but XSS attacks: http://www.yesodweb.com/page/about

I'll say it again, you're wasting your time staying in that small rickety photocopy room called C/C++. But I wouldn't put it past you at this point. Whatever that means, hahah.

I'm sorry, I thought we were talking about security? Are you leaking the other thread into here just so you can feel like you won both, instead of neither?

And I never said anything about any C/C++ type system doing anything? But okay.

Back to the topic: if you've heard of them, why did you insist that SQL is inherently insecure? Did you forget they existed, or did you just think I wouldn't notice? Are you that cocky?

I really hope your employer one day recognizes your incompetence and fires you, because the software world is plagued with enough bugs without people like you purposely and gladly laying out a red carpet for them to walk in on. I can't continue to argue with what is either a relentless geyser of misinformation or a brilliant troll, so I'm done. Maybe one day you'll come to your senses, but I doubt it.

The way a strong type system solves this SQL injection problem (despite your saying it's impossible and ignoring my having shown your wrong) is by automatically escaping arguments before binding them to parameters.

Well guess what, you don't need pre-compiled statements to benefit from this feature - all you need is the hoisting aspect of it. In other words, if SQL drivers did not offer the unsafe function exec_query that takes the whole query as a string and returns a result, and instead they only exposed a hoisted version of that function that takes a list of arguments and a placeholder query as a string...

  exec_query ["john", 12] "SELECT ... WHERE... = $1 AND ... = $2"
Then there is no SQL injection problem, as the SQL database driver would always automatically escape the arguments before binding the parameters.

So if only SQL database drivers did not offer exec_query but instead forced the user to provide the whole query string in one go with placeholders, then the driver would be able to enforce security at the proper software layer - which is not everyone's program that interacts with a database.

It might be interesting to have a gate on publishing images that explicitly runs tests for major known vulnerabilities. You could at minimum flag images as "known vulnerable", or reject publishing attempts.

The flag might make sense on a new vulnerability, and it could be applied automatically. Imagine [Tag: Heartbleed - Untested] when the vulnerability happened, then as the automated process rolls through the images [Tag: Heartbleed - vulnerable] [Tag: Heartbleed - no vulnerability detected]. Future images are required to pass first.

We have to be careful with widely distributed images.

Based on their definition of vulnerable the Ubuntu 12.04LTS installation image is also vulnerable. I think this is only news to anyone that hasn't setup a fresh install of Windows. I remember some presentation from the Honeynet project circa 1999 about how a new win98 installation, without updating service packs, took less than N (N<24) hours until compromise. Still I guess it is worth reminding people to not trust official containers without first applying security updates, and maybe never trusting unofficial containers, depending on your project

Great article. We'll need a better integration of security tracking and handling in our containerized infrastructure soon.

You have to be a little bit careful when it comes to version numbers and matching them to security issues. Most linux distributions for example apply security patches to older releases.

E.g. Ubuntu 14.04LTS comes with Apache 2.4.7-1ubuntu4.4 which one might parse as 2.4.7 which has multiple security issues.

The article references to distribution specific vulnerability ratings so I assume they als matched those versions correctly.

Study co-author here. We did observe that it's essential to be careful about comparing package version numbers on a per-bistro basis, and there are some tricky cases such as the one you pointed out, and rpm epoch numbers as another example. I believe we handled them correctly in the study.

An issue I have with both official docker hub images and dockerfiles provided by software developers is that almost always they run their software inside the container as root.

I wonder what would happen if you attempted the same study with AWS AMI's on the official images. Get the latest versions, don't update your distro, and see how many vulnerabilities you get. How often does AWS really rebuild their official AMIs?

Ultimately keeping your OS completely up to date is on you, not Docker, not Amazon, you. VM's suffer from the exact same problems as Docker containers.

Edit: Also, security issues with using community AMIs are already well known, should be no surprise the same applies to Docker community images.

They should do the same study for VM images on Vagrant Cloud (aka Hashicorp Atlas), or any other repository of binary software/images built by untrusted third parties.

I thought it was obvious that public images on Docker Hub were to be used for experimentation only--even in that case I only use the "official" Docker images in the library namespace. Anyone using Docker for serious purposes should build their own or at least vet the pre-built images.

Docker Hub as a build service doesnt make it very easy to update older images; you can set manual triggers to rebuild current if the FROM container changes, but thats not automatic. Other dependencies are not very easy either as you only get one FROM, then everything else is probably from git repos, packages, language packaging tools or tarfiles, which obviously need checking for updates.

docker hub: the petri dish of choice for malware

dockers biggest strength is also its biggest weakness IMO. they did lots of changes to the default capabilities (linux capabilities) to improve security. But the underlying problem of fixing old bugs in images remains, along with that its contents are often a disorganized mess: Coming straight from the developer as a black box (more or less) into production environments (yeah when has that ever been a good idea?).

Docker IMO creates a "never touch a running system" attitude. The "running system" in this case is the docker image which nobody dares touching after the developer has left the company. (or the developer themselves have no idea anymore what it contained 3 weeks later)

Also the overhead of setting up containers in a secure way is even more work than not using docker in the first place (ever had too look seriously into SElinux? not something you do casually on the side as it's massively complex).

So the justification that "by using docker we save time on deployment" is a farce. I guess it creates new jobs though for container specialists.

to paraphrase Theo de Raadt:

‟You are absolutely deluded, if not stupid, if you think that a worldwide collection of software engineers who can’t write operating systems or applications without security holes, can then turn around and suddenly write virtualization layers without security holes.”

EDIT: is it still possible in Docker/LXD to access /proc/sys/kernel/panic or /sys/class/thermal/cooling_device0/cur_state ? And how about consuming all the entropy of the host via /dev/random ?

I'm confused.

Looking at the top vulnerability CVE-2014-9462 in mercurial.

It affects mercurial clients that access crafted repositories as far as I understand.


Even if I use mercurial in my Docker image to get my app and not prepackage it (what I do), and I know this is about public images, how is this "high" vulnerability? I don't deny it's one I would just like to learn why it is classified high if e.g. I use Docker for my HAProxy.

As a workaround, update/rebuild your containers more often and deploy more often.

So, what? Everything is vulnerable.. You're not restricted by official images, just create your custom image that is not vulnerable ;)

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact