Please do not blame the technology. This problem existed long before containers and will exist long after they are gone. This is people trusting unknown anonymous third parties to build things that will run in their datacenter.
> If you didn't build the container..
> If you didn't build the package on Debian..
> If you didn't verify the source code when compiling by hand..
If there is a commercially supported version of Debian, then the same would apply.
If I pull in RPM's, Containers, VM Images built by Joe Random, I am legally on my own. My customers will be furious when they find out I have done this and the court will say that I did not make reasonable effort to protect my customers.
No. read the license terms. For all Linux distro, there is a clear mention that the software is provided as is, and they are in no way responsible for whatever happens with it. Very standard. So absolutely no legal standing and therefore no obligation.
That doesn’t exist with containers pulled from joevandyk/postgresql.
No, I don't believe that's the case.
> Linux distro vendors provide a contractual relationship with their customer base that provide SLA's around patching security defects and bugs.
I don't think many - if any - GNU/Linux distro vendors provide anything like that.
RHEL may - it's been a while since I've read a RH contract - but most distributions, as noted by parent, make it quite clear in the licence agreement that everything is provided as is and is sold without any warranty or assurance of suitability etc.
> They also enforce policies around uptake of new third party code.
Is third party code here the same as 'upstream' in the first take? 99% of most distributions code is 'third party' or 'upstream' in the sense it comes from people other than distribution maintainers.
> They also do extensive patching of all of their packages to mitigate the vulnerabilities that upstream providers do not patch.
I know Debian does this, and I trust them with the process. I'm not a big fan of RedHat, but I also know they have an excellent reputation on this front.
It doesn't change the fact that licences clearly put responsibility on the user not the distributor.
For CentOS (or any other open source software) you may have that confidence but you have no contract :)
Now do Redhat/Debian package maintainers do detailed security reviews on all the software they distribute... I don't know but the odds would say it's not likely as they don't employ (to the best of my knowledge) the number of code review professionals that would be required to do that.
And of course as soon as you venture off in to other repo's (npm, rubygems, CPAN, nuget etc) you're entirely on your own.
For sure, things like npn, gems, cpan, pear, pip, etc... is basically back to square one with Joe Random. Each of those things can be pulled into a code repo, built internally and turned into RPM packages. I agree that the effort to code diff review these things is quite large. It is likely still a smaller effort than rewriting all of this code from scratch.
That said even at less effort there it seems extremely unlikely that anyone is doing actual code reviews on the software being packaged up into all the Linux repo's out there. Even automated static analysis over that volume of code (as error ridden as that would be) just isn't practical.
That's not to say they're not more trusted than npm et al, as the developer can't push directly to the repo., so an attackers life is more complex.
Although that said it does introduce a new possibility, that of the malicious/compromised package maintainer...
To date, I have yet to see a software contract that absorbs any legal culpability. Not even high 3-comma annual renewals. The way culpability is usually handled is a clause demanding information security and/or data privacy insurance in client-vendor Master Services Agreements. If your experience with reading business contracts is different, and you've seen actual absorption of legal risk, then please tell some war stories of the contexts, as I'm always up for learning how others do business.
I am referring to after you have been breached, your data has been lost and your CEO and CFO are standing before the judge. The judge will make some punitive decisions based on what effort you can show you made to protect your customers.
If your devs are grabbing every shiny gidget widget from Joe Random and you did not make attempts, as a company, to protect your investors and uphold your fiduciary responsibilities, then the hammer will come down much harder.
This doesn't happen often, and you more commonly see lower-level line staff or managers standing in court because the high-level executives simply point to the written company policies their legal team advised b put in place that forbid such wanton behavior. Now indictment not to speak of prosecution has to clear the far higher bar to show that such high-level executives deliberately, consciously structured incentives such that meeting such policies was outright impossible.
Issuing a policy that demands any such conflicts be raised immediately to management neatly short-circuits such prosecution strategies. Unless the executives are particularly careless or brazen, it is worth more to the prosecution to go after lower-level staff.
I believe that it helps if legal precedent can be set such that management is held more accountable for behavioral structuring through incentives and selective policy enforcement.
Also, it's sort of weird how often people conflate these two things. There's this idea that home-rolling is naturally safer, and it's simply not true.
Everyone doing anything with software is relying on layers someone else built, and we should keep doing that. Layers I handle myself are layers that I know aren't malicious, but that doesn't mean they're secure. The risk of malice doesn't just trade against convenience, but against the risk of error. Using somebody else's programming language, compiler, crypto, and so on doesn't just save time, it avoids the inevitable disasters of having amateurs do those things.
We live in a world where top secret documents are regularly leaked by people accidentally making S3 buckets public. I'm not at all convinced that vulnerable containers are a bigger risk than what the same people would have put up without containers.
This argument tires me. Every time some smug developer asks me if I have personally vetted all of gcc, with the implicit understanding that if I haven't we might as well run some pseudonymous binaries off of docker hub, I extend the same offer to them: Get a piece of malware inside gcc and I will gladly donate a month's pay to a charity of choice.
Sometimes I have to follow though the argument with the question if they will do the same if I get malware on docker hub (or npm or whatever) but the discussion is mostly over by then. Suffice to say, so far nobody has taken me up on it.
The point is, that there's a world of difference between some random guy on github and institutions such as Red Hat or Debian or the Linux kernel itself. Popular packages with well functioning maintainers on Debian will be absolutely fine, but you probably shouldn't run some really obscure package just because some "helpful" guy on Stack Overflow pointed to it, and you certainly shouldn't base your production on some unheard of distribution just because the new hire absolutely swears by it.
Absolutely. There are some famously settled issues - don't write your own crypto, you'll screw it up, do write your own user tracking, third parties will inevitably farm data - but generally there's a decision to be made. And it's not the same answer for everybody; there's a reason Google and Facebook do in-house authentication services, which everyone else borrows.
I've seen the "containers let clueless people go live" claim before, but I'm not really convinced. Containerization offers most of its returns when we talk about scaling, rapid deployment, and multiple applications. If you just want to put up an insecure REST service with no authentication, it seems at least as easy to coax some .NET or Java servlet horror online without any containerization at all.
The examples in the article of containerized learning environments are a bit of a different case, granted. A space specifically designed to let stranger spin up new instances and run untrusted code would usually take a fair bit of thought, but containers have made it much easier to do without any foresight.
At a certain point it does come down to trust. From the position of a potential attacker, you can't just upload a randomly built thing to the official CentOS or Debian repositories and expect it to be made available to the rubes.
Very different than people downloading and trusting random Docker images.
I'd say there is a difference of using official Docker images (from the software vendor) vs images from a random person.
Official images exist for most popular packages, under a separate namespace and usually have checksums published etc.
It would be pretty difficult to sneak a covert Monero miner into an officially approved mainline Debian package.
However there is a sense in which this is a problem with container tech, in that there is no container equivalent of `deb http://deb.debian.org/debian stretch main` (yet!).
This is a statement about the maturity of the ecosystem, rather than a criticism of the technology itself, as you say. But I think that it's meaningful to say that this is a problem that containers currently have, that Debian (or other Linux distro) packages don't face to the same extent.
That's what the Docker standard library is
Red Hat, Canonical, Pivotal (I work for Pivotal) all provide this kind of assurance and it's a lot of our bread and butter income to do so.
In particular, Cloud Foundry uses buildpacks, providing runtime binaries with a chain of custody from sourcecode to the buildpack uploaded to the installation.
Buildpacks make this overall problem a lot easier, actually. You don't need to track the underlying base OS image or manage the runtime binaries. The platform does it for you. But you will still be responsible for tracking more immediate dependencies (NPM, Maven etc), which is a poorly-solved problem currently.
Assuming a default Docker engine install, and no options passed as part of the run, an attacker could DoS the box most likely, and may be able to intercept traffic on the Docker bridge network (although that's not a trivial thing to pull off), but they're unlikely (absent an exploitable Linux kernel flaw) to be able to easily compromise the underlying host.
Now we are entering the everything as a service era which includes software engineering. Instead of designing a solution you download someone else's, duct tape on a few packages, tweak variables and publish it.
You can also blame the breakneck pace demanded by today's tech sector where everything needed to be deployed yesterday in order to beat the other guy to the silicon valley millionaire finish line.
You can say it's a mix of impatience and laziness.
The rapid just got rapider today with easy packages, frameworks and containers. The prototypes just became "web scale" and are run way beyond their intended lifespan, just like any first system is.
And computers were once accommodated with the same with the space they deserved in large rooms.
Now we are entering the everything as miniature. Want more RAM? Throw out the whole thing, because it can't be fiddled.
You could say it's a mixture of impatience and abject cheapness.
Development of some technologies seems to happen on some weird curve like this:
end-user control / flexibility / repairability
| .. ..
| .. ..
| ... ...
| .... .......
| ..... ....
power / money-making capability
Things start as prototypes - hard to sell, hard to use, hard to tweak on user end. Over time, they become more flexible - think of e.g. PCs with interchangeable components. This is the golden era for end-users. They can fix stuff themselves, they can replace or upgrade components. The technology reaches maturity, and the only way from there is downhill - as businesses find new ways to fuck users over, both purposefully and incidentally. Make things smaller. More integrated. Sealed. Protected. The end result is the ultimate money-maker - a black box you lease and can only use as long as you're paying for the subscription.
The key is whether you see your data, or your customers' data, as a precious thing that needs care and protection.
If you do, you make the best effort to select the right software, understand how it works, deploy it correctly and securely, etc. If you don't, you just slap together something that sort of works from the easiest-to-obtain parts, and concentrate on other things.
A lot of people don't care too much about their own personal data; some of them are developers, product managers, even CEOs.
For an example of how this existed long before Linux containers:
There are third party RPM and APT package repositories that have existed for a very long time. The packages are not vetted by a company and there is no legal culpability for anything nefarious being contained within. People use these packages at their own peril and it is assumed they have mitigating controls to reduce risk.
Github is community contributed code and there is no enforceable legal contract between the developer and the consumer. The same thing applies. Use at your own peril and have mitigating controls (code diff reviews, static analysis, legal review, etc) This is especially true for all those projects under the MIT license.