Hacker News new | comments | ask | show | jobs | submit login

If you didn't build the container, then you are putting all your trust and your companies private bits in the hands of Joe Random.

Please do not blame the technology. This problem existed long before containers and will exist long after they are gone. This is people trusting unknown anonymous third parties to build things that will run in their datacenter.

Isn't that always true though? This is just one additional layer of trust. Sure, there are reasonable layers we should care about, but you're rarely, if ever, going to be doing everything and trust everything.


> If you didn't build the container..

> If you didn't build the package on Debian..

> If you didn't verify the source code when compiling by hand..


I think it is about legal culpability. If I am running CentOS in my datacenter, there is some degree of confidence that the packages were rebuilt by the CentOS team, a few members of which are Redhat employees. Redhat have an obligation to make reasonable effort to keep malicious code out of their supported software.

If there is a commercially supported version of Debian, then the same would apply.

If I pull in RPM's, Containers, VM Images built by Joe Random, I am legally on my own. My customers will be furious when they find out I have done this and the court will say that I did not make reasonable effort to protect my customers.

> Redhat have an obligation to make reasonable effort to keep malicious code out of their supported software.

No. read the license terms. For all Linux distro, there is a clear mention that the software is provided as is, and they are in no way responsible for whatever happens with it. Very standard. So absolutely no legal standing and therefore no obligation.

There’s a social and economic understanding that Redhat works hard to keep malicious code out of their distributions.

That doesn’t exist with containers pulled from joevandyk/postgresql.

That is specific to the Linux code itself which is taken from upstream. Linux distro vendors provide a contractual relationship with their customer base that provide SLA's around patching security defects and bugs. They also enforce policies around uptake of new third party code. They also do extensive patching of all of their packages to mitigate the vulnerabilities that upstream providers do not patch. There is much more to this that would take a blog entry to explain.

> That is specific to the Linux code itself which is taken from upstream.

No, I don't believe that's the case.

> Linux distro vendors provide a contractual relationship with their customer base that provide SLA's around patching security defects and bugs.

I don't think many - if any - GNU/Linux distro vendors provide anything like that.

RHEL may - it's been a while since I've read a RH contract - but most distributions, as noted by parent, make it quite clear in the licence agreement that everything is provided as is and is sold without any warranty or assurance of suitability etc.

> They also enforce policies around uptake of new third party code.

Is third party code here the same as 'upstream' in the first take? 99% of most distributions code is 'third party' or 'upstream' in the sense it comes from people other than distribution maintainers.

> They also do extensive patching of all of their packages to mitigate the vulnerabilities that upstream providers do not patch.

I know Debian does this, and I trust them with the process. I'm not a big fan of RedHat, but I also know they have an excellent reputation on this front.

It doesn't change the fact that licences clearly put responsibility on the user not the distributor.

For commercial software, there may be some level of legal liability, but it would depend entirely on your contract, and I'd imagine if you look at most standard contracts, they disclaim all such liability.

For CentOS (or any other open source software) you may have that confidence but you have no contract :)

Now do Redhat/Debian package maintainers do detailed security reviews on all the software they distribute... I don't know but the odds would say it's not likely as they don't employ (to the best of my knowledge) the number of code review professionals that would be required to do that.

And of course as soon as you venture off in to other repo's (npm, rubygems, CPAN, nuget etc) you're entirely on your own.

I agree, I am riding on the backs of people using RHEL. There is a direct contractual relationship between those companies and Redhat. In my case, I am relying on the other companies having that relationship and I can still say some effort is being made to validate the supported packages. While I can not sue anyone, I can say that I am using an OS that has some degree of code validation and feature set stability.

For sure, things like npn, gems, cpan, pear, pip, etc... is basically back to square one with Joe Random. Each of those things can be pulled into a code repo, built internally and turned into RPM packages. I agree that the effort to code diff review these things is quite large. It is likely still a smaller effort than rewriting all of this code from scratch.

As to code review effort being lower than writing, sure in most cases (although finding well hidden backdoors is likely harder than writing software)

That said even at less effort there it seems extremely unlikely that anyone is doing actual code reviews on the software being packaged up into all the Linux repo's out there. Even automated static analysis over that volume of code (as error ridden as that would be) just isn't practical.

That's not to say they're not more trusted than npm et al, as the developer can't push directly to the repo., so an attackers life is more complex.

Although that said it does introduce a new possibility, that of the malicious/compromised package maintainer...

> although finding well hidden backdoors is likely harder than writing software

Very likely:


Are you basing your assertions on a discussion with an attorney, or better yet, a written legal opinion, or is this your interpretation as a lay person?

To date, I have yet to see a software contract that absorbs any legal culpability. Not even high 3-comma annual renewals. The way culpability is usually handled is a clause demanding information security and/or data privacy insurance in client-vendor Master Services Agreements. If your experience with reading business contracts is different, and you've seen actual absorption of legal risk, then please tell some war stories of the contexts, as I'm always up for learning how others do business.

I am not a lawyer and this is not level advice.

I am referring to after you have been breached, your data has been lost and your CEO and CFO are standing before the judge. The judge will make some punitive decisions based on what effort you can show you made to protect your customers.

If your devs are grabbing every shiny gidget widget from Joe Random and you did not make attempts, as a company, to protect your investors and uphold your fiduciary responsibilities, then the hammer will come down much harder.

> ...your CEO and CFO are standing before the judge.

This doesn't happen often, and you more commonly see lower-level line staff or managers standing in court because the high-level executives simply point to the written company policies their legal team advised b put in place that forbid such wanton behavior. Now indictment not to speak of prosecution has to clear the far higher bar to show that such high-level executives deliberately, consciously structured incentives such that meeting such policies was outright impossible.

Issuing a policy that demands any such conflicts be raised immediately to management neatly short-circuits such prosecution strategies. Unless the executives are particularly careless or brazen, it is worth more to the prosecution to go after lower-level staff.

I believe that it helps if legal precedent can be set such that management is held more accountable for behavioral structuring through incentives and selective policy enforcement.

> to be doing everything and trust everything

Also, it's sort of weird how often people conflate these two things. There's this idea that home-rolling is naturally safer, and it's simply not true.

Everyone doing anything with software is relying on layers someone else built, and we should keep doing that. Layers I handle myself are layers that I know aren't malicious, but that doesn't mean they're secure. The risk of malice doesn't just trade against convenience, but against the risk of error. Using somebody else's programming language, compiler, crypto, and so on doesn't just save time, it avoids the inevitable disasters of having amateurs do those things.

We live in a world where top secret documents are regularly leaked by people accidentally making S3 buckets public. I'm not at all convinced that vulnerable containers are a bigger risk than what the same people would have put up without containers.

There's this idea that as long as everything is not rigorously proven secure, we might as well grab binaries of file sharing sites and run them in production.

This argument tires me. Every time some smug developer asks me if I have personally vetted all of gcc, with the implicit understanding that if I haven't we might as well run some pseudonymous binaries off of docker hub, I extend the same offer to them: Get a piece of malware inside gcc and I will gladly donate a month's pay to a charity of choice.

Sometimes I have to follow though the argument with the question if they will do the same if I get malware on docker hub (or npm or whatever) but the discussion is mostly over by then. Suffice to say, so far nobody has taken me up on it.

The point is, that there's a world of difference between some random guy on github and institutions such as Red Hat or Debian or the Linux kernel itself. Popular packages with well functioning maintainers on Debian will be absolutely fine, but you probably shouldn't run some really obscure package just because some "helpful" guy on Stack Overflow pointed to it, and you certainly shouldn't base your production on some unheard of distribution just because the new hire absolutely swears by it.

Right. All-or-nothing thinking is the bane of analysis, and philosophy in general.

The difference is that Docker has centered their momentum on the transclusion of untrusted/unverified images. It's true that executing random untrusted code has been a major problem since people got internet connections (although most HN denizens like to fancy themselves as too smart for that, so this story is undoubtedly uncomfortable for them), but when Docker makes it a core part of the value proposition, it's worth calling out.

Very true, but doesn't that make this basically a cost-benefit calculation, with risk-of-malicious-code vs. risk-of-reinventing-the-wheel(badly)? I assume the critics would say that container tooling makes it easier for reckless amateurs to put things up when they otherwise might not have managed to deploy at all without them...

> basically a cost-benefit calculation

Absolutely. There are some famously settled issues - don't write your own crypto, you'll screw it up, do write your own user tracking, third parties will inevitably farm data - but generally there's a decision to be made. And it's not the same answer for everybody; there's a reason Google and Facebook do in-house authentication services, which everyone else borrows.

I've seen the "containers let clueless people go live" claim before, but I'm not really convinced. Containerization offers most of its returns when we talk about scaling, rapid deployment, and multiple applications. If you just want to put up an insecure REST service with no authentication, it seems at least as easy to coax some .NET or Java servlet horror online without any containerization at all.

The examples in the article of containerized learning environments are a bit of a different case, granted. A space specifically designed to let stranger spin up new instances and run untrusted code would usually take a fair bit of thought, but containers have made it much easier to do without any foresight.

I don’t think either offer much assurances unless there’s good test coverage, mocking, stubbing, fuzzing, property testing and so on to ensure code is solid. Trust but verify (automatically)

Reproducible / Deterministic builds are a more realistic solution to this trust question.


It is. One procedural solution is increased rigor, i.e., formal methods (a-la seL4) and unit/integration testing to prove isolation properties. I still don’t understand how Linux or Docker get a free pass, be so popular and complex while lacking basic assurances of automated, repeatable, demonstrable quality.

It comes down to history, long term track record of reliability, and responsibility. The number of times that actual malicious software has made it into an official Debian apt repository is very, very low. The people who build the .deb packages and make them available (with appropriate GPG keys and hashes) keep things secure and are generally very trustworthy individuals.


At a certain point it does come down to trust. From the position of a potential attacker, you can't just upload a randomly built thing to the official CentOS or Debian repositories and expect it to be made available to the rubes.

Very different than people downloading and trusting random Docker images.

> Very different than people downloading and trusting random Docker images.

I'd say there is a difference of using official Docker images (from the software vendor) vs images from a random person.

Official images exist for most popular packages, under a separate namespace and usually have checksums published etc.

It's true that this problem is not unique to container tech -- it's a problem that every packaging ecosystem faces. Who polices what packages are available? And how many eyes are on these packages, to make sure that they are safe?

It would be pretty difficult to sneak a covert Monero miner into an officially approved mainline Debian package.

However there is a sense in which this is a problem with container tech, in that there is no container equivalent of `deb http://deb.debian.org/debian stretch main` (yet!).

This is a statement about the maturity of the ecosystem, rather than a criticism of the technology itself, as you say. But I think that it's meaningful to say that this is a problem that containers currently have, that Debian (or other Linux distro) packages don't face to the same extent.

> However there is a sense in which this is a problem with container tech, in that there is no container equivalent of `deb http://deb.debian.org/debian stretch main` (yet!).

That's what the Docker standard library is

> This is people trusting unknown anonymous third parties to build things that will run in their datacenter.

Red Hat, Canonical, Pivotal (I work for Pivotal) all provide this kind of assurance and it's a lot of our bread and butter income to do so.

In particular, Cloud Foundry uses buildpacks, providing runtime binaries with a chain of custody from sourcecode to the buildpack uploaded to the installation.

Buildpacks make this overall problem a lot easier, actually. You don't need to track the underlying base OS image or manage the runtime binaries. The platform does it for you. But you will still be responsible for tracking more immediate dependencies (NPM, Maven etc), which is a poorly-solved problem currently.

Exactly, similar issue exists with nearly all package manager tools - you’re putting a lot of trust in a lot of people you don’t know.

docker run some/container is basically equivalent to curling a shell script and piping it straight to bash isn't it?

not really, Docker (and similar containerization technologies) provide a restricted environment for the downloaded code to execute in (by default, it is possible for users to remove the restrictions)

Assuming a default Docker engine install, and no options passed as part of the run, an attacker could DoS the box most likely, and may be able to intercept traffic on the Docker bridge network (although that's not a trivial thing to pull off), but they're unlikely (absent an exploitable Linux kernel flaw) to be able to easily compromise the underlying host.

Couldn't have said it better myself.

The blame with npm, DockerHub, etc. basically boils down to "it makes it too easy to share and run software".

There used to be this thing called software engineering. You are presented a problem and you or a team come up with possible solutions choosing carefully the right tools and components for the job.

Now we are entering the everything as a service era which includes software engineering. Instead of designing a solution you download someone else's, duct tape on a few packages, tweak variables and publish it.

You can also blame the breakneck pace demanded by today's tech sector where everything needed to be deployed yesterday in order to beat the other guy to the silicon valley millionaire finish line.

You can say it's a mix of impatience and laziness.

This effect isn't new, even in ye olde days of waterfall. Back then it was called rapid prototyping and first system syndrome.

The rapid just got rapider today with easy packages, frameworks and containers. The prototypes just became "web scale" and are run way beyond their intended lifespan, just like any first system is.

> There used to be this thing called software engineering.

And computers were once accommodated with the same with the space they deserved in large rooms.

Now we are entering the everything as miniature. Want more RAM? Throw out the whole thing, because it can't be fiddled.

You could say it's a mixture of impatience and abject cheapness.

> Now we are entering the everything as miniature. Want more RAM? Throw out the whole thing, because it can't be fiddled.

Development of some technologies seems to happen on some weird curve like this:

  end-user control / flexibility / repairability
  |                 ........
  |               ..        ..
  |             ..            ..
  |          ...                ...
  |      ....                      .......
  | .....                                 ....
                 power / money-making capability
(Not really sure about the Y axis label; I'm having trouble expressing the quality I'm thinking of.)

Things start as prototypes - hard to sell, hard to use, hard to tweak on user end. Over time, they become more flexible - think of e.g. PCs with interchangeable components. This is the golden era for end-users. They can fix stuff themselves, they can replace or upgrade components. The technology reaches maturity, and the only way from there is downhill - as businesses find new ways to fuck users over, both purposefully and incidentally. Make things smaller. More integrated. Sealed. Protected. The end result is the ultimate money-maker - a black box you lease and can only use as long as you're paying for the subscription.

Hardware may be cheap or expensive, it does not make a lot of difference.

The key is whether you see your data, or your customers' data, as a precious thing that needs care and protection.

If you do, you make the best effort to select the right software, understand how it works, deploy it correctly and securely, etc. If you don't, you just slap together something that sort of works from the easiest-to-obtain parts, and concentrate on other things.

A lot of people don't care too much about their own personal data; some of them are developers, product managers, even CEOs.

I'm not sure what distinction you're trying to make? Build it yourself vs reuse? Old-fashioned file-sharing reuse vs modern network-based package management? Sounds like you're mocking anything that isn't homegrown or acquired via floppy disk, but I want to give you the benefit of the doubt...

I would suggest that depends on how they present the community artifacts. If they provide a good deal of obvious disclaimers that the artifacts are built by Joe Random and "Use at your own peril", etc, then they are probably covered. If not, then they are guilty of conditioning people with bad behavior.

For an example of how this existed long before Linux containers:

There are third party RPM and APT package repositories that have existed for a very long time. The packages are not vetted by a company and there is no legal culpability for anything nefarious being contained within. People use these packages at their own peril and it is assumed they have mitigating controls to reduce risk.

Github is community contributed code and there is no enforceable legal contract between the developer and the consumer. The same thing applies. Use at your own peril and have mitigating controls (code diff reviews, static analysis, legal review, etc) This is especially true for all those projects under the MIT license.

Applications are open for YC Summer 2019

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact