I am the author of baseimage-docker (http://phusion.github.io/baseimage-docker/)...

shykes · on June 26, 2014

I don't think your base image misses the point of Docker. Different people use Docker for different purposes, that is normal and a fundamental goal of Docker.

I do have criticism for your communication around that base image, starting with the link-bait blog post "you're using Docker wrong". Your message is that anybody not using Docker your way (full-blown init process, sshd, embedded syslog) is doing it wrong. That is not only incorrect, it contradicts Docker's philosophy of allowing and supporting more than one usage pattern.

My other criticism is that you point out a known Docker bug (the pid1 issue) and use it as a selling point for your image, without concerning yourself with reporting the bug let alone contributing to a fix. Meanwhile many people hit the same pid1 bug and have reported, suggested possible fixes, or contributed code to help implement that fix. If you want to be taken seriously in the Docker community, my recommendation is that you consider doing the same.

mwcampbell · on June 26, 2014

> I don't think your base image misses the point of Docker. Different people use Docker for different purposes, that is normal and a fundamental goal of Docker.

Far be it from me to tell you how you should run your own project, but it seems to me that if Docker is going to live up to the shipping container metaphor, then it needs to be at least somewhat opinionated. In particular, you've previously explained that Docker is supposed to provide a standard way of separating concerns between development and operations. If this is going to work in practice, then it seems to me that there needs to be agreement on conventions like:

* Logs go to stdout/stderr, not to the container filesystem or even a volume.

* Configuration settings are provided on container startup through environment variables.

* Related to the above, occasional configuration changes are made by starting a new container with new variables, not by editing a config file inside the existing container.

* The container's main process needs to cleanly shut down the main service in response to SIGTERM.

* No SSH in the container, unless the container is providing an SSH-based service, e.g. a gitolite container.

So if I'm right about what the conventions are or should be, then Puhsion's base image is indeed misguided.

FooBarWidget · on June 26, 2014

Apart from the SSH thing, Baseimage-docker very much complies to those conventions.

- In Baseimage-docker, Runit is configured to have all services log to stdout/stderr. In Passenger-docker, the Nginx error logs are redirected to stdout/stderr. We actively encourage services to log to stdout/stderr.

- Baseimage-docker provides easy mechanisms for allowing multiple processes to access the environment variables that were passed to the container.

- Baseimage-docker's custom init process was designed precisely to allow graceful termination through SIGTERM. It even terminates all processes in the container upon receiving SIGTERM.

Baseimage-docker does not mean that the Docker conventions are thrown out of the door.

FooBarWidget · on June 26, 2014

Hi Shykes, glad to see you replying. Your point about communication is fair enough. I will take a look at how the communication can be improved. However, let me stress that the message is not "you're using Docker wrong unless you're using it our way". I see how it can be read like that, but the real message is much more technical, complicated and nuanced. The message is fourfold:

1. Your Unix system is wrong unless it conforms to certain technical requirements.

2. Explanation of the requirements.

3. One possible solution that satisfies these requirements: Baseimage-docker.

4. Does your image already satisfy the requirements? Great. If not, you can implement these requirements yourself, but why bother when you can grab Baseimage-docker? And oh, it happens to contain some useful stuff that are not strictly necessary but that lots of.

As you can see, such a complicated message becomes waaay too long and hard to explain to most people. It probably only makes sense if you've contributed to the Linux kernel, or read an operating systems book. If I explained it in a way that's too technical and nuanced, 99% of the people will fall asleep after reading 1 paragraph. So the message was simplified. I apologize if the simplified message has offended you, and I am continuing to finetune the message.

As the for the PID 1 issue: I genuinely thought you guys didn't include a PID 1 on purpose, because running one isn't that hard. Last time I talked to Jerome, he had the opinion that, if software couldn't deal with zombie processes existing on the system, it's a bug in the software. With that response in mind, I thought that the Docker team does not recognize the PID 1 issue as really an issue. So please do not mistake the lack of a bug report as malice.

Later on, you told me that you guys are working on this, and I was glad to hear that.

I get the feeling that you feel bitter about the fact that I chose to write Baseimage-docker instead of contributing a PID 1 to Docker. Please understand that I did not do this out of any adversarial intentions. My Go skills are minimal and I am busy enough with other stuff. This, combined with the fact that at the time I thought the PID 1 issue was simply not recognized, led to me write Baseimage-docker. I would like to stress that I look forward to friendly relationships with you, with the Docker team and with the community.

ithkuil · on June 26, 2014

Do you have any links to relevant discussion, documentation and/or code related a (official?) pid 1 process for/by docker? I'm not able to find it quickly and I thought it might be useful if you could share given that you clearly have some context. Thanks!

FooBarWidget · on June 26, 2014

I don't know what the Docker team are working on, but this is the PID 1 process we use in Baseimage-docker: https://github.com/phusion/baseimage-docker/blob/master/imag... It's a custom system we wrote specifically for use inside Docker.

tinco · on June 26, 2014

The link-bait blog post title is not "you're using Docker wrong" but it is "your docker image might be broken". In my opinion there is a definite difference in arrogance and link-baitiness there.

The only occurrence of the word wrong in the whole post is in the sentence "What might be wrong with it?". That sounds more like healthy criticism than 'incorrect' contradiction of Docker's philosophy to me.

About the pid1 thing, I do not think Foobarwidget saw that as a Docker bug, but as a bug of Docker containers. Doesn't it make sense to release a Docker container with a proper init process then?

npsimons · on June 26, 2014

This, to me, sounds like the much more reasonable and sound approach to containers, as opposed to the "SSH bad!" article. As someone who doesn't use containers or virtualization heavily, but maintains multiple systems (servers, desktops, phones, etc), I can unequivocally say that SSH is practically a hard requirement. And just having it as the way to get to a container seems like a no-brainer, in that you don't have to learn yet another tool, that might not work, and even if it does, won't cover all the use cases of SSH. I use SSH for remote admin and automated backups of everything (phones, tablets, servers, desktops, etc). Adding another thing to be backed up via SSH is easy, no matter if it's virtual or not.

EDIT: I do really like the separation of concerns and modularity that are brought about by the approach advanced in the article. But I would argue that the arguments against SSH apply many times more strongly against alternatives to it: security upgrades? You're going to have to do that much more often with whatever you use to replace SSH. SSH has proven track record for security and authentication, it's well known, lightweight, and generally doesn't break on its own.

23david · on June 26, 2014

Thanks for your efforts with creating/maintaining baseimage-docker. It saves me effort from needing to maintain my own baseimage, and has given me some interesting ideas to try for my docker images.

I've now introduced and put docker-based infrastructure projects into production environments at 2 different companies, and IMO having sshd in the containers has made it much easier and familiar for techops/devops teams to get started with docker.

Docker-attach is a much more limited solution, and I think that introducing another tool like nsenter is a non-starter since it just adds more complexity with additional tooling and dependencies. Another tool when ssh works? The additional cpu/ram use isn't a big deal, and for security as long as I secure sshd and my keys/password properly (not storing them in my image, for example...), no worries.

Docker logging is also limited compared to tried and true linux logging utils.

Docker process supervision is still a bit immature and unreliable. I'll keep trying the built-in solutions, but I have everything working fine now without needing to wait for subsequent Docker releases.

Docker is a really convenient wrapper around a bunch of standard Linux tools, and IMO that has been its power. The weaknesses in Docker have been where it tries to build its own replacement for existing and mature solutions (logging, supervision, networking, etc.).

A lot of the functionality of libcontainer, libchan, libswarm seem to be done by existing tools. Why reinvent the wheel? Are the existing project maintainers unwilling to take pull requests?

vidarh · on June 26, 2014

> Microservices Are Not A Free Lunch

I don't buy that article at all. There's a lot of strawmen there where things are suddenly needed for micro-services while they apparently aren't when running the exact same things inside a single VM.

You can architect a microservices based system in ways that add operational complexity, but if you do it should be because there are substantial benefits to be had that way.

But you can equally well take that monolithic VM that seems to be what that article is assuming the alternative is, run Docker inside it, and run each service in a Docker container inside it, and still start to realise substantial benefits; not least because it makes it easier to grow out of the single VM easily as/when needed by making dependencies much more explicit and allowing each services software dependencies to evolve separately.

I agree you don't have to use Docker only that way, but the more I've played with Docker the finer-grained I end up making things...

benjaminwootton · on June 26, 2014

I'm the author of that article! Nice to see it discussed here even if you don't buy it :-)

I'm not sure I understand your point. My article concludes that MicroServices do indeed bring substantial benefits on a longer time horizon. However, the undeniably add operational overhead because your monolithic app explodes in terms of number of processes.

This is true whether you deploy to 1 virtual machine or 100. It's still 100 distinct processes that tend to communicate asynchronously.

The article doesn't mention the word Docker, but we actually subsequently found that Docker was the missing link and the thing that made MicroServices viable. When your abstraction layer becomes the container then the operational complexity is tamed.

Edit - Here are two articles I also wrote which describe how Docker enables MicroServices if you didn't catch them. These came after the No Free Lunch article. MicroServices with Docker, particularly polyglot MicroServices, would be painful beyond belief without Docker.

http://contino.co.uk/use-docker-continuously-deliver-microse...

http://contino.co.uk/use-docker-continuous-delivery-part-2/

vidarh · on June 26, 2014

> your monolithic app explodes in terms of number of processes.

There's no reason it has to, is my point.

You may have a point in instances where you start with an application that actually already wrapping a ton of unrelated functionality together in a single process - I didn't really think about it in terms of that scenario. But then I'd argue that an increase in the number of processes will be an operational godsend over having to try to track down problems in a monolithic mess. And even then you're not forced into splitting things up into tons of little pieces in one go.

The scenario I was thinking of, on the basis of the discussion here regarding baseimage-docker, is splitting up services that consists of a bunch of interrelated processes. E.g. Nginx + Apache + Memcached + the actual application + various cron jobs + Postgres is a typical example. Starting by splitting that up into separate Docker containers for each existing processgroup doesn't introduce any new application processes.

You can then gradually break up the actual application if justified/needed.

benjaminwootton · on June 26, 2014

That isn't what MicroServices are about.

MicroServices are an generally about application architecture where you break your application into a very finely grained interacting services.

Your eCommerce store web application for instance becomes a shopping cart service, a stock service, a category service, a login service, a user profile service etc. You can take it even more fine grained so your user profile service becomes a user profile update service and a user profile rendering service.

The point is, one application becomes tens or hundreds of distinct and distributed processes. This is a massive rise in development and operational complexity before you seperate out your web server, database etc.

http://martinfowler.com/articles/microservices.html

FYI We are heavy production users of MicroServices within Docker and would put tools such as NGINX, MemCached, Postgres into Docker images as a matter of course so they can be built, deployed and versioned in the same way as our services.

vidarh · on June 26, 2014

(whomever downvoted this and my earlier comment: it's extremely bad form on HN to downvote something because you disagree with it; post a comment instead)

Now you are nitpicking to the extreme.

> The point is, one application becomes tens or hundreds of distinct and distributed processes.

And this is a process that starts from the moment you start splitting up the major services that typically already live in separate processes. It's a false dichotomy to treat your custom code and the full stack separately in this respect.

I explained that I agree that you had a point when you get to a level where the application has been split up to a great extent. I also pointed out why that was not the case I was addressing - rather pointing to your article as justification for baseline-docker. In fact, if you go "the whole hog", I'd argue the argument for baseline-docker because substantially worse.

It's a false dichotomy to contrast "fully monolithic app" and "true microservices". In real-life it is a sliding scale where most larger systems will already consist of multiple cooperating services, whether or not you wrote them yourself. For every person who thinks they are doing micro-services, I bet I can find someone who would argue they should have split it up more (or less).

The more pragmatic point is to split up to the extent your operational setup handles well. From my point of view, 10 services or 200 per application makes pretty much no difference in the setup we're deploying at work, for example.

I'd be happy to discuss microservices with you in more detail (I'm in London too), but that's an entirely different discussion from the comment I was making earlier.

> This is a massive rise in development and operational complexity before you seperate out your web server, database etc.

It's a massive rise in development and operational complexity if you're set up to do monolithic or near monolithic apps. Once you're set up to handle reasonably distributed/service oriented apps, additional increases in numbers of services quickly ceases to be an issue. On the contrary, I'd argue that for many setups, splitting up your application further reduces complexity because it forces making dependencies more explicit and allows for more easy probing and monitoring. I know I much prefer doings ops for apps that leans towards micro-services than monolithic apps (part of my responsibility is ops for a private multi-tenant cloud for various relatively large CRM applications we do).

ckuttruff · on June 26, 2014

Couldn't agree more with this sentiment. I've started using docker to replace vagrant/VMs for my test workflow. I use it to spin up a container, provision our server code via chef, and test out various changes.

Is chef my favorite too? No. Could I be using docker in a more optimal manner? Sure. But the reality is that I wanted the simplest possible path to integrate docker within our workflow, and it already saves a ton of resources. I run linux on a machine with 4G of ram, so believe me, utilizing containers for testing infrastructure changes is a huge improvement.

So docker as a lightweight VM is most definitely a valid use case, IMHO