Yay, the first Docker post I can understand. I just installed and used Docker today to get a simple web server running out of a VPS (which is also a first for me as I always used managed services like Heroku).
However, some aspects of Docker leave me with concerns that it may not be the tool for me. I tried installing the official node image and it downloaded hundreds of megabytes of other images (probably over a GB). Not that much of a problem on my VPS but absolutely unusable anywhere else in my corner of the world.
Looks like I'll have to create my own images and use my own private local registry to make use of Docker outside of my VPS.
Yeah, those are layers in the image. That image [1] (727MB) is built on buildpack-deps [2] (695MB) which is itself built on the base jessie (154MB).
It would be pretty straightforward to adapt those Dockerfiles to create an image which includes only the dependencies for building node. It would end up looking a lot like 'node:slim' [3] (288MB).
Ideally, Docker will eventually have the functionality to more easily strip out transient requirements like build dependencies from the final images.
I can't say I've used many of the community images. We generally just use the base OS images and install packages as needed. The Dockerfile makes it pretty easy to do.
Great presentation, it was very easy to follow and understand.
Quick question though: is the only reason for using Docker containers with LaTeX file compilation for providing isolation between documents? Isn't there a performance hit versus running file compilation directly on worker machines, perhaps with some sort of folder-based isolation (workers will only compile files in folders that the user has permissions to)?
Isolation is definitely one of the main benefits for us. Compared to e.g. a chroot, docker also lets us disable networking and restrict memory etc. for the process in the container. It's another important layer of security.
Another benefit is that the Dockerfile also makes it a lot easier to manage installation of all the LaTeX packages, fonts and various scientific software that we have installed.
The overheads seem to be very low --- less than 100ms extra startup and tear down time, and no significant difference in runtime speed.
> Each compile job gets its own short-lived container
Just curious, how do handle input and output? Do you prepare a volume with the input and then grab output from there as well, or the program inside the container is fetching input from somewhere first, runs exec latex and then uploads the output elsewhere?
Running untrusted programs as root in a docker container is definitely not recommended (and indeed you should be very wary of running anything as root). In the slides, I didn't have time to go into how to run processes in containers without root privileges, but docker does provide features for doing so, and it's definitely worth using them.
Would you please consider modifying or extending this presentation to include other basics? You did a great job at making Docker finally understandable as a whole. I'd love to see more of your writing on this topic.
Thanks! I will see what I can do to add more. Note that the source for the slides is on github [1], so anyone who'd like to pitch in can do so, and that would be most welcome.
I disagree with running RVM in a docker container. The point of RVM is to manage multiple versions of ruby and a container is only meant to have a single responsibility so running multiple versions of ruby is not advised.
I think you're better off installing the required version with Ruby Install [1] rather than adding the complexity of a version manager.
However, some aspects of Docker leave me with concerns that it may not be the tool for me. I tried installing the official node image and it downloaded hundreds of megabytes of other images (probably over a GB). Not that much of a problem on my VPS but absolutely unusable anywhere else in my corner of the world.
Looks like I'll have to create my own images and use my own private local registry to make use of Docker outside of my VPS.