This is interesting, but means very little to me as a heavy user of Docker.
I don't care about boot time. I care about 1) build time, and 2) ship time. In Docker, they are both fast for the same reason.
At the top of your Dockerfile you install all the heavy dependencies which take forever to download and build. As you get further down the file, you run the steps which tend to change more often. Typically the last step is installing the code itself, which changes with every release.
Because Docker is smart enough to cache each step as a layer, you don't pay the cost for rebuilding those dependencies each time. And yet, you get a bit-for-bit exact copy of your filesystem each time, as if you had installed everything from scratch. Then you can ship that image out to staging and production after you're done locally--and it only has to ship the layers that changed!
So this article somewhat misses the point. Docker (and its ilk) is still the best tool for fast-moving engineering teams--even if the boot time were much worse than what they measured.
You described the layering features of Docker images: you can upgrade the upper layers without touching the lower ones. It helps, sometimes.
The reality always seemed a bit more complex to me: once you ship, you'll have to start to maintain your software. You will definitely want to upgrade the lower layers (base OS? JVM? Runtime of the day?). In that case the statically-allocated cache hierarchy of Docker layers will be of little utility.
On the bit-for-bit reproducibility: I have my doubts, too. Are we sure that all the Dockerfiles out there can be executed at arbitrary times from arbitrary machines and always generate the same content hash?
Downloading a pre-built image from a registry does not count as reproducibility, to me.
Obviously Docker is a tool, and as such you can use it with wildly varying degrees of competence. I am just skeptical that using a specific tool will magically free ourselves from having to care about software quality.
I think my issue with your comment is the same one I have with the original article: it focuses on some technical details which don't impact most teams in a major way.
> You will definitely want to upgrade the lower layers (base OS? JVM? Runtime of the day?). In that case the statically-allocated cache hierarchy of Docker layers will be of little utility.
My team deploys to production multiple times per day. We upgrade the OS or the language version every couple months. So the layered structure is optimized for our typical workflow.
> Are we sure that all the Dockerfiles out there can be executed at arbitrary times from arbitrary machines and always generate the same content hash?
Again, the content hash doesn't affect my team. It's "bit-for-bit" enough that, in my experience, if the tests pass locally they pass in production. If that weird library is installed correctly in the local container, it's installed correctly in the production container. That's what matters.
> skeptical that using a specific tool will magically free ourselves from having to care about software quality
I never said this and I'm not sure what you mean. At the end of the day I think the benefits of the "Docker system" are obvious to anyone who ship multiple times a day, especially working on big software systems that are used by lots of people. There are other approaches too, but personally I haven't seen a VM-based solution which offers as good a workflow.
The faster startup time of VMs is cool and I appreciate the work put into the paper... I'm just saying that it doesn't seem to matter in the bigger picture.
> The faster startup time of VMs is cool and I appreciate the work put into the paper... I'm just saying that it doesn't seem to matter in the bigger picture.
I imagine this kind of analysis is aimed at use cases like, say, AWS Lambda, where you're launching containers/VMs on something like a per-request basis.
"Downloading a pre-built image from a registry does not count as reproducibility, to me."
Completely disagree... On my team, the lead devs build the docker images, push them to a private repo, and everyone else pulls that exact image. Bringing up dev environments is almost instant. If a lead dev adds a dependency that breaks the build, everyone else is fine. They will fix the build and push it up when ready.
It's not reproducible because you didn't produce (i.e. build) anything, you just downloaded (i.e. copied) it.
Calling what you did "reproducible" would be like calling a scientific paper "reproducible" because you're able to copy the table of results at the end into another paper.
I think we are using two different meaning of "reproducible".
To me, saying that a build is reproducible means that anyone is able to independently build a bit-for-bit exact copy of an artifact given only its description (source code + build scripts, Dockerfile, ...), more or less in the sense of [0].
For this definition, even content-addressability is not enough.
Well, I think you disagree on the meaning of the word "artifact." GP cares that all production deployments are identical to the development environment that went through system tests. To ops, a reproducible build is one that has been uploaded to the registry, because it can be ported over to the production cluster and it will be bit-for-bit identical.
It is not necessary to reproduce the same build bit-for-bit because you have kept a copy and distributed it.
You are not rebuilding the same thing twice because that is an inefficient use of resources.
Nobody really cares that the timestamps are different when the builds started at different times of day, it is evident therefore that you will not produce bit-for-bit identical builds as long as your filesystem includes these kinds of metadata and details.
If you built it again, it is to make a change, so there is no point in insisting on bit-for-bit reproducibility in your build process (unless you are actually trying to optimize for immutable infrastructure, in which case more power to you, and it might be a particularly desirable trait to have. Not for nothing!)
> Nobody really cares that the timestamps are different when the builds started at different times of day, it is evident therefore that you will not produce bit-for-bit identical builds as long as your filesystem includes these kinds of metadata and details.
Do you think different timestamps is all that you need to worry about? How about the fact that some source tarballs that you depend on may have moved location? Or that your Dockerfile contains 'apt-get upgrade' but some upgraded package breaks your build whereas it did not before? Or that your Dockerfile curls some binary from the Internet but now that server uses an SSL cipher suite that is now incompatible with your OpenSSL version? All problems that I encountered.
I was not speaking in terms of formal correctness in terms of the definition of reproducible builds, I was speaking of practical implementation for a production deployment. All of the issues you mentioned are (at least temporarily) resolvable by keeping a mirror of the builds so that you can reproduce the production environment without rebuilding if it goes away.
I'm just defending the (apparently ops) person who was going by the original dictionary definition of "reproducible" which predates the "reproducible builds" definition of reproducible.
If my manager asks me to make sure that my environment is reproducible, I assure you she is not talking about the formal definition that is being used here. I'm not saying that one shouldn't care about these things, but I am saying that many won't care about them.
If you're doing a new build and the build results in upgrading to a newer version of a package, then that is a new build. It won't match. If you're doing a rebuild as an exercise to see if the output matches the previous build, then you're concerned about different things than the ops person will be.
If my production instances are deleted by mistake, I'll be redeploying from the backup images, I won't be rebuilding my sources from scratch unless the backups also failed.
I agree that it is a sort of problem that people usually don't build from scratch, it's just not one of the kinds of problems that my manager will ever be likely to ask me to solve.
No, you're right, I read the "reproducible builds" page and it's a good formal definition for the term.
I just think that if you ask ten average devops people on the street what "reproducible" means in the context of their devops jobs, eight of them at minimum are not going to know this definition, or insist on a bit-for-bit representation that arises directly from pure source code like what meets the definition at http://reproducible-builds.org
We're going to think you mean "has a BDR plan in place." Or maybe I've underestimated the reach of this concept.
I'd agree with zackify here too, unless I'm misunderstanding what was meant by "reproducibility". How is this different from say, pulling an external dependency from a repo manager, like we do all the time when building software?
The ability to easily deploy pre-built Docker images from a registry is one of my favorite features of a Docker workflow, especially the time that can be saved when deploying components of a software stack to a local development environment. I find I have to deal with significantly less installation issues if the developer can just run a Docker Compose file or similar on their machine to get going.
You are right, giobox: pulling an external dependency without guaranteeing its content impairs reproducibility (given that anyone else in this world uses this word in the same meaning I am. At this point I am starting to think that I am wrong).
Let me give a short example, limiting to the Dockerfile format:
RUN wget https://somehost/something.tar.gz
is not reproducible: there is no guarantee that everyone is going to end up with the same container contents.
RUN wget https://somehost/something.tar.gz && echo "123456 something.tar.gz" | md5sum -c
is reproducible, even if there is an external dependency. You can rest assured that the RUN statement either will convey the same result everyone else is having, or an error.
Similar considerations can be made for other statements (FROM something:latest, RUN apt install something, ...).
But, as I was saying, maybe my personal opinion of what is "reproducible" is a bit too strict.
I agree that using a registry & compose is very useful (and personally, do it all the time). Simply, that does not fit to the definition of reproducibility, for me.
The article is about using and optimizing VMs and containers at runtime, not at development time.
Whether Docker provides development benefits compared to other approaches is orthogonal to the runtime performance concerns being discussed in the article.
People use containers and VMs for multiple purposes. Some people might only care about development speed; other people might care a lot about runtime performance and will find this article relevant. The article's point is that, if you need a lightweight runtime isolated environment, VMs are viable, competitive with containers, and have unique isolation benefits.
> Some people might only care about development speed; other people might care a lot about runtime performance and will find this article relevant.
You're exactly right, and I'm simply saying I don't find it relevant for any of the work my team does. Others might, but my feeling is that the workflow of actually developing on these things is what has caused a shift from VMs to containers for so many teams.
I'd love to see a solution which combines the workflow benefits of Docker with the stronger isolation of VMs.
Definitely. There are different use cases even within the same shop. One of the major problems with getting containers adopted where I'm at is that the ops folks want to have a production story for containers before we start learning how to use them in development.
Many of the concerns are orthogonal! We are just using them on the app-dev side of the fence to manage our configuration drift so that regardless of what app you're trying to start, you can do it alongside of whatever other apps you're already running. We use minikube with the ingress addon to help us fit some moderately complex constellations of servers onto our laptops, without having to know Ansible or learn to configure Nginx (or need to twiddle the Nginx config every time we're changing to work inside of a different context, or know a lot about Kube, or requisition additional EC2 nodes, or really anything after setting up Deis other than "git push deis".)
Before they will call this development configuration supported and allow us to take advantage of cloud resources (S3 buckets to address our scaling and other concerns, basically so we can run Deis in minikube without starting from scratch again every day...) the enterprise architects want to see a plan to run "a multi-AZ scalable environment to run containers."
It was at that point in the conversation that I realized, we were having two totally separate conversations in a single thread. There is so much overhead to get things into production in our organization that we don't even want to broach the topic of putting containers into production use, but in the minds of InfoSec and the architects, they see it as inevitable that if we're using containers in development, we're also going to use them in production.
> This is interesting, but means very little to me as a heavy user of Docker.
I'm quite certain that the security issues with containers mean quite a bit to you as a heavy user of Docker. The very first line of the article establishes this baseline, re:VMs enjoy better isolation properties. Those properties used to entail higher overheads, but not so much now.
> Can we have the improved isolation of VMs, with the efficiency of containers?
It's the "efficiency of containers" part I'm disagreeing with. They define "efficiency" as "start up time", and that doesn't affect my team's efficiency.
We choose to make the tradeoff between much higher efficiency in our daily workflows, in exchange for slightly worse isolation (the level of isolation is acceptable for our use case given that we have multiple other layers of security).
Containers are not about safety or lightness IMO. Rather, the reasons I prefer them are better insight and usability. Since the same kernel is running multiple applications, it has insight into how much RAM has been requested, and how much CPU is being used, and can thus be more efficient in packing processes onto nodes. If you try to build this with unikernels, you will end up turning your hypervisor into a de facto kernel. As it is, hypervisors simply preallocate resources and in pretty much every case must overprovision for peak usage, which requires more unused capacity.
As for usability, I mean that you get many of the benefits of VMs (namespaces and resource constraints), but still with the added benefit of the kernel introspection.
Under that definition, then, VMs would not be lighter than containers, because they require overprovisioning and don't (by deliberate design) dynamically communicate memory or CPU usage to the hypervisor.
Linux supports hot-adding memory. Since it can do that, as a guest, so can any other VM. Overprovisioning might be required, but it's only required because of software limitations, not by hardware. I would venture a guess that could be solved.
Hot-adding memory is essentially have the equivalent of the mmap kernel syscall, but there's no ability for the VMs to return memory that they no longer need, as you have with the munmap syscall. So you may start out without overprovisioning, but eventually you will hit peak hours, redistribute your VMs, and be overprovisioned again unless you reboot machines after peak hours.
Even if you did add a capability to compact and dynamically stop using hardware memory into linux, then signal the hardware layer that it's not used, you're again just implementing another feature provided by standard kernels for decades. Not to mention this feature seems basically coupled to virtualized hardware.
The kernel can unmap memory by having a special "ballooning" driver that "allocates" memory pages returned to the hypervisor. I'm pretty sure that every sane hypervisor supports it since ages.
You can take memory from the VM. I've just tested it on KVM, dropping the current memory allocation for a vm, and reraising it. Shows up in free in the vm instantly.
Well it sounds like that is possible, but is there any way for the VM to indicate to the hypervisor that there is unused memory? Or for that fact, that it needs more? That's the critical missing piece, unless there's an API I'm unaware of.
Putting aside deployment, Docker has been a sheer joy for local development. It provides a complete wrapper for the voodoo and tribal knowledge required to run packaged apps.
Gone are the days of building a VM from scratch, following instructions for a previous version of the OS, installing 10 dependencies, installing the app, cursing when it fails half way through, installing another missing dependency, installing the app again, bouncing the server, and then having it all fall apart again because the app isn't registered to run as a service at startup (because you missed that step).
All of this was possible before Docker through Vagrantfiles and the like but Docker made it much cleaner and significantly less resource intensive (i.e. no longer need a VM per item).
This is why you do config management. Here, our Vagrant builds are exactly like our development and production servers. Its literally the same config to run them both, with just variables to differentiate as needed.
Your argument falls down when it's not your development-time build, but some third-party's. Docker allows you to basically treat "other people's daemons" the way most programming runtimes treat "other people's libraries": as something you can just toss into a dependency spec file and have "pulled in" to your program during build.
With Vagrant, I still need to figure out how to install e.g. Postgres, or Redis, or whatever, into the Vagrant VM (usually using salt/chef/ansible/etc.), or—much more rarely, because this isn't "idiomatic"—I'll find a Vagrantfile for someone else's virtual Postgres appliance VM. Which I then need to integrate into my virtual-network architecture as its own separate VM with its own lifecycle management et al.
With Docker, my app's image just depends on the postgres:9.6 Docker image, and so when I start my app, a copy of Postgres is downloaded and started as well. My app "owns" this container: the config management for what version of Postgres is running is done by my app (just like a library dependency), rather than the resulting container being thought of as a separate "piece" of my system to be managed by the ops team. The ops team manages my service, which contains Postgres as a component. Same as if I had built a VM that contained both my app and Postgres.
In short, containers (and especially the higher-level abstraction of container "services" or "pods") allow you to build and plan your service as a set of separate black-boxes, rather than owning their build processes; but then allow you to run your service as a unit and upgrade it as a unit, rather than each build-time component resulting in a separate runtime consideration.
---
And yes, all this could be accomplished just as well with VMs. There's nothing specifically about containers that makes them able to do things that VMs cannot. It's just how the two ecosystems have evolved.
If you:
• built an equivalent of Docker's "public registry" and "private URL-accessible registries" for VMs—maybe by generalizing AWS's AMI registry
• ensured that all common VM software could start VMs using a spec that pointed at an AMI URN/URL, downloading and caching AMIs from wherever in the process
• encouraged a public stance toward having a Vagrantfile in your project root that will—with no configuration—build a public-usable virtual appliance out of your project
• added automatic-build-from-Vagrantfile support to your public AMI registry
• encouraged an 'immutable infrastructure' approach to the resulting Vagrant VMs, such that they'd put all their state into hypervisor-provided stateful mounts (of SAN disks or NFS or what-have-you)
• enforced a common semantics for services/pods of VMs in hypervisors
...then you could treat VMs the way you treat containers, using the VM-management client tooling to push out a version-upgrade of your service, that would replace various VMs with VMs based on a newer AMI—without losing any state.
It's all a matter of changing the culture and idioms. That doesn't mean it's easy.
Docker doesn't imply not having config management. In fact for our infrastructure that was using puppet to build vms, we just run puppet as part of `docker build` instead. Still using config management and our prod, staging and dev environments run the same exact blobs.
Sometimes; and tools like Ansible Container can help have the best of both worlds.
But for every time I see someone using actual configuration management and best practices for reproducible Docker architecture, I see a hundred times when there are brittle and convoluted shell scripts or worse, entire application definitions in a 500+ line Dockerfile (which is neither configuration management nor scripts. At least the scripts are somewhat more portable and maintainable!)
Instead of writing a build script to put everything together from a solid foundation, Docker implicitly encourages you to perform a succession of random undocumented changes on the original foundation endlessly. Also known as "tribal knowledge." And this is all somehow "cleaner" than build scripts. "Putting aside deployment" means discouraging reproducibility and encouraging book burning.
No, putting aside deployment means he was talking about the benefits of Docker for local development instead of the benefits of Docker for deployment. As to your talk of a build script, this is what the Dockerfile is for, and it is heavily documented.
Packer has a docker-builder which constructs docker images without a dockerfile, found it useful for quickly converting between creating AMIs and Docker images for EC2.
I think depends on the project, no? You first specify your dependencies, then provide a way to use the dependencies to build the end project.
If you don't have to specify the dependencies or the build process to the user, then you can get away with the build process not being well defined. Docker is a win for 'easy installation', but I wonder if it degrades the quality of projects which rely on it?
> Gone are the days of building a VM from scratch, following instructions for a previous version of the OS, installing 10 dependencies, installing the app, cursing when it fails half way through, installing another missing dependency, installing the app again, bouncing the server, and then having it all fall apart again because the app isn't registered to run as a service at startup (because you missed that step).
Weird, You wouldn't have to deal with Nginx and services for local development
What you describe doesn't sound like local development... But rather acceptance/manual integration or product/feature review/approval
But in places with good automation there's tools or a pipeline that automatically spins up instances (potentially containers) with your branch/feature and automatically link them to other services and exposes them
From this point of view, docker can be useful for deployments... But it's mostly useless for development, since it doesn't help with keeping prod/Dev environments identical to each other
> What you describe doesn't sound like local development... But rather acceptance/manual integration or product/feature review/approval
> ...
> But it's mostly useless for development, since it doesn't help with keeping prod/Dev environments identical to each other
If you're working on a feature that integrates with LDAP, you need an LDAP accessible locally. Docker lets you spin one up near instantly. The alternative I'm describing is to install the LDAP server locally or in a VM. Both are definitely options, but they're significantly more intrusive or involve (or both).
Ditto for a Postgres DB, a Redis server, or any other external resource needed by your app. It's not just for integration tests. You need them available to develop against and Docker provides a consistent way for everybody to spin up those resources quickly.
Docker for Windows/macOS run containers inside of a Linux VM, so nothing changes there. There's also Windows Server Containers which are what they sound like.
Or CI. Having easy environments for any OS or software tool imaginable makes it quick and simple to test my applications across 10 OSes and OS versions as needed.
Now the hype seems to be wearing off and people are realizing that it's just a way of treating Linux installs like giant static binaries plus a whole lot of unnecessary complexity.
"Docker is fantastic! You can run anything you want, easily. Except, uh, don't try to run a database or anything like that. Oh and here are a dozen different flow charts and tables explaining how to handle networking. And uh make sure you still do all of the same VM provisioning for security since containers don't give you any. Simple!"
Containers are designed to be non-persistent - if an instance of a container goes down you should be able to replace it with another instance with no loss in state. Most people want their databases to be persistent (unless it's just temporary local storage for your process), so containers are a bad idea here. What you want is some sort of external storage to store state. This also enables better horizontal scaling.
I would say this might be a description of Docker or Docker containers specifically, but does not characterize Linux containers generally.
Linux container technology, by which I mean the Linux kernel APIs used by Kubernetes, CoreOS rkt, and Docker, do not have an opinion about whether the processes that run as containers are ephemeral or persistent. Containers can be just as persistent as the host they're running on.
For example, consider a Linux distribution using systemd. Systemd might launch regular processes upon system startup, and it might also launch containers where both types of processes are equally persistent on the parent host. See the documentation for systemd-nspawn [1] and machinectl [2]. The Arch Linux wiki has more details about how to use that distribution with nspawn [3]. Containers managed in this way are just like other system processes aside from being containerized.
Docker allows you to specify mounting directories on the host system. If you wish to have persistence of the data - it's simply the matter of specifying that mounting option.
Sure, but that's not a database running in Docker, it's "external storage to store state". Which is the right way of doing things if you're ok with your data persisting on the server.
I'm more curious, why would you run a DB in a container?
Containers seem really great for running inside schedulers in order to squeeze every last bit of performance out your servers while also making it easy to move things around as needed. To me this is things like app servers, web servers, load balancers, queue workers, microservices, etc.
I don't think of a database needing to be moved around, and I want its performance profile to have room for growth when needed (in other words, give me the beefiest server and only run the DB on it). Also, a database doesn't generally have 15 dependencies you have to hand compile/install like many apps do.
There seems to be a lot of operational complexity with doing things like "ok, here's our database, but the data is not in the container, and if the DB does move, we have to take the data with it." How long does the move take? Wouldn't it be better to just have another DB standing by to swap in immediately?
I guess my point is (and I'm actually, genuinely asking as someone who hasn't spent much time in the container world): what operational benefit does having a DB in a container give?
> I'm more curious, why would you run a DB in a container?
I develop different PHP apps - some are, uh, fossilized and don't work with MySQL 5.7 or later, some are on a more modern stack and can work with the latest and greatest stuff. Other apps I am working on run on pgsql. Also, I am on OS X.
All of this means it's either a Vagrant VM which I always forget to shut down and breaks every couple of months instead of Docker where I can do a simple docker ps -aq|xargs docker stop, and everything is pretty much reproducible.
And no, running the DBs native on OS X is a pure PITA - good luck getting different versions of MySQL and PHP to peacefully coexist.
On production servers the story is different but I can tailor a production server environment to the exact specs the software requires.
If getting the software you need to actually run is painful enough that you just run linux in a vm with a wrapper how is that easier than running linux in a vm or just run linux natively?
Hah. I know very little about containers beyond the unshare command and have been wondering how they avoid reintroducing all the problems of static binaries.
Makes me think of stali [0], the static linux distro that avoids all dynamic linking.
I personally think Unikernals will overshadow Containers in a few years. They just in my mind offer a model that is much more security and speedy. I give them a few years because Unikernals still need to figure out better tooling; debugging, producing and deploying them is currently difficult.
It seems pretty unlikely that unikernels will ever replace containers.
Containers allow you to run almost all of the past decades of software with no or little modification (and most modifications are done in bash), a unikernel is more like a new OS that applications need to be ported to, without the full POSIX API (eg, no subprocesses) - it's probably easier to port a multiprocess server (eg most SMTP servers) from Linux to the NT API than to port them to a unikernel
A few years is a lifetime in tech; why not learn what is new and see how if it's applicable to you and improves things now? You'll come out with more experience, can throw it on your resume, and very likely be more prepared for the next tech to come along.
As a senior sysadmin who has had to deal with all the crap devs throw at me, I skipped the container craze to focus on VM security and other issues. Give the devs a nice powerful boxen to use (APIs in many hypervisors these days)so they can do the provisioning themselves.
Without all the added complexity another tooling like docker requires.
Real world benchmarks I ran indicated perf differences were negligible, so devs seem to mostly like docker cause they don't have to do sysadmin stuff... but I can do all the same things in a VM, plus much more.
A big shoutout to the people at Proxmox while I'm at it.
Also a sysadmin, I've inevitably found that the environment needs/expects a number of services, and it just feels like docker is ill-suited to these use cases. Frequently even the devs themselves want multiple services and can't tolerate stateless systems anyway, but at the very least we have to deal with logging and the like.
I'm not saying it can't be done, but I've never found a developer who had a particular desire to implement containers, and even if they did, their concerns don't represent the entirety of what's involved in running infrastructure.
I feel like the increase in devs in general and in business have made them more likely to follow bandwagons and fads more easily than they used to, which just makes our job harder.
Forget the word cloud. The cloud exists on servers. What you are talking about is servers. I mean build a nice beefy 2 cpu supermicro rackmountable 2-4u server with a shitton of RAM and fast disks and a duplicate for HA which is then backed up properly. It is a dev and a sysadmins playground paradise. I do prefer to keep prod and dev on separate physical machines though for an added layer of separation.
One more thing. People seem to think colocation isn't worth it anymore... well let me tell you, they are very often wrong! Colo is a great solution depending on the problem. You get the closeness to the core backbone nodes, the security of a DC, and these days often at very competitive prices. You can get security cages, full racks or you can half racks and sometimes the DC has smaller etc to keep the cost down as well.
If thats not an option, you can do dedicated servers in a DC. The difference being in a colo the hardware is yours, in a dedicated setup the hardware is part of an offering by the provider. I have done this at many places as a stopgap until the colo server arrives (rackspace for example).
I would really suggest for real dev where you are compiling often and doing many things that stress systems, stay away from VPS's. It's not nice to other users and you don't get the perf you need/want anyway.
This kind of stuff is what makes me excited about 32core amd cpu's on the way. This time I wont have to do a quad cpu system to get 64 cores, just 2!
One thing I've been wanting is a VM that would contain a "minimized" version of an operating system, plus a single program, in addition to the programs dependencies.
I've looked into using Docker to do something like the above, but it didn't seem straightforward a few years ago. LightVM seems like it could be an option in the future.
SmartOS's LX branded zones seem kind of cool to me, and may hit an interest point for you. Basically, SmartOS is an OpenSolaris variant where they use Solaris Zones for docker images, but they've implemented a Linux syscall table API so you can create Linux "compatible" containers from a Solaris kernel (and also full access to ZFS, etc). You can see a talk here[1] where Bryan Cantrill goes over the reasons you might want to use containers, SmartOS and how LX branded zones. Warning though, he's an extremely engaging speaker, so not only are you likely to watch all of that 40 minute talk, but you might feel like binging on some other talks of his, so here you go.[2]
1: https://www.youtube.com/watch?v=coFIEH3vXPw
The perplexing thing to me about this paper is that "containers" are launched merely through a particular type of clone(2) semantics (and clone(2) implements fork(2) under the hood on Linux). So the process of launching a new container should take around the same time as it takes to perform a fork. The fact that it doesn't in their analysis is surprising.
So the question for me is, what is the paper measuring, exactly, when they talk about Docker's relative performance? Are they including Docker API overhead in their instrumentation? If so, that sounds like an apples-to-oranges comparison.
To start a container Docker does a lot of expensive setup like preparing network stack, volumes, logging etc. One can trivially see how expensive is that just by timing the command line and dropping expensive Docker options. On my laptop:
$ time docker run --rm ubuntu true
real 0m0.641s
user 0m0.036s
sys 0m0.018s
$ time docker run --rm --net none --log-driver none --read-only ubuntu true
real 0m0.399s
user 0m0.039s
sys 0m0.018s
Compare that with the unshare utility that one can use as a Docker replacement for simple containers (options create all new namespaces that the utility supports including user namespaces):
AFAICT ClearContainers only supports running your current kernel and virtualized hardware in their VM. That assumption gives them some massive speedups.
LightVM supports running arbitrary kernels, and can be really fast when running a very light weight micro kernel.
I agree with your point about 'supposedly' light weight. and its lot clear how much weight really costs the kernel's user.
but I'm pretty reluctant to agree that 'last years linux with a random bunch of patches' is the ultimate solution, or even the best solution available today.
the one really interesting thing that VMs give you is an abstract diver model. it might not be the absolute best or higher performance, but if a kernel or unikernel is designed only to be run in an VM environment, on a server, then the giant mass of buggy drivers just goes away. It may even be possible in some circumstances to run with just a VM network device and nothing else.
I personally think there is some fruitful work to be done around the kernel interface to admit better scheduling of i/o in highly thread-parallel environment.
But I have to imagine there are some radically improved security architectures for server-in-a-VM that doesn't involve a Linux userland, firewall, package manager, etc etc
Does it have any ability to use USB hardware? Something I looked into in the past was trying to use LXCs to do a multi USB WiFi adapter router to connect to multiple networks, which might be using the same 1918 netblocks. Unfortunately, LXCs had issues of "bridging" USB hardware into the LXC.
Quote:
"Jitsu - or Just-in-Time Summoning of Unikernels - is a prototype DNS server that can boot virtual machines on demand. When Jitsu receives a DNS query, a virtual machine is booted automatically before the query response is sent back to the client. If the virtual machine is a unikernel, it can boot in milliseconds and be available as soon as the client receives the response. To the client it will look like it was on the whole time."
I don't care about boot time. I care about 1) build time, and 2) ship time. In Docker, they are both fast for the same reason.
At the top of your Dockerfile you install all the heavy dependencies which take forever to download and build. As you get further down the file, you run the steps which tend to change more often. Typically the last step is installing the code itself, which changes with every release.
Because Docker is smart enough to cache each step as a layer, you don't pay the cost for rebuilding those dependencies each time. And yet, you get a bit-for-bit exact copy of your filesystem each time, as if you had installed everything from scratch. Then you can ship that image out to staging and production after you're done locally--and it only has to ship the layers that changed!
So this article somewhat misses the point. Docker (and its ilk) is still the best tool for fast-moving engineering teams--even if the boot time were much worse than what they measured.