Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
How does Docker affect energy consumption? (arxiv.org)
174 points by rbanffy on May 5, 2017 | hide | past | favorite | 86 comments


Doesn't this article fail to consider the bigger, practical picture, i.e. overall resource efficiency/footprint in multi-component architectures?

Containers allow us to condense workloads in a single OS runtime –while preserving isolation– where otherwise the same workloads would have spanned multiple machines or VMs, each with its overhead and slack (unused resources).

Example: consider you need to deploy not just a single instance of Wordpress, Redis, Postgres, etc., but a complex application consisting of many of those components.

You can either choose to (a) deploy each in a different machine [incurring in the overhead of OS + unused resources]; (b) in different VMs [incurring in the cost of OS, but being able to share a resource pool]; or (c) in containers, sharing the OS and incurring in the overhead of the container daemon.

I would love to see an article that takes these points into account.


> Containers allow us to condense workloads in a single OS runtime –while preserving isolation– where otherwise the same workloads would have spanned multiple machines or VMs, each with its overhead and slack (unused resources).

I'm starting to believe that the increased density provided by containerization is a myth, in practice. Both because orchestration tools bring their own overhead (compare all the proxies and filesystem/network overlays with multiple virtualized kernels) but also because containerization goes hand-in-hand with microservices thus increasing the number of components (often times with no real reason other than to be hip).

If you're running 20-30 containers on a beefy VM, you're not really condensing anything. You're just moving from running hundreds of small VMs on a single server to running much less larger VMs.


> If you're running 20-30 containers on a beefy VM, you're not really condensing anything. You're just moving from running hundreds of small VMs on a single server to running much less larger VMs.

Of course you are condensing. Those small VMs would have been running a full OS runtime each.

Assuming you relocate those workloads onto a single machine (1 process = 1 container), you no longer run 20-30 copies of an OS, just a single one.

And the container engine takes the place of the hypervisor.


> Of course you are condensing. Those small VMs would have been running a full OS runtime each.

That's something but probably not as much as you think since hypervisors can share identical pages (e.g. the Linux kernel) across guests and the base footprint for a server Linux install is not that high as a percentage of the private data most applications use. Unless you're running a ton of unnecessary services on those guests or have an application which uses almost no RAM you're talking about a fairly modest percentage savings even before you factor in all of the things you might be running for container management and other overhead on that side.

The other thing to remember is that this works both ways: containers are great for being able to upgrade one component independently but that means that e.g. you might have a dozen different versions of a common shared library because not all of your containers are using the same base image & version and with Docker your storage driver might actually force shared libraries to be duplicated across all processes anyway.


> since hypervisors can share identical pages (e.g. the Linux kernel)

With ASLR, I'm now sure the gains are that substantial.


The parent doesn't mean that the hypervisor will merge identical MMU->physical page mappings (like a Copy-on-Write process fork would); they mean that VM pages' underlying host virtual pages literally get periodically hashed for their current content by a background process on the dom0 and merged when they are found to have identical hashes. The underlying virtual page is then made copy-on-write.

Or, to put that another way: the host memory for most modern hypervisors consists of a heap of "new" pages, and then a generational garbage collector that moves said pages, if still alive, into a content-addressible "old" store.

As such, if two VMs each have a process that

1. calls malloc() 1000 times to get 1000 1-page buffers randomly spaced through their memory, the mappings different for each VM; and then

2. uses a fixed PRNG seed to generate random data [but the same random data] to fill those pages;

then those two processes' pages will still get collapsed together for a 50% savings.


I'm not sure that e.g. OpenVZ or LXC would have that much larger a footprint in RAM, even if they do have a full OS runtime. A few more daemons running, that would be it.

Everything else is mapped onto the single Linux kernel, and many pages are shared as a result; I think that even libs are able to be shared across VMs; so if you had 2 identical versions of glibc in 2 VMs, only 1 would be loaded and used.


Doesn't LXC also mean containers? Like, the C literally stands for "Container."


LXC (with LXD) or OpenVZ containers are typically shipping full OS in the container. Docker is different in that it typically only have a few processes per containers.


"Full OS" is confusing here. Let's be explicit:

1. "application containers" are effectively a single process [though that can fork more] with some kernel process-struct fields set to nonzero values, indicating that the kernel should present this process a different view of its environment.

2. "virtual machines" are the processor providing a separate virtualized view of the CPU, on which is then booted another virtualized kernel, which brings up with it virtualized OS services and eventually an app.

3. Between them, "OS containers" are a hybrid: they start up all the userland virtualized OS services that a VM does, but they do so on top of a kernel that's not actually a fresh, separate kernel; but instead a kernel that has been told (through setting tons of containerization process flags) to present to this group of processes a view of the world where this kernel looks like a fresh kernel in a newly-started VM.

"OS containers" are basically a raw optimization over VMs by asking one kernel to pretend to be multiple kernels, and to manage one pool of memory instead of having multiple pools of memory. Anything you can do with raw VMs, you should (in theory, given good inter-container isolation+quota logic) be able to do with OS containers as well.


Containers are an abstraction that exist using cgroups and namespaces for isolation. They use the hosts' kernel, it's not virtualized. Containers are only limited by the capabilities of namespaces and cgroups, unlike vms.

You might be dismissing microservices too quickly. They do have overhead but so does any level of abstract; the benefit of them though is clear separation of responsibilities between services and residency(Swarms, clusters, etc). Both can be achieved with Vms but VMs weren't built with these goals in mind


Using VMs to isolate single processes is like owning multiple toasters, and buying a different house to plug each toaster in.


Nah, that is separate physical machines. For VMs it's more like a multitenant toaster colo, with every toaster in its own asbestos cage, but sharing power and network^Wbread. For containers, it's like putting them all in one house but putting a fuse and RCD on every toaster. A traditional server is building a custom house each time and the toasters are all plugged into the same socket.

I'm not sure the toaster analogy will gain mass acceptance.


What if you're hosting toasters owned by people who might be rude teenagers who want to set any house with a rival toaster in it on fire?

When Docker can safely protect a Minecraft server in one container from a local DoS attack coming from a bot running in a sibling container, I'll reconsider using VMs. :P


Resiliency


Well, at least for our company we definitely see the gains and we absolutely don't see it as a Myth.. Previous project without docker to support all of our dev and QA environments: 9 VMs that we paid for. Most recent project with Docker: 2 VMs we pay for which includes all Dev and QA environments, with natively installed Nginx to hide the port differences.


It really do seem like we can't win. Try to get more control in one place, and things sprawl out of control somewhere else.

containers bundled up libs and binaries in a single package, only for an "app" to come reliant on a zoo of containers doing one little part of the whole.

Makes one wonder if the stack is made of rabbits rather than turtles...


But you can argue the overheads are birthing pains.


> Containers allow us to condense workloads in a single OS runtime –while preserving isolation– where otherwise the same workloads would have spanned multiple machines or VMs, each with its overhead and slack (unused resources).

Ultimately there's no reason why containerization can't be pushed right down to the language level. Consider .NET's AppDomains, or even further, a capability-secure programming language which isolates at the object level with zero overhead over ordinary languages.


> no reason why containerization can't be pushed right down to the language level

I started my reply by listing a plethora of reasons why this wouldn't work. (For one, this would work because it does work, right now in Erlang.) But they all came down to that it seems like you're missing some of the problems that containerization solves. You can wrap up almost any service -- regardless of language or versions or runtimes or how it interacts with the filesystem or what versions of libraries it depends on or what global configs it expects or anything -- and ship it as a self-contained normalized service that can run right alongside any other number of other self-contained normalized services that require their own global configs and libraries etc etc even if they're incompatible and whoever is deploying them doesn't even need to care.

Everyone has their own language and toolset that they're comfortable with and productive in. That will never change. Containerization abstracts over all of them and normalizes their deployment. You can never get that with a solution at the language level.


> For one, this would work because it does work, right now in Erlang.

Eh? Erlang has no equivalent to cgroups/namespaces, or even an equivalent of non-UID-0 code execution on its VM. There is in fact no isolation mechanism in the Erlang VM; all code is "privileged." Untrusted multitenant code execution is a pipe-dream for now, unless you graft on another sandbox inside Erlang, ala CouchDB's V8 C-port [and more recently luerl] sandboxes.

(I've been very much considering contributing code for "non-privileged Erlang processes" and "Erlang process namespaces"—adding things like "namespace outboxes" that will crash their own virtual nodes rather than flood peers—but it's not there right now.)


Well, no, it doesn't have privilege isolation. But it does have isolation from a fault tolerance perspective which I admit is only a portion of the purpose of isolation.

But if you're deliberately running malicious code even cgroups/namespaces won't save you from some attacks. Timing and cache attacks can be done without breaking out of the jail.


Not deliberately running malicious code, no; but deliberately running user-supplied code, yes. A properly isolated container system (whether at the OS or the runtime level) allows one system to serve as host for the programmatic equivalent of members of two rival gangs (mutually-untrustworthy processes), without anyone "getting shot."


Separating "malicious code" from "user-supplied code" is a distinction without a difference.

I would say one member having their private keys stolen[1] is a "fatal shot".

[1]: https://news.ycombinator.com/item?id=11891579


For many classes of software, sure. But there are plenty of reasons to isolate the entire OS. And Linux containers (not docker per se) are very inexpensive. Being able to compose using off the shelf packages and binaries and use userspace effectively provides a wonderful space to build solutions. I can use any language I want, any tools I want.


d) Deploy them directly to the same bare-metal non-virtual machine, without a VM or container.

Yes, you can get cross-app deployment conflicts (packaging, etc), and limits your cloud deployment options, but it is definitely another option and has lower overhead than any of the first three.


That would defeat a major goal of isolation: security.

It also creates fragility. What happens if a process is buggy and rallies up to 100% CPU? It affects all others.

Although to solve these issues you could use cgroups and namespaces... Aaaand we're back to containers again.


Agreed on both points. Either I missed your note about isolation on the first read or you edited.


On packaging, can't this be solved by static linking or something like Nix?


Honestly, I'd very much like to see more deep studies into container overhead - not just from a kernel standpoint, but actual tit-for-tat measures for storage and networking overhead that spanned the different filesystem options (I know that aufs is old hat, but I've used overlay and btrfs too) and, probably most importantly, the networking overhead.

I've had slowdowns in the order of 30% in terms of requests per second on some of my stuff inside Swarm as compared to running processes outside containers (arguably with four or five moving parts and entirely anecdotal, but still enough to give me pause), and I'd really like to understand how to shave this yak.


It really depends on what is being used. There are studies showing the overhead for containers is minimum, CPU-wise.

However, you might see docker-proxy eating a lot of CPU due to increased traffic. Or dockerd adding overhead to collect statistics, manage log buffers, etc.

My point is, pure containers using basic kernel features don't seem to add a lot of overhead. It's the sugar on top that sometimes is the problem, but that will vary depending on the container runtime, the app being contained, extra features that were enabled, etc.

I've seen CouchDB perform just fine in a container when using Docker's "bridge" network and drop to a halt if using Docker's "host" network. It seemed to dislike the situation and was doing way more syscalls than usual. Just to show the app might not like a certain environment.

Even with some overhead, it's a trade-off I'm willing to accept considering the advantages in the development workflow and managing the infrastructure. Most of what I've seen are rough edges that will get ironed out with time.


Yeah, of course. On the network overhead aspect, and as an extra data point, I'm using macvlan with great success on the Raspberry Pi (https://github.com/rcarmo/docker-plex-armhf/blob/master/Make...), but on cloud services that's pretty much impossible...

(I've also investigated the Ubuntu fan technique to lower overhead, but am looking for a more definitive solution)


> However, you might see docker-proxy eating a lot of CPU due to increased traffic.

You can say that again. Tried running Docker container with large range of exposed ports once (because SIP)... Each port got its own docker-proxy and the system basically suffocated. Not a nice view. :)


FYI: VMWare is 0% CPU overhead.

CPU instructions are executed as-is. Running on VmWare, ESX, Xen, Docker, LXD makes no difference.

It's a different story with network and storage access. Impact of containerization/virtualization is variable(5 to 95% performance drop).


Usually true. That said, some instructions and hardware are emulated or virtualised. (E.g. APIC, clocks, privileged instructions)

The main overhead is in IO though. Closer to 10% with KVM virtio than 95%.


And this is why friends don't let friends run databases in containers.


Don't Virtio and devicemapper have different performance characteristics, though?


If any of them give near native IO, would love to see. Haven't tested anything like that yet myself.


Docker is great cpu wise unless you're running on macos. It still likes to go nuts every now and then.


By eyeballing the graphs it looks like "tl:dr", yes by about 10%, and sometimes just because a job takes longer.

Another thing to note is that redis and postgres are i/o heavy and many production environments (esp in aws) will choose not to dockerise i/o heavy stuff.

I wish they had included a statement like that so I didn't have to stick my neck out and give an eyeball estimate, but I suppose this kind of statement is harder to defend and the data they provided was more nuanced.

With that said, power is largely divorced from cost in actual operation except at extreme scale.

This is because purchaseable and billable units of compute are usually not utilised to 100% capacity, both in cloud and bare-metal situations. Another way to say this is, most people essentially prepay more "power budget" than they actually use.

Since a primary use of docker (esp via kube) is workload consolidation, it's hard to know what real impact this has on the world.


It would be awesome if the comparisons would include tests of Linux in a VM. Does anyone have such data? (bare Linux vs. Docker vs. VM)


I did a test of 'Bare metal Linux' vs 'Containers on bare metal linux' for our product. In this case it is just 2 processes a 'search component' and an 'analytics and logging component'. Under heavy load the 'search' uses a lot of disk reads, CPU and network, while the logging module uses a lot of disk writes.

The comparison was done on

1) Ubuntu server 16.04 with both processes running as they usually do (Search with higher priority)

2) Core OS - Both processes running each in a separate rkt container (search with higher priority).

I saw no change in CPU / Network / Disk access metrics and my throughput remained the same.

Please note though, in my case I do not have way too many microservices as the general usage is. Also I use host networking. I also had no need for orchestration services like Kubernetes / swarm etc.,

TLDR:; No change between running product in container vs no-container mode with host networking, minimal containers and no orchestration.


Phoronix used to run some. Typically below 10% for disk heavy loads when KVM and virtio is used, bit more if not. I have no data on network heavy loads.

Generally containers vs VMs is a wash... except when considering security.

There used to be some issues with power saving in Xen that increased power use when idle, but have been fixed around 4.4 or so.


I would be interested in seeing VM hypervisors (Xen, QEMU etc) and also power consumption for rkt.


I wonder what the numbers look like for microservices, or the modern "burn down and rebuild everything" deployment methods.

Fleet management tools easily create hundreds or thousands of new virtual machines and run them through a complex npm bootstrap, when all you wanted to do was edit a file in /etc and restart node.


Fleet management tools usually guide you not to have any files in /etc, injecting instead the configuration from outside the container, often stored in Kafka or equivalent. So you shouldn't have to recreate the container image, just stop them and re-launch.


I really hope you are not running npm commands during container start versus build ?


I would be interested in the overhead of systemd. Amazon decided to drop systemd in favor of sysv/upstart init. I guess this is based on energy consumption but could not find any info online.


Probably more likely due to the cost of breaking compatibility. It's probably the most popular rolling release distribution out there, and I imagine a large part of that is because they'll do anything they can to avoid breaking compatibility.


I am not sure. In theory it is easy to convert init.d scripts to systemd service programatically. I see energy/CPU consumption a much bigger problem with it. We did some performance testing and systemd was using way much CPU time than sysv/upstart under similar cirtumstances. I guess the difference is huge when you apply it to 2M computers (this is how many Amazon used to have)

https://www.bloomberg.com/news/2014-11-14/5-numbers-that-ill...


How did you compare the two? Systemd replaces more components than sysv alone, like rsyslog; was that taken into account?


Does it matter? If one were to find out that it uses more energy because of those components, would that change anything regarding the decision of using it or not because of that higher energy usage?


It matters that you're measuring equivalent things. If systemd replaces three components, you need to compare its energy consumption with those three components, not just with sysv/upstart.


Not OP, but if you are measuring the complete system with similar functionality then the comparison would be fair.


Exactly, the sum of what I need (init + logging + ntp) should be close to using just systemd.


The article mentions that most of the increased energy consumption was due to the performance of I/O system calls. This was a bit surprising to me since Docker shouldn't have much of an impact of I/O system calls unless you're writing to the container's filesystem, which is copy-on-write. The only reason you should be writing to this filesystem is if you want to incorporate the data you're using into a later Docker image. For high I/O tests like the ones they're doing you should use volumes. Unfortunately I didn't see the actual `docker run` invocations they used to run the tests so I can't know for sure if this was what they were doing. It's just a suspicion.


Wouldn't the copy-on-write nature of the filesystem mean that reads are going to be more heavily impacted?


> We compare the energy consumption of various scenarios run on bare-metal Linux—that is, the applications are running on one kernel, without any virtualization at all—in contrast to Docker-managed containers, using “off-the-shelf” Docker images

Are docker containers at least more power efficient than virtual machines ?


This surprised me:

> "Results: In all cases, there was a statistically significant (t-test and Wilcoxon p<0.05) increase in energy consumption when running tests in Docker, mostly due to the performance of I/O system calls."


I would guess that the process change that usually comes with Docker has more of an impact.

Since most people adopt the "ephemeral" notion for containers, there's probably a lot higher rate of wiping and recreating bits, especially during dev and test.

I assume though, that a new process was designed to create efficiency in some other area that is more significant than any increase in energy consumption.


Running multiple containers on a single core or hyper-thread (which seems to be the primary way that containers are used), will obviously have a lot of non-cooperative context-switching, and will therefore have a very significant overhead compared to a solution that takes advantage of cooperative multitasking and cooperative context-switching.


Note The Register's take on this:

http://www.theregister.co.uk/2017/05/05/docker_docks_wallets...

which I posted separately a little earlier.


They just say that "the energy trade-off between containerization and enabling TLS/SSL is comparable for PostgreSQL" and quote the lead on this research, Abram Hindle: "If your expenses are people, then Docker is probably worth the hit".


It would have been more useful if the paper compared Docker vs Virtual Machines instead of bare metals. It pretty much intuitive that running docker over bare metal might incur slight overhead in terms of power consumption.


It would be more interesting to see "apple to apples" comparison such as "Docker vs Mesos Containerizer vs CRI-O" or "Xen vs QEMU vs KVM". But I'm glad to see this level of analysis!


I made no effort to read the paper and went straight for the pretty pictures. The basic idea seems to be that you can save a few percentage points on your energy usage by running directly on Linux without Docker.


Not only that, but Docker's "containers are cheap" mentality also means that there are more calls being made in the first place, and those are being made less efficiently.


The whole idea of Docker is to make it possible to better saturate single machines. How is that not more energy efficient than solutions before Docker?


Conclusion: 2 watt higher power consumption with docker.


This is just a paper presented for the sake of a paper presentation, like myriad of papers presented. Nothing to see here, should move on.


What do you mean?


Its common sense to understand that when you run an extra application (docker) in one machine and compare it with another machine which do have it, everything else remaining the same, the machine which runs the extra application consumes more energy. Don't need a paper or watts up to prove that. That is the reason I said, its a paper presented for the sake of it and nothing more.


Well, common sense won't tell you the amount of extra energy used, or which specific aspects of the extra application are particularly responsible for it, like this paper does.


Why not measure just processor load?


Power consumption is more complex than processor load - you have to think about how P-state (cpufreq) and C-state (cpuidle) selection is affected by the workload and now the scheduler distributes load across cores.


There are other factors to power consumption: type of instructions used (eg. cpuburn generates more heat than a regular 100% CPU usage), memory accesses, cache hits/misses, ...


While an interesting observation and research area, I'd be curious how much of these effects are offset by the better utilization that containers enable. I'd bet that the reduction in idling server capacity much more than equalizes these effects. However, this would probably be hard to measure.


Would be interesting to see a comparison running WordPress, Redis, Postgres, etc on Linux: BareMetal vs VirtualBox vs VMware vs KVM (Qemu) vs XEN vs Docker, etc - with CPU, Memory, IO r/w and energy consumption (all measured from host system) on Intel server hardware with VT enabled. (Optionally also on AMD, ARM and POWER server hardware.)


Who'd have thought that an additional layer of abstraction has a small impact on performance. Nothing else in computers work that way.


Great - you've made a hypothesis using your intuition. It seems like it's probably true. The science part is then checking that this it actually is, rather than just assuming, which is what this paper is.


No hypothesis or intuition at play. I've read some of the many articles covering the performance impact of using docker on various metrics. Useful metrics, unlike an electric bill.


Huh? Electric bill is the #1 useful metric for any workload at scale. If you're small, maybe human administration costs dominate that, but otherwise it's virtually guaranteed to be the primary operating cost.


I guess so, but if it matters that means you have hardware and I hate having to manage hardware. So I guess that colors my opinion on the matter.


Your criticism would make sense if the title of the article was "Does Docker affect energy consumption?".

However, the article is titled "How does Docker effect energy consumption."

It is about an experiment which aims to quantify the impact on performance, which is quite interesting.


Sort of related: mongodb seems to use a fair bit of energy as well on Ubuntu linux.


That's not related at all.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: