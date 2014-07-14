Changing the base layer itself at a minimum requires people to upgrade, and now you don't have that advantage of the preexisting audience anymore. If your improvement requires a breaking change, then you're in a real pickle. So people will stack on more and more until it becomes more or less unbearable.
However, a minor improvement or bugfix should be done in the component that is responsible for it. Creating another layer instead of fixing the problem is just creating more problems.
But, unless you are willing to abandon Linux/Unix, this is kinda where you are left.
Common Lisp is my go to example of this. It's possible to extend the language to support entirely new paradigms without breaking it, because of the power of the macro system.
You can't take an AWS VM and fire it up on DigitalOcean without trial by fire. VM's should be portable but they're not because cloud providers don't want them to be.
For a while everyone was locked into AWS but now that there's other options some companies want to hedge their bets, or even run parts of their workload locally.
Since the VM space is all proprietary now people are moving one level up and building a consistent environment above it. I fully expect docker to become proprietary in a couple years and the cycle will begin again.
You can't ?
I have never used DO, but I have deployed linux and FreeBSD systems on EC2 and moving them to bare metal was just a tar command away ... you can even pipeline 'dd' over ssh if you want to be fancy ...
Modern package management systems like APT spend a lot of effort installing and removing files, and they don't do it completely; any file created by a program after it is installed will not be tracked.
You could accomplish the same thing in other ways (as Apple's sandboxing tech does), of course.
Well, there /is/ another way to do it.
STATIC LINK ALL THE THINGS
Which would work if licenses and copyrights didn't exist.
We're at the point in the hype cycle where it starts getting fashionable to dismiss that as an overkill, but the reality for most of us out there is that most software is way more complicated than a single executable and containers make it easier to deploy complicated software.
Managing internal dependencies (like libraries) is another concern entirely. But containers are good for that, too.
I don't think it would.
Dynamic linking allows a library to be patched once and have the patch apply to all the programs using it. If every program was statically linked, you would have to update each one individually.
Not to mention the waste of space.
I'm guessing much of that is moot these days, but IMHO it's still something to aim for.
The benefit of that goes away with containers anyway, you don't share libraries, every instance gets its own install.
On the other hand, deploying stuff is dead easy now. You just tell the hoster "deploy this Docker package, expose port X as HTTP, and put a SSL offloader in the front" and that was it, no more "we need <insert long list> to deploy this" or countless hours spent with the hoster on how to get weird-framework-x to behave correctly.
If it's actually useful we'll find someplace for it to live (or just in containers if that's all that's needed). If not, finding that out was cheap.
I just sometimes wonder if the cart hasn't gotten in front of the horse on the whole container front. The fact that the term "bare iron" has been hijacked to mean "not under virtualization" made me start to think that something is seriously fucked up.
I don't like Docker, but I think it does some differential compression with the layers when you are modifying the image. So you don't have to re-do this install from scratch.
You may also run into issues with user IDs and various system config files in the chroot. Configuring the service with flags, env vars, and config files is a bit of a pain.
Docker is essentially a glorified chroot... I've essentially tried to rebuild it, and it is unfortunately a lot of work.
Heads up, you're talking about a different thing. You want to run these things
as normal operations, but gunnihinn was talking about deploying an application
to check if it is of any value. Chroot is just enough to make a mess as the
application's developer instructed in INSTALL.txt without the need to worry
about cleaning up afterwards.
And by the way, you seem to be confusing containers and Docker.
And I'm not confusing containers and Docker, I'm just speaking a bit imprecisely. In my experience, conversations about "containers" are rarely about raw containers, but rather some specific containerization scheme and tools (e.g. docker). I suspect everyone in this entire subthread means "docker containers" when they says "containers."
Yeah, some newbie forgot to mount-bind a necessary directory like /proc or
/dev, didn't provide sensible /etc/resolv.conf, or messed up host's and
chroot's paths, either in request or in configuration. Nothing that would
render chroot unviable. Is this what you meant?
There are several capabilities in that list to address your problem. Since your question is concerned about memory you can search for "memory" in that page and note the exact capabilities you will need.
OpenBSD has even more features (https://en.wikipedia.org/wiki/OpenBSD_security_features) alongside the capability model.
My gripe is that instead of learning how to properly use their tools and platforms people have started looking for golden hammers like docker and now we have a mess like RancherOS. If someone can explain to me how isolating the system services in docker containers (which by the way are not isolated since they are privileged) does anything above and beyond what capabilities and the security features in OpenBSD provide then I'll concede the point. My guess is there are no advantages and people are jumping on bandwagons they know nothing about and since the previous generation of tools was not utilized to its full extent neither will all the new docker hotness. People are hammering square pegs into round holes.
My point was more about the "developer UX" of needing to isolate things that way. Containers have the semantics of isolating a group of processes, but not performing any internal isolation between the processes in the container. This is almost always what you want—you want "your app" to be able to have multiple processes, and to not have any security boundary between the parts of "your app", just between "your app" and "other apps" or the OS. In other words, you want to have a single "process"—from the OS security subsystem's perspective—whose threads happens to be composed of multiple PIDs, multiple binaries, and multiple virtual-memory mappings. You want to fork(2) + exec(2) without creating a new security context in the process.
Sandboxing would make perfect sense if single processes were always the granularity that "apps" existed at. Sometimes they are. Sometimes they're not, and people do complex things with IPC-capability-objects. And sometimes, when they're not, that fact pushes people toward avoiding multi-process architectures in favor of monolithic apps that reimplement functionality that already exists in some other program in themselves, in order to bring that functionality into their platform/runtime so it can live in their process.
Containers let people avoid this decision, by just applying things like capabilities at the container level, rather than at the process level.
> If someone can explain to me how isolating the system services in docker containers (which by the way are not isolated since they are privileged) does anything above and beyond what capabilities and the security features in OpenBSD provide then I'll concede the point.
Completely apart from the above, my understanding of Rancher is that it's the Docker part of "Docker container", not the container part, that provides the benefit there. Docker is a packaging and service-management system; that its packages use containers is frequently beside the point. Rancher's system services are Docker images (i.e. Docker "packages"), and so you use Docker tooling to create, distribute, manage and upgrade them. If your own application on such a system is managed through Docker, this provides a neat solution to unifying your operations—you just do everything through the docker(1) command.
That is a complex mix of services running on a machine, some sharing the same flavor of VM, and you're allocating fractions of the total available resource capability to different components. If you cannot make hard allocations that are enforced by the kernel you cannot accurately reason about how the system will perform under load, and if one of your missions is to get your total system utilization to be quite high, you have to run things at the edge.
Agreed. What I find so odd is that this problem was simply and elegantly solved in BSD with 'jail' and everyone went about their business.
I do not understand why (what appears to be) the linux answer to 'jail' is so complicated and fraught and the subject of so much discussion.
I am not sure that containers and their build scripts represent the $huge_profit_potential that people think they do ...
I think there's interesting ideas of environments in other operating systems, but I'm not seeing how you'd make things better by flattening containers into the OS, per se.
Making existing programs compatible with such a paradigm probably wouldn't be any more work than e.g. adding SELinux/AppArmor support.
Now of course the correct way to solve this would be (a) to make both programs use the same version of python-somelib, (b) make upgrades happen atomically, (c) for Python modules to actually have some API backwards compatibility. But in the absence of doing the right thing you can use a container to effectively static-link everything instead. And worry about the security/bloat/management another time.
The main advantage of docker, as far as my understanding goes, is more of a prebuilt system configuration thing. Need a database? Load the prebuilt PostgreSQL image onto the respective machine.
One thing I think should get more use in general is that Dockerfiles are essentially completely reproducible scripts[1]. Too many companies I've seen still use Word documents full of manual steps one can easily get wrong for all their machine setups (especially in the Windows world). If you want to test something quickly, you're bogged down for a day.
Now it's all docker rather than packages, ansible (which leaves no trace of what it's doing on the target machine) rather than a for I in 'cat hosts'. Fine, but where's the benefit?
I can see, however, a benefit in encapsulating different services in different containers, as this potentially gives you some control over them (available disk space, network usage etc.) * . On top of that, I imagine starting out on a "machine agnostic" approach can be rather useful if you have to change your network landscape further down the line: If your database already is configured as if it were running on its own server, there's certainly a bunch of unintended coupling effects you can avoid.
That said, I can't see Docker being the silver bullet it gets hyped up to be sometimes. But that goes for most new and shiny things in the tech space...
* And yes, that's already feasible without containers. Docker's approach to this seems to provide a way to do it in a much more automated way than most alternatives though.
I think the main advantage is that it standardizes the interface around the application image/container. This allows powerful abstractions to be written once rather than requiring a bespoke implementation for each way that the application is structured. Imagine writing the equivalent of Kubernetes around some hacked-together allows-multiple-versions-of-a-dependency solution. It would be a nightmare. But because the Docker image/container interfaces are codified, you can build powerful logic around those boundaries without needing to understand what's inside those images/containers. Dynamically shifting load, recovering from failures, automatic deployments and scaling are all much easier when you don't have to worry about what language the application is written in or how the application is structured.
The point isn't really to "run more Docker". It's to eliminate as much operating system overhead as possible, so that nearly every CPU cycle and byte of memory usage is dedicated to your containers.
For example why run a network supervision daemon if dockerd or equivalent handle all the important complex pieces of networking via container orchestration? Why have a local package manager, or system port mapper.
Personally I prefer running SmartOS and Triton containers. Their system seem much more stable than any of the Linux containers and/or systemd setups I've tried. It sticks a bit more to traditional unix design which makes sense to me. Items like the caching layer for containers build on ZFS snapshots, a well tested file system layer, rather than ad hoc userland tools. Triton also runs all of the orchestration layers in separate zones (containers) like RancherOS is trying to build. But each component is a simple(ish) service, it's easy to `zfs list` and check on a container's file system or fix it or backup, etc. Same with SVC or VM machine management which both have small simple tools that do one thing pretty well.
To that note, docker has been moving towards breaking out and using smaller daemons haven't they? If that continues it might turn out more modular in the end wherein RancherOS would end up being more modular than systemd Linux setups. Imho, that'd be great.
By all means continue to run RancherOS and let me know how that goes when you're managing a few hundred to maybe upwards of a thousand VMs and then layering a container orchestration system with the underlying VMs coming and going on an on-demand basis. I remember thinking "I really wish I had more of this docker stuff in the OS itself. Because dealing with all the caching, volume mounting, and instability in userspace is so much fun".
I'll await your report because clearly my experience with these systems and all the ways they fail is too combative for your taste. There are a few things they don't tell you on the brochure when you're drinking the kool-aid.
I might give it a crack based on that. If you are hell bent on running Docker then an equivalent of the Ubuntu minimal install seems a good way to start.
If this idea has similar pragmatic advantages over the would-be best solution, then I'm game.
yep, when you want a distributed filesystem building it on top of a good and already working local filesystem is a pretty robust and cheap approach.
>smell like a system on top of a system to fix something that could be fixed in the system.
container layer is basically a distributed OS. Making a distributed OS by "fixing in the system" is pretty much non-starter ... or a huge academia project.
$ cgcreate -g memory,cpu:groupname/foo
$ cgexec -g memory,cpu:groupname/foo bash
It's the bare basic that libvirt and Docker et al are based anyway. So if you want to run just one process per "container" it seems rather logical to keep it simple and use cgroups commands directly. (Similar on Windows, using just sandboxie is so simple. Or do it like Android, execute every app/process with a different user.)
Each process is placed in a hierarchy so you end up with system-httpd which makes it easy to assign or limit resources based on the slice.
Redhat covers it extensively in their performance and tuning class.
Sadly, "treated as an afterthought" is all too often still applicable, as well; this also being a disease of Linux doco that it hasn't wholly shaken off. The culture of updating the doco in lockstep when the software changes hasn't really taken a firm root, alas.
Just one example of such doco problems is a systemd issue where the doco does not tell the the issue raiser that the entire basis for the issue is wrong. Users have to resort to finding commentary hidden in the source code. Raised as a documentation issue, it requests a documentation change to warn users of something that is not in fact the case at all. Ironically, the true doco issue is actually that it is deficient, and the correct doco change would be to move the commentary into the manual where users can easily see it.
Genuinely curious, how is it "infinitely better"? I'm considering potentially switching away from Docker on my production boxes to something else, but I've mainly only heard about rkt when researching.
How and why?
You're being condescending. I like Docker and if anything you make me not want to try systemd-nspawn with this attitude.
"Go on, try it" doesn't help me in any way. I believe that Docker has more attention, you might call it hype. I'd say "eyes" instead. We have here a way to shown to run everything in Docker, parent link of thread (RancherOS.) That's great, I already went ahead and tried it. I'm still waiting to be convinced that I should try nspawn instead.
Without your help, I won't even know what OS Distro I can download to try it, let alone why it's better.
[1] shows me how to run systemd-nspawn by itself.
[2] shows me that it's basically just like chroot when it comes to a user experience.
When do I get to the part that's better than the entire ecosystem of schedulers and orchestration tools that has sprung up built on and around Docker? Are all of those companies wrong? (Are you trying to tell me it's all just hype and I should put everything into the hands of one competent sysadmin that manages nspawn and systemd?) I could be convinced of that, but I just don't see anyone doing that. I guess that's actually what was meant by cargo cult.
This all really just makes me want to go out and spend some more time looking at Rkt instead. We're all not even remotely convinced that this is better. Where is the mantl.io built on systemd-nspawn?
Edit: I am still going to upvote you because you went to the trouble of going through my post history.
I just read the ArchWiki page on systemd-nspawn[1] and I fail to see how it is any better by the way. It just looks way harder to use (Docker images vs packages, scripts and per distro instructions ; docker create, docker start, docker ps, docker logs vs pacstrap, systemd-nspawn, machinectl, journalctl) and honestly not very different technically. systemd-nspawn just looks like a less user-friendly Docker to me.
[1]: https://wiki.archlinux.org/index.php/Systemd-nspawn
You're seeing condescension where there is none. I'm just pointing out facts. It's okay, Docker runs on hype, and apparently so do you. But then, I can't expect Red Hat to invest into advertising for a core system component, because developers ought to be aware of it.
nspawn also offers faster startup time, better integration with cgroups and chroot jails, etc.
Well, I'm fine with journalctl and machinectl as they're part of systemd. I'm not really fine with having to install respectively arch-install-script, deboostrap+debian-archive-keyring, debootstrap+ubuntu-archive-keyring to run an Arch, Debian or Ubuntu container. What if I want to run something like CentOS or Alpine?
>But then, I can't expect Red Hat to invest into advertising for a core system component, because developers ought to be aware of it.
That's why Docker has the market. systemd is huge and scary, developers see it as a sysadmin only component. You cannot expect developers to know systemd without explaining it to them in a way they can understand.
>nspawn also offers faster startup time
Is Docker slow? Starting a container is usually instantaneous. Maybe the engine? For me it's managed with systemd and its weird socket binding, it's pretty fast too.. Fast is good but I can't remember thinking "wow docker is slow"
>better integration with cgroups and chroot jails
How? Why do I need this better integration?
- - -
I'm convinced there are not a lot of things Docker cannot do in comparison to systemd-nspawn. On the contrary, with systemd-nspawn:
- how do I spawn a container remotely?
- how do I share my "images"? is there an easy way to bundle the app I want to isolate? something at least kinda portable between Linuxes, so no .deb/.rpm
- can I include a file to my source code and tell my users something like "run docker build, then docker run and you're good to go"?
- my sysadmins just gave me the rights to run the docker command (we configured the user namespace so that I'm not indirectly root on the host), would it be that easy for them with nspawn?
- say I want a specific dependency, redis for example. Can I do something as simple as `docker run -p6379:6379 -v/data/redis:/data --name redis redis` or would I have to manually install the redis in the nspawn?
It's about kicking the question of static vs dynamic can up the stack, as now you have something that is a hodgepodge of dynamic bindings that seems to behave like something static as long as you do not look behind the curtains. Oh, and don't mind the turtles...
More like a virus.
I think the advent of Docker Swarm probably put a crimp on development and use of Rancher (the app). To me the way forward is Docker's own clustering tools, and the ease of standing up a cluster of Atom processors at www.packet.net where they install (as an option) RancherOS is very attractive.
Like, for example, containerize Skype, so that it can't read my home. Or contain Firefox to just read `~/.mozilla` and `~/downloads`.
For binary blobs I don't trust that much, I'd really value this.
For FLOSS stuff, it still provides protection from bugs.
I want to isolate data, no libraries. Libraries are there to be shared.
It needs to build on top of what we have, otherwise adoption will never take place.
how many layers of abstraction are necessary, and why?
Obviously virtualizing serves a valuable purpose.
Making development more accessible is great.
Simplistic dev services like this mean reliance on others infrastructure, and being bound to cloud.
Doesn't seem forward thinking.
Can you imagine if Google had decided to run their search app on Microsoft servers?
I still strongly dislike "containers". It's not worth the complexity or instability. Two thumbs way down!
Does it though? I use CoreOS without containers (for the nice auto-updates/reboots), and it works really well with just systemd services. I'm aware the branding sells it this way (esp. the marketing rebrand as Container Linux or whatever), but does it run any containers as part of the base system? I've found CoreOS with containers not very reliable, and CoreOS without containers extremely reliable.
Since I use Go on servers which has pretty much zero dependencies, what I'd really like to see is the operating system reduced to a reliable set of device drivers (apologies to Andreessen), cloud config, auto-update and a process supervisor. That's it.
Even CoreOS comes with far too much junk - lots of binary utils that I don't need, and I'd prefer a much simpler supervisor than systemd. Nothing else required, not even containers - I can isolate on the server level instead when I need multiple instances, virtual servers are cheap.
CoreOS is the closest I've seen to this goal, the containers stuff I just ignored after a few tests with docker because unless you are running hundreds of processes, the tradeoff is not worth it IMO. Docker (the company) is not trustworthy enough to own this ecosystem, and Docker (the technology) is simply not reliable enough.
The OS for servers (and maybe even desktops) should be an essential but reliable component at the bottom of the chain, instead of constantly bundling more stuff and trying to move up the stack. Unfortunately there's no money in that.
I sat down one day to try to write down what would make Linux containers/orchestration usable and good, and realized after about 20 minutes that I was describing FreeBSD jails almost to a T. The sample configuration format I theorized is very close to the real one.
However, I think that there's good reason for actual deployments of containerized systems to remain niche, as it did until the VCs started dumping hundreds of millions into the current Docker hype-cycle, and the big non-Amazons jumped on board as a mechanism to try to get an advantage over AWS.
What people really want are true VMs nearly as lightweight and efficient as containerized systems. In fact, I think many people wrongly believe that's what containerized systems are.
Like what QubesOS is trying to do?
We have a server that receives the logs from our kubernetes cluster via fluentd and parses/transforms them before shipping them out to a hosted log search backend thingy. This host has 5 Docker containers running fluentd receivers.
This works OK most of the time, but in some cases, particularly cases when the log volume is high and/or when a bug causes excessive writes to stdout/stderr (the container does have the appropriate log driver size setting configured at the Docker level), the container will cease to function. It cannot be accessed or controlled. docker-swarm will try but it cannot manipulate it. You can force kill the container in Docker, but then you can't bring the service/container back up because something doesn't get cleaned up right on Docker's insides. You have to restart the Docker daemon and then restart all of the containers with docker-swarm to get back to a good state. Due to https://github.com/moby/moby/issues/8795 , you also must manually run `conntrack -F` after restarting the Docker daemon (something that took some substantial debug/troubleshooting time to figure out).
We've had this happen on that server 3 times over the last month. That's ONE example. There are many more!
Containers are a VC-fueled fad. There are huge labor/complexity costs associated and relatively small gains. You're entering a sub-world with a bunch of layers to reimplement things for a containerized world, whereas the standard solutions have existed and worked well for many years, and the only reason not to use them is that the container platform doesn't accommodate them.
And what's the benefit? You get to run every application as a 120MB Docker image? You get to pay for space in a Docker Registry? Ostensibly you can fit a lot more applications onto a single machine (and correspondingly cut the ridiculous cloud costs that many companies pay because it's too hard to hire a couple of hardware jockeys or rent a server from a local colo), but you can also do this just fine without Docker.
Google is pushing containers hard because it's part of their strategy to challenge Amazon Cloud, not because it benefits the consumer.
There is also nothing keeping you using Docker for containers. LXC also works great and it has no runtime, so you have none of the stability issues you can get with Docker. Though I must say Docker has improved a lot and I think it will stabilize and _it_ won't be an issue (not as sure about Kubernetes).
But I still don't think containers are what most people want. People need/want ultra-lightweight VMs with atomized state. NixOS looks promising but I haven't used it yet. It seems to give you a way to deterministically reason about your system without just shipping around blobs of the whole userland. You can also approximate this on other systems with a good scripting toolkit like Ansible.
NixOS does look interesting and I've considered playing with it for personal projects, but IMO it is still to fringe for use at work where you need both technical usefulness and a general consensus that it is appropriate (i.e. mindshare).
Kubernetes has the concept of an "ingress" controller because it has established itself as the sole router for all traffic in the cluster. We already have systems to route traffic and determine "ingress" behind a single point (NAT). Kubernetes also manages all addressing internally, but we have technologies for that (DHCP et al). Kubernetes requires additional configuration to specify data persistence behavior, but we have many interfaces and technologies for that.
VMs would be able to plug into the existing infrastructure instead of demanding that everything be done the Kubernetes way. It reduces complexity because it allows you to reuse your existing infrastructure, and doesn't lock you in to a superstructure of redundant configuration.
kube is very ostentatious software in this way, and it makes sense that they'd throw compatibility and pluggability to the wind, because the strategic value is not in giving an orchestration platform to AWS users, but rather to encourage people to move to a platform where Kubernetes is a "native"-style experience.
As for orchestration, people were orchestrating highly-available components before Kubernetes and its ilk. Tools like Ansible were pretty successful at doing this. I have personally yet to find executing a `kubectl` command less painful than executing an Ansible playbook over an inventory file -- the only benefit would be faster response time for the individual commands, though you'd still need a scripting layer like Ansible if you wanted to chain them to be effective.
Disclosure: I work on a cloud platform, Cloud Foundry, on behalf of Pivotal.
Nonsense. You can run your private Docker registry or if you want to support stuff like authentication and access control use Sonatype Nexus. Both open source.
> but you can also do this just fine without Docker
Not as easily. You'd need to use VMs with all their associated costs (especially if you use VMware) to provide proper isolation, and the hosting department usually will have day-long phone calls with the devs to get the software to reproducibly run, and god forbid there's an upgrade in a OS library. No problem there with Docker, as the environment to the software is complete and consistent (and if done right, immutable).
Some people did not know how to do any server management before kubernetes became a big deal, so they think kubernetes is the only way to do it. For the rest of us, I don't think there's a lot of value brought by this ecosystem.
- LinuxKit seems designed to be a piece that can be used to build a Linux distro but isn't a full distro out of the box like RancherOS
- As far as I know LinuxKit is still based on Alpine whereas RancherOS is custom and doesn't have much of a host filesystem
- LinuxKit is based on containerd and RancherOS is still based on Docker (though this is likely to change soon)
We're definitely interested in collaborating with LinuxKit since we do have similar goals. It's probably a good idea for us to write a more detailed blog post comparing the two since we've been getting this question pretty often lately.
We considered using Alpine as the base image for our system containers for a time. They're still built using Buildroot currently, though we're playing around with another project that we might use instead.
Anyway, it seems it's design has inspired some people recently.
I haven't used RancherOS but CoreOS works mostly-fine. However, I would avoid using these things altogether because containerization sucks.
What people want is a mainframe where a lump of code is guaranteed to run the same everytime, regardless of the machine state, and if something goes wrong with the underlying layer is self heals(or automatically punts to a new machine, state intact).
What we have currently is a guarantee that the container will run, and once running will be terminated at an unknown time.
Mix in distributed systems(hard) atomic state handling (also hard) and scheduling(can be hard) its not all that fun to be productive for anything other than a basic website.
> it seemed logical and also it would really be bad if somebody did docker rm -f $(docker ps -qa) and deleted the entire OS
or are you asking why anyone would want a 'docker-os', which has everything but the docker daemon as a container?
I saw that line, but since if you want to run docker at scale you'd have each execution node under the tight control of a scheduler, it seemed like a small edge case.
As I said, for the ignorant such as myself why would I chose this over coreos?
I guess this is mostly so you don't accidentally delete OS-containers (like ntpd) when trying to delete all your containers.