Hacker News new | past | comments | ask | show | jobs | submit login
Show HN: Bocker – Docker implemented in 100 lines of bash (github.com/p8952)
662 points by p8952 on July 21, 2015 | hide | past | favorite | 87 comments

Just playing with this in a VM with an attached btrfs volume, a complete revelation. 96 lines! And it's actually pretty functional. This takes keeping it simple to a whole new level.

The Wheezy image I use with LXC worked well enough, the minimal alpine image not so well, apk complaining about its database.

User name spaces support would be nice, then we can play with unprivileged containers.

And Overlayfs would be a nifty alternative to btrfs, it's in kernel 3.18, and 4.04 adds support for multiple lower layers. But this btrfs implementation is cool too. Cgroups support will be somewhere on that list too.

Cgroups and namespaces is in the kernel. General Linux ecosystem for networking, storage and distributed systems is already extensive. The possibilities are endless.

So now LXC, Docker, Rkt and Nspawn have Bocker for company.

Yes they do. Nice to learn about even more projects than I thought existed.

Using btrfs subvolumes as the image format, that's a nice touch. On the same road as the hypothetical systemd packaging system (not that I'm very enthusiastic about that).

The network, PID and mount namespaces are the ones unshared, plus a private /proc.

I like tools like this because they're reality checks on how the basics of Linux containers are just a few essential system calls, and particularly that they're limited.

Thank you.

One of the things I found really interesting here is how much could be done with just basic userland tools, and how old some of those tools are.

Docker was released in 2013, but support for kernel namespacing has been around since ~2007. That's quite a long time for such a great feature to go mainstream.

Yes. The basic technology has been there for a while. And the Docker source code has some eyebrow-raising parts to it.

However, I've stated this in other threads: Docker isn't about containment. It's really about the packaging system. I don't think this technology demo gets that.

If Docker is about packaging, then it's one of the worst package management systems I've ever used. I use Docker for the abstraction over namespaces and cgroups mainly, and get frustrated with the layers of disk images, bad caching system, poor security story, and the weak Dockerfile DSL.

Perhaps the parent poster was wryly indicating that he doesn't think that much of Docker. Certainly I think both of you are correct: Docker is about packaging, and it absolutely sucks at that. The only reason that that is not obvious is that Docker is piggybacking on the relatively excellent and well-developed package management of distributions like Debian and Fedora.

Debian and Fedora do packaging better than Docker, relatively speaking, but they still have major issues that have lead to "solutions" like Docker, Chef/Omnibus, etc. They install packages globally which doesn't allow for having multiple versions of the same software/library, they don't allow for unprivileged package management so users are at the mercy of the sysadmin, there's no transactional upgrades and rollbacks for package updates gone bad, and builds aren't reproducible (Debian is doing great work to fix this, though), to name the most important issues.

I work on the GNU Guix project, which can do all of these things. Additionally, with Guix, I have access to a configuration management system that can create disk images, VMs, and (in a future release) containers (replace Chef/Puppet/Docker), a tool for quickly creating isolated dev environments without polluting the rest of the system (replaces virtualenv and Vagrant when combined with a VM/container), and more.

I'm convinced that more featureful package managers can and will solve a lot of our software deployment problems, and I'm also convinced that simply layering more tools on top of a shaky foundation isn't going to work well in the long term.

> Debian and Fedora do packaging better than Docker, relatively speaking, but they still have major issues that have lead to "solutions" like Docker, Chef/Omnibus, etc.

I get what you're saying, but the way you've phrased it makes it seem like it wasn't intentional when in fact before immutable git-style packages were discovered, you were forced to choose between packaging that works well for developers/ops and packaging that works well for end users.

Debian is the best example we have of the latter, but it's a mistake to say they did a bad job at making ops-friendly packaging. They are solving a different, mutually-exclusive (until recently) problem.

With a bunch more elbow grease and polish, the nix/guix approach allows us to have the best of both worlds, but this is a very new development; arguably it isn't even "there" yet.

Debian and Fedora do it better, yes. But it's not quite as easy to get started. However once you are at a certain size, both solutions are horrible. (Docker and RPM). Especially when you need to target more than one Fedora / CentOS / RHEL / etc... Also editing Spec files is quite horrible.

>The only reason that that is not obvious is that Docker is piggybacking on the relatively excellent and well-developed package management of distributions like Debian and Fedora.


Just because a package manager has a broad user base does not make it excellent nor well-developed. pacman[1] user base is far smaller, but (IMHO) it's a much more refined package manager than apt or rpm.

I agree 100%. While the tech in Docker is great, it alone can't be easily monetized.

The real value comes from things like DockerHub, and getting people to buy into the whole ecosystem.

Which is probably why people were concerned about Docker's expansion into the clustering and orchestration markets, even if from a business perspective those are their only real holdouts to avoid commodification. The base Docker is easy to replace if the project goes out of hand, the services around it are trickier.

There are several tech streams converging there. A bigger chunk is the space that Mesos and Kubernetes occupy.

And to be fair, I suspect the Docker folks were thinking less about clustering and orchestration so much as: (1) clustering and orchestration still sucks; (2) people want as good of an experience using docker as they do when spanning across multiple nodes; (3) let's make clustering and orchestration less sucky and use the 'Docker Way'[1]

[1] 'Docker Way' is a pointer to the fuzzy, difficult-to-verbalize thing that Docker enables, namely in packaging.

clustering and orchestration still sucks

I don't see that changing until we get proper single-system imaging, location transparency, process and IPC migration, process checkpointing and RPC-based servers for representing network and local resources as objects (be they file-based or other) in our mainstream systems.

These things only really caught on in the HPC and scientific computing spaces, where you've had distributed task managers and workload balancers like HTCondor and MOSIX for decades. They've also been research interests in systems like Amoeba and Sprite, but sans that, not much.

The likes of Mesos, the Mesosphere ecosystem with Marathon and Chronos, and Docker Swarm bring only the primitive parts of the whole picture. Some other stuff they can half-ass by (ab)using file system features like subvolumes, but overall I don't see them improving on all the suck.

I ran an OpenMOSIX cluster as a hobby. The alternative was Beowulf (the meme of the day was "Imagine a Beowulf cluster of these things").

It seems the mainstream server industry has moved to more isolation rather than more interconnectedness, which is probably better for most public-facing systems.

Isolation is orthogonal to what I listed. MOSIX has decent sandboxing. I don't know about Linux-PMI or OpenMOSIX, though. They died off years ago anyway.

Two of the things I'd love to see is having the image format be:

  (1) Open standards (and I'm hearing things moving in that direction)
  (2) Content-addressability, so that images can be stored on IPFS.
Point (2) really plays the "packaging" rather than the "containment" aspect. I'm not really thinking about Docker Hub or any proprietary services like that.

There's a project put out by Chef (formerly Opscode) called Omnibus. It allows you to build a monolithic the package, complete with all the library dependencies and such. Chef Server is distributed with that monolithic omnibus. What had happened was that various library dependencies would cause problems with the various systems that needed to come together. It was easier to specify the precise version of the components needed. (But it also put the onus of security fixes on Chef).

That is the real problem that Docker solves. Packaging. It enables a kind of shift in thinking that's difficult to put into words. People say "light-weight containers' or whatever, but none of that really nails the conceptual shift that Docker enables. In about five years, it'll become obvious the way 'cloud' is obvious now, and non-obvious back in 2005.

Omnibus is a step backwards. Every monolithic Omnibus package has its own copy of each dependency, so you end up with duplicated binaries. You can no longer patch a library system-wide, you have to figure out which Omnibus packages have the library and rebuild it with the patched version. Package management was invented to deduplicate files across the system, and people seem to have given up on that.

You say that Docker solves this problem, but it doesn't really. Sure, it creates isolated runtime environments that avoid the clashes you described, but it only further obscures the problem of system-wide deduplication of dependencies. The real solution here is better package managers, such as GNU Guix, that can easily handle multiple programs on the same machine that require different versions of the same dependencies whilst also deduplicating common files system-wide. Once such a foundation is in place, a container system no longer needs to deal with disk images, it can just bind-mount the needed software builds from the host into the container thereby deduplicating files across all containers, too.

Omnibus and Docker are papering over problems, not solving them.

It turns out that the kernel is smart enough to deduplicate (in memory) the same version of a shared library across VM boundaries, so I don't really see why we need to make packaging handle this, especially if this can be applied to containers (if it isn't already). Duplicates of the same file on disk is not a big deal, and can be solved by a good file system which handles deduplication. I don't see why we necessarily want to do all of this in a package manager.

It's actually a good thing that we have these duplicates from a packaging perspective, because you completely remove the host requirements altogether, and can focus on what an app needs, and if you want to take advantage of deduplication (on disk, or in memory), then you can let another subsystem resolve that for you.

To resolve the patching something system-wide, you simply use a common base image, it's orthogonal to containers. Just because you can have different versions of a dependency, doesn't mean you need to. The main advantage of the container having it's own version is that you can independently upgrade components without worrying that a change will effect another application.

You might argue that this is a security concern, but I'd argue that it's more secure to have an easily updatable application than an easy way to update a particular library across all applications. In the latter case, we already know what happens, people don't update the library nearly as often as they should, because it could break other applications which might rely on that particular version. At least in the first case we can upgrade with confidence, meaning we actually do the upgrades.

This means your security depends on the app maintainer, which is a terrible place to be in. I don't want to have to wait for the latest image of 100 apps and hope they didn't break anything else just to deal with an openssl vulnerability.

If your system consists of 100 apps, you have a bigger problem, and likely is a shop big enough to deal with it.

I'm working on a production deployment of a CoreOS+Docker system for a client now, and the entire system consists of about a dozen container-images, most of which have small, largely non-overlapping dependencies.

Only two have a substantial number of dependencies.

This is a large part of what excites people about Docker and the like: It gives us dependency isolation that often results in drastically reducing the actual dependencies.

None of this e.g. requires statically linked binaries, so no, you don't have to wait for the latest image of 100 apps. You need to wait for the latest package of whatever library is an issue, at which point you rebuild your images, if necessary overriding the package in question for them.

One of the touted benefits of containers is shipping images to people with your software. That means as a customer you cant rebuild the image yourself.

It's exactly like statically linked binaries.

A lot of the cgroups and namespaces functionality too time to mature and stabilize. User name spaces for instance was only available with 3.8. Cgroups and namespaces still don't play well with each other.

Cgroups was initially added by some folks from Google in 2007. A lot of the early work on Linux containers was done by Daniel Lezcano and Serge Hallyn of the LXC project, supported by IBM. It was initially a kernel patch and userland tools. You can still see it on the IBM website. It was merged in 2.6.32.

Then around 2012 the LXC project started being supported by Ubuntu and Stephane Graber of Ubuntu continued the work with Serge Hallyn. LXC was of course focused on OS containers and they didn't really market themselves.

Around 2013 when LXC was finally becoming usable by end users, Docker who were probably using it in their previous avatar in dotcloud as a PAAS platform, took it as a base, modified the container OS's init to run single apps, removed storage persistence, and built it with aufs layers, and took it to market aggressively.

But if you look beyond the PAAS centric use case, OS containers are simpler to use, offer near seamless migration of VM workloads, more flexibility in storage and networking scenarios and are more easily used with the ecosystem of apps and tools with a normal multi-process OS environment.

The ability to gain the advantages of containers without needing to re engineer how you deploy applications is an incredible value proposition.

LXC is mature, pretty advanced and simpler to use than Docker, but a lot of users and media have got the impression that its 'low level' or difficult to use.

The Docker, PAAS and micro services folks are the only ones really messaging and going out there to gain adoption and there is an unfortunate conflation of containers to Docker and monoculture developing. The 'Open Container Standard' is an example. Shouldn't that be 'Open App Container Standard'?

App containers are a constrained OS environment and add complexity, and the various Docker specific solutions being developed for everything from networking to storage is evidence of the additional complexity. There is obviously a huge devops PAAS case here that people see value in. And the sheer amount of money and engineering deployed means something good has to come out of it. But containers cannot be just about PAAS.

I run Flockport that provides an app store [1] based on OS containers that are as easy to use as Docker hub and extensive documentation [2] on using containers so do give it a look.



Systemd-nspawn is way easier to use than LXC imo in that it replicates the simplicity of chroot with the power of cgroups. The security story is unfinished though.

Not really, Nspawn is extremely promising and is developing fast. Systemd 220 adds support for user namespaces so you can run nspawn containers as non root users.

But containers need minimal OS templates, networking and a way to configure it properly, storage support for things like cloning and snapshots, a way to configure cgroups, and management and those are still not available beyond some basic machinectl commands, and neither is the documentation. Nspawn is going to be a very strong solution, especially given Systemd is now there by default on most mainstream distros, but its not there yet.

User namespaces while letting non root users run containers brings with it a whole bunch of problems on accessing host resources like mounting file systems, networking devices etc that LXC has faced and addressed.

I have an article up on using nspawn containers here [1]. There are a lot of wild misconceptions floating around about LXC. It is actually pretty mature and easy to use, has supported user namespaces since 2013, has advanced networking and storage support for things like cloning and snapshots with btrfs, zfs, overlayfs, LVM thin, aufs, a nice set of tools to manage containers, and a wide choice of minimal container OS templates.

We have a lightweight boot2lxc VM image based on Alpine Linux for those who want to give it a go [2]

[1] https://www.flockport.com/a-quick-look-at-systemd-nspawn-con...

[2] https://www.flockport.com/start/

What do you mean that "The security story is unfinished though."?

It's openly pointed out in the docs that it's intended to prevent unintentional system alterations, not stop an actively hostile program - i.e. there's not a lot of confidence from the devs in it's isolation levels yet.

User namespaces actually weren't completely done until late 2013 (Ubuntu didn't have them enabled until 13.10 or 14.04 because it didn't work with XFS).

When the abstractions are right, things fall out almost for free. Kudos for attempting this, made me smile from ear to ear.

also that they dont use a root daemon and random hard to debug, hard to verify things.

it feels like most of them build large products around things to justify making money, while i like the simple, elegant, easy to verify little pieces of software (some call it "the unix way")

unshare and iproute are pretty decent for ex.

Agreed. That was pretty good. A dash of bash with 3 teaspoons of butter: https://en.wikipedia.org/wiki/Btrfs#Subvolumes_and_snapshots

Here's a proof-of-concept implementation of "docker pull" in bash (YMMV, I think it has broken since I wrote it): https://gist.github.com/tlrobinson/c85dca269f4405ad4201

Looks good!

I'd marked 'docker pull' as out of scope because I thought it would be fairly hard to interface with their API from bash, looks like I was wrong.

The only hard bit is parsing the JSON, which I use jq for (http://stedolan.github.io/jq/)

I think the v2 API requires hitting an auth endpoint too.

There are some images still in registry v1, I recon a problem like that where it gives you another URL

When I first saw the HN title, I was so stunning. Weird, it's not my tool https://github.com/icy/bocker ;)

The author of "bocker" (not my bocker) has a great idea. I would learn from the script. Docker is not magic anymore.!!

Dokku written in bash?

(I am one the maintainers of dokku, which is written in bash).

I think he's suggesting creating a similar project to dokku, but instead of using Docker, use Bocker.

If installing via apt-get, that means creating a debian package that provides lxc-docker-1.6.2 (docker 1.7 broke backwards compatibility in output of docker ps). Otherwise, just patching the [Makefile](https://github.com/progrium/dokku/blob/master/Makefile#L85) to check for bocker (linked to `/usr/local/bin/docker` of course) would suffice.

Though I don't think that is what he is asking for.

$10M per LoC

Nice work. Great to see the advanced features of BTRFS put to use.

Thank you.

Very interesting work, p8952. Can you elaborate more on what features made btrfs a good fit for the project?

There's also https://github.com/docker/dockerlite . But not sure how current it is.

not sure if you ever saw this - https://www.phoronix.com/scan.php?page=news_item&px=CoreOS-B...

it might be interesting to see a version of your script using overlayfs

Interesting, it was my understanding that Docker was going in the other direction. Moving off AuFS towards Btrfs due to issues getting AuFS patches into the mainline kernel.

I'll have to look info CoreOS's reasons for going with Ext4.

Docker 1.6+ works with overlayfs on my debian box (in production). I thought AUFS was already deprecated

I've a more or less similar script that uses overlayfs. More less than more... since it uses machined really. But it's easy to to grab commands from there and put in bocker for ex. You just need to replace the btrfs commands by the overlayfs mount, it's almost nothing :)

https://github.com/gdestuynder/mctl/ See also this terrible draft https://www.insecure.ws/linux/systemd_nspawn.html

Add support for GPG-signed `btrfs pull` and `btrfs push` and I'm totally sold! I've been working on something similar to this but on top of systemd-nspawn, which already does some stuff for you.

systemd-nspawn is nice because I run systemd in all my containers and thus allows me to easily do logging etc.

I don't really dig the docker-microservices mantra that much. I just use them as glorified VMs I guess.

(And yes, you should run an init system in your containers [0])

[0] - https://blog.phusion.nl/2015/01/20/docker-and-the-pid-1-zomb...


The emperor is wearing fewer clothes.

That script is the best description of Docker I've read.

I'm missing something, what actually "executes" the command here echo "$2" > "$btrfs_path/$uuid/$uuid.cmd"

Is something watching for .cmd? Is this some behavior of util-linux (for which, my few seconds couldn't find solid documentation on)?

If my dry read of the code is correct, that line is for the "bocker ps" routine to be able to print the command later.

The run command itself is executed next, inside of ip netns exec.

Ahhh, no.. I was looking at line 57 and somehow missed the actual call to $2 on 59. Clear as a bell now

Nice. As I just started playing with zfsonlinux, I'm tempted to "port" this from btrfs to zfs... Should allow for migrating "images"/snapshots with zfs send...

As an entry level developer - how does someone go about re-writing "x" in 100 lines of "x"?

Is there a certain process that goes into developing something like this, and why is this a popular thing to do? (writing an existing software in lesser lines)

1) Knowledge of the space is good, but can be learned while implementing. You'll need to know the language pretty well, in order to keep the line count down.

2) It's fun

Hey, I just submitted a pull request for the "exec" command :)

Persistent data structure FTW.

This is incredible.

Holy crap! I've been keeping up with the hype, yet having never used Docker and never needed it yet I can't help but become more skeptical now that I know that it's features aren't more complex than a little bit of bash.

People give bash a hard time, but things like this really give me that warm, fuzzy feeling.


Why would it need to be more complex than "a little bit of bash" if it fulfils the task intended?

Surely that is better than a convoluted, 1,000,000,000 LoC application that no one outside a handful of core developers understands? Right?

I think you and logicrime are saying the same thing.

to be honest most of the work is done by iproute and unshare in this case. unshare basically calls clone() (syscall) with the correct namespace related paramters, iproute deals with veth.

bash is quite a neat language (for what it does) but has horrendous syntax.

I honestly thought docker was just a little script when I saw it first, judging on the functionality.

You published it? You should have gotten a "micro-docker"-like hashtag trend going and then pitch your idea to VCs. The main "Lighter than Docker" startup would be valued at around 5-7 billion right now.

Just a quick interjection:

Something that one smart developer can do in 100 lines on any interpretter is never worth billions of dollars.

Someone who can do it, on the other hand, is certainly worth hundreds of thousands of dollars, annually.

Meanwhile, so many people continue to marvel at what can be done with an interpretter and a turing-complete language. Yet, the last thing we need is yet another turing complete language.

Unfortunately, the problem with turing machines, virtual or otherwise, is that they're so good at faihfully reduplicating themselves...

"Something that one smart developer can do in 100 lines on any interpretter is never worth billions of dollars."

100 lines maybe not, but docker is pretty lightweight glue riding on existing technology that did most of the heavy lifting. The valuation IS lopsided because it sort of did a "name grab" around the underpinning technology (sort of like "AJAX" was "XMLHttpRequest") and packaged it in a way that made it a more useful (some) and more importantly, talked about in a way people could understand, mostly giving it a name and describing some common practices was what happened.

The original idea was definitely clever though, and it is getting people who didn't adopt immutable systems before to start to understand immutable systems, even if the future is not actually going to be Docker. Yet, Docker is getting the press versus the higher level software that needs to exist to drive Docker.

While this makes it very hard for other projects to get VC attention, I think that's maybe a good thing for them if they don't know it - you want to bootstrap if you can anyway, and I hope many of them do.

While this isn't a super robust implementation or anything, I think it's important to show that Docker is more or less glue around existing concepts, and that there's still room to make better things.

Don't get me wrong, immutable systems are GREAT. Docker deserves points for getting people to understand them, and the ideas of private registries and Dockerfiles (though also not original) are good parts. Microservices? Not really neccessary for most people, that's more of a human problem. It sort of conflates immutable with microservices and makes you hop out of the norm to do the basics, but ok, sure, that's somewhat like a celebrity actor advocating for environmental causes. Still a good thing.

But is it a giant leap forward? Not as much as people think, compared to AMI builds, and you see folks running Docker on top of EC2 in cases (yes, they boot faster - but AMIs gave you immutable and things like Packer allowed redistributable images; stacking images is kind of sketchy if you don't control them all). But it's enough to make people use them, and that's a win, and someday the management software for it may be smart enough to make it feel as easy and powerful as that (fingers crossed for ECS?).

The 100 liner at least has the advantage of reminding people when people say "Docker is great" they mostly mean "I like this immutable systems thing and describing systems in text", and the other properties of Docker, and reminds people that if they can do better and try to make a better thing, they should also still try.

Interesting work. And 10% of those lines are simply closing braces which can be collapsed to the previous line, and half a dozen lines can be reclaimed from the help function...

I thought puppet/chef were the pit of the devops ridicule. Then I not only saw this, but also positive reactions to a readable code in which you have :

    echo 'nameserver' > "$btrfs_path/$uuid"/etc/resolv.conf
This is wrong on so many level that I don't know where to begin with.

Sadly that is pretty much emulating how docker does it. To the best of my knowledge the '' and '' name-servers are kind of hard-coded in a lot of containers (all?)

Can you share why you think puppet/chef are the "pit of the devops ridicule"?

Having implemented it across 10k+ servers in 12 datacenters I'd say he means Puppet is overly complex for what provides. Kind of like Docker. I think people equate Chef to the same over-complexity.

Having now used Salt and starting to play with Ansible I'm growing an extreme dislike for Puppet and the weeks of my life I can never recover dealing with things that Salt has made so much easier.

i dont think this part is needed at all, it just means the base image sux basically, so this is some patching up, which, well, is fine for demo purposes

Wow, Docker in 100 lines! It runs as root? Oh. It is written in bash? Oh. It needs a ton of manual setup? Oh. It doesn't actually implement the package format which is most of the point of Docker? Oh. So is it that easy to reimplement Docker? Despite the obvious snarky intent, it appears not.

@Shykes, is that your alter-ego account?

No, that's a dumb joke. Actually I don't really even use Docker personally.

HN's standard middlebrow dismissal of Docker is to claim that it's nothing except LXC, which really misses the point entirely. But at least it lets people pretend they are smarter, which is what really counts.

They are rightfully dismissive of Docker because it's just the current cycle of trendy abstractions. It's a barely passable solution to a bigger problem.

"We're running many services on a single machine. But this is complicated and difficult to update and maintain."

"We took our machine, ran a virtualization platform on it, and split each service into its own VM. But this comes at the cost of increased resource usage."

"Instead of separate VMs we created a container format to decrease the overhead while retaining many of the benefits of virtualization. But this is still resource heavy, and insecure as the containers will rarely see updates."

"So we created 'lightweight' containers which are very thin wrapper around the base OS so that containers can take advantage of updated shared libraries to mitigate the security problems, and further decrease the overhead."

"We're running many services on a single machine..."

The cycle will eventually come around and we'll be in a better place having learned that was was really needed was improvements to the base OS, package management, more robust MAC policy, and name-spacing rather than containers.

> and name-spacing rather than containers.

Containers are name-spacing.

No, they are isolation, a much stronger proposition.

The problem is that the buzz will often entirely miss the point also. If the reader understands the underlying CS concept it is better to focus on the benefits of the product as a particular implementation of a concept, rather than attribute all the benefits to the actual product itself.

If anyone wants to earn some huge brownie points with me (and who wouldn't), you could implement a PaaS on top of Joyent's Triton system[1]. Purely in terms of the cost structure you could offer with such a PaaS, this could be a Heroku killer. Huge bonus points if it's totally open source!

1. https://github.com/joyent/sdc-docker

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact