
A Container Is a Function Call - ingve
https://glyph.twistedmatrix.com/2016/08/defcontainer.html
======
gijzelaerr
Interesting. I've been working on something related, a non intrusive docker
'extension', kliko. Kliko is a specification to formalize file based (no
network yet) input/output flow for containers. It makes to possible to have a
generic API for containers, so you can automatically create user interfaces
for an application or chain them together in a pipeline.

[https://github.com/gijzelaerr/kliko](https://github.com/gijzelaerr/kliko)

Here is an example kliko file for a container defining the input and output:

[https://github.com/kernsuite-
docker/lwimager/blob/master/kli...](https://github.com/kernsuite-
docker/lwimager/blob/master/kliko.yml)

~~~
valarauca1
So one process writes to a flat file, then another process reads the flat
file? Wouldn't named pipes be more useful here?

Also it feels like you are just making a seralization format on top of
Stdin/Stdout.

~~~
gijzelaerr
There is doc available on the website, but I'm not sure if I explain the
concept properly.

The assumption is that you have containers that operate on input files and
generates output files. The behavior of the container depends on the given
parameters which are defined in the kliko file. The user (or runner) will
supply these parameters at runtime.

To illustrate, we use it for creating pipelines in radio astronomy where we
operate on datasets of gigabyes or bigger. most of these tools are file based,
they read files in and write files out. It is all quite complex and old
software, so Docker is ideal for encapsulating this complexity. A scientist
can easily recombine the various containers and play with the arguments. By
the split of input/output the container effectively become 'functional', no
side effects and the results can be cached if the parameters are the same. The
intermediate temporary volumes can be memory based to speed things up. We use
stdout for logging.

~~~
ymt123
If I'm understanding correctly that sounds similar to Pachyderm
([http://www.pachyderm.io](http://www.pachyderm.io))

~~~
IanCal
Pachyderm looks quite cool, but I think it's lacking a quick-start and a way
of running things locally in a simple way.

I grabbed the repo and clicked through a few links in the docs and hit a 404.
I searched on google and found a link to a way of running it just locally
simply but that doesn't work with the new version. Then I followed the
instructions and hit a problem installing something to do with k8 about mapped
paths and the fix printed in the console doesn't work.

I understand that this is a personal complaint and others might not care at
all about having it setup locally because it solves the big problems so well
but I just want to _try_ it at least locally.

------
mbrock
I appreciate the basic idea, but I think the identification of containers with
"function calls" seems more confusing than helpful. It's strange to think of,
say, a running Redis instance as a "function call."

~~~
monkmartinez
I disagree.

I think this article nails some of the basic issues people have with docker.
That is, your app is usually not entirely self-contained. One generally needs
to have a db, a cache db, some networking, something else, and dash of CI.

To run a containerized app ecosystem, you either type a load of crap in the
`docker run` statement or you annotate the heck out of the compose
file/dockerfile. Either way, you pray it works... I know that I have to wade
into logs and docker exec bin/sh because of some kind of silly-ness a lot more
than I would like.

The article is correctly, imo, pointing out there should be a better way to
wire up basic containerized ecosystems.

~~~
mbrock
Note that my only complaint is terminological, because I think "function call"
signifies something completely different from "launching a long-lasting
service".

~~~
jdmichal
Yes. If starting a Docker instance is a "function call", then what is the
analog for making a HTTP request to a Docker-hosted server? The analogy was
fine for describing the idea, and perhaps it even holds if the Docker image is
a processing step instead of a service.

But if the Docker image is a service... We have a set of terminology (and
technology!) already for services -- let's use those, rather than continue to
force the analogy all the way down.

------
steego
Wouldn't a better mental model be an object-as-a-process ala Smalltalk or
Erlang?

I personally think the object analog is a for more useful. Rather than
reinventing your own container composition system, you could simply adopt a
dependency injection approach. It would be nice to see containers-object
publish interfaces in a language/transport independent way that layers
interfaces, endpoints and transport mechanisms.

~~~
felixgallo
I agree, the actor model is a significantly more usable metaphor for
containers than functions. When you start thinking about supervisor trees, you
start heading towards Kubernetes, which is interesting.

~~~
pkinsky
I think we're getting hung up on 'function' here: the article is really about
the benefits of type systems (either runtime or compile time) as applied to
the argument and result types of functions. The actor implementations I've
used (mostly just Akka) lack strong types, but I think treating containerized
apps as actors that can only send or receive specific message types would fix
the problems the article brings up.

------
jackweirdy
The format the author proposes for a new Dockerfile is semantically similar to
how Juju charms describe services.

    
    
        name: vanilla
        summary: Vanilla is an open-source, pluggable, themeable, multi-lingual forum.
        maintainer: Your Name <your@email.tld>
        description: |
          Vanilla is designed to deploy and grow small communities to scale.
          This charm deploys Vanilla Forums as outlined by the Vanilla Forums installation guide.
        tags:
          - social
        provides:
          website:
            interface: http
        requires:
          database:
            interface: mysql
    

[https://jujucharms.com](https://jujucharms.com)

~~~
mintplant
> $ juju deploy haproxy

> $ juju deploy mediawiki

> $ juju add-relation haproxy mediawiki

> $ juju deploy mariadb

> $ juju add-relation mediawiki mariadb

> $ juju deploy memcached

> $ juju add-relation mediawiki memcached

> $ juju expose haproxy

That's really cool! Why isn't this talked about more? Bias against Canonical?

~~~
jackweirdy
From having tried a few times on a mac, it's really hard to get started. The
local provider is LXC-based so can only run on linux. I'd love it if it ran on
docker or virtualbox.

~~~
SmurfJuggler
I didn't know juju existed until now, but I've been working on something kind
of similar which is a lot less hassle to get up and running (vagrant up, done)
although I'm just one guy who doesn't have a hell of a lot of free time so it
will be intended purely for creating local dev environments at least initially
(and will come with a "do not run production services on this or you will die"
style warning)

It takes some steps towards resolving some of the issues in the article as
well as a number of other headaches I've encountered when trying to bend
docker to my will and build a development environment I can use every day for
everything without having to mess around with basic plumbing.

I can post a show HN about it in the days ahead if there's any interest - it's
not anywhere near where I want it to be (least of all in terms of code
quality) but the amount of time I can devote to coding is about to drop from
"near zero" to "really REALLY near zero" for a while, and it's very usable and
handy so it might be worth just tossing it out there as-is and coming back to
it at a later date.

------
rdtsc
Wonder why he never mentions Nix/NixOps or Guix.

If we talk about purely functional configuration and runtime changes those
seem to be like the projects to focus on.

~~~
viraptor
I read this as a specific proposal for future Docker, or some overlay
description format for it. Also the author is talking about runtime checks
while nix/guix provides install/configuration description.

Nix/guix solves some cool things, but doesn't do isolation (as in process
isolation). It can do it (namespaces live in the kernel after all), but it's
not a first-class thing.

~~~
arianvanp
We van start nix expressions in isolated containers with a single command!
[https://nixos.org/releases/nixos/14.12/nixos-14.12.374.61adf...](https://nixos.org/releases/nixos/14.12/nixos-14.12.374.61adf9e/manual/ch-
containers.html)

~~~
viraptor
That's cool! I see there's a lot of fun things already in. But also it seems
like something the author didn't want. Nixos containers are explicitly whole
system according to the documentation ("This command will return as soon as
the container has booted and has reached multi-user.target.") rather than a
single app.

~~~
k__
Doesn't Nix have nix-shell, which is like this?

------
tetron
Common Workflow Language [http://commonwl.org](http://commonwl.org) is a spec
(with multiple implementations) for wrapping command line tools (which may run
inside a Docker container) as functional units.

------
brandonbloom
Taken to its logical extreme, you get Algebraic Effect Handlers:
[http://math.andrej.com/2012/03/08/programming-with-
algebraic...](http://math.andrej.com/2012/03/08/programming-with-algebraic-
effects-and-handlers/)

The idea is a generalization of try/catch in which all effects are
accomplished through handler blocks that receive a continuation. Just like
when you make a kernel call, your program is a continuation given to the
kernel. Usually the kernel calls you back once, but some syscalls return twice
(fork) or not at all (abort) by manipulating processes.

Coupled with reusable handler block bundles, you can wrap a "container" around
any expression in your entire program.

------
MichaelBurge
I usually think of containers as an expanded chroot jail.

They even serve a similar purpose: During the 32 bit to 64 bit switchover, I'd
occasionally install 32-bit libs into a chroot jail to more easily compile
software that depended on being 32-bit, without wanting to pollute the global
space with 32-bit packages.

~~~
mafribe

       containers as an expanded chroot jail.
    

Could you point me to a succinct description of exactly the semantics of
containers as an expanded chroot jail? I've been looking, but have so far not
found anything.

~~~
geofft
The term "container" (on Linux) refers to the high-level combination of two
concrete low-level interfaces: namespaces and control groups (cgroups).

I like to think of namespaces as converting certain global variables in the
kernel to local variables for each process. These local variables are then
inherited to child processes. chroots are the simplest example, although they
predate namespaces. Ordinarily you'd think of a system as having a single root
directory; somewhere in the kernel is a global variable DIR _root. But in
fact, each process has its own root directory pointer in the process
structure. Most of those pointers have the same value in every process, but if
you run chroot, you change that pointer for the current process and all its
children.

The list of possible namespaces is in the clone(2) and unshare(2) manpages
(`man 2 clone` and `man 2 unshare`); look for the options starting with
CLONE_NEW_. They all change some pointer in the process structure, either to a
substructure of the original pointer (like chroot does), a deep copy of the
structure, or to a new, empty structure.

CLONE_NEWIPC changes the pointer for routing System V IPCs. CLONE_NEWNET
changes the pointer for the list of network device to a new structure with
just a new loopback interface. CLONE_NEWNS copies the mount table instead of
keeping a pointer to it, so your process can unmount filesystems without
affecting the rest of the system, or vice versa. CLONE_NEWPID changes the
pointer to pid 1 / the process ID table to point to yourself (effectively a
chroot for process IDs). CLONE_NEWUSER changes the interpretation of user IDs,
so you can have UID 0 in your process be a non-zero UID in the outside system.
CLONE_NEWUTS creates a copy of the structure containing the machine's
hostname, instead of keeping a pointer to the global structure.

cgroups are resource control. They let you say that a certain process tree has
some maximum amount of RAM, or CPU shares, or so forth. This is useful for
making containers perform the way you want, but doesn't really affect their
semantics.

~~~
cesnja
But do cgroups really allow setting maximum CPU shares? man 7 cgroups says
this about the cpu subsystem:

> Cgroups can be guaranteed a minimum number of "CPU shares" when a system is
> busy. This does not limit a cgroup's CPU usage if the CPUs are not busy.

Granted my english isn't the best, but this doesn't seem to indicate any
throttling.

~~~
geofft
I'm not very familiar with cgroups, but the documentation mention in that
manpage ( [https://www.kernel.org/doc/Documentation/scheduler/sched-
bwc...](https://www.kernel.org/doc/Documentation/scheduler/sched-bwc.txt) )
talks about throttling and maximum CPU usage.

I'd guess that it's just a matter of how you view it - putting a minimum
number of CPU shares on the root cgroup is the same as putting a maximum on
the rest of the cgroups, right? But maybe one or the other documentation is
wrong.

~~~
cesnja
Both manpage and documentation are probably right, though I'd expect CPU
throttling to be significant enough at least to be mentioned.

I'm not so sure about your guess, because that apparently works only when CPU
is fully utilized. I'm not so sure about my reading comprehension either.

------
PeterisP
We use a swagger file the "function signature" for each dockerized module, and
yes, the function analogy is quite appropriate if you modularize them well - a
sloppy component is like a subroutine grabbing all over global variables even
if it's in a docker VM, but a nice component does work like a function that
just happens to sit somewhere beyond a network.

------
viraptor
I've got a feeling that what the author describes already exists, just not on
the same level. And those solutions have problem in common - not everybody
wants to use that solution.

Specifically: puppet, chef, salt, juju, heat and others will happily deploy
your application (container or not) and provide you with a generic interface
for it. Docker is just one of the tiny building blocks here. They'll either
check the model before doing anything or will check if it works by trying.
Types or not, the system for implementing those interfaces exists.

But every month there's another configuration / orchestration / deployment
system coming out. Some will use it, but it will only worsen the
fragmentation. I feel like it would either have to be built into docker and
enforced at that level, or it would be yet another system that 99% of
operators doesn't use.

------
kozikow
In my current project, I have my own mini-infrastructure, where each machine
have a "watchdog" process that picks up items from global queue, and calls
"docker run" with the item it picked up from the queue. Docker knows which
program to start via ENTRYPOINT, and arguments to docker run are passed to
this program. Each docker image gets its own queue and autoscaling instance
group.

It works well for data processing tasks - my docker images are crawlers,
indexers or analytics code in python or R. Deployment is quite simple - just
push docker image, and it will be picked up on the next docker run. Images can
add items to global queues, and for any bigger data they write things to a
shared database.

------
dkural
We've been doing this for a while in the genomics space, check out "common
workflow language": [https://github.com/common-workflow-language/common-
workflow-...](https://github.com/common-workflow-language/common-workflow-
language)

Where each container is described by what types of things it takes as inputs
and what it outputs -- ie as functions.

In practice we've built hundreds of these and composed them in endless variety
for a large range of genomics tasks, processing petabytes of data at Seven
Bridges Genomics.

------
sly010
To think in terms of parametrizing containers I find the libcontainer spec [1]
and runc to be way more useful than what docker has to offer.

[1]
[https://github.com/opencontainers/runc/tree/master/libcontai...](https://github.com/opencontainers/runc/tree/master/libcontainer)

------
pmb
Type-checked container interfaces is the reason for protocol buffers.

------
7952
At a low level isn't this about how you deal with foreign functions? You have
the same issues of typing, validation and conurrency that you do with
something like ctypes in python.

~~~
davesque
Yeah, seems like the question of data marshaling except in a more distributed
and fault-tolerant sense. Maybe I'm misunderstanding this whole discussion?

------
jadbox
I think you're looking for Swagger. It's been around now for awhile, works
well, and has a good ecosystem of tools around it.

------
kreisquadratur
Most examples seem to be concerned with I/O boundary crossing ("why ports, why
volumes, why envs".) Is that the reason a type-system is mentioned, to provide
safety, e.g. a firewall-like DSL, through documentation? More emphasis could
be put on at least two other values: discoverability and compatibility.

------
chewbacha
This is how I've been describing docker to people and also how I've been using
it. It's why, when I need a specific version of elastic search or Postgres i
just run the tagged service.

It turns any binary into a more portable executable.

Great post!

~~~
bogomipz
What is a "tagged service"? Could you elaborate? I'm assuming this has nothing
to do with docker tags?

~~~
chewbacha
Oops, I meant tagged images. If you look at the common images for like
postgres, redis, and elasticsearch, they tag the images with the version of
the service that's installed.

[https://hub.docker.com/_/postgres/](https://hub.docker.com/_/postgres/)

------
blorgle
Sounds like zerovm? [http://www.zerovm.org/](http://www.zerovm.org/)

~~~
duaneb
Not at all; that is more to do with deterministic execution than useful
abstraction.

------
stcredzero
The crux of the matter:

 _large programs written in assembler in the 1960s included exactly this sort
of documentation convention: huge front-matter comments in English prose._

 _That is the current state of the container ecosystem. We are at the “late
’60s assembly language” stage of orchestration development. It would be a huge
technological leap forward to be able to communicate our intent structurally._

Upon reading this, my first thought is: Please don't say XML!

------
lil1729
Or in other words, how to "parameterise" a container. An interesting idea.

------
gsmethells
I always thought Docker leaned more towards "process as a platform".

