
Immutable Infrastructure and Disposable Components - nahname
http://chadfowler.com/blog/2013/06/23/immutable-deployments/
======
mitchellh
I just want to note that Packer ([http://www.packer.io](http://www.packer.io))
fits perfectly into the model of immutable infrastructure. Packer is an open
source tool for automatically creating machine images (perhaps for multiple
platforms).

The idea of quickly and easily creating these master images in a way that
doesn't slow down agility to change infrastructure is crucial, and Packer
enables that.

Disclaimer: I wrote Packer.

~~~
edwintorok
How does it compare to DisNix?
[http://nixos.org/disnix/](http://nixos.org/disnix/)

~~~
mitchellh
As far as I know, disnix is just for Nix. Packer creates machine images for
multiple platforms. It would make sense, if you were building a machine image
for Nix, perhaps, that you would use this. But for an Ubuntu machine, you
can't really. Unless I'm mistaken.

------
beat
Conceptually, it's good. But in practice, it's just an analogy. Immutability
in programming languages is enforced at compile time and run time. Something
like:

val x = 1; x = 2;

is an actual error and the compiler/runtime will give a "you can't do that".

Immutable infrastructure, on the other hand, would require read-only
enforcement on all aspects of configuration. It can be done, but not easily,
and someone _always_ has root. So what you're really talking about is the
ability to reconstruct an environment programatically from scratch, with no
manual intervention, which is laudable but not exactly immutable.

This is the sort of thing that makes me appreciate the 12 Factor App idea
([http://www.12factor.net](http://www.12factor.net)). Rather than trying to
make configuration immutable, make it impossible. Don't rely on the
existence/continuity of a filesystem at all. Don't use configuration files.

~~~
dustingetz
> someone always has root

same in programming... unsafePerformIO, dropping into native code or assembly,
etc. In 2013 the hardware exposes a mutable interface.

~~~
beat
I think that's fundamentally different. Breaking immutability in a language by
stepping outside the language is still programmatic. Breaking it via
administration is a human decision.

In programming, it's a conscious decision to break immutability (assuming a
language that supports it). It administration, it's a conscious decision to
enforce it.

~~~
tel
And it's really bad practice (at least in Haskell) to truly break
immutability. Uses of unsafePerformIO are strongly urged to be
"observationally immutable, pure, safe, transparent".

------
antientropic
The idea of applying purely functional programming concepts to deployment is
not new, see the Nix package manager and the NixOS Linux distribution built on
top of it ([http://nixos.org/](http://nixos.org/)). This paper explicitly
makes the link with functional programming:
[http://nixos.org/~eelco/pubs/nixos-jfp-
final.pdf](http://nixos.org/~eelco/pubs/nixos-jfp-final.pdf)

------
mwcampbell
I see that Wunderlist uses EC2. So I speculate that they create a fresh
machine image for each new revision of the application, and also use immutable
machine images for bits of infrastructure such as the database.

But how do we put the idea of immutable infrastructure into practice on bare-
metal servers? One option would be to set up one's own virtualization
infrastructure, such as Xen or KVM. But that would undermine the I/O
performance and resource consolidation advantages of running on bare metal.
Docker looks promising; we just need to discover the best practices for
running specific kinds of services (e.g. web applications) on top of it.

~~~
Ixiaus
Jails have been around for a while in FreeBSD and they are often used _for
this exact purpose_ ; the concept has only, really, entered the mainstream due
to widespread usage of EC2/OpenStack and now Docker (which is basically
FreeBSD Jails + ezjail but for Linux and a little different on some other
fronts).

Nothing beats a bare metal machine with FreeBSD + ZFS pool + ezjail; the
really cool thing about ZFS is that you can build a "base jail" then just use
ZFS to make snapshots of the base jail when creating new jails. So with that
you can create versioned base jails that are running an upgraded ports tree,
or more generally, an upgraded world build!

It also makes _transition_ between application versions easy because you can
run a separate database jail(s) (maybe different versions of the db?) and have
a jail for app v1 running while a jail for app v2 is running at the same time.

What I would love to see is a tool for abstracting this deployment pattern on-
top of ezjail. I've been considering doing this myself but my time is
stretched too thin :(

~~~
shykes
> _What I would love to see is a tool for abstracting this deployment pattern
> on-top of ezjail. I 've been considering doing this myself but my time is
> stretched too thin :(_

This is basically what we're doing with Docker. We started with lxc but it can
and will be ported to other OS virtualization backends, including jails and
zones.

Create a new base image from a running container:

    
    
        CONTAINER_A=$(docker run -d ubuntu apt-get install curl)
        docker wait $CONTAINER_A
        docker commit $CONTAINER_A shykes/my-ubuntu-with-curl
        CONTAINER_B=$(docker run -d shykes/my-ubuntu-with-curl curl --help)
    

Share your new image for the rest of the world to enjoy:

    
    
        docker push shykes/my-ubuntu-with-curl
    

Transition between application versions:

    
    
        V1=$(docker run -d shykes/myapp:v1)
        V2=$(docker run -d shykes/myapp:v2)
        docker stop $V1
    

Transition between _database_ versions (sharing of persistent data):

    
    
        V1=$(docker run -d shykes/mydb:v1)
        V2=$(docker run -d -volumes-from=$V1 shykes/mydb:v2)
        docker stop $V1

------
DannoHung
[http://docker.io](http://docker.io) is how I'd do this as soon as they have a
stable version for production. Pick a common (or at least mostly common) base
OS for every server, configure your entire system on top of a LXC container,
badda bing, badda boom. Quick, low profile, repeatable, redistributable, etc
etc etc.

~~~
incision
Much the same thought I have.

I've been doing various things toward the same goal for a long time, but
Docker helps on every count (quick, redistributable etc).

------
krosaen
Related piece from another Fowler:

[http://martinfowler.com/bliki/SnowflakeServer.html](http://martinfowler.com/bliki/SnowflakeServer.html)

~~~
KevinEldon
And also PhoenixServer:
[http://martinfowler.com/bliki/PhoenixServer.html](http://martinfowler.com/bliki/PhoenixServer.html)

~~~
mwcampbell
I like the concept. But where I work, our infrastructure is currently hosted
on leased dedicated servers, where (last time I checked) OS reloads require
manual intervention from the hosting provider and cost something. So I guess
the burning down and rebuilding would have to be done on a layer above the
base OS, e.g. Docker containers.

------
hashtree
Been doing this for years at all levels, and it really works. It allows me as
a single dev to spend less than five hours (amortized) a week on a decent
sized infrastructure to support my business. This includes building multiple
racks of custom servers coloed at multiple DCs, networking, private clouds,
server admin, hardware maintenance, upgrades, deploys, etc... and saving me
100k+ per year over AWS and having to hire another.

~~~
raphinou
Wondering how you manage you database server. Care to share you approach?

~~~
hashtree
I make use of polyglot persistence. Graph, columnar, document, relational,
key-value... each used when they are the right tool for the job. Each scaling
horizontally. It gets easy to blow them away. :)

------
simonsarris
This is essentially what I do with my own server, but largely because I'm too
dim to trust myself to upgrade things well.

That being said, he's abstracting the problem a little bit:

> If you absolutely know a system has been created via automation and never
> changed since the moment of creation, most of the problems I describe above
> disappear. Need to upgrade? No problem. Build a new, upgraded system and
> throw the old one away. New app revision? Same thing. Build a server (or
> image) with a new revision and throw away the old ones.

The key here is _created via automation_. You get a choice, I think:

* You can spend time/energy/worry/entropy on making sure the server is up to date

* You can spend time/energy/worry/entropy on making sure the automated process is up to date

The second one is a little easier, to be sure, though depending on the setup
it could still involve a lot of research, and it's certainly more expensive.

But hey, if you're in a position where trading some _money_ to buy some _time_
is an option and you have the cash to burn, I'd buy the time at every chance.

------
awj
We've been doing this for ages where I work. New EC2 AMIs are built inside a
disk image. We can pretty easily use all of the standard tools through chroot,
so installing packages is relatively simple and flexible.

Instances are ready for use immediately after booting, and deploying new code
is as simple as rebuilding the image and booting it. We can also easily scale
up most forms of servers by simply booting more and adding them to the proxy
as they come up.

The only downside here is that firefighting still kind of remains a two step
process. Even after investing a decent bit of time optimizing the process, it
still takes 5-10 minutes to build and upload an image. From there it takes
another 5-30 minutes to get EC2 to boot it, so there's a strong temptation to
tweak things on the server. I've taken to patching, testing, and committing
the fix on the tag for that release so I can simply apply a diff to get the
server up to date. Not perfect, but it goes a long way to prevent regressions.

~~~
mwcampbell
Do you mount the root file system read-only in these images? That seems like
it would be a great way to lock things down further, if it's practical.

~~~
awj
No, mostly because it's inconvenient. Making the root filesystem read only
means having to track down everything that tries to write to it and fix mounts
for them.

After a few years of working with this, I kind of wish we'd bitten that bullet
at the start. The AMIs we produce have a root filesystem that's only like 2
gigs so it goes across the network faster. 2 gigs isn't a lot of space when
something decides it wants to log or write out temp files in weird places.

------
ckdarby
So, basically this article doesn't provide any insights and or technical
background on how any of this was achieved.

Just promotes the author for conference speaking.

Well played good sir you got on the front page of HN, well played.

~~~
brudgers
Thought experiments such as the author's are the insight.

One of the distinguishing features of a thought experiment is a lack of
implementation details in the domain to which the analogy is applied.

Better not to become bogged down trying to build a faster than light
locomotive.

------
bumeye
This idea of immutable infrastructure reminds me of NixOS [1]. A linux
distribution where nearly the whole filesystem is immutable.

Definitely worth a look. I'm not sure if it's fit for production usage, but i
think it's the future of operating systems :).

[1] [http://nixos.org/nixos/](http://nixos.org/nixos/)

~~~
edwintorok
It has some features to deploy to a cluster as well:
[http://nixos.org/disnix/](http://nixos.org/disnix/)

And I find its 'nixos-rebuild build-vm' command quite handy to test changes.

On the other hand immutability of the OS image doesn't mean you can easily
roll-back after an upgrade, especially if you've been running the new version
for a while, because you might still have mutable state (see below).

Sure if you store all your configuration in Nix, then it is possible to
rebuild the old configuration as it was, but your mutable state is not managed
by Nix.

One such example is database files. You've upgraded to the new configuration,
and perhaps even had a script to auto-upgrade from your old schema. Now you
want to go back: what should happen to the database? Should it be entirely
rolled back (from backups) to an old snapshot? But then you loose all the new
data. Should you roll-back the schema change? Again what happens to the new
data that doesn't fit the old schema?

Another example of mutable state is remote nodes that you depend on for a
particular service. You can roll back your local machine to an old version,
but will it still be able to talk to your new remote nodes?

It seems that the only way to handle this is to make sure you are always at
least 1 major version backwards compatible, so you can freely upgrade/roll-
back machine that are part of a cluster.

~~~
edwintorok
Apparently there is also NixOps for network/cluster deployment:
[http://hydra.nixos.org/build/5426864/download/1/manual/manua...](http://hydra.nixos.org/build/5426864/download/1/manual/manual.html)

------
oliao
I think the point made here is quite interesting: instead of relying on
(puppet, chef,...) to do the "deltas", always generating the whole system from
scratch. This has a couple of advantages:

* You make sure everything is automated

* You can always trust the resulting system works; if it were to be built from scratch

~~~
brudgers
Immutable hardware requires [an analogous] garbage collector.

Most of the next machine would have the same state as the old one:

    
    
      ; Machine -> Machine
      ; Makes the next machine
      (define (next-machine m)
        (make-machine
          (machine-name m) 
          (machine-cpu m)
          (machine-storage m)
          (update-os CURRENT-OS)
          (machine-applications m)))

------
Roboprog
Throw a little outsourcing and offshoring into the mix, with people on offset
disjoint schedules and this confusion gets even deeper.

I love the idea of building servers on the fly from an image under some kind
of revision control.

~~~
lmm
Netflix does it right IMO: they have an automated process that builds a
complete AMI from source control.
[http://www.infoq.com/news/2013/06/netflix](http://www.infoq.com/news/2013/06/netflix)

------
tieTYT
I think I must be missing something obvious here, but how can you do this if
you use something with state like a database? How do you create a production
system from scratch without losing all the data?

------
JulianMorrison
So you deploy a new physical server to the rack? Bundle the OS image with the
deploy? Ghost it with the OS image before deploying? The article is light on
explaining the "how" part.

------
AsymetricCom
The level of discussion here is about one level separated from buzzword bingo.

