
12 Fractured Apps - kelseyhightower
https://medium.com/@kelseyhightower/12-fractured-apps-1080c73d481c
======
andrewguenther
The article starts off by talking about trying to run legacy applications and
then just casually tells you to modify the source code to solve all your
problems. Not everything you want to run in a container is something which you
want to modify.

Also, I really don't see the problem with the volume mounting approach. It may
look ugly on the command-line, but when you're using an orchestration tool, it
is actually quite painless and solves a lot of the issues mentioned in the
article.

I do completely agree that one should completely avoid running a custom
entrypoint. They are often written and then forgotten and can lead to really
nasty bugs.

Lastly (this is more a Docker criticism, not the article) writing to stdout is
all well and good, but Docker does a terrible job handling it. There is no way
to truncate logs to stdout coming from a Docker container so Docker just holds
onto the entirety of the log contents until the container is removed. For long
running applications, this makes logging to stdout a deal-breaker. In a move
absolutely contrary to my last point, I commonly use a custom entrypoint
purely to handle logging. It passes all arguments to the application and then
redirects stdout and stderr to cronolog which writes to a volume in my log
pulling container.

~~~
ploxiln
I created this logrotate rule to rotate the docker json log files:

    
    
      /var/lib/docker/containers/*/*-json.log {
          copytruncate
          rotate 2
          size 40M
          missingok
          nocreate
          nocompress
      }
    

But for most services I still mount a log dir and redirect stdout and stderr
to a file manually, since neither the json log format nor that directory
structure are particularly convenient for my purposes.

More recent docker releases have added other logging drivers, and added
options to the "json-file" logging driver for "max-size" and "max-file", which
sounds appropriate, but that came after I worked around the problem and I
haven't upgraded yet.

------
Weizilla
Two things which I never understood about using environment variables are how
do you version control the changes and how do you manage these variables when
you have more than just a handful of them?

In this example, it's only six simple values but what happens when you have 10
or 20? Or you have 10 applications with six values each that need to be
deployed to four different environments?

Or what about if there are multiple teams making concurrent changes at the
same time? What if some application starts failing due to to a recent variable
change and you want to revert back or track down who made the change?

I feel like once your application grows to more than just a few simple values,
you end up creating a big file to populate these values and you end up back to
using configuration files.

~~~
deedubaya
Using environment variables for everything is wrong too. API keys and other
sensitive information should be in environment vars. Non-private information
should definitely be in config files.

If you need the flexibility of environment variables for a semi-configurable
non-secure variable, use them to overwrite a sensible default.

~~~
andybak
Can someone explain to me why env vars > filesystem for secrets? They seem
equivalent in most ways that actually matter.

In general 12-factor gets my hackles up as it comes across as dictatorial with
explaining why. Even when I'm wrong I like to be gently convinced rather than
hit over the head with rule book. Can someone point me to an extensive source
that clearly justifies each factor? Ideally with an actual debate about each
point (as this often surfaces the strongest parts of the case for something)

~~~
nzoschke
I have a tremendous amount of experience with the 12 Factor book having worked
with at Heroku for 6 years. I am also working on an open source 12 Factor
platform called Convox.

One reason the factors are presented as prescriptive because apps that don't
do this won't work on Heroku.

Is there a specific factor you'd like to deep dive into?

I'll pick one to start: Environment.

There are many ways you can set and read configuration for an app: env, config
files, config tools like chef or puppet, config database like zookeeper or
etc. if we are talking about config like a database URL you could also use a
service discovery system.

Env represents the simplest contract between your app and whatever platform is
running it (the OS, Docker, Heroku, ECS).

If the platform can update env and restart the processes to get the new
settings, no other config management is necessary.

It's UNIX, it's simple, and it helps you bootstrap any more specialized config
management if you need it (set ZOOKEEPER_URL or CHEF_SERVER_URL).

So ENV feels like a factor to become very prescriptive about.

The biggest debate I can see is if ENV is sufficient to build our micro
services on, or if service discovery "magic" is necessary too. I.e Zookeeper,
Airbnb SmartStack or Docker Ambassador containers.

For the vast majority of apps, ENV is sufficient.

I personally still build my more complex apps around ENV and at all costs
avoid needing to use a service discovery system. The added complexity and
operations isn't worth it to me.

I have a strong hunch that service discovery won't become an app development
pattern that everyone uses until a managed platform (like Heroku) offers it.
Perhaps this is where Docker, Swarm and Tutum is headed.

------
nzoschke
Thank you for this guide Kelsey.

I worked with countless apps and developers at Heroku on getting their apps
running well on the platform. There was always one great mystery: why not
build our apps a bit differently (dare I say better) to work in the cloud?

The database connection pattern is spot on. For any network resource, try to
connect and if there is a problem retry with back off.

Also log the connection error events that a monitoring tool can notify off of.

I've seen apps that have the absolute worst behavior around this error that
will happen. The worst is crashing the app in a way that triggers thrashing
around restarts.

We had to build tons of complex restart back off logic into the Heroku
platform to handle this.

I often wish app frameworks made this easier. I think most devs don't do these
things because it is a chore for s problem that only happens occasionally.

But what if Rails baked this into ActiveRecord?

At one point Rails only logged to files on disk. We came together to add
stdout logging to the framework.

~~~
kelseyhightower
You're welcome.

You have managed to capture the spirit of the post. The goal was to highlight
areas where developers can take action, and how improving even the little
things can go a long way to improving the entire system -- even the one you
can't see.

In the early parts of my career I would often take pride in building complex
systems to accommodate for misbehaving applications. Throw in some fancy
Nagios alerts and a sleep depriving on-call rotation; I looked like a hero.

Then I learned how to write code.

This was the turning point in my career (2006). I was now brave enough to
modify "legacy" production applications to take advantage of "new"
infrastructure features like service discovery (use DNS records instead of
IPs), and logging directly to syslog (asynchronously with proper ring
buffers).

I was willing to learn any language too: Python, PHP, or Java, it did not
matter because it allowed me to take action and contribute at the heart of the
application.

I'm not saying platforms that also handle "misbehaving" applications or
complex failure scenarios are unnecessary. I just consider those platforms as
extra layers of protection, not a free pass to ignore building applications
that take responsibility for reliability from startup to shutdown.

------
merb
Everything written in this article is easily done without docker. Just with
groups / systemd. Docker makes these things way harder to do. Especially since
packaging isn't so hard anymore. Mostly Dynamic Language's are harder to
package, but when I think about Java, Go or other compiled language's you
mostly could just create a single file which you could version.

------
jsnk
This is an aside, but somewhat related.

Where are you suppose to store secretive environment variables like database
password or apikey/secret pair etc (say in Ubuntu server)? Is storing in
something like ~/.profile or ~/.bashrc and doing `export
SECRET_KEY='plaintext_secret'` on the server enough, or should they be treated
in an even more secretive manner?

~~~
Perceptes
This is a problem that several new tools are being built to address, e.g.
Vault from HashiCorp and Keywhiz from Square. Storing the secrets unencrypted
on disk on the host system is not a huge improvement over having them in the
application by default. Ideally you want a system to store them securely that
allows them to be extracted and decrypted only using credentials and policies
you control. They should only ever exist in memory (which is why Keywhiz uses
FUSE, for example.) Some container orchestration tools like Kubernetes also
include their own mechanism for securely storing and retrieving secrets and
making them available to applications.

------
allan_s
How do you manage things like database schema (except by switching to schema-
less database :) ), is it your software which is suppose to create it if non
existing ?

In case the database is pristine, ok I see, I do a "create table if not exist"

But if the database is at version N and I want to go version N+1 , what do I
do? I mean I do know about database migration tools, but how does it
integrates in your "pure 12factor" deployement, as it means when you deploy
you need to have at least this order:

1\. bring up the database 2\. run the migration script 3\. bring up the
application

and the article was advocating to make things in a way that you don't need to
have a "you MUST first run this, then that"

~~~
spotman
One way to think about this that maybe overlaps with the theme of this
article, but also stands on its own with specific regard to database
modifications is that you often need to have the ability to have multiple
versions of everything alive at one point in time.

So maybe that's a schema A and a schema B, or maybe you have applied schema B
which only app version 1.1 is optimized for , but version 1.0 is what is in
production immediately following your database migration. So you can't make
changes in schema B that would immediately render app 1.0 broken, which means
you need to not box yourself into a corner with future assumptions as much as
possible.

Ultimately if downtime is not an option you end up writing these capabilities
in at every layer. Whether it's an api endpoint or code talking to a database
you often have to make carefully thought out changes incrementally to ensure
that things can all operate simultaneously, and often this ends up having
metadata about the versions of everything as an option for taking different
code paths.

This article touches on this in the way that if suggests making your app deal
with an available database and one that is not available. Same with a field in
a schema or a payload. To make your code less brittle instruct it what to do
in both cases.

------
davidbanham
At no point does the author elaborate on why failing to start if the
environment isn't sane is a bad thing. All my software checks for the things
it expects to be in place, then bails hard and fast if they aren't.

It's then up to the init daemon to attempt to restart that process, and up to
the monitoring and orchestration tools to ensure that the environment returns
to sanity over time.

~~~
ozim
I think that author just does not have experience outside of what he does.
Maybe his systems can fall back to sane defaults. But what is sane default if
you have to communicate with 3rd party server and your system is worthless
when connection is not there? You have to have ip/domain name configured.

For cars if something is wrong then in some cases you can start and even drive
but users get warning. If there is something really wrong car will not start.

So I think what author suggests is at least asking for trouble.

Almost everywhere as quoted:

"Everything in this post is about improving the deployment process for your
applications, specifically those running in a Docker container, but these
ideas should apply almost anywhere."

~~~
lwf
The author is a she.

~~~
parasubvert
Kelsey isn't a she :)

~~~
lwf
Oh, welp. That's what I get for gendering names. My apologies, to both you and
Kelsey.

------
kazagistar
There is one thing that really confuses me about 12 factor...

It suggests that you provide the locations of backing services in the config.
This seems insane, since it means that you cannot move any backing services.
Do they expect you to restart when you switch backing services? Do they expect
you to run all your internal traffic through load balancers?

We provide each container with the address of the local service discovery
server (Consul) and it finds what it needs itself, when it needs it. I assume
everyone using this kind of setup in production is doing something similar?

~~~
sagichmal
"Providing the location of the backing services" doesn't mean the physical
address, but the logical address, managed by an e.g. load balancer, capable of
abstracting over physical changes as necessary. Consul is one way to do it;
DNS is another, and there are many more.

------
jacques_chester
> _As you can see there’s nothing special here, but if you look closely you
> can see this application will only startup under specific conditions, which
> we’ll call the happy path. If the configuration file or working directory is
> missing, or the database is not available during startup, the above
> application will fail to start._

Why not let a PaaS do this for you? Heroku, Cloud Foundry, OpenShift or the
others I've yet to learn about?

Disclaimer: I work for Pivotal who donate the majority of engineering effort
to Cloud Foundry.

~~~
cpitman
I agree with the general idea, let an already existing product handle problems
that everyone has. The last part is still important though, if we are going to
build "distributed" apps, then we _need_ to handle those dependencies failing
or being briefly unavailable.

And the reality is that every app becomes distributed as soon as it has a
database or client. A ton of legacy applications make the implicit assumption
that the network is reliable, and fall over hard when it isn't true anymore.

~~~
jacques_chester
I agree that legacy apps break 12 factor rules. That's why the 12 factor app
is "a thing" in the first place. An app which follows them can survive being
killed and restarted.

I've worked on some legacy migrations and it's usually a process of
incrementally chiselling out services and cleaning up hard-coded assumptions.
Tedious but usually doable.

------
kimi
Whaleware was created to address some of these issues with Docker. Default
configuration, a definite application init phase, plus internal monitoring and
lifecycle reporting.
[https://github.com/l3nz/whaleware](https://github.com/l3nz/whaleware)

------
akramhussein
If you haven't seen Kelsey give a talk, well worth watching - really funny guy
and intelligent. He gave one on Nomad and Kubernetes that's good.

