Hacker News new | past | comments | ask | show | jobs | submit login
Twelve Factors of Web Application Development (12factor.net)
250 points by Jd on Nov 22, 2011 | hide | past | web | favorite | 37 comments

This is a very excellent list, and reads almost exactly how I've documented my personal standards and best practices. There are a few things that I feel are left out, and should definitely be just as important:

1. Testing. Having automated tests that your developers are ideally writing first makes a world of difference. Throw it into your continuous integration system, and nurture a culture around it. If your tools and language allow it, having a lint or style checker as part of your build process is also very handy.

2. Development / Debug Mode. There should be a switch that you can flip when working from localhost, development, and staging to immediately obtain tons of data to aid you in debugging your application. Django-Debug-Toolbar is the holy grail when it comes to using an out of the box solution. Seeing breakdowns of queries called, time spent doing various functions, quick log lookups, etc is essential.

3. Production Error Tracking. If you're still relying on grep'ing logs or being sent 5xx error emails as your only form of error control you're in the stone ages. Go use a tool or service like Sentry to turn this into a centralized system.

4. Issue Tracking. Along the same lines as version control, this should be a no-brainer. A centralized resource for keeping track of any formalized bugs, errors, or planning future features.

5. Analytics and Statistics. Can you tell me right now the delta of 4xx and 5xx errors in your application between your two most recent deploys? How healthy is your app over time? How many sign-ups, returning users, login failures, and purchases have been made? Tracking key business metrics and actionable events will really give you fantastic insight into your product. It's also a great way to catch critical system errors before they take down your site for hours on end, via tracked spikes in latency, cache misses, queue size, etc.

Issue trackers are mixed blessings. We deliberately chose not to work out of one at my current workplace for various reasons. Instead, if we encounter a bug or a user reports one, we deal with it on the spot or we add a task for it in Scrumy, which we use with weekly sprints. If something's not high priority enough to go in there, we just drop it instead of amassing a backlog that's kept out of sight. It's not a problem if we forget about a low priority bug since it's probably not worth fixing if we're not often reminded of its presence by reencountering. Of course, this doesn't work equally well for all teams and products. But I wouldn't call an issue tracker an absolute no-brainer.

It sounds like Scrumy is your issue tracker.

It is only insofar as issues are recorded as tasks to be completed that week. Adding a task to Scrumy means that we're making a commitment to work on it within that week. Some things carry over into the next week, but we don't allow it to become a rolling backlog. So I think it's unfair to call it an issue tracker.

Production Error Tracking

Concur. It is an immensely useful tool, simple as it may be to set up. There are services that do it for you (getexceptional), but it literally takes only a few hours to throw your own together.

Issue tracking is probably moving too much into management territory for this list, which is why I think they left it out.

It's probably for the same reasons that they don't touch upon server management/configuration, but I have personally found puppet (or other similar tool) very useful. It's basically source control for your server configuration - once you start using it, it seems reckless to not have.

"The twelve-factor app stores config in environment variables (often shortened to env vars or env). Env vars are easy to change between deploys without changing any code; unlike config files, there is little chance of them being checked into the code repo accidentally"

OK guess I'm a moron here but....surely the suggestion isn't that a huge set of env vars are entered at the console manually - they have to be in some shell script or file which produces them...which is then...the config file ! Someone help me see what I'm missing here.

Unless the idea is, "well it's a shell script! that's nothing like a config file!" in which case I'll be searching for the HN downmod button I can't wait to have someday.

`envdir` [1] is a common pattern for maintaining environment with a directory of key/value files.

If you use `foreman` [2] to manage application process formations, it will source a '.env' file before running.

You could call either pattern a "config file" but the important part is that its actual Unix environment variables, and there is a safe and secure place to store the variables outside of the code repository.

Then... you can take a huge leap forward when running on Heroku. Heroku has an API to set environment variables:

  $ heroku config:add API_PASSWORD=abc123
That value will be present in the runtime environment for the life of your application. So you do enter them manually -- but only one time when setting up an app.

The really clever part is that 3rd party add-on providers can also set environment variables on your app with a similar API. So if your database is an add-on:

  $ heroku addons:add redistogo
  $ heroku config
  REDISTOGO_URL => redis://user:pass@host-1:9492/
  $ heroku addons:upgrade redistogo:medium
  $ heroku config
  REDISTOGO_URL => redis://user:pass@host-2:9133/
If settings are in config files, you would have to deploy new code to change settings, and you wouldn't have a way for your hosting platform to help manage settings.

[1] http://cr.yp.to/daemontools/envdir.html

[2] http://ddollar.github.com/foreman/

In a well run environment there are generally special tools that can provision a new node, generally these kind of environmental configurations are set up there. Some people do manage it in source control and just have a branch for each environment. Both work, but one provides an avenue to offload the task from a developer onto an infrastructure type role.

The choice of a config file and/or environment variables seems more sensible.

I personally am not a fan of config files at all, a lot of this stuff comes down to preferences and experiences, but I have seen the environmental configurations being put in config files problem repeated over and over, then someone checks in the config with the project and someone checks it out and then it gets installed in an environment. It is considered good practice to separate configuration data from the code entirely, so that a commit, with config files, to source code cannot blow up an environment. Many recommend, as is the case with this author, that that separation happen by putting the config in environmental variables, that way it is wholly separated from the application and an updated to the application cannot blow up an environment. I agree with the conclusion and through experience, have found it to be the best solution as well. That beings said, using a tool like Puppet to automate and turn over, the process to infrastructure, is pretty simple, but some groups don't like to use tooling and if not, having an environmental configurations script in a separate repository, is just as valid of a process and will reduce the failure point.

I've arrived at the same conclusion recently, at least as far as keeping production configuration out of config files and putting them in environment variables instead. I'm curious, what is your take on configuration for the development and test environments?

I'd prefer that a project can be set up for development as quickly as possible, so my current approach is to check in default configurations for the development environment that are overridden with environment variables in production.

It depends on the technology that you use but many runtime environments allow the IDE to provide environment like configuration in the project properties, I tend to like to use that approach over specialized config files because the project files are wholly separate from the code base. For example the JVM allow you to send flags to it on initialization, Netbeans and Eclipse provide an interface to manage those flags from a development perspective, to the JVM they look no different than environmental variables and therefore the absence of the project files means that it transparently gets that configuration from the environment. I am a fan of that solution over specialized config files that are developed by the development team.

Two questions. 1) Do you do that for everything? Including database and memcached connection settings, cache settings, basically everything that's configurable about a web app? I'm not an IDE user nor familiar with Java, so that's a serious question. 2) How do new developer's get an environment set up quickly?

1) yep, everything, the code can hit an environment and run based on that environment. No configuration is provided with the code. That being said how you implement it can be pretty flexible you could separate them all out into separate variables or just have one variable with an XML string as the value and parse that string to get configuration, the point is to have the configuration injected into the application by the environment. The details of what those variables look like is the prerogative of the development team.

2) there are several options, a developer (gold CD) virtual is one, where a virtual is set up and configured based on updated configuration and an automated script, or having a environmental script that sets up the development variables on the developers workstation (just make sure it stored in version control independent of the code). Each works well, it just depends on preference, there are a lot of ways one can set it up to work in their environment, just try a few and see which one best fits your development culture.

Thanks. I'm definitely on board with #1. I'm still trying to figure out what will work best for me for #2. It's interesting (and useful) to hear how others solve it.

yeah, I'm not a fan of using environment variables. Using symlinks has always seemed more stable to me.

"If the app needs to shell out to a system tool, that tool should be vendored into the app"

I have never seen this done before. Is this really a best practice? I mean at some point your app has to depend on certain tools in place by the OS, right?

I was a little skeptical about this piece of advice too. How would one go about doing this? Let's say the binary in question is curl. Do you vendor binaries for each operating system that you might deploy to? Or do you vendor the source code and compile it for your target system in the build step? Does that mean GCC is a dependency? I feel like it's turtles all the way down. What about things like Ruby gems that require native compilation? Do you vendor the libraries that you need in order to compile the gems? For example, the pg gem requires PostgreSQL development files installed in order to compile.

I think it depends on the tool in question. If your application uses the tool heavily, in customized or unusual ways, it might be worth packaging it with your application.

Examples could be varnish and nginx and postgresql.

You probably don't need to package postfix.

Abbreviated version:

(1) One codebase tracked in revision control, many deploys (running instances, typically including a production and one or more staging sites)

(2) Explicitly declare and isolate dependencies (e.g. GEMFILE for ruby)

(3) Store config in the environment NOT code, "A litmus test for whether an app has all config correctly factored out of the code is whether the codebase could be made open source at any moment, without compromising any credentials."

(4) Treat any service the app consumes over the network as attached resources, a 12-factor app "should be able to swap out a local MySQL database with one managed by a third party (such as Amazon RDS) without any changes to the app’s code."

(5) Strictly separate build, release, and run stages.

(6) Execute the app as one or more stateless processes. "The twelve-factor app never assumes that anything cached in memory or on disk will be available on a future request or job."

(7) Export services via port binding. "The twelve-factor app is completely self-contained and does not rely on runtime injection of a webserver into the execution environment to create a web-facing service. The web app exports HTTP as a service by binding to a port, and listening to requests coming in on that port."

(8) Scale out via the process model. "processes are a first class citizen... the share-nothing, horizontally partitionable nature of twelve-factor app processes means that adding more concurrency is a simple and reliable operation."

(9) Maximize robustness with fast startup and graceful shutdown. "Twelve-factor app’s processes are disposable... should strive to minimize startup time...shut down gracefully when they receive a SIGTERM signal from the process manager...should also be robust against sudden death"

(10) Keep development, staging, and production as similar as possible. "Twelve-factor app is designed for continuous deployment by keeping the gap between development and production small... resist[ing] the urge to use different backing services between development and production"

(11) Treat logs as event streams. Twelve-factor app "never concerns itself with routing or storage of its output stream"

(12) Run admin/management tasks as one-off processes.

The site is down for me right now so I'll plead ignorance about #4.

How does one goes about implementing #4? I can abstract away a REST API call for Avatars, but how does one do that for a database? Or is this just a fancy way of saying (in Java-speak), use Interfaces.

[Edit: the site is back up and I was able to read it. The premise reads differently than how the parent wrote it. I read it mainly as, the sysadmin should be able to switch to a new host, change the configuration file, bounce the application, and everything should work.]

A great article, even if a bit self serving: written by Heroku people, and it is relatively easy to develop and deploy Twelve Factor Apps using Heroku services.

I like that the article was written as a response to observing problems deploying real apps. I just emailed this link to two of my customers, just in case they did not see it. There are too many good points in the article to comment on.

This looks like a very good list. I wish the author had made it available as a single page instead of 13 pages, so that I could save it to Instapaper more easily.

In "II. Dependencies", how could one approach this in PHP?

For example, I would build an app with a dependency on Smarty templates. The main app is in a git repo, but Smarty is required to be also installed on the system. The config file of the app defines the location of Smarty on the system.

I use Phing to download whatever version I want during build time and to place it in a lib/vendors/ directory. So, for example I use pieces of Zend and the Google AdWords PHP SDK with my recent project, so in a build.settings file, I have:

google_adwords_version=2.6.4 zend_framework_version=1.11.6

Then, Phing will read in that file and convert those to properties. From there, I can wget the versions of those packages, untar them, and place them in lib/vendors/.

This makes it very simple to test with a newer version. Just update the file and rebuild and a newer version will be downloaded and installed.

12factor.net is not connecting for me right now. Google cache says it is this: http://twelve-factor.herokuapp.com/

It's working for me. The DNS propagation has made it to the midwest of the USA. Beyond that I can offer nothing. :-)

Just in case anyone else was put off by the horrible web fonts and absurd claim in the very first sentence, I'll just add another comment that on balance this is an interesting piece if you stick with it.

Is there a link to a pdf version or a single page version?

Point 3 - Config: The claim is that you shouldn't have a config file that you don't check into source control like Rails' config/database.yaml, or the common Django settings_local.py. Instead you should use environment variables. Um. I see the point,but I also see two problems with this.

1) I'm certainly not going to set env variables with database passwords and such by hand. I'm going to have a script to do it for me. I might even call it, oh, I dunno, settings_local.py. In which case, I'm right back at square one, needing to not check this into source control. In other words, how do you avoid having the database passwords in SOME file, SOMEWHERE? Does it "fix" anything if it's named "fabfile.py" instead of "settings_local.py"? I can't see how.

2) How do I deploy multiple apps to the same server? Let's say I've got a linode (or EC2 instance, or whatever) running a test, dev, and production deployment of three different apps. With config files, I just have a different settings_local.py file in each deployments project directory. With environment variables...what, do I use prefixes? app1_test_dbpass = 'foo'; app2_prod_dbpass = 'bar', and so on? Except they say NOT to group stuff together into environments by name. So uh...how do you manage name collisions?

Basically, they seem to have identified a real problem, then proposed a solution that doesn't fix it and doesn't work. What am I missing?

I recently switched from various settings.yml files to a bash script. I can't believe I hadn't made the move sooner! Such a huuuge improvement.

A few good reasons:

1) Language agnostic -- I can use common configuration for Ruby, Node.js, Makefiles, Bash scripts, etc. Without having to find/use a parser for my config file settings, nor worrying about executing a Ruby script from a Python process, or whatever.

2) Real programming constructs -- I can easily test a boolean condition, interpolate variables, and otherwise minimize copy/paste between configuration sections.

3) Available at bootstrap -- Don't need to install Ruby or Python or anything like that. The very first scripts I run can use the environment without compromise. Particularly useful because machines are bootstrapped with Makefiles, which play well with the environment.

I modeled my app env script on the behavior of the `env` command (in fact, I delegate to it). See an example script in this gist:


Note the examples. By delegating to `env`, you get a few features for free:

1) Executing any process within the environment

2) Passing additional environment variables on the command line

3) Printing of environments (great for diffing!)

My preferred approach to this is:

1. Have a script that defines environment variables your app uses, and name that script ~/.bashrc, user accounts are cheap and provide an excellent way to have isolated environments.

2. Check an example copy into the project itself. The application does not source this script itself. This documents the dependencies of the app.

Disclaimer: I've never deployed on anything that wasn't unix-like.

I use chef and runit's chpst (like envdir) to set the environment variables.

The production / staging environment chef configuration is stored in a tightly-locked down git repository that can only be accessed from production.

Chef reads the production configuration information, puts the environment variables into envdir-complaint files in a directory, runit uses chpst/envdir to start the process with the correct environment settings... and that's pretty much it.

The main thing is your configs lives somewhere different than the code. They can both be source control somewhere. Though I generally keep a localhost config with my code.

Just define an env variable or something that contains the location of settings.py, a directory thats managed by something like chef/puppet.

Yeah, that part didn't make much sense to me. Say I have 5 different projects, which when developing may need 5 different sets of environment settings on my local computer.

So obviously I need something that contains and applies these settings. What makes this any less likely for me to accidentally commit that than the files they're objecting to?

Point 7 - Port Binding: I'm interested in hearing anyone else's views on this. How would you even do this for a Django app? And is this really better than just using WSGI? Is anyone using Tornado with Django in production? Google showed a couple proof of concept demos, but nothing serious.

At the moment, I have some apps deployed with nginx > uwsgi > django, taking advantage of the fact that nginx has built in support for uwsgi these days. This breaks rule 7, since my app isn't just binding to a port - but would I really be better off by using nginx > tornado > django?

Django works well under gunicorn HTTP server: http://gunicorn.org/

Applications are open for YC Summer 2019

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact