Hacker News new | past | comments | ask | show | jobs | submit login
How to Deploy Software (zachholman.com)
552 points by jmduke on Mar 1, 2016 | hide | past | web | favorite | 169 comments

One thing this glances over is that you should have something monitoring your production systems to make sure that they are running correctly.

To start with get something to monitor errors/exceptions and email you. To name a few services:



https://github.com/errbit/errbit (can be hosted on Heroku for free)

Also make sure that you have accessible logs that log useful information (timestamps, the user making the request, unique request ID). Then use syslog or a SaaS service to aggregate logs from all servers in one place, and keep them for as long as you can.

Great point - a small note though, for those of us who don't like using other people's hosted services and have some of our own infrastructure to host - use Zabbix, Nagios, Cacti, Spiceworks, Munin, PRTG, etc. Lots of nice options. I know everyone loves cloud services these days, but I feel absolutely no temptation to put any of my business infrastructure in other people's hands when I don't absolutely have to.

I use a combinations of Zabbix and a very agressive Smokeping mesh deployed with docker (10 pings per minute from all separate DCs all other DCs) to monitor our worldwide nodes and be able to account for routing problems on the backbone rather than our nodes themselves. Very handy tool, particularly if you work with latency sensitive applications. I've been surprised some of the datacenters we use appear not to have access to similar...

I wouldn't describe Nagios a 'nice' option. Every time I have to install or reconfigure it, it opens the wounds afresh (and I only have a relatively small deployment) :)

Hard to have a discussion about open source monitoring without mentioning OpenNMS. Best monitoring back end by far.

Go for Sensu, you wont look back and it has terapeutic powers, when it comes to monitoring ;)

https://getsentry.com/ is another option (my favourite) for exception monitoring.

It's also open-source https://github.com/getsentry/sentry

We use it for GitLab.com and love Sentry.

One of my clients got Graylog up and running pretty quickly.

Logs are only one part of monitoring though, and it's easy to miss the wood for the trees if you're drinking from the logs firehose. Some metrics monitoring is also vital, naturally I recommend http://prometheus.io as I'm one of the core developers :) https://blog.raintank.io/logs-and-metrics-and-graphs-oh-my/ discusses this a bit more.

I've used https://opbeat.com/ for a small side project and been fairly happy - they track exceptions, releases and performance.

Another error tracking system for your list: https://bugsnag.com/

It has been rock solid for us with no event limits and cheap pricing.

(not an employee, just a fan)

Same here, we just started using it and are happy with it!

The ELK stack is also good but will require some config - you'll need to pay for a subscription for alerts however there are free alternatives like Yelp's ElastAlert on Github.

Personally I found the EHK (ES, Heka, Kibana) stack much nicer than ELK. I especially like that each node can run Heka locally and ship directly to ES without having some intermediary server we have to maintain. On top of that, Heka is great for general event streams and can grow quite a bit more into your infrastructure as a general tool than logstash (at least last time I used logstash, which was a few years ago). Heka can also monitor and send you alerts via email ("Hey this server got 3 exceptions in the past 5 minutes!") on certain conditions.

We do use ES as a core part of our infrastructure, so our only real barrier was setting up Heka.

> keep them for as long as you can.

Careful, you can run into Data Protection legal issues with that approach.

Data is a liability as well as an asset.

The operator of Pinboard says that companies should treat personal data like radioactive waste. You should not keep it around, or keep as little as is required/possible, and keep it secure and away from everything else.

It's a great way to think of it.,

Shameless plug, but my co-founder and I are about to launch our powerful logging service this month. One of our features is session-specific logs and the ability to send alerts at user-defined thresholds.

We'd love to get some feedback on our product when we launch to beta, so if you're interested, please sign up for our email list at http://logdebug.com

Your website is still new, I see, so I imagine it will be more explanatory about how your service works. ;)

One thing about the plans that made me cringe was that "SSL" encryption was not available to the Startup plan. I feel like we're entering an era where HTTPS isn't really something that should be offered as a perk, ya know?

One suggestion is remove the "Most Popular" banner above Small Business on your pricing. Being as your site is plastered with under construction, it's a turnoff seeing such false advertisement.

Sure, good tip.

Though the pedantic butt-head in me wants to point out that out of all of our paying customers, they're all using that level of service. :) So technically...

So keep it. Ditch the under construction stuff instead if you have customers.

Let us know when it's actually out. This is a big pain of mine right now but frustrating I have to sign up to be notified. That pain might cease to exist when you're actually ready

I'll vouch for Airbrake. It does an excellent job of identifying patterns and grouping similar errors, tracking resolution across deploys, recording request context to help you figure out exactly what was going on while avoiding spamming you with error messages.

I haven't had much of a need to look at logs since I started using Airbrake three years ago. For the price, it's really hard to beat the value when you compare the price to various log aggregation systems.

A great list of resource - glad I ran into this post! I have rolled my own log scraping and exception monitoring and notification, but some of services do a lot more.

I have been using https://www.pingdom.com for uptime monitoring and some basic perf/latency tracking. It works well and doesn't cost a lot, plus it can hit my services from multiple regions.

Let me start with a disclaimer that I work for Atatus. Front-end monitoring is the other end of the circle, that complements production monitoring. With Atatus that is what we have done. We monitoring the front-end for both errors and performance issues - https://www.atatus.com/

> timestamps, the user making the request, unique request ID […] keep them for as long as you can

That's a sure way to go to jail.

Don't store which IP requested what for more than necessary — in many countries, there are limits if a few months on how long you may store metadata.

Can you elaborate? I find this incredibly hard to believe.

If it's true, you'd think it would be widely-known and publicized in those countries (whichever ones they are) but I can't find any reference to such a policy existing anywhere.

Germany (and probably other countries n the EU as well) considers IP addresses PII, and you have to have consent to store it. Short-term logs might be justifiable as technical necessity (but the lawyers I've heard talk about it didn't want to put a "safe" time frame on that), but certainly not "keep as long as possible".

That's the reason Google Analytics, Piwik and others have an option to blank out parts of the IP address.

Jail is a bit dramatic though (at least here), unless you do something bad with them fines + maybe damages would be the maximum, and it is quite unlikely you'll actually get hit. But if you are ever in a situation where this would come to light (= already in legal trouble), it could make things a lot more complicated, especially if you process data for other companies.

It falls under personal data about someone. Many (but not all) EU countries consider IP addresses to be personal information, and hence "Store it forever for no real reason" is illegal. You need to have a legitimate, legal, reason to store PI, you need the person's unambiguos freely given consent, etc.

There is currently a court case as to whether dynamic IP addresses from ISPs counts as personal information. But static IP address probably count. And all the other information (user X logged in at time Y and send message to other user Z) would be PI too.

It's commonly known in business. I've heard of it repeatedly, usually in terms of "here's what our retention policy has to look like for European customers."

> That's a sure way to go to jail.

Not in (any of) the United States.

Reminds me of that scene in Snatch where Dennis Farina says "yeah, don't go to London."

Don't make any plans to have a physical EU footprint and you can care less what the EU thinks of your internal policies. EU customers can take it or leave it. Their choice. It's not a warm stance to take, but you've probably got a dozen other balls in the air that you're juggling. EU data compliance probably isn't one your worried about as a small startup.

On the other hand, if you comply with EU data policies, your customers might choose your app over another app that doesn’t

(especially european users, and businesses)

Of course. Such are the decisions and tradeoffs that any young startup needs to make. That's why I said that they could take it or leave it.

Too late to edit, but the actual quote was, "Yeah, don't go to England."

And that's why Safe Harbour was found illegal, and why it's probably illegal for EU companies to use those US services.

It could still be good practice to minimize all data when it doesn't serve a purpose, and I can't think of a reason not to delete the last byte of an IP address (except possibly in an internal setting where all clients are from the same subnet).

Personally, I don't see much of a problem when my IP is stored somewhere since you'd need a reference, but others have apparently thought more about it and want it to be anonymized.

(I've now come up with scenario: If you have access to multiple such sets of data, you could connect accounts from different systems. I. e. when the youporn logs leak, someone could tie the data to your HN account etc.)

> (I've now come up with scenario: If you have access to multiple such sets of data, you could connect accounts from different systems. I. e. when the youporn logs leak, someone could tie the data to your HN account etc.)

Combine that with leaks of a social network where you have real names and IP addresses, and you could actually identify the name, job, address, etc of everyone who watched a specific clip of porn.

That’s very bad.

Metadata. Not even once.

syslog and SEC is a good place to start before deciding to ship all your logdata to a 3rd party.


You are a raygun employee? If yes, please disclose that if you "recommend" a product.

Same username, Technical Marketer at Raygun


Ouch. Definitely staying the hell away from those people.

Agreed. It's pretty spammy otherwise, especially with one comment in nearly two years and just a few Raygun-specific posts.

You should see what the Nylas team does on here, or the ABP guys.

Lots of spam, never announcing they're promoting their own product (until you call them out on it, then they admit it, at least), and both like to lie.

I really enjoyed this article. As an industry, when it comes to something essential like source control, we seem to have converged to a common set of practices and workflows. Deployment is arguably just as important, but I think the practices are very different on different teams. This article is like a more practical version of Continuous Delivery.

Three areas that I think would have been worth including:

1. Pre-production.

You deploy to test & other pre-prod environments more often than prod. They should use the same scripts/tools/processes as production deployments, only with different permissions.

2. Configuration.

Test and production environments will always have different config settings, so no team will ever be able to deploy to more than one environment without encountering this problem. I think there's still an open question around whether those configuration settings should live in the same source control as the code, in a different source control repository, or a dedicated system. Source control systems and sensitive values (passwords, API keys, etc.) don't always mix.

3. Build your binaries once.

The article is more focussed on dynamic languages, but for compiled languages, I think this is important. If you branch, compile, deploy to test, test it, get the all clear, then compile again and deploy to production, there's a lot of opportunity for differences between what you tested and what goes to production to sneak in.

In fact even for dynamic languages, this might be a valuable practice. What if the JS minifier on one build server is different to another, and the deployed script ends up being different in production to what was tested.

Disclaimer: I'm the founder of Octopus Deploy, and these practices might be biased towards enterprisey .NET/on-premises deployments rather than cloud hosted, dynamic language projects.

> 1. Pre-production

It interests me in how many combinations of pre-prod environments exist, as I (naively) expected this to be standardized, but it's not (e.g. https://en.wikipedia.org/wiki/Deployment_environment#Environ... ). I can see the need for two vs four environments for handling different stages of the life-cycle depending on business needs (and budget). Addressing how to choose the right set of environments for a given company/product would be useful.

I also want to say that your product is phenomenal and it has improved the quality of our shop's build and deploy pipeline by an order of magnitude. Thank you.

1. Pre-production

Pre-production should test production in a non-production environment. Specifically, if a system connects to, say, SalesForce, and you've been using a local sandboxed instance of SalesForce in dev and UAT, the pre-prod should connect to the real, live SalesForce. I have a pre-prod environment for my current project, and it's useless because org policy states that only prod environments can connect to live instances of anything. Completely invalidates the reason for having pre-prod.

2. Configuration ...will always have different config settings...

Yes, aside from pre-prod. Pre-prod and prod MUST be mirror images, INCLUDING config settings. Identical. The same in every way except for host names.

We use octopus deploy(Great software btw), most people use configuration stuff mostly for connection strings. Everything else is the same.

> 3. Build your binaries once.

Assuming your builds are solidly reproducible like they should be, how do such differences "sneak in"?

Granted, truly reproducible builds in the first place are Really Hard(tm).

We put an explanation here (hope it's OK to post the link):


A real world example of this is that when .NET 4.5 shipped, the compilers in .NET 4.5 for 4.0 code produced different output than the .NET 4.0 compilers would have. So installing a system-level update on Windows on a build server would mean Test and Production got different results:


Also, releases move through environments at different times:

Monday: 1.0 goes to Test

Tuesday: 1.1 goes to Test

Wednesday: 1.0 goes to Prod

Between 1.0 and 1.1, perhaps you updated Node, or went to Python3 - so your build server had to be updated. Reproducing Monday's build is going to be more difficult the more time that goes by.

As you said, truly reproducible builds are hard. Why not just zip the artifacts and use the same files, instead of rebuilding?

Docker for CI will help. Update your base image when you upgrade Node/Python.

In theory, feature flags seem like a good idea. Until you reach a point where too many flags become difficult to test in an exponential tree of combinations. Also, it demands tight discipline to make sure each new flag properly isolates some new feature ... Has someone had success with this idea, and be able to 'tame' an explosion of flags in their codebase? really curious

I can't speak to them now, but it happened a number of times while I was at GitHub. We'd have a number of things staff-shipped for weeks or months (occasionally a year!) and it was just... there... and nothing happened much to it.

As you might expect, it wasn't really the fault of feature flags but rather indicative of a process problem. We'd either have to prod the team or person in charge of it, verify that work was actively pushing things forward to ship, and so on.

At various times we'd have, for lack of a better phrase, the "No (wo)man", who would come in and say "no" to a lot of things. One of the best ways to achieve this is to see that nothing has happened on a feature flag for awhile and then send a pull request to remove the feature flag (and the feature entirely). This got people out of the woodwork who said waiiittttt a minute let me just finish that up, or if no one vehemently disagreed with the pull then you could actually just remove it entirely. But it did take some explicit reflection on whether the flags were defensible to remain flags.

On my team of about 10 devs, we create / destroy about 5-8 feature flags a month.

The only thing you have to remember is: Its either going to production, or its getting dropped.

There is an expiry date built into the feature flags framework (90 days), and either they need to be explicitly cleaned up and deprecated, or extended with a valid reason attached.

Also, we code review every commit. If any diff is coding around/through a feature flag, the first question is always: "Can you clean up that experiment first?"

Sounds very familiar. Are you in Seattle?

Something like that, yeah

Having worked with feature flags "at scale" (whatever that means :), here's some advice that I've learned from working with hundreds of developers across tens of teams and given to others. I'm leaving out experimental analysis (A/B testing, multi-armed bandit) because it just adds "...but consider how this affects your metrics".

  * Always be developing against the current running features. No brainer.
  * Design things so that they integrate feature flags, not work around them. This usually means pushing feature flag determination to more generic/common code.
  * Separate backend/frontend changes into separate feature flags when possible. Turn on backend changes early and often to better measure your feature's impact.
  * Give individual features their own flag, but also have a global flag that manages the entire experience. This makes it easier to manage your gradual dial up as well as shut off problematic features that would otherwise mess up the launch.
  * Be diligent about removing feature flags once they're turned on. Schedule it into sprint time, reward teams that remove them, make it a management mandate, whatever. Just get rid of them once they're no longer needed.
  * Invest in monitoring around your services that (ideally) can correlate failures with features. you should turn on features over the course of a few hours/days to mitigate customer impact in the event of failures and gain data about performance at 50/50.
I think the answer to your specific question of "testing every combination" is that you can't, easily. But by keeping the number of feature flags that are inactive low (< 150 is very liberal) for a given service, having everyone develop against the current running features + dev overrides, and using gradual dial up with integrated monitoring to catch poor interactions when the impact is small, you'll have mitigated a lot of your concerns.

I've had good results with them. We never had more than 5-7 flags at any given time. We would flag things by environment (dev/staging/prod) and/or by user (everyone/beta users/internal team).

Flags were there because of two main reasons:

* Our app was deployed more frequently than a service we depended on (e.g. every commit for app, nightly for service in staging, every 2 weeks for service in prod, etc)

* We wanted to have it rolled out to a subset of users for testing

We did not run into exponential trees of combinations because we rarely had two different flags interacting. Maybe it was a happy accident of trying to make work parallelizable or maybe because our feature flags never lasted more than a month or so.

The code was intentionally dumb and the flags were stored directly in the source code (not in a database table or another config file or something). Simple, stupid calls to `FeatureFlipper.isFooReportEnabled()`. We did not test this class because each method was a simple boolean check that the current user appeared in a list or ENV != prod.

We stubbed out `FeatureFlipper` when using it throughout the app. Stub the feature to be enabled, check behavior. Repeat for disabled.

Most of the features were simply hidden at the view level. For users, if there is no button, it doesnt exist. We didn't particularly care if the user would "guess" the url of a feature flagged page -- not worth the effort.

Doing a "full rollout" of the feature was a non-event. Just delete the method on FeatureFlipper and go fix the compile errors :)

I'm interested that you're having a problem with this, but you don't give much detail.

Why do you have so many feature flags at a single time? Do you have a team of 20-40 people or more? Are your features taking months to create? Are you trying to create a feature flag for every new line of code before deploying it?

A lot of times code can be deployed without using a feature flag, because it's small enough, maybe you are overlooking those cases?

Testing flag combinations for one. Also please re-read my post carefully. The details are there.

  >Also please re-read my post carefully. The details are there
Clearly there is something wrong with your project that you aren't mentioning. Plenty of other projects do flags without problems. Your team has issues.

GitHub now has a build (we have about 12 builds per github push) that runs all the tests with all feature flags enabled. This doesn't hit problems that only appear in the matrix of some enabled, some disabled, but in practice has worked pretty well and found a few minor issues.

> Deploying major new features to production should be as easy as starting a flamewar on Hacker News about spaces versus tabs

Great writing. Spaces all the way.

Elastic tabstops are clearly the best, but until we have good editor support I'm sticking with spaces.


For now, these decisions will be made for me by our gofmt overlords.

Seriously? Ugh. Tabs FTW.

I've actually thought about this a lot (for some reason).

Spaces are fixed. Tabs are adjustable via most user interfaces.

So, in theory, you'd think tabs are superior because everyone can have their own amount of spacing and that's that.

Therefore tabs > spaces

(I should probably leave now)

Until you want to align code on non-tab boundaries.

True...damn it.

But guys, this is what is being proved. Stop. Don't. Come back.

Would someone mind explaining the argument for spaces over tabs? Possibly naive young dev here.

How many spaces do you like to indent your code? Some people like 3 space indentations, some like 8 spaces, and some like 5. It's very personal (but anyone who doesn't use 3 is a reeky brazen-faced mammet).

A simple solution is to use tabs, because then every developer can set the tab distance however they like, and they are happy. It breaks down when you have code that is aligned beyond the indent, like this:

  if(a && b &&
     c && d &&
     e && f   ) {
In that case, increasing (or decreasing) the tab distance will ruin the intention.

The best solution is to use tabs to the point of indentation, and then spaces thereafter, but a lot of code editors don't support that, so in practice it's hard to implement, so people use spaces to preserve their formatting when it gets uploaded to github.

Mixing sounds like the worst option.Its what has cause horrendous looking code in the past when you open someone elses code and they are mixed.

Python and spaces for me.

  > Python and spaces for me.
Sounds like you are a very open-minded individual. Not.

I say this from experience, not narrow mindedness.

Open a file where it is all spaces, and it looks the same on every machine. Open a file with a mix of spaces and tabs, and it often turns out an absolute mess.

OMG! 3 spaces? EVEN number? 2 or 4. Never even.

You sir, are a reeky, brazen faced mammet!

Short version: spaces doesn't break over time.

Longer version: First understand that there is no tabs vs. space. There is only tabs + spaces vs. only spaces. (Because not all indentation line up with tab spaces, and someone may wish to line up assignments, lists, etc.) Only one developer in the history of your project that uses another tab stop standard is then enough to mess up your indentation. And that's just one way to mess it up, in a sufficiently large project some creative developer will find another.

Spaces just work and ensures your guidelines are followed. The only downside is a few wasted bytes. It might have been a religious debate in a long distant past when someone actually counted bytes, but today it's mostly young developers who don't know better who engage in it (with a few exceptions). Linus uses spaces. OpenSSL changed to spaces as part of cleaning up their codebase. It's the default behaviour of GNU indent. It is a good idea.

Right, but how many spaces?


Spaces always have the same width, so it's harder to screw up formatting and easier to enforce stuff like maximum line width.

Tabs width can be set to whatever you want, so everyone can use the spacing they prefer.

Spaces work better with my preferred IDE setup. Therefore spaces are inherently and objectively superior to tabs.

I personally like the idea of tabs better, but prefer spaces because it makes navigating the text less awkward when traversing tabs. It may sound stupid, but for my own stuff it's just more comfortable.

I take umbrage!!! (oh wait, I am a spaces person too)

Ah, but how many?

I like a mix of space's and tab's depending on if my line number is divisible by 3 or not.

2 1/2. Obviously you're using an editor that can do half-spaces?

Of course! I simply type one in with a half A-press.

For those who don't know the reference: https://www.youtube.com/watch?v=kpk2tdsPh0A. It's definitely in the hacker spirit.

En Space: U+2002

2 or 4

Tabs are obliviously superior since every developper can visually resize them to be as small or as big as they wish. Personally, I can't work with small indentation. My eyes have a hard time following long straight lines.

Does anyone else work on systems that take 3 hours to back up the DB, an hour to deploy, 1 hour to start up and a few hours for users to check out functionality before business opens on Monday? No to mention the federal regulations about what paperwork is required and who can even access production. Maybe I'm in the wrong site.

Hah, we (hopefully!) get an update window every 18 months! And a staging a new production like test environment starts with another $100K invoice..

And thank goodness Microsoft hard-coded in a 10 hour delay to ensure the KDS root key of a domain is propagated, even if I create it before I create additional domain controllers...

Such fun!

Off topic, but to create an immediately effective KDS root key, just set the effective time ten hours in the past. You can validate propagation by looking for the 4004 event in the KDS event log. This is probably not a good idea in production, but is useful when building/rebuilding a lab.

  Add-KdsRootKey –EffectiveTime ((get-date).addhours(-10))
See https://technet.microsoft.com/en-us/library/jj128430.aspx


No federal regulations or 3 hours to backup the DB, but in my previous job we had to build a deb package on a build server, then copy the package to the repo server, and finally ssh into the test/preprod/prod server to do the deployment (using sudo apt-get update). And I didn't have access to the production servers..

All of that could be automated.

Hear, hear. I don't have any access to production. (Interestingly, that's just one more reason to fully automate stuff.)

The post is interesting, but it doesn't mention two major difficulties: zero downtime deploys and database migrations.

Yeah; I tried to stay away from really low-level aspects (just because they're hard to generalize across languages), and also just because the damn thing was so long already, ha. :)

As far as database migrations are concerned, GitHub (and others, of course) takes the perspective of migrating before the code that uses those migrations go out. In other words, as @herge says in a sibling comment, the code that gets pushed needs to support two branches of code and two branches of data simultaneously. It's certainly some extra work (and can be pretty gnarly depending on the scope of the migration), but once you get to a certain point it's kind of the only way to do no-downtime migrations.

There's many possibilities to help with the actual migration process, depending on what database you're using. With MySQL, for example, you can do something like the process in lhm: https://github.com/soundcloud/lhm

Zero-downtime deploys aren't super difficult in Rubyland anymore (many have written how they achieve it in Unicorn, for example), although I'm not as familiar with how other folk do it across other languages and platforms these days.

+1 -- treating the database and the backend as two separate services with their own release lifecycles and APIs is "obvious"...

...well, once someone says it out loud. :)

After that, of course it makes sense that they would have a need for an API compatibility window across at least two versions. It's exactly the same issues as supporting a backend and client side where you can't instantaneously force an update to all clients. With your DB, you're in control of the version, but you're certainly not in control of making it "instantaneously", so the same rules apply as when you're waiting for some curmudgeon user to update: API versioning and a support window.

Now, if only we had an automagic way to make it less painful for a project to support two different versions of an API from the same codebase....

We use django migrations, and I still haven't found a way to do zero downtime deploys and migrations short of doing the following:

If we are trying to deploy migration #1, first deploy a version B of the code that supports (but provides the same set of features as before) the db both after migration #1 and before migration #1 (maybe with the help of a feature flag that is set in the migration). Do the migration. The deploy a version C of the code which removed the feature check above. But all this requires 2 different versions of the code and a lot of process just to ship out one migration. It gets combinatorically worse if you have more than one migration to deploy.

(I think I'm just expanding on what you say you already do - but I'd already written it out by the time I realised that this was exactly what you are doing, so I'm going to just leave it).

If you can, always make migrations backwards compatible with the previous version of the code, so they don't need to be rolled back if the code needs to be rolled back. Having a good migration rollback procedure is nice too, but usually unnecessary if testing has gone well.

If you need to add a new field to the model then always create it as nullable, whether or not you use a default value. That'll allow this particular database migration to run on previous versions of the code. You can test this by generating the migration SQL and running it on the database for your test/dev environment which is running your current production code.

Once the new code and migration is running in production and you're satisfied with it, immediately create a new change which makes the previous migration more strict (remove nullable, use a data migration or default value for existing nulls). Test, stage, deploy that.

Having backwards compatible migrations leads into zero downtime deploys also. Once the data migration has run, your app servers can be running version X or X-1. Before you push new code to a node, do a graceful shutdown (allowing queued requests to complete) or remove the node from the cluster (haproxy socket api for example), update the code, bring the node back into the cluster.


- not all database migrations can be backward compatible, but most can be made so by breaking your change into two or more changes. First - make the change without strict integrity checks. Second, enforce integrity checks and provide defaults.

- zero downtime deploys requires multiple nodes (counter example would be welcome, but I can't think of one).

I wish there was a way to "pause" incoming requests in web servers. Most deployments (migrations + code) take less than a few seconds, and I'd be fine with some users having to wait 2 seconds for a request to finish over their request hitting a 500 (due to inconsistent code/database) or 503 (putting the site into maintenance mode).

Usually we aren't deploying a schema change that's really huge so we just go for it and let the application crash for those users who happen to hit a place where the code/schema are out of sync.

Zero (no crash) downtime deployments seems like too much effort for too little gain.

Agreed. Seems like a case of optimization for technical completeness rather than a business need.

Basecamp apparently uses an nginx/openresty plugin to pauses requests during a deployment: https://github.com/basecamp/intermission

haproxy allows re-dispatching failed requests some number of times. If you have an extremely brief outage due to a deploy, redispatching failed requests 3 times may be sufficient. I imagine other load balancers have similar functionality.

One approach which can work pretty well in many apps is having something like a CDN or Varnish which can serve stale content when the backend is unreachable. That allows the code you need to bootstrap your app to be served as long as your edge cache is running and your JavaScript can some sort of retry+backoff for failed requests or even do things like check for image error states to trigger a reload.

Have your load balancer drain connections to a web server. When it has no request in progress, deploy to it. When that's done, move on the the next web server.

This doesn't solve the problem when there is a DB migration.

This is covered by herge's parent comment. Deploy code that can work with either the old or new DB structure before you perform the migration.

Folks from Braintree gave a talk about how they did this. They'd queue up all the incoming requests in Redis, then replay them once the site was back up.

There is, sort of. If you're hosting on Linux, you can use iptables to artificially delay responses.

That and some other methods are described in a post on the Yelp engineering blog[1].

1: http://engineeringblog.yelp.com/2015/04/true-zero-downtime-h...

I don't think that's quite what raziel2p wants. Won't that still allow the web server to receive and process the request, but just delay it's response?

Load Balancers.

> and a lot of process just to ship out one migration.

I would have agreed with you before working at GitHub, but the end result is that deploying is so easy that 3 deploys does not feel like "a lot of process." On most of our applications I can do this in 15 minutes or less.

Yep - unfortunately that seems to be the only way to handle it though.

For the part regarding database migrations, you can find here [1] an interesting podcast about SQL database (schema) evolution. Among the other things they talk about how to have multiple schema revision coexisting at the same time. While hopefully most of us just need to support at most two revisions at the same time, it gives a number of interesting ideas.

[1] http://www.se-radio.net/2012/06/episode-186-martin-fowler-an...

We have something similar. But I don't like branches, I prefer single trunk development(generally agree with Martin fowler) + feature switches to isolate wip features. We store all binaries built. so we just roll back to the previous binary, which is a single button click for us.

Curious... Why don't you like branches?


It's a controversial article though.

I'm pushing for toggles on my team, haven't made a single feature branch in this repo resulting in no merge conflicts (yay).

Sofar I'm the only one regularly doing it without feature branches. Running with the idea that just because you can branch cheap doesn't mean you should. Of course toggles are technical debt to be managed, but so are branches.

I've found good practice has been mentioning which toggles are available on the README with their defaults (could be generated)... they should be tracked and removed ASAP. I read a newer article that breaks them down into categories http://www.infoq.com/news/2016/02/featuretoggles. Toggles over branches are showing value as we run different variants on staging without having to redeploy different builds, but instead changing a launch variable. It's especially clear with 2+ WIP features. We're using environ with Clojure which doesn't have any fancy runtime toggling, but that'd be another thing to look at.

The tech debt associated with branches disappears with the completion of the merge request. The tech debt with feature flags doesn't disappear until someone gets around to removing the (now) dead code.

Personally, I use both, based on what is best for the feature. Why not, after all?

I love branches in theory, but the main problem has been in practice around CI. It's a nightmare trying to manage a staging environment with feature branches involved... given better tools/a better build pipeline I'd go back. I agree, toggles are basically programs branches compiled in... analogous goto vs if/else control structure.

But day-to-day, stakeholders want to test feature X that they've heard is going well but it's not stable enough for develop? Okay, let's [engineering time and $$$]. vs. adding FEATURE_NEWDB=true to the upstart script. We've already got automation engineers helping us out, but until the deployment problem is solved toggles are more practical in our case.

How hard is it to set your CI to run off a separate branch? Or to run CI for all branches which are checked into a repo (a common option I've seen implemented)

Nobody said it here yet, so I just wanted to mention that the design of this blog is really nice. I love the font, links, blockquotes and chapter title images.

It also rendered equally nicely on my android phone and my desktop browser.

I saw that the author has a github repo for an older blog style for jekyl, but I'd like to see a similar thing for this one.


Thanks! Was thinking of open sourcing this soon as a one-off. I'll see what I can do. :)

It broke down halfway through on Android Firefox =(. Reader mode forever.

The font rendered with iceweasel on debian 8 is illegible.

"How to deploy web software".

The software world > web servers.

(Sorry to be picky, but some people seem to assume that all software is developed for the web these days whereas the web world is just a significant & vocal minority).

I think the text works fine for any kind of server software.

As he discussed, feature flags also work for downloadable software (desktop/mobile) - the multiple deployments obviously don't make that much sense in that case though.

A great deal of software is web based these days. You install apps on your phone or your desktop. You deploy to a server.

true, but its still a 'sizeable minority'. As far as number of deployments go, the embedded software world massively outweighs server, client, mobile and desktop.

Unfortunately (or fortunately), we in the embedded software world tend to be less vocal than the others.

I wouldn't have though deploy was the correct word for embedded.

If its one piece of software I would think of it as an install. If it's coordinating multiple pieces its a deployment.

Thanks for the writeup - very helpful. It's always good to get a view of how others are solving the same problems oneself has.

That said, the article does come off a bit as trying to be authoritative, but at the same time it doesn't leave enough room for possibilities where alternative approaches may have merit as well (i.e. "this is how to do it" vs. "what worked well for us, ymmv"). Newbies that read this article will think that the principles described are the canonical way and even try and apply them in scenarios where alternatives may prove superior.

Other than that, a lot of good advice, well done!

I've been arguing for using Git Flow. Reading the post I have to say the stand our lead takes against Git Flow and in favour of very cosy CI is perhaos steonger than I realised.

He argues for pushing about as often as possible. With our small team thats very do-able, every push gets tested and linted by the 'blue' or 'green'. You're supossed to only push passing code which you easily can by running the tests and lint locally. So instead of all the pain points mentioned in the post you write passing code, pull and rebase on other passing code, and then push. Little code review, no worries about hasty reverting, few / early conflicts keeping us from trilling each other up or writing incompatible features.

The reason I argue for Git Flow? Our tree is an absolute mess. Most often a single chain, of often linearly scrambled features. In other words removing one feature would be hard and require a bunch of legwork, not a couple Git commands.

If anyone strongly feels there's a better way for a small team than lightning fast CI let me know!

I wish native Nodejs deployment was a solved problem, but there really is no comprehensive and universally used tool for deploying Nodejs using Nodejs. ShipIt mentioned in the article is barely a year old, it has a short featurelist and short list of users. PM2 (Keymetrics) is not bad but is buggy, also they seem a bit overwhelmed at the moment. Flightplan is decent but the syntax is more awkward than ShipIt. Every other common language has a stable deployment tool besides Nodejs.

I ended up going with Distelli, it's a SaaS but it's fantastic. These days deploys often involve more than just one app or language, and I really prefer a tool that can ship anything. Also, having a GUI to see deployment statuses is invaluable. With those requirements none of the Nodejs tools can stand up to the other, more mature utilities. And rather than have to write all my deploy logic in another language, I just purchase the service.

What issues have you run into with PM2? I'm running it in production myself, so I'm curious.

I've really enjoyed working with PM2 in production for the last 4 months... although our ramp up has been slow.

Thanks @doublerebel. For those interested Distelli is at https://www.distelli.com

Disclaimer: I'm the founder at @distelli

I'd be interested in hearing people's thoughts on deploying feature branches to production before merging them. I've generally followed more of a git-flow approach [1]. This seems to have the advantage that multiple feature branches can be grouped and deployed together - thus, avoiding the problem in the article of the deploy queue becoming a bottle neck.

[1] http://nvie.com/posts/a-successful-git-branching-model/#crea...

At my old team, when we did the migration from SVN to Git, I set up the deployment workflow with mandatory pull requests for everything. And I included a small deployment tool in the QA system where every developer could just click a checkbox next to their branch, and that branch would be added to the QA system (which has a copy of the production data for extended testing).

Behind the scenes, it just does an octopus merge of all selected branches into master. Since the codebase was reasonably large, we almost never encountered problems with merge conflicts.

> multiple feature branches can be grouped and deployed together

I would respectfully argue that this is not a desirable feature.

Although you might save some time by deploying multiple branches at once, you cloud the waters of what and how to roll back.

I think a better idea is to make deploys easy, quick, and revertible so that you can deploy early and deploy often, and in the event of a rollback, you can rollback just the broken feature.

That's true but it would seem like there would be other disadvantages to.

For example, when do you do testing? If we test as soon as the pull request is opened, we know that master is going to change a lot between now and when we finally deploy this code so the tests might not be valid.

If, on the other hand, we wait to test just before we deploy to live we risk locking up the queue for too long. This might not be an issue if you're test only take a couple of minutes to run but if you have lots of integration tests (like facebook for example [1]) then it could become a big issue.

Is the solution to this, you just accept that the codebase you're testing won't exactly be the same as what's deployed to production and the risk that comes with that?

[1] https://developers.facebooklive.com/videos/561/big-code-deve...

Our preferred deployment method at Honeybadger is to (almost) always merge to master before deploying. We will deploy a feature branch when we want it to be a little easier to rollback to a known good state (by deploying master) for changes that we are nervous about. Those deployments are rare, though, as we have an almost-production environment (it talks to all the production services, but no customers use it) for doing one last smoke test before unleashing code on customers. :)

Why worry about what is on master, if you save your build artifacts. so that if you need to go back to a the previous behavior, just redeploy the previous productuon build output

You code is written in golang. You've been compiling it with golang 1.5.1. Then golang 1.5.3 comes out with critical security fixes for their TLS code.

That is why you care what is on master: because you need to rebuild if your runtime changes.

We deploy straight from master, but tag each build.

If you need to rebuild a specific version, it's as easy as checking out the tag.

This feels like the correct strategy to me (and it's what we do too). Deploying many branches to production seems like a nightmare at any scale.

Debugging. Some bug only happens on customer X's installation. You need to know which version customer X is running so you can reproduce the bug in house and know what other impact it might have had.

Interesting article. Frank and honest. Good to read the experiences of others.

To get the disclaimer out of the way I'm a co-founder of Vamos Deploy. Our product addresses many of the deployment problems that have been discussed here so I thought I'd mention it. We are looking for feedback on the product and an early adopter or two - https://vamosdeploy.com

I'd like to cover some techie details here. Vamos Deploy encapsulates an application with it's dependancies and runtime config so it can be deployed as one to any number of machines irrespective of the OS (well, Linux and Windows at the moment). This encapsulation is achieved by configuring a 'grid' with all the application package versions, library/runtime dependancies, runtime property values and local repository names (hostnames usually). When the grid is deployed (all via CLI) the respective local repos get updated. You can have multiple grids on a host (in a local repo) thus enabling multiple, differently configured, encapsulated applications that don't conflict. It avoids duplication by the grid sharing the underlying application packages and libraries in the local repo. There is a audit log of all actions for traceability and transparency. A simple ownership model prevents non-prod code getting into production and restricts who can deploy to production. It can be combined quite easily with any config management tool for release orchestration. You don't need RPM/Deb packages or deal with Yum repos. We have concentrated on making it easy to learn and use so max benefit can be attained quickly.

I'll stop there. Be interested in anyones feedback here or https://vamosdeploy.com#contact for a chat.

The title of the article was "How to Deploy Software", but almost all of the advice only works for server side software where you have total control over the deployment environment.

I'd be much more interested to learn about how people develop mobile and web apps, where feature flags are far less useful as you need to push the entire app to the AppStore, so your iteration time is much slower.

I also recommend Zach's presensation https://speakerdeck.com/holman/how-github-uses-github-to-bui... [selfpromo:] When we were thinking about deploying our frontend builds, we got inspired by Ember's CLI Deploy pipeline (http://blog.firstiwaslike.com/deploying-ember-cli-apps/) and we've built something similar for our Webpack based app (https://github.com/productboard/webpack-deploy). Together with Git flow methodology we basically removed all friction from deploying new versions. Would love to hear your thoughts!

I don't know if I agree with the branch on every deploy particularly if you have a small team and use Mercurial where named branches live forever. I wish the article discussed dependency management more.

Instead of branching we tag every deploy and use dependency management heavily (ie maven, npm, etc). That is the project that gets deployed never really has any branches but is composed of lots of smaller projects each in their own repository which may have branches but they have to be released.

This approach cuts build time, improves coupling/cohesion as well as facilitate a possible transition of OSS useful components (that do not provide a competitive advantage nor or proprietary).

I have seen way too many projects that have this giant monolithic source tree (particularly PHP projects) and thus have to rely on branching much more heavily. I firmly believe this is the wrong approach.

> When you're ready to deploy your new code, you should always deploy your branch before merging. Always.

Does anyone actually do this? This seems counterproductive - what if there are multiple branches?

Assuming you have a product/release branch and many others, then I'm guessing the author means you should merge product into your own branch and deploy that, before merging your branch back into product (and deploying it).

That should work fine with multiple branches in most cases, so long as you have a system to stop anyone else deploying their branch while yours is running.

I've worked places where we will always deploy a feature request to an ephemeral environment. I find quite a lot of bugs are actually caught by this.

But I've never deployed a feature branch into production like the blog post suggests. I had my own questions about this lower down in the comments.

I use deploynaut as my deploying tool, and I have to say it's made the process much smoother. Previously I'd simply use git to update a code base or sql workbench/pgadmin to update a database.

The body text in this article is illegibly thin, please consider moving to weight 600 so that people can read your text. You've worked hard to write it, now it is time for people to read it. :-)

I didn't notice, but my NoScript blocked the web font and the default font is more legible with that color. In both cases changing the color from rgb(100, 100, 121) to rgb(80, 80, 100) is enough to improve the readability of the text. font-weight: 600 seems a little extreme.

For some perhaps the color of the text is the only complaint, but for me the font is also illegibly thin [1]. I got around it by disabling the web font, because just setting the font weight to 600 didn't fix the odd shapes [2].

The bold version of the font is also available as a separate font family (AvenirNextLTW01-Bold). It looks much more like a "normal" font weight and is incredibly readable [3].

[1]: http://i.imgur.com/uVHXptR.png [2]: http://i.imgur.com/IXwR6EU.png [3]: http://i.imgur.com/xgz7EtR.png

I noticed that thin fonts are not so thin on high DPI screens. Example: that font is more readable on my 9" 2500x1600 (approx) tablet than on my 1080p 15" laptop. Maybe they are designing for retina Macs.

http://spinnaker.io : deployment software made by netflix (king of daily prod pushes), google, and the community for aws, gce, azure, etc

shoutout to sublime for development

I just double-click a command file on my desktop and I'm done.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact