
BlueGreenDeployment (2010) - colund
http://martinfowler.com/bliki/BlueGreenDeployment.html
======
GFischer
The most interesting thing for me was the link to
[http://continuousdelivery.com/](http://continuousdelivery.com/) and the
diagrams summarizing some of the concepts in the book (they're CC licensed):

[http://continuousdelivery.com/wp-
content/uploads/2014/02/01_...](http://continuousdelivery.com/wp-
content/uploads/2014/02/01_CD_the_idea_low-res.jpg)

[http://continuousdelivery.com/wp-
content/uploads/2014/02/02_...](http://continuousdelivery.com/wp-
content/uploads/2014/02/02_CD_test_strategy_low-res.jpg)

[http://continuousdelivery.com/wp-
content/uploads/2014/02/03_...](http://continuousdelivery.com/wp-
content/uploads/2014/02/03_CD_automated_acceptance_test_low-res.jpg)

[http://continuousdelivery.com/wp-
content/uploads/2014/02/04_...](http://continuousdelivery.com/wp-
content/uploads/2014/02/04_CD_managing_data_low-res.jpg)

------
boobsbr
I work at a big 3-letter multinational corporation, and we still compile and
deploy manually. No continuous integration or, at least, automated testing. No
build tool whatsoever, or a repo to store artifacts. Everything is compiled on
the dev's workstation and sent by email.

We do have a disaster recovery server, but it sits outdated for weeks after a
deployment in production. When something goes wrong, the switch is not
automatic and transparent to the user, we have to email EVERYONE in the user's
list to please use the DR server's address. There is no router or load
balancer.

Crazy.

~~~
osipov
In a big 3-letter multinational corporation your experience with the build
process might not be representative of the corporation at large.

~~~
boobsbr
You're right. But I've asked around and I haven't found any projects that use
an automated build and test process.

------
madeofpalk
BlueGreen deployments are easy to achieve using something like AWS
ElasticBeanstalk (and they also force you to have your entire env setup in
code so its quickly duplicatable for each deploy) with how easy it is to
create a new stack and its 'Swap CNAME' feature.

However, now that Elastic Beanstalk supports incremental rollouts I don't see
as much value in the approach.

We use Blue/Green deployments (with AWS EB) at Mi9 to deploy all our high
traffic websites

~~~
dorfsmay
Same here, we went to blue/green deploys when magic in Beanstalk broke and
created an outage (we used to use beanstalk deploys where it does all the
switching for you).

It is very sad and unfortunate that Amazon decided to limit the smaller number
of instances to 1 in VPCs. In the old classic AWS you could set the minimum
number of instances to 0, which allowed you to completely shutdown an
environment without destroying it, effectively allowing you to turn back to it
in a few seconds.

------
patsplat
Database is the sticky part of this strategy.

If Blue is in production and Green is on deck, how are updates to the Blue
database pushed to Green?

~~~
robinwarren
I believe (in practice if not theory) it is common to handle db schema updates
separately and to run both systems off the same database. This way the data
only lives in one location but you get to run two slightly different app
servers/frontends on top of it. You are then more careful with database
changes. This sounds like a pain in the arse but a lot of people seem to be
able to run with it without it causing a lot of additional overhead having to
schedule your db changes to go out before your dependant code changes.

~~~
anton-107
So does this effectively mean - only create tables and add columns in your db
schema migrations (no updating and removing)?

~~~
lmm
You can remove but you update the application to not read from the removed
column (and test this in a test environment) first. "Updating" is best done by
adding a new column, writing to the new column in parallel, back-populating
it, reading from the new column, and then dropping the old column. It's a bit
cumbersome but it works and it's safe.

------
beat
I had the pleasure of working on a three node blue-green system at a Fortune
100 company (yes, enterprise can do some things right). Each cluster could be
in either of two production modes or a staging mode, all switched with a
simple command line request. The request created touchfiles that managed the
load balancer behavior. Instantaneous switching, yeah!

On the other hand, the database was more or less a cache. It was very much
availability over consistency, in CAP terms. If we had to keep transactional
data, it would have been harder.

edit: When I started there and they described the architecture to me, I said
"Cool, you're doing blue green deployment!", and the response was "What's blue
green deployment?"

------
baliex
The idea reminds me of double-buffering. I wonder if there are other parallels
that can be drawn (pun not intended) from the similarities

~~~
TeMPOraL
This is exactly double buffering, and I think it would be better to call it
that than "blue/green deployment", because "blue" and "green" suggest that
there is some difference between the two environments, that the question
whether current "production" is "blue" or "green" matters, whereas this seems
to be just two environments and a pointer that switches between them.

------
geerlingguy
Betterment just posted an interesting set of slides from their #AnsibleFest
presentation yesterday on 'Cyan' deployments powered by Ansible; kind of an
evolution of the traditional 'Blue/Green' style described in Fowler's post:
[http://www.slideshare.net/AlanNorton1/cyansible](http://www.slideshare.net/AlanNorton1/cyansible)

------
joshribakoff
The way I do it is to rsync my code to a path such as "app-v1.0.0"

I then create symlinks such as "live" & "beta":

"live" -> "app-v1.0.0"

I rsync the code from a CI server or local host, after running bower/composer
& testing, this way I deploy a "snapshot" of the entire codebase.

I find a lot of people just casually deploy their code with git, and
subsequently run "bower install", tolerating downtime in between, with no
solid rollback mechanism in place. I hear Capistrano works in a similar way
but is git based instead of rsync. To me its really important that the updates
happen atomically & roll back atomically, and as a result of running a single
command.

Simple example of a "deploy.sh" I would use in a new project:

gulp production && rsync --update-after --delay-updates --exclude-
from="rsync_exclude.txt" -avr ./ server-www:/var/www/html/

------
lmz
Netflix does it with two auto scaling groups:
[http://techblog.netflix.com/2013/08/deploying-netflix-
api.ht...](http://techblog.netflix.com/2013/08/deploying-netflix-api.html)

------
ing33k
This can be implemented easy using Ansible
[http://docs.ansible.com/guide_rolling_upgrade.html](http://docs.ansible.com/guide_rolling_upgrade.html)

btw article is 4 years old

~~~
mattzito
This actually doesn't describe a green/blue environment - it describes having
a single environment that has code deployed in a rolling fashion. Which is
useful, but not this use case.

It's also worth noting that this use case is actually pretty hard to model in
ansible without external tools or the like. You basically need to deploy four
jobs:

\- deploy new code to inactive environment

\- switch persona from inactive to active

\- reconfigure load balancer to send to newly active

\- switch formerly active persona to inactive

within some or all of those individual tasks you might want to apply these
rolling techniques. But since rollouts may not be idempotent and this process
isn't transactional, you have to be very careful indeed to code this out.

------
dorfsmay
Some companies will also activate the new path only a small percentage of
traffic and monitor the logs to be able to detect errors that were missed in
testing.

~~~
neilellis
Yes, I really like that strategy, not done it myself as a little (not a lot!)
complicated for a small company.

------
kaylarose
We use this at the (very large) corp that I work at, with very large, complex,
high-traffic/revenue sites. Each component in the architecture has it's own
"light/dark" deployment (in each environment), so any piece can be
staged/tested/deployed/rolled-back with zero downtime. It has worked fairly
well, with the majority of hiccups occuring during the ramp-up on the process.

------
hharnisch
We've used this technique at Respondly for the past year and a half with a
Meteor stack. It's saved a lot of headaches and has greatly minimized
downtime. It does have some gotcha's though when pushing certain types of
changes. Mostly things around message formatting changes in shared queueing
systems.

------
allan_s
(looking for comments)

after reading that articles (some months ago) I've decided to implement that
in our company , we're mostly using symfony2 and postgresql, and I came up
with that [https://github.com/allan-simon/ansible-docker-
symfony2-vagra...](https://github.com/allan-simon/ansible-docker-
symfony2-vagrant/)

basically I have

    
    
              ______nginx__phpfpm+code  (blue)
             /                        \
       front nginx                      database 
             \______nginx__phpfpm+code/  (green)
    
    

The front nginx play the role of "switch" between blue and green , and the
database is shared

so basically for the database problem I solve by telling our dev to be aware
of it, which translate into a two step process for database upgrade. (we use
Dotrine's ORM with doctrine migrations for migrations, so when i talk about
getter /setters I talk about the entities' )

    
    
      * if we add a column which must be "not null" , we add it with a default value first , so the few second the old code is still online , but the database is already updated , if an insert happen it will not fail . Then second time we drop that default value and put in the migration the SQL statement that handle previous data.
    
      * if we drop a column,  we first commit a version of our application which remove the code using this column , or make them return default value. and then the second deployement drop the column, so that the version N-1 is already made to not care about it.
    

etc. the base logic is "the database version N+1 should still work with
version N of the code" , I like to image that with a "crossing a small river
by always keeping a feet on earth rather than jumping the two feet at once "

after I have to admit that fortunately 99% of our deployment are not about
database changes so this "need to do it carefully" only happen once in a while
and it does not affect our productivity (especially with the gain of being
able to do CI)

last thing, is that I've used this technique only on small to medium websites,
nothing with HUGE traffic, so I don't know if it's 100% no downtime, my test
(running several `while true; curl WEBSITE ; done ` ) shown no traffic lost.

I just wanted to say that to get opinion from more experienced people, while
telling to "normal company" fellows that it's possible to achieve that without
"the cloud" and old-school server hosted in a DC or at customer's office.

Edit: of course we still have a dev and staging environment on which we
validate code before deploying in production.

~~~
lmm
Yeah, that's the way to do it. Provided nginx is working properly it should be
zero downtime.

------
danesparza
That's from 5 years ago.

Also: Your database changes should be entirely backwards compatible and should
happen in one location (not on its own slice).

------
kaghaffa
What's a good tool to automate the router configuration to point to the new
deployment?

