
Continuous Deployment  - cwan
http://www.avc.com/a_vc/2011/02/continuous-deployment.html
======
mrkurt
One of the great benefits of continuous deployment is that it really helps you
hone in on a useful testing philosophy.

We've been doing continuous deployment at Ars for about 1.5 years. We would
generally write tests for our apps before that, but they were somewhat
directionless (and awfully easy to blow off for a "minor" change). Once our
tests became crucial, we did a much better job of picking relevant tests and
feeling their importance.

Many people visibly shudder when we tell them about our deploys. What they
don't consider, though, is that manually ensuring a given release is "good" is
less reliable than letting a well instructed computer system do it. When you
make changes, run your tests, and then eyeball things to make sure
everything's cool, you're really just doing unstructured integration testing.
You're likely to miss regressions, unintentional bugs in seemingly unrelated
systems, etc.

~~~
sghael
It's great to see the ideas like CI trickle into the that startup world at
large, to the point where we have VC blogging about it.

The thing that's amazing to me is that anyone would choose to NOT work like
this. If you are web-based startup and cannot do multiple deployments a day,
or you don't empower ever last person on your technical team to do a
production deploy, you are at a serious competitive dis-advantage.

~~~
neeleshs
There is a difference between CI and Continuous deployment. CI can just mean
you continuously build/run tests/deploy to staging(may be). The article talks
about continuous deployment to production.

------
fleaflicker
_I asked how to roll back the changes. He said "we don't roll back, we fix the
code."_

Not the best idea. I don't debug as well under the extreme pressure of my site
being broken with the clock ticking.

~~~
mrkurt
You'd probably find it easier when the deployed changes were extremely small.
Our last automated deploy was a single line change. Most are bigger, but not
huge.

Also, while they don't do full rollbacks, I suspect more than one fix has been
"remove the offending code until we can figure out what's wrong".

~~~
rm-rf
I've seen single line changes cause data loss, corruption, system outages,
remote root exploits...

I'm not sure that the number of lines makes the change more or less risky.

~~~
ichverstehe
And how exactly does this relate to the way they deploy their code? As far as
I can tell, they actually review code before marking it ready for deploy. That
kind of changes would be an issue with "usual" large batch deploys as well.

------
clutchski
"At Etsy, they push out code about 25 times per day. It has worked out very
well for Etsy and has led to [...] improved morale."

I am in the opposite situation at work. My company has scheduled, monolithic,
all hands on deck releases once a month. It's an insane legacy policy from a
time before our dev team had scripted releases and automated tests running
against every commit in development. We've solved the technical issues of
continuous deployment, but socially, we're stuck in the dark ages. It's a huge
morale killer. We've had several great developers rally to change this policy
and were stonewalled, and eventually left.

~~~
sghael
Without knowing the details of this company, here is my theory:

With established companies, with established revenue and legacy
product/process, management thinking is probably something along the lines of
"if it ain't broke, don't fix it". They don't --pardon my french-- give a shit
about morale if the profit is rolling in. If they didn't start with CD, they
aren't going to do it now.

So to champion any change, you're going to have to make a case around greater
revenue/margins, better (less) customer support, more defensibility in the
marketplace, or something else that gets the suits all geeked.

Complementary part of this theory: Startups need every advantage they can get
since they have the weaker market position when they begin. So they're more
willing / required to be innovative about every last bit of their process,
including deployments.

Other advice -- start slow. Prove the process on a smaller part of the
product.

------
jwedgwood
We adopted continuous deployment at my last company, and it was a huge win for
us. It resulted in less downtime, reduced the cognitive load on our
developers, and let us turn changes and bug fixes faster, which just made
everyone happier. Here is the approach we took --

We got continuous integration working first by setting up an automated test
server. We used cerberus, which is a trivially simple ruby gem that simply
polls a repo, grabs code, runs tests, and reports the result. You could
install this anywhere, even an old Mac mini if you wanted. We spun up a low
end server for it. We wrote great tests, got decent coverage, and made
adjustments in our automated testing strategy to increase our confidence.

Then we worked on zero-downtime deployment and rollback. This was actually the
hardest part for us. With regard to schema changes, if we had to hammer our
database (new index on a big table) then we needed to take the site down, but
otherwise our strategy was to always add to the scheme, wait for the dust to
settle, and then remove things later. This worked for us, but we had a
relatively simple schema.

I haven't quite figured out how an ajax-heavy site would pull this off. That
seems like a hard problem since you need to detect the changes and refresh
your javascript code.

We then combined these two to get automated deployment to a staging server. We
could have written monitoring code at this point, but we decided to punt on
that, relying on email notification for crashes and watching our performance
dashboard.

And finally, we set it up to deploy to production, and it just worked, and we
never looked back. It was the most productive and pleasant setup I've ever
worked in.

~~~
jamesjyu
Regarding ajax heavy applications, I have been faced with that particular
problem. A few of my sites are javascript heavy apps, with long flows between
page reloads. If I ever change my server API in a way that would break the
javascript, I need to signal to the client that it needs to refresh. (I keep
track of version numbers for the server code, and whenever that is bumped, it
means the client is out of sync)

This can be kinda.. awkward. What I've opted either a lightbox or some kind of
message saying we need to refresh the page. But that is not ideal.

Has anyone dealt with this issue before? With javascript heavy apps, the
development is more like a traditional desktop app, or a mobile app that has
to deal with the client/server model interface in a non-trivial manner.

~~~
kylemathews
I haven't directly but I just happen to be staring at Pivotal Tracker which
handles it pretty nicely. The do as you say and push a little lightbox which
says "A system change has occurred requiring a refresh of this page. Please
click 'OK' to reload Tracker."

~~~
jamesjyu
And Gmail would have the same problem. But to date, I have never seen Gmail
ask to refresh..

~~~
kylemathews
I think how Gmail handles the problem is they keep multiple instances of the
server-side software running, one for each version of the API. From my
experience, whenever GMail rolls out a new feature, I don't see it until I do
a refresh and frequently that's what GMail tells me I need to do to see the
new feature.

------
leftnode
It's great seeing VC's get involved on this level with their portfolio
companies. I imagine it makes explaining when things go wrong easier.

~~~
fredwilson
yes, and it is fun to be able to "go into the factory and see how things are
made"

i had so much fun

------
arrel
Reminds me of the calacanis article about facebook's developer culture -
continuous deployment not only makes updates faster, it democratized the
process and gives every developer the power to make things better. Good for
the product, and good for the team. [http://launch.is/blog/launch002-what-i-
learned-from-zuckerbe...](http://launch.is/blog/launch002-what-i-learned-from-
zuckerbergs-mistakes.html)

~~~
sajidnizami
TBH if you are a startup and have downtimes, people don't trust you. I know I
won't go to a site if it failed on me on the second click because somebody was
_adding features_.

~~~
mrkurt
If you are a startup and _don't_ have downtimes, you're either building
something trivial or a god.

We've actually had less broken downtime since we started doing automated
deploys than we did beforehand. It's partially the result of good testing,
partially because the changes are just so much smaller than they used to be,
and partially because pushing to our master git branch is now serious
business.

~~~
sajidnizami
Imagine this scenario. Peak traffic time in the day and site goes down because
of a deployment, an investor sees it and threatens to pull money out. Ad
networks that you work with actually see a report where during peak hours
their advertisements weren't served, causes of major trouble. I work for one
of the biggest sites in my region and take my word for it, uptime is essential
and not something left to chance or gods.

Continuous deployments ensure delivery of features and turn around time fast
enough for a changing business model of startup and can't be stopped either.

So uptime and continuous deployment model has to happen at the same time.

Deployments should always be automated and revertable. If you think you can
run a healthy startup with just about adequate mistakes perhaps you got an
easier place to work at :)

~~~
jasonlotito
You're scenario has nothing to do with CD. In fact, if you aren't using
aspects of CD, your scenario is even more dangerous.

------
allspaw
Just getting to this now, but there's a slight misquote in Fred's post. :) I
(and I think Kellan said it at the same time) said "...we don't roll _back_ ,
we roll _forward_..."

We do that because it's simply faster and easier to roll forward than to roll
back the entire deploy. Rolling forward means taking advantage of the
percentage rampups of new code paths, feature and config flags to turn things
off/on, and even reverting the 5 line change is simpler than rolling it all
back.

So no, I'm not suggesting we let bugs in prod languish while they are debugged
for a long while.

------
ajju
Anyone have good pointers on the best tools for continuous deployment of
Django? I know there are tools like Capistrano, Fabric and Django Evolution
out there but if someone has first hand experience using some of these it
would be good to learn about.

~~~
zeemonkee
I use Fabric at work, with South for migrations.

One useful technique I've found is to have Fabric deploy to a "staging"
install on the target server (we still use SVN). It runs unit tests in that
install, and if the tests fail, it stops deployment. If they pass, fabric will
then complete deployment to the production server (and run tests again for
good measure).

Of course that depends on having good testing practices and coverage, but it
helps reduce the number of stupid last-minute mistakes that creep into the
repo. Fabric is also useful in numerous repetitive small tasks that you need
to run without going through the rigamarole of SSH.

Generally though we avoid running South through Fabric - changes to data
schema are run very carefully after backing up etc. South is great, but any
migration tool should be used with due care.

~~~
ajju
Interesting. Somehow, I had never heard of South. Thanks!

For others like me, from <http://south.aeracode.org/docs/about.html>

_South brings migrations to Django applications. Its main objectives are to
provide a simple, stable and database-independent migration layer to prevent
all the hassle schema changes over time bring to your Django applications._

------
jfm3
You read about these "20 deploys a day" type situations, and it sounds great,
and I'm sure it makes VCs all warm and creamy inside. But they're talking
about small changes, and not all deploys are small. You can't incrementally
change a database engine, for example.

What I would be more impressed by is if they could run tests against a full
load of real user input, and have useful/reliable metrics come out the other
end. There's no reason for the deploy to fail if you have a real mechanism for
testing the deploy. I've yet to hear of a shop of the size of an Etsy that
does real production/load testing.

~~~
steveklabnik
IMVU. [http://timothyfitz.wordpress.com/2009/02/10/continuous-
deplo...](http://timothyfitz.wordpress.com/2009/02/10/continuous-deployment-
at-imvu-doing-the-impossible-fifty-times-a-day/)

~~~
jfm3
"[...] 4.4 machine hours of automated tests to be exact. Over an hour of these
tests are [...] Selenium. The rest of the time is [...] unit tests."

Impressive. But this not production test, where you tee the actual input from
users into the system. Nor is it load test, where you measure whether or not
the site is performant under real load. For that matter, there's nothing here
that indicates you debug your operations (provision/deploy/backup) at all. My
comment about not seeing such an operation stands.

~~~
steveklabnik
Read a bit farther.

> Load average, cpu usage, php errors and dies and more are sampled by the
> push script, as a basis line. ... A minute later the push script again
> samples data across the cluster and if there has been a statistically
> significant regression then the revision is automatically rolled back.

~~~
jfm3
Precisely so, but that's not testing, it's repair. You've just rolled back
from a production problem automatically. The user still saw the problem.

Sorry if this seems rude. You're definitely way ahead of the game, but you're
not addressing the thing that I'm talking about.

~~~
steveklabnik
None taken. I don't work for IMVU.

------
btipling
That's how we roll at Cloudkick. We ship the code. We're going to get all of
Rackspace to do this too if we can.

~~~
joshhart
Will you really be able to pull that off? I imagine certain parts of the code
base you could use CI, but Rackspace servers cannot have downtime. If I were a
Rackspace customer, I wouldn't want innovation - I'd want my boxes to never go
down. And I believe that means a rigorous QA process.

~~~
AngryParsley
It's not always true that development is a trade-off between fast feature
development and stability. If you're building a monitoring system (like
Cloudkick), more features means more stability for whatever you're monitoring.

------
sajidnizami
Interesting stuff. I've been doing something similar with my sites for over
two years now. Its more of an workflow than just a system really.

1\. A staging server where the new changes are first tested 2\. Production
server which gets the updated and verified working code 3\. In case of any
trouble (which hasn't happened in quite sometime btw) the old solution backup
is kept with a date and time and can be reverted just as easily as code is
updated from staging server

You can actually use a load balancer to roll over the servers too while
deployment and it takes the factor of downtime away to an extent. But this
can't be done with a powershell or any shell script and a network admin has to
be present at the time of deployment which makes it difficult.

I have had next to none downtime on my servers. Last I remember my sites going
down was during a DNS shuffle well over 6 months ago, never because of some
developer or sys admin screw up. Yes I do have more than ten deployments on a
single site in a day.

~~~
mnutt
We use Capistrano[0] to automate releases in the same manner. It keeps
multiple copies of the site in timestamped directories and has a 'current'
symlink which it just points to the latest. If you ever need to roll back,
just point the symlink to a different timestamp.

Similarly, for our rails apps we use unicorn[1] to do no-downtime deploys.
When we roll out new code, unicorn brings up new workers while the old workers
are still running. The old workers are not sent any new requests, and once
they finish their in-progress requests, they are killed.

[0] <https://github.com/capistrano/capistrano/wiki> While most of the docs
talk about rails, capistrano isn't ruby-specific. We use it to deploy node.js
apps as well. [1] <http://unicorn.bogomips.org/> A forking ruby http server

~~~
scotth
Do you also have timestamped databases? How do you deal with schema changes
between versions?

~~~
mnutt
Most of our deploys don't have a schema change, and often when adding tables
or columns the migration can be run before the new code is deployed. When
dropping or renaming columns you'd usually need to first deploy code that is
able to gracefully handle the migration.

However, sometimes it's not worth the extra code complexity and we just take
the site down and migrate.

------
markneub
So what do you do when you want to make a change that isn't a logical
evolution of your existing codebase? Say, for example, you have an ecommerce
site, and you're switching payment processors to use one with an entirely
different API. Surely you can't just push a change like this straight to your
production server.

~~~
sokoloff
Make the code change conditional and push it out off, cookie (or query string
or URL) force some users into the new path...

If you don't already have an A/B or multi-variate framework in place, first
push the no-effect change to the old code to understand that now it needs to
be conditional on XX.

IMO, adding a new payment provider is a logical evolution. If your intent is
to turn the old one off, I still believe it's worth the complexity to run them
both until you're sure.

------
brown9-2
Continuous Delivery is a great book about how to build such deployment systems
and infrastructure (I'm only about a third of the way through it):

<http://continuousdelivery.com/2010/02/continuous-delivery/>

------
ollysb
Shameless plug -- we're building hosted continuous integration and deployment
for heroku. We'll be opening up our beta real soon, email hn@zenslap.me to get
an invite. More information at <http://zenslap.me>.

------
babar
Does anyone have experience trying this in a more enterprise-friendly product?
We'd have a struggle figuring out how to communicate these changes to our
customers on a daily basis, not to mention the absolute horror if something
changed/broke right before or during a crucial production run. There's a lot
of things I want to implement from these ideas, but the examples I see are
usually high-volume consumer websites, so I am curious if people in other
areas have success with this.

------
laujen
That's pretty cool. This could work in mobile but Apple's review process kills
it there and the monitoring part is hard. App stores, though, make this
possible. I wish Apple would allow a post-review oath for trusted developers,
those who have established a track record of successful releases.

~~~
ichverstehe
If a mobile app wanted to be updated every single week, I would get annoyed at
some point. The point is that web applications can do this in a seamless
manner.

~~~
alextgordon
There's no technical reason why that should be so. Google Chrome doesn't make
you click a button to update it, or even tell you that it _has_ updated. It
just does it.

~~~
rfergie
Mobile is different because people have to pay for bandwidth

~~~
alextgordon
Not a problem with Google's binary delta based updater.

[http://blog.chromium.org/2009/07/smaller-is-faster-and-
safer...](http://blog.chromium.org/2009/07/smaller-is-faster-and-safer-
too.html)

~~~
spitfire
Unless you live in Canada.

------
makeee
I don't even roll out changes really, I edit the live production code, one
small change at a time. I always felt this was kind of dumb, and I know it
brings my site down for a few seconds to a minute here and there (about 1k
visitors are on the site at any given time). But i've always felt this allows
to get 5x more work done than I would within structured rollout system with
version control, etc.. and has given me a great sense of instant satisfaction
that motivates me to keep working.

~~~
simonw
You don't use version control?

I'd seriously recommend learning git (or svn). After a few days learning curve
it won't slow you down in the slightest, and it will have a huge effect on the
quality of your codebase since you won't need to keep old code around and
you'll always be able to figure out what you changed, why and when.

------
tzs
Is anyone doing this at a site where serious money is involved? For instance,
that 4 minutes of downtime described in the article would be a loss of
something like $120k at Amazon.

~~~
MartinCron
Looking at the opportunity cost lost by four minutes of downtime without
comparing it to the increased cost of _not_ doing continuous deployment
doesn't seem like a fair comparison.

Also, with huge monolithic deployments, the risk of much longer downtime is
increased. If you can get deployments into non-events, you don't have as many
catastrophic problems.

------
code_duck
Etsy exhibits an ever-changing array of glitches and bugs which are difficult
to pin down, monitor and understand. These are exhibited in both the web
interface and the API.

I think that their plan to 'push code' 25 times a day or whatever plays a role
in that. And really, I don't think having their dogs, VCs and first day
employees publish changes to the site helps.

I think the issue is which test they have written - their tests aren't
catching a few important details here and there.

------
karterk
I wonder if they have some kind of automated test suite just to make sure
everything works fine. Having worked on large codebases, it's almost
impossible to make sure everything is still alright manually, after a new
deploy.

~~~
mcfunley
Yes, there are tons of unit tests that are run (voluntarily) before commits
and (automatically) on staging pushes.

And then once code is live we have many system and business-level monitors in
place, so we know almost immediately if anything's wrong. More info about that
here:

<http://codeascraft.etsy.com/2010/12/08/track-every-release/>

~~~
gfodor
One thing to add is our test wizards have also conjured up a system we call
the "try server", which allows the execution of all the test suites
asynchronously on fast machines before you commit. So, you don't have to wait
for your laptop or machine you're working on to run the tests, you just kick
it off and get an email with your results in a few minutes. This makes it so
it's painless to be sure you'll never commit a red build.

~~~
tomazmuraus
Yeah, I was thinking about doing a similar thing here at Cloudkick.

Currently, our test suite is not that large and it takes around 6-8 minutes to
complete, but with testing every minute counts (testing is generally not that
fun and a slow test suite just makes it more painful).

We have a two types of tests:

\- Twisted tests - this tests runs asynchronously and finish pretty fast so
they are generally not that problematic \- Django tests - Django tests don't
run in parallel so they are pretty slow. Recently, I was playing around with
the Django test runner and I have made some modification to it to run the
tests in parallel. Now the Django tests finish around 50% faster.

The only problem with this solution is that it is a "hack" and it requires
some modifications to the Django core (I guess I should play more with the
nose parallel test runner).

We also use some other "tricks" which make tests run faster - for example,
MySQL data directory on a test server is stored on a ram disk.

~~~
gfodor
Coupled with slow tests is the fact that unless you have something like the
try server it ties up your machine while the tests are running. Being able to
just kick it off and continue working on the next feature or bug goes a long
way to reducing the pain.

------
koski
There is a great example of continuos deployment, good tools and methods here:
<http://vimeo.com/14830327> .

------
rs
_Big changes create big problems. Little changes create little problems_

The real problem comes when little changes (unknowingly) create big problems

