
How Balanced Automates Testing and Continuously Deploys - mahmoudimus
http://blog.balancedpayments.com/balanced-payments-operations-automated-testing-continuous-deployment-jenkins/
======
mahmoudimus
If these kinds of problems interest you and you’re looking for a real
challenge, contact us! We’re always looking for sharp and talented individuals
that can make an impact.

NmQ2ZjYzMmU3Mzc0NmU2NTZkNzk2MTcwNjQ2NTYzNmU2MTZjNjE2MjQwNjU2MzZlNjU3MjY1NjY2NjY5NjQ2MTY1NmI2MTZkNmY3NDc0NmU2MTc3Njk=

~~~
juskrey
No offence. (and also using your excellent service) But.

My statistical background have always questioned me, are these guys, who are
posting rebuses on job pages are actually searching for rebus solvers or need
their job done?

Or did they ever conduct a practical coding experiment with applicants who
love rebuses and who dont..

~~~
argarg
It took me exactly 20 seconds to get the email address behind this ... and I'm
not a professional rebus solver. I just happen to know how a base64 and hex
encoded string looks like.

~~~
lifeisstillgood
dammit - I thought I had decoded gibberish for 5 minutes there ! COuld not
work out what I had done wrong.

when you say 20 seconds did you reach for a bash command ? interested knowing
which.

~~~
solarkennedy
Here is some bash kinda:

curl -s [http://blog.balancedpayments.com/balanced-payments-
operation...](http://blog.balancedpayments.com/balanced-payments-operations-
automated-testing-continuous-deployment-jenkins/) | html2text | grep -P
'[^.*]=$' | head -n 1 | base64 -d | sed 's/../0x& /g' | xxd -r -c 100 | rev

~~~
mahmoudimus
You should apply. I'm looking for people like you :)

~~~
64bittechie
iwanttomakeadifference@balancedpayments.com

I really wouldn't hire someone who wants to use grep to parse out html[1] :P

Then again, I would not like to work for a place where engineers don't
understand this ;)

[1] <http://stackoverflow.com/a/1732454/366152>

------
pulledpork
Any light on how to pull back once a change has been deployed? If it's version
controlled you can check out a previous version but do you automate tht?

~~~
msherry
Hi there. I'm the author of this post.

We use git, so it's easy to revert commits, or push an update with '-f' that
resets to an earlier version. Once we push this new commit (that simply puts
us back to an earlier state) to our release branch, it's picked up by the
testing system just like any other commit and pushed out.

~~~
pulledpork
That seems like a slow process if you're "oh shit, rollback, rollback,
rollback!"

~~~
msherry
If we're already at the "Oh shit" stage, then we might be happy sacrificing
test coverage to go back to known-good code, in which case we would run our
fabric `deploy` task and have our rollback completed in ~30 seconds. The idea
behind this process is that we never have "oh shit" moments like that :)

(Please see my other response to a similar question, as well)

------
wahnfrieden
We used a similar continuous deployment process for Canvas, minus the staging
server (now that we're doing iPad software, things have changed). A nice next
step is to deploy to a single server at first and only roll out to the rest of
the servers once that one has been audited - if that server starts failing,
divert traffic from it until it gets back to a healthy state.

~~~
mjallday
This is a great idea. Can you share any insight into how to monitor this via
an automated process as part of a deploy and how to finish scheduling the
remaining deploy once it has been verified?

I'm guessing we could use Jenkin's join plugin for this and have a job that
just waits for x minutes but time does not necessarily correlate to a feature
being used.

~~~
wahnfrieden
I'm still doing that step manually as I figure out how I want to automate it.
An easy way would be to emulate your load balancer's health check (or
interface with your load balancer if you can), in our case this is AWS's ELB
which just hits a /ping endpoint on the instance that gets handled reasonably
deeply in our stack and puts that instance out of service after enough
consecutive failures.

It helps to look at what your past failures have been when going live in
production to know what to audit during incremental rollout. In our case,
/ping has stopped responding before, but some other common cases were
increased 500 response rates or severely decreased performance across average
or peak response times. These can be used as metrics when we automate. It
doesn't help much for infrequently-used features, but I think the main idea is
to prevent all of your instances from going down/going haywire at once
together, which is unlikely to be a problem caused by such features even if
you rolled out to all at once.

Would be interested to hear if you get that working with Jenkins to handle
incremental rollout, I haven't tried yet. I'm not even sure how this would be
done in Fabric but that might be another option (and have Jenkins call Fabric
and block on its completion).

~~~
mahmoudimus
wahnfrieden,

Can you reach out via support @ balancedpayments.com? Come hang out on our IRC
- irc.freenode.net #balanced.

Would love to talk to you more!

------
smackay
The article mentions it takes 10 minutes to do a release so if you have revert
to an earlier version you could be looking at significant (for a payments
processor) downtime. I presume you have a strategy for rolling back directly
on the servers?

~~~
msherry
The whole idea behind this testing infrastructure is so that we're fairly
certain that any deploy isn't going to leave us scrambling to revert to an
earlier version. Since we've implemented it, we haven't had to do that yet --
it's caught all manner of issues large and small that we're glad never made it
into production.

The 10 minutes is to run the full suite of tests. In the case of an emergency,
if we needed to get code out and wanted to bypass tests (e.g. reverting to an
earlier version), we could run our deploy task manually, in parallel on all
machines, and be done in about 30 seconds. This is obviously a nuclear option
that we'd hope to never have to use, but if we were reverting to known-good
code after a botched deploy, it might be our best bet.

This process is a safety net. If we want to work without a net, of course we
have more options -- this is the tradeoff that we've made.

EDIT: small clarification

------
tieTYT
950 unit tests? The number of unit tests sounds very small. I was expecting
5000+. Why so small? What makes your team decide to write a unit test?

~~~
msherry
Hi there. Why would you expect 5000+ tests, or anything more than 100? It
seems like the number of tests that is the "right" number would be very
dependent on the size of the codebase, which wasn't mentioned in the article.

We write tests when new code is added, or a bug is fixed and we want to make
sure it doesn't reappear. Hopefully we add new tests without prompting,
because that's the right thing to do, but our coverage enforcement will goad
us into writing tests if we forget. Ideally, new code should be testable by
only a few tests -- if it requires more, then it's probably too complicated
and should be refactored.

~~~
masklinn
> Hi there. Why would you expect 5000+ tests, or anything more than 100?

It might be a factor of testing methodology and style, many people consider
that each assert is a test/each test should only assert a single things, so
the number of tests grows large. I was also surprised at a "mere" 950 unit
tests (our web frontend has maybe 10% test coverage and reports almost 700
tests, but that's because the test runner counts each assertion as a test, not
each test case — of which we have 200)

~~~
msherry
Ahh, I see. Right, 950+ is the number of unit tests we have, each of which
contains a number of assertions. If we count those, it's well into the
thousands, so I guess the original question wasn't that far off the mark.

