Hacker News new | past | comments | ask | show | jobs | submit login

The article mentions it takes 10 minutes to do a release so if you have revert to an earlier version you could be looking at significant (for a payments processor) downtime. I presume you have a strategy for rolling back directly on the servers?



The whole idea behind this testing infrastructure is so that we're fairly certain that any deploy isn't going to leave us scrambling to revert to an earlier version. Since we've implemented it, we haven't had to do that yet -- it's caught all manner of issues large and small that we're glad never made it into production.

The 10 minutes is to run the full suite of tests. In the case of an emergency, if we needed to get code out and wanted to bypass tests (e.g. reverting to an earlier version), we could run our deploy task manually, in parallel on all machines, and be done in about 30 seconds. This is obviously a nuclear option that we'd hope to never have to use, but if we were reverting to known-good code after a botched deploy, it might be our best bet.

This process is a safety net. If we want to work without a net, of course we have more options -- this is the tradeoff that we've made.

EDIT: small clarification




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: