

Ask HN: How do you improve your attention to detail? - Evgeny

Simple example: you wrote brilliant code, passed all tests, installed into production environment but configured the wrong date (or forgot to configure any other parameter) and your release was an epic fail?<p>It seems to happen to me often, it does not seem to be directly related to programming ability - the code does what it's supposed to do. But since the environment I'm working in is directly interacting to clients in real time, every time anything is released with some wrong configuration, it's only picked up when a client does something and gets an error back and complains. That's quite frustrating.<p>I'm trying to think of the way to 'fix' it - must have something to do with the ability to keep a lot of small details in memory and/or picture in my mind all the possible small differences between the "production" and "test" environment.<p>Any advice?
======
aamar
You've observed something important: deployment is a frequent source of
errors. It is a discipline unto itself which deserves understanding, study,
coding, and infrastructure. Here is my system:

First, here are the specific types of solutions, from best/most-difficult to
least-efficient/most-achievable:

1\. Automation, e.g. have the date automatically set by the deployment your
deployment script. Make your scripts be smart, so they can set things
correctly.

2\. Poka-yoke (<http://en.wikipedia.org/wiki/Poka-yoke>): e.g. have the deploy
script refuse to push out code if a configuration variable is unchanged,
unless there is a comment on that config line overriding the check.

3\. Checklist: Write down a set of procedures/checks to do when deploying.
Make a copy of this list on deploy, and check off each item as you do it. (See
also:
[http://www.newyorker.com/reporting/2007/12/10/071210fa_fact_...](http://www.newyorker.com/reporting/2007/12/10/071210fa_fact_gawande))

You'll note that all of those are geared at reducing reliance on your memory,
rather than improving your memory. Next, here is the overall process:

\- ("5 whys") When you have a problem in deployment, write down what went
wrong. Find 5 ways that problem could have been prevented.

\- Address at least two of those with solutions from the set above. For
example, you might automate part of the problem and add a poka-yoke in another
script. Or if you're ambitious, you may be able to have two fully automated
solutions, an automated deployment script and an automated test which checks
that that automation is working. Not as good (but sometimes easiest) is to add
it two different checklists, filled out by different people or at different
times.

\- Implement solutions even when something _almost_ went wrong but didn't.

\- Periodically review the overall process; refactor and otherwise improve it.
In particular, push solutions up the chain, e.g. replace a checklist item with
a much better automated solution.

Additional things to consider, specific to deployment:

\- Deployment frameworks (e.g. Chef) can assist with automation.

\- Deploying to a staging environment first can flush out many issues.

\- Deploy first to a small subset of servers/users, following with deployment
to all users X hours later.

\- Get the advice and support of an experienced NetOps or sysadmin person.

Edit: shouldn't be so negative about checklists; they're often useful.

~~~
thisrod
I use checklists, and they work.

Scientists and engineers have a technique for complicated mathematics, which
might work for editing sendmail.cf. Do it twice, and compare your answers.
This works best when the "you" is plural, but repeating your own work the next
day is better than nothing.

------
ScottBurson
I think the only thing you can do is to make the process less sensitive to
your ability to remember everything. You, like everyone else, are not a
computer. There are two basic approaches: automation and testing. Automate as
much as possible of the process of installing your code into the production
environment, and then make it a practice, the moment code new code is
installed, to fire up a browser and bang on it yourself.

