
Manual Work is a Bug – Always be automating (2018) - pcr910303
https://queue.acm.org/detail.cfm?id=3197520
======
onion2k
This article needs a giant warning on it. Automating is _brilliant_ , but an
automated process is a liability. If no one is regularly checking the output
of something that's been automated then you can't know if it's broken, and
that could be catastrophic. Every story of "the backups had failed", "payments
had been missed for months", or "inventory wasn't where it was supposed to be"
is the story of an automated process no one was checking.

If you automate _anything_ you need robust error reporting (which is not an
email someone will ignore).

~~~
LeonM
> which is not an email someone will ignore

Every developer should take this advice to heart. If a customer asks for 'a
daily email report', you should strongly advise against it.

\- Most reports the customer asks for will be meaningless vanity metrics
anyway

\- If you only do error reporting (not sending an email if everything goes
smoothly), you won't notice it if email is not being delivered.

\- If you report everything (not just the errors), people will stop reading
them

\- A mailbox history is not a log

\- you'll get a request just about every week to change the recipient list of
the report.

\- Who reads the reports on weekends?

I could go on with arguments for a while, but the short story is this: email
is not suitable for logging and reporting. In fact: email is not suitable for
a lot of things. But customers will ask you anyway, because that's the tool
they know (if you have a hammer, all problems will look like nails).

My advice:

\- Log everything to disk when possible (syslog, pipe)

\- use a logging aggregator to filter and archive (fluentd, ELK, graylog)

\- use an exception tracker for exceptions (sentry, raygun)

\- make incidents actionable by creating tickets automatically (jira, zendesk,
slack)

\- if needed, use an incident response service (pagerduty, opsgenie)

~~~
mr__y
>make incidents actionable by creating tickets automatically

not really a serious concern, but what happens when ticket automation goes
down? If all the problems are auto-reported, in a form of tickets or tasks,
then if that somehow breaks down it might take a moment before anyone notices,
especially if reported exceptions are rather rare. If it is normal to have
tens of tickets every day, then problems with ticket creation will be
noticeable almost instantly, if they are very rare - then not so much

The second issue with automated exception tracking is that you loose the "huh,
this is weird" mechanism that works when actual humans go through logs or
reports. While any tool will of course be orders of magnitude faster and also
probably more accurate, by relying solely on such automation an opportunity to
notice some "weird"/not-typical entries or rare/unexpected sequences of those
might be missed. Then again in most cases - I guess - simple statistical
analysis might be a good substitute. And that can be automated. (edit:
formatting)

~~~
SteveSmith16384
The problem is, for every automated reporting system, you need another system
to check that that system is working. And so on, all the way down.

~~~
bryanlarsen
Don't let the perfect be the enemy of the good. 99.9XXXX reliability is good
enough. Eventually have a enough nines and your risks are things like "nuclear
war", "dinasour-killer sized asteroid hitting the earth", et cetera.

~~~
mr__y
Agreed, there is absolutely no point going further after reaching a certain
reliability level. However, one thing is eliminating risks, the other is
limiting the consequences of said risks. I strongly prefer 99.9% reliability
where that 0.1% means some insignificant problem over 99.99% reliability where
the remaining 0.01% means total disaster. My point is that doing "too much"
automation gives diminishing returns (which is not bad in itself), but might
also disproportionately increase the consequences of that 0.xxxx1%

------
wwweston
I've been thinking about writing an essay about the flip side for a while, but
maybe it only needs to be a toss-off HN comment, so...

In political/economic debates, people like to say: "if you tax something, you
get less of it." And in turn "if you subsidize something, you get more of it."

You could think of automation as a labor subsidy. Or, on a rough level, as a
tax on attention. That's not strictly correct, but it gets you in the right
frame of mind: automation is, among other things, _designed_ the achieve labor
with diminished (or even absent) attention.

The problem is that diminished attention can have overlooked tradeoffs. If you
know those tradeoffs, you may be able to automate around them. If you're not
paying attention, you probably can't. If you've automated away attention, you
might not notice until you've hit a problem at the scale of your automation.

Automation _is_ valuable, absolutely. You probably want more of it. I do. And
yet...

~~~
superhuzza
Relevant

[https://www.researchgate.net/publication/49400362_Attention_...](https://www.researchgate.net/publication/49400362_Attention_and_automation_New_perspectives_on_mental_underload_and_performance)

------
stereolambda
I particularly love the suggestion to just document what you're doing as the
first step to proper automation. It's not that much work or cognitive effort.
Sometimes a decent description of your procedure is all you will need and it
tends to prove very valuable later. It can be also gradually improved. I was
doing it often already, and this will remind me to persist.

~~~
mikepurvis
This is very true. I stand up a server every now and then and just define the
setup as a single-file Ansible playbook— pretty much no matter what, it'll be
some combination of packages to install, users to create, config files to
modify, etc. Why write it down when I can just define it as code and check it
in somewhere?

But then the IT department comes along and hates it; they make a wiki page for
each server with a list of instructions for what was done. Their approach is
more flexible, as not every tool/step can be easily or robustly automated, and
if you can just screenshot the config webpage and paste it into your wiki,
that might be just as good as (and a lot faster than) spending an hour trying
to bypass that step.

Plus, their approach assumes aggressive use of VM-level tools. Where I'm like
"oh, I need N build slaves? Let me add them to my inventory file and re-run",
whereas they're like "N build slaves? Sure, I'll set up the first one manually
and the clone the machine for the others."

I know there are real actual advantages to version controlled configuration,
but there is something pragmatic about just doing the thing and keeping human-
readable notes as you do it.

------
james_s_tayler
Reading this makes me realise on my team I'm the automation guru. That's my
culture through and through. The rest of my team's culture is the opposite. I
always want us to play a little more defense, so that we can over time get a
lot more done. They always push back that "we're too busy to do it". Yet I
continue to automate and things continue to improve. Sucks because we could be
significantly more effective if it were the team culture.

~~~
pytester
This may be a healthy dynamic. There's always a risk both of over-automating
and under-automating and people tend to sway one way or the other.

~~~
james_s_tayler
I sort of feel it's the inverse. The last place I worked was heavily into
automation and they had the one manager who always cautioned about relying on
too much automation. I think that was a healthier dynamic tbh.

Too much vs. not enough are just different sets of problems.

Too much = you risk no one on the team being able to troubleshoot when things
go wrong because no one has ever had to run through the process themselves
before so no one understands it.

Too little = a non-trivial percentage of your bandwidth is eaten up by trivial
things you keep doing over and over again that are in the vain of servicing
requests rather than solving problems.

In practice I find I'd rather have the first problem and I can think of
solutions to it's problems that I'm happy enough with.

------
Smaug123
I've recently been on a learn-about-economics spree, so I have a handy lens
nearby with which to view this advice: you must not allow your gains to grow
merely linearly (by constantly learning but not automating). The economic
imperative is to allow gains to compound (by automating). Manual workflows
cannot compound; automated workflows can.

Sam Altman covers the same sentiment in [http://blog.samaltman.com/how-to-be-
successful](http://blog.samaltman.com/how-to-be-successful) .

~~~
treerock
in theory[1], yes.

[1]:[https://www.xkcd.com/1319/](https://www.xkcd.com/1319/)

(I may have been doing a lot of automation recently because, I think, bosses
have paid a lot of money for an automation tool and feel the need to use it,
and therefore feeling a little cynical about the whole thing)

~~~
mikepurvis
Ha, I expected a link to this one:
[https://xkcd.com/1205/](https://xkcd.com/1205/)

------
PinkMilkshake
I've struggled with this. I'm part of a smaller local faculty IT and our
systems are created by a central IT.

Some problems I struggle with:

1\. How do I automate in an environment of so many disparate systems. Some
things are done through web interfaces, some RDP'ing to an AD server, email,
spreadsheets sitting on shared drives, etc.

2\. How do I handle credentials. Central IT won't create generic accounts for
us, even for testing, so all automation would have to be done through my
credentials. How do I set up an automated system that uses my credentials
safely?

3\. Where do I run this system from? From my own laptop, a box under my desk,
somewhere in the cloud where I'd have to beg security to allow it to do
anything non-trivial? And how do I pass this on to the next person?

I've looked in to RPA type solutions (expensive), headless browsers (so
unreliable), Microsoft Flow (limited and confusing).

I'm at a loss but I did recently find TagUI which looks very interesting
[https://github.com/kelaberetiv/TagUI](https://github.com/kelaberetiv/TagUI)

~~~
AnIdiotOnTheNet
> How do I automate in an environment of so many disparate systems.

1\. Using disparate tools if need be. Fortunately we have PowerShell these
days which covers a whole lot of stuff including being ok at various web-based
interactions. Anymore, if it can't be done with PowerShell (in a Windows
environment that is) then I think it's probably too fragile to leave to
automation anyway... ok, that's not entirely true. There is at least one
important production system in our environment I have to automate using
AutoHotKey and that works pretty well.

2\. I have no idea why anyone would think not allowing generic accounts for
this sort of thing would be a good idea, but under your own account shouldn't
be a big deal as long as you leave appropriate documentation about what needs
to happen in case you're hit by a bus. Many sins are forgivable if there is
adequate documentation.

3\. I usually just run automation scripts on the system I'm scripting. If I'm
dropped into an environment and asked to figure out how some system is
automated, where's the first place I'm going to look?

------
bryanlarsen
A better way to do step 2 in the process: "Do-Nothing Scripting". Discussed on
HN recently:
[https://news.ycombinator.com/item?id=20495739](https://news.ycombinator.com/item?id=20495739)

------
hcarvalhoalves
Automation gives you consistency. If you automate the incorrect process,
you’ll get it done consistently wrong.

It still is a win because you’ll have a better chance on diagnosing and
improving something that fails the same way as opposed to some manual process
that has been done in a dozen different incorrect ways and you learn nothing
from mistakes because there are no automated controls to feed back into.

Most people and organizations have a strong negative gut reaction when
automation does something wrong but are okay when a manual process yields
mistakes because we largely accept humans as fallible (at most, someone gets
fired, and more bureaucracy is introduced), but the long-term economics of
automation look ultimately better.

I’ve seen time and time again apparently innocent manual processes that, when
not recognized as a liability, rapidly turn into entire departments (and
entire management hierarchies), at which point it’s impossible to improve,
because changing culture is harder than changing code. Having enough instances
of that is what creates corporate whales without competitiveness.

------
rukenshia
I never heard any excitement from people when we told them we would test some
automation with their request, but otherwise I love being reminded of this. We
tend to slack with that in our team, all the reasons mentioned in this post.

What we have found helped us quite a bit is, after writing the initial runbook
for a task, to write a second script (usually pretty short, maybe two dozen
lines) that asks you questions and then generates customised steps. We then
later use these functions in the actual automation.

~~~
mdaniel
I don't know if you already saw the other HN post on the front page about
this, or just coincidence, but:

> Do-nothing scripting: the key to gradual automation

[https://news.ycombinator.com/item?id=20495739](https://news.ycombinator.com/item?id=20495739)

------
wideasleep1
Always improve, but manual work is the craft that gets you to that
improvement. Once the wheel has been created, no need to recreate it, but
brakes might be nice, too.

------
broth
I would argue not all manual work is a bug nor does all manual work need to be
automated. For example, if you have a manual task that needs to be done once
per quarter, is it really worth investing time and money to automate it? In
contrast, if you have a manual task that you do multiple times per day, then
perhaps that's worth investing resources to automate that task.

------
majkinetor
I applaud this sentiment. Automate all the things.

~~~
ozim
Automate things that you sufficently understand. Do it first 20 times manually
and then when you know what it really does automate.

------
znpy
This article is short-sighted at best. I am a system administrator which sadly
cannot do much automation.

The reason is simple: I manage systems not installed by me, where problems are
often a one-off issue and more important, every system is installed a bit
differently (differently enough that there is no portability for automation
between systems).

Automation is awesome. But sometimes it just cannot be done, because of
constraints put in place by others.

------
pmlnr
[https://xkcd.com/1205/](https://xkcd.com/1205/)

~~~
james_s_tayler
Better yet - Terrence Parr's famous quote - "Why program by hand in five days
what you can spend (twenty) five years of your life automating?"

