
Invent More, Toil Less (2016) [pdf] - DyslexicAtheist
https://www.usenix.org/system/files/login/articles/login_fall16_08_beyer.pdf
======
spectre256
The Google SRE book has an excellent description of toil and, unexpectedly,
discusses that some amount of toil is beneficial.

In short, the authors of that section claim that it's not really possible for
the SREs at Google to spend all their time solving novel problems through
automation.

This makes sense: constantly solving new problems is hard. It takes lots of
time, mental energy, and the outcome is inherently uncertain.

Google found that some amount of toil (roughly defined as repetitive tasks
that are not particularly challenging and do not solve long term problems) is
essential for the health of their engineers. Toil is boring yes, but can be
relaxing, and as work that is inherently easier to accomplish, can help keep
confidence that working on solving unknown problems can deplete.

I would have expected that Google would have absolutely minimal toil, given
that they are leaders in the automation space, but if they've found that some
amount of easier work is necessary, then it's probably true for anyone.

~~~
gav
> Toil is boring yes, but can be relaxing, and as work that is inherently
> easier to accomplish, can help keep confidence that working on solving
> unknown problems can deplete.

I think it is important for all roles to have a cadence that mixes easy tasks
in with the challenging ones so that every day doesn't seem like a drag.

For example, I prefer to start my day by knocking out a couple of easy bug
fixes before diving into some challenging development.

~~~
spectre256
I'm exactly the same way. If I can get a few things checked off my todo list
early, I'm nearly guaranteed to at least feel like I had a good day.

~~~
nubbins
That seems like a good strategy but maybe we should also work towards a
mindset of enjoying the process of working on hard problems even if apparently
fruitlessly. Whenever I feel I’m spinning my wheels I remind myself thats how
I learned what I know and my best breakthroughs usually come when I think its
a dead end.

------
jacques_chester
This is a good example of the "capability trap" dynamic[0][1]: there is always
short-term pressure that crowds out long-term capability building. The longer
you neglect the capability-building, the worse your capability gets, the
higher the pressure.

The only way out is to acknowledge that you will take the hit in a "worse
before better" phase.

[0]
[http://web.mit.edu/nelsonr/www/Repenning=Sterman_CMR_su01_.p...](http://web.mit.edu/nelsonr/www/Repenning=Sterman_CMR_su01_.pdf)

[1]
[https://www.systemdynamics.org/assets/conferences/2017/proce...](https://www.systemdynamics.org/assets/conferences/2017/proceed/papers/P1325.pdf)

~~~
cableshaft
Could you go more into detail about the 'taking a hit in the worse before
better' way of getting out of it? My department is drowning in endless toil
(from the parent article) right now, and I'd like to figure out some way to
get out of it besides just leaving the company like so many others have
(although I might end up doing that too, but maybe I can course correct the
department just a little bit before doing so).

~~~
jacques_chester
> _Could you go more into detail about the 'taking a hit in the worse before
> better' way of getting out of it?_

The case studies in the first paper are worth reading, but essentially you
make it clear to everyone that you _are_ going to take a hit on your primary
production output in order to restore capability:

> _Policy analysis showed that escaping the capability trap necessarily meant
> performance would deteriorate before it could improve: While continuing to
> repair breakdowns, the organization has to invest additional resources in
> planned maintenance, training and part quality, raising costs. Most
> importantly,increasing planned maintenance reduces uptime in the short run
> because operable equipment must be taken off-line for the planned
> maintenance to be done. Only later, as the Reinvestment loop begins to work
> in the virtuous direction, does the breakdown rate drop._

The reason that it's called the capability _trap_ is that nobody wants to
accept things getting _worse_. Everyone wants the improvements to be (1) free
and (2) monotonic. But once you get stuck in the trap it's (1) expensive and
(2) initially backwards. You slow primary production to make improvements and
all the skeptics can see is that the numbers are worse, and didn't you promise
to improve it? But the way out means going backwards on the primary metrics
before you can free up capacity to improve capability.

And it doesn't even need skeptics to be hard. Even with all the good
intentions in the world, it's very hard to avoid the temptation to prioritise
your primary output over everything else. Sure, we should improve our CI/CD
... once we ship this feature. Sure, we should automate our disaster recovery
... but put out this fire first. Yes, we should create systems to manage the
fleet better ... but can't let you refuse anyone's requests, we have a
business to run. Yes, of course we need reserve capacity to deal with
uncertainty, but don't you dare run anything less than maximum utilisation ...
and also don't you dare say we're at maximum utilisation when I give you more
to do.

------
jeremyperson
Love it. "Work with enduring value leaves a service permanently better,
whereas toil is “running fast to stay in the same place.” Therefore, as a
service grows, unchecked toil can quickly spiral to fill 100% of everyone’s
time."

~~~
mlthoughts2018
It depends on the point of view of management. In a lot of companies, managers
don’t care about tech debt much and see “toil” as the rightful majority mode
of work that engineers ought to focus on. They may simply lack the capacity to
understand what an investment in paying down tech debt to free engineer time
could buy them, or they may view that process as too risky, or they just may
not care.

There are a lot of managerial reasons why this happens. It’s one of the
biggest lines of questioning I pursue in job interviews, to try to understand
what point of view people have elsewhere in the food chain about how (or _if_
) engineers add value to the business.

~~~
nostrademons
There are two sides to this:

On one hand, good management should realize that automating repetitive tasks
and paying down technical debt is how you add to a tech company's capital
stock, and _is the only reason why tech company valuations tend to go
parabolic_. If you're just trading money for labor and labor for customer's
money, you have a consulting service business. These can be profitable, but
you don't build an enduring asset this way, and the valuations that business
owners usually think about when they decide to start a tech company are those
that come from owning a monopoly asset that you can sell to multiple customers
for virtually no additional cost.

On the other hand, the part that engineers are usually blind to is the
business context that the company operates in. Tech moves fast. It's not
uncommon for basically all of a company's founding assumptions to be obsolete
3 years later. New tech platforms are available; customers want different
things; a new competitor has just demoed a killer feature. All of these have
the possibility to render large swaths of a product's feature set obsolete.
There's no sense investing engineering time in a codebase that's about to be
killed anyway; just rack up the technical debt and declare bankruptcy. Worse,
management often can't _tell_ engineering about many of these external
realities without killing the product: would you continue working for a
company that said "Our competitive position is untenable. Pump out as many
features before anyone external notices so we can sell the company" or "We
need you to keep the lights on for our existing customers while this secret
division over there builds the next generation of the product"?

It's often good to ask questions about the company's overall strategy and
competitive position (and to do research on this yourself) before joining.
While the owners usually won't tell you everything, it'll build trust if they
can tell you some things. They should be able to think of these issues in
terms of trade-offs: "What's technical debt?" is the wrong answer, but so is
"We care deeply about technical debt and give our engineers as much time as
they want for code cleanup" (the latter is a bullshit answer, designed to make
the hire). Similarly, it builds trust if engineers can also recognize the
business tradeoffs and accept that sometimes the right answer is to hack
something together and ship it so the customers can get value while management
figures out the next strategic move.

------
empath75
If you’re bored with your work it’s sign that you should be automating
something, or else someone else will do it for you and you’ll be out of a job.

~~~
platz
There are a limited (and contested) number of non-boring jobs. Eventually
someone has to take out the garbage and clean the toilets.

~~~
hhs
Speaking of that, I wonder when there will be robots (i.e., with good
software) that will help take out the garbage, clean the toilets, and possibly
fix plumbing?

~~~
platz
Lol, plumbing especially will never be automated

~~~
mbrubeck
Plumbing _is_ automation. Without plumbing, human labor is used to haul water
from wells and cart away human waste.

~~~
tomjakubowski
I think there's some confusion here between plumbing as in the trade and
plumbing as in the infrastructure.

------
z3t4
As someone doing both dev and ops I automate and refactor away any issues
(some small yet important issues might require large rewrites in the
software). But I also like to carry out manual tasks for users as it gives
them a sense of service and I get to _talk to users_.

------
ildari
Is there any tool to count toil?

~~~
motakuk
We consider working on alerts as a toil and measure it in amixr.io

