
I forgot how to manage a server - Mojah
https://ma.ttias.be/i-forgot-how-to-manage-a-server/
======
pritambaral
The three pitfalls TFA mentions are the reason why I enforce the following
rules around operations automation at my company:

1\. Only that should be automated what can be done manually.

Before automating something, a sysadmin must be able to manually carry out the
same operation. If they do not understand how it works, without using someone
else's scripts, they are ineligible to administrate it.

2\. What has been automated should always be doable manually.

After something has been automated, a new sysadmin should be able to read the
automation code and re-do the steps manually. Maybe not at the full scale of
repeating actions manually on every server, but enough that if the automation
tools are irrecoverably lost they can be rebuilt from knowledge.

This also serves as a test for adequate documentation in the automation code.
A new sysadmin should not have trouble figuring out why a certain thing is
done a certain way in the scripts.

3\. What happens, even when automated, should not be hidden from the sysadmin

Even in full auto, what's going on is always visible, so a sysadmin can jump
in (i.e., go full-manual) at any step. This was one of our earliest rules,
because we (me & my teammates) wanted to be able to debug without friction if
anything went wrong. Also, it looks cool(er).

\----

As a guiding principle behind how we automate operations, we assert that no
amount of automation is replacement for a sysadmin's skills; it replaces only
manual, error-prone effort.

~~~
jasonlotito
So, nothing here prevents "forgetting how to manage a server" like the article
describes. This is literally how automations generally works. You are doing
something manual. You know how to do it manually. You automate it so you don't
have to do it manually. You never do it manually again.

Until you have to 10 years later for whatever reason, and suddenly, you
forget. And suddenly you have to look things up.

~~~
zrobotics
Which is why I think there should still be some manually-managed servers
remaining (possibly something like testing). What I didn't see mentioned is
that when the automation fails for whatever reason, typically it is a five-
alarm situation where minutes count. Having to search through man pages or
stack exchange is wasting valuable time during one of the few situations where
time is super-critical.

~~~
pritambaral
> Which is why I think there should still be some manually-managed servers
> remaining (possibly something like testing).

All operations during development are carried out manually, unless already
supported by automation (and until, of course). When automating the new
operations, a sysadmin must also conduct the steps manually, and also take
care of considerations a developer might not have been aware of.

> What I didn't see mentioned is that when the automation fails for whatever
> reason, typically it is a five-alarm situation where minutes count.

Generally, this is easily-handled by redundancy, HA, and snapshot rollbacks.
Of course anything that's a potential five-alarm situation has redundancy,
right? :P

------
ryu2k2
He didn't forget how to manage a web server. He knew his problems, how to look
for solutions and how to apply them. That's an important skillset to.
Especially since it enables one to keep up with change. You're not a
professional sysadmin if you only mindlessly keep using the same
configurations that were taught to you.

~~~
chousuke
I found it a bit puzzling too. Does someone actually bother memorizing
iptables syntax? Sure, if you use it daily for a long time, some of it will
stick, but it seems completely pointless to waste time explicitly learning it.

I find it's better to save my brain power for learning things like what I need
to take into account when configuring a database for example, or what
variables affect network performance and what options I have for tuning, etc.

When I reach for tools to solve problems, the most I usually remember is "I
can use X to solve this class of problems" and the how is available from the
manual or with a Google search.

~~~
Aloha
I'm sure someone does - I mean I have a friend whom I can hand a network
diagram with specified cisco devices, and he can write cisco configs by hand,
from memory - whereas me, I struggle to recall any of the regex I learned the
last time I needed regex.

~~~
afturner
oh for real fuck regex. I need to just sit down for an hour and learn it, but
I never do :( it's like Vim..

~~~
nurettin
I don't even remember when or why I learned it. Was it 10 years ago? 15? 20?
Was it the xkcd.com/208 ? Was it curiosity? All I know is I've been using it
ever since and solved so many problems with it and yes, there are places where
finding the right expression becomes a problem in itself, but online regex
testers largely solve that problem by letting you visually test your regex.
(see rubular, pythex)

People have huge reservations about parsing non-regular languages (like xml)
with regular expressions because xml could have an attribute with a string
with another xml in it or something, but the whole point is moot because regex
is mainly used for partial matches and the possibility of having a string
inside the html that could match whatever you are looking for is minuscule and
will probably show up in tests anyway.

~~~
Aloha
if you design your regex for the XML you're using it on, this shouldnt be an
issue - I use it frequently to mung config files as needed for work stuff - we
use XML or XML like files for all sorts of stuff.

------
nurettin
Things he feels bad about are strange. I only remember how to allow root with
key auth because I searched for it yesterday. "Not remembering how" is not the
problem at all. "Not knowing what" is.

~~~
dkarl
_Needing_ to remember is a sign of mismanagement, because if you have to
remember it or look it up in third-party docs you might pick a different
method than you did last time and create a subtle problem that might be time-
consuming to debug. If anyone recalls the old days of having a handful of
expensive servers that were configured at different times, often by different
people, this was a common source of inconvenience and outages. The right way
is to have it automated or at least recorded, for example in his presumably
version-controlled config management system.

Of course this creates the pitfalls of automation, but I think it's
unavoidable if you desire reliability and efficiency. Bypassing automation and
typing in commands from memory or from man files, even if it's a one-off
experiment with a fresh server, is going in the wrong direction to address the
pitfalls of automation. If you're worried about not understanding the
configuration commands your automation uses, a better way would be to call a
meeting with some junior engineers (or a rubber duck) and talk through the
configuration process line-by-line.

~~~
nurettin
We are typing about converting multiple languages of expression (say, iptables
rules and sshd config) into one (say, yaml or json) right? The whole point is
to homogenize those languages into a single one so the configurations can be
managed via a single program. (say chef, ansible)

And in the process of doing this, people stop remembering the original
languages and you are now stuck in config language land where you still have
to look up syntax on how to do things anyway.

So from your perspective, the problem is not "not remembering how" or "not
knowing how" it is "not organising processes" am I getting it correct here?

~~~
dkarl
> So from your perspective, the problem is not "not remembering how" or "not
> knowing how" it is "not organising processes" am I getting it correct here?

From my perspective, he doesn't have a problem at all. I would even dispute
that he doesn't know the "how." If his servers are configured via chef, then
chef _is_ the how. Going deeper and looking at the OS-specific mechanisms chef
uses is necessary for debugging when the behavior doesn't seem to match the
documentation, just like if you're doing it by hand, iptables is the how, and
the system calls or library calls it makes are what you'd look at if you
couldn't understand its behavior from the man page and other docs.

In other words, you shouldn't worry if you're forgetting something because you
never need to use it. When you need to use it, then you will learn it again
and count it as an advantage that you did know it at some point in the past.

------
altendo
The thing is, memorization itself should -not- be the primary skill of a
sysadmin.

A sysadmin should know how to troubleshoot, how to use tools, and (more
importantly) how to learn how to use the tools.

I don't want to discount the value of someone who still remembers how to
handle seemingly arcane tasks like iptables. Experience has its weight in
gold, and memory definitely contributes to its value, but there's more to it
than that. It is the capacity and willingness to learn in any situation,
however, where a talented sysadmin really shines.

------
marsrover
I think it’s ok to forget syntax and the like as long as you still understand
the concepts.

If this guy had to work on a server for a week without any config manager, I’m
sure he’d be right back at home.

Opposed to a guy who has never touched iptables. That guy is in it for the
long haul.

Maybe a better title is “I forgot the commands(and some directory locations) I
use to manager a server.”

------
teekert
I don't know how chef or puppet or that sort of scripts look like but my
administration scripts are littered with: #Here I do A because B didn't work
and we need it for C. Or in fstab: # This is the disk I bought together with
father. Or in crontab: # This syntax looks weird but otherwise it does not
select the correct python env.

(I'm just a home-server admin ;) but that also means significant stuff only
happens every 3-5 years at a new LTS or with some major problem, so notes to
self are important.)

Such notes to self not only help when things go wrong, they also make for easy
repetition of stuff years later. When I reinstall, I first back up the entire
old root with all configs and thus all my notes (I could limit myself to /etc
probably).

~~~
pbhjpbhj
I'm primarily a home admin, I keep all `history`, and have a little search
script (grep really), that helps a lot.

~~~
teekert
Yeh, I guess the time gaps are so large that you have to. Still, I guess from
time to time one also needs to update their orchestration scripts... Tbh,
about the ssh, I know its in /etc, the I hit ssh, tab tab / tab tab, ah there
is sshd_config, lets search for "root". I also don't really know by heart.

------
exelius
I would simply argue that traditional “system administration” is now the
responsibility of development teams rather than ops teams (yeah yeah DevOps,
but the “ops” part has changed).

Ops today is largely responsible for maintaining build / deploy pipelines,
orchestration systems, and ensuring SLAs are met via SRE activities.

Most functioning DevOps teams I’ve worked with recently have added a more
generalist role for a person who is a mile wide and a foot deep. It’s more of
a hybrid sysadmin / development skill set ranging across base OS and package
management, logging, scripting / automation, networking, access control,
security, a dozen programming languages and whatever ITIL / EA platforms you
have to interface with. These folks are a godsend in issue resolution as they
know where the skeletons are buried. They also can pinpoint your top 5 tech
debt issues of the top of your head.

The best version of this person also has some BA skills — they work really
well as a demand management / intake person because they usually understand
the end-to-end architecture — especially the code behind the integration
interfaces — better than anyone on the team. They allow developers to focus on
code, ops people to focus on production ops, and architects to focus at the
right level of abstraction.

------
lmm
This seems no different from forgetting how to write assembly by hand because
you've been using C for so long.

~~~
pritambaral
C is a standard, approved by standards bodies, and with multiple
implementations on multiple platforms pledged to be supported for ages.

Someone's Puppet and Ansible scripts for some tasks are no comparison to C the
standard, and do not come with the breadth, stability, or support of C.

~~~
ivanbakel
What does that have to do with the analogy? Nobody compared C to Puppet.

The point was that it's common to forget about details on a different level of
abstraction to the one you normally operate. Doesn't matter if it's an ANSI
standard or a hand-rolled DSL for a very niche use case.

~~~
pritambaral
I meant, C replacing assembly in the skill sets of the vast majority of us is
still okay because it's widely supported, unlike many — as you put it — hand-
rolled DSLs replacing few standard ways of configuring things.

~~~
lmm
There was never any standard though. sshd.conf is one hand-rolled DSL,
smb.conf is another, httpd.conf is another...

~~~
pritambaral
But there is only one sshd_config with the exact same manual across all the
servers across all companies that use it (distro and version differences
notwithstanding). If you know what the PasswordAuthentication option in
sshd_config means at one company, you know what it means at any other company,
unlike each one's way of maintaining the file.

------
mruts
I’m not a sysadmin, but I’m remarkably bad at remembering boring things I
don’t care about. I generally forget things like how to create systemd units
or sshd config options a couple minutes after completing the task.

I don’t think it’s a big problem nowadays. As long as you know how to quickly
get the information you need, it’s probably not that much more inefficient.

~~~
terlisimo
Agreed.

90% of stuff I don't know off the top of my head I do like this:

1) connect to a server/repository which already uses that in some form

2) copy/paste relevant parts.

3) add new parts by using existing parts as a reference

4) fill in holes with man page/docs/internet search engine

If I had to start from scratch I'd probably fail the basic syntax with pretty
much anything.

The important thing is to know that you forgot implementation details, but
know where to look them up.

[https://www.xkcd.com/1168/](https://www.xkcd.com/1168/)

------
mauvehaus
"If you cannot already do the machine’s job by hand, the machine will outwit
you."

Raney Nelson [http://www.daedtoolworks.com/lounge-against-the-machine-
daed...](http://www.daedtoolworks.com/lounge-against-the-machine-daedlab-
commandments-1-and-2/)

------
redwood
Most of us have forgotten how to write machine level code too

~~~
lostgame
Fortunately writing home brew for old consoles keeps my wits up about this.

~~~
_asummers
To be clear, you mean home brew as in beer, not the package manager, correct?

~~~
dark_ph0enix
S/He means writing his/her own video games/software for old consoles [1]

[1][https://en.wikipedia.org/wiki/Homebrew_(video_games)](https://en.wikipedia.org/wiki/Homebrew_\(video_games\))

------
rstuart4133
I wonder if it is something to do with puppet. I've never used it (I wrote my
own thingy before puppet existed), but I gather puppet comes with a whole pile
of configuration thingies out of the box. For example, maybe you don't have to
know how edit the sudoer's file to prevent password prompting because puppet
has a puppet way of doing it that edits the config file for you. I could well
imagine if you did things by editing puppet files for a few years you might
forget the underlying config file syntax, or even what config file ends up
being edited.

If that's true he'll pick it up soon enough. The muscle memory will start
flowing again.

If he wants something to panic over try moving to k8s. If effectively includes
a puppet like thingy done a completely different way, and worse it builds a
very different computing model. Use that for 10 years, then try to re-adjust
your thinking back to a single server model that you maintain by editing
config files and you will be in for a rough ride.

------
opan
It could be handy to write down some of this stuff you don't do often in a big
org file (or plain text) for later reference. I find that some commands for
irssi (tui irc client) are hard to remember, so I started writing a local
irssiguide.txt that I look at or add to sometimes. I tend to add a new server
or channel to autojoin right away, so then if I have a new thing to add 6
months later I can't recall the exact syntax. I've been meaning to follow my
own advice in other areas. Taking notes as you learn things so that you can
relearn them later, or optionally show them to a friend, seems very useful. I
also sometimes get stuck on something and end up giving up and forgetting what
the problem was, meaning coming back to it later is that much harder. I'd like
to get into the habit of writing down problems I have as well. e.g. figuring
out how to mount the storage of an android phone easily

------
inflatableDodo
I rarely try and remember much more than there being a thing that does a
thing, and then skim the reference to get the syntax correct when I need it.
Things go in if I am using them regularly and go away again when I don't and I
don't worry about it or think that I have forgotten how to do a thing. I also
know that learning something again when it has disappeared almost completely,
often leads to a deeper understanding of the thing, possibly because new
connections have to be made before what is left of the original network gets
lit up again, so even complete bewilderment at something I used to do
regularly doesn't faze me all that much, I just get back on it in the
knowledge that I will discover things I missed the first time round.

------
naringas
this is why I don't focus on remembering how to do stuff

I focus on how to quickly learn (or re-learn) how to do it.

------
projectramo
What it used to mean to say someone was good at math was that they were very
good at quickly solving complicated math in their head. One documentary I saw
claimed that this is what they looked for at Oxford and Cambridge for their
math programs.

As soon as the calculator was invented (okay, and made cheap and portable),
this skill was useless. Then you had to set up problems so that the calculator
could solve them. (Similarly with the computer).

This is a natural and good progression. All the best practices should be
learned and mastered by people building the tool, and you should just be
proficient at the tool.

------
aNoob7000
This is an honest question for sysadmins. What do you do if you go to a
company that doesn't have Puppet or Chef? What do you say in an interview?

~~~
Hamuko
"I can improve the server architecture by automating a lot of the common
maintenance tasks with Puppet or Chef."

------
tw1010
I've forgotten how to write assembly, but that doesn't mean I can't make the
computer do the same things with higher order abstractions.

~~~
rgoulter
Right. But while I think the important point is about abstractions, it's still
valuable to understand-about (even if not "remember in detail how to
manipulate") the next lower level abstraction.

Partly this is to fix things when the higher abstraction breaks.

But TFA's point is to manipulate the system without bringing in the cost of
the higher abstraction.

------
module0000
I've gone through similar scenarios... you hire employees with fantastic
experience which includes config management. Everything is optimal until low
level problems occur. A good example of this is software raid failing, and the
best playbook/manifest writer doesn't know where to begin(without google).

The solution is _really really simple_. Don't hire people without an RHCE, or
that haven't had one in the recent past. Feel free to substitute the RHCE for
a comparable certification, which are few and far between.

~~~
eropple
RHCEs, etc. don't strike me as adequate signaling. I've worked with RHCEs that
didn't have much in the way of RHEL administrative chops. I've worked with
people without them (and don't have one myself) who have solid downstack and
upstack proficiency.

This is one that I genuinely don't know how to solve; I do not know how you
evaluate somebody whose skillset you don't fully understand. But
certifications aren't the answer.

~~~
module0000
>> I've worked with RHCEs that didn't have much in the way of RHEL
administrative chops

I don't understand how this is possible. Did you validate their RHCE at
redhat.com? It's not a multiple choice test, it's real systems(not connected
to the internet, no google for you), and you don't pass if you can't
configure/repair them and the services required. The requirements aren't toy
examples, they tend to be complicated and involving things you want to avoid
due to difficulty. Most people(>50%) fail the exam, according to redhat's
statistics.

~~~
eropple
I didn't; I didn't hire them. Maybe they were lying. I sorta doubt it--the two
I'm thinking about seemed like decent people, just in over their heads. But
it's possible.

------
Svoka
Strange how by "server" author means virtual a Droplet in Digital Ocean, not
an actual metal server.

~~~
the_fury
I'm not entirely sure why that's strange. It's been a common term for an
instance or VM for as long as I can remember.

------
lqet
> I had to Google the correct SSH config syntax to allow root logins, but only
> via public keys. I had to Google for iptables rule syntax and using ufw to
> manage them. I forgot where I had to place the configs for supervisor for
> running jobs, let alone how to write the config.

Are these actually things sysadmins know _without_ googling or peeking into
manpages?

------
wglb
Tools are good and useful, but don't forget the fundamentals.

------
xaduha
NixOS is much saner than your usual run of the mill distros.

------
therealmarv
Seems a good use case for going serverless (I know about vendor lock in).

~~~
pmlnr
Serverless email servers! Serverless DNS servers! Serverless web servers!

I believe you should see a problem there.

~~~
Cthulhu_
Serverless is a misnomer for this kind of thing; it's "-as-a-service" not long
ago. Nobody manages their own email or DNS servers anymore. Webservers, maybe,
but that too can be handled by a few hundred different services depending on
what is needed.

~~~
pmlnr
> Nobody manages their own email or DNS servers anymore

Thankfully you are quite wrong there; lots of us keep running the services
that were designed to be decentralized.

------
nulbyte
> I had to Google...

This is how I do it too, except replace Google with DuckDuckGo.

