
Ask HN: How do I learn how to become a good sysadmin? - nstart
Earlier today I read an article on the decline of proper sysadmins in the age of containers. I thought I was doing a decent job of being a sysadmin+developer at the place I work but looking at the problems containers can cause I couldn&#x27;t help wonder how one should go about becoming a decent sysadmin.<p>Main scenario is the one where you as an individual might on your own (or with a couple of friends) start some tiny sass product, or build an open source product that you want to keep secure.<p>At the most basic level, when firing up a new server I follow the how to harden your server guides and install fail2ban, disable root login, enable ssh only login etc.<p>Things I don&#x27;t fully understand, user permissions or rather, how to use permissions to ensure good security. I&#x27;m also unsure about logging. I&#x27;m essentially clueless about where to look to understand what goes into being a good sysadmin.<p>Any advice would be awesome. Thanks.
======
protomyth
Start with two mantras:

1\. I will know exactly what every command or script I run on a system I
control is supposed to do - no exceptions. If I don't and are just following
instructions, I really need to learn what it means and why. If you need to
setup a test system and snapshot before and after to see how things work.

2\. I will document a lot. Imagine some poor person showing up after you have
won the lottery (think happy thoughts, but watch out for buses just the same).
Don't just blindly put down step, put the why down. If I cannot write why I am
doing something then I need to think about it more.

The rest just flow from those. Learn to program, be a tool builder, find the
best way to learn and dive in, solve problems, and insist on consistent,
repeatable, backedup, secure systems.

Do remember though: all your successes will be hidden in the darkness and all
your failures will be shown in the full light of day. Its not a fun gig at
times.

~~~
TheCowboy
This is good, and I'd add on to the part about the successes. Learn to
document your successes and be able to verbally communicate why anything you
do is important or useful to less technical users.

If you don't have a good boss who can see that you are good at what you do,
you will have to be able to speak up if you want to be paid what you are
worth.

You want to be a step up from a computer janitor who needs to be told what to
do, to being someone who delivers value to the business and helps people get
their jobs done, and can anticipate problems before they occur.

~~~
lighthawk
> Learn to document your successes and be able to verbally communicate why
> anything you do is important or useful to less technical users.

And on the software engineer/developer side of things the same applies. This
is why whenever I am given a self-assessment or asked to help with a review of
myself, I go back through my git log, email, etc. looking for what I've done
instead of just attempting to summarize based on memory. Then I keep a
personal copy of my self-assessment. That way, I have a record of what I did,
and so does my company. Wikis, file servers, and other document repositories
change, and when you switch jobs, you have that available to look at to update
your resume. If your company doesn't make you do at least annual and hopefully
quarterly self-assessments, you should do it on your own.

------
crypt1d
Sorry to disagree with a lot of posts here.

>At the most basic level, when firing up a new server I follow the how to
harden your server guides and install fail2ban, disable root login, enable ssh
only login etc.

This is simply wrong. There are two things wrong with this approach:

1\. If I understood correctly, you are essentially following random guides on
the internet about setting up your security rig. Not too far from downloading
untrusted binaries from the Internet.

2\. It shows that you do not understand the core issue and what the main
purpose of such tools is, how they function , etc.

This is a bad approach because it does not scale well. Security should be
built from ground up, not as icing on top. You have to interact with
developers during the build cycle and be the paranoid one. So, for example,
when somebody mentions FTP transfers you yell at them, instead of finding a
workaround (eg, setting up a box with FTP but filtering IP addresses with
iptables or whatever).

Security is an architectural issue, not the issue of which tools to install.

That being said about security, being a good sysadmin for me also means
striving for simplicity, documenting and standardizing everything and being
meticulous.

How do you learn it? From experience. Listen to your senior colleagues and
learn by doing. Also, RTFM (but seriously though, those tech notes are
important).

Disclaimer: Used to be a team leader of a large UNIX/Linux team at the big
blue. Now I do DevOps.

~~~
nstart
You are right with the following blog posts blindly is a bad idea. And I feel
guilty that I didn't invest time to learn why I commented those lines in IP
conf (it's bad enough that I don't remember which files I changed). So I've
definitely got a place to start working on already.

The mentorship part is a lot more difficult. Good sysadmins are really
difficult to find in Sri Lanka. I've worked with a few companies so far. Some
examples.

1) One company I worked for didn't even have a policy of hashed passwords and
protection against SQL injection. They were developing major enterprise
software.

2) I worked as an internal systems developer for a non IT team within another
company. This was the one place I could have learnt the most at but the IT
team was this very opaque "don't tell people what exactly we do" kind of team.

3) One last example. This other company that I worked at, the sysadmin was
pretty good in keeping stuff up and running, but a lot of it was copy paste
scripts. I got what I could out of the person but I couldn't pull out much.

Where I'm from, the main cyber security body of the country gave blank looks
when asked about heartbleed at a conference held recently after the whole
thing exploded.

All that to just sum up why I turned to HN to seek out advice as to what
resources I should look at. A lot of threads on the net seem to veer more
towards "be a good communicator" and "know your system". While necessary, it's
a little too abstract for someone trying to find out what gaps exist and which
ones need filling ASAP.

Thanks a lot for the advice. I'll probably start reading up on all those files
I had to edit when hardening the server. That should provide a good starting
point.

~~~
bsbechtel
You should reach out to @arunoda
([https://twitter.com/arunoda](https://twitter.com/arunoda)), founder of
Meteor Hacks. He's one of the leading voices in deployment architecture for
the open source meteor.js project, and based in Sri Lanka.

~~~
nstart
I have actually. He's one of my heroes in SL. He works close to where I'll be
moving to soon (it's all in one IT park). Guy is very very humble. Very nice
to talk to him. Also one of the few all in believers of TDD :D

------
jpgvm
At the end of the day it's important to find a mentor or join a bigger company
where you can study under a more senior sysadmin.

These days things are getting more complicated, to be a good sysadmin you also
have to be a good developer. Usually as good and sometimes more than the
people that write the apps that you will run, maintain and love long after
they have decided they have something more shiny they could be working on.

You need to be able to dissassemble and fix programs built by other
developers, diagnose issues in many runtimes, understand kernel subsystems and
the various issues you can run into in kernel land.

You need indepth knowledge of C, system calls and the behaviour of hypervisors
and hardware. Don't believe that running on EC2 and not programming in C gives
you the luxury of glossing over these fundamentals.

You will also need networking knowledge, even if you don't intend to run your
own networking equipment you will still need to understand things like the TCP
handshake, what and skb is and why that's important, understand the
differences between select() and epoll().

Learning to be a good sysadmin is relatively easy, learning how to be a great
sysadmin takes a good 10,000 hours.

~~~
mvanvoorden
I'm a sysadmin for 16 years, I have worked in big and small companies and I
don't agree with a great part of this post.

I'm not a developer, I have never needed to disassemble or fix programs built
by others, never needed to understand kernel subsystems or anything else
kernel related (except may be how to replace a broken driver/module). I know
nothing of C, I know just the basics of system calls and I've never heard of
skb or select() and epoll().

I don't even like developing software, which is the reason I once joined the
sysadmin team at the company I worked for and fell in love with this job.

I see how these skills are a nice extra, but I understand that knowing too
much also creates the problem of having to work harder and stay more often
after hours. And saying no isn't an option because nobody else knows how to
handle the problems that may arise.

To me, a good sysadmin knows when to say no and spends the least effort in
fixing errors in code. The developers should provide good programs or it
doesn't get installed in production. Keep strict boundaries in what you do and
don't to prevent being abused by your employer. If you are often the last one
to leave because you're doing work that someone else should be doing, or asked
to be stand by after hours while being the only sysadmin, step on the brakes
before this will eventually burn you out or makes you hate your job (or even
hobby).

Also, knowing all of what parent describes + all the 'normal' sysadmin
knowledge is (eventually) impossible to keep up and will also take too much of
your free time. You will regret this later in life.

~~~
lamontcg
You are literally the problem with system administration.

The attitude that you don't need to know how to program or understand the
architecture of the systems that you run on is a highly privileged attitude
that you are practitioner who can remain ignorant of their tools.

The good news is that kids these days are being raised on SREs and there's a
billion people in India who will take your job and won't have an attitude
about learning how to program. Ultimately, you are going to be a curious
dinosaur. You are a unique product of the late 90s Internet bubble where the
need for system administrators expanded so much faster than the available
supply and anyone halfway technically minded got hired for really good
salaries and put in charge of servers.

That is going to change and evolve.

And I spend 5-10 years learning all that knowledge back in the 90s and its 15
years later and I can state with confidence that there's no regrets. If
anything my only regret is that I didn't learn how to really practice
"software development" as opposed to "programming" even earlier. Going
forwards, I expect that more an more the lines between systems and software is
going to blur and _MY_ advice to kids these days would be that they will only
be able to be ignorant of software development practices at their own peril.

And I stated doing PC Tech work in the late 80s, became a Unix admin in the
mid-90s, managed the configuration management system and base configuration of
a site that grew to 30,000 linux servers in the 200Xs, and then after 15 years
switched to being a Software Developer. I'm fully confident based on
experience that you are offering absolutely terrible advice to someone who is
just learning, and you are out of touch with the direction of your own field.

~~~
dang
> You are literally the problem with system administration.

No personal attacks on HN, please.

------
falcolas
To use permissions for security, follow the principle of least access. If a
user or program doesn't need access to something, don't give it to them. User
permissions are the first tier of this, Apparmor and Selinux are the next (and
correspondingly complex) next tier.

For example, for a web stack which communicates exclusively over the network
stack, run your entire stack as individual non-root user. Nginx can run as the
nginx user, django/rails/node as a separate user which has no access to the
nginx configs. MySQL/PostgreSQL/Mongo as yet another user, which can't access
the previous two.

By default, config files should not be owned by the process reading them, they
should be part of the group (if they owned them, they could be re-written by
that process).

Logging, you have a few options. Syslog is simple, and re-directable, which
can help increase security. Writing to normal log files is perfectly
acceptable as well, just be sure to write to a program specific directory and
set up a logrotate config.

Also, set up monitoring and notification on everything you possibly can.
First, know when something has gone wrong before you're notified by your
customers. Then figure out how to know something will go wrong before it
actually goes wrong (like running out of disk space - this happens more often
than you might imagine).

There are plenty of articles on hardening linux, and sysadmin best practices
on the web - I've only outlined a few here. When acting in the sysadmin role,
be skeptical, paranoid, and value service uptime above everything. With this
attitude, you will be more prone to automate server setup, limit user access,
and less inclined to just throwing things out there because they're shiny.

The real fun is when you design and develop a way to safely and securely let
your fellow developers just throw code over the fence, because then they can
do their job in a way that suits them, without compromising your servers.

~~~
dozzie
> Also, set up monitoring and notification on everything you possibly can.

Notifying about everything is a very, very bad idea. You'll drown under
notifications that do not matter and you will certainly miss those that do
matter. There should be as little notifications as possible.

~~~
falcolas
I used to believe this. I was aggravated by the pager, and would happily
acknowledge problems in Nagios just to make the damned thing shut up. More and
more pages would come in, the volume increasing each day as "non-issues" piled
up.

The third time this behavior caused downtime for my clients, I wised up and
took my own advice of "uptime is king". I took the "every alert is actionable"
to heart, and took a few moments to realize that the action can also be
against the monitoring.

Alert for disk at 80% on a 2TB volume? Action: Verify growth in Graphite isn't
out of the ordinary, and increase the warning threshold to give you about 3-4
weeks of notice that you might need to get bigger disks.

Alert for an excessive number of 404 responses? Browse the nginx logs and
identify someone trying to hack your corporate-mandated WordPress install.
Verify they aren't making any traction, and add exceptions to your 404's so
you don't alert on known (and non-whitelisted) endpoints they're hitting.

Alert for memory at 80%? What's consuming it; do I have a memory leak and need
to restart something? If all is well, and MySQL is just being greedy, up the
alert to 90%.

Disk capacity warning at 3am? Put some hours around the warning notification,
and add a separate "things are growing out of control" alert which doesn't
have hours.

API endpoint is not responding again, but it's not the system at fault? Add
the API developer to the notifications for that alert and remove yourself for
a week or two.

After an admittedly harrying week of this, you gain two things. One: an
operational understanding of what is going on in your system. Two: an alerting
system which is tuned to your use case, and which lets you know when you have
real problems. Remember - uptime is king.

------
jaimebuelta
I am not a sysadmin, so probably there are better specific advice.

But, from the point of view of a developer, the thing that I appreciate the
most on a fellow sysadmins is to be calm and methodic at all times. Organised.
Having a plan and know what to do. Being ready for disasters.

Do we need to update a security patch on every single server? Ok, list of
servers, start with server one, finish with the last one, don't let any one
fall behind.

A server suddenly catches fire? No problem, remove it from the load balancer,
get a fire extinguisher, remove it, order another one, recover from backup.

Is there a problem on production? What could be wrong? Check logs, think a
little, then try to fix it. Do postmortem and come with improvements. Try not
to be bitten again for the same thing.

In mi opinion, the core of good sysadmin is to minimise risks and errors in
the stability of the system. Mistakes can (and will) happen, but the aim is to
make them only once. I think it has a big component of learn from battle
stories.

Processes and servers can fail, but the whole system should hold up.

~~~
nostalgiac
Learning to be calm and methodic as a system administrator is something that
comes with experience. It's a great way to tell experienced and non-
experienced sysadmins apart.

As you said, if a server catches fire, the newbie sysadmin will freak out and
start running around in circles screaming. The experienced sysadmin has seen
it 10x before and knows its not a big issue - removes, extinguishes and
replaces it.

------
joatca
I was once asked at an interview to give my three most important rules of
sysadmin. This is what I said:

1\. Never enter a command if you don't understand what it does.

2\. Never make a change on a production system if you don't know how to undo
it.

3\. Get everything else wrong if you have to, but get the backups right.

I realize this paints with a broad brush but it's a good baseline.

------
bluedino
Automation, documentation, and uninstallation.

A good sysadmin automates everything he can. It saves time, makes everything
uniform, and reduces errors (it can multiply errors but at least they're all
the same error).

A good sysadmin documents everything he does. Code, configurations,
everything. You want it to be easy for not just yourself to figure out what
you did, but your colleagues, customers, or your replacement.

A good sysadmin uninstalls programs that are no longer needed. He doesn't
leave 50 old or unused versions of scripts laying around. Not just to save
disk space or reduce system resources, but for security and to avoid
confusion.

------
cyberrodent
[http://www.opsschool.org/en/latest/](http://www.opsschool.org/en/latest/)

~~~
dccoolgai
Wow, I didn't know about this. Great resource, thanks for providing.

------
daxfohl
Not on Hacker News! Most of the people here (including myself) are developer
hacks that _think_ they can do sysadmin, which is probably worse than not
knowing it at all.

~~~
nstart
I dunno. The replies I've got have so far pointed me in a fairly consistent
direction that can't possibly leave me worse than where I am right now. What I
do from there will be up to me (and anyone else who benefits from this thread)
I guess.

------
mvanvoorden
F*cking up a lot is a good start ;)

When you don't understand things like user permissions, best is to make up
some scenario's and write them down on paper. It makes these things easier to
grasp, when you can strike through impossible options or for instance
visualize routes or outcomes.

Visualizing on paper also works very good for debugging networking. For
instance when a packet coming from the internet arrives at your gateway, how
does it travel from there to it's final destination and how does it travel
back? Write down at every hop the port/subnet/netmask/gateway for both the
incoming and (if applicable) the outgoing interface. Works great for finding
connection issues.

Document your infrastructure. Documenting in detail can show design flaws that
weren't visible before. It also helps a lot to gain more insight in how
everything is connected and how it works together.

------
SpaceInvader
One resource I always recommend is the FreeBSD handbook, can be found here:
[https://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/](https://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/)

~~~
nstart
Thanks. Saving it to the reading list.

------
snorkel
Just because Hadoop's install process sucks that doesn't make you a bad
sysadmin. Back in the day when we were building our kernels most of us weren't
verifying download package sigs either. Containers aren't making us dumber
either.

A good sys admin just knows all of the working parts of a server and knows
about the latest tools. The only way to get better is hands on practice. Try
setting up your own Hadoop cluster from source and run a job through it, then
put Hive on it, then attempt to containerize a node ... which not even work
but learning why that won't work becomes useful knowledge. Just stumble
through it and learn.

------
mtalantikite
Sysadmin is (unfortunately) a role that is on the decline, so from a pure
employability perspective I'd suggest you focus more on the dev side.

As for the skills, I'd suggest running a Linux distro as your personal,
everyday machine, not just a server you log into on AWS or DO every so often
to configure (which, also -- don't do that. You don't want snowflakes in your
environment). It'll force you to learn a lot about how the system actually
works.

Try out a distro that doesn't hold your hand so much -- for me it was Gentoo
in the early 2000s and Slackware before that. Always read the man pages. Learn
all the tools for performance profiling and get used to reading your logs.
Spend a lot of time learning how networking works -- maybe start with really
understanding iptables which will lead you into lots of other parts of the
networking stack. Read "The Design and Implementation of the FreeBSD OS" if
you're interested in Unix beyond Linux.

Ultimately it's a skill that you need to learn by doing. Just don't stop the
dev side of your life because as things continue to be automated and
abstracted away there are going to be less and less positions as a generic
sysadmin.

~~~
ocdtrekkie
Why would you say that the need for Sysadmins is on the decline? If anything,
I should think it would be greater than ever. I'm very curious here.

~~~
mtalantikite
It's not that sysadmins are going to disappear, it's just that with IaaS and
the automation tooling that's been developed in the past decade teams don't
need to be nearly as large. The role has also changed.

A few people can manage a deployment of a thousand server instances now fairly
easily (I've been on teams like that). A decade ago you'd be renting colo
space, racking/stacking yourself, managing your networks, swapping dead
hardware, and managing all the software that goes on top (I've also been on a
team like that). You'd need a large team dedicated to just ops and
sysadmining.

Hiring today is different. A sysadmin didn't necessarily need to know how to
code beyond some scripting with bash or perl. These days in order to manage
the complexity of large cloud systems you probably should be a solid developer
in addition to having a deep knowledge of systems. Or if you're a small
startup you'll probably have your devs work additionally on your
infrastructure or use a PaaS.

~~~
davidgerard
> It's not that sysadmins are going to disappear, it's just that with IaaS and
> the automation tooling that's been developed in the past decade teams don't
> need to be nearly as large. The role has also changed.

The sysadmin's role is to automate themselves out of a job: you should not
need to do anything twice.

For some reason, the job never disappears and new stuff keeps coming along.

(I do know one guy who successfully automated most of this job. He got bored
and got a new job.)

------
ephemer1c
I've been doing Linux for 16 years, no big deal.

Most valuable advice I can offer is: learn an editor and use it for
everything.

OpenSSH is boss for all networking issues.

Netfilter (iptables) once understood is used daily.

And something I call "The 2s Complement": know two of everything, two distros
(RPM and .deb), two shells (bash and zsh), two MTAs, two... you get the
picture.

In every environment you will be able to perform then.

Use zsh with GRML completion, saves so much time.

For hacking, read all Phrack issues.

Know shell scripting and at least one other interpreted language like Python,
Perl, PHP, Ruby.

Don't waste time on desktops, changing looks etc. I wasted so much time early
on customising everything, great fun but no ROI whatsoever.

But the most important thing is... how to think! That that my friend takes a
long time and only you can construct your own effective thought processes and
algorithms.

Build terse mnemonics to aid in command options.

Oh yes, keep everything you code, script etc. for future reference and
learning and improving.

Learn to code and then learn to think in code when sysadmining systems.
Infrastructure as code.

Another thing, build scripts, configurations, solutions etc. as objects and
reuse those objects, eg. a rsyslog configuration stanza for a UDP input that
is portable and can be reused.

Sorry for waffling on I can think of so much more...

Start a blog and copy/paste your ideas and interesting work configs, code,
scripts etc. there.

------
pjungwir
If you want to learn to "think like a sysadmin", the book _Time Management for
System Administrators_ by Limoncelli is short, funny, and excellent---and
covers far more than just time management. That will give you some sense of
what to care about and why. It's very useful even as a dev or devops person.

If you are really serious about being a professional sysadmin, Limoncelli's
other book is considered outstanding---but it's huge and I haven't read it
yet. Again, it's a mix of non-technical goals and technical solutions. The
Nemeth book is how I started (though I'm not a sysadmin) and it's also quite
good, but more purely technical.

I wouldn't say to a beginner, "Know what every command does," but try to learn
a new thing every day. Keep notes. I like to write private man pages
([https://github.com/pjungwir/manpj](https://github.com/pjungwir/manpj)), but
do what works for you.

I think learning sysadmin skills mostly just takes time, so try to make each
task double as a learning opportunity. Sometimes this requires a lot of
digging, or reading, or "debugging". Here I probably agree with @protomyth:
the holy grail is _understanding_. If you strive to achieve that (and don't
just go with the first thing that appears to work) you will become better and
better.

While you're learning the nitty-gritty tech stuff, also try to keep in mind
the different priorities of a sysadmin over a developer. Both care about
reducing their own effort and annoyances. A sysadmin wants stability (no 3
a.m. downtime), automation (easy to deploy/update/scale), transparency
(monitoring, logging), auditability (logging), recoverability (failover,
backups), controllability (runbooks), security. Maybe a real sysadmin could
chime in with what I'm leaving out. :-)

~~~
davidgerard
> Time Management for System Administrators by Limoncelli

YES. Best comment here.

------
SimpleUser245
All of the mentioned points are very good (and applicable) to performing the
job properly. But for me it is a matter of curiosity. If I need a function to
work, I want to know how it works. If I need an application installed, I want
to know its underlying pieces, how it communicates, what dependencies it has,
etc. At some point you have to draw the line of course, as it is impossible to
know everything, but that line is generally easy to find. For example, I could
put you to sleep talking on how TLS handshakes work, but I have no interest or
desire in learning the math behind any used algorithms (although I do keep
abreast of security issues with ciphers, etc.). Learning new options (say the
proliferation of systemd in linux land) is always good, even if you ultimately
decide not to implement it at this point in time. So just keep learning,
always find new ways to do things, audit your own systems, etc.

------
dozzie
How to become good sysadmin? Understand how your system works and do
everything to not interfere with it. This simple principle is the most
important for a sysadmin, all the rest is an implication of it.

How to use logging (shipping, processing), how to install software (packaging
systems), how to one can separate services in different ways, how to
communicate with systems, how to work with resources, how to tell what's going
on in OS (what's running, what uses what, what subsystems are under how much
load) are all derivatives of "know your system and work with it instead of
around/against it".

------
mauvm
The opposite approach might also teach you a lot: try to hack your own server
(even better with the help of someone who knows a decent amount of hacking).
This won't ensure you'll fully secure your server, since you can't easily
reach the level of "real" hackers, but will give you a decent understanding of
a hacker might approach attacking your system.

I myself are also a "pretend to be" sysadmin. Not hardcore in any way. However
I use Docker for encapsulating almost all the software components, basic
security (no-password ssh logins etc.), and most importantly: logging.

------
Arubis
There's a lot of good advice already in this thread, and I'd rather not just
repeat it all in new phrasing. So, read it and weigh it and take what you
like.

And then, when you're done, fire up a console window and unplug your mouse and
put it somewhere that's really annoying to get to. Live with this for a week.

This will force you to live your life in a terminal, which means that all
those little tasks become commands and scripts and configuration files. You
will not have a GUI and you will have to understand how to make stuff work
anyway.

Trust me, you will learn fast once you don't have a choice about it :)

------
rotten
You have to be organized and thorough. You are going to do a lot of stuff no
one appreciates or cares about unless you don't do it. (such as backups)

Organized and thorough and don't forget backups. You can never have too many
backups.

cd /bin and cd /sbin and cd /usr/sbin. Do you know what every command in those
directories do? No? Learn them. On some systems /sbin and /usr/sbin have
different files in them. Why? Try to figure that out on your own.

cd $MANPATH, do you see more man pages that you didn't know existed? Read
them.

Signs of a poor administrator: System clocks are all different; not sure if
the backups work; don't know who every one who has an account on the system
is; don't know what the system does; system way out of date on patches; large
garbage files floating around; inconsistent and incomplete monitoring; don't
know where the server is; etc...

You don't have to be a top notched system ENGINEER to be a good system
ADMINISTRATOR. It helps (a lot), but I think attention to detail,
thoroughness, and organization are core skills. Also, you are the interface
between the machine and the human world. You. Be prepared to deal with people.
Developers can get away with living with their heads buried in code. A System
Administrator cannot. 90% of the problems a System Administrator faces in
their job is not the technology, but the people trying to use it and manage it
and mess with it and own it.

------
davidgerard
A lot of the comments here are about technical detail. But knowing the
technical details is just a prerequisite.

Half the job is having a sense for technologies.

* You will have projects dropped on you that you literally never heard of 24 hours before, and you will be _expected_ to be able to support them. ... and, with some experience, you will.

* People will come to you with _great new technologies!_ ... that you'll get a sense of disaster about. You will need to articulate your concerns in a manner that doesn't piss off everyone invested in the bad idea.

* Be humble, don't turn into an expert beginner. (Look that up.)

* Find other sysadmins to drink and bitch about work with. Your geekosphere will keep you balanced, help your career and be a _vital_ source of info.

Politics is the other half of the job, even though you're expected to be a
consummate techie. Treat every email as a press communication, it'll be picked
over like one.

* This is not a technology festival, it's an organisation that does something. Always look from that angle.

* The BOFH stories are _stories_. Nobody, anywhere, actually wants to deal with the grumpy BOFH, in any circumstance.

I have been a sysadmin for 15 years. I fully expect to be gainfully employed
until age 100 if I want to, because _even in the future, nothing works_.

------
kokey
Experienced sysadmins knows that it's very likely for a successful application
to be in place for well over 5 years. They will know this from having been the
person who had had to take over such systems from others, several times.

They will know that any proprietary code will have to be supportable even
after the developers are gone. The systems themselves will have to be
supportable by new sysadmins. The system will have to be able to receive
security patches and fixes. They will know all this will be needed for
multiple systems in the same company. They will know that many promising
technologies will come and go. They will know how much it sucks taking over
systems where these things weren't thought through. They will remember systems
they have left for others which would have not have these things considered.
They will remember abandoning systems themselves that became unmanageable
because these things weren't considered.

Inexperienced developers and system administrators treat servers like new
desktops, with desktop focussed operating systems, new software being added to
it all the time, changes being made in an increasingly unrepeatable manner,
and the whole thing being replaced every 2 years.

------
Naery
I'm a Windows Server Administrator, have been for a very long time. And, if
you'll permit it, I'll say I'm a damn good one. I think there are two things
that have gotten me to where I am today: 1) Incessantly asking why, and 2)
voracious reading and research in my lab.

I acquired some servers a while back, then got some switches, then a router,
all enterprise-grade gear, and I set up some virtual servers, vswitches,
etc... Basically, I made a ridiculously complicated home lab. To do so, I
followed tutorials online, but at every step of the way, I asked "why is this
necessary, what does this do, why do I do it this way". That gave me a great
foundational understanding.

Then, I realized that certifications aren't restricted to a certain group of
people, _anyone_ can take them, so I started studying for certifications. The
idea was, these large governing bodies think these are the kinds of things
that experts should know, so if I learn these things, I should be an expert.
Right? Not quite, but reading up on each of the features, etc, that are
covered in the certification exams really expanded my horizons. Then of
course, I started asking Why would I use that and things just snowballed. Hear
about something, ask why it is what it is and why people need/want it, then
read up on it.

My home lab has been one of my greatest learning tools. Everything I've read
about I have set up, installed, configured in my home lab. Then, I usually ask
my brother (who also does IT) to come break something in the lab. He of course
doesn't tell me what it is, and I have to go figure out what my "junior admin"
did wrong and repair it.

So, I guess the TL;DR version is this: You need three things: 1) A home lab.
2) The question Why. 3) Tons and tons of reading.

------
rlonstein
There's a lot of good information in this thread, but I can add this: join a
SA organization and/or special interest group:

    
    
       Usenix, https://www.usenix.org/
    
       LOPSA, https://www.lopsa.org/
    

Then read the journals, even the articles that don't interest you now, and
follow the email lists.

~~~
lwhalen
I'd like to add that LOPSA also has a mentorship program, free for all
members.

------
6t6t6
Maybe my advises are more abstract than what you are hoping for, but this is
what I learnt after 15 years being a sysadmin.

\- The good sysadmin is not the one who makes difficult things, is the one who
makes the things easy. If one you find yourself wanting to recompile a Linux
Kernel, probably you are taking a really bad approach for the problem.

\- Document, document, document. Think about the bus factor. If a junior
sysadmin is able to rebuild and manage your infrastructure using your
documentation, you are doing the things in the right way.

\- Never make an step forward without a backup plan. If you are going to make
changes in the production servers, always have a plan B in case something goes
wrong. Doing an `apt-get update` and hope that everything will be ok, is not a
good policy.

\- Always remember that your job is to serve the other departments, so they
can do awesome things. If too many people in the company knows your name, you
are doing something wrong.

------
mobiplayer
Aside from the technical side, a good sysadmin knows that the business has
priority over how cool this or that technologies are.

So every time you fix something, you're not fixing a computer, you are fixing
a piece or tool of a business process. Changing your mindset will help you
prioritizing the really important things.

------
mordae
> ... being a good sysadmin ... means striving for simplicity, documenting and
> standardizing everything and being meticulous.

This. And you learn by _not_ doing this and still having to maintain an
infrastructure. After a while you will either start, hang yourself or get
fired.

------
hobarrera
A lot of comments here already address how to get into the learning side, and
the theoretical side.

If you also want to get your hand dirty and learn some more that way (eg:
experience), get some VPS with BSD (my preference is OpenBSD) or some really
bare linux distro and set up your own email server (eg: OpenSMTPd), IM server,
etc there. You'll learn plenty on the way: about servers themselves, how
emailing works, management, etc, etc.

I learnt huge amounts doing this years ago, and have an extremely detailed
understanding of how email, anti-spam, validation, etc work.

Don't do this "instead of" grabbing books though, do it "on top of",
especially if you want to make a career out of it in future.

------
WestCoastJustin
Here's my brain dump on what goes into being a well rounded Sysadmin:

[https://sysadmincasts.com/episodes/25-bits-sysadmins-
should-...](https://sysadmincasts.com/episodes/25-bits-sysadmins-should-know)

------
smutticus
Let's all take a moment to lament the sorry state of man pages in Linux.

------
antod
Lots of good advice about concrete skills mentioned here already - eg
scripting, monitoring, documenting, low level OS knowledge etc.

I reckon a good sysadmin also has a hint of nagging paranoia or slight sense
of impending doom and is never complacent about how well things are currently
running or how secure they seem.

On top of this they need a good instinct for anticipating problems and
evaluating risks so they can proactively fix problems before they arise.

Oh yeah - learn how DNS really works.

------
elwin
There's a lot of good advice already, but here's one tip:

Learn how to read documentation. Consult man pages and official documentation
before resorting to random people on websites. This is a skill that requires
practice, because a lot of the material is mediocre. Some writers give
overviews, some give examples, some list every feature. You may be more
comfortable with one kind, but learn how to digest each one and extract the
knowledge you need.

------
Ologn
The article mentions certain things, but _every_ piece of outside software run
is a potential security problem. Every piece of inside software is a potential
security problem for that matter. The authors don't even have to be malicious,
just careless. How many remote holes did Drupal, Joomla and Wordpress have? (A
lot)

Yes, the article is right that you should not just grab some random compiled
binary and throw it on your production server. It mentions Debian. I suspect
Red Hat and Suse have better solutions, as their customers demand it. Of
course they may not have an officially blessed package of something that was
released last month.

How to be a good sysadmin? For big installations there are production servers,
staging servers, development servers, and then often some unofficial
development servers. You control access to the production server, the
procedure to do releases is formalized. You update server firmware, OS updates
and package security updates. Do it regularly on staging, QA it, then do it on
production.

Most security breakins I have seen are because a non sysadmin, non security
person is doing something they're not supposed to. They're running an
unauthorized server on their desktop not set up by the sysadmins, with a
glaring security hole. Or an outside consultant is careless about how they
connect to your systems, and someone breaks in through their account.

Maybe you're a sysadmin at a web site and you notice scripts trying to hack
web usernames and passwords. Your workload is high, and you bring this to the
attention of the head developers and management. No one cares, the business
logic management wants implemented in the short term is very high, there is no
time or budget for security. So you can either end your normal work at 6:15 PM
and stay another hour at work each day fixing the problem, or ignore it and go
home like everyone else.

I knew some people who were on the early tiger teams for the big accounting
firms. They told me their success rate was 100% - they managed to get in to
the company systems every time. They also mentioned they were at a
disadvantage, as they had to remain within the law (beyond the blessing of
management to probe security), while others doing so would not.

Insofar as logging - syslog calls from programs go to syslogd. This can be
sent to various places, including to /var/log. You can tune facilities and
logging levels in the syslog configuration file. Under systemd it might be
different. Do you understand what I said in this paragraph? Good, you now know
more than 95% of the Unix sysadmins I've interviewed over the past 20 years. I
wish I was kidding.

------
rogeryu
I think you are doing a decent job. However when you want to do the same for a
big(ger) company, or a bank or multinational like Shell or IBM, the
expectations go up. Then your setup will not be enough. I work like you, try
to learn, every day. I'm not a professional sysadmin and I don't think I would
get a job when I applied for one, but by lack of the real thing, I'm doing the
sysadmin job.

------
gauravgupta
I would recommend following Hackr's System Administration section. It's been
my starting point for most sysadmin learning I have had so far (even though I
am a software developer) - [http://hackr.io/tutorials/system-
administration](http://hackr.io/tutorials/system-administration)

------
ehershey
The content at opsschool.org has the potential to be a great resource -
[http://www.opsschool.org/en/latest/security_101.html](http://www.opsschool.org/en/latest/security_101.html)
\- but there's not a lot of "there" there. Maybe you'll like it more than I
do.

------
joshbaptiste
The best sysadmins I have seen understand the Operating system from top to
bottom. The easiest way to understand an OS without being a OS developer is
using tracing tools to solve problems such as Dtrace under FreeBSD or Solaris
(Linux also has many tracing tools). I would also recommend watching CS 162
series from Berkeley on Youtube.

------
emodendroket
It's a different discipline and you don't really need to be an expert in that
_and_ development; you should pick one. That's not to say you should be
totally ignorant of administration details, but, really, you could devote all
of your time to it if you wanted.

------
digitalsushi
My greybeard ISP unix boss in the late 90's taught me that if it's worth
getting sued over, it's worth sending your /var/log/auth.log to a printer that
can print line-by-line (a line printer).

------
daurnimator
FWIW, I'm part of an 'online hackerspace' 'hashbang.sh' where we try and teach
people to be better sysadmins.

[https://hashbang.sh/#](https://hashbang.sh/#)!

------
dschiptsov
Learn how to make basic autotools project (configure.ac, makefile.am, etc)

Learn how FreeBSD's ports system (building packages from sources) work and why
it is so.

Then learn what Redhat Package Manager (RPM) is and how to make a source rpm.

Get enlightened.)

------
skywhopper
As a long-time sysadmin, my best advice for learning is to give yourself the
opportunity to break things. Constantly ask yourself questions about how
things work and then see if you can answer them. I wouldn't even begin to
start worrying about containers until you're more comfortable with the basics.

Some questions and projects to get you started:

Do you know how to start and stop services, how to tell what's running, how to
see what network activity is happening?

Try installing Apache or Nginx on an EC2 instance, then see if you can change
how it works. Change where it logs to, or change what port it listens on, or
change what directory it pulls files from. Can you change where it reads its
configuration from? What happens if you give it a bad configuration? Where
does it log errors? Can you break it and then fix it?

Spin up a new EC2 instance and try to see if you can lock yourself out of it
or otherwise disable it. What happens if you kill the SSH daemon? Or change
the port? Or delete your user's entry in /etc/shadow? What happens if you run
"rm -rf /"? What do all the fields in /etc/passwd do? What if you delete your
home directory? Or "export PATH="? What happens when you delete files that a
running program is using? What happens when you fill up the disk? When you run
too many processes? Can you make the thing crash? When something gets screwed
up, do you understand why?

Run "ls /bin" and see how many of those commands you know what they do. Pick
one you don't know how to use but which you've heard of and try to figure out
how to use it. Look at the man page, run the command with "-?" or "\--help".
Play around with it till you feel comfortable. Then pick another one tomorrow
and do the same.

Run "ls /etc" and pick a file and try to find out what it's for. See if you
can do something interesting by changing the contents of the file. You might
need to reboot or restart a service. Tomorrow pick something else from /etc.

Do the same with /usr/bin, /usr/lib, /var/lib. Figure out where the files in
/var/log come from and how you can write to them and how you can change their
names and how many there are.

Set up two instances and see if you can get them to talk to each other. On EC2
each host has two network interfaces. Do you know how to find both IPs? Can
you set up MySQL on one, and connect to it from the other? Once you get that
to work, can you block it? Spin up a third instance and see if you can figure
out how to make MySQL accessible to one and not the other. Once you figure out
one way to do it, figure out another way. Can you get a service to listen on
one IP and not the other? Both?

What's the difference between UDP and TCP? Set up NFS. Set up a RAID. Break a
RAID, rebuild it. Do you understand crontabs? syslog? Can you send mail to
yourself?

Once you're feeling more comfortable with the environment, and you start
actually fulfilling a sysadmin role, your basic philosophy should be:

* Expect anything and everything to fail.

* Trust nothing, even your own software and machines.

* Grant the least possible access to make things work. Developers and vendors will always ask for more, and if you are not pushing back against their requests, you are probably giving them too much.

* Always have a plan for rolling back a change.

* When something breaks and you fix it, don't stop working until you understand what went wrong and you take some steps to avoid it in the future.

* When you have a task that takes more than one step to complete that you do more than once, write a script to do it for you.

That should get you started...

~~~
nstart
That... Was brilliant. My weekend plans seem to be pretty sorted. Thanks a lot
for this. I wish I could reply with something more than just thanks but
honestly, there's so much here. Just.. Thank you :)

~~~
skywhopper
Glad you got some good ideas. When you're just learning, if you get stuck on
something and get frustrated, just drop it and move on to something else.
Sometimes you just need to come back later. Sometimes you'll figure out the
solution while trying to solve or learn something else. Don't get hung up on
any particular problem at first.

Ultimately, good luck and have fun!

------
captn3m0
[https://serversforhackers.com/](https://serversforhackers.com/) launched
finally a while back, and I thought I'd share it here.

------
denysonique
The first step is: Install Gentoo

~~~
denysonique
Some of you may want to down vote this comment, however only those who ever
installed Gentoo know the deep meaning behind that sentence.

~~~
Fourkeys
Then perhaps you could elaborate on the sentence, seeing as the thread is
about helping the learning process of a sysadmin. Posting something that only
someone experienced in it would "understand" is entirely unhelpful and the
likely reason for any down votes.

------
mightymaike
making a lot of hours. Especially when it comes down to debugging problems
created by endusers. Making edits on servers what are in production. Every
sysadmin was fooling around in the beginning.

------
FabianBeiner
I'd suggest to crash servers while trying. :)

------
collyw
Don't take advice from Ubuntu forums.

------
cymetica
Learn to automate. Script.

------
hackuser
You'll learn the technical concepts and rubrics eventually on your own; here
is what I'd be thinking about to start:

1) It's great that you ask and seek to learn to do it right; that's a first
step many don't get past.

2) There are big differences between hacking on something at home and
professional system administration. Most important is cost: Your time is
expensive but system downtime can be incredibly costly. [1] You need to
anticipate and prevent the problem in the first place (you are the expert!),
have resources prepared in case of failure (including expert knowledge of the
system), and resolve it quickly. Also, you are paid as an expert to get
results that boost the bottom line. You can't spend a day fiddling around with
something and gratifying your curiosity.

3) There is a very wide range of knowledge and skill among sysadmins. You
don't need a license to do it; anyone can print "System Administrator" on
their business card. There are many, many ignorant hacks; lots of decent ones
who don't think beyond what they are told; and few true experts. Who you
surround yourself with will determine, to a great extent, where you fall in
that range. You probably will adopt their standards and you will learn their
way of doing things; you can spend your time learning either the knowledge and
techniques of the hacks or those of the experts.

4) Invest the time and effort to learn core technologies [2] exhaustively and
to learn best practices. Never shy away from difficult technical material;
push yourself to find the best sources and develop skill to understand complex
material. Most of what's on the Internet is bullsh-t from and for amatuer
hackers, good enough for your home server. You can spend your career without
understanding much of what you are doing -- there's enough work out there for
the hacks too. Find the very best sources and people (see #3), and take the
time to learn from them. Learn something once and it pays off forever.

5) Learn to solve problems. Worry less about learning techniques (e.g., arcane
details of command line switches); anyone can look those up. The real
challenge is staring at a screen, seeing something that you have no idea about
(and which is not in books or on the Internet), and finding a way to solve it
(with all the time and other pressures mentioned above). To do that, you need
a model in your mind, a deep understanding, of the technologies and systems
involved (see #4).

6) Put yourself in situations where you will be in command of the situation
and with time in reserve for making major enhancements, learning more, and for
unexpected crises; not where you are struggling to keep up with a flood
problems, frantically treading water (or slowly drowning). Also choose
situations where you are prepared to handle the worst-case scenarios: When the
sh-t hits the fan -- when all your plans and normal operations go to hell --
people will look to you to save them. Be their hero. (See #5.)

It can be a very intellectually stimulating and gratifying job. In every
field, good people and true experts are hard to find. Make yourself into one
and you will be in good shape.

\---------

[1] Consider the cost of 1,000 people not working (avg hourly rate * 1,000 *
hours), hours of orders missed, facilities shut down because your system is
the bottleneck, deadlines missed, angry customers, embarassment to the
business and its executives, etc.

[2] What the core technologies are for you will vary. You can't know
everything. Beware of investing time learning about technologies that change
quickly. TCP/IP probably will stick around for awhile; the app-of-the-week,
maybe not.

