

Things You Should Be Doing With Your Server, But Probably Aren't - luminousbit
http://www.roundhousesupport.com/blog/9-things-you-should-be-doing-with-your-server-but-probably-arent

======
mmt
The part about log rotation strikes me as an anachronism, which, sadly, is
still applicable due to naive default configurations.

Disk _space_ is preposterously cheap, and even high-volume, Internet-scale
text logs are small and low-bandwidth[1].

The problem isn't that disks fill up, but, rather, that logs are ever written
somewhere that, if full, affects anything but the logging. The solution of a
separate filesystem for logs has been around perhaps longer than the OP
himself but only makes sense for a few, standalone servers.

Otherwise, the solution of remote syslog has been around for almost as long.
If your logs are critical, you could even use two (for twice the price). Even
the days of questionable reliability and small message size[2] limits are
pretty long gone with syslog-ng.

This kind of thing, otherwise tedious minutae, is second nature to sysadmins.
Hire one.

[1] It's the disk _bandwidth_ or throughput which is still expensive

[2] That's on the order of 500 characters, old text pager lengths, not the
newfangled 140 character sms/twitter cheapness. Hey, you kids get off my lawn!

~~~
theBobMcCormick
Log rotation isn't just about disk space (although why you'd want to waste
Gigs of disk space on storing old log messages locally is beyond me), it's
also about helpful to segment logs for easier and quicker searching and
reporting, either manually or with scripts.

Not to mention it's pretty much automatic these days. It's not like it's
something you've got to go out of your way to configure.

~~~
mmt
_Log rotation isn't just about disk space_

I disagree, since, otherwise, there's nothing rotary. If one has effectively
infinite disk space, nothing ever gets "rotated out," for example.

Certainly, log segmentation provides benefits beyond being a prerequisite to
rotation, but, as you point out, it's a common default configuration. An
inbuilt feature of syslog-ng is log file naming based on date, obviating any
post-processing.

------
hopeless
This is a great reminder why I moved to Heroku ;-)

(the points re. backup and monitoring do still apply, but at least it's only
for your app and not the whole server)

------
truebosko
Does anyone have any thoughts on using config software like Puppet or Chef?
I've been having to deploy a few more servers lately and this is exactly what
I may need, curious to hear of experiences.

~~~
WALoeIII
Do it, and most importantly: don't "cheat." Make sure your machine is 100%
described by your tool. The closer to 100% you are without being at 100% the
more likely you are to forget that single tweak.

Chef vs Puppet is like Emacs vs Vi. I use Puppet because it was the only game
in town when I started, but I know a bunch of the Chef guys (hi Adam!) and
they know whats up. The tools have different approaches but are rapidly
converging on the same feature set, pick a simple role (say a webserver) and
do the entire config in both tools and you'll know almost instantly which one
matches your mindset better. I like to think Chef is a bit closer to the
shell, things happen in order and you can rely on that, whereas Puppet is a
bit more abstract and doesn't require you to micro-manage it as much. The
golden truth is somewhere in the middle of this approach, Puppet for example
just added some features to make ordering more consistent since people have
really struggled with that in the past.

Chef is pure Ruby, so if you know Ruby you'll have a nice head-start. Puppet
has the Puppet DSL, which if you are a sys-admin jives well with config style
syntaxes. Puppet just released a pure Ruby DSL as well but it doesn't have
100% feature parity with the Puppet Language.

Provisioning and Bootstrapping machines is still a black art. I use a ruby
script that sets some environment variables then launches a first run of
Puppet, after that Puppet takes over and manages. There is no silver bullet
(yet) for getting a machine into your Puppet/Chef cluster, though there are a
lot of innovative approaches that look very promising. Depending on your
environment (EC2 vs say Slicehost) I'm sure you can find a blog post that will
get you going in the right direction. Here is mine:
<http://blog.onehub.com/posts/coordinating-the-onehub-cloud> . This is already
a bit out of date, we're migrating from iClassify to the officially blessed
Puppet Dashboard (<http://www.puppetlabs.com/puppet/related-
projects/dashboard/>) we just need to put some more time in to finish.

Testing sucks. You will need to come up with some repeatable environment you
can run your configuration against. I use VMWare and just continuously restore
to a snapshot. There has been some good development here, especially with
Vagrant (<http://vagrantup.com/>). This lets you write a shell script (or rake
task in my case) that boots an instance, runs the config and then pauses for
you to inspect. From here you can push another button and 'retry' the whole
thing or destroy the VM and start over. It is annoying to get this going and
you'll be tempted to just start on one of your machines but that is a bad
idea. This tool is going to run as root, and after the first day it is going
to run with minimal supervision. You need to be sure its doing what you
expect.

~~~
mmt
_Provisioning and Bootstrapping machines is still a black art._

At the risk of appearing to build a strawman, I'm purposefully taking this
quote slightly out of context, because this strikes me as a far more general
belief, which is used to justify these CM systems as a solution. Herein lies
the danger of circular reasoning or self-imposed problem.

Since I'm an open-minded sysadmin, I always keep an eye on the likes of
puppet[1], but I have continued to reject them. Much of it has to do with
philosophy: use what I can, that's tried and true. Very nearly all the right
pieces for provisioning and configuration management already exist[2].

 _Make sure your machine is 100% described by your tool._

This, sadly, smacks of perfectionism, which is known as the enemy of Good
Enough, yet I agree that these tools demand it.

The vast majority of this kind of description is already done by the OS via
the package system and init/upstart[3]. To duplicate this kind of description
with a separate tool is, to me, incomprehensible.

What's more, for at least the past 5 years, brand name server hardware[4] has
had, in the BMC, without any special add-on cards or option keys, enough IPMI-
over-LAN support that one can, over the same physical ethernet as one of the
regular network interfaces, set the next boot to be PXE and trigger a reset.
From that point, a fully functioning server can be up in 5-10 minutes[5].

With those kinds of provisioning times, why would I want to bother with
something that requires the extra step of "black art" bootstrapping[6]? At
most extreme, to make a configuration change on a running system, I'd just
need to trigger the installation of a new package version on the relevant
systems.

The best part of such a scheme is that I don't need to make any further
customization choices, like puppet vs. chef. All the infrastructure I need
(DHCP, DNS, TFTP, kickstart or debian-installer, local mirror/repo) is a Good
Idea to have anyway, and it's all standard. I would expect any moderately
experienced sysadmin to be able to debug all those pieces, without learning a
DSL or a particular system's quirks. One also benefits from years of evolution
of such tools, including "free" redundancy and pre-existing plugins for
monitoring.

The only thing that's left is some kind of higher-level templating, which can
be added as a wrapper around all of the standard things. So far, the only tool
I've found that doesn't want to take over everything all at once, and works
fine with incremental takeover/integration of the underlying tools) is Cobbler
[7].

Not all problems can be solved with (custom) software.

[1] It was this month's BayLISA topic.

[2] Growing up with parents in the semiconductor industry, my exposures to
Unix (and VMS, TOPS, and VM/CMS, none of which "stuck") and Lego were around
the same time, so there's a deep-seated analogy there.

[3] which are, of course, configured by files which can be contained in
packages, so, really, just the package system.

[4] that is, any rackmount server which can be ordered with a cable management
arm. That there is such a differntiating factor belies the notion of
"commodity" hardware. I find it to be merely a euphemism for "lowest common
denominator" hardware.

[5] I've observed this scale easily, with no slowdown, to 30 clients against
one sub-$1k boot/repo server.

[6] "Because we use cloud providers" is a weak answer, since, besides being a
self-imposed problem with other unique issues, it gets remarkably expensive
beyond a few dozen (if that) instances.

[7] When I last dove into it a couple years ago, it was clearly focused on
kickstart and the Redhat/Fedora world, with Debian/Ubuntu barely an
afterthought.

~~~
Goladus
_With those kinds of provisioning times, why would I want to bother with
something that requires the extra step of "black art" bootstrapping[6]? At
most extreme, to make a configuration change on a running system, I'd just
need to trigger the installation of a new package version on the relevant
systems._

In my experience, "making a configuration change on a running system" is not
an extreme case. It happens all the time, and 5-10 minutes of downtime for
reboot just to add a comment to an apache configuration file is insanity.
Especially if there's a problem and you need to rollback.

Frankly, if you are booting machines in 5-10 minutes with all the packages
they need, you've almost entirely solved the bootstrapping problem anyway.
There are just a few security bits left.

edit: I did a quick check of our subversion repository, it looks like our
commits per month are in the 80-100 range. Systems get reconfigured, at least
in minor ways, a LOT. Installing a new version of a package for every single
change would be a huge amount of overhead. Far more than the one-time overhead
of bootstrapping cfengine. In fact we could do it that way, if we wanted, but
we don't.

~~~
mmt
* It happens all the time, and 5-10 minutes of downtime for reboot just to add a comment to an apache configuration file is insanity.*

Agreed, since that comment doesn't warrant any deployment at all, but that's a
strawman.

 _Especially if there's a problem and you need to rollback_

This is, of course, a philosophical difference. How can one be sure that the
rollback actually results in the previous state? For me, the certainty
outweighs the speed increase.

 _Installing a new version of a package for every single change would be a
huge amount of overhead_

My reaction to this is that you must be doing something vastly different to
install a package. Unless you're talking about an environment with just a
handful of servers (in which case, why even bother with CM?), the overhead of
building a package of text files would be less than that of checking them out
from version control.

------
Herald_MJ
Backups are all well and good, but really they aren't going to save you unless
you are also doing frequent _test restores_. Only when you've acted out your
worst-case scenario (with a set of virtual machines and a recent set of
backups) and managed to restore the system without problem do you really know
that your backups are adequate.

~~~
steveklabnik
... which is why the article mentions them directly after backups.

~~~
Herald_MJ
... no, the article mentions checking the backups are being made, that the
processes you put in place are actually working. This says nothing about how
useful and effective your backups actually are in a worst-case scenario. For
this, you need to be doing test restores, which aren't mentioned in the
article.

------
lazyant
Alternative server security checklist: <http://watsec.com/article/50>

------
snitko
Nice checklist. I will use it sometime.

