

Confessions of an IT pro: My biggest professional blunders - edw519
http://articles.techrepublic.com.com/5100-10878_11-6093189.html

======
bayareaguy
_#4: Taking backups for granted._

Everywhere I've ever worked I've encountered some variation of this. One
recent customer had some MySQL problems which took down the sole MediaWiki
server holding all their IT documentation. Not only did they not have a
standby, not a single backup or dump was ever done.

These days what I find even more astonishing is how even experienced
professionals make the mistake of thinking they don't need separate backups
because their system has some integral disk redundancy (e.g.
[http://www.joyent.com/joyeurblog/2008/01/22/bingodisk-and-
st...](http://www.joyent.com/joyeurblog/2008/01/22/bingodisk-and-strongspace-
what-happened/) ).

~~~
gaius
Another true story: A while ago I was DBA on a very important database (as in,
the company would likely go out of business if it was lost). The way it was
backed up was to quiesce the standby, take a snapshot of its filesystems on
the SAN, then send those to tape while the standby was reenabled. Sounds good.
But what had happened was that the sysadmin, to speed it up, was running
several parallel backups, each one of which had a hard-coded list of files.

So as soon as one new datafile was added to the database (I estimate, a week
after that regime was put in place) all subsequent backups were invalid. We
were in that situation for over a year _and everyone believed that the
databases was being properly backed up_. We would even occasionally restore a
file (picked from the backupset, hmm) and block-verify it. I only discovered
it when I was upgrading the system and I fixed it _real_ quick (dynamically
generating n equally-sized lists of files is trivial!). Hard coding the list
of files was stupid, sure, and betrayed a fundamental lack of understanding
about how the system worked - but it was also stupid that no-one ever checked,
and that betrayed a lack of understanding of how organizations work.

The lesson I took away from that was, you have to have people who have
visibility of things end-to-end.

~~~
sbhat7
This is why we blacklist databases/tables that are not to be backed up and
backup everything else. This makes sure any new database is added to the
backup.

------
projectileboy
A meta-lesson of many of these lessons was "if you can avoid it, don't work
with difficult people at crummy jobs". Lesson #8 in particular smelled of an
environment where people were actively seeking to be offended and unhappy (the
smell is oh-so-depressingly familiar).

~~~
gaius
Agreed. A couple of companies ago I knew someone who complained to HR that her
manager was bullying her. What happened? Well, he knew she was afraid of
heights, but still relocated her desk (along with everyone else in her team)
from the 1st to the 5th floor...

------
spudlyo
Accountability, or confessing when you make a mistake, is the mark of a
professional. I didn't learn that lesson right away, but I'm glad I finally
did. Your manager or customers might not be happy that you made the mistake,
but they'll always respect and appreciate you for being accountable.

~~~
jlees
Yeah, I don't want to begin to imagine how awkward some of those situations
could have become had the author _not_ just said 'sorry, I screwed up'.

~~~
spudlyo
Getting caught trying to cover up a big mistake is usually a resume changing
event. As trite as it may sound, the cover up is usually worse than the crime.

~~~
billswift
Especially when it's not a crime, just a mistake.

------
dmfdmf
The backup thing is critical... I once had a client who's tape backup drive
died. He was out of town, I was busy with other clients and it was a Fri. I
said let's deal with it on Monday. Monday comes around and the secretary calls
me and says the server is locked up. I walk her through a hard reboot... then
hard disk DOA, as in "no bootable drive found", no partition, no files. My
heart sank. Even worse the tape drive that died had not written a usable
backup in weeks even though the logs reported no errors. The best tape we had
was 2 months old, not much use but better than nothing. Fortunately, the data
recovery firm was able to get the data off the old hard disk but it cost the
company about $3K.

Another good rule... if you inherit a new customer you need to "trust but
verify" the previous tech's work. I warn my new clients about the added costs
of checking and learning a new setup. I learned this one the hard way when a
new client had a decent tape backup program in place but I never did a test
restore. One day they needed some deleted files restored and was shocked to
learn that the previous tech had inadvertantly selected the option to backup
the directory structure only! No files in piles of tapes, just folders. Yikes!

------
keefe
Ouch to the OT one!

The worst one I did : With this product, the VP of product distributes product
keys. I had recently had a HDD crash and reinstalled and had customer facing
deadlines. He was not around and I needed to make some progress, so I fiddled
the keycheck code and bypassed it. I finished my work and committed the whole
workspace... including the hack... fortunately this was caught in QA.

Now, that is not the worst impact on my life... in my earlier years I made it
too much a point to get my real opinion across.

------
auston
How does this .com.com work?

(I am being serious here) Can I get a .com.com?

~~~
philwelch
Cnet bought com.com a long time ago and use it as a publicity stunt.

~~~
BobbyH
CNET used to do something sneaky whereby they used domain.com.com on many of
their domains to inflate their unique visitor counts. That way, when somebody
typed <http://mistake.com.com>, they'd be counted as a unique visitor to the
com.com TLD which hosted all CNET sites. This way, they could tell advertisers
that they reached 99% of the Internet via their com.com domain. It was really
annoying to type in <http://www.cnet.com>, and be redirected to
<http://cnet.com.com..>.

Apparently, they've stopped doing this. Now, typing in, say,
<http://mistake.com.com> ends up doing a search for "mistake" on CNET's search
engine, search.com.

------
edw519
_I have shot myself in the foot on more than one occasion by failing to
document a procedure or configuration I was sure I would remember._

Boy does that ever strike a nerve. Mainly in my own code.

I'll write a function with just a little bit of complexity, getting working
perfectly, and build all kinds of stuff on top of it. Three months later I
need to add something to it, but I don't understand what I wrote.

I'm the biggest stickler on variable naming and self-documenting code and yet,
I still keep doing it to myself. Oh well, always room for improvement.

~~~
billswift
Not just in IT either. Any time you are doing anything you might need or want
to refer back to, TAKE NOTES.

"Someone who wants to appear clever relies on memory, someone who wants to get
things done writes things down."

