
Something is burning in the server room; how can I quickly identify what it is? - usea
http://serverfault.com/questions/496139/something-is-burning-in-the-server-room-how-can-i-quickly-identify-what-it-is/
======
yardie
You can tell from the responses who has a real DR plan and who are just
winging it. The DR-backed commenters can switch to site B with nary a worry.
Everyone else is trying to justify keeping the server room running while
something is burning. To them a misplaced backhoe is a bigger problem than you
know a server burning.

I love the sanity check part. But really, you're keeping someone on hand to
drag your ass out after you pass out while you sniff up toxic fumes.

~~~
growse
A million times this. If turning a whole datacenter off is catastrophic for
your business, you've not done your risk management properly. Single events
that take down datacenters happen. Not being prepared for them is
unforgivable.

~~~
rdl
Even with DR, it has a cost. Especially if you're a colo provider, powering
off your whole facility (or, ideally, at least a room) to find smoke is going
to have cost, even if all your customers have DR plans.

In most datacenters I've seen, I'd probably be willing to do a run through
with IR cam/temp probe, or just visual inspection, with a handheld 1211,
especially if I had a respirator, if it were just "smell of smoke". Clear view
and path to two exits, someone at the EPO switch, etc.

The "big scary things" are battery plant and generator plant, and any kind of
subfloor or ductwork. As long as the fire isn't in any of those, it's far less
of a big deal. I probably wouldn't EPO a room for a server on fire, either --
just kill the rack, which takes slightly longer.

I've been in places where "smell of smoke" was a fucknozzle smoking a cigar or
burning leaves outdoors outside an air intake, and another where it was a
smoker's coat being put on an air handler.

~~~
yardie
The great thing about DR plans is... when implemented correctly no one should
have to risk their life avoid using it. He didn't say that one of his servers
was running a little hot (which happens, a lot). He said there was smoke and
the acrid smell of something burning. Which means that one of his components
actually got hot enough to ignite.

If you're not ready to use your DR plan it probably means your DR plan is
inadequate to begin with. Why the hell do fire drills? Even cruise ships do
drills. God forbid they pull their passengers away from that very important
game of Texas Hold'em.

 _I probably wouldn't EPO a room for a server on fire, either -- just kill the
rack, which takes slightly longer._

You fail to understand how fires start or why they spread. I mean why the hell
do datacenters spend millions of dollars on fire supression when an IR cam and
a handheld extenguisher is just as good, right?

~~~
rdl
Essentially no one does "EPO drills" on their datacenters. Particularly in
multi-tenant environments like commercial colocation centers. It's quite
reasonable for your DR plan to involve a $200k+ cost per EPO pull or DR
failover. Your business should have DR provisions, and you should test the DR
plan, but it's probably not reasonable (or legal) to do a full test involving
dumping agent, rapid power off, etc.

The fire suppression exists for two reasons. One, is to get code exemptions to
be allowed to run wiring in ways which would otherwise require licensed
electricians to do every wiring job, and prohibit people from being in the
facility. Two is to detect small fires early, and to prevent their spread, as
well as to protect facilities from catastrophic facility-wide fire.

Servers are just not that high a fire risk, particularly when de-energized.
Generally inside a self-contained metal chassis, less than 100 pounds each,
metal/plastic, etc. The power supply is the most likely component to start a
fire, and contains a max of maybe 250g of capacitors and other components. The
risk of one server catching on fire is low, and the risk of it rapidly
spreading to anything else is low, so yes, I'd be comfortable pulling a single
burning server out of a de-energized rack.

Also, in big or purpose-built facilities, those components _most_ likely to be
fire risks (batteries/power handlers, and generators) are in separate rooms,
separated by firewalls from the datacenter. A fire in the _battery room_ is
going to be dealt with by sealing that room and powering it off, dumping
suppression agent, and bringing out the FD immediately.

Life safety is much more important than business continuity, but a lot of
people have jobs where they accept a non-zero risk of physical harm to do
their jobs. It's certainly not reasonable to demand a datacenter tech go into
a burning building to rescue a database server or something, but approximately
zero datacenter staff I know would have a problem with assuming the level of
risk I would to find problems. (it's probably a bigger deal for employers to
actually discourage risk-taking by employees, particularly when it's risk-
taking to save themselves effort, like single-person racking large UPSes or
very large servers, etc.)

~~~
yardie
I think we are aiming at the same thing here. A proper, multi-tenant
datacenter will have separate zones for generators, UPS, electrical, and
climatisation. The actual chance of a fire starting and spreading in this type
of configuration is low and this is the environment I prefer to work in. I've
also worked in server rooms in 100+ year old buildings which did double duty
as storage/broom closet. The original post was closer to this since they had
racked UPSes next to their servers and network equipment. It apparently caused
enough smoke to fill a server room and make the poster nauseous, which makes
me wonder what air handling capacity they have. It's this type of "datacenter"
where you have to worry about your life.

------
patio11
You ask the fire marshal what was burning, about an hour or so after you've
Big Red Button'ed the server room.

This story is giving the Japanese engineer in me apoplexy.

~~~
elektronaut
Acting quick and getting a fire under control often makes the difference
between an emergency and a disaster. But only after you've made sure the
building is being evacuated, and the fire department has been alerted.

Don't be afraid to call the emergency number. They'll know what to do and walk
you through it.

Under no circumstance should you enter a room filled with smoke. Smoke
inhalation is incredibly dangerous.

~~~
rmc
_Under no circumstance should you enter a room filled with smoke. Smoke
inhalation is incredibly dangerous._

To re-iterate, a lot (most?) of the people who die in fires actually die from
smoke inhalation than from getting burned by the fire/flames.

~~~
niels_olson
Hi. I have worked in a burn unit. Inhalational injury is usually not the cause
of death. In fact of the people I saw only one who died of bronchoscopy-
confirmed inhalational injury. And he was an obese smoker with minimal
residual lung volume to begin with.

~~~
saraid216
This seems to imply that most people who die _are_ burn victims? (It feels
like this is a stupid question, but I'm not sure of the answer so eh. Asking
anyways.)

Would people at risk of a inhalation injury actually pass through you often?
You're in a burn unit, so I'd assume that means you mostly get burn victims,
and inhalation injuries would be pointed somewhere else?

~~~
niels_olson
> imply that most people who die are burn victims

Inhalation injury is a subset of burn trauma. The flow control is "Ambulance
inbound from fire" -> ER calls trauma alert -> trauma team meets the
ambulance(s) at the ER door. Those with inhalation injury are sorted from
there.

------
rdl
IR/thermal imaging cameras are SO USEFUL. I had a fire (bathroom fan caught
fire due to being 45y old, knocked it down and extinguished it myself, but was
worried about extension in the ceiling/duct).

Oakland FD came out and used their IR camera to check the heat from the
ceilings nearby. Hilariously they found a hot water pipe (running between
bathroom and kitchen) and almost axed the ceiling open (turning $1500 in
damage into $3k+), but their captain was smart and figured it out from another
angle.

Really tempted to hack an EOS 5Dm3 into an IR camera next. Not so much for
fires as night vision, but it would be useful for fires too. I'm not sure how
useful an IR camera is at detecting heat, since things which aren't yet on
fire are not quite so infrared, though.

I usually use a Fluke IR temp meter when cooking and to find hot wires/etc. in
the datacenter, though.

~~~
Udo
They are useful but they do have limitations. I'm not sure it would have
necessarily detected the faulty battery in this case. Last year, there was a
fire at my house and the FD searched for the source for three hours. With
thermal imaging and everything. It was inside the walls, no open flame, just a
lot of smoke and no clear readings on the imager. That was pretty frightening.
(However when they finally did find it, they put it out in a couple of
minutes.)

~~~
rdl
I'm glad it motivated me to get ABC Dry and Halon 1211 extinguishers for both
rooms and the car, at least.

In a "real" datacenter, you should have smoke sensors which would map where
heat/smoke is coming from (since you have controlled airflow, it should be
obvious which rack or small group of racks was the source -- it doesn't just
exhaust into the whole room). But it's pretty clear this wasn't a "real"
datacenter by their lack of protocols for handling fire, it was some office
server thing.

~~~
Udo
I'm just saying these things can quickly get more complicated than expected. I
too had extinguishers handy at the time but of course I didn't know what to
use them on (and neither did the FD for three hours). These electrical fires
can be tricky to debug, especially when different kinds of barriers come into
play.

> But it's pretty clear this wasn't a "real" datacenter by their lack of
> protocols for handling fire, it was some office server thing.

Probably. What's the right protocol though? In this case, it was apparently
clear that something minor was amiss, nothing that would justify shutting down
the whole thing. In any case, flooding the room with inert gas would probably
not have made much of a difference, as it looks like the battery was never
actually burning.

~~~
rdl
House construction is insane, anyway -- they're full of random stuff, and
there are plenty of non-accessible void spaces. Datacenters are at least
generally nice and open, so finding a weird residential hidden fire _should_
be a lot easier. (which is why datacenters get permission to run their wiring
the way they do, etc., because they have so much other safety)

The right thing to do in a real datacenter it to check which of your ~hundreds
of laser VESDA sensors first tripped, and investigate in that area :)
Presumably you have floor air supply, ceiling air return, so the first thing
to trip should be a ceiling sensor near your fire. If no floor sensors trip, I
wouldn't be super afraid to go in there, and if it's only a small number of
them it's not a big fire.

You don't want the dry pipe to go off for sure, and you don't want the FM-200
either, but the consoles should be reporting the smoke alarm to you way before
a human would smell it "filling the whole room", and they don't generally
discharge either for very small events (at least everywhere I've seen).

In an office (some open plan, some cubicles, some conference rooms and
offices, etc.), with a few racks of equipment, and maybe some lab space, it's
a lot more similar to the scary hidden residential fire problem. :( Your risks
in trying to uncover the problem are actually higher than in the datacenter
because then you don't have the amazing gas system and a dry pipe backup to
save you if it turns into a big fire while you're there, and it's not as
designed for easy egress, and probably doesn't even have real EPO. I wonder if
there's a firefighter on HN who would know the real answer to this case.

~~~
n2dasun
FM200 systems do go off for small events at times. I managed a team of
datacenter facility specialists up until last year, and we'd seen issues like:
FM200 dumps because underfloor smoke detectors notice smoke from a CRAC
condensate pump (pretty low risk) smoking its winding, FM200 dumps when a
quick refrigerant discharge (technician error) looks like smoke to the
detector, and false positives at smoke heads due to a dirty area under the
raised floor, combined with air flow irregularities.

I definitely agree that I'd be more concerned about a house fire, but the rule
that we enforced to our people and the vendors, as well as the vendors working
for us (not to mention the guidance that we received from our customers) was
that nothing in that datacenter is worth potentially losing anyone's life.
That having been said, I have Toucan Sam'd in a datacenter to try and find the
source of an odd odor before, but never alone, and only to find out what to
secure power to. I wouldn't sit there and try to fight it with a fire
extinguisher.

~~~
rdl
The only "accidental discharges" I've seen were related to construction dust
in an underfloor. And yes, suck :(

In general the purpose of a handheld extinguisher is to fight tiny fires as
well as to help you escape a bigger fire. The thing I'd be most afraid of
would be someone walking around trying to find a small fire, only to discover
a big fire, have egress blocked, and need to figure out a solution. Or, coming
across an actual person who is on fire or otherwise in danger (even if you'd
expect virtually no personal risk for property, I think most people would
accept substantial personal risk to save a person, particularly a coworker).

------
ck2
Maybe we need a diy thermal sensor that plugs into an android or iphone
device?

Oh wow, it exists:

[http://www.instructables.com/id/Thermal-Imaging-Phone-
Camera...](http://www.instructables.com/id/Thermal-Imaging-Phone-Camera/)

<http://www.robhopeless.com/search/label/Thermal%20Imaging>

[http://www.kickstarter.com/projects/andyrawson/ir-blue-
therm...](http://www.kickstarter.com/projects/andyrawson/ir-blue-thermal-
imaging-smartphone-accessory)

<http://rh-workshop-llc.myshopify.com/>

Costs only $150 to make?

Open-source: <https://github.com/RHWorkshop/>

~~~
aw3c2
Wow, I wonder what the range of that is. I.e. if I can use it to analyse the
thermal loss of houses.

~~~
walshemj
why do you need that you from my termofluids classes you should be able to
calculate the heat loss for a given Delta T by taking a ball park figure for
the materials ie so many square meters of brick Glass etc.

~~~
wiml
You use it when you don't know the material (because you're surveying an old
house) or you don't know the delta-T (because you're looking for a hidden
fire, grow op, etc).

~~~
NickNameNick
Or you're trying to track down head loss caused by construction flaws,
especially leaks and drafts caused by cracks or gaps. Also leaks caused by
improperly installed power points. Missing or settled insulation is also
common.

------
whizzkid
I couldn't answer the question there because of reputation thing. (just signed
up)

Here is a little bit different approach;

If you are in the room and smelled the burn, that means something is already
happened and you are dealing with its result and possible side effects, and
that gives you possibly enough time before shutting down everything or getting
out of the room. Your chance of not being harmed by this situation is high at
least for 5-10 minutes.

In this case, having a termal check would not help you a lot since burned
hardware is most probably not functioning anymore and might be colder than the
regular servers. The other option would be that it is still working but not
causing any fire yet so heat is not much different than the usual.

Now, smell is your only evidence,

I am hoping you guys have air conditioner in the server room. Put it on the
max level so that the smell will not be so strong everywhere. This can help
you identify where the smell is coming from. Before you check the smell, get
out of the server room, breath as much as fresh air you can, so that your
realisation will be sharper when you get back to server room. Having your
colleague with makes the process faster.

This would be my first reaction to these kind of situations.

It is of course costly to turn off whole system but don't forget that it is
not important than you!

------
jeremyjh
I'm not sure if everyone responding really read this all that carefully. There
was absolutely no mention of smoke in this question. There was a "smell". If
you drop an entire datacenter, you are easily looking at $100K+ in damages
just to reset the room. So, getting a buddy and taking some time to look for
the problem until it is found or until you actually are seeing smoke or other
specific danger seems like a pretty reasonable course of action.

~~~
DanBC
X times out of Y that's going to be a reasonable course of action. But one
time someone will die, and at that point (because it's rare and we freak out
at rare dangers) people will be up in arms about it, and about how stupid and
irresponsible it is to not hit the button.

~~~
johngalt
Panic can be dangerous as well. A halon dump can asphyxiate someone who
doesn't reach the oxygen mask or an exit in time.

Dropping a whole server room without seeing any smoke or fire is silly. Do you
pull a fire alarm if someone smokes a cigarette indoors?

------
darwinGod
For some more context, this was the OP's previous question in Stackoverflow.

[http://serverfault.com/questions/420877/ive-inherited-a-
rats...](http://serverfault.com/questions/420877/ive-inherited-a-rats-nest-of-
cabling-what-now)

Doesn't that change the entire question!

~~~
astrodust
If there was a burning smell in _that_ , yikes, hit the Big Red Button and
turn in your resignation.

~~~
micro-ram
Seeing that rats nest I would have let it burn!

------
blantonl
It is pretty simple. You call the fire department.

They have a TIC (thermal imaging camera) that can detect heat/overheating
sources pretty quickly. Plus, it's kind of nice to have them on hand in case a
smell progresses to a fire.

~~~
micro-ram
I agree. Even if you call just to alert them that something is not right. Let
them roll one truck just to have someone on hand in case someone gets burnt or
shocked.

------
jahabrewer
It's been a while since I've been in IT, but don't failing UPS batteries put
off fumes that destroy your lungs?

A friend of a friend was a hero and shutdown his datacenter cleanly/recovered
some hard drives during a situation like this. He got severe lung damage (not
from fire).

------
JimmaDaRustla
Lead acid batteries will have a rotten smell after shorting out - more so in
wet ones, but SLA will smell the same.

If you wait to the point that an SLA smells, it has probably expanded and
caused internal damage to your UPS/server rack, albeit minor if you can manage
to get it out without dismantling anything.

------
chris_wot
Just an idea: the server room should be segmented into smaller areas with
isolated power circuits. Using the sniff test, if you are truly concerned that
you are about to have a fire, then it's only responsible to start an orderly
powerdown to prevent equipment loss, and more importantly, prevent injuries to
_people_.

If you start shutting down areas of the datacentre that appear to be closest
to the smoke, then you will have a better chance of locating the issue in the
fastest possible timeframe, with minimum disruption.

On top of this, if you then have critical infrastructure that you must keep
running, then you keep your failover servers in different areas and failover
to that equipment.

I'm not a server or datacentre guy in any way, but doesn't this seem sensible?

------
johngalt
'hmmm... probably UPS battery venting' _click_ 'yup'

When UPS batteries vent it has a distinctive odor. It's very pungent and
sulfuric, but it doesn't smell like a fire or melting silicon. Any experienced
operations guy has smelled it before.

Additionally in most fire suppression systems the Big Red Button is the
_abort_ button. A well designed room will dump itself when it detects smoke
after a short evacuation alarm. It's precisely designed to keep people from
screwing around with a real fire. They must make the active decision to _stop_
fire suppression rather than _start_ fire suppression.

------
peterwwillis
HVAC. Has no noticeable smoke (it'd probably be outside the building anyway)
but pumps a burning smell into the room when the motors start dying or aren't
oiled right.

Don't hit the big red button just because you smell burning.

------
protomyth
I've had this happen to me once. No alerts and all boxes were up, but there
was a smell in the room. I went machine by machine and UPS by UPS and nothing
was wrong or burning(1).

Next day we find out the breaker panel next door had a short that blew out
several breakers. Smell was vented into the server room.

So, not always your room, could be something else just as or more dangerous.

1) shut down all machines, unplug all UPSes, open every case

------
iSnow
What kind of server room is this which is not equipped with smoke detectors?

~~~
gambiting
If it's just the "melting plastic" kind of smoke, it won't trigger smoke
detectors. And I believe that his battery wasn't actually on fire - if it was,
then yes, smoke detectors would have triggered.

~~~
HeyLaughingBoy
Oh yes, it will.

I made my 12 year-old read the story (along with pictures) of a girl his age
who was trapped in a burning house after he set off the smoke alarms at
midnight by melting bits of plastic in his room.

Hopefully he learned something that night.

~~~
eridius
How did setting off smoke detectors cause a fire?

~~~
HeyLaughingBoy
Poor sentence structure on my part.

My son set off the smoke detectors by melting plastic in his bedroom with a
cigarette lighter.

To demonstrate the danger of what could come from this, I had him read a news
report of a girl who was screwing around with fire and ended up being badly
burned over most of her body.

------
ChuckMcM
I spent a couple of summers interning at IBM and one of the things they taught
you in orientation was the sound of "imminent halon dump" (the alarm that said
Halon was about to be used in the machine room). The instructions were, hold
your breath and make for an exit _immediately._ Failure to do so would lead to
asphyxiation.

------
dattaway
Lead based batteries often have a usable life of 3-5 years. Chances are, the
others of that vintage are already dry and have already failed. Then they will
rupture, often with smoke as their series connected brothers try to push
electrons.

~~~
micro-ram
I have seen plenty of UPS batteries swell up so big they can't be removed
without disassembly. The only indication was the failed self test. OP did say
it was the UPS in the rack next to his production DB.

Don't forget Capacitor Plague. I still see it regularly.

<https://en.wikipedia.org/wiki/Capacitor_plague>

Have a plan, be safe.

------
FireBeyond
If it's that important, buy a TIC (thermal imaging camera). They can be had
for under $10,000 and will show you actual hotspots. Walk through, sweeping
every item.

------
squozzer
Temperature indicator stickers.

~~~
stevenrace
At $5+/ea, that quickly adds up and difficult to parse in large numbers in
duress. Whereas an IR camera is rather effective and perhaps cheaper.

------
lobster45
Thanks for this story. I am ordering a fire extinguisher for our server room
now.

------
snarfy
It helps to prioritize what you look for. 9 times out of 10 it's power
related.

------
Qantourisc
Get in an electronics "expert" they know what component smells like what :)

------
hp50g
Infra red camera!

------
lotsofcows
Can of deodorant.

