Hacker News new | past | comments | ask | show | jobs | submit login
Investigation report on the OVH data centre fire in Strasbourg on 2021-03-10 (lafibre.info)
60 points by speedgoose on June 8, 2022 | hide | past | favorite | 35 comments



> Could a water leak on an electronic board of an inverter be the cause of the beginning of the disaster?

The report is in French, but you can look at the pictures or use an online translator.

https://deepl.com usually provides much more convincing French-English translations compared to Google Translate.


Note that "more convincing" isn't necessarily a good thing; the deception of fluency can totally be used to convince you of something wrong.


I've had good experiences with deepl, but certainly a good point. Translating definitely has an interpretation aspect to it. I'd be curious if anyone tried tricky texts on different translators and how they fared. I'm not enough of a linguist myself to know of different categories of tricks (to give a fair assessment rather than things randomly coming to mind), or what even the jargon for such things would be to look them up.


Well, this went around my circles the other day: https://twitter.com/Xythar/status/1405660710382706694

It turns out that the "tricks" don't necessarily have to get very sophisticated, because, well, the target user of machine translation services can't understand the source text.

(I mean, the fr-en language pair is probably better off than ja-en, but I don't know enough French to know whether it can be trusted!)


> in truth it's like 10 times more likely to just make complete shit up if it doesn't understand the source

I use this to learn German. Instead of translating a text EN->DE and not practicing any writing skills, I'll write my best attempt at German and see if it's understandable by running it through the DE->EN or DE->NL translator. (In cases where I care about the quality, I'll then patch up the English/Dutch if necessary, run it NL/EN->DE, and use that version.)

For this purpose, I'm glad that it makes a best guess at what my broken German must mean, and it usually does a fair job (easy to say because I know what I meant, so no validation issues there). Of course this is not great for every use-case, for example it would be better if it additionally displayed confidence (e.g. slightly graying out subsentences below 80% confidence), but it also has advantages to make a good guess at the meaning of the source.

And when I put in something unintelligible, unintelligible stuff comes out. It's not really just making something up, at least not in my experience with EN/DE/NL. No idea what happened there with that Japanese example.


That reminds of a gem that I've found recently. Google Translate translates this Lithuanian sentence from Wikipedia:

Adomas Bernardas Mickevičius (lenk. Adam Bernard Mickiewicz, 1798 m. gruodžio 24 d. Zaosėje, netoli Naugarduko – 1855 m. lapkričio 26 d. Konstantinopolyje, Osmanų imperija) – iš istorinės Lietuvos Didžiosios Kunigaikštystės kilęs, lenkų kalba rašęs poetas, dramaturgas, eseistas

as:

Adam Bernard Mickiewicz (Polish: Adam Bernard Mickiewicz, December 24, 1798 in Zaos, near Novgorod, Constantinople, November 26, 1855, Ottoman Empire) - a Polish-born poet, playwright, essayist

Remove the last word (eseistas) and suddenly "Polish-born poet, playwright" turns into "a poet and playwright from the historical Grand Duchy of Lithuania who wrote in Polish". It should be noted that Mickiewicz's nationality is somewhat of a controversial topic.


As an aside, there seems to have been a dramatic change in Google Translate’s underlying approach several years ago: before it, the fr>en and zh>en translations were reasonable (if stilted and occasionally ungrammatical) while both de>en and ja>en absolutely sucked (just made no sense whatsoever over groups of more than three or so words); I suspect the difficulty being the “global” transformations needed to translate SOV to SVO word order. It would be very interesting to know what they did (there are old-style statistical approaches that involve learning pairs of corresponding syntax trees, but I didn’t get the impression they were practical? might have been an artifact of the original DoD research being focused on zh and ru, which are both SVO—though ar is VSO and fa is SOV...).


I'm not sure what to make of this.


I'm not sure why all the complaints and concern, seems like a fairly successful server migration to the cloud to me.


Took me a moment, but I got it in the end.


Interesting read (Google translated).

>> Safety lessons in building design

>> In the field of building design, we will retain two safety lessons.

>> First of all, the requirements applicable to battery charging rooms, when they are located inside a building, require a sufficient degree of fire resistance to prevent its propagation to the rest of the building. The existing regulations already seem complete to us, and the OHV accident does not call their technical relevance into question.

>> However, two configurations, in the current state of the regulations, deserve particular attention:

>> - When the batteries used are not likely to generate hydrogen during charging (if lead batteries are now mainly used in energy storage in data centers, lithium technology offers one more alternative more competitive which tends to develop);

>> - Or when these load rooms are located outside.

>> On the first point, the BEA-RI considers that the prescription relating to the constructive provisions should also concern the other battery technologies for which electrical failure and thermal runaway cannot be physically ruled out. This type of failure can lead to major fires and justify specific construction measures.

>> On the second point (outdoor charging rooms), the BEA-RI recalls the recommendations issued in its report MTE-BEARI-2021-004 on the battery container fire in Perles and Castelet (09)

>> Finally, the report points out that protecting the battery room is not sufficient, given the outbreak of fire at the level of the inverter:


used to build and work in a couple DCs in the 2000 era.. can confirm the battery room was legit the most terrifying room we had (multistory buildings, so they were inside). something about the stored potential energy and risk of physical hazard always tripped my spidey senses.

ATS panels also..



Also: power for large phone switch boards. Those battery banks are very impressive.


Why should battery rooms be inside buildings? Backup generators aren’t.


Some facilities keep generators either on the technical floor, or on the roof.

AFAICT putting out relatively small hydrocarbon fires is done efficiently with water and foaming agents that cut off the air supply. As soon as the generator is not producing electricity, these measures can be used.

Batteries though cannot be switched off, and using water around serious electrical currents is a very, very dangerous idea. It's already much harder to put out a fire around lead-acid batteries, and nigh impossible with Lithium-based batteries.

This is where you need actual physical firewalls which can withstand fire for a long time and not let it through.


At this site I think the generators are outside, but there are lots of facilities that keep their generators inside the main structure.


Interesting — I was about to say “I’ve never seen one inside” but realize what I really should say is, “I only notice them when they are outdoors”


My old company had flywheel generators - since those didn't generate pollution they all were located inside


Flywheel generators or UPS? AFAIK, flywheels are only temporary energy stores while diesel gensets are spun up to peak power.


DRUPS - Dynamic rotary UPS. It provides enough energy for your ride-through period until your gensets are up and the power quality is good.


They are also used as frequency/power regulators (or for high power phase conversion, unlikely in a data center)


Flywheel while the diesel gennies spun up


At a place I worked years ago ours was inside, in a fireproofed and soundproofed room, with a giant intake vent to the outside and exhaust chimney to the roof.


Seems counter-intuitive because they are also used in vehicles, but lead acid batteries last much longer in an in air conditioned environment.


1. Lead acid batteries need to be between -20C to +50C for charging or discharging.

2. Especially in hot climates, inverters in the UPS need to be kept cool.

3. I assume that by keeping the UPS & batteries inside the building, you minimize transmission loss/voltage drops so you don't have to oversize your conductors.


translation of the main points (of the linked article, not the full 43-pages BEA-RI report) :

Fire started simultaneously in a battery and and in a power inverter. It doesn't look like there's any comment on why/how it started on either and how/why it started on both at the same time.

Batteries were lead-acid, not Lithium-Ion (lead-acid is supposed to be less of a fire-hazard).

There's timestamped video of the start of the fire. Alarm was triggered within less than a minute. Someone was at the initial fire location within 2 minutes. Building was evacuated within 4 minutes. Fire Department called within 7 minutes of fire start. On site 17 minutes later. Power company is called within 17 minutes of the 1st fire alarm. Power company couldn't cut power from the substation because of fire risk (and delay to get approval from OVH, which owned the substation). The upstream substation is shut down 80minutes after the fire started.

The fire-engulfed building still has power for another 98 minutes. Fire is considered under control 3h after that (6h after it started), extinguished another 3h later.

There was significant concern that the building would collapse.

There was no fire sprinklers or fire-extinguishing mechanism in the building itself. Water supply was insufficient (single fire hydrant delivering 16k gallons/h). Now water reserve, or pumping mechanism (the Rhein was super close).

Only after 2h were the firefighter able to throw water at the blaze. Before that, there was still electrical power on site.

Recommendations:

Battery storage already have some regulation WRT fire hazard. Charging equipment don't. Apply the same rules for both

improve the ability to cut power from a site

There's a bunch of question that pop in my mind :

Concerning the construction, how common is it to have a 1-h fire resistance ? In the report, they say the floors are made out of wood with a treatment to allow for 1h firewall, and the "internal structure" has a 1h "fire stability". That sounds awfully low to me, especially considering that the fire took 6h to control, 10h to clear.

What's "R-5" concrete structure ?

The power is provided through 2 redundant 20kV AC lines. Then, the power can follow one of 3 path :

  directly fed into the hardware if it's "clean" enough

  corrected to be cleaner before being fed into the hardware.

  converted to DC before being fed into the batteries.

If the batteries are needed, the power is first converted back into AC before hitting the servers, which will convert it back to DC.

My question : Why don't we convert all the power to DC in a central location before feeding DC power to the servers ? I would expect some significant saving costs in DC->AC conversion, no need for AC "cleaning", and the ability to extract the AC->DC conversion out of the servers (increasing density, and removing some more heat close to the servers). I'm sure it's not a new "idea", I'm just curious as to why it's not a good one.

On the report itself, it looks like the presence of battery makes electrical cutoff much harder. If you cut the main power lines, the batteries and power generators take over, so your DC is still powered.


> Why don't we convert all the power to DC in a central location before feeding DC power to the servers ?

The DC consumed by the server is 12V. AC is 240V (in Europe). That means you need about 20 times thicker wires to carry the same current. And running 240V DC doesn't win you anything because converting that to 12V DC is more trouble than converting 240V AC to 12V DC.

That doesn't mean it never happens, but it's more typical where distances are short. The network equipment you ISP has in a box somewhere on your street is often feed through AC for normal operation, but connected to the battery backup via a DC line. There cable runs are short and low-ish current, and the efficiency gain from not doing DC->AC->DC matters


this is why the DC powered servers in the open compute platform design rack are all fed from something like 277VAC (or even 480VAC) to a centralized power supply in the rack, which creates 48VDC, and is then fed from the center of the rack to a bunch of 1U sized servers. any particular 48vdc cable set of conductors is not longer than about 160 cm.

individual servers then have 48vdc-to-12vdc high efficiency power supplies to create power usable by their motherboards.

of course you don't want to run cables for 12vdc around anywhere.

https://www.google.com/search?client=firefox-b-d&q=open+comp...


>240V DC doesn't win you anything because converting that to 12V DC is more trouble than converting 240V AC to 12V DC

It is true that it doesn't win you anything, but it's not more difficult. In fact you can connect DC voltage to ordinary switching power supply just fine.


> Building was evacuated within 4 minutes. > Concerning the construction, how common is it to have a 1-h fire resistance?

Fire rating requirements are for human safety and not really property safety. In this case, 1 hour rating was more than sufficient for everyone to exit.

The type of use/occupancy of the building generally sets the requirements; I don't know requirements for data centers. 1 hour is kind of the start of enhanced fire ratings; some things might not have a requirement of even 1 hour, others may have more. A large enough fire rating (again, depending on use), may be considered a firewall which can be used to consider each side a seperate building for fire planning purposes. A smaller, but still large rating may be needed to partition spaces with different needs. Higher ratings are common for stairwells, and I wouldn't be surprised if they were required for the battery room and other areas of concentrated risk.

> Why don't we convert all the power to DC in a central location before feeding DC power to the servers ?

Some datacenters do run servers on DC, typically phone companies, and typically -48VDC. This has pros and cons. You may simplify power conversion from batteries, and the ac -> dc input step will likely clean the power, too. But 48v is much lower than common AC voltages, and lower voltage means more current for the same power, which means more loss in transmission and larger conductors required and more heat generated as well.

You still need individual power supplies on each computer, as you need to go from -48v to +12v (and maybe some others, depending). Historically, DC/DC voltage conversion was much less efficient than AC/DC, but that's less true today.

Well designed AC switching can happen at the zero crossing, which can reduce arcing, which is sometimes useful.


I can't imagine the phone call to Germany: "Hi, it's the clown neighbor again, we are sending you smoke, probably laced with lead. Sweet Dreams."


Well that's fair, they send us their coal fumes all the year.


It'd be funny if it weren't so tragic. Germany really should have shut down coal (i know it's easier said than done and there's a significant industry around it, all the more reason to start with it and not postpone) before nuclear.


home of the famous BAGGER 288

https://en.wikipedia.org/wiki/Bagger_288




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: