
Trello is running on a diesel generator and may go down at anytime - guiambros
http://status.fogcreek.com/2012/10/fog-creek-services-update2.html
======
spolsky
Let me update this: we've been working all day on moving Trello to AWS, a
project which is now underway. If all goes well (fingers crossed, etc) we'll
be up and running on AWS very soon. There are still other services in the
flooded New York data center (FogBugz, Kiln, Copilot) and we are making plans
to physically move servers to a new data center as quickly as possible if the
flooded data center falls over, but a physical move of 5 racks could take a
whole day so I hope that doesn't happen.

Separately, Stack Exchange, which is in the same data center, is running off
of its hot backup in Oregon.

We would have liked to have completely redundant data centers for FogBugz on
Demand and Kiln on Demand to avoid even a few hours of downtime, but because
those services rely heavily on giving every customer their own SQL database,
there is almost no reasonable way to get fast failover to a different data
center. We can do it with Stack Exchange because there are only a couple of
hundred databases. We've been building a SAN solution which will make it
possible to hot swap out to another datacenter, one day, but that project is
not complete.

~~~
drewcrawford
> but because those services rely heavily on giving every customer their own
> SQL database, there is almost no reasonable way to get fast failover to a
> different data center

Why did you say you could do fast failover to LA in 2007? [1] It was the case
then that every customer had their own SQL database too.

I'm not trying to be snarky, your team is obviously putting forth some heroic
effort. I'm just an affected customer that is curious why the question seems
to have two mutually exclusive answers.

[1]
[http://webcache.googleusercontent.com/search?q=cache:lHEK939...](http://webcache.googleusercontent.com/search?q=cache:lHEK939AKiEJ:www.joelonsoftware.com/items/2007/07/09.html+&cd=5&hl=en&ct=clnk&gl=us&client=safari)

~~~
guiambros
I'd be curious to see if there's any update to FBOD architecture. The idea of
using self-hosted servers running SQL Server and IIS already sounded eccentric
5 years ago, but given the number of options today, it'd be hard to still
justify such a monolithic solution.

In all fairness, Joel recognized in the article that the initial decision was
based on their previous experience with Microsoft -- and AWS was still
incipient in 2007 -- so it made sense at the time. I doubt this is still the
case. Even less if you consider the costs of Sandy - duplicated servers,
migration, hours lost, unhappy clients, etc.

------
ahi
According to this post, the datacenter people are hauling 55 gallon drums up
17 flights of stairs. Diesel is ~7 pounds/gallon so a 55 gallon drum is ~385
pounds not including the container. Ouch.

~~~
spolsky
Last we heard it's a bucket brigade doing 5 gallons at a time. Fog Creek
president Michael Pryor is participating but mostly it seems to be the crew
from Peer 1 networks doing the hard work, we don't want to take too much
credit.

~~~
ahi
Bellevue Hospital had (has?) the same kind of operation going on.

------
guiambros
My surprise is that a respected team and high profile company would make such
a basic mistake of relying on a single datacenter, within a single
availability zone -- in lower Manhattan. And knowing since last week that
Sandy was coming, and this was a real threat.

And I'm not even talking about Trello (which is still just a hobby for Joel &
team), but this also brought down Fog Creek and all their commercial services
and paying customers.

The positive side of Sandy (if there's any) is that people will really take
more seriously the idea of "expect the best; plan for the worst". And Amazon
will likely see a spike of new customers in the next following days.

------
dctoedt
I'm baffled that the powers that be in NYC didn't learn from Houston's
experience with Tropical Storm Allison in June 2001. All the hospitals at the
Texas Medical Center had their basements and ground floors flooded; there went
their generators [1]. (I was part of a large group of volunteers from all over
the city that helped to evacuate patients for transfer; it was spooky seeing a
grand piano floating in a below-grade-level lobby of one of the hospitals.)

My then-employer's building had its basement flooded; we were on floors 19
through 25, but our electricity and phones and Internet were gone --- and
this, three weeks before the end of Q2. Our developers and IT people hauled
computers down the stairs, and we moved to temporary space for several weeks,
but we still missed, that is, we didn't achieve the sales and revenue targets
that we had forecast for analysts.

I'm given to understand that because of the lessons learned from Allison,
Houston isn't quite as vulnerable to flooding any more.

[1] <http://en.wikipedia.org/wiki/Tropical_Storm_Allison#Texas>

~~~
spolsky
The bottom line is that people just don't plan for 100 year storms. I suspect
that's because the number of things that go wrong once every 100 years is so
large and the cost of handling each one is so disproportionate to the risk
that it's not cost effective. Think about it this way: under what
circumstances would you double your hardware budget to avoid 1 day of downtime
every 100 years?

~~~
ChuckMcM
This is exactly right. The truth is you can spend infinite money on various
scenarios, and that cost has to be paid. The manufacturer side of this is
'warranty', you guarantee that something won't break in 90 days, a year, 5
years, what have you and you price to that probability.

Another risk in New York that will take data centers offline, earthquake. It
will happen at some point, but the chance of it happening in the next 100
years small (contrasted to Santa Clara where we're much more likely to get an
earthquake than a flood). So you could pay to have your Manhattan high rise
put onto base isolators and proofed against an earthquake, but what size
earthquake? Magnitude 3? Magnitude 6? Magnitude 9? And then if your building
is still standing brightly after the Magnitude 9 earthquake are your fiber
optics still there? How about the network tie point? Did it fall into a hole
in the ground? So the cost to make all of Manhattan resistant to a 9.0
earthquake?

Its impressive that they are carrying up the fuel. I might be inclined to see
if I could tractor in a 12kW generator to run one elevator. Sure you'd be
burning fuel at both ends but it would be easier on the crew hauling the
petrol.

~~~
spolsky
due to flooding in the basement and diesel fuel spilled everywhere, they
wouldn't power up the elevators even if they could. The building's entire
electrical system is under salt water/diesel mixture.

~~~
dbecker
Your team's dedication is amazing.

Having been lucky enough not to be placed in the situation you are in, I doubt
I could have gone to such lengths to get everything back online so quickly. I
admire everything you guys are doing.

Thanks.

------
gecko
Trello has switched to AWS as of this posting. The Trello team are insane, and
I'm very proud to be part of the same company that they are.

------
ISL
Also, pumps must always be located at the bottom to pull liquid uphill by more
than ~30'. Storing large tanks of diesel fuel on the 17th floor (or higher),
is a heavy and perhaps dangerous situation.

An always-submerged in-tank pump, not unlike that of a water well, powered via
a sealed line from the generators above could, in principle, avoid this
problem.

I hope the generators are able to power some sort of mechanical lifting
arrangement. Manually hauling the drums is mighty hard.

~~~
jlgreco
I remember it being 30' for water, but it should be a bit more for diesel
(since diesel is not as dense as water). Mercury, an obviously much denser
fluid, can only be pulled up by a vacuum a much shorter distance (around 760
millimeters iirc).

Still, the extra height isn't going to buy you much.

~~~
mhp
When I was carrying buckets of diesel up the stairs, I did notice that it was
a lot lighter than water. Still not great after 17 flights, but a lot lighter
than a bucket full of water.

~~~
ISL
I should've remembered the density difference...

According to omniscient Wikipedia, diesel is 83% the density of water, so the
33 feet that water can be lifted via suction becomes ~38'.

The Trieste used the buoyancy of (incompressible) gasoline to enable a return
trip from the bottom of the Mariana Trench.

Strong work lifting all that fuel!

------
casca
While this is certainly a very rare event, I'm very glad that we've gone with
redundant self-hosted solutions. Trello is great and is highly respected by
the startup crowd, but to be held hostage to someone else's data decisions
does not work for me once you get beyond the most basic level of a functioning
company.

~~~
atesti
Indeed! For me all hosted solutions are automatically dead. Too many startups
just disappear so quick that I would never do business with them. (And while
FogCreek hasn't disappeared so far, at least WebPutty is gone)

Given that this is a unpopular view around here anyway, I want to add another
thought:

If your company has it's own webpage at datacenter A, including payment from
datacenter/service B with CDN C and Twitter-integration D, using hosted
FogBugz for support mails, hosting it's CSS-stylesheets on TRELLO which was
baked by Google AppEngine, and using email provider E...

Why don't you have downtime at leas one a quarter???

How can a system that relies on so many components and other companies be
reliable? With all those AWS-outtakes? I don' understand this.

If you host most of your stuff yourself, either everything works, or nothing.
But not all goes down at the same time.

~~~
tghw
Just to be clear, WebPutty is not gone. It's operational for another few
months on Fog Creek's dime[1]. After that (or now, if you'd like) you can
easily host it yourself on App Engine, probably for free, now that it's open
source[2].

[1] <https://www.webputty.net/> [2] <https://github.com/fogcreek/webputty>

~~~
atesti
If a service one once used is retired, open sourcing it is of course the very
best outcome. Thanks for that!!!

However (regarding AppEngine based solutions), as far as I know, if AppEngine
goes down, there is no way to have any backup, is it? Only google can host
AppEngine apps.

------
jsight
This story reminds me a little of this:
<http://en.wikipedia.org/wiki/Interdictor_(blog)>

It turns out that many backup power plans have not been designed with longterm
flooding in mind.

------
ljlolel
Trello database can't be that big. Why not just copy it, zip it up, and
transfer it over to the west coast?

------
tzs
I can't tell for sure, but a little Googling seems to indicate that there is
natural gas available in at least some New York neighborhoods.

Why aren't data centers built where natural gas is available? A natural gas
powered generator could run from the gas lines, which should be more reliable
than fuel that has to be trucked in.

~~~
Spooky23
Natural gas and steam are pervasive in NYC. I think they are used less often
because you cannot control delivery and running big gas pipes in a retrofitted
building is expensive.

Keep in mind that Manhattan has NEVER flooded before.

------
lutze
"The superhuman folks at the data center are hauling 55 gallon drums of diesel
fuel up 17 flights of stairs."

I can't for the life of me think why someone would put a backup generator 17
floors up, these guys don't know you can buy mains cables longer than 4 feet?

If it absolutely has to be 17 floors up, buy a fucking pump.

Geeks. Sometimes we're so dumb it hurts.

~~~
rexreed
From [http://status.fogcreek.com/2012/10/fog-creek-services-
update...](http://status.fogcreek.com/2012/10/fog-creek-services-
updates.html):

"Here's the physical situation:

The generators are on a high floor in the building and the pumps supplying the
generators with fuel are submerged. The best option at this point is for
people to physically lug diesel up over a dozen floors, or make other
arrangements for pumping fuel to that high floor. "

Yes, a generator not in the basement / ground level makes sense. When a flood
happens, it usually happens from the ground floor up ;) Thus why generators
are at the upper levels.

Even in this case, using a pump solution makes little sense. Assuming you have
electricity to power the pump, I suppose you could power the pump with the
electricity from the remaining fuel, but a 17 floor pipe containing fuel would
probably be heavier (and more dangerous) than a long mains power cord, and it
wouldn't make sense to have a 17 floor cable to power the pump with the pump
at the bottom of the run than it would to have just the pipe of fuel running
the length of 17 floors and the pump at the top of the run.

Time for a winch? I don't think Home Depot stocks a 17 story pipe, and
relocating the generator probably will take more time than just winching /
carrying up the barrels.

~~~
jlgreco
Generators higher up makes sense, but _17_ stories high? I suppose real-estate
concerns limit where you can put them though.

~~~
mhp
In this case, I think you are correct. Need lots of exhaust so have to be on
the roof. One police plaza just tore the side of their building off to make
room for generator exhaust. It's a warzone in lower manhattan right now.

~~~
jlgreco
Wow, that is pretty nuts. Did they install temporary generators, or did they
run into unanticipated ventilation issues with existing generators?

------
alh
Why would you host in New York, how dumb? Everyone knows that when aliens
attack, disaster strikes, time for nuclear armageddon or it's the end of the
world, that it all starts in New York. I've seen the movies so there!

~~~
dbecker
What will I possibly do if I have to go without Trello during a nuclear
armageddon.

