

95% Data-Center Cooling Energy Reduction Thanks to Fluid-Submerged Servers - MikeCapone
http://www.treehugger.com/files/2011/04/9-percent-data-center-cooling-energy-reduction-fluid-submerged-servers-mineral-oil.php

======
maxharris
If this is such a great idea, why aren't Google, Amazon, Facebook, Rackspace,
etc. using it? They're really smart folks, and they have teams doing nothing
but work on this stuff. I suspect that they've already considered this and
rejected it because of some cost not considered by the oil-immersion folks.

~~~
lsc
google and facebook have requirements for data center reliability that are,
ah, quite a bit different from most of us further down the food chain. My
understanding is that google has some data centers that simply run on outside
air with no provision (other than just shutting the whole damn thing down and
routing traffic elsewhere) for days when outside air is too hot to use.

When you own the application stack, and when you are willing and able to
handle failures at that level, there are all sorts of optimizations you can
do; optimizations that are not possible if you have a bunch of individual
boxes that can't go down.

Even amazon's SLA tends to focus more on the ability to bring up new nodes
rather than saying anything about the longevity of your existing nodes.

As for rackspace, you have a point; they are in a space where uptime on
individual servers matters. However, judging from their prices and their
scale, they have industry-leading levels of margin, and they get this margin,
in part, through having a good reputation. I bet that the reliability hit
they'd take trying something new is large enough to scare them away from
trying much anything radical in this regard. Every sysadmin knows that you
take a reliability hit whenever you change anything, even if it's a good idea
in the long term, and how long that reliability hit lasts depends largely on
the complexity of the change.

Now, I'm not saying that oil immersion is the right answer; we've seen it
talked about, tried, and we've seen it fail many times; It's got lots of
problems. Someone has got to implement it well, and then make the required
parts standard before it will catch on; something that is almost as difficult
as implementing it well in the first place.

But the fact that google and facebook don't use it does not mean it's not
useful for people with different data center requirements.

~~~
maxharris
Great post! I thought about all of the companies in my list as having the
exact same general problem, and I see now that's not true.

Could you elaborate on what the problems are? They are probably interesting.

~~~
lsc
I don't have any special knowledge of google or facebook. (I worked at Yahoo
for a while as a sysadmin, but that was some time ago) I have maintained web
applications from the hardware up, and I have run co-location and VPS
businesses.

But hosting hardware (or virtual servers) that other people run stuff on is a
fundamentally different thing than running an application soup to nuts.

Like I was saying, if you own the app, there are all sorts of nice failover
things you can do that makes the failure of one physical bit of hardware (or
even a whole data center) not as big of a deal. This means you can put less
effort into keeping your servers and datacenters up, and just keep more spare
capacity around instead.

If you are renting hardware (or virtual hardware) for someone else to run an
application on, transparently dealing with hardware failure becomes /much/
more difficult. I mean, especially with virtual hardware it's not entirely
impossible to move people off a failed server, but all the methods I know of
for doing so are extremely expensive, either in terms of infrastructure,
licences, or complexity. (and usually all three)

I mean, at yahoo, at one point I fucked up and took out a data center. Now, I
was on pager, sleep deprived and generally not performing well, but with a few
commands, I was able to redirect all traffic across the country to another
data center. Sure, my mistake cost the company quite a lot in terms of
additional latency, but it was better than being down, right? It would be
nearly impossible to do the same with a co-location or VPS company.

------
spitfire
Cray did this first. In the 1970's. They also did it cooler than these people
will. The Cray-2 had a cooling tower which used artificial blood (flurinert)
for coolant.

<http://en.wikipedia.org/wiki/Cray-2>

------
kakali
It seems a downside would be that this uses more space. The racks are mounted
horizontally. Convert an old data center to this and they'd only be able to
install roughly half the amount of machines as before.

~~~
lsc
Eh, squarfootage is /really cheap/ some places. Right now, it'd be much easier
for me to find cheap squarfootage than cheap wattage.

Unfortunately, there is something of an inverse correlation between the cost
of squarefootage and the cost of bandwidth; especially at the small end of the
scale, bandwidth is /extremely cheap/ here in the caltrain corridor of Silicon
Valley. Squarefootage is /extremly expensive/ so you are partly correct.

The thing of it is, though, there's fiber along nearly all railroad right of
ways. Now, getting that fiber off of the railroad right of way and into your
data center, that can be a rather large capital expense, and often when you
get out in the boonies there are fewer players in the secondary markets; less
competition means you pay more per unit bandwidth. But the ongoing cost,
especially if you use a lot of bandwidth and can bypass most of the middlemen?
might not be that high.

Even in Silicon Valley, where bandwidth is cheap and squarefootage is
expensive, power is even more expensive. With a very small increase in capital
cost I could make my own servers 8x more space efficient; with current power
costs? most data centers won't let me go any more dense than I am now, so
there's no point. (Depending on the cooling system the data centers use, it
can be harder to deal with one really hot rack than two racks that are each
half as hot.) So really, giving me half the space and twice the power would
work out just fine.

------
jacques_chester
As with all things in life, there are tradeoffs.

1\. Fire hazard can be higher, depending on cooling fluid.

2\. OHS regulations to do with handling the cooling fluid.

3\. Congratulations, you are now a plumber!

4\. No more hot-swapping of components.

5\. You may need exotic parts that don't break in the cooling fluid.

6\. Custom equipment may be very expensive.

7\. What happens if you can't get another batch of cooling fluid?

Liquid-cooled systems, including full immersion, are not new -- I believe the
Cray-2 was the first system to do this and it was maintenance nightmare.

