
Google joins Open Compute Project to drive standards in IT infrastructure - rey12rey
https://cloudplatform.googleblog.com/2016/03/Google-joins-Open-Compute-Project-to-drive-standards-in-IT-infrastructure.html?m=1
======
Animats
Finally, racks go metric.

The 19 inch rack is one of the oldest standards in computing. ENIAC used 19
inch racks. Open Compute, though, uses wider, and metric, racks. 19 inch rack
gear can be mounted in Open Rack with an adapter.

The Open Compute spec says that shelves of IT gear are provided with 12 VDC
power. There's power conversion in the base of the rack. Facebook has
standards for distribution to the racks at 54VDC, 277VAC 3Ø, and 230 VAC
(Eurovolt). Apparently Google wants to add 48VDC, which was the old standard
for telephone central offices.

Facebook's choice of 54VDC distribution is strange. Anyone know why they
picked that number?

~~~
jauer
54VDC and 48VDC are roughly the same thing in telco land. E.g. I hooked up a
Juniper MX960 last week with "-48 VDC Nominal" power supplies. The operating
voltage range is -40 to -72VDC.

A "48V" rectifier will output 54VDC to the power bus (feeding batteries and
systems) since it's battery float voltage. If the rectifier goes offline (loss
of AC power) the batteries will power the bus keeping systems online.

I'm curious what the actual difference in practice is between Facebook's 54VDC
and Google's 48VDC.

------
wyldfire
> energy efficient and more cost effective ... engaging the industry to
> identify better disk solutions for cloud based applications

My pet issue w/IT infrastructure is the management modules. Finding a server
w/a management module that works everytime is nigh impossible. Do google and
Facebook design their own or do they somehow just work around their quirks?

~~~
GauntletWizard
Google and Facebook don't bother with management modules; They're more likely
on switched power for each machine, with significant work in automatic netboot
reprovisioning when things go wrong on these systems. In the event that fails,
they swap machines out for diagnosis... if they bother with the diagnosis
part.

~~~
haneefmubarak
Ahh yes, the tried and true "turn it off and on again" philosophy...

All kidding aside, if it's a random rare issue (some bit flipped somewhere),
chances are that power cycling it may fix the issue (upon initialization). If
it's anything more serious, the issue will likely persist, in which case
taking the machine out of production is likely the best immediate course of
action.

------
atomic77
I haven't followed this project too closely, but, it seems interesting that it
has taken this long for Google to join (and the conspicuous absence of
Amazon). Anyone able to speculate on why now?

~~~
payne92
Pure speculation: Google was the first to build very high density server
farms. As such, they had various proprietary designs and approaches, and
became heavily invested.

And then it took them a while to decide that leveraging the open source
ecosystem would be better than maintaining those various proprietary elements.

~~~
ChuckMcM
I don't think it is speculation, Google has always considered the
infrastructure things they did as a competitive advantage, super top secret
stuff that even other employees don't know about kind of silos.

And to your second point, it is interesting that OCP (some of it perhaps
influenced by people who had worked at Google and then later worked at
Facebook) appears to have minimized some of the advantages Google had, and now
they find themselves possibly getting behind (one of the nice things, and bad
things, about open projects like this are that you get a lot more resources
applied to making things better than one company can muster).

For similar reasons I wonder if ARM will displace x86 if only because there
are probably 15 to 20 independent teams of smart people with ARM architecture
licenses making better ARM processors, and perhaps 4 such teams at Intel
making improvements in the x86 architecture. At some point it seems "open"
seems to eventually overwhelms "walled garden." Although I'm totally open to
counter examples where that hasn't been the case.

~~~
jldugger
From what I've seen the challenge ARM faces in the datacenter is support for
ECC ram.

~~~
ChuckMcM
I agree with that, I would love to see an ARM server chip with a full ECC
memory path. However I note that ECC memory subsystems are "well understood"
from a hardware perspective so one of the teams working on ARM processors will
no doubt apply that to their version and we'll if there is more demand than
just the two of us :-)

------
godzillabrennus
This is great news and just another nail in the coffin of what Wired calls the
Fucked By Cloud vendors: [http://www.wired.com/2015/10/meet-walking-dead-hp-
cisco-dell...](http://www.wired.com/2015/10/meet-walking-dead-hp-cisco-dell-
emc-ibm-oracle/)

~~~
tw04
He fails to explain why "Amazon" is the future. He starts off by insinuating
it's because traditional vendors are more expensive... I guess I'd ask to see
the raw data he's using.

I've run the numbers, AWS isn't cheaper _AT ALL_ unless you're talking
bursting workloads that run for less than a month, or a company that only
needs one or two servers but still needs the reliability of a larger
environment. Anything outside of that is cheaper to do on-prem 9 times out of
10.

Either he's got his head in the sand, has bought into the "cloud is cheaper
hype", or he has other justifications he fails to list.

~~~
hc000
Well, netflix just migrated all their servers to amazon EC2. I'm sure it was
cheaper for them.

~~~
pilsetnieks
And Spotify moved to Google Cloud. However, these are isolated cases that
don't reflect on the average experience. For one, they most likely got a much,
much better deal because of sheer volume. It still doesn't change the reality
for companies with a handful, or even a few tens or hundreds of servers.

------
spacecowboy_lon
Interesting that it looks they are dumping 12v and going back to telecoms
standard 48V :-)

------
wilhil
And, as not an employee of a multi billion pound company, how can I get
involved?!

I ask every time, and this project is amazing, but, it feels just for the big
guys!

~~~
pas
You can buy OCP compatible stuff, promote it, join the mailing list, blog
about it, and so on. Otherwise, you can't. It's basically the club of kids who
build this kind of hardware stuff (do R&D on infrastructure).

When Facebook started OCP it seemed like an awkward initiative, after all, who
would manufacture OCP-compatible stuff, when you can't find anything OCP-
compatible at all? (It needs custom racks, for shame! Madness!) But slowly, it
turned out, that there are multiple groups working on custom designs, trying
to get out of the world of half-assed firmwares, crazy sales calls and useless
support vectors associated with traditional vendors.

So initially it only made sense if you had at least a few thousand servers and
a team to work on that extra few percentages of efficiency. Nowadays, it seems
a much more diverse club. And especially after the network stuff got opened up
- the timing with SDN was right anyhow - they seem doomed to succeed.

~~~
skuhn
The equipment is available from various vendors, but the biggest hurdle is
finding a datacenter facility that can handle OCP gear. Retail colocation is
pretty much 19" racks only and 120V/208V power only. Even small scale
wholesale datacenter deployments are tough to do if you want to go whole hog
on OCP specs.

You can still get OCP hardware in standard 19" form factors for these
environments, but I don't personally see much point in it. You're seriously
limiting your vendor and hardware selections, and if you aren't seeing the
purported power efficiency gains then it's not worthwhile. The allure of
eliminating the pain of classic vendors is still there, but there are paths
through the Dell / HP / Supermicro maze that are tolerable without burning it
down and going to OCP.

There are other challenges with OCP's overall design in standard facilities
beyond the server cabinet level too. Their (optional) battery cabinet design
-- where you save on the facility cost by omitting the UPS bank in favor of
battery cabinets for every triplet rack -- is explicitly forbidden at
basically every top tier datacenter provider that I'm aware of. No batteries
allowed on the floor because of the fire suppression system. You can
definitely get a provider to build you a room where they'll let you have
batteries on the floor, but that is a multi-megawatt sort of problem, not a
50kw sort of problem.

~~~
pas
Yes, you need at least a cage, so hundreds/thousands of servers minimum. And
that's where power conversion gains might mean you get ~10-20 servers'
electricity for "free".

But that means you need to basically run your own DC operations. Cables,
batteries, monitoring, hot spares, and good luck finding a DC that even allows
you to speak about putting fire hazardous stuff "there".

Yet it can be done. Money talks, hence the seemingly amazing success of OCP
out of nothing.

... ah I should have read your last paragraph, before writing anything :)

