
Inside CERN's multi-megawatt data center - PaulHoule
https://home.cern/about/computing
======
walrus01
It is interesting to see that they apparently built this as a traditional
raised floor environment without hot/cold aisle separation. Looks like
something from 20 years ago. Building datacenters on the multi megawatt scale
(example: 10-12kW thermal per 44U cabinet, equivalent to two 208V 30A circuits
per cab), it's far more efficient to put everything straight on concrete slab,
build hot and cold aisle separation and do fiber and power trays entirely
overhead.

If I had to guess the PM on this was somebody who'd built traditional
supercomputer datacenter environments 15-20 years ago for some entries in the
top 50 of the top500 list, and scaled up what they knew how to build.

edit: Looking at the photos in more detail it looks like they're retrofitted
hot/cold separation into a traditional 20-year-ago style raised floor
environment. The fact that there's zero overhead ladder racks carrying AC
power (or -48VDC cabling) and zero overhead fiber trays means that everything
is under the floor... Very costly and labor intensive compared to modern
methods of building a datacenter. Also means that it's not designed to be
changed or modified very often, if ever, and it's a huge forklift upgrade if
the power and fiber layer 1 topology ever needs to change.

[https://home.cern/sites/home.web.cern.ch/files/image/about_s...](https://home.cern/sites/home.web.cern.ch/files/image/about_section_page/2013/01/cern-
servers.jpg)

~~~
PaulHoule
They built it in 2002, which was almost 15 years ago.

~~~
Angostura
And probably designed it quite a few years before that.

------
milesward
"Some 6000 changes in the database are performed every second."

Things have changed a lot in 15 years!

[https://cloud.google.com/bigtable/docs/performance](https://cloud.google.com/bigtable/docs/performance)
10k QPS per node...

[https://cloud.google.com/bigtable/pdf/FISConsolidatedAuditTr...](https://cloud.google.com/bigtable/pdf/FISConsolidatedAuditTrail.pdf)
"2.7 Million FIX messages processed and inserted per second"

Disclosure: I work at Google Cloud.

~~~
6nf
Yea they talk about using 30 petabytes of data per year. Backblaze can fit
that in half a dozen racks these days. I can't even imagine how much Google
can squeeze into a modern data center. Storage sure moves quickly!

~~~
VodkaHaze
I wonder when the fixed costs of upgrading will be less than the operating
costs of inefficient 15 year old hardware

~~~
Create
Hw is regularly replaced, old kit being thrown at poor nations (obviously
they’ll foot the energy bills). Its a regular intel shop, mostly with dell
stickers. Truly run-of-the-mill current cots, what you would expect anywhere
else.

------
discodave
> The Grid runs more than two million jobs per day. At peak rates, 10
> gigabytes of data may be transferred from its servers every second.

So that would be 80 Gigabits per second, or the networking capacity of 4 hosts
with 20Gbit connections (but let's say you had 3x4=12 hosts to be safe).

On AWS you can get m4.16xlarge, p2.16large, x1.32xlarge or r4.16xlarge hosts
with 20Gbit networking.

Those hosts can be had for 10-50k per year. So that works out to be 120-600k
dollars per year.. which doesn't seem like that much in the scheme of things.
120k is less than one developer in the bay area!

Of course there would be other costs like networking and storage, but overall
it seems like they could save $$$ in the cloud.

~~~
captainmuon
I don't know the exact numbers, but it is regularly evaluated, and they found
that it won't save money moving to the cloud (yet).

Apparently, grid jobs have different IO/CPU/Memory characteristics from
typical cloud applications. My jobs tend to use a lot of CPU and bandwidth,
but are mostly IO bound. A friend did a very, very CPU intensive analysis for
his PhD, and they estimated that it would cost $30 million to run it on AWS.
I'm not sure where they got that number from, so take it with a grain of salt,
but even if they are an order of magnitude off, it is still prohibitive.

Another issue we are facing is RAM consumption. Many scientists are not
trained programmers, so there are a lot of memory leaks. It didn't use to
matter, an analysis program ran only a few hours and was single-threaded
anyway. Now we are moving to using multi-threading, and we have been using
multi-processing anyway... And as I understand it, in modern hardware the
RAM/CPU ratio is getting lower and lower. If your job or thread needs 8 GB
RAM, you can't run many of them on a 32 core CPU...

So yeah, I think the main issue is wierd resource usage patterns. Not that it
would be impossible. Grid computing is basically just a weird parallelly
evolved version of cloud computing, after all.

~~~
Coding_Cat
For the Alice experiment most jobs are IO limited. Although throughput should
be improved by run 3 via backend changes (I hope so at least, currently doing
my MSc. thesis on exactly that...).

Simulations are compute heavy, but analysis tasks vary, with the speed of a
modern processor future analysis is still likely to be IO/bandwidth limited.
The ratio of RAM to core on the test server I have access to is 128/20 so 6.4
GiB per core.

But for (future) analysis the plan is to use, or at least try to use, shared
memory and grouping users tasks together based on the required data (They
speak of an analysis train with wagons). So a large part of the ram
requirements, namely the backing data, should be shared among the cores. I
think ideally there will be a top-level scheduler for each node which tries to
minimize the required bandwidth (but that is outside the scope of my work so
who knows how it's implemented). With a few "best practices" it should be
possible for most analysis to consume a 'reasonable' amount of memory at any
given time in that case.

------
yigitdemirag
These pictures are taken on the same floor, but just one floor below in this
building there is another computer grid, similar to this one but a bit
smaller. At that grid, there are, if I remember correctly, more advanced
(Intel's xeon, AVX512 etc.) CPU architectures are operating. In the very same
floor there are also robotic arms that access offline/long stored hard-drives
per user request, which always seems pretty amazing to me. (Disclaimer: I was
intern at CERN.)

~~~
batbomb
Those aren't hard drives, it's HPSS (more specifically, robotic tape library)

------
spaceboy
What a curious TLD:

    
    
        .cern
    

I want one of those

~~~
Coding_Cat
Go work for CERN, become a member of the personel and then meet these strict
requirements: [http://nic.cern/registration-
policy/](http://nic.cern/registration-policy/) , shouldn't be too hard ;)

Shame they don't let users have a homepage USERNAME.cern like how universities
give you a homepage usually at uni.edu/USER/~/index.htlm (or something).

------
peter303
From the people who invented the World Wide Web given away free to the world.

------
bouvin
Awesome. I was a summer student there in 1993, and had the opportunity to
visit their data center at the time. What impressed me most was the huge tape
robot in the basement.

~~~
yigitdemirag
It was still there :) I was also summer student at 2014, how far summer
studentship goes back I wonder..

------
peter303
Each collision event sends impinges on thousands of various sensors which
record energy, charge, and geometry. A tiny fraction of collisions are deemed
potentially interesting and recorded for future analysis. Later a physicist
could propose a particle model and search the records for candidates.

------
rodionos
Lots of Sun machines, including in the database racks. Oracle I assume.

~~~
pjmlp
When I interned there my room had a pile of dead Sun workstations, maybe
something like Sun-3, I don't remember the exact model.

------
quakeguy
Stunning amounts i can't wrap my head around!

