CERN's Exabyte Data Center

roadbuster · 2024-05-01T17:16:19.000000Z

Not much tech content here. They racked a dense array of drives in a JBOD. There's no description of a complex or sophisticated ingestion engine, system for distributing data across the array, file system, error correction, nothing.

An Exabyte of disk is definitely eyebrow-raising, but companies like LinkedIn had an exabyte of storage in 2021 just in their HDFS clusters.

https://www.linkedin.com/blog/engineering/open-source/the-ex...

squigz · 2024-05-01T17:28:13.000000Z

It's hard for me to grok that LinkedIn (or any company, really) has more than 1 exabyte of analytical data.

Scaevolus · 2024-05-01T18:20:31.000000Z

That's about 1GB per user. If you imagine them storing every interaction anyone has with the site-- every file upload, every page load, every mouse movement-- and throw in some duplication-- it's not completely impossible.

0cf8612b2e1e · 2024-05-01T20:34:03.000000Z

Do these logs ever get pruned? Is it worth knowing that 10 years ago, Johnny took 1.5 seconds to click the upvote button? Or is it just easier to keep it, imagining some what-if value extraction?

dmbche · 2024-05-01T18:24:58.000000Z

Isn't 1gb per user absurdly large?

Teever · 2024-05-02T03:28:44.000000Z

Makes you wonder how much facebook and google have per user.

woleium · 2024-05-02T12:51:34.000000Z

not for gmail, or photos, but it does seem that way for linkedin, probably, unless they like storing uncompressed logs.

rthnbgrredf · 2024-05-01T20:14:37.000000Z

If I remember correctly, JBOD means that if a drive fails the data on the drive is lost. If you have so many disks then you have an expected fail rate of roughly 1.7% per year (according to the popular drive statistics from back blaze).

Isn't it a bit harsh to lose 1.7% of data every year? Why not a dead simple RAID that can tolerate the loss of a disk without data loss?

ysleepy · 2024-05-01T21:50:25.000000Z

Redundancy is handled at a higher layer. They are probably using ceph on top.

russelg · 2024-05-02T03:57:02.000000Z

A JBOD is as the initialism, they're Just a Bunch of Disks. They are simply enclosures that allow a large number of disks to be hooked up to a server. Any RAID or whatnot would be done on the software side, with Ceph or something similar as another reply states.

Melkman · 2024-05-01T19:17:01.000000Z

More information on the software side of CERN storage can be had here: https://indico.mathrice.fr/event/143/contributions/174/attac...

rpcope1 · 2024-05-01T17:51:06.000000Z

CERN has had some interesting storage technology, at least in the past. When I was at Seagate, we were trying to pitch our Kinetic drives (basically a 3.5" HDD with an ethernet port that talks key/value store instead of SATA), and CERN was one of the large purchasers of these.

Their datacenter is used for a lot more than just LHC results, also. Zenodo, the open science result repository, also lives in their storage.

mrbluecoat · 2024-05-01T18:00:26.000000Z

The proposed Square Kilometer Array (SKA) will need one of them a day: https://arstechnica.com/science/2012/04/future-telescope-arr...

0cf8612b2e1e · 2024-05-01T20:31:30.000000Z

That is brain melting. Just storing it already seems impossible. How do they run an analysis over that much?

Is there ever a point where they can safely delete this data? Or do they have to keep it forever?

rcxdude · 2024-05-03T23:44:42.000000Z

If it's anything like the LHC, they will very aggressively prune the raw data, only keeping the parts that are deemed to be of some interest.

nyc_pizzadev · 2024-05-01T17:30:37.000000Z

Looks like a marketing piece publicizing that CERN is using WD HDD products at scale with no technical details. To make matters worse, the WD product links don’t even work!

qwertox · 2024-05-01T19:07:35.000000Z

CERN publicizing about their HGST Ultrastar use [0] in 2013 was what got me started buying their drives and I never had issues with them. HGST is now part of Western Digital [1].

The last time I had to buy disks I switched to Seagate Exos X and thought I'll continue buying them. I think it was one of the Backblaze Drive Stats which made me buy them. I like the drives.

So CERN is now doing again such an advertising campaign:

> When Bonfillou shared the requirements from the next generation collider, the team suggested testing the company’s new series of JBODs (Just a Bunch of Drives), the Ultrastar hybrid storage platforms.

Since the project is expected to start in 2029, add about at least 5 more years for CERN to collect data on the drive stats, that's a long time to wait.

Does anyone here know if WD's Ultrastar are still as good as back then, when HGST was HGST? Was it just a brand change and all the rest, R&D-team, design, production, was still the same, separate to WD?

[0] https://rog.asus.com/articles/news/hgst-ships-6tb-ultrastar-...

[1] https://www.westerndigital.com/en-us/products/internal-drive...

fancyfredbot · 2024-05-01T15:43:44.000000Z

Turns out the LHC isn't just useful for discovering the Higgs Boson - it also works for advertising hard drives!

alephnerd · 2024-05-01T17:23:13.000000Z

A lot of distributed systems innovations came out of the High Energy Physics space.

Nvidia itself was basically bankrolled by DoE labs for much of it's early existence, MPI was a HPC project driven by Oak Ridge, the WWW was a side project at CERN to simplify information retrieval, etc.

High Energy Physics is a very data and computationally heavy problem that lends itself nicely to HPC, and a lot of the innovations in CS subfields like Bioinformatics, Machine Learning, and Theory was enabled by this research.

jgalt212 · 2024-05-01T19:17:13.000000Z

How is all this data not compressible? e.g. XML compresses 20:1 or better using just gzip. Is all this physics data indistinguishable from white noise? Are their incentives to not compress? Don't take the big data out of big science. Then it's no longer big science and we cannot impress the masses.

> 'How's the spacecraft doing?' 'I dunno. All this equipment is just used to measure TV ratings.'

akubera · 2024-05-01T22:52:53.000000Z

What makes you think it's not compressed? (or that the data is stored as XML?)

There's very sophisticated compression systems throughout each experiment's data acquisition pipelines. For example this paper[1] describes the ALICE experiment's system for Run3, involving FPGAs and GPUs to be able to handle 3.5TB/s from all the detectors. This one [2] outlines how HL-LHC & CMS use neural networks to fine tune compression algorithms on a per-detector basis.

Not to mention your standard data files are ROOT TFiles with TTrees which store arrays of compressed objects.

It's all pretty neat.

[1] https://arxiv.org/pdf/2106.03636

[2] https://arxiv.org/pdf/2105.01683

jgalt212 · 2024-05-02T00:44:48.000000Z

link 1 shows Alice compresses down to less than 100 GB/sec, and uncompressed at 3.5TB/sec.

The article has a sub-headline:

> A petabyte per second

The math doesn't work out for me, unless there at 30 ALICE equivalents at CERN. I think there are about 3.

rcxdude · 2024-05-04T00:04:22.000000Z

The 'uncompressed' stream has already been winnowed down substantially: there's a lot of processing that happens on the detectors themselves to decide what data is worth even sending off the board. The math for the raw detectors is 100 million channels of data (not sure how many per detector, but there's a lot of them stacked around the collision) sampling at 40Mhz (which is how often the bunches of accelerater particles cross). Even with just 2 bits per sample, that's 1PB/sec. But most of that is obviously uninteresting and so doesn't even get transmitted.