
CERN pushes storage limits as it probes secrets of universe - hugorodgerbrown
http://news.idg.no/cw/art.cfm?id=FF726AD5-1A64-6A71-CE987454D9028BDF
======
jessriedel
The filter is known as the "trigger". The trigger has several levels, so that
the data rate is reduced by an order of magnitude or more at each level. The
lowest levels are done in hardware for speed. The upper levels are in software
for flexibility.

Designing the trigger is extremely complex, since the various detectors within
each experiment have greatly varying response time. Only data from fast
detectors is available for the low level trigger.

In addition, the speed of light is a real barrier for the lower levels of the
trigger; by the time the debris from a collisions reach the outer reaches of
the experiment (this is usually where the muon chamber is), there have already
been an additional collision at the center.

(The speed of light is about 1 ft/nanosecond, the radius of the muon chamber
in CMS is about 25 ft, and the time between bunch crossings is about 25
nanoseconds.)

The design of the trigger is a very important and often contentious process. A
bad trigger will throw out important physics events, and trade-offs can favor
one physics search (e.g. the Higgs) over another (e.g. supersymmetry).

------
Xk
Alright; I'm confused.

First they say that they "generate around 1 petabyte of data per second"

Then they say "ATLAS produces up to 320M bytes per second, followed by CMS
with 220M Bps. The data from ALICE amounts to 100M Bps and LHCb produces 50M
Bps." only that sums up to 690M Bps ... definitely not 1 petabyte per second.
(That is, assuming that 1M Bps means 1 million bytes per second, or just under
1 megabytes per second.)

And then, later on, they talk about a different mode in which "more data is
produced by the four experiments, about 1.25G Bps in total." which is still
not 1 petabyte per second.

What's going on?

~~~
AretNCarlsen
I used to be the sysadmin for a high energy physics lab as we prepared for the
ATLAS experiment to come online. (It was a long wait, following helium
explosions and such.) The reason you see so many different numbers is that
they cannot possibly record the full flow of information. CERN has a very
large buffer that the collision sensor data is fed into initially, which is
analyzed in realtime to determine which chunks of data are likely to contain
significant information. Those chunks are kept, and the rest are discarded.
This bothered a lot of people, since they are probably throwing away
interesting scientific data, but they are limited by current storage
technology.

Further preliminary analysis is performed on the retained data, broadly
categorizing the energy and other characteristics of the collision. That
allows individual physics groups around the world to download only the data
that is likely to pertain to their specific research, e.g. the Higgs boson,
multiple dimensions, etc.

There was some talk of transferring data via Bittorrent or perhaps a custom
protocol involving fountain codes. That never got off the ground. Instead, the
Russians were working on a custom peer-to-peer system with a monolithic
centralized set of indices, a system which is hopefully working better than it
used to.

P.S. - Here's a hummingbird-speed video of building our prototype fileserver
node for local physics analysis of ATLAS data [before I learned about electric
screwdrivers]: <http://www.youtube.com/watch?v=8y6MpPNqxmw>

------
shaggy
There was a very good and very detailed talk given by Tony Cass from CERN at
the LISA 2010 conference. The talk gives a much more in-depth look at the
environment at the LHC. The link below has the audio, video and slides from
the talk. Look for "The LHC Computing Challenge: Preparation, Reality, and
Future Outlook"

<http://www.usenix.org/events/lisa10/tech/>

------
jevinskie
How long do these experiments run? 1.25GB/s doesn't seem all that bad if the
experiment is only seconds-minutes in duration.

~~~
Create
Generally you would fill the machine, then circulate beam "stable", until it
wears out (loose luminosity), then dump the remaining, and start again. This
is called a fill, and can last several hours. A fill has several runs, some
experiments automatically change runs every hour. A run is generally with a
given filter (trigger) setting/detector configuration. You would run 24/7, but
due to the filling cycle, waiting for ramping up and stable beam etc. will
give you "idle" overhead. Furthermore, you are "lucky" to reach flat top
stable beam: magnets can quench, power supplies will trip etc. giving
unscheduled downtime. Then there is scheduled downtime (like just recently),
which can last years :)

So all in all, you would get a few dozen weeks of real operations a year,
which would include some stable beam (if you aren't into beam gas studies or
cosmics).

~~~
AretNCarlsen
Do you know what the actual average uptime has been per operation? I have
never seen that number.

~~~
Create
To be fair, the metric for benchmarking is delivered/recorded luminosity. The
machine delivers, and the detector records. It is this efficiency that funding
agencies are shown in the periodic reports.

Strictly speaking, (down)time can be "irrelevant", in the sense, that with
higher luminosity you get more data (LHC can catchup on Tevatron easy). You
can have 5 nine uptime, with 1 bunch circulating, or lots of bunches of
particles (the beam is not continuous, it comes in trains of particles). So
one thing is, that you also go for as many bunches as possible...

But the periodical reports are on cdsweb, because of the public funding
agencies ie. for the machine itself, setting an upper boundary: "Downtime
statistics over the 2010 run" -- Chamonix 2011 Workshop on LHC Performance,
Chamonix, France, 24 - 28 Jan 2011, pp.70-74. Then the DAQ of your experiment
of choice comes on top of this...

