
CERN Open Data Portal: Explore more than 2 PB of open data from particle physics - lelf
http://opendata.cern.ch/
======
hi41
What software is being used to return the results? I tried it and feel the
response is very quick. 12 pb is such a huge amount of data.

~~~
dingalapadum
First of all it’s 2 and not 12 pb. But more importantly the search doesn‘t go
through those 2 pb. The search goes through the different ‚experiments‘ (or
whatever you call that) and the dataset for one of those experiments may
easily be hundreds of gb. Your hits are ‚experiments‘ not the content of those
large datasets obtained during the experiment... this reduces the size of the
search index by several orders of magnitude compared to the datasets itself.
So if for instance each dataset was 10gb in avg, you‘d ‚only‘ be going through
roughly 20000 entries. So letˋs make a very conservative estimate of the lower
and upper bounds, say something like 2000-2000000 entries (although 2 mio.
datasets/experiments would be A LOT - like 500 experiments each day since
2008. Anyway, that being the search feels snappy indeed, which is nice. I
agree with the other comment that ES is a good guess.

------
samstave
ELI5: what should anyone be looking for, how, and with what tools?

Also explain why they dont already have an Api/too/integration with *.edu.
Looker tableau wolfram?????

~~~
goldenbeet
Looks like the site has some resources for helping to get started.

I've also got a separate resource from a Meetup talk I went to a while back.
The speaker is an ML engineer who looked into some LHC datasets and posted a
writeup of her talk here: [https://lavanya.ai/2019/05/31/searching-for-dark-
matter/](https://lavanya.ai/2019/05/31/searching-for-dark-matter/)

------
marksbrown
Geant4 should have been open source a decade ago.

~~~
rkwasny
What do you mean? Geant4 was always open source, I did my bachelor thesis 15
years ago using it.

~~~
sdwa
Now, FLUKA on the other hand... no idea what their deal is.

~~~
wbl
It's very useful for studying ultra small highly metallic supernovas is
probably part of the issue.

