Hacker News new | past | comments | ask | show | jobs | submit login

Root has some features that are very unique and powerful.

It’s used in particle physics today mostly because it allows to do performant out-of-memory, on-disk Data Processing.

With frameworks like Python pandas, you always end up having to manually partition your data if it doesn’t fit in memory. And of course, it’s C++, so by default the data analysis code is pretty performant. This makes a difference when you can iterate your analysis in one hour instead of 20.

That being said, when I last worked with it, Root was a scrambled mess with terrible interfaces and way to many fringe features, e.g. around plotting, that are better handled by Python nowadays. It even has a C++ command line!!!

I wrote a blog post back then how I thought it could be fixed: https://www.konstantinschubert.com/2016/06/18/root8-what-roo...

Let's be honest: it's used today because it was used yesterday, and there is a lot of useful legacy code. Not many like plotting with root, or faffing about with memory allocation.

The reason it started getting use is that in the 1990's, when the current generation of experiments were starting up, C++ was hot and Fortran was not. PAW was old and in Fortran and so the young ones wanted to work with the new hip ROOT instead[1].

[1]: https://www.quora.com/Why-does-CERN-use-ROOT/answer/Mario-Al...

Back when I was at HLT, I remember many talking about ROOT but we didn't use it much in TDAQ.

Oh man, that’s to hardcore for me. I bow to your superior inside knowledge and ask for enlightenment on the meaning of HLT & TDAQ

It is a Google search away. :)

No it isn't.

Not to mention that the actual code of Pandas and half of the data crunching tools in Python are actually C/C++ tools with a Python interface anyways.

> With frameworks like Python pandas, you always end up having to manually partition your data if it doesn’t fit in memory.

"Pandas Docs > Pandas Ecosystem > Out of Core" lists a number of solutions for working with datasets that don't fit into RAM: Blaze, Dask, Dask-ML (dask-distributed; Scikit-Learn, XGBoost, TensorFlow), Koalas, Odo, Ray, Vaex https://pandas-docs.github.io/pandas-docs-travis/ecosystem.h...

The dask API is very similar to the pandas API.

Are there any plans for ROOT to gain support for Apache Parquet, and/or Apache Arrow zero-copy reads and SIMD support, and/or https://RAPIDS.ai (Arrow, numba, Dask, pandas, scikit-learn, XGboost, spark, CUDA-X GPU acceleration, HPC)? https://arrow.apache.org/

Around 2014 we used root to build predictive models for lending. It was introduced to the company by some physicists. It was good and powerful but man was it messy.

Later we briefly moved to R and finally settled into Python and friends.

Applications are open for YC Winter 2020

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact