Then since ROOT violates three basic principles of OO (encapsulation, inheritance, virtuality) we are compelled to conclude that ROOT can't be considered as an OO software. ROOT is a bright example of people having jump to C++ but missed totally the point of OO. At least it will probably stay in the history of software because of that.
What could be the improvements in a ROOT major revision ?
o at least fix the name ! Is it ROOT, Root, root ? (Hell, we are pretty sure that any Bazaar model software have at least converged on that !)
o have then a correct namespacing of classes and libs.
o restore encapsulation (then get rid of the g pointers).
o revisit the inheritances. At least have a good histogram class. And arrange the storage area to be stable (then "fix" the TTree). And please, have an introspection class that looks like an introspection class.
o use pure abstract interfaces to separate domains. And stick strongly to the idea to have them pure.
o etc, etc, etc, etc, etc, etc, etc, etc, etc, etc,...
Before beginning, I should point out that these are simply my own views and that I hold no animosity against the developers — their design simply doesn't work for me. Presumably there are many people "out there" who think ROOT an excellent piece of software. In complete honesty, though, I have yet to meet any of them. In fact, I've never had any complaints that this article mis-represents ROOT, and I've had a fair bit of "fan mail", not mention discussions with well-respected developers and physicists who hold precisely the same views :-)
It was okay for a time, but that's time has long passed.
There's better options. Don't use it unless you are in HEP.
The problem is that ROOT still has a few very specialized features that its users still need and you can't get elsewhere. And there are a ton of legacy analysis tools built on top of it that are difficult to port because of how ROOT is. And a lot of its more extensive users are comfortable with it and have no motive to change (they're busy with being scientists).
I don't know anybody who actually likes ROOT, but it also won't be going away any time soon.
I think there are a few of objectively neat features of ROOT:
* Versioned persistency of C++ objects deriving from the TObject base class ;
* Script-like execution of C++ and a C++ REPL based on clang ; and
* Dynamic bindings of the C++ classes to Python .
There's an accompanying, but independently developed, file access protocol for reading and writing ROOT files over a network, too .
On the other (subjective) hand, ROOT is regarded a pain to use by ‘analysts’, the people who use ROOT to make the results that go in to physics papers. There are already some good, old-but-still-valid critiques [5, 6], so I won't say too much, but I think a large part of the problem comes from two things:
1. ROOT tries its best to do everything that a particle physicist might want to do. This encompasses a very wide range of things, and this has lead to ROOT having a very large, often intractable codebase that cannot be modularised.
2. It has failed to keep up with contemporary coding techniques and analysis methods. Most of the PhD students I know use the Python interface to ROOT, and yet the ROOT developers are planning to drop Python support for the next major version (ROOT 7, which is expected in 2018). Those that do use C++ aren't able to use even C++11 effectively with ROOT, as its interfaces aren't compatible.
Luckily, I'm confident that analysts will move to a better way. I've been very encouraged by the astrophysics and machine learning communities in particular, who are using Python to do low- and high-level analysis on large datasets, as we do in particle physics, and are producing fantastic results. Tools like pandas, matplotlib, and scikit-learn are an absolute joy to use in comparison with ROOT, and the communities within the Python ecosystem are wonderful: they foster very open code development, and value readable, well-documented, fast code.
I don't need ROOT to get any better, because I think the future is already here.
* HEP stores about 0.5 exabytes of data in ROOT format, that's almost exclusively serialized objects that do not know anything about TObject.
* No way will the python binding be dropped. I wonder where you got that rumor from. About one third of our users is using it.
* HEP is limited by CPU resources, which is part of the reason why HEP decided to use a close-to-bare-metal language for the number crunching part.
* We just made the use of python and R multivariate analysis tools with ROOT data more straightforward.
* We have people from genomics etc coming to ask for help, because they cannot find a system that scales as well as ROOT does.
And then we have a different perception of the direction out there. I see that Hadoop was nice but slow, Spark is nice but slow, so now things are moving to C++, see e.g. ScyllaDB. There is no reason for us to move away from it, but every reason to make it more usable.
And yes, I agree that this is an issue. But many physicists do not.
* Physicists still don't like pyroot interfaces, otherwise rootpy wouldn't exist.
* astropy is proof that you can be performant and user friendly. Julia is proof that you don't even need a C++ library underneath.
* Saying ROOT scales well is weird; It is true that ROOT and the ROOT IO/ROOT files are efficient, but it needs but additional services have helped it scale (dCache, XRootD, batch farm/grid/DIRAC, etc...)
* Not sure what the ScyllaDB tangent has to do with anything. There are scalable open source RDBMS options out there too like CitusDB, Greenplum which support UDFs. Hadoop and Spark with HDFS are still great for certain applications, and as general data analysis tools are great, but it's tricky to really get them to perform well without HDFS and the grid model of computing doesn't lend itself well to that paradigm.
* I've heard the C++ interpreter is much better with Cling (if that's you, I applaud your effort!) CINT was a gun that fired in both directions for every grad student I ever had to help.
* XRootD has little to do with ROOT anymore other than it also implements the original root protocol.
* ROOT is not modular. It is both an application and a collection of libraries and somewhat of a VM. That does make some things convenient, but it also makes some things extremely hard.
There are many reasons to move away from ROOT, and the astrophysics community is a prime example of that!
Speed is always a concern, but I don't think it dictates that C++ should be the primary ‘user-facing’ interface. Numpy is fast, but it doesn't sacrifice a nice API to achieve it.
Personally, a big difference is that a lot of the Python packages feel fast to use and, most importantly, to write. ROOT can be fast to execute, no question, but I feel like I'm fighting against it (and I'm sorry that's very vague and qualitative).
It would be very interesting to hear more about the genomics use-case, and how they evaluated the other options.
There are serious bugs in RooFit which haven't been fixed in years. Wouter Verkerke has abandoned it (from what I can tell). Lorenzo Moneta is fixing the worst potholes, but it seems is has no authority or no time to tackle the misguiding interface and the broken scaffolding of RooFit.
Maybe ROOT7 will be a chance to take ownership of RooFit again.
Nice to see it on HN.
So also not that much use of PAW as well.
But PAW was earlier, in the KLOE experiment for my graduation thesis.
Look, ROOT is a very complex framework for data gathering and analysis build by physics and it shows every step of the way. The bugs are everywhere and it does really weird things like setting global variables when you analyze some piece of data for instance, changing your results for all subsequent analysis (this particular bug cost me about 2 weeks).
And in the end, there isn't really any point in using ROOT.
- Data gathering can be done with a simple CSV (binary if you wish), a more advanced SQL database, or in the realm of research with the venerable HDF5 format.
- Data analysis in C++ or any compiled language, just doesn't make much sense. You can use Python or R. The libraries to read and treat data are optimized and will make the process much less error prone and probably faster in the end.
Seriously, don't make the same mistakes as I did just because some older people in your lab use ROOT and you feel compelled to do it as well. There are much better tools for the job and I regret not searching for them before wasting about 6 months of my PhD thesis trying to integrate ROOT in my research workflow.
I partially agree: don't use it as a framework, but do use its libraries, they are good!
I will say that it is nice that it has most any math function you will need. I know people who get super frustrated when they can't find a landau distribution in whatever language/library they are using and then just go back to ROOT at the end of the day.
I rather use a programming language with REPL that gives me the option to compile to native code, instead of being forced to write extensions in another language.
Plenty to choose from, doesn't need to be C++.
I think that HNers that bash J2EE and JEE designs never had the "pleasure" to enjoy mid-90's C++ OO frameworks.