The fact that someone runs statistics calculations in Fortran (a language whose most recent standard is dated 2008) should be shocking only to the recent generation of coders who have become "code hipsters." The guys looking for the Next Big Thing™ in programming languages.
These are the guys trying to cram Python into every possible goddamned use case, whether that be embedded systems or computational fluid dynamics. Whether it's the right tool for the job or not. The same guys who are implementing a Ruby interpreter in Haskell, running on a VM written in Rust. Why? Because damn it, their language of choice is shiny and new, and therefore it must be the best.
Nope. You use the best tool for the job. If you're already familiar with a language and that language is perfectly well-suited for the task at hand (in this case mathematics) then you use that language.
Fortran is old, which means it's seen a lot of development. The compiler optimizations are very, very efficient and Fortran code runs extremely fast. In fact, Fortran was explicitly designed for running mathematics algorithms on computers.
tl;dr Fortran is actually a great choice for this application and I hate hipsters.
Fixed that for you.
Yes Fortran is probably the right tool for the job in this case, but I also think that creative experimentation with programming languages, invention of new languages, new implementations of and enhancements to existing languages, are good and important, and not just a "hipster" thing. All modern programming languages and their implementations have warts and deficiences, and building a new and improved one requires a level of skill and dedication that goes far beyond hipsterism.
The work of stats guru Jeff Sagarin.
Fortran can run very fast. If all you need to do is numerical calculations, that might be enough. But that's never all you need to do: you need to preprocess your input data, or deserialize it, normalize it, join it or transform it based on some rdbms data, then you do your calculations, and then you need to graph it, serialize it, etc. Most of those tasks are somewhere between excruciating and impossible in fortran, however modern a dialect you use.
Have I disputed this fact somewhere?
>The truth is, for a lot of numeric computing, you're better off using python, which calls fortran or c code to do the heavy lifting.
Proof that Python can't get the job done in that particular area. Nevertheless there remain Python zealots who will tell you that it's the best for everything.
>Fortran can run very fast. If all you need to do is numerical calculations, that might be enough.
Actually it sounds like that is all he needs to do. Reading and spitting out a .csv file from Fortran is trivial, despite its clunky IO syntax.
Yes, when you were complaining that "These are the guys trying to cram Python into every possible goddamned use case".
Proof that Python can't get the job done in that particular area.
Python can't run at all without C code. Fortunately, Python interfaces really well with C and C++ and fortran. The fact that Python lets you be way more productive using fortran tools without having to know or suffer from fotran's glaring deficiencies seems like a huge plus to me.
Reading and spitting out a .csv file from Fortran is trivial, despite its clunky IO syntax.
Not really. People who do serious analysis need to make graphs. They need to push data back into SQL databases. They need to do all sorts of interactive analyses. They need to present their results and analytics to other people, including reviewers. They need to debug their analysis. All of that stuff is much easier with python than fortran.
Which is where the aforementioned CSV files come in handy. Heard of GNUplot?
In large scale number crunching, like climate models, numerical weather prediction, the typical case is that input data is conceptually in a regular 2d or 3d grid, and stored in binary format files (like NetCDF or HDF), as that is more efficient and saves space.
Then the heavy lifting number crunching code runs on a cluster as a batch job, reads in the data, crunches the numbers, and writes results out again in NetCDF or HDF files.
The output files are then downloaded to a desktop PC, and graphing is done with Matlab (Python is also getting more popular) or especially in meteorology with some dedicated meteorology graphing software.
The binary format input and output is probably about as efficient as it can be. Also, heavy number crunching scientist probably don't have much use for relational databases.
On the subject of being the best language for the job, Fortran does not use 10x more lines than more modern languages. In fact, in many cases it uses less. When I was working on real-time signal processing algorithms we used to prototype algorithms in Fortran, then implement them in C. I also used to use Python with SciPy and NumPy for analysis, as well as Matlab. Matlab's huge toolbox library notwithstanding, the Fortran implementations were usually shorter, simpler, and easier to understand than any of the other languages. It was only bias and preconceived notions that kept it from being used in the real system. I would have loved to see Fortran subroutines running the algorithms underneath the C distributed framework.
 Until someone in management and the customer team decided to have the scientists write their algorithms directly in the real-time C. That is a rant for another day.
As for the first, a) I touched on the fact that Fortran is a modern language and b) unless you've got examples showing two side-by-side implementations of the same algorithm with Fortran being 10x longer, then it sounds like you're just another of the aforementioned hipsters who is experiencing aversion to something just because it is not new and shiny.
Fortran is like fast, typed, compiled Matlab (with smaller standard library, but a lot of numerical libraries available all over internet).
Nut yes, exploratory data analysis, something interactive like NumPy + Matplotlib, Matlab or R might be better.
But NumPy + matplotlib is kinda painful to set up first to be interactive, R the language is wonky, and Matlab cost a lot of money (and the free alternative, Octave, has much more limited graphics, and they are kind of ugly, too).
Of course, Fortran has no graphing capabilities, so one needs some tool for that anyway.
This was written in 1983: http://www.pbm.com/~lindahl/real.programmers.html
It contains among other things the sentiment "Unix is a glorified video game."
Most of the things you are using right now used to be hipster.
Old language != Bad language
"SI: Is it true you still code in Fortran?
JS: Yeah, what’s wrong with that? It’s a good language. Fortran is real good for doing mathematics and running it real quickly. I’m not doing Photoshopping or anything like that. I’m just running numbers."
I don't think it's stupid or old, but it's certainly interesting: in the world where hopping from one newest technology to another is normal, somebody sticking to what they are used to is noteworthy. Some other examples are Bob Staake using Photoshop 3.0 (http://www.bobstaake.com/pixfix/films.shtml) or OpenBSD using CVS (hehe) and writing their own version of it.
Maybe that's what the submitter thought.
(Yes, Fortran is still updated, but it's certainly not the cutting edge tech. Or maybe it will be, after this HN submission.)
In a way, array slicing syntax has been with us since Algol 68, so it's not cutting edge. But still, Wikipedia only lists Algol, Matlab, R, Fortran, Sinclair Basic, Ada, Perl, Python and then coming to modern times, D and Go, as having that feature.
So it kind of feels like cutting edge in times when I'm lucky enough to be writing in a language that supports it.
On the assumption that you're talking about programmers, I suspect it isn't. I suspect the majority of the world's programmers stay with the same few tools for years and years, sometimes (but not always) upgrading a version when they're forced to.
My question about using Fortran as the main infrastructure is that this does not sound like a big data problem. It sounds more like an exploratory data analysis and model development problem. In which case a REPL-based environment like Python, Matlab, or R would have a lot to offer: interactivity, integrated plotting, modeling toolboxes.
People in my group do scientific computing with Fortran, large datasets, and big iron, and it works great in that domain. As people in this thread have noted, there is significant support for automated parallelization. I used to snicker at these dinosaurs (their own self-deprecating term), but I have noticed their publication record seems to be piling up faster than mine. I have therefore gotten over it, and hope that the dinosaurs are not snickering back at me.
There is a lot of scientific and industrial code (in astronomy, aeronautics, ...) that is in Fortran. And I'm not talking about legacy stuff that is only in maintenance mode. These codes are still evolving and get new functionalities. They're not in Fortran only for legacy reasons (though that plays some part). They're in Fortran because Fortan is still the right tool for the job.
Scientific computing, where is IO is pretty much never the bottleneck, is a completely different world from web development. Python/Numpy is usable for some tasks, but only because it delegates a lot of stuff to underlying FOrtran code..
Numpy's (and matplotlib's, actually) core is written in C.
Scipy is a different story, however. A lot of scipy's internals are written in fortran.
That having been said, fortran is a _very_ nice language for a significant portion of scientific programming tasks. IO in fortran is rather painful, though, i.m.o. Tools like f2py (which still doesn't really support fully modern fortran, unfortunately) make it very easy to combine fortran and python, for whatever it's worth.
Not all number crunching is the same.
It was interesting that the NCAA doesn't include scores in their methodology, just wins and losses. Is that because they don't want to officially encourage teams to run up the score? It doesn't stop teams from doing that, but perhaps that's because they are trying to impress other audiences.
It's parallelisable - if that's a word.
Admittedly the stuff I have seen is all written without knowledge of the maxim "write your code as if the next person to fix your code is a mad axe murderer who knows where you live."
But they still unlock the secrets of the universe with it, even with unusual naming conventions.