Hacker News new | comments | show | ask | jobs | submit login
Multiprocessing in Python with Fortran and OpenMP (admin-magazine.com)
73 points by rbanffy 11 months ago | hide | past | web | favorite | 23 comments

OpenMP is great and all, but you're going to have an awfully hard time beating numpy on either speed or correctness, since it's using LAPACK under the hood already.

They're very different tools. Your system's LAPACK is probably crazily optimized (e.g. ATLAS or MKL) [1], so if you can reduce your problem to LAPACK subroutines, that's probably the way to go. If you have lots of cores and your problem doesn't neatly reduce to LAPACK, OpenMP will probably help you. In either case, you're well-served spending some quality time with a profiler, and possibly fighting your compiler's optimizer.

[1] In case you're not familiar with it, ATLAS runs a bunch of small programs to figure out what approach is fastest on your specific machine, trying different instructions and memory access patterns for different array/matrix sizes. Cool stuff.

False. OpenMP does shared-memory parallelism and has nothing to do with LAPACK. numpy is completely serial.

Numpy (or rather, BLAS) is running many threads in parallel. The amount of concurrency is actually configurable, and is something well worth optimizing for multiprocessing workloads.

Numpy can be compiled against a multi-threaded BLAS, for example MKL.

I've seen OpenMP-based, non-BLAS implementations beat out MKL, on Intel hardware nonetheless. We emailed the MKL folks and they just shrugged. :) Don't presume that MKL is the lower limit on performance.

Anaconda python is a great option for exactly that. MKL gives great multicore performance. Many packages, e.g scikit-learn, dask and others, lean on numpy for performance, and are thus much faster.

Yes it can, in which case you embed OpenMP parallelism in low-level library calls, and is another way to accomplish what the article suggests.

But it's still a fine-grained parallelism so numpy can match plain Fortran performance in special cases (and not the other way around).

At this point in time, Fortran is basically a domain specific language, analogously to using regular expressions for text processing. It is a nightmare to write full applications in it. The f2py approach to Fortran is undervalued IMHO, as Fortran is brilliant for certain types of simulation, and much easier to write and optimize than C/C++.

A number of languages most naturally live in the Domain-Specific Language (DSL) world, and only left it because, through some period of history, some groups didn't have much else. Fortran was a general-purpose language for a while, because it was the only language other than BASIC or assembly available at some sites, and it got roped into doing precisely the kind of text-manipulation stuff it's historically been bad at. If someone is writing a text adventure in Fortran, you know they don't have any other option.

Cobol is a DSL for record-oriented batch processing, which is extremely useful for some business tasks and not very useful for most of the rest of the programming world. RPG (Report Program Generator) is, if anything, even more restricted than Cobol, but it never "escaped" to the extent Cobol did so it never got the same kind of derision heaped upon it.

C is most naturally a DSL for OSes and what used to be called "systems programming": Software that has to run at machine speed, and can sacrifice usability and debuggability for pure speed. It was never a good choice for most of what got written in it, but, again, C compilers were available when other development tools were not, and, on a slow enough computer, all programming becomes "systems programming".

C compilers only became generally available outside expensive UNIX workstations around the mid-80's, until then there were dialects like Small-C for CP/M systems, where C was just yet another language.

Systems programming languages exist since at least 1961, with ESPOL being one of the first ones. IBM for example did their RISC research with PL/8.

As for pure speed, that is the result of 40 years research in taking advantage of UB.

"Oh, it was quite a while ago. I kind of stopped when C came out. That was a big blow. We were making so much good progress on optimizations and transformations. We were getting rid of just one nice problem after another. When C came out, at one of the SIGPLAN compiler conferences, there was a debate between Steve Johnson from Bell Labs, who was supporting C, and one of our people, Bill Harrison, who was working on a project that I had at that time supporting automatic optimization...The nubbin of the debate was Steve's defense of not having to build optimizers anymore because the programmer would take care of it. That it was really a programmer's issue....

Seibel: Do you think C is a reasonable language if they had restricted its use to operating-system kernels?

Allen: Oh, yeah. That would have been fine. And, in fact, you need to have something like that, something where experts can really fine-tune without big bottlenecks because those are key problems to solve. By 1960, we had a long list of amazing languages: Lisp, APL, Fortran, COBOL, Algol 60. These are higher-level than C. We have seriously regressed, since C developed. C has destroyed our ability to advance the state of the art in automatic optimization, automatic parallelization, automatic mapping of a high-level language to the machine. This is one of the reasons compilers are ... basically not taught much anymore in the colleges and universities."

Fran Allen interview, Excerpted from: Peter Seibel. Coders at Work: Reflections on the Craft of Programming

RATFOR was always an interesting way of doing text with Fortran.

It was a pre-processor for Fortran 66 written by Brian Kernighan

Frankly, FORTRAN is not that bad. It's not worse than PL/SQL or ADA. Yes all are verbose but that's not the issue, although I wish some of these get a revision of their syntax, no question.

Often the issue is the lack of IDE,code quality tools,libraries,support for string encoding schemes, linters and co, everything that is taken for granted in modern languages with a broader appeal.

Actually, newer programmers are rediscovering these languages because they actually scale very well as an alternative to C++.

For PL/SQL I find Oracle's own SQL Developer quite good.

Ada, there are quite a few nice IDEs, most Eclipse based, but of course commercial. The free one from AdaCore, GPS, is kind of OKish.

For Fortran, most commercial compilers have Visual Studio plugins on Windows, and Eclipse ones on UNIX systems.

Not to mention the oodles of solutions for super specific scientific computation tasks that exist in fortran. I'm learning about numerical integration and stuff for solving differential equations, and there's so much stuff where someone essentially figured out how to do this years ago and wrote a .f file for it.

As someone with a budding interest in this, are you finding this stuff on the web? Any specific sites or books where you're finding these implementations?

Netlib. Some good ones:

* LAPACK (BLAS is there too) http://www.netlib.org/lapack/explore-html/

* QUADPACK (quadrature integration) http://www.netlib.org/quadpack/

* MINPACK (nonlinear solvers) http://netlib.org/minpack/index.html

* ODEPACK (ordinary differential equations) http://netlib.org/odepack/index.html

Most of the textbooks on numerical methods are not very approachable, but Carl Meyer's Matrix Analysis and Applied Linear Algebra text is a good start, as are Gilbert Strang's books.

Note that SciPy wraps all of these in nice Python wrappers. (That's basically its reason for being.)

There are a few places that use FORTRAN extensively. The two I am aware of being Weather simulations and ESDU[1].

[1] https://www.esdu.com/cgi-bin/ps.pl?sess=unlicensed_117111319...

Hairer's DOP853 is a differential equations solver that a lot of people seem to be enamored with:


In this case someone took the old fortran code and modernized it a bit, yet again in fortran:


Not sure that nightmare is how I would have described it having worked on a large telecos system that used Fortran 77.

Do you have any examples?

The newer Fortran's have much better character/text handling I agree the FORMAT statements could be tricky

Fortran 90 came with matlab syntax for arrays. Slicing, operators work on entire arrays, etc. Much more high level than C, and often faster, but safer and easier: no pointers needed.

Fortran 77 is just a nightmare, like you said. Not many people recommend C++ styles from the 80s either.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact