Hacker News new | past | comments | ask | show | jobs | submit login
Flang – A Fortran compiler targeting LLVM (github.com)
176 points by swills on May 18, 2017 | hide | past | web | favorite | 103 comments



Can we use it along with emscripten to compile fortran programs to js?


This is the most Hacker News comment I have seen in years and I love it


Me of fifteen years ago would love this for lab classes on, e.g., classical molecular dynamics.

The computational load of these simulations is, in modern terms, not large, so Emscripten-compiled simulations would be plenty fast enough, and deployment of web pages in a computer-lab environment (or for people working from their dorms!) is way easier than the alternatives.

So there are real, non-trivial applications. You wouldn't want to be running ABINIT or whatever in your web browser, but GROMACS on some small system? Not unreasonable.


For classroom study - wouldn't straight javascript suffice for basic classical molecular dynamics? That approach would also make it much easier for students to play around with different update algorithms, temperature control, etc.


That would involve

a) writing yourself an engine and that's actual work

b) that engine not being one which is used in real research

So "it depends" – for introductory teaching yes, for training researchers less so.


Teachers can utilize Jupyter notebooks today.


Yep, I will definitively try to compile my Rust MD code to emscripten and give it a look! There are quite a few JS molecules visualizers, so the outcome can be pretty nice for teaching.


Funny.... but:

In a past job we had some ada code in our codebase that was just awful looking. It was machine translated from fortran to ada. From what I could tell, it was just a few files that were translated and then fixed up and checked in before they gave up on that conversion project.

we nicknamed such code "ada-tran"

Thus the I feel the new translated code should be called "Forscript"


To my understanding, Emscripten thankfully does not attempt to keep the JS sane for humamn consumption at all. It's basically a higher level assembly language that targets a browser. What Emscripten does is more compiling to JS, not translating to it.

> we nicknamed such code "ada-tran"

If Fortran -> Ada is ada-tran, then wouldn't Fortran -> Javascript be Javascran? :)


It better not be Forscript.


Transcript?


I asked the same question two years ago on another LLVM Fortran project.

https://news.ycombinator.com/item?id=10574243


This is the same LLVM Fortran project as discussed two years ago in the thread you linked. Two years ago was just an announcement, now they are publishing some open source code (which might or might not actually work, for some definition of "work", I haven't tried it).

So the backstory is that Nvidia bought PGI, a company which produced a C/C++/Fortran compiler suite which was moderately commonly used in HPC. Then they took the Fortran frontend from the PGI suite and bolted it to LLVM and open sourced it as flang.


Why would you want to do that ?


So that you can demo your numerical algorithm in a web page because it's the only way 99% of people would ever notice your work?


As if the target demographic actually wants to see web demos of numerical algorithms?


I am very much the target demographic and I can definitely see a use. If nothing else an interactive web demo would make a great teaching tool.


Yes indeed. Perhaps you are not the target demographic, but there are a lot of algorithms which can be interactive and have a visualization for example

http://journals.plos.org/plosone/article?id=10.1371/journal....


There are many useful computational model written in fortran that would be absolutely useful if they can be run in a browser, simply because telling the user to compile them on their own is impossible unless they're familiar the toolkit themselves.

For example, it would be incredibly useful if you can run various US EPA's dispersion models right in your browser.

https://www3.epa.gov/scram001/dispersion_prefrec.htm


I've heard a horror story of someone doing signal processing in node.js. It'd be nice to use a well tested Fortran FFT routine - there aren't many .js libs for doing numerics AFAIK (disclaimer: not a js programmer)


Exactly at my first job we brought our FFT routines in from NAG https://www.nag.co.uk/


Because 90% of HN only knows JavaScript so everything must in some way involve it.


legacy numeric applications / routines?


Except for some obscure edge case, it wouldn't really be a great idea to run them on a JS vm... FORTRAN code still in use is computation-heavy stuff for use cases where speed makes a big difference, the performance hit of a VM wouldn't be acceptable.


There is a lot of battle-tested Fortran out there. If you're for one reason or another in a situation where you have to deploy to the web platform and you need some numerical algorithms, I can imagine it could make sense to pick some routines from netlib instead of reinventing the wheel.


Using python would possibly make more sense here


The style of netlib is isolated standalone routines, easy to recompile to js. Numpy would be far more complicated.


python numeric code using numpy and the like ultimately depends on some fortran libs underneath


Numpy has no Fortran whatsoever


The numpy.linalg module depends on LAPACK, which is Fortran.

Core numpy depends on BLAS, for which both Fortran and non-Fortran implementations are available.


Go to the source, it's called lapacklite and it's C code translated from Fortran

https://github.com/numpy/numpy/tree/master/numpy/linalg


Hmm, seems you're right. Though that seems to be some fallback thing which is used if the real thing isn't found during the build. At least on Ubuntu 16.04 the lapack_lite module links against the real (Fortran) lapack library:

  % ldd /usr/lib/python3/dist-packages/numpy/linalg/lapack_lite.cpython-35m-x86_64-linux-gnu.so|grep -E 'lapack|fortran'
	liblapack.so.3 => /usr/lib/liblapack.so.3 (0x00007fb619ce7000)
	libgfortran.so.3 => /usr/lib/x86_64-linux-gnu/libgfortran.so.3 (0x00007f5ae0911000)


Why not keep the numerical code on the sever?


If you can run the processing on the client side, then you can scale up much, much easier and provide whatever service much cheaper, or even free. Eg, you can provision fewer servers, no need to build a processing queue or load balancing setup, if the input is provided by the client then you use less bandwidth by avoiding a round trip on that, etc.


Not sure it makes sense for computationally expensive code to be running in the user's browser.


Cluster-scale calculations? Probably not. If we could /easily/ port FORTRAN code to JavsScript then you could do this [1] kind of thing with a lot less manual re-implementing.

[1] http://physics.weber.edu/schroeder/fluids/


Hmm, I guess it depends on the application whether that's a good idea or not.

Or rather, before flang + emscripten/wasm/whatever, there was no reasonable way to run Fortran in a browser; now there is. Application developers can then choose whatever is most important for their application.

Disclaimer: my web dev experience predates the "modern web" with SPA's, fast javascript runtimes, html5 etc.


Never thought I'd live to see the day people would be talking about compiling fortran to javascript to run in a web browser. Just mind blowing.


It would be useful if you specifically wanted to run it in JS environment, rather than for the speed

Seems like most numeric/math code have some Fortran libraries underneath.

It'd be handy to have a route to just compile them to JS rather than having to write from scratch


Well, for example, we could run 1980-era climate model written in Fortran on desktop web browser; it would run with reasonable speed.


Ah modern technology: using 1000x faster hardware to run 30 year old code at the same speed, but in a browser!


That is what I think every time I see WebGL demos that most of the time remind me of Voodoo or Amiga demos, but hey it is on the browser!


You have accelerated graphics on any modern browser, on any OS, on any CPU architecture. I'll rag on JS as much as the next person but you have to admit 1) that's a pretty big feat 2) it's not an apples to apples comparison.


This is not true, because I have a couple of devices with pefectly fine working OpenGL ES 3.x GPUs where the browsers either don't support any form of WebGL or the FPS are a single digit number.

Really, the 2nd coming of VRML.


Thirty years ago it would have been run on supercomputers.


That's what I would define as a "some obscure edge case", see my comment above.


The thing is that pretty much all Fortran is "some obscure edge case". Otherwise it'd be written in something that's not Fortran.

Another example, I know the guy who maintains NIST's fluid properties solver (back in the day he worked at a subcontractor for NASA characterizing superfluids in fuel tanks). It's ancient Fortran compiled to a windows .dll and wrapped in VB in an excel file. It's super battle tested (being used for nearly 30 years), and it'd be awesome to be exposed in the browser.


I use quite frequently LAPACK (http://www.netlib.org/lapack/) which I would hardly define as "edge case".


Everyone who does any stats uses LAPACK, it's just that most of them aren't aware of it.


This was a serious question? Oh geez. What kind of world has the web wrought?


...and then your Javascript code will run on node.js, and the instance will be Dockerized for ease of deployment to a VPS. How many layers of abstractions are there?


It's turtles all the way down.


And they all live in a VM anyway.


and also nginx reverse proxy m8, with apache compiling instances to be deployed automatically to scale using latest gcc7 with 03 optimization for robustness against DDOS


Yes as some one who used to hack Fortran for a living I was having a hard time to find a use case for converting Fortran to js


To reap the performance benefits of v8 and wasm for your numerical fortran routines? /S


the matrix.



In fact, Flang has now been added: https://mohd-akram.github.io/languages/?source=Fortran


Probably. The biggest stumbling block will be inline assembly.


One other nice benefit to this project is determining name mangling schemes when linking to other Fortran projects using a pure LLVM toolchain.

Say we want to use BLAS and LAPACK in a C/C++ project that uses clang as the compiler. All we have to do is write the appropriate prototype for the LAPACK function in C and then link to BLAS/LAPACK. That said, there are four different name mangling schemes that the BLAS/LAPACK could use, which are combinations of upper case or with trailing underscores, e.g. gemm, GEMM, gemm_, and GEMM_. Both Autotools and CMake will determine the mangling scheme for us, but they need a Fortran compiler to determine that. Right now, that's almost always GFortran. With this project, it could ostensibly be Flang, which would eliminate the need for GCC to determine the mangling if a pure LLVM toolchain is desired.


Modern LAPACK already comes with an official C binding, LAPACKE: http://www.netlib.org/lapack/lapacke.html

The C ABI is pretty stable, so if you're using LAPACKE you shouldn't need to worry about which Fortran compiler was used to compile LAPACK.

Other than that, AFAIK the most common name mangling these days is lowercasing + trailing underscore. That being said, for "modern Fortran" with modules, OOP etc. the issues are vastly more complex and AFAICS there is no commonly agreed upon ABI.


Do you mean a proprietary toolchain?


Judging by the copyright notices in the files, this is probably the PGI-Fortran-on-LLVM compiler that was announced a year or two ago - https://www.llnl.gov/news/nnsa-national-labs-team-nvidia-dev...

EDIT: Confirmed - https://developer.arm.com/-/media/developer/developers/hpc/f...


Ok, so the only reason anyone's using Fortran is for crazy speeds, what's the codegen like compared to gfortran?


Actually, modern Fortran is a pretty decent numeric programming language, comparable (if not superior) to MATLAB, and much faster than it. The only problem is that finding documentation on modern programming idioms and styles is complicated among MEGATONS of legacy code lying around literally from Cold War times (still perfectly working, of course, so if you urgently need to write a Monte Carlo solver for neutron transport equations — you got it covered).


I'm really interested in Fortran because I love super down-to-the-metal optimisation stuff, but I have no meaningful use-case for it. Where would you recommend someone start if they wanted to get into modern Fortran?


Fortran isn't really used much for down-to-the-metal optimisation. It's not at all built for the low-level tricks common in C, but rather expects you to program at a higher level (notably with first-class arrays), after which the compiler will (hopefully) do its magic. Although in practice, most Fortran code is not particularly optimised (or optimised for long-dead machines), and could be much much faster.

So, if you want to do down-to-the-metal optimisation for Fortran, you want to write a Fortran compiler, not a Fortran program.


Most Fortran programmers are scientists or engineers. So you could become a computational scientist. Materials science (density-functional theory, computational chemistry), climate/weather modeling, computational mechanics, fluid dynamics, etc. are all fields where Fortran usage is strong.

If you're not interested in science, I guess you could contribute to GCC or LLVM and try to make Fortran benchmarks run faster.


To add to some of the answers, Fortran's strength is that a scientist can write an algorithm for some numerical routine fairly easily and the output code will be highly optimized.

This is possible because you don't have to deal with all the weird edge cases you have in C/C++ like aliasing. The language is fairly simple so it can be optimized to death and it's easy to add things like automatic vectorization, etc.


I'm sure you meant to include scare quotes, as in "easy" :-)


Compared to C++ / C even old fortran is suprisingly pleasant to use, when you want to implement simple numerical algorithms, it feels a bit like a barebones matlab


Any time you want to do a big numerical computation fast. I suggest that, instead of looking at old fortran code, you make sure you are learning to use array fortran, which is supported by gfortran. You can do operations on arrays directly (makes the code concise), and get multi-processor parallelism with very little effort.


> array fortran

You mean Coarray Fortran, which can be a substitute for OpenMP, but simpler.

https://en.wikipedia.org/wiki/Coarray_Fortran


Due to it's simplicity, due to it's support for atomics, and due to some traditional Fortran features (modules, F9x object-based, build-in array support), Coarray Fortran does already allow to break it's own limits for development of more sophisticated parallel logic codes: https://github.com/MichaelSiehl/Atomic_Subroutines--How_the_...


Yes, that's the term I should have used. Thank you.


Tie in machine learning somehow and Fortran can come back in style.


Unfortunately, the single existing Fortran GPU compiler/library is very expensive.


Which may be exactly why Nvidia is creating a Fortran compiler for LLVM, which can target CUDA.


Am I the only one who thinks that Forlang would have been a better name? ;)


http://www.phoronix.com/scan.php?page=news_item&px=LLVM-NVID...: "Flang is to Fortran as Clang is to C/C++."

This is a NVIDIA project.


Apparently unrelated to Flang the 2013 LLVM GSOC project to implement a Fortran frontend.

https://github.com/hyp/flang

The name re-use was a bit confusing.


Now how about a COBOL version? :-)


I would love to see a modernized COBOL in the same way they have modernized Fortran. It'd be an instant hit in enterprise IT environments.


Is it sensible to implement math heavy part of C application in Fortran and call it via FFI? Or is -Ofast enough in most cases without the costs (multiple, because performance may be one of them) of FFI?


Not for speed reasons alone. It's certainly possible to tune C code to have roughly equivalent performance to Fortran.

The real appeal for using Fortran is that for beginners it's a much easier language to pick up than C (much less footguns, for one), and that working with (dense) multidimensional arrays is very nice, roughly on par with high-level languages like numpy, R, or MATLAB.


To expand on this, I think the footguns you're talking about are specifically performance related.

Just an example: How do you specify a fast readonly array in C? If I recall correctly it's

restrict const * const int my_array[n]

In Fortran?

integer(4), intent(in) :: my_array(n)

Not even starting with array ops, multidimensionals, array slicing etc., that's already a big win in readability. Having a pass-by-reference language by default is refreshing for HPC purposes.

That being said there are disadvantages compared to C. Biggest one IMO is the lack of inline declarations, which especially makes privatizing things in OpenMP and OpenACC code a bit more tricky, i.e. it needs to be explicit.


> How do you specify a fast readonly array in C?

Generally in the function declaration rather than the variable declaration, but to declare a non-aliasing pointer to array of constant ints would be

  const int (restrict *my_array)[N];
or

  int const (*restrict my_array)[N];
and the other two variations thereof.


Gentle reminder to future language designers: whatever else you decide to take from C, please, please don't take declarator syntax. This thread showcases why.


Actually one of the reasons that Fortran is fast is not that the code you write in it is inherently faster than C. It is more that the idiomatic way of programming it is fast. This is not necessarily the case in C, where it is quite natural to use pointer structures, while in Fortran you would in most cases use large arrays that consists of continuously allocated memory instead.

It is of cause possible to do that in C as well, but it is not as natural. You can use pointer structures in modern Fortrans as well (as in F90 and later). Again, the difference is that idiomatic Fortran often lead to faster code.

One real difference is that Fortran do not allow aliased memory in arrays unless that is specifically declared using the "equivalence" statement, while in C and C++ that is the default assumption. This allows for some optimizations that can't be done safely in C. In C99 and later, you can use the "restrict" keyword to state the opposite, namely that an array is not aliased, but usually you won't do that outside of tight loops if at all.


Thats what I did in grad school. C front end for datafile reading, writing, and data munging. Fortran for the heavy lifting and solver libraries.


How easy is it to do OpenMP from LLVM front-ends, at the moment? Fortran without OpenMP would be a bit limiting.

Edit: It looks like Clang 3.8.0 supports OpenMP, so this should be doable.


what advantages over gfortran does this have and if its performance why not just pony up for the intel fortran


> why not just pony up for the intel fortran

I used to work in a government research lab. We mostly used GFortran instead of Intel for political stability reasons. Your budget can change drastically from year-to-year (even if there's no change at the congressional level) because every dollar has to go through the entire bureaucracy before it makes it to your cubicle. The budget wars are constant at every level in the org pyramid.

GFortran is free, and always will be. Whether we can pony up for Intel is completely unknown 5 or 50 years from now. And a lot of people in the Fortran world are ultimately funded by Uncle Sam.


so your counting your power bill / cost per gflop when you do your budgeting and exactly how many licences would you need? you'd be building on a server not on everyone's desktop pc right?


Not OP, but In these types of scenarios power/utilities probably come out of someone elses budget with at least one layer of competing bureaucracy walling off anyone from asking that question, let alone investigating it and factoring it into the decision making process..


Exactly. Anything that comes out of somebody else's budget is "free". :-/


Yeh your probably right the laundry files IRL :-)


Based on the front-end of a mature, commercial Fortran compiler, see - https://developer.arm.com/-/media/developer/developers/hpc/f...

I know quite a few Fortran programmers, most of whom will avoid Gfortran because it lacks the features and performance of the commercial alternatives. When you consider that a lot of these people write code for large scale computer clusters and/or supercomputers, the cost of something like Intel Fortran pales in comparison to the cost of the hardware. I suspect this is a big part of why Gfortran is not of the same caliber as GCC


That's the impression I got that the benefits of intel pays for its self in terms of more efficient use of compute budget


It's Apache-licensed instead of GPL; for some people, that's an advantage.


This is actually an Nvidia Project so theoretically it'll be used to target GPU's

https://www.phoronix.com/scan.php?page=news_item&px=LLVM-NVI...


The ability to compile to LLVM IR, for one. Yeah, you could use dragonegg (and I have), but dragonegg is unmaintained and the last working versions are pretty old (LLVM 3.3 and gcc 4.8).




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: