Is Fortran easier to optimize than C for heavy calculations?

mschaef · 2023-09-21T16:01:12

Back in school (mid-90's), I made a point of taking a FORTRAN introductory class to go along with all the other work I was doing in Pascal, C++, Lisp, etc.

For most of the class, FORTRAN felt mostly like a penalty box. It was full of behaviors and limitations that dated straight back to the early 1960's, if not earlier. It was at least easy to make it do the things it could do.

Where my opinion of FORTRAN changed was in the last time that class met. By that point, we 'knew' the language and the instructor was on to more advanced topics, including optimization. He made a point of how an optimizing FORTRAN compiler could re-nest three nested loops for the purpose of automatically improving memory access patterns. Vector Cray machines (which implemented common looping patterns in hardware) shipped with FORTRAN compilers that could take normal-looking loops and turn them into code that optimized for the hardware.

I don't know if it's still the case (thanks to restrict and better compilers), but he made a compelling point at the time for the simplicity and restrictions of the language leading to more options for optimization.

dralley · 2023-09-21T21:01:24

>I don't know if it's still the case (thanks to restrict and better compilers)

Nobody (really) uses restrict, certainly not at scale.

This is especially obvious because when Rust came along (which can effectively use restrict in most places, much like Fortran) it uncovered a lot of serious bugs in LLVM's implementation for restrict. That whole situation only really stabilized in the past year or so.

Maybe GCC does a better job as a result of having gfortran.

8bitsrule · 2023-09-22T04:47:00

"as of 2022 programs have been written in Fortran for over six decades and there is a vast body of Fortran software in daily use throughout the scientific and engineering communities....

It is the primary language for some of the most intensive super-computing tasks, such as in astronomy, climate modeling, computational chemistry, computational economics, computational fluid dynamics, computational physics.... Since the early 2000s, many of the widely used support libraries have also been implemented in C and more recently, in C++... For this reason, facilities for inter-operation with C [1] were added to Fortran 2003 and enhanced by the ISO/IEC technical specification 29113, which was incorporated into Fortran 2018... " [0]

[0] https://en.wikipedia.org/wiki/Fortran#Science_and_engineerin...

[1] https://en.wikipedia.org/wiki/Foreign_function_interface

jcranmer · 2023-09-21T16:01:36

Fortran has a lot more (N-dimensional) array logic intrinsic to the language than C or C++ does, and in general, Fortran has a fair number of "little" things that can add up for optimization:

* Fortran arrays do not alias by default. (This is the most well-known, and is indeed repeated in the answers on StackOverflow)

* Fortran N-D array indexes are UB if they go out-of-bounds in a way that isn't true for most other languages. In C, if you have a declaration `int x[2][3];`, the expression `x[0][4]` is a legal way to access one of the elements--it's only UB to go outside the bounds of x as a whole, not individual rows. In Fortran, the equivalent expression (well, modified because Fortran is column-major) is illegal.

* Fortran has built in array expressions--you can express a vector addition just by adding the arrays themselves. This makes it easier to autovectorize.

LegionMammal978 · 2023-09-21T21:10:12

> In C, if you have a declaration `int x[2][3];`, the expression `x[0][4]` is a legal way to access one of the elements--it's only UB to go outside the bounds of x as a whole, not individual rows.

Even though multidimensional arrays are laid out contiguously in C/C++, it doesn't follow that indexing can traverse from one row to the next. By C17 6.5.6/8, pointer-integer addition can only create pointers to other elements (or one past the last element) of the same array object that the original pointer came from. However, an int[2][3] array has two int[3] arrays as elements, and each of those arrays has three int objects as elements. The int objects are not all elements of the same array (the standard's definition of an array element, 6.2.5/20, doesn't apply recursively), so you aren't allowed to index between them with expressions like (x[0] + 4).

Compilers seem to be pretty lenient about this in practice, though, only optimizing out out-of-bounds accesses when they exit the complete object. But at least Clang's UBSan will complain if you try to index between rows [0].

[0] https://godbolt.org/z/GYosP39Mc

KeplerBoy · 2023-09-21T16:40:07

> In C, if you have a declaration `int x[2][3];`, the expression `x[0][4]` is a legal way to access one of the elements

Is it really? Does the C-compiler guarantee that your 6 ints will be stored in a contiguous memory area?

jcranmer · 2023-09-21T17:04:29

Yes. A[i] in C is exactly equivalent to *(A + i). (Incidentally, this means that 2[x] is a legal expression in C).

KeplerBoy · 2023-09-21T17:14:55

Of course, that part is clear. The question is, if the two rows are guaranteed to be contiguous in memory. If using heap allocated memory, one could easily construct a counterexample (i'm not sure how the stack allocated array behaves, i guess it works because it's the sane choice, but not strictly guaranteed):

int *x; int numRows = 2; int numCols = 3;

x = (int *)malloc(numRows * sizeof(int ));

for (int i = 0; i < numRows; i++) { x[i] = (int )malloc(numCols * sizeof(int)); }

// This could segfault x[0][4] = 42;

jcranmer · 2023-09-21T17:36:03

Yes, it's a multidimensional array, everything is required to be consecutively allocated. (See 6.5.2.1 for more details). `int[2][3]` is 6 integer objects consecutively allocated (as 3 arrays of 2 arrays of integers), not 3 pointers to arrays of 2 integers.

KeplerBoy · 2023-09-21T17:49:22

Thanks for pointing to the actual section. Good to know.

Someone · 2023-09-22T00:01:32

> (Incidentally, this means that 2[x] is a legal expression in C)

Even weirder, for that declaration

    int x[2][3];

I think the following are legal expressions:

    0[1[x]]
    1[x][0]

Or is there something in the language that makes that break parsing?

pklausler · 2023-09-21T16:22:36

Fortran's dummy arguments can (generally) be assumed in the scope of a subprogram to not alias, whether arrays or not. The burden is on the programmer of calls to procedures to not associate any of its dummy arguments that are modified during the call with data that might be associated with any other dummy argument.

theodorethomas · 2023-09-21T16:30:43

People mention the no-aliasing, the compilers, the intrinsics, the libraries, and the expressivity but one aspect of the difference is ignored and it is this: C is a language for specifying behaviour of hardware (time sequence of detailed states), Fortran is a language for specifying computation of values. Fortran abstracts far more of the hardware than C and consequently, a Fortran compiler can benefit from quantum processors, mind-readers or time-machines, should they ever be invented.

A Fortran program that reads its input, calculates and finally writes out its output does not have to execute any particular instruction at all. As long as the answer is "AS IF" it had done the user-specified computation, the Fortran compiler has done its job. In between I/O, it submerges into the ineffable like a Cold War SSBN.

C is about the instruments, the players and the conductor, Fortran is about the music.

dTal · 2023-09-21T17:21:47

Maybe this was once true, but the hardware that C was designed for specifying the behavior of was a PDP-11. Nowadays you are programming an abstract C-ish virtual machine that provides certain semantic guarantees that don't necessarily map terribly well to the physical hardware. For example, if you write to a memory address, and then read from the memory address in the "next instruction", you expect the change to be immediate, even though the code is actually running on a pipeline that could be dozens of instructions deep, with several layers of cache between the core and system memory. So in a sense there's not really a qualitative difference between C and Fortran - they are both for specifying a sequence of operations on an abstract machine, relying on the compiler to implement that machine - and indeed modern optimizing C compilers actually provide very few guarantees about specific assembly instructions to be executed, happily rewriting or omitting code so long as it executes "as if" it ran on the C virtual machine.

See "C is not a low-level language" - https://queue.acm.org/detail.cfm?id=3212479

theodorethomas · 2023-09-21T18:05:20

I don't see how C can match Fortran's abstraction level and still reliably control hardware that uses memory-mapped I/O.

C, as an operating system implementation language, is trying to do something fundamentally different than Fortran.

You live by memory address, you die by memory address.

PhilipRoman · 2023-09-21T18:06:54

> For example, if you write to a memory address, and then read from the memory address in the "next instruction", you expect the change to be immediate

This would also be true for assembly, hardly a high level language

lmm · 2023-09-21T22:36:21

On modern CPUs assembly is a high level language (or rather, it's a language that doesn't have any of the advantages of traditional low-level languages, even if it also lacks the advantages of traditional high-level languages. Much like C)

AnimalMuppet · 2023-09-21T18:07:53

> you expect the change to be immediate, even though the code is actually running on a pipeline that could be dozens of instructions deep, with several layers of cache between the core and system memory.

I expect the hardware to handle cache coherency in that situation. What the compiler does should be irrelevant.

dTal · 2023-09-21T21:01:53

Right, but the point is that the hardware is still "meeting you halfway" to present the appearance of something which isn't actually happening. Those pointers in C aren't really "memory addresses" at all, they're keys in a key-value store managed by the hardware to preset the illusion of flat, contiguous memory, as mandated by the C programming model.

So maybe it's accurate to say that C is "more compatible" with real hardware, in the sense that its abstract machine is more isomorphic to what's really happening than Fortran's is. But it's not exactly "closer to hardware" in the way we might be tempted to think; it's more of a lingua franca that your processor happens to speak.

If you're still tempted to consider C "close to hardware", consider that you can compile the same code for a Z80 and a Threadripper. What hardware exactly are you controlling that's common to both?

AnimalMuppet · 2023-09-21T21:23:25

> as mandated by the C programming model.

As PhilipRoman said, this is also true of assembly (or any other programming language model[1]).

> If you're still tempted to consider C "close to hardware", consider that you can compile the same code for a Z80 and a Threadripper. What hardware exactly are you controlling that's common to both?

In both of them I can write to a memory-mapped I/O device, if it has one. I can write a custom memory allocator for a pool that I'm managing myself. I can't do either of those in Fortran or Javascript.

[1] Why does it have to be true of any other programming language model? Well, maybe I exaggerate slightly. But can you show me a (single threaded) programming language where "a = 1" does not mean that on the next line, a will be 1?

kmstout · 2023-09-21T22:35:07

> But can you show me a (single threaded) programming language where "a = 1" does not mean that on the next line, a will be 1?

MIPS I.

https://en.wikipedia.org/wiki/Delay_slot#Load_delay_slot

gpderetta · 2023-09-21T22:04:07

>But can you show me a (single threaded) programming language where "a = 1" does not mean that on the next line, a will be 1

Generally agree with your point, but just to play the devil's advocate, in a CPU with exposed pipeline and no interlocks, setting a register to a value doesn't guarantee that a following instruction reading from that register will see the last value written.

teleforce · 2023-09-22T00:43:57

With D language it can be the music, instruments, player and the conductor all at once [1].

Fun facts, Walter the original author of D language wrote his popular Empire game in Fortran [2]. Some of the ideas that make Fortran fast is incorporated into D language design and this makes D is as easy if not easier to optimize than Fortran [1].

[1]Numeric age for D: Mir GLAS is faster than OpenBLAS and Eigen:

http://blog.mir.dlang.io/glas/benchmark/openblas/2016/09/23/...

[2]A Talk With Computer Gaming Pioneer Walter Bright About Empire:

https://madned.substack.com/p/a-talk-with-computer-gaming-pi...

NikkiA · 2023-09-24T02:18:15

Oddly, when people mention Empire, I think of Peter Langston's Empire rather than Walter's.

coliveira · 2023-09-21T21:35:16

Yes, you're correct. C was created to control a CPU, it is a low level language with a comfortable syntax. C abstracts the hardware. But Fortran has nothing to do with hardware, it is just a notation for computing matrix algorithms. Fortran can be thought as a primitive APL. You can do all kinds of optimizations in Fortran that you cannot do in C, because it doesn't care or know about the underlying hardware.

mhh__ · 2023-09-21T20:11:11

That was maybe true for C in the seventies but there's practically no difference anymore e.g. C has an as if rule too.

theodorethomas · 2023-09-22T10:45:07

The point is, it shouldn't.

https://news.ycombinator.com/item?id=30022022 How ISO C became unusable for operating systems development

andy99 · 2023-09-21T10:59:26

In terms of being easier (for the programmer, not the compiler) one thing worth mentioning is that Fortran already has optimized intrinsics for matmul and dot_product. I was recently doing a small comparison between C and Fortran for matrix-vector multiplication and was surprised how fast the matmul is "out of the box" vs handwritten loops. Personally I also think that for code that is linear algebra heavy Fortran is easier to work with because of the intrinsics - it's analogous to numpy, whereas C takes more effort, mainly keeping track of sizes of things.

https://gist.github.com/rbitr/3b86154f78a0f0832e8bd171615236...

physicsguy · 2023-09-21T11:45:43

True for small matrices but above certain size you wouldn’t want to use these methods or loops anyway, and they’re not optimised if your matrix is sparse either. So practically you end up using a library and you’re back to basically being equivalent with C.

andy99 · 2023-09-21T12:55:45

It's possible to link in an external BLAS library that will be used by the intrinsics instead. See https://fortran-lang.discourse.group/t/compiler-option-matmu...

But yes I agree, at some point it's always going to be back to something more customized.

microtherion · 2023-09-21T19:55:29

And this is where the performance starts converging again, because BLAS is also available from C. So if you can express your computation in terms of BLAS primitives, it does not matter much from what language you're calling them.

kergonath · 2023-09-21T22:14:48

Yes, but you retain Fortran’s expressiveness. You can write a matrix product A = x * B, regardless of whether the matrix multiplication is handled by a compiler’s multiplication function, any flavour of BLAS, or ScaLAPACK over MPI. I don’t think Fortran has that much of an edge in performance compared to decently optimised C, but the Fortran code to get there is much more readable and easier to grasp. And with much fewer footguns (think allocatables versus pointer arithmetic).

physicsguy · 2023-09-22T06:53:59

… until you hit sparsity which you almost always have to switch approach. Generally you do not want to construct a dense matrix for a problem where it’s not dense!

lmm · 2023-09-21T23:03:20

If your goal is an expressive, safe, low-footgun language for expressing BLAS computations, you'd probably use NumPy rather than Fortran.

kergonath · 2023-09-22T12:36:07

NumPy is very Fortran-like, it's probably not the best example. Sure, it avoids some Fortran issues, but it also inherits some baggage from Python. Personally, working with both regularly, I don't really find it more expressive or safer than the common Fortran equivalents. And in Fortran there's less mental overhead about whether a variable is a list or a NumPy array, which does not matter until it does. Ditto for row/column-major order. Overall, in terms of ergonomics and ease of use, both are fairly similar.

leereeves · 2023-09-21T12:47:41

The Fortran matmul intrinsic doesn't use an optimal algorithm?

kergonath · 2023-09-21T22:17:51

The compiler is responsible for the implementation of the matmul intrinsic. Most of them have a “decent for small matrices and easy to inline” version by default and an option to convert it to call to BLAS under certain conditions (like larger matrices). Then, the performance is that of your BLAS library. The assumption (quite good in my experience) is that people use the intrinsic naturally because it is easy and natural, but turn to BLAS calls when performance becomes important. Which is not the case for the vast majority of matrix multiplication, even in high performance computing.

There is no universal optimal algorithm.

duped · 2023-09-21T15:56:57

Even theoretically this is only possible if the dimensions and sparseness are known at compile time, which can be pretty heavy assumptions.

jcranmer · 2023-09-21T20:56:37

The implementation of every BLAS implementation I'm aware of basically looks like this:

  if (dims in range A)
    gemm_implA
  else if (dims in range B)
    gemm_implB
  else if (dims in range C)
    gemm_implC
  // several more implementations...

The immediate call function is invariably a switch over several implementations, and by the time you reach that call, you have actual knowledge of dimensions.

duped · 2023-09-22T03:52:31

The branch there can be expensive for small matrices.

Alifatisk · 2023-09-21T10:36:15

This might be a bit irrelevant but,

my professor who teaches computer programming (who previously worked as a nuclear physicist) loved mentioning Fortran whenever possible.

He truly enjoy the language and believes it is well suited for heavy computation. He said its used a lot for weather forecasting.

He is quite old now so I don’t know how true it is today, but he made me more interested in Fortran and I might give it a shot instead of C sometime soon.

nonameiguess · 2023-09-21T12:54:39

I haven't talked to him in a few years, but used to be good friends with a meteorology researcher at Purdue and Fortran was virtually all he knew, it's so heavily used. When I first started out in GEOINT processing, a lot of our code (NRO code) at the library level was still Fortran, but that was mostly older libraries. Anything written after 2005 or so was predominantly C++. I don't really keep up to date with the constellation status since I haven't worked there in a while now, but there were still a few orbital platforms older than that back around 2017 or so and the code for handling them was still largely Fortran. We never used Fortran directly, though, only bindings for C++, though I did have to learn a tiny bit of Fortran to hack on those once.

green-salt · 2023-09-21T21:04:48

This was the case for an atmospheric research facility's HPC cluster which was pretty much running all fortran for the calculations at least back in 2010.

kergonath · 2023-09-21T22:27:20

It is still very much used for weather forecasting (ECMWF has components in it), as well as materials physics, quantum chemistry, and things like finite elements and nuclear fuel performance codes.

It is much, much better than C for all these applications. Nobody in the field seriously considers C; all the non-Fortran codes are in C++. Both the Fortran and C++ bits have Python wrappers anyway.

dkarl · 2023-09-21T16:42:41

Professors will do weird things if they get a chance. I know of a junior college where the intro to programming course used Fortran77 in the late 2000s. A lot of kids signed up for it because they needed a STEM elective and didn't want to take math or science.

Joel_Mckay · 2023-09-21T11:16:27

[flagged]

therealcamino · 2023-09-21T12:20:35

"Most academic institutions have been sued many times due to the volume of nonsense they feed kids."

I've never heard of that happening, let alone many times to most institutions.

Joel_Mckay · 2023-09-21T12:28:58

[flagged]

jcranmer · 2023-09-21T12:47:54

> Part of the settlements is an NDA.

If someone is sued, that creates a court record. Even if the terms of the settlement are not disclosed, that there was a settlement is lodged in the court record, and publicly accessible. Even if court records are sealed, the existence of the case is not, and initial complaints are rarely sealed.

In short, if suing universities because of what they're teaching is routine, you should be able to point to court cases. If you can't find them, then it's not because it's secret--secret court cases don't exist--it's because they don't exist.

tptacek · 2023-09-21T13:59:10

In the spirit of generosity, you might conclude that the parent commenter simply has their wires crossed, and has cases involving for-profit schools like University of Phoenix confused with real universities. If you didn't know much about higher education, you might think UPhoenix was one; it's in the name.

Joel_Mckay · 2023-09-21T14:14:17

Nope, but thank you for your concern. The local class action case was with a local school that privatized, lost accreditation, and eventually became a trade school. The other University case I alluded to was covered in the news, and was notable indeed,

I'm just being careful here. Have a great day =)

jcranmer · 2023-09-21T14:59:28

And you're still refusing to tell us any names because we're somehow lazy for not being able to correctly guess what locality you're in, what time you're talking about, and search local media history for that timeframe to uncover the case you're referring to?

Joel_Mckay · 2023-09-22T00:39:42

While I am risk averse around potentially litigious content, there are plenty of publicly accessible recent examples to examine at ones leisure. Note the two scandals I was alluding to are not on these lists, but follow the familiar theme.

https://www.youtube.com/watch?v=8Sp-VFBbjpE

Try being a little more civil when asking for clarification. =)

Canada

https://www.cbc.ca/news/canada/edmonton/cdi-college-lawsuit-...

https://www.cbc.ca/news/canada/manitoba/jackie-healey-sues-r...

https://www.cbc.ca/news/canada/british-columbia/former-stude...

https://www.cbc.ca/news/canada/calgary/calgary-okotoks-fligh...

https://www.cbc.ca/news/canada/toronto/ex-students-sue-toron...

https://www.cbc.ca/news/canada/saskatchewan/watch-student-su...

https://www.cbc.ca/news/canada/london/former-phd-student-get...

US

https://sites.ed.gov/naciqi/files/2018/05/NACIQI-Enclosure-1...

https://www.nbcnews.com/news/us-news/students-25-universitie...

https://www.cnbc.com/2022/01/11/lawsuit-claims-colleges-over...

For-profit institutions funded by predatory loans:

https://www.cnn.com/2022/01/13/politics/navient-student-loan...

dang · 2023-09-23T18:29:49

You posted really abusively in this thread, not only duplicating this comment several times but also deleting the text in other posts in a way that destroyed the context of replies and confused readers.

We ban accounts that do these things, so please don't do them again. Fortunately it doesn't look like your account has a pattern of this, so it should be easy to avoid in the future.

Joel_Mckay · 2023-09-21T13:00:32

[flagged]

jcranmer · 2023-09-21T13:07:02

Give me three case numbers of cases where former students successfully sued the institution (cases that were dismissed don't count--just because a case was filed does not mean it is meritorious). Those case numbers should make it easy for anyone to find the complete docket of the case on PACER (if it's federal court) or applicable state courts (well, to the degree to which states make their dockets easily accessible).

You made the assertion; it's your job, not my job, to provide evidence for your assertion, and I've put you on notice as to the evidence that is satisfactory. If you can't provide the evidence, I have no reason to believe your assertion.

Joel_Mckay · 2023-09-21T13:14:31

jcranmer · 2023-09-21T13:26:49

> You would be fairly surprised who showed up besides trump u in the states.

Okay then, who showed up? You imply you know who. It shouldn't take any effort to name anybody, if you already know them.

Joel_Mckay · 2023-09-21T13:51:01

tptacek · 2023-09-21T13:56:30

POC||GTFO, as the kids say.

bananapub · 2023-09-21T12:47:50

so your weird unfounded assertion is also explicitly unprovable? awesome, thanks for sharing.

Joel_Mckay · 2023-09-21T12:53:33

In general lexis nexus holds the records of former students suing for BS employment promises.

You would be fairly surprised who showed up besides trump u in the states.

It was a lot more common than most like to admit. =)

__mharrison__ · 2023-09-21T16:32:11

Can someone enlighten me as to what "T" means? First time I've seen it.

sbierwagen · 2023-09-21T23:55:31

HN won't let you delete a comment if it has replies. The workaround, if you're still in the edit window, is to delete all the text instead. Doing this is considered rude.

__mharrison__ · 2023-09-22T01:45:50

Thanks.

jcranmer · 2023-09-21T17:07:21

Not sure, but the individual previously had a different comment that was edited to be just "T".

Joel_Mckay · 2023-09-21T22:47:41

redandblack · 2023-09-22T02:15:59

I worked at an Airline in late 80s working on numerical optimization code in fortan / IBM mainframes - integer optimization with maybe 10k rows and 10m+ columns. I remember they having specialized vector processors to run those optimizations. We used to run those schedule optimizers daily until it was time to hand it over to the next planning department. The firm also had a in-house team from IBM for help with it.

Anyway the urban myth in the IT dept was, they had a custom fortan compiler written in early 70s that saved the firm from buying 2 additional IBM mainframes with the compiler writer compensated well

saagarjha · 2023-09-21T09:53:09

The difference is mostly not relevant these days. Yes, restrict is important, but anyone who is optimizing numerical computing hard is probably already using it judiciously, and for them what is more important is what is familiar to them and makes them most productive.

gnufx · 2023-09-21T10:46:36

Unfortunately I see rather few people doing serious optimizing knowledgeably at all in research computing. (I know how and where it's done well, of course.) That has global effects on productivity of users, though some of the biggest gains can come from more global optimization, like in the use of MPI.

pjmlp · 2023-09-21T10:00:25

The keyword is judiciously, without triggering UB, and I bet most don't.

pklausler · 2023-09-21T16:41:03

Subroutine and function "dummy arguments" are basically restricted references. When analyzing data dependences in the body of a subprogram, a compiler doesn't have to worry about writes to one dummy argument affecting any reads to or writes from another dummy argument. The burden is on the programmer of a procedure's caller to ensure that any dummy argument that is modified during the call does not alias with any other dummy argument. This is the big one.

Fortran >= 90 has ALLOCATABLE variables (and now components), which are dynamically managed objects that are akin to std::unique_ptr<> or std::vector<> smart objects. A compiler can assume that an ALLOCATABLE object is free from aliasing, unlike a POINTER.

Fortran does have POINTERs, but they can only point to objects that have been explicitly tagged with the TARGET attribute. Objects that are not TARGETs are safe from aliasing with POINTERs. Fortran does still not have the concept of a pointer to immutable data.

Do any of these still yield performance advantages over C/C++? Probably not.

pipo234 · 2023-09-21T09:33:08

> The performance difference comes from the fact that Fortran says aliasing is not allowed [...]

This is one of the legacies modern C++ is still wrestling with today, supposedly solved by Rust, Zig, Val and the likes.

zik · 2023-09-21T10:28:39

It's true that there's no "restrict" inthe C++ standard, however all the major C++ compilers have extensions for it ("__restrict__") and have had for a long time.

Conscat · 2023-09-21T15:17:49

https://gcc.gnu.org/onlinedocs/gcc/Loop-Specific-Pragmas.htm...

The #pragma GCC ivdep annotation also enables this in loops. C23 also has the [[unsequenced]] attribute, which supports pointer arguments for optimizations like these, unlike [[gnu::const]].

patrec · 2023-09-21T13:18:38

Except that it doesn't work and is basically WONTFIX since no C or C++ programmer ever uses it.

This is only very slightly exaggerated.

Blackthorn · 2023-09-21T16:06:44

We use it all the time in audio code. How do you get the idea that it doesn't work?

patrec · 2023-09-21T17:04:32

Look at the gyrations the Rust compiler team has been going through for years to be able to actually use LLVM's noalias (which is a very desirable optimization in Rust where the borrow checker statically guarantees absence of aliasing in a lot of code).

I haven't kept up with the current state of things, but if you can finally liberally use noalias in LLVM without having to fear miscompilations, you probably have solely the increased significance of Rust to thank for that.

Blackthorn · 2023-09-21T17:26:54

We use gcc so honestly I'm not sure how the difference goes. I'm glad for Rust making the whole ecosystem more usable regardless.

kergonath · 2023-09-21T22:31:47

GCC had to spend some time getting things correctly because of gfortran.

mkj · 2023-09-22T02:09:00

It worked fine in Intel compiler. We used it a fair bit to sort out hot paths in C++ for high performance computing (where Fortran was the other main language). Now Intel have moved to LLVM as their compiler base I wonder if they've put some work into making it work there?

Conscat · 2023-09-21T15:14:28

I've never had issues with it, personally. In what way does it not work?

lmkg · 2023-09-21T16:04:47

Rust has aliasing guarantees. In theory, Rust code can pass noalias annotations to LLVM and become more optimized.

In practice, it took years for Rust access these optimizations because the noalias annotations kept hitting codegen bugs in LLVM. It was an area of the compiler that had not seen much active use before Rust, and they were causing a lot of dormant bugs to become issues.

The feedback cycle was also very long because LLVM didn't fix the problems quickly. I didn't pay close attention so I don't know if this is prioritization or because those fixes required major surgery, but Rust sometimes had to wait for major version bumps in LLVM before seeing if "this time was the last time."

Measter · 2023-09-21T17:24:56

Not just in theory. Rust does now mark almost all references as `noalias`, the exception being `&T` where `T` contains an `UnsafeCell`. The equivalent safe Rust function signature[1] would have all four references marked `noalias`, even though the `input` and `matrix` slices could alias.

How much the optimizer can take advantage of that is another matter, due to what you said. Doing a quick translation to more idiomatic Rust[2], it does hoist the accesses to `matrix` out of the hot loop, and does also seem to be a bit less moving stuff around compared to the C version, which I think is putting the `matrix` values in the right places, which Rust did before the loop.

[1] fn transform(output: &mut [f32], input: &[f32], matrix: &[f32], n: &i32);

[2] https://godbolt.org/z/5PPn3eh19

Conscat · 2023-09-21T19:06:04

That's interesting. It also looks like with the `__restrict` specifier (https://godbolt.org/z/cvaYoc6zh), the clang code is similar regarding the matrix multiplication. The body of the vectorized loop itself looks identical, `.LBB0_4` in clang and `.LBB1_8` in rustc.

stephencanon · 2023-09-21T23:17:26

Swift ran into similar difficulties with noalias. Happily the bad cases I was tracking that inhibited optimizations were resolved in the last year or so. There’s probably still some lurking corner cases, but the overall situation is pretty good now.

vlovich123 · 2023-09-21T16:16:41

Yup. But it is true now though so hopefully they’ve added the test cases to LLVM to make sure things continue to work correctly in the face of more and more optimizations being thrown at code.

nwallin · 2023-09-22T17:21:32

That was, for the most part, just a problem with LLVM governance and development priorities. When the bugs were found in LLVM, GCC was examined and found to have a few similar bugs. GCC fixed these issues relatively quickly, but LLVM sat on them for years.

Normal C/C++ developers doing normal C/C++ development in GCC probably never saw any of those bugs.

zik · 2023-09-26T04:22:08

That's utterly incorrect. It's used thousands of times in the linux kernel.

_flux · 2023-09-21T11:39:17

Hmm, I thought basically only Rust solves it. How do Zig and Val solve it?

Conscat · 2023-09-21T15:15:09

Multiple-aliasing cannot exist in a language without reference semantics, such as Val (or whatever its new name is).

dralley · 2023-09-22T00:25:27

Zig doesn't solve it. I don't remember where but I know Andrew has mentioned that directly.

fsckboy · 2023-09-21T10:18:55

fortran is call by reference, it's all aliasing, you can even in some implementations change the value of literals

c is call by value, you have to at least try to alias

bregma · 2023-09-21T10:59:25

It took me days one time to figure out why 0 had adopted the value four.

The technicality is that Fortran is call by descriptor (not by reference, not by value). That's a powerful difference but a sharp two-edged sword.

kaba0 · 2023-09-21T15:49:00

Could you please explain how is it differs from the other two? Never looked into Fortran.

kergonath · 2023-09-21T22:35:19

The Fortran standard does not specify anything about passing by references or value. Instead, it specifies the expected results. The compilers do it however they want or can, which means that they can pass pointers, values, copies, or pointers to copies depending on the situation.

Sometimes (most of the time, to be fair), the compiler can work out the best solution. Sometimes, the developer has to jump through some hoops to avoid unnecessary copies.

gnufx · 2023-09-21T10:57:54

Fortran does not call by reference, and the term aliasing isn't used by the standard. It calls by what's usually called value-return or copy-in/copy-out, according to the "storage association" rules in the standard.

While it may be true that the rules aren't generally checked by compilers, Toolpack did it for pure Fortran77 in the 1980s, though I don't know how comprehensive it actually was.

gpderetta · 2023-09-21T10:41:40

my understanding is that in fortran, if two arguments alias each other, it is UB. So it is not that fortran doesn't have aliasing, it is just that the optimizer assume that it doesn't (but don't quote me on this!).

Bostonian · 2023-09-21T12:49:20

Arguments are allowed to alias if they are both inputs that will not be changed in the function -- technically if they are both "intent(in)".

pklausler · 2023-09-21T16:24:37

INTENT(IN) is not required to allow aliasing of unmodified dummy arguments.

kergonath · 2023-09-21T22:38:44

Yes, but in practice no compiler checks whether an argument with no INTENT is modified or not, or could alias with something else or not. So the compilers tend to trust the programmer. With INTENTS, it’s obvious and much easier to check.

pklausler · 2023-09-21T22:51:48

Dummy argument aliasing has nothing to do with INTENT.

TheRealKing · 2023-09-23T16:35:39

I believe the NAG Fortran compiler does check for some aliasing scenarios if not all. The argument intent is irrelevant.

bregma · 2023-09-21T10:56:38

Fortran is column-major and C is row-major. That difference alone makes vectorization in Fortran much much more frequently used.

defrost · 2023-09-21T11:07:38

As a 60 year old C | Fortran coder since ~1980 or so, that difference makes near zero practical difference to numerical computing.

Pretty much any vector|matrix operations will see row against column (or colum against row) operations in equal proportion - what you gain on the swings you lose on the roundabouts whichever way you choose to orientate storage.

Indeed, given large sparse arrays, language specific default array storage methods might not even be used - and here C is (arguably) more flexible at building task specific data structures - a column(?) of pointers to row fragments (perhaps).

The big advantage early FORTRAN had was baked in anti alias assumptions and a truckload of pre existing relatively accessible NIST | NASA grade quality numerical algorithms that had been crawled over, eyeballed, tested, and pounded in simulations by many qualified applied scientists.

AnimalMuppet · 2023-09-21T12:56:47

> The big advantage early FORTRAN had was baked in anti alias assumptions and a truckload of pre existing relatively accessible NIST | NASA grade quality numerical algorithms that had been crawled over, eyeballed, tested, and pounded in simulations by many qualified applied scientists.

Yeah. That Fortran library that has a solver for sparse complex matrices that won't produce nonsense if you give it a stiff problem, that has 50 years of people pounding on it to make sure that it's rock solid? I'm not sure that there's a C/C++ equivalent for it. I absolutely cannot write an equivalent. Yes, I can write a matrix solver that takes complex data. No, I can't write one that is as efficient, as numerically stable, or as bug-free as what already exists in Fortran.

bregma · 2023-09-21T16:28:45

As a 60 year old C and Fortran coder since ~1980 or so, and who has spent the better part of the last decade supporting compilers and their internals, I say it makes a big difference when it comes to autovectorization optimizations. But that's my experience based on most users being naive and know-enough-to-be-dangerous programmers, and your experience may differ.

defrost · 2023-09-22T02:39:05

That's the thing though, isn't it - it's not the storage order that makes a difference, it's the built in's for handling array | vector multiplication etc.

They keep the fingerpokkens off the blinkenlights.

It's a whole other ballgame when the problem domain demands near raw pipelined handling of big data streams that don't align with any type of default internal Fortran data storage.

The classic example would be late 1980s | 1990s processing of satellite image data from ground stations - the pipelines were fat (relative to the hardware of the day) and the data arrived in BIL (Band interleaved) and|or not BIL, and all manner of jumbled formats that made sense on the aquisition side.

The tools written then, back in the day, had to gulp in big end | little end data, 8 bit, 12 bit, 24 bit, and 32 bit data, and to performantly reproject and geomontage on the fly ended up with the same forms of optimizations duplicated across a crazy circus of incoming raw storage combinations.

Whether tackled via C or FORTRAN just simply reading in data and hoping it matched native FORTRAN array structures wasn't an option.

aap_ · 2023-09-21T11:45:29

This is down to convention though. You decide how you interpret your data in C.

bregma · 2023-09-21T16:29:48

And the compiler determines how to implement your decisions.