
C, Fortran, and single-character strings - ingve
https://lwn.net/SubscriberLink/791393/90b4a7adf99d95a8/
======
ryl00
I work on a large, multi-platform, legacy codebase of C/C++ and Fortran, and
the FFI between the two sides has always been an area where care must be taken
to ensure proper interop.

Being multi-platform, multi-compiler through the years has helped us ferret
out FFI problems like the one described here, as our Windows Fortran compiler
used to be Compaq Visual Fortran, which interleaved character strings with
their length argument (as opposed to gfortran, which puts all such string
lengths at the very end of the argument list). With Compaq Visual Fortran (and
Intel Fortran with the right options set), forgetting to specify a string
length argument on the C side would almost invariably lead to immediate
crashes (unless, of course, the sole string argument was at the end).

~~~
ChickeNES
Compaq had their own Fortran compiler?

~~~
EvanAnderson
It was DEC's and came to Compaq by way of the acquisition.

------
greglindahl
The discussion is probably pretty confusing for most folks not familiar with
Fortran.

Fortran compilers can set their own ABI and usually do for everything after
Fortran 77. For Fortran 77 and earlier, which includes CHARACTER __* 1, most
compilers use the same ABI that the Bell Labs f77 preprocessor used, which
lives on as f2c in the modern era. This compatbility is common enough that
many packages written in C that want to be called from Fortran have this ABI
directly embedded in their source code.

The problem is that the current gcc gfortran front-end doesn't do this for
CHARACTER __* 1 strings, and there 's C code expecting that extra string
length argument. And if they want to change it, any object code with CHARACTER
__* 1 arguments in it needs to be recompiled.

~~~
AnssiH
> The problem is that the current gcc gfortran front-end doesn't do this for
> CHARACTER* 1 strings, and there's C code expecting that extra string length
> argument.

Is it? My reading of the article is that gcc gfortran _expects_ the length
argument for CHARACTER* 1 to be provided as usual, but there is a lot of C
code that does _not_ provide that when calling Fortran functions.

~~~
acqq
That's how I read it too. The existing C code which is not passing the
expected parameter was a result of "hey I've tried it without and it worked"
approach. It "worked" at the moment, and only for that specific compiler, but
once the compiler started to rely on the "fact" that the parameter is "by
definition there" some the previously "working" programs started to break.

And this outcome is in fact nothing special to C and Fortran. I can just as
well write a shell script (or any other) which "expects" that there is a
parameter and if I don't use it and I later start to depend on it I'm sure
there will be some other scripts calling that one and passing something else
than what was "defined" as expected to be passed.

On that topic, erverybody should read ryl00's comment here. Having more than
one really different target is the only practically effective path to immunize
from these kinds of dependencies. For me, it's one of good arguments against
the "monoculture" software targets.

------
eridius
If the breakage is ultimately caused by compilers like GCC omitting the length
for single-character strings, why doesn't this discuss the fix of simply
changing GCC to start putting the length there for single-character strings?

~~~
0xffff2
I've never used FORTRAN, so I'm just going off of my reading of the article.
It looks to me like the correct requirement is that the C source code itself
include the string length, and that developer have not been doing so.

~~~
eridius
Oh I see. I was thinking that calls to Fortran functions were actually
recognizable by the compiler, but you're right, the code snippet right at the
top shows a manually-inserted strlen(s) in the C code.

------
kazinator
Hmm. Unless my understanding is off, these tail calls will work right if there
happens to be a word in the stack where the missing argument is supposed to
be, and that word has a value of 1 (correct string length). Because that looks
indistinguishable from the correct argument having been passed.

So that suggests a run-time solution like this:

1\. The function examines the string length argument word. If that word is 1,
everything is cool; the function proceeds, the helper function is tail called
and so forth.

2\. If, on entry, that word is not 1, then the function calls itself (with a
real non-tail call that allocates new argument). It passes itself the missing
1 value properly. When that call nested returns, it also returns.

3\. This nested invocation issued in (2) now sees a value of 1 in that
argument since it is correctly passed, and so it proceeds as given in step
(1). When all that tail-call-ology finally executes a proper return somewhere,
it will return to this nested invocation of that function, which will pop out,
and return to the C caller, as described in (2).

No need for a compiler switch to turn off tail calls. Tail calls work among
functions that obey the ABI (like all Fortran-Fortran calls). When an ABI
violation is detected, then we get an extra frame. (Hopefully there aren't
tail calling loops that involve cycling between Fortran and broken C.)

The broken C can be gradually fixed. The workaround can be deprecated and
removed when that happens. Correct code doesn't trigger the workaround
behavior in the invoked functions and also doesn't suffer the performance hit
of the extra nested call. That creates an incentive to fix the C code.

~~~
comex
I’m not sure that doing that check in every function that might be called from
C would actually improve performance compared to just turning off tail calls.

~~~
kazinator
Well, not every function that might be called from C; every function which
takes this funny string argument that is known to be of length 1 that might be
called from C.

------
uxp100
The article quoted someone saying "OUCH. So, basically, people have been
depending on C undefined behavior for ages..."

This doesn't actually have to do with c undefined behavior at all right? I
think I get what is happening but that comment is throwing me off.

~~~
quietbritishjim
As I understand it, some C programs omit the final argument to a function.
That function does not actually read the value of the argument, but it is
still undefined behaviour for it not to be passed in.

~~~
wahern
Yes and no. This is great example of why some of the rhetoric about undefined
behavior in C is misleading.

This is fundamentally an FFI issue. The C compiler is relying on the function
signature declared, explicitly or implicitly, by the person writing the
binding code. But this declared signature is _wrong_ , and because it's wrong
C says that the behavior when invoking the routine is undefined--undefined
behavior.

Other languages would have the _exact_ same result. If you told Rust that the
FFI prototype for a FORTRAN routine was foo(pointer-to-char) instead of
foo(pointer-to-char, size_t), you'd get the exact same broken runtime behavior
as with the C code. But Rust never defines this as "undefined behavior".
Rust's behavior is no less undefined in the practical sense, it just don't
formalize the concept in its specification, such as it is. Like most languages
it side steps the issue because it's a thorny area for a specification,
especially thorny now that the literal phrase "undefined behavior" elicits
reactionary flames from the peanut gallery. If you say nothing you're not
inviting criticism, nor are you burdened with managing yet another dimension
of consistency in your specification.

There are problems with "undefined behavior" in C, how it works in the
standard and especially how compilers use it to shift blame for insecure
semantics. The problems are just nuanced and technical and can't be
constructively discussed outside specific contexts. And especially in
discussions comparing C to different languages, these problems get conflated
with orthogonal issues like type safety.

EDIT: For context here are some relevant citations to C11 itself:

C11 (N1570) 3.4.3p1: "undefined behavior [is] behavior, upon use of a
nonportable or erroneous program construct or of erroneous data, for which
this International Standard imposes no requirements"

C11 (N1570 3.4.3p2: "NOTE Possible undefined behavior ranges from ignoring the
situation completely with unpredictable results, to behaving during
translation or program execution in a documented manner characteristic of the
environment (with or without the issuance of a diagnostic message), to
terminating a translation or execution (with the issuance of a diagnostic
message)."

C11 (N1570) 4p2: "If a ‘‘shall’’ or ‘‘shall not’’ requirement that appears
outside of a constraint or runtime-constraint is violated, the behavior is
undefined. Undefined behavior is otherwise indicated in this International
Standard by the words ‘‘undefined behavior’’ or by the omission of any
explicit definition of behavior. There is no difference in emphasis among
these three; they all describe ‘‘behavior that is undefined’’."

~~~
uxp100
I guess my confusion is if writing a signature for a FORTRAN function wrong is
undefined behavior.

I wasn't thinking that having a function prototype incompatible with the
external fortran code would be undefined, just, uh, wrong in a normal sort of
way.

~~~
shakna
How could it be wrong in a 'normal sort of way', though?

The type signature is the rule of law that allows you to normally pass safe
pieces of memory between two systems. If you define it badly, what exactly is
supposed to catch this and tell you that it's wrong?

Is the compiler supposed to speak both languages, and be able to disassemble
whatever object code it's given when linking to attempt to determine if the
signature is valid?

~~~
uxp100
By generating c code that correctly sets memory to call a function with a
particular ABI with the given signature. Then it's the normal sort of wrong.

Since the spec doesn't speak about FFI at all it seems, I think this is
implementation defined behavior, not undefined behavior.

~~~
shakna
> By generating c code that correctly sets memory to call a function with a
> particular ABI with the given signature.

And it does. The signature is wrong, however, and so the resulting call will
be wrong.

Since the standard doesn't speak about FFI, and the invalid call is actually
in another language's memory, I'd say we're in completely undefined territory
here.

To try and say that more clearly, I'd still call it undefined behaviour if an
invalid type signature in Java calling Go resulted in something going wrong.
Java can't know how Go is supposed to react to a fudge in it's internal memory
semantics.

~~~
uxp100
Ok, I understand what you mean. In the c spec undefined behavior has a
specific meaning that I don’t think applies to this case.

------
verisimilitudes
I'm horribly amused by this article. You constantly see C programmers working
around that mess of a language like this and so constantly see these glaring
flaws appear as if from the ether, almost as if it's a natural consequence of
programming.

Meanwhile, Ada has Interfaces.Fortran for actually interfacing with the
language and doing it correctly. Fools will tell you C is the backbone of
civilization and everything interfaces with it, but this is yet another
example of how they're wrong.

~~~
anfilt
This is not a language issue, but programmers who have made a mistake. The
same thing would happen if an assembly programmer were not to set the correct
registers or push all the arguments to the stack before calling an other
function. The same thing would happen in most languages if a programmer
declared their FFI incorrectly.

So Programmer/s have written function prototypes for the Fortran functions,
but they have omitted the length parameter in that prototype. That is the
reason behind this. It should have never worked to begin with but the way it
was handled on the FORTRAN side things meant for single length strings it did
not crash. GCC probably could have just made the change they wanted, but they
don't want to break existing code right away that is technically already
broken.

It boils down to some programmer/s doing the following.

int foo(char _str);

instead of

int foo(char _str, int len); // The correct prototype.

The prototype is you saying there is a function that exists with this name,
return type and parameters. The compiler will then construct calls with what
you have defined as the function prototype. Then when linking the linker just
looks at the name of the called function and replaces it with the
address/memory position of the function.

~~~
verisimilitudes
>This is not a language issue, but programmers who have made a mistake.

That's always the excuse with C, I know.

>The same thing would happen in most languages if a programmer declared their
FFI incorrectly.

If the Ada Interfaces.Fortran didn't work, that would be a compiler issue, but
that's only because Ada does things right.

>It boils down to some programmer/s doing the following.

I read the article, you know.

~~~
anfilt
You can do the same thing in ada. Instead of using the Fortran_Character type
defined in Interfaces.Fortran you could just define your own that does not
follow Fortran's string calling convention.

Is that a language problem with ada? That's all the C programmers did here.

~~~
verisimilitudes
>You can do the same thing in ada.

I wouldn't know how GNAT behaves in this case, but I figure it would at the
least be much harder to do by mistake, as has been done with C for years.

>Instead of using the Fortran_Character type defined in Interfaces.Fortran you
could just define your own that does not follow Fortran's string calling
convention.

The Ada compiler is aware of what's being done when you interface with another
language. You must tell it how the procedure is from a different language and
you're also able to specify the representation of Ada data types to conform. I
wouldn't be surprised to learn that GNAT is capable of checking on the Fortran
side of things as well, for interfacing like this.

>Is that a language problem with ada? That's all the C programmers did here.

I don't believe Ada has this issue, but I'd wager it at the least makes it
harder to make. In any case, I have no personal experience interfacing Ada
with Fortran.

My point, however, is that this is a very common issue with C programming,
where people don't know what they're doing and things break constantly. The C
language is not designed for interfacing with other languages, in part because
it wasn't designed much at all.

~~~
rightbyte
Calling conventions are ... conventions. What language magically solves inter
architecture ABIs?

