

Abusing the C Compiler - wingo
http://wingolog.org/archives/2010/09/07/abusing-the-c-compiler

======
acqq
Hardly abusing or original, generating small chunks of code and invoking C
compiler on each of these is how GNU autoconf works (and the reason it works
slow, too!)

~~~
wingo
An interesting parallel. I guess the novelty is that there is no intermediate
expanded file -- no config.h, etc.

~~~
mfukar
I don't see a single sentence in the article that implies this.

~~~
wingo
Not sure what you mean here; the autoconf analogy came from the grandparent.

If you are under the impression that an intermediate expanded file is
produced, then you misunderstand how Lisp macros work.

~~~
mfukar
I'd like to point you to the 2008 paper by Felix Klock [1], or perhaps the
more easily digestible presentation [2], where you will find (slide 39-47) the
description of the inner workings of procedural macros like define-c-info,
where it's explicitly stated that an intermediate C program is generated,
compiled & executed to get the desired result(s).

This is not groundbreaking per se, but it is certainly a smart compile-time
tactic of interfacing with a C ABI.

[1] [http://www.ccs.neu.edu/home/pnkfelix/Published/klock-ffi-
sch...](http://www.ccs.neu.edu/home/pnkfelix/Published/klock-ffi-
schemeworkshop-2008.pdf) [2]
[http://www.ccs.neu.edu/home/pnkfelix/Published/klock-ffi-
sch...](http://www.ccs.neu.edu/home/pnkfelix/Published/klock-ffi-
schemeworkshop-2008-slides.pdf)

~~~
wingo
I agree, that is a great paper. It's the one I linked to in my article :)

~~~
mfukar
I was specifically referring to the usage of intermediate files...I think I
assumed too much, since obviously you must have a very different view of what
consists an intermediate file than I do. Would you care to elaborate?

~~~
wingo
I don't want to get too bogged down here, but sure:

When you use autoconf, you typically run configure, then you're left with a
config.h, which then parameterizes later builds.

On the other hand, when you evaluate the the definition of dirent->name, no
intermediate file is left behind.

Of course in both cases you make temporary C files, but they are ephemeral.
The difference is that in the first case you are generating files for
inclusion in a later phase, and in the second you are effectively extending
your scheme compiler with a c compiler.

The surprising about this code from a Scheme programmer's POV is that usually
macros are about rewriting Scheme source using Scheme. In this case the macro
generates C source, forks to compile and run it, and munges the result into
the resulting text.

But sure, I can see that from a certain point of view, autoconf and scheme
macros can do similar things :)

~~~
mfukar
I see your point, but the fact remains that the underlying technique is just
the same. The difference on whether intermediate files are used further down
the build process seems somewhat irrelevant.

------
nitfol
<http://perldoc.perl.org/pstruct.html> takes this a bit further (but less
portably), and compiles the code to assembly and parses out the debug records
to find the information about the structures.

------
JoeAltmaier
Very cute; very problematical. C struct offsets depend entirely on the model,
packing, target etc. How can this be controlled?

~~~
tedunangst
You include stddef.h and use offsetof().

~~~
JoeAltmaier
Unless you're compiling for another platform, or a different code model than
the default etc.

~~~
koenigdavidmj
In the first case, you are likely going to have a libc and headers from the
target platform sitting around so that you can link.

~~~
JoeAltmaier
May take switches on the compiler command line.

Further, packing can depend upon the pragma in force when the header file is
included. How to set up that environment?

~~~
wingo
You know, the whole dynamic FFI world is very much seatbelts-off, which is
initially quite disconcerting. Usually I rely on my distro to ensure
compatibility, but once you start saying "this syscall gives me a pointer to
memory which should be interpreted as two ints, one char, and a float, packed
conventionally" you begin to realize what exactly are the interfaces between
various bits on your system. For better and for worse of course.

What Guile has now is the "list-all-members-in-the-struct" approach that Klock
discourages. It's the difference between API and ABI compatibility. I'd like
to figure out how to do the former.

A

