
Lisp macros for C - eudox
https://github.com/eudoxia0/cmacro
======
stevelosh
A good question for any "macro" systems like this is something like the
following:

1\. Suppose you've written a function 'foo' in your language that does
something useful (e.g.: partitions a sequence, validates a map, or pretty much
anything really).

2\. Some time later you want to write a macro 'bar'.

3\. Can you use function 'foo' when writing 'bar'?

If the answer to 3 is "no" then you don't really have Lisp-like macros.

This is what makes Lisp macros so powerful. It's not just that you have a way
to mangle abstract syntax trees. If that's all you want then yeah, you can
write a parser and template language to do it, like this thing or sweet js,
but it's not the same.

Lisp macros are beautiful because there's no real divide between "writing
code" and "writing macros". It's all just code. You don't have to worry about
which things are available in code-land and which are available in macro-land
because _they 're the same place_. There's just "the language" which you
extend and mold with functions and macros woven together as necessary.

Contrast this with a language without real Lisp-like macros, like
Clojurescript. If I want to write the 'bar' macro in Clojurescript I need to
think "wait 'foo' is a Clojurescript function so now I need to port it back
into Clojure land before I can use it in this macro because macros in
Clojurescript live in Clojure-land not Clojurescript-land". I need to think
about this for everything I call inside a macro.

(Admittedly the situation isn't as bad in Clojurescript because it's fairly
close to Clojure in syntax and it's possible to cross-compile code so it lives
in both "lands", but the ugly divide is still there.)

Common Lisp, Clojure, Scheme, Wisp, Julia, etc have Lisp-like macros. Sweet
JS, Clojurescript, C, etc don't.

~~~
groovy2shoes
These types of macros are very similar to Scheme's syntax-rules macros without
the hygiene.

Viewing Lisp macros as an arbitrary function from syntax to syntax is fine,
but such macros are not very Scheme-y. It's _very_ difficult to have such
macros and guarantee they're hygienic, which is why syntax-rules limits you to
matching patterns and producing templates -- it effectively limits the kinds
of functions which can be macros.

Other types of Scheme macros (which are non-standard), like explicit renaming
and syntactic closures, require programmers to opt-in to hygiene. syntax-case
(which is in R6RS) allows programmers to break hygiene by jumping through
hoops, but is otherwise similar to TFA's system as well.

So if Scheme is a Lisp and these macros are like Scheme macros, I don't think
it's inaccurate to call them Lisp-like macros. But it _is_ imprecise.

As an aside, I've written a compiler for a Lisp-like language. Speaking from
experience, getting the compiler to answer "yes" to 3 is non-trivial (even
though it's dead-simple in an interpreter). I suppose that's why Racket has
all the phase distinctions it has.

------
tomp
For something similar, take a look at Terra [1], the "low-level counterpart to
Lua". It's based on Lua, but has "low-level" functions that are JIT-compiled
to machine code using LLVM. It supports calling Lua macros from low-level
functions, or you can use Lua code to write complex high-level frameworks
(e.g. Go-like object system [2], Java-like object system [3]).

It's still in development, so there are parts of the system that don't work
very well (e.g. global variables in compiled executables cannot be statically
initialized), but it's a very interesting project.

[1] [http://terralang.org/](http://terralang.org/)

[2]
[https://github.com/zdevito/terra/blob/master/tests/lib/golik...](https://github.com/zdevito/terra/blob/master/tests/lib/golike.t)

[3]
[https://github.com/zdevito/terra/blob/master/tests/lib/javal...](https://github.com/zdevito/terra/blob/master/tests/lib/javalike.t)

~~~
sea6ear
I just wanted to say thanks for posting the link to Terra. I hadn't heard
about it before. I'm a big fan of Lua so it looks really interesting.

------
terhechte
It would be great to be able to integrate this into the Clang pipeline so that
Clang based auto completion and error detection would not trip over it but
correctly evaluate the resulting source and use that. Then, it would be easy
to integrate this into an existing project workflow. Otherwise it would limit
usability due to a lack in tooling support, I'd guess.

Clang already does that for the C preprocessor, although that is probably much
easier to do since it is not Turing complete [1]

[1] By definition. Though some people throw crazy things at it:
[http://stackoverflow.com/questions/3136686/is-
the-c99-prepro...](http://stackoverflow.com/questions/3136686/is-
the-c99-preprocessor-turing-complete)

~~~
eudox
I actually started out trying different parsers (Clang, pycparser), but they
validate too much. cmacro uses a home-spun parser because it's meant to let
you write things like:

    
    
      route "/user/edit" POST => lambda(Req* request) -> Resp* { ... }
    

Which isn't even close to valid C.

As for integration: It's pretty simple to operate from the command line, which
is good enough for integration with eg. a Makefile-built project.

~~~
ghkbrew
Yeah, I'm guessing the parsing would be a problem for integrating with other
actual c-language parsers. It looks like the the parsing is completely open
ended. I.E. there is no way to tell if you've completely parsed the macro
without knowing the macro definition.

Maybe taking a page out of rust's handbook and adding slightly more syntax to
the macro would help. If the parser knows that a macro is any identifier
followed by a ! and then enclosed in braces, it could simply ignore the
contents and continue parsing.

------
gavinpc
> Because a language without macros is a tool: You write application with it.
> A language with macros is building material: You shape it and grow it into
> your application.

Is this the OP's analogy? It's quite interesting.

~~~
AnimalMuppet
This is a standard Lisp idea, but I think it's kind of off base.

Lisp has this idea of creating a language in which you can solve your problem.
That's fine, but the Lisp people mean something rather different than the rest
of us do.

When I "create a language to solve a problem" in, say, C++, I create some
nouns (objects), some verbs (methods), maybe some adjectives (other objects or
flags) and adverbs (more flags). Then I can write my application using that
"language", but using normal C/C++ syntax.

When Lisp people say "create a language to solve a problem", they mean "write
completely different syntax". The article gives some examples.

Why would I want to do that? Well, I might want to explore a new paradigm -
some new thing like aspect oriented programming, say. I don't have to wait for
a language that implements it, I can just do it myself. This is great for
academic research projects, but not nearly so great for production code.
(Bringing new people up to speed becomes a much longer process, if your code
outlives the original developers.)

But I might have to create new language features just to get anything done in
Lisp. One example is LOOP. It's a macro, because Lisp without macros doesn't
have very good looping ability. "It's a building material" partly in this
sense: _It 's not a very good tool_. It's not all that usable until you add
the parts that, in most other languages, you already get.

Now, does LOOP do more than a C-style for or while loop? I doubt it, but
perhaps it does it more neatly. But C gives you 90% of what you need without a
macro, whereas Lisp gives you 10% without a macro. If you need more in C, you
can do it, even if it's a bit clumsy. But in Lisp without macros, it's
horribly clumsy all the time.

(Yes, I know that there are Lisp and Haskell types who seem to regard writing
a for loop over a container as a great waste of programmer time. I think that
they are mistaken.)

You can do some amazing things with Lisp macros. Paul Graham gives the example
of writing an extension language for Viaweb that Lisp macros turned into Lisp
code, which was then run on the server. That's really slick (though in the
current situation, you have to watch out for security issues unless you
validate that file _very_ carefully). But if he had chosen another approach,
what would change? The macros have to turn the file syntax into Lisp syntax.
If he had written an ordinary parser, he would have had to do the same. The
only difference is that, by using macros, he let the Lisp compiler do the
grunt work of the parsing.

TL;DR: Lisp _needs_ macros to be usable. I'm not convinced that C does.

~~~
lispm
> When I "create a language to solve a problem" in, say, C++,

Most of the time you just add words to a language by adding
verbs/adjectives/adverbs. It's not a new language.

> One example is LOOP. It's a macro, because Lisp without macros doesn't have
> very good looping ability.

That's backwards thinking.

LOOP is a macro, because macros are the way to implement code transformations
in Lisp. Lisp has other iteration constructs, which are implemented as
functions (MAP, REDUCE, ...).

Common Lisp is also its own compilation target. So at the very bottom there is
only one iteration construct provided: GOTO. The rest are macros/functions on
top of that. The language a compiler needs to understand is thus very small.
The built-in extension mechanism then allows almost arbitrary code
transformations. This is used by the language implementation AND the user. The
user/developer has access to the same facility to implement code
transformations in applications or libraries.

This is great for production, since you don't have to wait for some committee
or a benevolent dictator to implement the language feature you need. Instead
of waiting for a new iteration facility for months or years, you can implement
it in an afternoon. Productivity goes up. You don't need to wait for tools
generating code or more compact notations reducing boiler plate code - just
implement it yourself. You also don't need external preprocessors - just use
Lisp. You can also debug/extend your code transformation using the same
development tools, instead of maintaining external preprocessors. It also
enables complex programs to have comparatively small code bases. Which often
makes maintenance easier.

TL;DR: Lisp gives the user more expressive power and trusts them.

~~~
AnimalMuppet
But see, that's more or less my point.

In Lisp, you can write it in an afternoon. But in Lisp, because the language
(sans macros) doesn't give you much, you have a bunch of things that you have
to do that way. In C, because the base language gives you more, you have
enough that you don't _have_ to write the features that you need. (Granted,
this doesn't work if your goal is to reduce boilerplate code to zero...)

~~~
lispm
> In C, because the base language gives you more,

C does not have nested functions, lambda expressions, closures, ... it gives
me less.

Let's look at iteration.

C does give me primitive WHILE, DO WHILE and FOR statements. Nothing more.

I get almost nothing in C.

> But in Lisp, because the language (sans macros) doesn't give you much

It gives me already a language where WHILE and FOR can be written as
functions.

    
    
        (defun while (c b)
          (tagbody
           while
           (if (not (funcall c))
               (go end))
           (funcall b)
           (go while)
           end))
    

In C you get the best of both worlds: no powerful iteration and no easy way to
implement it.

------
canweriotnow
Even in C, always remember the rules of Macro Club[1]:

1) You do not write macros.

2) You do not write macros that violate expectations of normal code behavior.

3) If this is your first time, you must write a macro.

4) Don't think object-oriented.

5) No shirt, No shoes, No dynamic scope.

6) Don't create new scoping rules.

[1] [http://stuartsierra.com/download/2010-10-23-clojure-conj-
mac...](http://stuartsierra.com/download/2010-10-23-clojure-conj-macro-
club.pdf)

------
norswap
Shameless self plug, the same thing for Java:
[https://github.com/norswap/caxap](https://github.com/norswap/caxap)

Although I'll acknowledge the criticism made elsewhere in the comments: this
is not really usable due to lack of tooling support.

~~~
fixermark
Normally, the lack of tooling support is the first thing that turns me off to
attempts such as this to adjust fundamental aspects of the coding pipeline.

In the context of C, however, you're up against a language (the preprocessor)
that itself has only bare-minimum tooling support, so you've got a leg-up over
similar projects in that regard. :)

------
geon
How does this compare to C-Amplify?

[http://voodoo-slide.blogspot.se/2010/01/amplifying-c.html](http://voodoo-
slide.blogspot.se/2010/01/amplifying-c.html)

------
arh68
So I tried installing this on a debian box, did anyone else have issues
compiling? I apt-got sbcl, flex, but make failed with _" asdf-linguist" not
found_ until I swapped lines 46/47 in the Makefile. Now it looks okay

    
    
        [saving current Lisp image into cmc:
        writing 5952 bytes from the read-only space at 0x20000000
        writing 4000 bytes from the static space at 0x20100000
        writing 49545216 bytes from the dynamic space at 0x1000000000
        done]

------
1ris
There is the classic example why one should not use the cpp as a replacment
for function calls:

#define square(x) ((x)*(x))

breaks when called with a++ as argument. Does this pre processor address this
problem?

~~~
AnimalMuppet
I think that you're thinking wrong. Lisp "macros" and C/C++ "macros" are
different concepts. This is not a cfront replacement.

~~~
1ris
In Lisp one usually does not write something like (begin (set! a (+ a 1)) a),
while in c this is pretty common, so this is a extra issue that might be
addressed separately.

------
jderick
Looks cool, I actually ended up using php a while back when I needed to do
something like this in C++.

------
Dewie
> There is a sweet spot between low-level performance and control and high-
> level metaprogramming that is not yet occupied by any language:

Maybe Nimrod?

~~~
nly
C++. Too many other languages get wrapped up in ideology. D comes pretty
close. I can't get past Rusts unreadability. I'd like to give Nimrod a try...
especially since it can be compiled to C and presumably integrated in to C++
projects.

[Edit] Read Nimrod tutorials and ported a few toy programs. It's encouragingly
clean and doesn't seem to shy away from features... I wonder if Alex Stepanov
knows that, in Nimrod, "if you overload the == operator, the != operator is
available automatically and does the right thing" ;)

~~~
Rusky
Curious- what makes Rust unreadable for you? Abbreviations? Cryptic symbols?

~~~
nly
Yes on both counts. The problem is that I can type faster than I can think. I
don't believe that terseness inherently improves readability or encourages
better design. I have felt this way since Perl.

Consider this from the tutorial:

    
    
        fn draw_all(shapes: &[~Drawable]) {
    

vs C++:

    
    
        draw_all (const vector<unique_ptr<const Drawable>>& shapes) {
    

Transcribing that took a few seconds of thought, and all the terseness of a
few symbols has accomplished is to obscure a poor choice of API and ownership
semantics. That's just one symbol... how horrific can we make something
without noticing with just a few more params and symbols?

~~~
pcwalton
Why is that a poor choice of API and ownership semantics? That's entirely
natural. You use a range in order to provide the maximum convenience to your
callers, and you use unique ownership to provide efficient memory management
and reduce the amount of copying of data that has to be done when the backing
store is resized.

(Also, your equivalence is not quite correct: `&[T]` is more like a Boost
range. In particular your transcription makes it look like the original code
only accepted Vec<T>, which is not the case.)

> That's just one symbol... how horrific can we make something without
> noticing with just a few more params and symbols?

You've pretty much covered all the type-level symbols in Rust, except for *
for unsafe pointers.

~~~
nly
If &[T] is somewhat generic already, why even have 'Drawable' in the
signature?

Also, doesn't having a borrowed array of unique references to Drawables mean
the elements of the array are either now implicitly borrowed, or I have to
borrow each of them before they're accessed? Just knowing the symbols don't
make the semantics clear. In C++ all smart pointers are values in their own
right. In my example I have a reference to an array of smart pointers, and
there's no magic.

~~~
pcwalton
> If &[T] is somewhat generic already, why even have 'Drawable' in the
> signature?

`&[T]` isn't generic: it's a bounds-checked slice. Two pointers: start and
end.

Presumably `Drawable` is in the signature so that methods specific to
`Drawable` can be called.

> Also, doesn't having a borrowed array of unique references to Drawables mean
> the elements of the array are either now implicitly borrowed, or I have to
> borrow each of them before they're accessed?

They work like anything else: if you want to take a reference to a Drawable,
you borrow it.

> Just knowing the symbols don't make the semantics clear.

Yes. Also true for C++'s symbols; e.g. `&`.

> In C++ all smart pointers are values in their own right.

Same in Rust.

> In my example I have a reference to an array of smart pointers, and there's
> no magic.

Same in that example.

~~~
nly
A good test for me is whether the simplest possible implementation works:

    
    
        template <typename Range>
        void draw_all (Range r) {
           for (e : r) { draw(e); }
        }
    

There are no magical symbols here at all. It's not efficient, but it's
completely memory safe... breaks with unique_ptr though, which is a good
indication for me that unique_ptr is the wrong choice. Here's the less safe
'borrowing' version:

    
    
        template <typename Range>
        void draw_all (Range const& r) {
           for (e const& : r) { draw(e); }
        }
    

and a sane compromise:

    
    
        template <typename Range>
        void draw_all (Range r) {
           for (e const& : r) { draw(e); }
        }
    

How would you write all of these in Rust?

~~~
pcwalton
The reason for use of a unique pointer is that if the caller had an array of
unique pointers to Drawables, then you want the caller to be able to call
draw_all() without recreating the array.

You could come up with a generic function that doesn't require unique
ownership (for example, one that takes an Iterator<&Drawable>), but the
function in that example wasn't generic because making everything generic just
in case is overengineering.

You seem pretty confused about how borrowing works. Borrowing is tangential to
that function.

~~~
nly
I'm not confused about borrowing, what I'm saying is is that borrowing, plain
refs in C++, are a performance hack in both languages. In my final version
above, I have assumed copying a range is cheap, and copying a Drawable is
expensive and unnecessary... seems perfect to me, and it will not compile
passing a vector of unique_ptr's, which is what I would want

~~~
pcwalton
References are not a performance hack; they're a fundamental way to avoid lots
of moves, which obscure algorithms and cause a lot of mutation. (In Rust they
are memory safe.)

Anyway, if you wanted a generic reference-taking version:

    
    
        fn draw_all<I:Iterator<&Drawable>>(iterator: I) {
            for drawable in iterator {
                drawable.draw();
            }
        }
    

And a generic move version:

    
    
        fn draw_all<I:Iterator<Drawable>>(iterator: I) {
            for drawable in iterator {
                drawable.draw();
            }
        }

~~~
nly
Internal moves make no sense. You can't move an object out of a container that
has invariants, like an ordered map. This is why it only makes sense for the
_caller_ to move a container in to the function. This is efficient. I've
edited my 3rd version appropriately, as moving from a const& was clearly
bogus.

    
    
        template <typename Range>
        void draw_all (Range r) {
           for (e const& : r) { draw(e); }
        }

~~~
pcwalton
Using the container's iterator to move allows the container to enforce those
invariants.

~~~
nly
I've totally lost track of what we are arguing about. Anything containing a
unique element has to be logically unique itself, right?

In C++, which defaults to value semantics, it's required that you move your
container if it contains a non-copyable (unique) element. So you only need to
move _into_ the draw_all function in this case, which is why taking the range
by value is not just efficient, but semantically correct. If the caller moves
in to the function, then when it returns the caller will no longer own any
elements. The callers vector will be empty, and the elements themselves will
still be unique having never been copied, moved or "borrowed".

If borrowing isn't a performance hack, then why not make everything you're
ever likely to borrow shared? I'd argue anything you're drawing is _shared_
between the draw routine and the caller. Drawing a distinction just because
the caller is suspended, seems like an impediment to future change if, for
example, you later switch to a coroutine or an asynchronous/threaded
operation. Copying the range and sharing elements gets you this for free.

In summary, 'draw_all' as specified was a bad API because:

* It restricted the type of range/container passed to it

* It had unnatural ownership semantics (borrowing a box of unique things without saying you're borrowing those things is weird).

* The implementation, as was, required further borrows which were only implied. In C++ you take everything straight away.

~~~
pcwalton
> * It restricted the type of range/container passed to it

Yes, it did, but as I mentioned before, it's overengineering to make
everything generic that could possibly be generic.

> * It had unnatural ownership semantics (borrowing a box of unique things
> without saying you're borrowing those things is weird).

No, it's not, it's quite natural. `&[&Drawable]` is not a subtype of
`&[~Drawable]`, so if your caller has an array of `&[~Drawable]`, then they
would have to recreate the array to pass it to that function.

> * The implementation, as was, required further borrows which were only
> implied. In C++ you take everything straight away.

I don't understand what this means, but in any case C++ and Rust don't differ
substantially on ownership/reference/move semantics.

------
aryastark
I just don't know how to reconcile the fact that HN tore the C language a new
arsehole over "goto fail", and is now back to praising macros.

Macros need to be tossed into the dustbin of history, right next to self-
modifying code and other cute but dangerous hacks.

~~~
AnimalMuppet
HN is not one person.

There are many people here, from many backgrounds, with many different
perspectives. Almost all languages have proponents here (with COBOL the
possible exception), and _all_ languages have detractors here.

~~~
aryastark
Look, I don't have time for your childish pedantry. There were entire articles
on the front page of HN blaming C for having poor language design on the issue
of "goto fail". Now there is an article promoting macros in C, no less.

HN may not be one person, but does have a front page as a result of aggregate
behavior.

~~~
dang
> Look, I don't have time for your childish pedantry

Please don't make personally aggressive comments on Hacker News.

~~~
aryastark
You can delete the "aryastark" account and all comments.

