Hacker News new | past | comments | ask | show | jobs | submit login
Brainfuck interpreter written in the C preprocessor (github.com/orangeduck)
118 points by pepsi_can on Nov 16, 2012 | hide | past | favorite | 35 comments

"There has been much speculation on the turing completeness of the C Preprocessor, but I believe this is the first demonstrative proof that the C preprocessor is turing complete. This uses no GCC extensions and other than the rules for macro evaluation the only 'features' it takes advantage of are token pasting and variable argument macros."

Can anyone confirm that this is really the first demonstrative proof of this kind?


It is well known that the C preprocessor is not turing complete. Looking in https://github.com/orangeduck/CPP_COMPLETE/blob/master/RECR.... shows us this technique is not turing complete, as these functions define a maximum recursion depth.

It is impossible to implement a system without a de-facto recursion depth limit.

But this is not a de facto recursion limit of the system - the limit is encoded in the program itself, in the macros used to implement the limited recursion.

void f() { f(); } is only limited by the system, and if compiled on a system that automatically expands the stack to fill all available memory, you could in theory keep adding memory, and the system could handle more recursions. The recursion limit is external to the program.

But the linked c-preprocessor programs would have to be explicitly modified by adding more lines of code to achieve the same thing.

Yes, but as I point out elsewhere, Turing Completeness separates the expression from the machine. Machines will always have limits. But the expression of computation does not have to have limits.

Recursion depth limit is really a memory limit, so you just need someone to come and install more RAM when you run out in order to obtain de facto infinite memory, hence infinite recursion depth.

Even that is not infinite because there is a limit to how much memory you can address. You don't have to bring physical realities into it.

How is a "limit to how much memory [one] can address" not a physical, rather than logical, constraint?

Really, the crux of the matter is that you can have a function call itself an unbounded number of times in C (i.e. unbounded recursion). That you eventually run into a stack overflow is a limitation of the machine, not of the language; the definition of C does not prescribe a maximum number of recursive calls. This, together with conditionals, makes the C language Turing-complete.

The system will just spontaneously update to the next power-of-two bits when address space gets short. Problem solved.

That would break binary compatibility with all existing programs, though. So I guess the claim should be that it's impossible to implement a practically useful system without a de-facto recursion depth limit.

How does tail-recursion fit into this discussion of infinite recursion?

I'm not sure how that disqualifies this from Turing completeness any more than finite pointer sizes disqualify every other language from Turing completeness.

There's some subtlety going on here, but I'm going to agree with CJefferson.

Turing completeness is about separating the expression of a computation from the machinery that carries out the computation. If your expression language can be mapped to the idealized Turing machine, then your expression language can express all computations that can be computed. (That is, it is Turing Complete.) This allows us to reason about the expression of computation outside of the machinery that carries it out - because the machinery will always have limits.

The subtlety is: where do we draw the line between the expression and the machinery?

That is, assuming the question is "is the C preprocessor Turing Complete?" I answer no. Because implied in that question is that the C preprocessor language is the expression language we are interested in, and the C preprocessor is the machinery. The reason why I say that the C preprocessor expression language is not Turing complete is simple: you cannot actually express recursion or unbounded loops. No, the above is not a counter example. Why? Because the simulated loops are bounded, always. If you can't express arbitrary loops or recursion, then you can't compute everything that can be computed.

Note that I stressed express. Yes, an actual C program is clearly limited in how much it can recurse, but the language itself can express unbounded recursion.

And that can make our reasoning a bit more interesting: what if rather than asking the question about the C preprocessor, we instead ask it about the implied functional language that the author defined? This implied functional language is implemented in the macros the author defined. If we assume that his implied functional language is the expression language, and the machinery is both the C preprocessor language and the by-hand macros that CJefferson pointed out, then yes, that implied functional language is turing complete. That is, we could take the semantics behind his top-level macros, and go implement a fully Turing Complete language with those semantics. That's what you are getting at when you consider those macros the same as physical memory. We can do that, but now we're reasoning about the author's implied functional language, not the C preprocessor itself.

If you still don't agree, consider that the author's macro language is essentially the same as BlooP: http://en.wikipedia.org/wiki/BlooP_and_FlooP

So the expression of unbounded recursion can be expressed like this:

    #define EMPTY()
    #define DEFER(id) id EMPTY()

    #define FOREVER() \
        ? \
This will print out `?` infinitely. Here is the machinery to evaluate the expression:

    #define EVAL(...) A(A(A(__VA_ARGS__)))
    #define A(...) B(B(B(__VA_ARGS__)))
    #define B(...) C(C(C(__VA_ARGS__)))
    #define C(...) D(D(D(__VA_ARGS__)))
    #define D(...) E(E(E(__VA_ARGS__)))
    #define E(...) __VA_ARGS__

Which will expand to something like this:

    ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? FOREVER_INDIRECT () ()
Well at the end we see `FOREVER_INDIRECT () ()` which means the algorithm can expand further, so we could just apply more scans:

And we get more `?`. And we didn't have to change the expression of the algorithm just the machinery to evaluate it.

This is why I say the C preprocessor can act as Turing-complete, because the expression and machinery, while separate, is both in the language itself. This is different than BlooP, which the only possible way to express Turing completeness is through an external source(such as an input file).

Of course, for all practical purposes, the C preprocessor is near Turing-complete enough. The whole goal of the preprocessor is to output a file, we don't need it run indefinitely. We can express almost any useful algorithm, and if it reaches the recursion limit, we just simply apply more scans.

This is why I say the C preprocessor can act as Turing-complete, because the expression and machinery, while separate, is both in the language itself.

Unfortunately, "can act as Turing-complete" is not a well defined concept. It is clearly not the same thing, and the distinction is important. Consider, as pointed out elsewhere in this thread, that we can always solve the halting problem for a C preprocessor program. We're in danger of violent agreement, though: I think we agree on the main points, just perhaps not on their relevance.

Since we've gone this far down the rabbit hole, I'd like to point out that C++ templates are Turing complete: http://netvor.sk/~umage/docs/3rdparty/C++%20Templates%20are%...

This is an excellent explanation.

As you say, if we assume that my set of macros are the machinery, then truly my "BRAINFUCK()" macro is Turing complete.

Using this distinction between language and machine, how peculiar that a Turing Complete thing can be expressed using a language which is not Turing Complete itself.

The question is really - are these distinctions between machine and language always meaningful and without contradiction.

Using this distinction between language and machine, how peculiar that a Turing Complete thing can be expressed using a language which is not Turing Complete itself.

Ah, but that's the rub: we can't, and we didn't. You are imparting semantics on your macros that the C preprocessor has not. That is, the semantics that you imagine the macros to have are solely in your mind. They are not implied by the semantics of the C preprocessor.

The C preprocessor is a pushdown automaton, which if you add fixpoints (unbounded recursion) becomes a bounded storage machine. Our actual computers are such machines, which if you give them an infinite amount of memory are Turing machines. A language can be Turing-complete even if the implementation of that language is not.

Interestingly, any physical, finite machine can be modelled as a finite state automaton (which is even more limited than a pushdown automaton) with a large enough state space. The distinction only comes up when you consider idealized infinite machines.

Theory of computation is weird.

Can C be implemented such that pointers do not have a fixed size? That is to say, suppose you have a hypothetical machine that uses bignum memory addresses. Can C be implemented to run on that machine without cheating and using a fixed pointer size?

You don't need to reason about C's pointers to consider its Turing Complete. That it allows arbitrary storage, it has conditionals and allows full looping (either unbounded loops or full recursion) is enough. That is, a language that had C's semantics sans pointers would still be Turing Complete.

Are you saying that C with neither pointers nor arrays would still be Turing complete? I am skeptical.

I actually pondered on this for a long time, and I haven't come to a conclusion. I cannot, however, counter any of your arguments, as I thought some of the same things myself.

This is a surprising train of thought.

One simple reason that the C Preprocessor language is not Turing complete - the answer to the halting problem for any CPP input is "yes".

Ah now, that is an interesting point. Any language that only supports looping by recursion could, in principle, avoid that problem regardless of stack size by transforming recursion into iteration. But CPP can't do that.

Thanks, that's the first clear distinction I've seen for why this is "more" Turing incomplete than C.

Now all you need to do is implement COBOL in it, and the gate to hell will open.

He said it doesn't use any gcc extensions, but this still doesn't seem to work with llvm-gcc or clang. :(

I haven't tried this, but did you compile with the equivalent of -std=c99?

If there's a recursion limit, can you not write a program that fails to terminate?


Well sure, but there's always a recursion limit of one kind or another.

All programs terminate at the heat death of the universe.

Impressive. Massively impressive. Does this count towards your thesis in any way? I'd hate the kind of effort this required to go to waste when PhDs sucks the life out of your for 6-8 years (at least in the US.)

Has anybody checked out [http://www.ioccc.org/2001/herrmann1.hint

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact