
What type of Machine is the C Preprocessor? - cremno
http://theorangeduck.com/page/what-type-machine-c-preprocessor
======
jerf
"We imagine Turing Machine memory having an infinite tape which is just there,
not having a tape that must be generated as the head moves."

This is one of those times when you're probably better off saying and thinking
"unbounded" rather than "infinite". The point of a TM is not that it
"literally" (to the extent that term applies anyhow) has "infinite" cells in
it, the point is that there is no point in the computation in which it will
reach for another cell and be told that there isn't one. (So there isn't even
a way to represent that.) Many "infinite" things are better conceived of as
"unbounded". For another example, "infinite lists" in Haskell. Obviously, they
can not concretely manifest infinitely many cells in memory, but there is no
point at which the runtime will reach for "the next cell" and be told it has
reached the end of the list.

That is also why it's better to model our real computers with TM math, even
though they obviously do not have literally infinite amounts of memory. As
long as a software process runs along and is never told that it is out of
memory when it asks for more, we get fairly TM-like behavior, or we get a
crash if it does run out. By contrast, while the FSM model is in some sense
more mathematically accurate, it is less _useful_ for modeling real computers,
since it is so weak.

I think this formulation also better intuitively explains why we use infinity
so often in our proofs; by saying "and there's always another one if you want
it", we remove the case where we have to handle there _not_ being another one.
And that case can get quite hairy, as anyone who has watched their machine
self-immolate after it discovers it is out of RAM can attest to. :) (It can
also be every bit as mathematically tedious to deal with, too.)

The further I got through grad school, the more I said "unbounded" rather than
"infinite".

~~~
grannyg00se
"The further I got through grad school, the more I said "unbounded" rather
than "infinite"."

I think there's a huge difference between simply saying "unbounded" compared
to going through that detailed explanation you just gave. In the latter case,
it's very clear what the difference is. But without the explanation I'd guess
that people will think something like this:

    
    
        unbounded -> without bound -> infinite
    

In other words, without the explanation "unbounded" becomes a synonym for
"infinite".

~~~
baddox
"Unbounded" means "any finite number, no matter how large." For many people,
this might be what they think of when they hear "infinite," but in
mathematics, "infinity" is an entirely different concept.

~~~
jerf
As I mention in another comment, I am specifically referring to the fact that
when we computer scientists or discrete mathematicians use the term _in
proofs_ , we were more often interested in the "unbounded"ness than something
with "actual" infinite cardinality... the fact that I understand the
difference is precisely _why_ I tended to start saying unbounded more often.
Using infinity to mean unbounded is really too powerful, and a proof should
use the least power necessary. (It was only an intuitive feeling at the time;
I have a much better understanding of that fact now.)

When infinity is the relevant concept, use it.

~~~
marvin
This reminds me of the distinction between _random_ and _arbitrary_. Seems to
trip people up sometimes.

------
chriswarbo
I don't think there's a specific name for this kind of Turing Machine, but it
looks to me like it's primitive-recursive, similar to Hofstadter's BLooP
('bounded loop') language. It's basically the difference between for loops,
which specify up-front how many times to loop, and while loops, which keep
going until some dynamic condition is met. For loops are total, while loops
are Turing complete. Note that these are the CompSci definitions of for and
while; most languages with "for" syntax are actually implemening while loops.

There _are_ some interesting kinds of Turing Machine which use limited tapes,
rather than limited transition tables. For example, a monotone Turing machine
can have heads which only move one way. They're useful for modelling demand-
driven input and output (one read-only, monotone input tape, one write-only,
monotone outpu tape and one read-write non-monotone work tape). Other machines
that I've seen in research are 'enumerable-output machines', which can only
edit their previous output if it ends up lexicographically higher, and
'Generalised Turing Machines' which can edit their output as long as each bit
takes a finite number of steps to 'stabilise'. These machines have been
investigated by Jurgen Schmidhuber, among others.

~~~
gus_massa
I agree. After reading in the article that:

> _The main argument against Turing completeness was that my implementation
> did not support unbounded recursion._

I’m almost sure they are Primitive recursive function, but I didn’t have time
to write the complete proof.
[http://en.wikipedia.org/wiki/Primitive_recursive_function#Li...](http://en.wikipedia.org/wiki/Primitive_recursive_function#Limitations)

~~~
gus_massa
From a previous article of the same author:
[http://theorangeduck.com/page/c-preprocessor-turing-
complete](http://theorangeduck.com/page/c-preprocessor-turing-complete)

> _This is a fairly clear distinction and a compelling argument. Clearly there
> are languages where you can effectively express unbounded recursion
> function() { function() } and the C preprocessor is not one of them._

>* But there are a number of subtleties going on here. For example, if we
consider the set of macros I created to be the "machinery", and the "language"
to be the Brainfuck input to my system, then it is indeed Turing complete.
That is - I have created machinery which can simulate Turing complete
languages. Taking all the above definitions into account, consider how odd it
is that Turing complete machinery can be expressed in a non Turing complete
language.*

And from the “Brainfuck interpreter written in the C preprocessor”:
[https://github.com/orangeduck/CPP_COMPLETE](https://github.com/orangeduck/CPP_COMPLETE)

> _Currently the maximum recursion depth is set to around 1000 and the data
> array size around 100. These can be easily extended but for now, as a
> general rule of thumb computations exceeding 1000 steps may not run._

Here is the problem. We can consider the one-step-interpreter: it’s a function
that has as arguments a program, the current instruction index and the whole
memory state, and this function computes the next instruction index and the
next whole memory state. This one-step-interpreter is primitive recursive. He
put that function inside a bounded loop, which is also implementable as a
primitive recursive function.

So essentially he created an interpreter that runs a program for at most a
fixed number of steps, where this number is an argument of the function (or
worse, in this case a constant). Interpreter(Program, Memory, MaxSteps) is a
recursive primitive function. To be Turing complete, he needs to write
InterpreterForEverIfNecesary(Program, Memory).

With a fixed MaXSteps value, it can’t compute all recursive primitive
function.

~~~
chriswarbo
This situation crops up in total languages like Coq. For any iterative
process, like the stepping of a Turing Machine, we can create an infinite
stream of iterations, eg.

Cons (tape, head, state) (Cons (tape, head, state), (Cons ...)))

But since our program is total, we can only extract the first N states, for
some finite N. Compare this to a Turing Complete language like Haskell, where
we can also define an infinite stream of iterations, but we can also traverse
this stream indefinitely.

------
lisper
"What we are missing is the ability to re-enter old states."

That's deadly. Any machine with a finite number of states that cannot re-enter
a state it has previously been in must necessarily halt on all input. That
makes it strictly less powerful than even an FSM.

~~~
caf
It is worth noting that _must necessarily halt on all input_ is quite a
desirable property in a macro language evaluated at compile time.

~~~
lisper
That is arguable. All else being equal it might be true, but all else is not
equal. The benefit of never having your compiler enter an infinite loop may
not be worth the cost in terms of lost expressive power.

~~~
mtdewcmu
Has it been proven that the ability to write undecidable programs confers
expressive power such that you can't have one without the other? It seems
self-evident that running forever can't be useful, so you should be able to
say WLOG that useful programs always halt. So the set of all useful programs
is strictly smaller than the set of Turing machines. Is there a formal reason
why Turing machines are desirable?

~~~
lisper
> Has it been proven that the ability to write undecidable programs confers
> expressive power such that you can't have one without the other?

Yes.

> It seems self-evident that running forever can't be useful

Really? Do you not think that it might be useful for, say, an operating system
to run forever?

~~~
caf
Do you think it is useful to implement an operating system entirely at
compile-time, in an implementation which uses separate compile and run phases?

~~~
mtdewcmu
I'm not sure I follow what you're talking about. Operating Systems are
tangential to the point, though. When I said all useful programs halt, the
kinds of programs I meant are the kind used to prove things about computation.
Those programs are simplified abstractions of real programs, and they
generally compute one result and then halt (if they halt). Being simple makes
those programs easier to use in proofs than real programs. However, the proofs
still generalize to real programs; a real-world program with multiple
functions is equivalent to several single-function programs glued together
together.

So my point in saying that all useful programs halt was really to argue that
all useful programs must have the constraint that they their behavior is
always predictable, or controllable, in some sense. In a trivial way,
computers are always predictable, but Turing proved that there are cases where
it's impossible to predict the eventual outcome of a program in advance, i.e.,
the halting problem. In precise terms, the outcome (halt/not halt) of some
programs is said to be undecidable or uncomputable. Undecidability is a
hallmark of Turing machines. But since real programs are written by humans,
who can't solve undecidable problems, real programs will never be able to gain
any advantage from undecidable behavior.

------
sonnyOrullivan
Wouldn't it simply be a linear bounded automaton (LBA)? I understand you can
implement recursion via some tricks but only as many steps as you predefine
(see e.g. [https://stackoverflow.com/questions/7508475/is-c-
preprocesso...](https://stackoverflow.com/questions/7508475/is-c-preprocessor-
metaprogramming-turing-complete)) This reeks like an LBA to me.

~~~
roywiggins
My first instinct is that you're right- at least, I don't think it can be any
more powerful than an LBA: if there's a finite number of state transitions it
can make, then it can only actually "get to" a certain distance along the
tape, and you can just chop the tape at that point.

~~~
lisper
Nope, a cpp-machine is strictly less powerful than an LBA, or even an FSA. An
LBA (and an FSA) can enter an infinite loop. A cpp-machine can't.

------
NAFV_P
A few months ago I got fed up with losing track of what I had written in C. So
I knocked up some python scripts to write templates for me, it displays the
date and time the file was created and the filename in a comment. This concept
can be taken much further.

CPP has been known to be abused by programmers as a general text processor,
has anyone else started using another language to help them write C?

~~~
ChuckMcM
You mean like $ SCCS %G% %M% %W% $ ?

~~~
jfb
Good times. I used to have emacs mode hooks that ensure that "$Id$" was
written into files. I don't miss that.

~~~
ChuckMcM
In what I considered minor blasphemy one of my friends wrote git-hooks to keep
using them.

