
Proving that C++'s grammar is undecidable - vitaut
https://medium.com/@mujjingun_23509/full-proof-that-c-grammar-is-undecidable-34e22dd8b664
======
amluto
The algorithm in the article has an important bug. It’s a basic search: pop an
element from a queue, see if it’s a match and, if not, push two new elements.
If the queue is empty, declare failure. Think about this for a moment. Each
iteration, the queue length increases by one. Of course it’s never empty. So
the algorithm will eventually declare success on a positive input and will run
forever on negative input. It _has_ to be this way, since PCP is equivalent to
the halting problem. There is no algorithm that solves it correctly and always
halts. This is the whole point.

This means that this program does not prove that parsing C++ is undecidable in
the way the author thinks. There is no term that is a type or a value
depending on the solution to an undecidable problem. Instead, there is a term
that can not be resolved in finite time. This is a necessity property of
pretty much any Turing-complete metaprogramming system.

(What I think is actually going on that’s a bit unique to C++ is that C++
cannot be parsed unambiguously. The fact that the grammar depends on whether a
term is a type or value and that you can’t determine this from the AST without
doing things like expanding templates is nasty.)

------
identity0
Am I missing something? The author says that a solution to the Post
Correspondence Problem with infinite dominoes in finite time is impossible. He
then writes a bunch of C++ code that can solve the PCP for any arbitrary list
of dominoes. Did he realize that this list must be finite since it’s written
into the C++ code? If the list had infinitely many dominoes it would require
infinite space to store and infinite time to compile anyways, PCP or not.
That’s like saying Java is undecidable because you can write
List<List<List<List<...>>>> and have something take infinitely long to
compile.

~~~
btilly
What you are missing is that in the Post Correspondence Problem you have an
unlimited supply of each type of domino. So a finite list of domino types
could require an unlimited number of dominos to decide.

Now we can write programs that attempt some sort of clever reasoning to figure
out whether or not there is a solution at all. But given any specific program
that does that, there is a finite arrangement of dominos such that either the
program produces the wrong answer, or cannot possibly finish. In other words,
for any algorithm there is a finite problem that cannot be properly decided by
that algorithm.

Given an axiom system such as ZFC, we can write a program that searches
through all proofs in ZFC for a proof or disproof that a particular case of
the Post Correspondence Problem is solvable. This, being an algorithm, has a
finite input that it will either produce a wrong answer for (meaning that ZFC
contradicts itself) or else will never finish. Whether or not that case has an
answer is not decidable within ZFC!

Ideally we want the answer to be "there is no solution" in this case. But if
we introduce the axiom that there is a solution, we will never find a
contradiction and never be aware of any disturbing conclusion beyond, "That
solution must be really, really big." It turns out that, no matter how hard we
try, first order logic cannot correctly encode the concept of "finite".

This is what we mean by "undecidable".

------
qppo
Isn't it well known that C++ templates are Turing complete? Aren't most
metaprogramming facilities in other languages as well?

~~~
roca
The problem here is that C++'s metaprogramming facilities have leaked into the
parser to the extent that it's undecidable whether to parse that final
declaration as a variable declaration or a function declaration.

So: arguably the _grammar_ is decidable, but to parse C++ you need more than
the grammar, you need the full metaprogramming machinery. That is, C++ has
ended up in the same state as a language that simply provides Turing-complete
macros.

~~~
tines
This is what I love about C++ that isn't provided by [m]any of the "modern"
languages (not including Lisp). I want programs in my programs dangit!

~~~
roca
If you really want that, it seems better to provide them properly like Rust
proc-macros than to provide them in C++'s awful half-baked form.

~~~
adev_
> If you really want that, it seems better to provide them properly like Rust
> proc-macros than to provide them in C++'s awful half-baked form.

\- C++-17 (-20) constexpr is one century in advance in front of rust macro in
term of power and what you can do with it. It is not even comparable.

\- C++20 compile time execution is currently pretty clean and has nothing to
do with this template mess shown here. Definitively not "half baked".

~~~
roca
Compile-time function evaluation is quite a different feature from macros.
Most of what you would do with macros (e.g. output entirely new definitions of
functions, types, etc) can't be done with constexpr at all.

~~~
tines
And vice-versa.

------
phkahler
This kind of makes me laugh. Is this due to the relentless changes they've
been making to the language the last decade or two? I thought it kinda went
off the rails a while ago.

~~~
saagarjha
Nope, nothing new here. I mean, it uses some new spellings for certain
template concepts, but they are fairly easy to create yourself with a years-
old version of C++.

------
ridiculous_fish
> typeOrValue is a type int if the solution to the Post Correspondence Problem
> is “yes”, and a value 0 of type int if the solution is “no”, using SFINAE

Isn't this where the `typename` keyword comes in? `typename` is required when
using a qualified (meaning ::) dependent (it references a template parameter)
identifier as a type. My understanding is that uses of types without typename
is a common helpful extension but is not strictly conforming.

C++ language lawyers, please check my work...

~~~
grandmczeb
I think typename is only required inside of template definitions.

> The keyword typename must only be used in template declarations and
> definitions and only in contexts in which dependent names can be used.

[https://en.cppreference.com/w/cpp/language/dependent_name](https://en.cppreference.com/w/cpp/language/dependent_name)

~~~
InfiniteRand
The example given in the original post involves a dependent name

~~~
grandmczeb
It has to be both dependent and inside of a template definition. The example
is a normal function declaration. The linked article has a couple examples.

------
zelly
By this logic, any language with metaprogramming is undecidable because you
can write undecidable programs as a metaprogram. Execution is not the same as
parsing.

~~~
Kranar
If that metaprogramming is Turing complete then yes. The point of this article
is to show that metaprogramming in C++ is Turing complete. It had already been
widely known but this article presents an explicit demonstration of it.

~~~
zelly
god knows there are easier ways to make a C++ compiler vomit on you

------
jonny383
C++ is legitamately gross to read these days, unless you sit there reading it
every day.

The syntax has become so ridiciously complicated, I have developed an
involuntary gag relfex every time I read any kind of "modern" C++ code (i.e.
heavy use of templates / generics / keywords).

~~~
gumby
I'm not sure what you mean by heavy use of keywords, but C++20 allows you to
express templates basically as generics so if the <> syntax offends you you
can do without it completely.

Do you also object to Lisp Macros?

~~~
jonny383
This is exactly the problem with C++. Let's fix ugly syntax by providing yet
another way to write something that accomplishes the same thing.

Over-evolution isn't a good thing.

~~~
pjmlp
Just like breaking backwards compatibility usually leads to 17 years delay in
adoption. Can't please everyone.

------
ltbarcly3
Templates are Turing complete. Once you have that, it's pretty simple right?

------
vkaku
I can't decide if C++ has a grammar at all.

Please have a sense of humor.

------
rowanG077
This doesn't prove C++'s grammar is undeciable. Template expansion is seperate
from parsing. Besides you can do the same thing in any language that can do
Turing complete compile time programming.

~~~
Kranar
Template expansion is not separate from parsing in C++. That's the very thing
this article explicitly demonstrates. A program will end up being parsed in
two different ways depending on the solution to an instance of Post's
Correspondence Problem. In one solution a declaration of x parses into a
variable, in another solution a declaration of x parses into a function. Which
of those two parse trees is actually correct? That question is not decidable
and hence C++'s grammar is undecidable.

~~~
rowanG077
You are right! I didn't realize this.

------
frank2
The code in the OP are images, e.g., cannot be cut and pasted.

~~~
saagarjha
Sadly Medium seems to lack the tools to make decent standalone code snippets.

~~~
frank2
I don't know what qualifies as "decent" for you, but instead of an image, I'd
vastly prefer a plain old blockquote element combined with whatever html one
uses these days to specify a monospace font.

~~~
saagarjha
<pre> or <code>? That would be nice, but I don’t think Medium has that.

------
bialpio
The fact that the code compiles on godbolt.org makes me doubtful that this
particular instance proves that the grammar is undecidable.

~~~
saagarjha
The fact that it compiles does that this particular setup halts. This isn’t
true in general.

------
winrid
The color scheme on the syntax highlighting makes me crave cotton candy...

------
convery
Can do a "constexpr void inf() { for (int i = 0; i < 2; ++i) i = 0; }" as a
shorter proof.

~~~
saagarjha
That’s not the grammar, though; that’s compile-time evaluation.

