
C++ is not a superset of C - lochsh
https://mcla.ug/blog/cpp-is-not-a-superset-of-c.html
======
_kst_
Several of the examples shown as valid C are not.

For example, this:

    
    
        const int foo = 1;
        int* bar = &foo;
        *bar = 2;
    

is said to have undefined behavior, but in fact the initialization of `bar` is
a constraint violation, requiring a diagnostic. (Some compilers will issue a
non-fatal warning, which is allowed by the C standard but IMHO is
unfortunate.)

Another example: it says that this:

    
    
        const size_t buffer_size = 5;
        int buffer[buffer_size];
    

will not compile in C, but it's valid at block scope in C99, which introduced
variable-length arrays. (C11 made them optional.)

"In C, this would compile, albeit likely with warnings about implicit
conversion:"

    
    
        int main() {
            auto x = "actually an int";
            return x;
        }
    

The "implicit int" rule was dropped in C99, and even before that the language
did not define an implicit conversion from char* to int. Again, some compilers
might support it with a warning, but it's a constraint violation.

~~~
lochsh
I haven't been able to find anything in the C11 standard about the
initialisation of bar being a constraint violation. In 6.7.3, the only
constraints for type qualifiers listed are for _Atomic and restrict. Could you
let me know where you got the information you talk about here from? I could be
missing part of the standard.

~~~
_kst_
N1570 6.7.9p11 says that the constraints and conversions for simple assignment
apply to scalar initializers.

6.5.16.1p1 says that, for pointers, "the type pointed to by the left has all
the qualifiers of the type pointed to by the right".

In this case, the right operand is a pointer to a const-qualified type and the
left operand is a pointer to a non-const-qualified type.

(Without this restriction, you could silently discard const qualification just
by assigning or initializing a pointer, which would largely defeat the purpose
of const.)

------
jcranmer
The C++ specification has an entire appendix devoting to listing
incompatibilities with C. Some features missing from this list:

* A char literal is an expression of type int in C, but type char in C++.

* String literals are const in C++ but non-const in C (although attempts to modify them are undefined behavior).

* This program is legal C but not C++:
    
    
       int i;
       int i;
    

* structs and unions occupy a different name space in C than they do in C++.

* main cannot be recursive in C++, but it can in C.

* C++ allows lvalues in a few more places. Usually, this amounts to a compiler error, but there are a few places where the additional lvalue-to-rvalue conversion is legal and produces a different result.

* There are some cases where C++ requires an explicit cast that C permits an implicit cast (void* being the most well-known)

------
umvi
This is true, but C++ is _mostly_ a superset of C, which is "good enough" for
the vast majority of developers. It's enough of a superset that we were able
to seamlessly integrate our legacy C libraries with our modern C++
applications without hassle (and we didn't run into any of the corner cases
listed in the article).

~~~
siggen
I am hoping this isn’t an implication that C is legacy and C++ is the modern
and the future. :-)

~~~
bitwize
C isn't just legacy, it's fundamentally broken. Rust is the future. C++, while
it mitigates some of the brokenness of C, favored backward compatibility with
C over fixing the brokenness once and for all.

~~~
verall
Does the RESF just browse C/C++ threads waiting for the right moment to jump
in and proselytize? It is not even mentioned once in the article.

------
stephen82
Nearly everything that is mentioned in this blog is a subject of change /
addition to the latest C standard, code named C2x.

They are in discussions to introduce the following in C:

    
    
        * nullptr
        * auto
        * __has_include
        * make false and true first-class language features
        * constexpr
    

and lots of other goodies [1].

[https://gustedt.wordpress.com/2018/11/12/c2x/](https://gustedt.wordpress.com/2018/11/12/c2x/)

~~~
mehrdadn
Are they trying to make C literally become the same as C++ except classes and
templates?

~~~
loeg
> Are they trying to make C literally become the same as C++ except classes
> and templates?

As a C programmer, aren't classes, templates, and exceptions the things that
have classically differentiated C and C++?[0] I don't see anything obviously
objectionable about nullptr[1], auto, __has_include, or constexpr. (I don't
have a ton of experience with them, either.)

I'll admit I don't really grok what "make false and true first-class language
features" means — maybe make them reserved keywords? (int)true must still
evaluate to 1, in any event.

[0]: "C++ is C with classes!"

[1]: C's "NULL" has this obnoxious wart in that it is implementation-defined
whether or not it is a pointer type. I.e., it can be "(void *)0" or just "0".
This means that it cannot be used safely in portable incantations of variadic
functions that expect pointer arguments.

~~~
klingonopera
> This means that it cannot be used safely in portable incantations of
> variadic functions that expect pointer arguments.

Why only variadic functions? The answer to that may make my next question
obsolete, namely: Wouldn't only C++ complain about this? In C, isn't the input
to anything coerced to the data type it will represent, in actual complete
disregard of the input type?

~~~
colonwqbang
In a fixadic function the integer literal would be promoted to a pointer,
because the caller knows that the argument should be a pointer. In a variadic
function call, the caller guesses the type of each argument based on what is
passed. A variadic function needs to know exactly the size of its arguments
and sizeof(int) != sizeof(pointer) on many systems.

This works:

    
    
        void f(char *p);
        
             f(0); /* integer literal auto promoted to pointer */
        
    

This probably doesn't, at least not as the number of arguments to f is
increased:

    
    
        void f(...);
        
             f(0); /* assume type of function is void f(int) */

------
autoexec
I love that roughly 35 years after its creation we're still arguing about what
c++ is or is not in relation to C. I'm looking forward to the debates and blog
posts of my grandchildren on Perl 6 vs Perl 5

~~~
user5994461
Doubt either of our grandchildren will ever use Perl. It is truly an arcane
language that has no niche and no new application written in it.

~~~
autoexec
I'm still using perl scripts for log parsing and some general sys-admin type
stuff, but yeah, it's way past its prime.

------
saagarjha
> I'm suspicious of restrict. It seems like playing with fire, and anecdotally
> it seems common to run into compiler optimisation bugs when using it because
> it's exercised so little.

On the contrary, I wish C++ had restrict in the standard, for exactly the
reasons mentioned: it can help the optimizer in certain cases.

~~~
cozzyd
Indeed, auto-vectorization usually won't work without it...

~~~
lochsh
This is all totally fair -- I've never used restrict, so it's hard for me to
have a useful opinion on it.

I think it's easy to dismiss things that can help optimisation when you've
never seen their effect first-hand. I imagine if I'd had an experience where
I'd used it to great advantage I'd be wishing it was in the C++ standard too
:)

------
monocasa
> The size of the array needs to be known at compile time. In C++, a const
> variable can be a constant expression, meaning it can be evaluated at
> compile time. In C, this is not the case, and we must instead use a pre-
> processor macro:

Enums are also good here in C land for specifying compile time constants.

~~~
_kst_
Yes, but only for constants of type int.

For example, this is valid:

    
    
        enum { ANSWER = 42 };
    

But C enumeration constants are always of type int, so you can't define a
constant of some other integer type this way.

~~~
loeg
It's worse than that. The type of a C enumeration is implementation defined.
It must cover the full range of defined values, but it definitely doesn't have
to be 'int'. (§ 6.7.2.2 (4)):

> Each enumerated type shall be compatible with [ed: one of] char, a signed
> integer type, or an unsigned integer type. The choice of type is
> implementation-defined but shall be capable of representing the values of
> all the members of the enumeration.

One possible source of confusion is that explicit values in an enum (i.e.,
'42' in your example) must have values representable as int. (§ 6.7.2.2, (2)).

~~~
_kst_
I said _enumeration constants_ are of type int.

For example, given:

    
    
        enum foo { this, that, the_other };
        enum foo obj;
    

obj is of type "enum foo", which is compatible with some implementation-
defined integer type, but the constants "this", "that", and "the_other" are of
type int.

In C++, the constants are of type "enum foo", which can also be referred to as
"foo".

------
pietroglyph
The designated initializers example will actually compile in recent versions
of GCC, Clang, and Visual C++, even if that's not part of the standard prior
to C++20. A better example (i.e. one that doesn't compile in C++ but does in
C) would be

    
    
      int arr[3] = { [1] = 5 }
    

or even

    
    
      struct A c = {.x = 1, 2}
    

or another example where the designators are not declared in order of the
struct members, or where the designators are nested. The version of designated
initializers standardized in C++20 only allows the simple case that's
currently implemented in all the major compilers.

See
[https://stackoverflow.com/a/29337570](https://stackoverflow.com/a/29337570)
for more info and examples.

------
camgunz
To offer a different opinion, I love the design/images/color scheme. Feels fun
and creative.

~~~
lochsh
Thank you so much! This is lovely to hear. ^_^ I like it too.

------
powzapbiff
The code:

const size_t buffer_size = 5; int buffer[buffer_size];

compiles fine in c, but not for the same reason. C99/C11 has dynamic arrays

[https://en.wikipedia.org/wiki/Variable-
length_array#C99](https://en.wikipedia.org/wiki/Variable-length_array#C99)

~~~
eMSF
To be exact, it doesn't compile fine outside of functions (which the sample
code didn't have) because file scope arrays can't be VLAs.

~~~
lochsh
This is what I was going for in the blog post, but I got confused after seeing
these comments as Clang _does_ compile this when the buffer is of static
storage class.

Which I don't think is standard -- but it's not using VLAs. I wondered if it
just has constant expression semantics for const variables.

Weirdly, adding a _Static_assert to test this theory proves it for c99 but not
c11 :/

[https://godbolt.org/z/q-bb-n](https://godbolt.org/z/q-bb-n) c99 with clang
[https://godbolt.org/z/ad14Ah](https://godbolt.org/z/ad14Ah) c11 with clang
[https://godbolt.org/z/xJSDQa](https://godbolt.org/z/xJSDQa) c11 with gcc
(which is the only one giving the output I'd expect)

~~~
eMSF
>I wondered if it just has constant expression semantics for const variables.

That would be my guess also, for applicable const variables. File scope const
variables are quite constexpr-y in C anyway, since C requires all file scope
variable initializers to be constant expressions (C++ only requires that for
constexpr variables).

Toying around with clang in Godbolt, it seems that there are some quirks
regarding this. The following is accepted:

    
    
      static const int n = 0;
      int buf[n]; // invalid zero-length array is accepted
                  // probably another non-standard extension
    

But the following is not, despite _n_ having the same zero value:

    
    
      static const int n;
      int buf[n]; // complains about a file scope VLA

------
faehnrich
Reminds me of a blog post of mine[1].

Funny that I say C is not a subset, while this says C++ not a superset.

But mine doesn't go too deep, don't reference any standards.

I love things that explore dark corners of languages like this, look forward
to digging deeper.

Also, I like the web design, kinda cyberpunk.

1\. [http://faehnri.ch/how-c-is-not-a-subset-of-cpp/](http://faehnri.ch/how-c-
is-not-a-subset-of-cpp/)

~~~
lochsh
Being told my web design is kinda cyberpunk is a great compliment, thank you
^_^ I like the banner image on yours.

------
kazinator
C++ was once a near superset of C. However, due to language divergence, the
situation is that both C and C++ are large supersets of an intersection.

What is in that intersection depends on which C and C++ dialect pair your
intersect.

E.g. a newer C++11 dialect has "long long", so if intersected with C99, "long
long" is in the dialect. If we intersect C++ older than C++11 with C, or C
older than C99 with C++, then we don't have "long long".

(Except as a conforming extension from a compiler, which we could detect with
a configure script and use anyway.)

The thing is that the intersection languages are basically fully fledged C:
you can easily develop in them and do everything you'd want from C, if you're
willing to live without a few frills here and there like C99 designated
initializers, and variable length arrays (dropped from being required in C in
C11) and whatnot.

If you require complex numbers, that could get hairy.

A long-time C90 programmer will not find anything amiss, though.

------
jancsika
> The size of the array needs to be known at compile time.

In C99 the example you give does indeed compile and silently get turned into a
VLA.

~~~
lochsh
It would only be a VLA if defined in a block, I intended the code snippet to
be at file-scope, giving the buffer size and array static storage duration.

But it does indeed compile in Clang, and I'm looking into why. I think this
line in the C11 standard might be key: "An implementation may accept other
forms of constant expressions."

------
pmul
The first example doesn't produce undefined behaviour as c++ - it just won't
compile - you can't initialise that pointer to a non const int from a const
int without doing something naughty like a const_cast.

------
tambourine_man
Which is part of the beauty of ObjC

~~~
ncmncm
As Stroustrup said, "Smalltalk is the best Smalltalk I know of."

------
ndesaulniers
Designated initializers were added in c++20, I was just looking at a recent
change to clang's semantic analysis that warns on the differences between C99
and C++20 designated initializers (Todo: post link when not mobile.)

One recent difference I saw was valid in (via GNU C extension) but invalid in
c++:

    
    
        struct foo my_foo = ({
          init(&my_foo);
          my_foo;
        });

~~~
loeg
GNU C extensions are well outside the scope of standard C (or C++).

------
vwstt
That pink background to the left weighs 6 MB. Do you really think it's
necessary?

~~~
lochsh
I should probably make it smaller! 6MB is pretty indulgent. But I do like how
it looks and this is mostly just a fun blog for myself :)

~~~
bt848
I think your blog looks great. The color palette you are using for text is
very pleasing.

~~~
lochsh
Thank you! ^_^

------
lochsh
I've made some updates to the blog post based on feedback -- thank you
everyone who pointed out mistakes helpfully :)

I've made the updates clear, and linked to the archived version of the
original post.

------
alexnewman
Been saying this for years. The weird part is that it's a good thing. C++
redefined auto and it has never been the same.However it's clearly a better
language than it was in 2010

------
pmul
The first example won't compile in c++ (can't get a ptr to a non const from a
const without something naughty like a const_cast) - that's not undefined
behaviour is it?

~~~
lochsh
you're right, I messed up here. I'm working on some updates to the blog post
based on feedback.

------
jdmoreira
I never heard anyone say that C++ is a superset of C.

Sure, the first version was a preprocessor on top of C and certainly that is
common knowledge. But a superset? Never heard it.

ObjC on the other hand...

~~~
MauranKilom
You evidently haven't been looking at the stackoverflow questions coming in at
the [c] and [c++] tags... ;)

A frightening amount of [c][c++] tagging is expunged there every day.

~~~
Veedrac
Which is often just meaningless pedanticism. There are a _lot_ of questions
about C APIs where it's irrelevant whether the answer uses C or C++. The
latter might not be an _exact_ superset, but in most cases it's easily close
enough.

------
nneonneo
Nit: memmove allows src and dst to overlap while memcpy does not.

~~~
lochsh
Ah yes, what I meant was that more optimisations can be made on memmove if we
restrict src and dst, as the overlap case no longer needs to be considered.

I've never used restrict, so I could be missing something, but this is what I
meant in the blog post by mentioning memmove.

------
mhh__
I thought this was common knowledge? Pointer aliasing for example

~~~
mark-r
Common knowledge is like common sense - not as common as you'd think.

------
adamnemecek
Very 1995, no offense.

~~~
stephen82
I hope you mean the website theme.

If yes, indeed.

I had to view in Reader mode via Firefox.

I could not read it, I got dizzy by its colors; my sensitive vision couldn't
stand it -_-

------
einpoklum
In other news, 1+1=2; and two wrongs don't make a right (but three lefts do.)

~~~
jhatemyjob
And now it's time for the show!

