Hacker News new | past | comments | ask | show | jobs | submit login
C++ is not a superset of C (mcla.ug)
107 points by lochsh 41 days ago | hide | past | web | favorite | 107 comments

Several of the examples shown as valid C are not.

For example, this:

    const int foo = 1;
    int* bar = &foo;
    *bar = 2;
is said to have undefined behavior, but in fact the initialization of `bar` is a constraint violation, requiring a diagnostic. (Some compilers will issue a non-fatal warning, which is allowed by the C standard but IMHO is unfortunate.)

Another example: it says that this:

    const size_t buffer_size = 5;
    int buffer[buffer_size];
will not compile in C, but it's valid at block scope in C99, which introduced variable-length arrays. (C11 made them optional.)

"In C, this would compile, albeit likely with warnings about implicit conversion:"

    int main() {
        auto x = "actually an int";
        return x;
The "implicit int" rule was dropped in C99, and even before that the language did not define an implicit conversion from char* to int. Again, some compilers might support it with a warning, but it's a constraint violation.

Thanks, this is useful! I'll have to look at the standards again and be more precise in my language.

I haven't been able to find anything in the C11 standard about the initialisation of bar being a constraint violation. In 6.7.3, the only constraints for type qualifiers listed are for _Atomic and restrict. Could you let me know where you got the information you talk about here from? I could be missing part of the standard.

N1570 6.7.9p11 says that the constraints and conversions for simple assignment apply to scalar initializers. says that, for pointers, "the type pointed to by the left has all the qualifiers of the type pointed to by the right".

In this case, the right operand is a pointer to a const-qualified type and the left operand is a pointer to a non-const-qualified type.

(Without this restriction, you could silently discard const qualification just by assigning or initializing a pointer, which would largely defeat the purpose of const.)

> Some compilers will issue a non-fatal warning, which is allowed by the C standard but IMHO is unfortunate.

GCC 5.1.0 reports:

  warning: initialization discards 'const' qualifier from pointer target type [-Wdiscarded-qualifiers]

The C++ specification has an entire appendix devoting to listing incompatibilities with C. Some features missing from this list:

* A char literal is an expression of type int in C, but type char in C++.

* String literals are const in C++ but non-const in C (although attempts to modify them are undefined behavior).

* This program is legal C but not C++:

   int i;
   int i;
* structs and unions occupy a different name space in C than they do in C++.

* main cannot be recursive in C++, but it can in C.

* C++ allows lvalues in a few more places. Usually, this amounts to a compiler error, but there are a few places where the additional lvalue-to-rvalue conversion is legal and produces a different result.

* There are some cases where C++ requires an explicit cast that C permits an implicit cast (void* being the most well-known)

This is true, but C++ is mostly a superset of C, which is "good enough" for the vast majority of developers. It's enough of a superset that we were able to seamlessly integrate our legacy C libraries with our modern C++ applications without hassle (and we didn't run into any of the corner cases listed in the article).

I am hoping this isn’t an implication that C is legacy and C++ is the modern and the future. :-)

C isn't just legacy, it's fundamentally broken. Rust is the future. C++, while it mitigates some of the brokenness of C, favored backward compatibility with C over fixing the brokenness once and for all.

Does the RESF just browse C/C++ threads waiting for the right moment to jump in and proselytize? It is not even mentioned once in the article.

I love Rust as much as the next guy, but not everything that was a good fit for C is necessarily a natural transition to Rust.

Not that I've worked in a lot of places, but everywhere that I've worked that uses C at all treats it this way. Writing new C code is considered a no-no.

As a view into a different part of the industry, when I was writing code for cheap embedded processors, it was C code all the way.

I write code for expensive embedded processors and it's still C all the way, with C++ recently starting to make a (small) appearance. As such I'm acutely aware of the wonderful incompatibilities between C and C++. My favourite is this:

    struct point { int x; int y; };
    int main() { struct point p = { .y = 0, .x = 0 }; }

    error: designator order for field ‘point::x’ does not match declaration order

Yes, embedded development has always been stuck in the stone age. C is actually a recent, and major advance over assembly. It happened only because vendors discovered they could steal accounts from competitors by being source-compatible, even though their ISA was different. C++ does not offer that advantage for them, no matter how much it might benefit developers, so they have no intention of ever supporting it.

But Arduinos are programmed in a dialect of C++, so it will become necessary to enable it in the near future, i.e. by 2030. Civilization might fall first.

I write C every day at a major silicon valley networking company. So it's not at all considered bad.

It is considered bad at many other places.

I would add, more enlightened places.

But competent C++ programmers are expensive. It is harder to tell whether a C programmer is competent, because the code they write has less room to be better than minimally tolerable.

C is a niche language and C++ is about to hit the wall as companies abandon it for languages like Go and Rust.

You're high. In industry C and C++ are very widespread. Go and Rust are ... not. You'll have an easier time finding a job that requires FORTRAN than one requiring Rust, and Go is only a bit more popular.

Well, you go to C++ for the climate; you go to Go or Rust for... the company!

Lets touch base again in five years and see how C++ is doing.

Let's touch base in five years and see if anybody remembers Rust. It is likely they will, but not assured.

Consider that the number of programmers adopting C++ every day far, far exceeds the number adopting Rust, and the rate is increasing faster, in absolute numbers, than ever before, it is hard to imagine any scenario where C++ is less popular, relative to Rust, than today. Yes, C++ usage will less than double, while Rust usage will multiply, but the absolute difference in number will be much larger than today.

You seem to underestimate the vast amounts of existing C++ and C code. That code is not going anywhere, especially in only 5 years. Companies are not going to rewrite stuff in Rust just because of memory and type safety.

Touch base in 5 years... Rust will still be a niche language due to its complexity. (IMO, Go has a better chance of at least picking up a few percentage points.)

C++ is noticeably safer than C but C is still more widespread. Safety doesn't really sell.

I write C every day in embedded RISC-V. I think Rust looks neat, and would find it cool to use. But I just convinced the team to move off of gnu90, and a huge amount of my work is in assembly.

Rust offers the idea of safety and security, but at the same time hedges itself with unsafe - fine, sure. However, my work would be nearly 100% in unsafe. So it results in an unpredictable, complex, unsafe language - a non starter.

Nearly everything that is mentioned in this blog is a subject of change / addition to the latest C standard, code named C2x.

They are in discussions to introduce the following in C:

    * nullptr
    * auto
    * __has_include
    * make false and true first-class language features
    * constexpr
and lots of other goodies [1].


"Add a type aware print utility with format specification syntax similar to python"

Now thats exciting

__attribute__((format(printf, ...))) makes printf every bit as type-aware as I need it to be. Put that in the standard instead.

does anyone have more information on this? only reference I can find is that blog post; I can't find the specific section of the linked specs that discusses such a utility

Are they trying to make C literally become the same as C++ except classes and templates?

> Are they trying to make C literally become the same as C++ except classes and templates?

As a C programmer, aren't classes, templates, and exceptions the things that have classically differentiated C and C++?[0] I don't see anything obviously objectionable about nullptr[1], auto, __has_include, or constexpr. (I don't have a ton of experience with them, either.)

I'll admit I don't really grok what "make false and true first-class language features" means — maybe make them reserved keywords? (int)true must still evaluate to 1, in any event.

[0]: "C++ is C with classes!"

[1]: C's "NULL" has this obnoxious wart in that it is implementation-defined whether or not it is a pointer type. I.e., it can be "(void *)0" or just "0". This means that it cannot be used safely in portable incantations of variadic functions that expect pointer arguments.

They're not the "problems" with C++, they're what make C++ great. It's like C is slowly realizing C++ features are actually useful, but doing so as slowly as molasses, while still trying to pretend like this isn't the case...

> [Classes, templates, and exceptions are] not the "problems" with C++, they're what make C++ great.

We're going to have to agree to disagree on that one.

> It's like C is slowly realizing C++ features are actually useful, but doing so as slowly as molasses, while still trying to pretend like this isn't the case...

I think it makes sense to adopt C++ features that do some combination of (1) providing a useful feature, (2) aiding C++ compatibility, and (3) not majorly increasing the conceptual size of the C language. Just like it makes sense for C++ to adopt C99-compatibility (structure literals has taken y'all like 20+ years to adopt).

Templates, classes, and exceptions are a huge huge addition to the complexity of the language. If C programmers wanted them, with all of the pitfalls of manual memory management and C-style lifetime safety, they'd just use C++. Obviously, they don't.

Personally, if I had to choose a language other than C I'd pick something like Zig, Rust, or Go over C++.

C with templates (not classes), STL algorithms/ranges and small things like nullptr, auto, lambdas, stronger basic types, etc is actually a powerful configuration.

Everyone who likes C++ will tell you which parts they like and how powerful they are :-).

> This means that it cannot be used safely in portable incantations of variadic functions that expect pointer arguments.

Why only variadic functions? The answer to that may make my next question obsolete, namely: Wouldn't only C++ complain about this? In C, isn't the input to anything coerced to the data type it will represent, in actual complete disregard of the input type?

In a fixadic function the integer literal would be promoted to a pointer, because the caller knows that the argument should be a pointer. In a variadic function call, the caller guesses the type of each argument based on what is passed. A variadic function needs to know exactly the size of its arguments and sizeof(int) != sizeof(pointer) on many systems.

This works:

    void f(char *p);
         f(0); /* integer literal auto promoted to pointer */
This probably doesn't, at least not as the number of arguments to f is increased:

    void f(...);
         f(0); /* assume type of function is void f(int) */

For non-variadic functions, the argument is implicitly converted to the parameter type (if possible).

For variadic functions, the compiler doesn't know the parameter type.

So to print a null pointer, you need to do this:

    printf("NULL = %p\n", (void*)NULL);

    printf("null pointer = %p\n", (void*)0);
The %p format requires an argument of type void*, so you have to convert to that type if necessary.

colonwqbang hit the nail on the head.

For non-variadic functions, the function declaration provides the correct parameter type, and caller arguments are coerced to the correct type.

> In C, isn't the input to anything coerced to the data type it will represent, in actual complete disregard of the input type?

In short: no. Implicit casts between incompatible types are warnings/errors.

> aren't classes, templates, and exceptions the things that have classically differentiated C and C++?

The key difference is the destructor. Classes, templates, and exceptions extend the reach and value of the destructor. Absent the destructor, the other stuff is of little value (cf. Java).

Well, they are trying to slim down, so to speak, the gap between C and C++ for the sake of a safer interoperability between the languages and for the sake of safer and secure code.

If C could fix _Generic from C11 and make it behave more like a lightweight version of templates, then I could say we have a huge potential of having a safer version of what we currently have.

Are classes and templates not the bulwark of what makes C++ C++ though

I do not really care about new features to C - I would just use C++ for these features - but what is your specific argument against something like auto? Or just in general on new features.

I love that roughly 35 years after its creation we're still arguing about what c++ is or is not in relation to C. I'm looking forward to the debates and blog posts of my grandchildren on Perl 6 vs Perl 5

Doubt either of our grandchildren will ever use Perl. It is truly an arcane language that has no niche and no new application written in it.

I'm still using perl scripts for log parsing and some general sys-admin type stuff, but yeah, it's way past its prime.

What are they?

> I'm suspicious of restrict. It seems like playing with fire, and anecdotally it seems common to run into compiler optimisation bugs when using it because it's exercised so little.

On the contrary, I wish C++ had restrict in the standard, for exactly the reasons mentioned: it can help the optimizer in certain cases.

Indeed, auto-vectorization usually won't work without it...

This is all totally fair -- I've never used restrict, so it's hard for me to have a useful opinion on it.

I think it's easy to dismiss things that can help optimisation when you've never seen their effect first-hand. I imagine if I'd had an experience where I'd used it to great advantage I'd be wishing it was in the C++ standard too :)

> The size of the array needs to be known at compile time. In C++, a const variable can be a constant expression, meaning it can be evaluated at compile time. In C, this is not the case, and we must instead use a pre-processor macro:

Enums are also good here in C land for specifying compile time constants.

Yes, but only for constants of type int.

For example, this is valid:

    enum { ANSWER = 42 };
But C enumeration constants are always of type int, so you can't define a constant of some other integer type this way.

It's worse than that. The type of a C enumeration is implementation defined. It must cover the full range of defined values, but it definitely doesn't have to be 'int'. (§ (4)):

> Each enumerated type shall be compatible with [ed: one of] char, a signed integer type, or an unsigned integer type. The choice of type is implementation-defined but shall be capable of representing the values of all the members of the enumeration.

One possible source of confusion is that explicit values in an enum (i.e., '42' in your example) must have values representable as int. (§, (2)).

I said enumeration constants are of type int.

For example, given:

    enum foo { this, that, the_other };
    enum foo obj;
obj is of type "enum foo", which is compatible with some implementation-defined integer type, but the constants "this", "that", and "the_other" are of type int.

In C++, the constants are of type "enum foo", which can also be referred to as "foo".

The designated initializers example will actually compile in recent versions of GCC, Clang, and Visual C++, even if that's not part of the standard prior to C++20. A better example (i.e. one that doesn't compile in C++ but does in C) would be

  int arr[3] = { [1] = 5 }
or even

  struct A c = {.x = 1, 2}
or another example where the designators are not declared in order of the struct members, or where the designators are nested. The version of designated initializers standardized in C++20 only allows the simple case that's currently implemented in all the major compilers.

See https://stackoverflow.com/a/29337570 for more info and examples.

To offer a different opinion, I love the design/images/color scheme. Feels fun and creative.

Thank you so much! This is lovely to hear. ^_^ I like it too.

The code:

const size_t buffer_size = 5; int buffer[buffer_size];

compiles fine in c, but not for the same reason. C99/C11 has dynamic arrays


Yes, I think I made an embarrassing mess up here by using a compiler that doesn't support VLAs, and not fully understanding VLAs. Thanks for pointing it out.

To be exact, it doesn't compile fine outside of functions (which the sample code didn't have) because file scope arrays can't be VLAs.

This is what I was going for in the blog post, but I got confused after seeing these comments as Clang _does_ compile this when the buffer is of static storage class.

Which I don't think is standard -- but it's not using VLAs. I wondered if it just has constant expression semantics for const variables.

Weirdly, adding a _Static_assert to test this theory proves it for c99 but not c11 :/

https://godbolt.org/z/q-bb-n c99 with clang https://godbolt.org/z/ad14Ah c11 with clang https://godbolt.org/z/xJSDQa c11 with gcc (which is the only one giving the output I'd expect)

>I wondered if it just has constant expression semantics for const variables.

That would be my guess also, for applicable const variables. File scope const variables are quite constexpr-y in C anyway, since C requires all file scope variable initializers to be constant expressions (C++ only requires that for constexpr variables).

Toying around with clang in Godbolt, it seems that there are some quirks regarding this. The following is accepted:

  static const int n = 0;
  int buf[n]; // invalid zero-length array is accepted
              // probably another non-standard extension
But the following is not, despite n having the same zero value:

  static const int n;
  int buf[n]; // complains about a file scope VLA

The C standard permits implementations to support additional forms of constant expressions.

Try "clang -std=c11 -pedantic-errors".

Reminds me of a blog post of mine[1].

Funny that I say C is not a subset, while this says C++ not a superset.

But mine doesn't go too deep, don't reference any standards.

I love things that explore dark corners of languages like this, look forward to digging deeper.

Also, I like the web design, kinda cyberpunk.

1. http://faehnri.ch/how-c-is-not-a-subset-of-cpp/

Being told my web design is kinda cyberpunk is a great compliment, thank you ^_^ I like the banner image on yours.

C++ was once a near superset of C. However, due to language divergence, the situation is that both C and C++ are large supersets of an intersection.

What is in that intersection depends on which C and C++ dialect pair your intersect.

E.g. a newer C++11 dialect has "long long", so if intersected with C99, "long long" is in the dialect. If we intersect C++ older than C++11 with C, or C older than C99 with C++, then we don't have "long long".

(Except as a conforming extension from a compiler, which we could detect with a configure script and use anyway.)

The thing is that the intersection languages are basically fully fledged C: you can easily develop in them and do everything you'd want from C, if you're willing to live without a few frills here and there like C99 designated initializers, and variable length arrays (dropped from being required in C in C11) and whatnot.

If you require complex numbers, that could get hairy.

A long-time C90 programmer will not find anything amiss, though.

> The size of the array needs to be known at compile time.

In C99 the example you give does indeed compile and silently get turned into a VLA.

It would only be a VLA if defined in a block, I intended the code snippet to be at file-scope, giving the buffer size and array static storage duration.

But it does indeed compile in Clang, and I'm looking into why. I think this line in the C11 standard might be key: "An implementation may accept other forms of constant expressions."

The first example doesn't produce undefined behaviour as c++ - it just won't compile - you can't initialise that pointer to a non const int from a const int without doing something naughty like a const_cast.

Which is part of the beauty of ObjC

As Stroustrup said, "Smalltalk is the best Smalltalk I know of."

Designated initializers were added in c++20, I was just looking at a recent change to clang's semantic analysis that warns on the differences between C99 and C++20 designated initializers (Todo: post link when not mobile.)

One recent difference I saw was valid in (via GNU C extension) but invalid in c++:

    struct foo my_foo = ({

GNU C extensions are well outside the scope of standard C (or C++).

That pink background to the left weighs 6 MB. Do you really think it's necessary?

I should probably make it smaller! 6MB is pretty indulgent. But I do like how it looks and this is mostly just a fun blog for myself :)

I've converted it to jpg and it's now 448K :)

I think your blog looks great. The color palette you are using for text is very pleasing.

Thank you! ^_^

You are also contributing to global warming. It takes a energy to transfer this data all over the world. And poor people will will run out of their monthly phone data limit.

I've made some updates to the blog post based on feedback -- thank you everyone who pointed out mistakes helpfully :)

I've made the updates clear, and linked to the archived version of the original post.

Been saying this for years. The weird part is that it's a good thing. C++ redefined auto and it has never been the same.However it's clearly a better language than it was in 2010

The first example won't compile in c++ (can't get a ptr to a non const from a const without something naughty like a const_cast) - that's not undefined behaviour is it?

you're right, I messed up here. I'm working on some updates to the blog post based on feedback.

I never heard anyone say that C++ is a superset of C.

Sure, the first version was a preprocessor on top of C and certainly that is common knowledge. But a superset? Never heard it.

ObjC on the other hand...

I think it's quite a common thing for people to say if they're not actually experienced in writing the languages. I've heard it from multiple people.

As I say in the blog post, it's common knowledge for people who are experienced in writing C and/or C++ that this isn't true. But it's a misconception that persists among other programmers.

Have you seen practical cases where there has been issues because C code is not compatible with C++? The blog has nice toy examples, but would be nice to know if they actually occur in the wild.

Yes, all the time.

Off the top of my head, I've run into:

- C++ does not support 'foo([static N])' function declarations

- C++ does not support 'struct A a = { .b = c, ... }' initializers

The conflict arises when these are used in C headers, and then C++ programs attempt to include them.

FWIW, https://duckduckgo.com/?q=%22C%2B%2B+is+a+superset+of+C%22&t... finds things like:

- C++ is a superset of C, and that virtually any legal C program is a legal C++ program. https://www.tutorialspoint.com/cplusplus/cpp_overview.htm

- "C is a subset of C++." / "C++ is a superset of C." https://www.geeksforgeeks.org/difference-between-c-and-c/

- "C++ is a superset of C; (almost) anything you can do in C, you can do in C++." https://www.cprogramming.com/begin.html

- "As you recall C is a root for C++ and C++ is a superset of C." https://www.c-sharpcorner.com/article/similarities-and-diffe...

I thought it was until just now. I've never done anything of substance in either, but I've tried both. Now I'm wondering if it was something I heard, or if I just assumed based on my extremely limited experience with both languages.

You evidently haven't been looking at the stackoverflow questions coming in at the [c] and [c++] tags... ;)

A frightening amount of [c][c++] tagging is expunged there every day.

Which is often just meaningless pedanticism. There are a lot of questions about C APIs where it's irrelevant whether the answer uses C or C++. The latter might not be an exact superset, but in most cases it's easily close enough.

And 90% of those posts are made by beginners with a crashing linked list implementation. They're uninformed.

I've definitely heard c++ described this way.

It was a superset, once. C++ started as a C preprocessor. But the languages have diverged. It's enough of a superset, though, that C++ hasn't deleted bad features of C, like pointers and arrays being the same thing.

Once C++ gained "//" for comments - which was the original release in 1983 - it was no longer a superset:

   uses_BCPL_style_comments = 1 //*  */ 2
C99's // support did not restore the superset nature.

This means your "once" was almost certainly "never".

Pointers and arrays are not the same thing. See section 6 of the comp.lang.c FAQ, http://www.c-faq.com/ .

I doubt that C++ was ever a strict superset of C. Has

    int class;
ever been a valid declaration in C++?

It was maybe a superset for the brief period between C++98 standardization and C99 standardization the following year.

Nit: memmove allows src and dst to overlap while memcpy does not.

Ah yes, what I meant was that more optimisations can be made on memmove if we restrict src and dst, as the overlap case no longer needs to be considered.

I've never used restrict, so I could be missing something, but this is what I meant in the blog post by mentioning memmove.

I thought this was common knowledge? Pointer aliasing for example

Common knowledge is like common sense - not as common as you'd think.

I knew most of those things except for the pointer aliasing details.

Very 1995, no offense.

I hope you mean the website theme.

If yes, indeed.

I had to view in Reader mode via Firefox.

I could not read it, I got dizzy by its colors; my sensitive vision couldn't stand it -_-


In other news, 1+1=2; and two wrongs don't make a right (but three lefts do.)

And now it's time for the show!

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact