Hacker News new | past | comments | ask | show | jobs | submit login
Soursop and Ponies in Kona: A C++ Committee Trip Report (cor3ntin.github.io)
62 points by dtoma on Nov 27, 2022 | hide | past | favorite | 67 comments



> why all these groups of people decided to start from scratch than to put up with the C++ committee

In which the C++ committee continues to not acknowledge that its problem is being a committee, in the most ridiculously bureaucratic sense of that word.

If the only way to get my contributions accepted into a project involves writing a paper about it, sure, I can do that. If it involves writing a paper about it, and then having endless meetings about it that could have been emails, some of which I have to physically travel to, I can't be bothered. I've left actual paying jobs over that, I'm not doing it for free.

And sure, I'm an individual, and most of the people they're talking about here are representatives of companies. But the effort-to-results ratios still exist, and C++ has managed to tip them to the point that making an entire new language is less effort than proposing a C++ change.


The D programming language came out of my inability to influence C++. Amusingly, D has had a lot more influence on C++'s direction than I was able to do directly.


It is and must be more difficult to make changes than start something new. When one person alone starts something new it is easy to make choices to a vision. When something is popular you cannot easily make changes that all will agree on.

C++ has painful experience on what happens when you don't carefully consider all proposed changes and so miss something. Export is an obvious example, but there are others that seemed good until painful experience later showed why not. Templates would look very different if they knew then what people would do with them.


> If the only way to get my contributions accepted into a project involves writing a paper about it, sure, I can do that. If it involves writing a paper about it, and then having endless meetings about it that could have been emails, some of which I have to physically travel to, I can't be bothered. I've left actual paying jobs over that, I'm not doing it for free.

If most people wrote papers that were so perfect in their construction that no one would ever need to ask questions about their content, as every relevant question would be answered by reading the paper, then there wouldn't need to be a need to shepherd it through meetings. But in my limited experience, most papers aren't like that. In the numerics study group, we had one paper at the most recent meeting that was so vague, we eventually decided we had no idea what the paper was actually proposing, so answering the question "would we like to move forward with this idea" was impossible. And with the author not being present... well, that's more or less the end of the road for that idea.


So send an email, and ignore it until you get a response. You don't need to fly to Kona for this.


Not just the paper has to be perfect, but all the reviewers too. Sadly people don't just ask reasonable and already unanswered questions.


These replacements don't care about the committee per se, but rather reject committee's core goal of preserving backwards compatibility over everything else.

Making it easier to add more features to C++ can't fix the problem of being unable to simplify the language by removing unsafe and legacy features.

If they wanted C++, but only hated the committee process, they'd have forked the language and worked on compiler extensions (like WHATWG bypassed W3C process). But instead they all went for clean slate with some level of interoperability.


That quote you cherry picked is literally in a section talking about how hard it is to get into the committee and how slow the committee is.

So no, they are not failing to acknowledge that. It's literally the point of the quote you're responding to.


I just went back and reread the section to see if I'd missed something, and... kind of, I guess? They acknowledge that it is hard to join, but they don't seem to fully get why - that the problem is their system of scheduling meetings instead of discussing things asynchronously, not the ISO in itself. Without that realization, I would be surprised if a post-ISO C++ committee didn't just keep doing the same thing as before, because face to face meetings are the only way to be productive, right?

The "look at this pretty place I got to go to" picture immediately after that section does nothing to help this impression.


> One of the concerns is that C and C++ are being discouraged for new projects by several branches of the US government[1], which makes memory safety important to address.

The biggest memory safety problem for C is array overflows. I proposed a simple, backwards compatible change to C years ago, and it has received zero traction. Note that we have 20 years of experience in D of how well it works.

https://www.digitalmars.com/articles/C-biggest-mistake.html

It'd improve C++ as well.

I really do not understand why C adds other things, but not this, as this would engender an enormous improvement to C.


P.S. Modules would also make a big improvement to C, proved by implementing them in the ImportC C compiler. Modules don't take away anything from C's utility.


But that change is by definition not backwards compatible - neither ABI nor source level.

Using an ifdef to maintain source level compatibility doesn't work as two pieces of code will see the same function using different ABIs.

That said I agree entirely - the conflation of array and pointer is the biggest flaw, it's what "necessitated" the null termination error that people are so fond of calling the biggest mistake.


Sorry, I meant it does not break existing code.


Fair enough - but I also get the point that it's absurd that "modern" "safe" C++ is invariably more obnoxious and verbose than the unsafe version.

Then when new syntax _is_ added, there is no attempt to fix anything. My favorite example is the iterator syntax:

    for (thing: things) ...
Being defined as equivalent to

    for (iter = things.begin(), end = things.end(); iter != end; ++iter) { thing = *iter; ... }
Which is great because it makes any kind of range checking or reallocation safe enumeration much slower. All just to allow the spec to avoid making changes to existing "idiomatic" begin()/end() based code.

So prior to this syntax I had made the WebKit Vector class perform bounds checking on any indexed access, in all build modes. So enumeration was generally indexed, because begin()/end() are atrocious, and that meant that enumeration was both bounds and reallocation safe. But then the enumeration syntax came along, specifically in terms of begin()/end() - and I was never able to make an iterator that could have the required semantics that didn't simply shit the perf bed.

So now you have a new "modern" C++ iterator that is strictly less safe than the old form of iteration, and for which a safe and secure iterator is intrinsically hard to optimize due to the semantics of the rest of the language. Hurrah.


For D it looks like:

    for (thing; things) { ... }
which is rewritten as:

    for (auto r = things; !r.empty; r.popFront) { auto thing = r.front; ... }


C++ already has vector and now has span. Your proposal doesn't add anything over those.


Which only do bounds checking in debug builds, or when explicitly use the compiler switches to enable them in release builds, good luck trying to advocate for using at() everywhere instead of braces.

Before C++ got the STL, all collections libraries shipped with compilers used to have bounds checking enabled by default, apparently that is too much performance loss for the standard library.

Walter's proposal added bounds checking being used unless explicitly disabled, like in any sane systems language.


As experience has shown, bounds checking is needed in the release builds, because those array overflows are only discovered by hackers in the released software.

D compilers allow that to be turned off, but it's only appropriate when:

1. evaluating how much the checking costs in runtime performance

2. doing competitive benchmarking

Otherwise, it should always be on.


Exactly, as any sane systems programming language. :)

I always force enable bounds checking on C++ code, never had a performance issue where the real culprit wasn't something else, wrong algorithm or data structure for the problem at hand.


Nobody has ever accused C++ of having good defaults. That is also unfortunately one of the hardest things possible to change without language-fragmentating mega breakages (see Python3)


Adds:

1. convenience

2. attractive appearance

3. ubiquity

4. better error messages (because compiler knows what they are)

5. one construct instead of two (vector and span)

6. overflow behavior selectable with compiler switch

For example:

    #include <vector>
    std::vector<int> v;
vs:

    int[] v;
Many D users have remarked that it's the single best feature of D.


I think the core issue here is that WG21 seems hell bent on refusing any new syntax if at all possible, and instead requires "generic" solutions that can be used for many problems. The result of which is these unending train wreck solutions for what should be basic, and we end up with enable_if and SFINAE. At least C++ finally has the idea of specifying an interface for a template instantiation, except of course it's absurd in its own way. Rather than specifying an API, you write "code" and a type conforms to that concept if the code "compiles". So you have no way to say "I expect this type to conform to this concept" other than using it.

The primary goal of "concepts" appears to be to mitigate the awful error messages, but even for that it fails.

For example, let's imagine Hashable from any other language, and look at a C++ concept version:

    template <typename T>
    concept Equatable = requires (const T& a, const T& b) {
      { a == b } -> std::same_as<bool>;
    };
    template <typename T>
    concept Hashable = Equatable<T> && requires (const T& a) {
      { a.hashCode() }->std::convertible_to<size_t>;
    };
and an implementation:

    struct Thing {
      size_t hashCode() const;
    };
    bool operator==(Thing, Thing);
Now how do we make sure Thing is actually going to conform to Hashable? with an assert of course, why would we want anything so gauche as declarative syntax?

    static_assert(Hashable<Thing>);
You can see the brilliant syntax explicitly disallows us specifying constraints on a concept, and instead we have to use the && expression. My personal belief here is that the banning of constraints in the template name is simply to force people to use logical composition so the people who thought of it can claim people like it.


> 3. ubiquity

Hardly. A std library update is often easier and faster to ship than a new compiler toolchain. Especially in C/C++ world, although it is moderately uniquely terrible in this regard

> 5. one construct instead of two (vector and span)

Incorrect. Vector is an owned type, while both span and your proposal are borrowed types.

> 6. overflow behavior selectable with compiler switch

That sounds an awful lot like a negative to me, not a feature. It's not generally desirable for code behavior to change depending on how it's compiled, makes debugging kinda hard.

Unless you're just referring to things like sanitizers, in which case yeah span & vector both have those same flags.


My proposal also uses [] as an owned type.

> It's not generally desirable for code behavior to change depending on how it's compiled, makes debugging kinda hard.

It only changes the behavior of what happens after the program has already failed. People want different behaviors, depending on their environment, and like having the choice.


D's slices "T[]" maps to C++'s span<T> -- you're right, though the semantics/APIs of each are slightly different you can treat them as approximate concepts.

C has no such thing that I am aware of


Fully agree with #embed. I don't really need the feature often enough to justify it. But it feels fine to use the preprocessor for it. It's annoying enough for the people who need it to have some build-system work-around, so just some simple straight forward implementation seems better than some over-specified solution for every possible use-case (imagine the horror of some template construct with locales/encoding specification added to it). Also, even if C keeps diverging from C++ having these basic constructs stay compatible is worth quite a lot imho.


One problem is that C++ wants to (eventually, at least as an ambition) deprecate the pre-processor. So it's embarrassing to add features which people need to this system which you claim you're deprecating. Which is it?

I think C++ would have been better off with a closer equivalent to include_bytes! (Rust's compiler intrinsic masquerading as a macro, which gives back an immutable reference to an array with your data in it) - but the C++ language doesn't really have a way to easily do that, and you can imagine wrestling with a mechanism to do that might miss C++ 26, which is really embarrassing when this is a feature your language ought to have had from the outset. So settling on #embed for C++ 26 means it's done.

I was concerned that maybe include_bytes! prevents the compiler from realising it doesn't need this data at runtime (e.g. you include_bytes! some data but just to calculate a compile time constant checksum from it) but nope, the compiler can see it doesn't need the array at runtime and remove it from the final binary just as a C++ compiler would with #embed.


> having these basic constructs stay compatible is worth quite a lot

Also, I think C++ can still use the same preprocessor as C at this point (it's been a while since I've had to deal with that)? If you're going to diverge the preprocessor you should get more benefit out of doing so than "not having #embed". For that matter, having important features like #embed only available via preprocessor also helps undermine the pointy-haired trolls who (allegedly?) keep trying to deprecate the preprocessor entirely in favor of some proprietary build system.


I would love to see a world in which a system is designed and built from multiple languages, so that the "right" tool could be used for each part. Does this even make sense? The modern distributed web seems to be leading us there. Slowly, slowly.

I would also love to see a world where all C, C++ dependencies magically port themselves to Rust without FFI or a first-cut rewrite that a hobbyist did. 10-20 years maybe?

Unless:

> One of the concerns is that C and C++ are being discouraged for new projects by several branches of the US government[1], which makes memory safety important to address.

Reading these posts really does make it seem like C and C++ are a derided, ancient construct of better days when we trusted software engineers and didn't write code for connected systems. It's just not possible to go back to those times.

While I'm extremely interested in Rust, the ecosystem for my entire industry is based on C++ with no change in sight, and built on C operating systems. Because, to date, we write code that executes on a machine that is not taking input from a user, and so does not have the brand of security concerns that make Rust attractive (for the most part). Here, static analyzers get us what we need at the 80/20 level.

1. https://media.defense.gov/2022/Nov/10/2003112742/-1/-1/0/CSI...


It only makes things slighly better, but Windows, Android, macOS, iOS, mbed, and plenty of others have enough C++ into them, even in kernel space.

And yes, it will either take decades to purge IT ecosystems from them, or they finally get some #pragma enable-bounds-checking, #pragma no-implicit-conversions (yes there are compiler specific ways to get things like this), and similar, so that they can stay in the game of safe computing.


I shall preface this with that I'm a beginner at C++.

I really like your idea of building a language from multiple parts.

Or multiple DSLs.

Maybe you could have a DSL for scheduling the code, a DSL for memory management and a DSL for multithreading. A DSL for security or access control And the program is weaved together with those policies.

One of my ideas lately would be how to bootstrap a language fast. The most popular languages are multi paradigm languages. What if the standard library could be written with an Interface Description Language and ported and inherited by any language by interoperability

Could you transpile a language's standard library to another language? You would need to implement the low level functionality that the standard library uses for compatibility.

I started writing my own multithreaded interpreter and compiler that targets its own imaginary assembly language.

https://GitHub.com/samsquire/multiversion-concurrency-contro...

I like Python's standard library, it works

I feel I really enjoy Java's standard library for data structures and threading.

Regarding the article, I hope they resolve coroutines completely. I want to use them with threads similar to an Nginx/nodejs event loop.

I tried to get the C++ coroutine code on GCC 10.3.1 working from this answer to my Stackoverflow post but I couldn't get it to compile. I get co_return cannot turn int into int&&.

https://stackoverflow.com/questions/74520133/how-can-i-pass-...


>"I would also love to see a world where all C, C++ dependencies magically port themselves to Rust without FFI or a first-cut rewrite that a hobbyist did. 10-20 years maybe?"

10-20 years sound as a pipe dream.


I've been dreaming about idiomatic interop for a while but by definition it's out of scope for any given language


Still hoping for compile time introspection/reflection for class serialization. Whichever language implements it first (C++ or other) I'm all in on. I come from a scientific background, where running code on data gathering machines, and writing it out, then reading it back in later for analysis is 90% of what I do.


I’ve done this with libclang: parsing C++ with clang.cindex in Python, walking the AST for structs with the right annotation, and generating code to serialize/deserialize. All integrated into a build system so the dependency links are there. Obviously being built into the language would be way better, but if I was spending 90% of my time I would take any necessary steps.


Interesting, sounds similar to the dictionary that CERN ROOT generates. Id like to be able to do the same, and a generic "dictionary maker" by what you've described could be useful for allowing multiple formats


Interested in sharing any code? This will be useful to many C++ devs who need any sort of reflection in their workflow (especially for gamedevs)


not op, but i've done this a couple times both through the python API:

    https://github.com/jcelerier/dynalizer
to automatically generate safe dlopen stubs for runtime dynamic library loading from header files

and through the C++ one (this one is an extremely quick and dirty prototype):

    https://github.com/ossia/score/blob/master/src/plugins/score-plugin-avnd/SourceParser/SourceParser.cpp
to pre-instantiate get<N>(aggregate), for_each(aggregate, f) and other similar functions in https://github.com/celtera/avendish because of how slow it is when done through TMP (doing it that way removed literally dozens of megabytes from my .o and had a positive performance impact even with -O3) ; so I weep a lot when I read that people in the committee object to pack...[indexing]


Have you checked out the PFR library (perfect flat reflection)? I've coupled this with the magic-enum library to good effect.

PFR can be rewritten in very little code, assuming c++14(?); magic-enum is long enough to just use.

I generally have one TU for just serialization, and don't let PFR and magic-enum "pollute" the rest if my code. This keeps compile times reasonable. (The other is to uniquely name the per-type serializer: C++'s overload resolution is O(n^2)). I then write a single-definition forwarding wrapper (a template) that desugars down to the per-type-name serializers. It strikes a good balance between hand-maintenance, automatic serialization support, and compile-time cost.


This does look very interesting, thank you!


You can do this with Haskell (aeson package) and maybe with Rust (serde?)


Java annotations have enabled compile-time reflection since Java 6, and of course it has been used for serialization: https://github.com/square/moshi/#codegen


Rust basically supports this with pretty low complexity via serde, but I think many developed languages have at least something to do this, although in some it has to be hacked on.


Reflection is definitely a big topic of discussion, but I'm not sure whether it will make it in time for the finalization of the C++23 spec. I think this is the most recent iteration of the proposal:

https://www.open-std.org/JTC1/SC22/WG21/docs/papers/2022/p12...


Now there are two competing proposals, with luck maybe one of them can make it to C++26. or maybe not.


I'm curious about how you could use https://celtera.github.io/avendish for this. I've developed it to enable very easily creating media processors with the little reflection we can do currently in c++20; in my mind data gathering would not be too dissimilar of a task.

It makes me really sad reading about the objections to pack indexing as this library needs it a LOT (and currently, doing it with std::get<> or similar is pretty pretty bad and does not scale at all past 200 elements in terms of build time, compiler memory usage & debug build size)


Compile-time introspection and reflection have been implemented in GHC Haskell as the Generic class. Basically the compiler synthesizes a representation of your data type in terms of basic operations like :+: or :*: (for sum types and product types) and you can easily operate on them. Is that what you mean by compile-time introspection?

It's already being used (for many years in fact) to implement JSON serialization and deserialization in arson without depending on Template Haskell (kind of like macros).


What about making JSON that reflects the class structure and serializing that?


Well yes, but if I had reflection I could make a general 'serializer' routine that has backends for multiple formats (JSON, HDF5, CDF, ROOT, etc).


I’m pretty sure D supports this.


> I spent an ungodly amount of time over the past couple of years exploring ways to get views::enumerate (a view that yields an index + an element for each element of a range) to produce a struct with named members (index & value), as this is more ergonomic and safer than std::get<0>(*it). Alas, despite my best efforts and hundreds of hours invested, this proved almost unworkable.

Isn’t this a damming indictment of the language and everything that is wrong with it? How can something so simple be so hard?


It seems C++ is in the throes of the language version of the Innovators Dilemma.


Herb's CppFront looks like the best hope for a clean C++ future


If anything, Circle would be it.

CppFront is just like Carbon and Val, with a completly different syntax, translating to C++ is just an implementation detail, he just markets in a different way given his position at ISO, most likely not to raise too many waves.


Not really, we as a community know already that the best way to significantly change a language by keeping full compatibility is to write a preprocessor (CppFront way).

Carbon is DOA as it hacks a compiler, and Circle isn't even in active development (again if it would compile to C++ that would be a better direction).

At the same time putting ideas from Carbon to CppFront is possible (I wish Carbon developers would also think about going the preprocessing direction).


Circle is the only one that is in active development, and available today, need to improve your fact checking.

https://twitter.com/seanbax

CppFront is just like Carbon and Val, CppFront compiling to C++ is an implementation detail, C++ and Objective-C aren't C, just as CppFront isn't C++, regardless of the sales pitch.


Herb is one of the best things that ever happened to C++. Not only is he wicked smart, but his ability to persuade is most impressive. As if he needed more, he's also a very nice gentleman.


I agree. C++ really should just be left as is an used as a compilation target for easy bootstrapping and interoperability.


CppFront is a different project entirely. It's like saying C++ is C's future


CppFront compiles to C++ and everything is intended to map to clean usable C++ code so that if the project fails, the code is still salvageable in its C++ form.

It's not intended as a separate language.


I can do the same with Eiffel, so is Eiffel C++'s future?


I think you're removing all context and constructing a false equivalency

According to wikipedia, Eiffel was created in 1986, making it a contemporary with C++'s initial development. From what I can tell, it's creator had no affiliation with the development of C/C++, and it was created for reasons completely unrelated to C++.

CppFront was created by the C++ committee chairman for the explicit goal of providing a path forward for C++. Herb explicitly stated that the inspiration for using C++ as a compilation target was taken from Bjarne's initial implementation of C++, which compiled to C.


Nope, I am making a point that plenty of languages have as goal to compile to C++, Eiffel was only an example from many others I could have chosen from.

The way Herb Sutter tries to sell Cppfront, versus all those other languages that have backends capable of generating C++, is exactly that, as ISO C++ chairman he is trying to portray Cppfront isn't like the others, given his position.

Hasn't as it is, he would use the same terms as the Carbon and Val folks.


Exactly what C++ to C at the beginning https://en.m.wikipedia.org/wiki/Cfront


> I propose that C++ initialize all stack variables to zero, preventing ~10% of CVEs.

This can be done by compilers without any change to the language standard.


It can, but then most don't use it, even though Windows and Android have proven you can ship that into production with hardly any noticeble performance loss.


[flagged]


Am I missing a joke?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: