Modern C++ Won't Save Us (2019)

Animats · on April 26, 2021

My comment for too many years is that C/C++ fails to deal with three issues: "How big is it", "Who owns it", and "Who locks it". C++ has, with difficulty, made progress on "How big is it" through templates, but raw pointers keep leaking out. "Who owns it" has been tough to deal with, although "owned pointers" at least try. "Who locks it" has yet to be addressed at the language level, although at least there's now some agreement on multiprocessor semantics at the language level. The old line was that locking is an operating system problem.

After years of C++ and six months of Rust, I don't think C++ can catch up with modern languages with template gimmicks. Getting pointers right needs global analysis. Getting decent error messages about inconsistencies between point A here and point B way over there requires global analysis. Trying to build a borrow checker with the C++ template and type system is pounding a screw.

Rust has its own problems. Figuring out how to do something safely can be quite difficult. It can involve solving puzzles, and often involves rewriting things at several levels, especially when threads are involved. As a result, "unsafe" is too often used as an escape hatch when someone can't spend the time to get it right. Those who slave under the whips of "agile" may be forced to such hacks.

HexDecOctBin · on April 26, 2021

> three issues: "How big is it", "Who owns it", and "Who locks it"

The issue with this kind of thinking is the belief that there is a "it", not "they".

Say, you are writing a HTTP server [1]. A beginner mindset (what seems to be demonstrated in this post) with start allocating left and right, for every HTTP header, for every piece of string, for every piece of metadata record, etc. Thus, "how big is it" become no bigger than it needs to be. The lifetime of these allocations and deallocations will be made as "narrow" as possible, meaning that the programmer will try to deallocate as soon the they stop needing the data stored; "who owns it" becomes the user of the data itself. And so on.

However, think of an alternative strategy, something which game and embedded developers have been using for decades. You allocate a memory arena when the HTTP request comes. From that point on, every "allocation" is equivalent to bumping a pointer in that arena. If the arena gets full, we allocate another and chain it to the previous one, as a linked list. No deallocations are performed (or could possible be performed) until the Request is parsed, relevant processing is done, and Response is constructed and sent. At this point, the entire arena chain associated with that particular request-response cycle is deallocated in one go.

How big is it? We don't care, smaller than the arena.

Who owns it? That particular request-response cycle.

Who locks it? No one, if different request-response are parallelised wrt each other; otherwise, depends on the nature of concurrency/parallelism.

[1] Usually, I would have given a gamedev example, and then people would have said that it only works in gamedev. That's why I have tried to give a webdev example, considering the majority demographic on this website.

Some reference:

1. Mike Acton's CPPCon Talk: https://m.youtube.com/watch?v=rX0ItVEVjHc

2. Rant by Casey Muratori (Brusqueness Allergy Warning): https://www.youtube.com/watch?v=f4ioc8-lDc0&t=4407s

notacoward · on April 26, 2021

> You allocate a memory arena when the HTTP request comes.

The per-request-arena concept is one of the most powerful and IMO underappreciated tools out there for dealing with memory-lifetime issues. I've used it or advocated for its use at multiple jobs and projects, quite successfully, even for kernel code. Windows NT uses it in IRPs. Network code in both BSD and SysV did something pretty similar, at least in the long-before-Linux days. It's tried and true. Having a clear moment when things should be freed and an equally clear description of what those things are is wonderful.

That said, even games and network servers and storage drivers often have other complex data structures not related to individual requests. It's usually not hard to come up with rules for each of these, but the rules forced on you by something like STL might not be as convenient, performant, and/or verifiable as those you come up with yourself.

I hope that some day programmers will stop using memory-unsafe languages except when they really need to, and that when it is necessary those languages will give them the tools to solve those problems themselves instead of pretending they're already solved. They're not. Rust's borrow checker plus Zig's arena concept plus a couple more pieces might actually get us there. C++'s feature salad never will.

HexDecOctBin · on April 26, 2021

Maybe I didn't express myself properly (English is not my native language), but the point wasn't that everyone should use Memory Arena™ instead of Smart Pointers™ or Borrow Checker™. I am not trying to sell a conference here :)

The point was, to put it in Mike Acton's words, "when there is one, there are many". Usually, things are grouped, perhaps semantically, and it make sense to think of them as a group rather than breaks them in pieces. Arenas are one rather simple grouping mechanism, but they are not the only one. Of course, explaining complex grouping will require explaining the associated problem-solution pair, which is obviously out of scope of a forum reply.

All I would say is, if one steps out of one-at-a-time thinking, and statically plans the dynamically happening data transformation, and therefore exploit patterns and relationships inherent in that data flow, programming in low level languages becomes much easier.

But you need to know the problem (really really know it), and you need to know the solution. Temet Nosce.

Rochus · on April 26, 2021

> Rust's borrow checker plus Zig's arena concept plus a couple more pieces might actually get us there. C++'s feature salad never will.

Looks like there's some new salad growing.

kazinator · on April 26, 2021

A nice, obvious way to get a per session arena for almost all resources is to fork a process.

notacoward · on April 26, 2021

Um, yeah, but there are other issues that make it not so nice. And it's an inoperative concept for most kernel/embedded work. I'd say in most contexts where forking would be OK you should be using a GC language anyway.

kazinator · on April 26, 2021

> And it's an inoperative concept for most kernel/embedded work

I recently integrated mini_httpd (a small forking server) into an embedded system as part of a solution for distributing updates across a cluster of these systems.

You must be talking about 15 cent microcontrollers (that still have a TCP/IP stack with SSL) or something, not ARM cores with MMU's running Linux.

notacoward · on April 26, 2021

I said most kernel or embedded. It's lovely that there are all sorts of embedded devices nowadays capable to support a style of programming indistinguishable from a general-purpose system, but that's not the only or even most relevant case. There are also billions of devices out there - not just 15-cent microcontrollers either - that don't lend themselves to that style because of real-time or other requirements. Even within a conventional system, there's a ton of stuff in the kernel - pre init, board support, most storage and networking - that can still use a per-request-arena approach to good effect but can't fork a process.

As I said, if you can fork all the time you probably shouldn't be using a memory-unsafe language anyway. Nothing you've said so far has suggested otherwise. The basic techniques for implementing your own memory safety are still valuable and worth discussing for people who aren't in Easy Mode all the time.

kazinator · on April 26, 2021

This subthread is strictly about web serving. Requirements combinations like "HTTPS serving in pre-init or board-support code" or "real-time HTTPS serving in under-powered embedded board" are not practical or relevant, unless we replace "HTTPS serving" with something else.

With regard to the other point, if we are using a memory-safe language, then we can use fork to get an instant arena, regardless of how that language performs resource management under the hood. Whatever allocations happen in the request, of memory or file descriptors, are reliably gone when that exits. If there is any problem in the implementation of the memory safety, the failure is contained to a process. Thus we have an additional reason to use process containment: not having control over the memory management, and not trusting it 100%.

barrkel · on April 26, 2021

A meta comment, not really related to your point, but I found it amusing in context.

You're writing about how doing lots of little things in an HTTP server is inefficient; and you're replying to John Nagle, who contributed the Nagle TCP optimization that debounces tiny sends with a little bit of delay, in the hope of doing more work in one go!

(On topic of your HTTP example: this is where generational GC can shine. Ideally one of your younger generations covers the full allocation cycle for a request & response, so that nothing survives the collection and it's ultra-cheap to collect since tracing the roots doesn't find anything. The nice thing about GC instead of an explicit arena is that it's global, so you don't need to contort all your library calls to ensure they're using the right allocator.)

nightowl_games · on April 26, 2021

Game dev here. Try reading the Godot code base. It's tiny allocations all the way through.

It's really quite bad.

BUT

I'm using it professionally and it's useful. It's hard for me to mentally reconcile how good and bad the creator of Godot simulatenously was (is?).

Casey also goes way too far in his take (as usual).

RAII is a method to guarantee safety and correctness. It's useful in many many scenarios.

Game dev and c++ is a great area to hone your performance skills. And it's awesome to really squeeze the power out of a CPU, but doing that is only a small small piece of the game development journey. Ultimately, you must make a product that is entertaining.

PS: Mike Acton's talk is really only tangentially related. He's just talking about uber perf in general.

flohofwoe · on April 26, 2021

As productive and useful as Godot is (it is an awesome engine), its design resembles game engines of the late 90s to early 2010s when OOP was all the rage. Back then, most game engines were written that way (lots of tiny heap-allocated objects, connected by shared pointers). It's really just since the 2010s that the CPU/memory gap is the main driving force of game engine design (which wasn't much of an issue in the late 90's).

On the other hand, many games don't need to juggle more than a few dozen to a few hundred dynamic instances of one "thing" (outside of the particle- and animation-systems at least), so for most games a modern ECS design is definitely overkill (except for some specific parts of the game). Providing a good "game building workflow" in the editor is definitely more important.

nightowl_games · on April 26, 2021

Hey Floh, love your work! It's validating to hear you say these things. For me, the biggest annoyance with Godot's code base is the lack of clear ownership semantics (no use of smart pointers), and the use of fully hand rolled collection types for everything. It makes it hard to inspect variables in a debugger! Also the performance of the collection types are bad because of all the allocations. Seems like those problems will never get solved as it would be too expensive to rectify.

HexDecOctBin · on April 26, 2021

Well, I'd be surprised if the design/"entertainment" focussed part of the game is bring written in C or C++.

Actually, you know what, I would not be surprised, just disappointed :)

EDIT: Unless its a small indie studio, of course. In that case, do whatever works for you. Go nuts, have fun!

flohofwoe · on April 26, 2021

A fairly common way is to first implement experimental gameplay logic in a scripting- or visual-language, and once it works, extract the performance-critical parts into "proper" C++. Basically, the common lower level gameplay building blocks are written in C++, glued together by some higher level mechanism (like noodle graphs or a scripting language, and common features move into C++ as needed). Unless it's Unity, in that case, replace C++ with C#.

tejohnso · on April 26, 2021

Wow, Casey doesn't hold back. Clearly, without hesitating says that 100% of code written in RAII* style sucks. Interesting considering Stroustrup considers RAII to be "the basis of some of the most effective modern C++ design techniques."

I'd like to hear Casey's comments on TDD.

*RAII style is lumped together with try/catch, smart pointers, tons of malloc/free new/delete

asiachick · on April 26, 2021

Casey hasn't a clue. He's only written games and he hasn't actually had to solve the issues that are being solved by in other domains. If he actually did have that experience he'd see his ideas don't scale. Games are in generally, vastly different than other apps (word processors, video editors, browsers, etc) in for one, for the most part, they get to choose all of their data upfront. If you're making "The Last of Us 2" there is no user data. There's no "some people will use this to write a letter to grandma and yet some other people will write an 800 page book on physics with mathematical diagrams" and yet another will write a report on the market with linked live data.

flohofwoe · on April 26, 2021

Consider "games" like Minecraft, Roblox or Dreams, those are entirely user-data-driven. Different types of games are at least as different to each other as to other types of applications (or rather, they are not less diverse than other applications, they usually just have a higher focus on performance).

kaba0 · on April 26, 2021

How are they different? You have some primitives like a block, and meaning is given by users to a group of them. The program has to care only about the primitives.

(Of course making it performant, not rendering/“simulating” everything and the like is exceedingly hard, but it is true that it is more of a “closed world” as opposed to some other areas of software development.)

flohofwoe · on April 26, 2021

This type of "creative games" lets users combine basic building blocks in the same way a word processor application allows to write entire books by combining a a limited set of characters. The limitations of strictly linear games like Last of Us are not because of technological restrictions but because it's hard to tell a cinematic story while still giving the user complete freedom.

asiachick · on April 26, 2021

I'm pretty confident Minecraft, Roblox, and Dreams are not following Casey's practices.

tejohnso · on April 26, 2021

What makes you say that?

Johanx64 · on April 27, 2021

Minecraft was created in Java with ducktape and hacks and a lot of ugliness and perf characteristics which wouldn't fly past a pedantic programmer like Casey

brhsagain · on April 26, 2021

Is your argument that arenas don't scale because user-provided data is variable in size?

Although arena memory is casually described as "allocate one huge chunk of memory up front," you are not literally only allocating one block ever and praying it never runs out. If you run out, you allocate another block. The point is that you don't call malloc for every string, object, list, etc. Adhering to this largely eliminates the need for RAII. What about this doesn't scale?

Personal anecdote: I'm building an IDE, where literally all of my data is provided by the user, and arenas have worked perfectly. I don't think I have a single destructor except for dealing with things like file descriptors, etc.

HexDecOctBin · on April 26, 2021

I wonder, has Bjarne Stroustrup even shipped any industrial-grade software? Because he's mostly an academic AFAIK.

ncmncm · on April 26, 2021

Bjarne Stroustrup is a Director at Morgan-Stanley, a major investment bank in New York where software difficulties may cost millions of dollars per minute. He has described his workday as people coming to him with a difficult software engineering problem, about which he asks increasingly detailed questions until light dawns, and they go away ready to re-write the badly designed subsystem causing the trouble.

indy · on April 26, 2021

Often, just the act of explaining your problem to someone results in possible answers. (see Rubber Duck Debugging https://rubberduckdebugging.com)

boomlinde · on April 26, 2021

When using a "rubber duck" I think you just force yourself to turn loose thoughts into coherent ideas by having to express them clearly. This can reveal some problems that were not apparent before the idea was expressed. It's a "rubber duck" because it's really just a monologue and an inanimate object could do the job of the listener.

What's described above seems more like Socratic questioning, where a person asks questions that reveal facets of an idea that the person being asked may not have considered, thereby prompting the asked to rethink their assumptions and draw new conclusions.

syspec · on April 26, 2021

Everybody describes their own day like that

PaulDavisThe1st · on April 26, 2021

I have no idea what "industrial grade software is", but I ship a cross-platform DAW written in C++, and RAII is tremendously awesome.

HexDecOctBin · on April 26, 2021

Are you saying that it is better than the alternatives? Have you tried writing code from a grouped-resources perspective? What was your conclusion?

PaulDavisThe1st · on April 26, 2021

I don't see grouped resources as an alternative to the way we use RAII. It doesn't mean that the approach doesn't have its own merits, but it doesn't cover the things we do with RAII. The architecture of a DAW isn't much like a server that responds to requests, even if there are elements of it that do correspond to that pattern.

Alternatively, you are referring to something I'm not aware of.

hardlianotion · on April 26, 2021

Edit - Already answered in more detail, below.

He is at Morgan Stanley,

https://www.morganstanley.com/profiles/bjarne-stroustrup-man...

I don’t know how close he is to the grind, but I imagine he at least talks to people who are reasonably often.

throwaway17_17 · on April 26, 2021

Does TDD here mean ‘Type Driven Development’ a la Idris or, as I assume (and seems more likely) does it mean Test Driven Development?

If you mean test driven, I don’t think I have heard Casie discuss anything remotely close to the standard presentation of TDD strategies. I would imagine, based on watching a large amount of Casie streaming, both alone and with Blow, that he would view testing (as in TDD) as unproductive for his particular style of development and his goals as a programmer. But, since it’s in my brain now I will ask him next time I catch a live stream just to satisfy my curiosity.

I can’t speak to any discussions specifically about TypeDD. But would bet Casie would be wholly uninterested and consider it theoretical academic fluff that pulls away from the fundamental data transformation work of development.

viraptor · on April 26, 2021

Your comment reads like you disagreed in some way with the parent, but then you showed a scenario and defined it in a way you can answer the important questions. Just drop the "we don't care" part and they're all good answers that can be used to model the ownership and usage.

The lack of locks in this scenario is important in itself and can be expressed in types. (It shows for example that as long as the whole process is migrated to another thread, you can have safe N:M scheduling) The process ownership of the arena can be well defined as well.

The "alternative strategy" as you described it doesn't have different questions - just different answers.

HexDecOctBin · on April 26, 2021

No, the point wasn't a scenario. The point was to stop focussing on individual allocation-deallocation and start thinking about the data transformation pipeline. And every program is a data transformer — because data transformation is all a computer does or can possibly do.

When one starts thinking in terms of data transformation pipeline, and start to relate lifetime of data with the lifetime of various phases of that pipeline, then one stops worrying about individual "objects" and start thinking in terms of aggregates. Suddenly, there is no need to track the ownership or size for each object, because those properties are now shared with the same (or isomorphic) properties of the phases of pipeline itself. As long as the position in the pipeline is tracked for (e.g., the call stack), all other lifetimes will automatically be tracked.

viraptor · on April 26, 2021

I get the idea you're describing and use it for various purposes. But I don't agree "there is no need to track the ownership or size for each object". You can offload some of that thinking to the arenas, the same way you'd do it with GC, sure. But you can't just ignore ownership - sure request context is owned by the arena - but do you need to copy the values you pass to logging? do you need to wait for log flush before destroying request context?

What about the more common cached data / process state?

Arenas simplify the processing just like GC but they're not magic and don't solve everything.

HexDecOctBin · on April 26, 2021

Arenas are just one tool amongst many; I am not talking about some specific magic tool, but about software architecture itself.

Please read this other reply I made in the same thread: https://news.ycombinator.com/item?id=26938577

EDIT: And about data sharing (with logging, etc.), that's part of the data pipeline too. So yes, your aggregation mechanism will take it into account. These are not two separate problems, irritatingly coupled due to reality; these are two parts of the same problem.

klibertp · on April 26, 2021

> From that point on, every "allocation" is equivalent to bumping a pointer in that arena.

I'm not a C++ developer, so I have a hard time imagining how would this be implemented in real code. I know how to allocate a blob of memory, but how do I redirect all the later allocations (that use `new` or that are hidden in std::string and other containers) to use parts of that blob? How do I know when the blob is going to overflow so that I need to allocate another one? Is it possible to give 5-10 lines example showing the basics of the technique?

kaba0 · on April 26, 2021

It’s not an either or thing, there is probably no need to allocate strings in the arena. Some cpp structures do allow for custom allocators, but you are more likely to have an arena instance and call some specific function on it and it will do the allocation for you. It usually operates on only a few types, not meant for arbitrary allocations.

klibertp · on April 26, 2021

> Some cpp structures do allow for custom allocators,

Yeah, that's what I was thinking about, but my only knowledge of allocators comes from gcc error messages which include all the template parameters - I have no idea how they work or how to switch to another one :(

I get the idea, though, it's basically what Erlang does for its processes - each one has a "private heap" just for it. If the process exits, the whole chunk of memory can be reclaimed and there's no need to run GC on it anymore (there's no sharing of memory between processes in Erlang, at all, so it's safe).

clddd · on April 26, 2021

You either pass the allocator around to every stl container, and everything that uses an stl container. Or you override new/delete and have a sidechannel that defines what arena to allocate in.

klibertp · on April 26, 2021

The second option seems much more convenient - I didn't know you can do that!

HexDecOctBin · on April 26, 2021

Sure, here is a snippet: https://pastebin.com/XGUeKdyk

Disclaimer: Quickly typed it on phone, so might make your computer explode.

samus · on April 27, 2021

Another optimization would be to keep the request buffer around and use it as backing storage for all the strings objects that result. The weakness of this strategy is that it doesn't make sense to keep it around if you only need a small part of it. It's a common source of memory leaks in Java where `String.substring()` just creates a new object with different start and end indexes for the same backing buffer.

A possible disadvantage of arena allocators is that everything is now tied to the lifetime of the arena, even though it might not be required that long. You'd need to practice a strategy like subarenas, RAAI, garbage collection or reference counting within the request as well.

The occurrence and advantages of all these strategies unfortunately are heavily workload-dependent. Most strategies work well for 80% of the workload. Different workloads of course.

bsenftner · on April 26, 2021

This is why I'm a strong advocate of hiring former game developers - they know what matters when it matters, not before, and not after. how? Pain and tears of being there, trying in the obvious wrong way first, and then fixing it during production while kiddies shout at you. Former game devs are the Marines of software.

pastenpasten · on April 26, 2021

What asiachick wrote below.

Muratori has demonstrated to me one time too many that he hasn't a got a clue about the things he's talking about.

To be precise: He may or may not have a clue about some things he's talking about, but I don't know enough about them to form an opinion. But in more than one occasions he went on to talk (sometimes at great length) about things where I know he's mostly or completely wrong in what he said.

I don't have a way to know the scope of these things, the things he doesn't know or understand yet talks about. Maybe it's just when it comes to "systems programming". Maybe it's everything that isn't game design and programming. And maybe he's wrong about games too. As I said, I don't know enough to judge everything he says, but the things I was able to judge convinced me not to trust him and his opinions.

And since I can't set the scope safely, I have to not trust him regarding everything.

For example, while his post[1][2] on ETW being "the worst API ever made" is slightly amusing and has _some_ good points, it mostly demonstrates:

(a) Inability or unwillingness to read the documentation. It would have saved most of his problems. But let's say that to criticize the _design_ of an API we can disregard that for a moment.

(b) Inability or unwillingness to pick the right tool for the task or pure trolling.

(c) Worst of all: Complete lack of understanding of the goals, purposes, limitation, design goals and tradeoffs of the system.

You can't criticize the design as being too complicated, while offering an irrelevant alternative for a "glorified memcpy()" (his words), when you don't understand what it does or what it's supposed to do.

(I emphasize again: I'm not saying the API is a paragon API design and that all is perfect there. I am saying the criticism is both grossly exaggerated and demonstrates misunderstanding of what the API actually does.)

After a couple of those I don't trust him and you shouldn't either but you're welcome to.

[1] https://caseymuratori.com/blog_0025 [2] https://news.ycombinator.com/item?id=8146124 See the discussion there.

HexDecOctBin · on April 27, 2021

> You can't criticize the design as being too complicated...

I hadn't finished reading the ETW rant; but I don't see anything wrong with it. All he seems to be saying is that for communicating with OS, ioctl-like all-in-one APIs are bad while epoll like APIs are better. Not really a controversial statement.

> After a couple of those I don't trust him

Who said anything about trust? I have made enough mistakes and learnt enough lessons, and it's nice to see other well-respected programmers (like Casey and Mike) come to the same conclusions like mine. To paraphrase your post: "You shouldn't repeat those mistakes but you're welcome to".

EDIT: Actually, after finishing reading the blog post, wow! I thought Linux's perf_event API was shit, but this ETW crap takes the cake. Well done, Microsoft; you had one job.

soedirgo · on April 26, 2021

While I agree with the sentiment, my impression is that the GP's point is about memory safety rather than performance. So yes, this applies for common patterns like per-frame memory in games, in which case the "it" is the arena. Otherwise, as a general rule, profile first, then optimize.

HexDecOctBin · on April 26, 2021

> this applies for common patterns like per-frame memory in games

Even though I went to the trouble to give a non-gamedev example...

It's not about optimization, it's about resource management, an alternative to GC or RAII.

ape4 · on April 26, 2021

I made a special purpose http sever in C++. I had a function called request() and just allocated everything in the stack of that function. Only a few KB. eg a vector of the headers etc. Of course after the function was done the memory was freed.

zoomablemind · on April 26, 2021

> You allocate a memory arena when the HTTP request comes. From that point on, every "allocation" is equivalent to bumping a pointer in that arena. If the arena gets full, we allocate another and chain it to the previous one, as a linked list.

Didn't you just describe the regular memory management as applied by OS in a process scope? At the completion of the process, all memory allocated by the process is reclaimed.

On some systems this approach is combined with granular quotas to allow multi-process coexistence.

Shifting this sort of memory management over to application scope may yield benefits now, but ultimately it's an OS-level concern.

samus · on April 27, 2021

Asking the OS for memory is a low-level operation, which means it is not portable and should be done by a library or framework. Also, since it's a syscall (involving a context switch), you don't want to do in performance-sensitive code.

Some applications might benefit from allocating huge pages. By default, the page size is something of the order of 4KB or 16KB. Large numbers of pages result in huge overhead. By allocating fewer, large pages on the order of MBs can help.

thawkins · on April 27, 2021

Interestingly it's exactly what the PHP memory model does, each request is a shared nothing VM space, nothing survives the end of the request, even if allocated memory objects don't get free'd.

pjmlp · on April 26, 2021

The same Mike Acton that now is an Unity employee, being one of the team leads on Burst compiler for HPC#, and not a big C++ fan anyway.

Iv · on April 26, 2021

It took me a while to understand what C++ was: a quest for the highest level semantics possible with the lowest performance overhead. It is a quest started a long time ago with a lot of dead ends and circumvolutions.

The quest is still valid but really, the complexity that C++ has given birth to is not worth it anymore. Other languages restarted from a blank state and are probably better bets. I almost switched to D years ago, for a while it looked like Go was going to take over but finally Rust seems to be it. My current work forced me to get into rust, and I don't think I'll go back to C++ anytime soon.

It was a fun ride. So long and thanks for the fish.

throwaway17_17 · on April 26, 2021

That is an interesting take, I don’t really disagree (regarding the high-level semantics/lowest overhead). I do question the use of ‘highest’ as the qualifier for the semantic level, but certainly would support a claim that the goal was some indeterminate high with regards to the semantic level. Even at the inception of C++ as something different than C with classes, there were at least a few languages with a higher semantic level than where C++ is today. Although I do not think there was any then current strategy for making those languages as low overhead as C in practice. So as time has progressed and the cruft and complexity have built up I think they have now decided to find the highest level they can reach, while maintaining backwards compatibility and the low overhead requirements.

I am inclined to agree that a green field language is probably a better bet for achieving the goal, but I am not sold on any of the current contenders. So I still plod along doing my performance sensitive development in stripped down C++ and plug away at my pet language project.

jcelerier · on April 26, 2021

> It took me a while to understand what C++ was: a quest for the highest level semantics possible with the lowest performance overhead.

I find it interesting that it took you a "while" even though Stroustrup was already defining like that before C++98: https://books.google.fr/books?id=hS9mDwAAQBAJ&pg=PT151&lpg=P...

creata · on April 26, 2021

I'm not sure what you mean by "global analysis," but Rust's borrow checker doesn't do any analysis that I'd consider "global."

raphlinus · on April 26, 2021

I think there's something important here, potentially obscured by imprecise terminology.

The Rust safety invariants are, at heart, global properties. The central one is that mutable references are unique. Another way of saying that is: if you hold a mutable reference, then all other references held by all other objects in the system do not conflict with it (either because it is included in the "stacked borrow" or because it doesn't reference the data at all). That kind of property is ordinarily quite difficult to prove.

Rust does it though, by encoding these invariants in the types of functions. Most importantly, these types compose, meaning if you've got one function that respects these invariants calling into another, then the whole thing will also be sound. You can compile them separately and still be confident.

By contrast, achieving similar goals in the C family of languages does require a global analysis. The classic example is alias analysis. It's implemented in many compilers, tons of PhD ink has been spilled, but long story short it doesn't work. You get okay results sometimes on small programs[1], but as systems scale up, basically there always becomes a way for one pointer to alias another.

Thus, the claim I would make is the following: it will not be practical to retrofit Rust-style borrow checking onto an existing unsafe language, because the type system has to be rich enough to express the types of invariants required. The Rust type system has been carefully crafted to be powerful enough, at considerable cost: the complexity of the type system is one of the biggest complaints about the language, and slow compile times, one of its correlates, is another.

I would also claim that trying to scale up static analysis (based on alias analysis and other similar global analysis techniques), while perhaps somewhat helpful, is not going to give comparable results as Rust. In order to get anywhere near the level of confidence that all possible safety problems have been caught, the false positive rate would be unacceptably high.

I think this is one of the enduring achievements of Rust, and one likely to be carried forward in future programming language designs, but extremely unlikely to be successfully retrofitted to existing languages.

[1]: https://digitalcommons.calpoly.edu/cgi/viewcontent.cgi?artic...

chriswarbo · on April 26, 2021

The point of the borrow checker is to replace the global property 'all memory has an owner' with the (approximate, conservative) equivalent 'all code passes the borrow checker'.

By tracing ownership information through each component, and forbidding (or ignoring via `unsafe`) situations which cannot be traced in this way, the latter property can be decomposed and solved locally.

In other languages, like C++, we may still want this global property, but we can't break it down into local reasoning. Even if we decide on an approximate, conservative equivalent like 'all code passes STATIC ANALYSER X', getting local reasoning to work would require enough changes that it's arguable whether we're actually programming in the same language anymore. For example, we may need to add extra annotations to our code (which aren't present in existing codebases); we may need to extract extra information from sources (preventing the normal use of separately-compiled libraries); we (and our dependencies!) may need to avoid certain valid/legal patterns of code which the analyser can't handle; etc.

Writing the above, I'm reminded of trying to use the mypy type checker in Python!

Animats · on April 27, 2021

In other languages, like C++, we may still want this global property, but we can't break it down into local reasoning.

That's exactly it. You want some set of machine-checkable constraints which add up to the desired global properties. Rust managed to do that. Attempts to fix this in C++ yield a set of slightly leaky constraints which sort of almost do that. Fixing this requires taking things out of the language, which is unpopular.

It's embarrassing that the code below still compiles with default gcc options, in either C or C++ mode. Yes, it's terrible C++. The compiler allows it.

    #include <stdio.h>
    #include <string.h> 
    int main(int argc, char* argv[]) {
        char buf[20] = "\0";
        char* s = buf;
        for (int i = 0; i<argc; i++)
        {   s = strcat(buf, argv[i]); }
        printf("%s\n",s);
    }

(Even Microsoft has "strcat" deprecated by default.)

WalterBright · on April 26, 2021

D's ownership/borrowing system does data flow analysis within functions, i.e. it's intra-function. Inter-function (global) is handled via the function signature.

oconnor663 · on April 26, 2021

I think the idea is more like "the entire program is covered by lifetime analysis / borrow checking, so that nothing gets missed." Something like that?

gavinray · on April 26, 2021

Data-flow analysis is what I assume the author meant. Fancy word for "reasoning about behavior at a level higher than just a syntax node" (usually within a function or module, possibly within an entire program though)

Both D and Rusts compilers do it, and I'm uncertain whether Zig's does but it may too.

dataflow · on April 26, 2021

"Global analysis" is specific compiler optimization terminology. It refers to analysis across so-called "basic blocks" (also compiler terminology). See here: https://web.stanford.edu/class/archive/cs/cs143/cs143.1128/l...

UncleMeat · on April 26, 2021

That's not true. Global analysis is a synonym for interprocedural analysis, which crosses function boundaries. An intraprocedural analysis that crosses many basic blocks is still "local" rather than "global".

This slide deck is weird. I've taught 143 before. Where did this come from? Perhaps in really old stodgy terminology before there was any interprocedural anything people used the words this way? I really don't know any static analysis or compilers person who would consider "global analysis" to mean "intraprocedural analysis that considers more than one BB".

dataflow · on April 26, 2021

It's funny because I also remembered it as cross-function analysis, but when I went back to grab a link I saw this and thought maybe I'm mistaken somewhere, so I just used this definition. Oh well. Main thing I was just trying to convey was that it's technical terminology.

UncleMeat · on April 26, 2021

Yeah that's so bizarre. Is this Alex's slide deck? Did a TA write it? Could it be tremendously old? Weird.

dataflow · on April 26, 2021

I don't know if it's actually wrong though, given that a procedure is just a block of code with one entry point (and let's say one return, for the sake of discussion). I'd have to jog my memory, but I'm thinking: if you can already optimize inside a function (but outside basic blocks), then I don't recall what would be so drastically different across functions. The fundamentally hard part does seem to be going from one basic block to multiple. I might be forgetting something though... do you recall?

Also, I don't think they're alone in this terminology either... see here: https://www.csd.uwo.ca/~mmorenom/CS447/Lectures/CodeOptimiza...

UncleMeat · on April 26, 2021

The only explanation I can think of is that this is old terminology that has stuck around in course material based off older textbooks that aren't bothering with any sort of interprocedural anything.

Function boundaries do make things fundamentally more difficult than basic block boundaries, for a large number of reasons. You can no longer have single definitions of values and updates to those values are not global updates for the entire program. This is why you need stuff like context/object sensitivity for interprocedural analysis but it doesn't matter for local analysis. Graph structures also become way more chaotic, preventing the nice efficient lattice movement you see in classical local fixed point computation.

Like, static analysis and compilers is my job and none of my colleagues would use this term this way.

But clearly there is some material using it that way. Weird. I've been wrong before and I'll be wrong again in the future. So I'm happy to be wrong here. Clearly there are some situations where "global analysis" is used to describe whole-function analysis.

dataflow · on April 26, 2021

Interesting... now I'm thinking maybe I was confusing it with interprocedural analysis? I've definitely seen the distinction made between them before, though I'm not sure if I've seen them mentioned alongside local analysis in the same text. I guess if you want to have fun this week, go ask your colleagues what the difference between local, interprocedural, and global analysis is. See if they say the last two are synonymous. :-)

Regarding the difference for interprocedural analysis: I'm not entirely sure I follow it unfortunately. Let's say you can do everything within a function. Can't you just inline all of its callees (at whatever depth you want) and do your analysis/optimizations based off the result of that? Inlining itself is a rather trivial mechanical transformation, and after that everything is inside one function again, which you can already handle. The only real obstacle here seems to be recursion, but if you just treat that as any other opaque function call that you can't optimize across, you should otherwise get the rest of the way there, right? What am I missing?

UncleMeat · on April 26, 2021

As a preface, it is difficult to speak about true fundamental limitations of things like abstract interpretation because "print Top" is a valid algorithm that will work for all programs. It is just completely useless. So you can model function calls in trivial ways (just treat calls and returns as giant phi nodes and widen whenever you hit mutual recursion). But this tends to produce pretty poor results for real programs and your fixed point computation tends to take longer due to the shape of real program call graphs. Add in dynamic dispatch and you've got all sorts of fun (watch all your pointers in a java program get merged through the receiver to Object.equals(), for example).

In practice, these structural differences require new approaches. The field has done a really good job at intraprocedural analysis, solving a lot of really important problems many decades ago. How to do fixed point computation over SSAed CFGs is well understood, even when you've got heap relationships to think about. Interprocedural analysis still largely sucks. Even modern approaches like CFL-reachability for dataflow analysis produce a ton of garbage. This is one reason why I say it is just harder.

As you mention, recursion prevents you from inlining everything into one giant function. You do get "the rest of the way there" by inlining to some depth limit in the sense that inlining is a way of achieving context sensitive analysis (though it is generally not preferred). But you've still got big problems at the points of mutual recursion (either you need to widen badly or you need to actually do interprocedural analysis) and your program is also exponentially larger in pathological cases.

dataflow · on April 26, 2021

Ahh I see! Thanks for the explanation.

OneWingedShark · on April 26, 2021

>My comment for too many years is that C/C++ fails to deal with three issues: "How big is it", "Who owns it", and "Who locks it".

Ada has is pretty good at dealing with all three of these, TBH.

"How big is it" — Given by the 'Size attribute, and representation-clauses explicitly control record-layout.

"Who owns it" — The declaring entity, which is why you can have dynamically-sized arrays without heap-allocation and "allow the scope to clean things up".

"Who locks it" — a bit more convoluted than the above, but generally one of several options: (1) the Task / protected-object via entries; (2) the object itself, via controlled/limited_controlled inheritance; OR (3) the subprogram/compilation-unit via parameter-passing and/or interface-control (i.e. the only way to alter the interior-value is by some exported interface).

thedracle · on April 26, 2021

I think the biggest problem with C++ is the lack of explicit safety requiring you understand the internal mechanisms of how everything works in order to make use of almost anything, combined with extremely high levels of abstraction.

The high levels of abstraction are powerful, but quickly become footguns to the uninitiated (and the initiated alike at times).

> Figuring out how to do something safely can be quite difficult. It can involve solving puzzles, and often involves rewriting things at several levels, especially when threads are involved.

I think this gets to the core of really closing the gap. While the "insider" knowledge of C++ is really understanding the internal workings of these abstractions so you don't do something stupid with memory, the "insider" knowledge of Rust is knowing the specific patterns and routes that will work with the safety features of the language rather than against it for solving various problems.

I can still say for me I spend a lot of time 'exploring' Rust, having the compiler bitch at me, and tweaking things about until they work.

Sometimes it can be extremely frustrating, and given I have the insider knowledge of C++ with 20+ years experience, it's easier for me to just drop in and do exactly what I want to do (because I know it's safe... or at least I think it is ;) )

That being said, I'm very excited for languages like Rust, but there is still a gap of cost/benefit between it and C++ for the time being, at least for my professional use.

Koshkin · on April 26, 2021

> footguns to the uninitiated

Well, that depends on the uninitiated. Those who come from a more "user-friendly" language, e.g. Java, are pretty much safe - they can continue with C++ and its Standard Library as if it was Java (as long as they respect the RAII principle). The C people, on the other hand, should expect to be bitten on the ass many times (try, for example, to use the pointer to an element of an std::vector after you have pushed some more elements into it!), and so you are right, those people are, unfortunately, better off knowing the internals.

tpoacher · on April 26, 2021

> As a result, "unsafe" is too often used as an escape hatch when someone can't spend the time to get it right.

Isn't that exactly the same as C++ then?

tialaramex · on April 26, 2021

Every time somebody got lazy and used "unsafe" in Rust that's unavoidably annotated in the code. If the compiler can't prove it's safe and you don't label it "unsafe" it won't compile at all.

Even coming in years later, a maintenance programmer can identify this code is suspect, whereas these other parts are safe.

Whereas if you get lazy in C++ you can do whatever unsafe things you want, anywhere, and it leaves no trace.

eptcyka · on April 26, 2021

I don't see `unsafe` being used as an escape hatch IMO.

ratww · on April 26, 2021

Same here, lots of large projects I worked didn't have any unsafe at all. It's a very niche feature, mostly for people doing very low-level stuff.

But I suspect this is survivorship bias. I only ever worked with very experienced developers when doing Rust. I'm pretty sure as soon as I start working with more inexperienced devs I'll start seeing a multitude of clever ways to bypass the borrow checker, similar to the cleverness I currently see them doing in other languages.

eptcyka · on April 26, 2021

It's a feature that's necessary when dealing with FFI or when implementing containers. It's a tool, rather than a crutch to work around the borrow checker. In fact, unless one is using raw pointers everywhere, it's very impractical to use unsafe blocks to circumvent the borrow checker.

shinazueli · on April 27, 2021

I was with you until the last sentence. I write Rust code at scale, with schedule and performance constraints. I’ve literally never had to use the word unsafe in my code.

If you can’t figure out how to do it without fucking up the borrow checker, that’s because you can’t figure it out, and you need to learn how to do your job. You’re never “forced” to hack anything.

Blaming “a whip of agile” is just bad taste. Don’t be bad. Like, wtf, you had a decent comment and then shat the bed right at the end.

127 · on April 26, 2021

Here's the point of view from someone who has only written 1 piece of production software with both C++ and Rust.

Trying to write C++ I was constantly fighting accidental memory copies. With Rust all of this was trivial and everything works as I expect. This is pretty much the reason I chose Rust over C++.

As a beginner I have no idea how I would begin to elegantly write immutable data, parallel code. With Rust it's just .iter() to .par_iter() (everything is immutable by default).

C++ package management is awful (it has none). C++ headers are annoying and pointless.

I know these are strong statements and of course only my opinion but I just see people claiming C++ is fine just having a strong case of Stockholm Syndrome.

Rust is nowhere perfect. Often it can look like line noise, the project folder takes multiple gigabytes even for small-ish projects, the compilations can be slow, the tooling is not always there. This is still multiple orders of magnitudes better than trying to Google how Cmake works step by step.

On the other hand I love rustanalyzer + vscode. It was super easy to get going with and just works. Visual Studio C++ seems to be much more steep as a learning curve.

rhn_mk1 · on April 26, 2021

While I agree with most of what you wrote, CMake feels supercharged compared to the build system part of Cargo.

Cargo is failing hard on the integration front, both on integrating things, and being integrated. build.rs is a minimal substitute of a build system: "deal with it yourself". The way Cargo wants to have total control, on the other hand, makes it hard to integrate with existing projects: those which don't use Rust at the top level.

The only great thing about Cargo's build system is that it works well on the happy path of Rust-only, crates.io-only software.

throwaway894345 · on April 26, 2021

I was a C++ developer in a past life and Cmake was a big reason I moved on to greener pastures. It’s the only language that is more toilsome to use than C++ itself, and that by an enormous margin. It’s stringly typed (no, that’s not a typo, everything is a string), it’s syntax is obscure, it extends so poorly that it just bakes-in support for building popular libraries (e.g., Google’s test framework, Qt, etc iirc), imports are implicit so it’s tedious to track down the definition for a particular symbol, it completely punts on package management—not only were builds not reproducible but you couldn’t even get it to download and install dependencies for you from a declarative list. Cmake isn’t a build system, it’s a toolkit for scripting your own bespoke build system (and a crumby one at that) which basically means that every project is a unique snowflake with its own distinct quirks which are tedious to learn and maintain—even though 99% of projects would be covered by something like cargo. Those are some of the things I remember off the top of my head ten years later (it was also doggedly slow and things would break across minor releases, but I’m told those things have improved).

Cargo is imperfect, but it’s the right tool for the job 99% of the time.

rhn_mk1 · on April 26, 2021

CMake may be toilsome, but it is acceptable for simple projects due to its builtin dependency resolution, and powerful enough to do anything you want on the complex side.

Cargo is perfect 80% of the time in my usage, but when it's not perfect, it's almost actively harmful, and much worse than CMake. And I say that as no fan of CMake.

Foomf · on April 27, 2021

CMake variables are also dynamically scoped, just in case things weren't crazy enough!

emidln · on April 26, 2021

Rust integrates pretty seamlessly into Bazel projects via rules_rust (https://github.com/bazelbuild/rules_rust). The existing rules even allow for c calling rust and rust calling c. Example: https://github.com/bazelbuild/rules_rust/blob/main/examples/...

hctaw · on April 26, 2021

Cargo is a package manager first and minimum viable build system in far second. Comparing it to CMake is a bit unfair.

If you need a real build system use cargo make or CMake itself.

rhn_mk1 · on April 26, 2021

I would not be bothered by Cargo not being a build system, except it's the one build system underpinning the entire Rust ecosystem via crates.io, and its "rules" are not interoperable with other build systems.

As a result, you have to deal with Cargo's terrible build system whenever you want to use an external crate, whether you want it or not.

hctaw · on April 26, 2021

I don't disagree, I find cargo insufficient for projects of any meaningful complexity (it doesn't even support post build steps...). But it's really good at one thing: compiling crates.

But I haven't had a ton of trouble integrating it into CMake projects. It falls into the category of "know your tools." Not everything can "just work" all the time.

rhn_mk1 · on April 26, 2021

> But it's really good at one thing: compiling crates.

Unless you're using "compiling" as strictly compiling, and not "building", then I don't agree either.

It falls flat on its face if you want a build time choice between dependency versions, for example. And build.rs means that Cargo washes its hands from compiling parts of crates that are not written in Rust, so it's arguably not good at compiling (of anything but pure Rust).

The way Cargo is integrating several concerns also makes it hard to create better build systems for Rust, because they would have to pull in the same kitchen sink in order to support Cargo.toml. So that's being bad at letting others compile as a bonus.

EDIT: Actually, that wouldn't be a problem if Cargo the decent package manager didn't mandate Cargo the awful build system.

hctaw · on April 26, 2021

As much as I agree that Rust needs a better story for builds and interacting with other languages, it sounds like you have a misunderstanding over what Cargo primarily does and what crates are. A crate is a single compilation unit of Rust. Cargo is a tool for compiling crates and pulling in other crates that it references.

build.rs is a half measure to include foreign symbols in compilation artifacts like static/shared libraries and executables. I'd go so far as to advise against using it for anything but specifying linker flags.

I don't think I've ever had a use case for specifying dependency versions at build time. That seems insane, and I do insane things in cmake with regularity. There's a reason versions are pinned to a config file committed to repos in almost every contemporary language.

fwiw, Cargo is a crate itself and you can use it as a library. You can even compile it with C language bindings to call through FFI in other build systems if you felt like it. The lang tools team has done a great job with keeping the scope of Cargo manageable and putting in the ground work to make better tooling around it.

For complex Rust builds, check out cargo-make. It does most of what you'd need in a predominantly Rust codebase. For polyglot environments, cmake with custom targets is the least bad way I've found to do it - and it's not hard to do that by shelling out to Cargo.

rhn_mk1 · on April 27, 2021

A crate may be a compilation unit, but it's irrelevant. Within the Rust space, a crate is a library. something like 95% of crates are using Cargo, and Cargo requires that dependencies are also using Cargo. Today it's impossible to ditch Cargo, and publish your crate with e.g. Bazel as the build system.

There's no misunderstanding that what Cargo does it build Cargo crates. The problem is that it doesn't allow for sanely built (so not using Cargo) crates.

> specifying dependency versions at build time

Packaging for different distributions, where different versions of a dependency are provided, is quite a common thing, and has justifications beyond technical reasons.

Rochus · on April 26, 2021

Well, it's good that you have the choice. Mastering C++ takes many years, and each edition of the standard adds a few hundred pages.

iudqnolq · on April 27, 2021

> the project folder takes multiple gigabytes even for small-ish projects

This is a workaround but you can specify a global directory for all those artifacts with the env var CARGO_TARGET. This will deduplicate common dependency versions and lets you stick it in a temp disk/exclude from backups.

skocznymroczny · on April 26, 2021

I think a lot of those issues are more related to how bad C++ s rather than how amazing Rust is. I abandoned C++ several years ago for D and I can't imagine ever going back to C++. Similar issues, header files, no package management.

echelon · on April 26, 2021

[flagged]

pjmlp · on April 26, 2021

If an OS vendor was passionate about Modula-3, Ada or Eiffel, Rust wouldn't be needed, but unfortunely other paths were taken.

carlmr · on April 26, 2021

I don't know Modula-3 and Eiffel, but I do know some Ada. From my experience Rust, with the borrow checker, still brings a lot to the table compared to Ada. Although Ada has things that Rust doesn't have, too, like delta types, which are immensely useful in embedded programming, and SPARK.

Ideally Rust would adopt some of these, or Ada in the next standard.

pjmlp · on April 26, 2021

Ada is adding borrow checker like capabilities to SPARK.

In fact this is what I consider Rust's biggest contribution to the computing world.

Even if Rust dies tomorrow and eventually fades away, it has brought Cyclone and ATS ideas to the masses, to the point that many languages have done, are in the process of doing, design decisions to integrate affine or linear types to some extent with their type systems.

throwaway894345 · on April 26, 2021

I would rather see more mainstreaming of sum types, although affine types are nice as well.

keithalewis · on April 26, 2021

One good thing about C++ is that it scares off dumb people. :-)

jhgb · on April 26, 2021

Sadly it also scares off smart people.

MaulingMonkey · on April 26, 2021

It's also rather inconsistent at scaring off dumb people.

DethNinja · on April 26, 2021

I love C++ and I’ve to admit that it will need a subset language soon but the examples given in this article wouldn’t possibly exist in well checked code-bases because you can immediately see usage stinks just by looking at it.

I think every C++ is bad article can be summarised like this:

1. C++ has lots of features.

2. Let’s nonsensically combine these features to shoot ourselves on the foot.

3. Uh oh, C++ didn’t help us write good code. Hence, C++ sucks.

People have to accept that in real life there is a thing called code-reviews and senior engineers are supposed to prevent such badly written code.

adwn · on April 26, 2021

> People have to accept that in real life there is a thing called code-reviews and senior engineers are supposed to prevent such badly written code.

Ah, yes, the "you're holding it wrong" argument, just like the article predicted. The problem with this is that your version of "real life" is significantly different from the real "real life", in which often no code-review is held, or the reviewer misses bugs.

Silhouette · on April 26, 2021

People have to accept that in real life there is a thing called code-reviews and senior engineers are supposed to prevent such badly written code.

Why should people accept that? There is overwhelming evidence that such reviews do not reliably prevent such problems from making it into production. In some cases, other languages exist that do reliably prevent the problem from ever making it into production because the problem is impossible by design.

We can and should consider from time to time whether the advantages offered by an old but established language are now outweighed by the advantages offered by a newer but better designed language. If that doesn't happen with increasing frequency as time passes, we have a serious problem as an industry.

jandrewrogers · on April 26, 2021

I think it is fair to point out that it requires a comparatively high degree of skill and experience to use C++ well relative to many other languages. Most people do not use C++ well because doing so is quite difficult and requires a large investment of time. This is not helped by the fact that it has an enormous amount of legacy baggage that is technically valid code that no one should ever use — there is an entire anti-language you have to learn too. Its standard library has many flaws such that many experienced programmers eventually write and use their own alternatives to many parts of it. These are all legitimate hurdles and criticisms, it is neither a pretty nor easy language.

The major benefit of C++ is that with sufficient mastery you can do complex things strictly, safely, and concisely that are difficult-to-impossible in any other systems language due to its flexibility and metaprogramming facilities. Its flaws are very real, but so are its strengths.

dgs_sgd · on April 26, 2021

As someone not very acquainted with C++ I was under the impression that the vast standard library was one of the great selling points of the language. What are some examples of defects that force programmers to rewrite parts of it?

jcelerier · on April 26, 2021

depends where you come from. If you come from Python, Java, you'll find that there's a lack of support for networking, etc. and that it is not vast at all. After all we lack a W3CEndpointReferenceBuilder (https://docs.oracle.com/javase/10/docs/api/javax/xml/ws/wsad...).... not sure how it's possible to be productive without that.

If you come from C you'll think that there are allocations everywhere and judge it unfit for purpose (even if in my experience, C libraries and software tend to allocate much more than C++, and use worse datastructures like linked lists, etc. just because it's easier to code in C).

Finally the C++ committee and standard library implementers refuses to break ABI / API, which means that : * the C++ standard is not able to follow the state-of-the-art for e.g. hash maps, as the fastest hash maps (if you only care about speed of insertion / retrieval, which is let's be honest 99.5% of the use cases out there) because unordered_map has strict requirements that these newer hash maps do not satisfy. * things like regex stay broken

saagarjha · on April 26, 2021

The standard library is quite good, but is designed to be fairly general. Sometimes this means that you can do better with something you write yourself.

gypsyharlot · on April 26, 2021

Let's not confuse this with skill, though. I think ego is the reason why this has been allowed to go on for this long...

The "skill" is mainly pointless memorization of a bunch of idioms. I've attended C++ training with committee members that have made C++ their life... and it was quite discouraging to see that even they do: Let's try this... oh, that didn't work. Let's try this... hmm, right. I know what this is, I've seen it before. Now, it works.

pjmlp · on April 26, 2021

C++ was my passion after Turbo Pascal, I fought to use it instead of C on university assignments, was a TA in C++, did research at CERN in C++, and used it at several multinationals before migrating into managed languages.

Well checked code-based is something that I have hardly seen in real life, outside conference talks about best practices.

pornel · on April 26, 2021

Bugs in the article are trivially obvious, because they're in 3-line code examples, explicitly pointed out.

The problem is, the same bugs happen in actual large codebases, without the priming to look for this particular issue out of hundreds of possible issues.

It's a difference between an article saying "This is Waldo" (duuh!) and a "Where's Waldo?" game, where you don't even know how many Waldos are there.

v8dev123 · on April 26, 2021

These Rustic people write very dishonest articles. They write all flaws of C++ and then compare how bad C++ is. It really harm the Rust language itself not the C++. Professionals will dislike Rust because of it's community not the language. Rust people should put more effort writing a formal specification else many people will consider Rust as undefined. My comment getting flagged and downvoted and therefore posting here.

clarge1120 · on April 26, 2021

To add, C++ has survived for many years as the defacto system-level programming language (C has too). It has survived good and bad shifts in software engineering.

The old "C++ is not going anywhere" argument applies.

C++ is not great choice for small, or transient projects. There are other languages that are a better investment for those projects.

But if you're writing an infrastructure-level application, that is expected to have a shelf life in decades, C++ (or C) is a pragmatic and rational choice.

pornel · on April 26, 2021

UB-invoking dereference in std::optional is such a baffling design choice.

The whole point of an optional type is to prevent accidental unchecked access to the value. Sure, sometimes it's useful for performance to skip the check when it's already known-safe from the context, but such dangerous optimization should have been behind a method like `beware_of_the_nasal_demons()`, not an innocent-looking convenience syntax for flirting with the UB — in a language that was supposed to be cleaning up the unsafety.

dataflow · on April 26, 2021

> The whole point of an optional type is to prevent accidental unchecked access to the value.

You might argue it should be that, but it wasn't my impression that it is that. My impression was std::optional is just there to allow for representing the absence of a value. Ideally that should be zero-overhead, which means dereferencing it should degenerate into a normal object access. Hence the current design.

kragen · on April 26, 2021

Do you mean that the purpose of std::optional is to be faster than a pointer (which could represent absence as nullptr)? I would have thought it would be ML-style safety, but it seems like you're right. I guess a tagged union (or std::pair<T, bool>) would be faster—if the std::pair is a local variable, fields of the wrapped object will be at fixed offsets from your stack pointer or frame pointer, so you save yourself an indexed register load the first time you access a field of the object. That seems to have been the intended use.

Why did they think this was a good idea?

dataflow · on April 26, 2021

No, the purpose wasn't to be faster than a pointer. You can't even necessarily use a pointer where you can use std::optional; the value of std::optional is embedded in itself.

One purpose would be type safety (and things revolving around that). Another would be to let you pass around an optional object just as easily as the underlying object.

As to whether it's a good idea to have it at all, I'm not particularly fond of it personally either. I don't have specific issues with its dereferencing safety though.

Don't confuse type safety with memory safety though. Type safety != memory safety != thread safety != ...

kragen · on April 26, 2021

All good points. Thank you.

nemothekid · on April 26, 2021

>Ideally that should be zero-overhead

Optionals are not zero-overhead in any other language. If you aren't going to bother checking the optional, why not just return any old pointer type? The type system should be used to enforce a contract and bypassing checks on that feels like you are just spitting in the fact of the type checker.

dataflow · on April 26, 2021

> Optionals are not zero-overhead in any other language.

That's been true for so many other features in C++ too.

> If you aren't going to bother checking the optional, why not just return any old pointer type?

Pointers require something to point to. Optionals can embed the value.

> The type system should be used to enforce a contract and bypassing checks on that feels like you are just spitting in the fact of the type checker.

It's bypassing something alright, but I wouldn't say that's the type checker. And I mean, you could use this logic for everything else. Iterators, arrays, etc. should all have bounds checking too, right? If you want that, there's C#...

nemothekid · on April 26, 2021

>That's been true for so many other features in C++ too.

Plenty of C++ features are zero-cost, not necessarily zero-overhead. To me having an optional without bounds checking is like obtaining a shared_ptr without incrementing it's use_count. It's like arguing that I should increment use_count manually so that copies are "zero-overhead". Sure it's "faster" but the design is still broken and will lead to all sorts of issues.

>Pointers require something to point to. Optionals can embed the value.

So? Just because it embeds a value rather than a pointer doesn't make your program any more correct. If the API designer returned an empty optional and you access it anyways you are still dealing with garbage data.

>And I mean, you could use this logic for everything else. Iterators, arrays, etc. should all have bounds checking too, right? If you want that, there's C#...

Ok? Then lets remove all overhead from the language. Why does shared_ptr increment it's refcount behind my back? No more shared_ptr. This kind of argument without nuance gets you nowhere. Likewise, unlike bounds checks, nothing about std::optional is forced or baked into the language. If you don't want to do null checking, then just don't use std::optional - the fast option is still there and unlike Rust, the fast option is the default.

Optionals are a great tool to eliminate null pointers from a codebase. It's a big enough problem that I would expect an optional type, designed in 2018 to get it right. Allowing automatic dereferencing is an oversight.

lmm · on April 26, 2021

> And I mean, you could use this logic for everything else. Iterators, arrays, etc. should all have bounds checking too, right? If you want that, there's C#...

I mean yes, the logical conclusion is that trying to backport safety onto C++ is impossible and moving to another language is the only reasonable option. That's kind of the point of the article.

dataflow · on April 26, 2021

> I mean yes, the logical conclusion is that trying to backport safety onto C++ is impossible

The logical conclusion is that what some people here seem to want is so radically different from C++ that they really just want an inherently just a different language altogether... which is fine. Nothing about this implies it's impossible to shoehorn C++ into something else (though that may still be true; I'm not sure). It just implies that another language should be considered when you want to prioritize memory safety, like maybe C# or Rust. (And nowhere here am I agreeing or disagreeing with the article.)

matvore · on April 26, 2021

  > why not just return any old pointer type?

Using a pointer type in a straightforward manner would entail allocating the object on the heap.

If allocated on the stack it has obvious lifetime limitations.

emtel · on April 26, 2021

> Optionals are not zero-overhead in any other language.

This is a common misconception, and a pet peeve of mine. Properly designed optional types add neither space nor code-size overhead.

Consider the C function:

    foo(int* ptr) { ... }

If it is part of the API of foo that `ptr` may be null, then foo must not dereference `ptr` without verifying that it is non-null. So the body of foo must be something like:

    foo(int* ptr) {
      if (ptr != NULL) {
        ...
      }
    }

Or, take the non-pointer case:

    foo(struct S s, bool s_present) { ... }

Clearly the intent of this function is that the contents of `s` should not be accessed if `s_present` is false, so the implementation of the function must check `s_present` before inspecting `s`. (Inspecting s incorrectly wouldn't necessarily be UB, like in the previous example, but it's clearly erroneous)

The only thing that a good optional type implementation does is make it a type error to fail to do what everyone agrees both of those functions must do anyway in order to be correct. It need not increase the size of the representations, nor must it emit a single unnecessary instruction when compiled. There are plenty of languages that do this, Rust being a prominent example.

jcelerier · on April 26, 2021

> If it is part of the API of foo that `ptr` may be null, then foo must not dereference `ptr` without verifying that it is non-null. So the body of foo must be something like:

    foo(int* ptr) {
      if (ptr != NULL) {
        ...
      }
    }

then what ?

    foo(int* ptr) {
      if (ptr != NULL) {
        *ptr += 1; // optional<T>::operator* would introduce a branch here ?
        *ptr = *ptr / 2 ; // same 
        if(*ptr > some_constant) // same
        { ... } // etc etc 
      }
    }

and compiler optimizations can't always be assumed, for instance debug mode (-O0) performance does matter

emtel · on April 26, 2021

That’s why I said properly implemented! Good optional implementations let you do the check once to unwrap the inner value, so you have one branch.

In rust:

    fn foo(s: Option<&S>) -> {
      if (let Some(inner) = s) {
        // now you can access inner repeatedly
        // without branches
      }
    }

(edit to fix syntax errors)

jcelerier · on April 26, 2021

but this declares a new variable - I find this super messy and really decreases readability in practice since now it's not obvious anymore that what you're working with was a function parameter. Also it adds a scope level - I much prefer the

   if(!bla)
     return;

   // use bla

early-return style. So that's really a no-go for me from years comparing the two styles

emtel · on April 26, 2021

Well if you have aesthetic objections to the way rust does it, I can't argue with you. Kotlin does it the way you like, though.

Anyway, I started in on this thread because I was objecting to the claim that optional types always have overhead. They don't. That's all I wanted to show.

matvore · on April 27, 2021

  > Well if you have aesthetic objections to the way rust does it,

I agree with GP that it is harder to maintain ("messy" they say) if there is more than one variable referring to the same value. It is not so subjective as you make it to be.

iudqnolq · on April 27, 2021

The idiomatic way to solve that in rust is to re-bind to the same variable name.

   if let Some(foo) = foo { /* ... * / }

That's possible because in Rust name shadowing

    let foo = grab_foo_bytes();
    let foo = parse_foo_bytes(foo);

makes the previous binding of the variable no longer namable and thus no longer accessible, but doesn't drop it (and trigger RAII destructors).

Now someone will probably come in and say "oh no, this isn't exactly like c, how will anyone ever understand it". To that I reply why is it that c users get to say "if you don't know how c works exactly you're holding it wrong" and then comment about other languages "I don't want to have to learn anything to hold it right".

matvore · on April 27, 2021

  > The idiomatic way to solve that in rust is to re-bind to the same variable name.

OK, that's reasonable. Is the idiomatic way to use optionals to introduce a layer of nesting? I prefer keeping functions very "flat"-looking. It sounds like Rust's optionals will give people an excuse to create labyrinthine functions where I'm constantly scrolling around to remind myself of what level of nesting I'm at and whether I'm in a loop or not, etc.

As you say, you don't need a new block to shadow a previous var, so hopefully that style catches on.

  > "oh no, this isn't exactly like c, how will anyone ever understand it"

Not very persuasive, sure, but the network effect of the C/C++ culture (including its general syntax and imperative nature) is a strength in and of itself. New languages would do well to coddle the existing C++, Java et. al. users wherever it doesn't contradict the language's central mission.

iudqnolq · on April 27, 2021

> I prefer keeping functions very "flat"-looking.

I definitely agree. One way I do that is by having an internal function that takes a valid value and a public function that does the validating/error handling.

That doesn't always make sense though. There's a few other idiomatic ways to avoid nesting. Since statements evaluate to values, you can write

    let foo = if let Some(foo) = foo {
        foo
    } else {
        // Something that either evaluates to the same type as foo or returns early
    }

That's so common there's a special operator for it, ?. It essentially either early returns the sad path or evaluates to the happy path.

    fn get_foo() -> Option<Foo>;

    fn frob() -> Option<Bar> {
        let foo = get_foo()?;
        let bar = convert_to_bar(foo);
        Some(bar)
    }

I prefer to use Result to model missing data like cases instead of Option because it composes better. So that might be

    fn get_foo() -> Option<Foo>;

    fn frob() -> Result<Bar, BarNotFound> {
        let foo = get_foo().ok_or(BarNotFound)?;
        let bar = convert_to_bar(foo);
        Some(bar)
    }


    #[derive(Debug, thiserror::Error)]
    #[error("Bar not found")]
    struct BarNotFound;

That last bit uses a stdlib macro and a very commonly used external lib macro to save a few lines of repetitive typing.

Edit: Also ? doesn't special case Result and Option. You can make your own type conform to the interface (trait) it requires. That would probably be weird though.

matvore · on April 29, 2021

Interesting. Thank you for sharing.

kazinator · on April 27, 2021

It's exactly like Lisp:

  (let* ((x (this))
         (x (that))
         ...)
    ...)

In C you cannot even do:

  { int x = 42;
    { int x = x + 1; // x + 1 refers to this second x
      ... } }

This is because the scope of the identifier being declared already starts at the =. So even if the redeclaration were allowed without opening a new block scope, it wouldn't work.

However, there is a good reason for that: initializers can be self-referential, so they have to have their own identifier in scope:

   // define circular structure in one step, no assignments:
   struct node n = { .next = &n, .prev = &n };

In this regard, the scoping rule is like letrec in Scheme or labels in Common Lisp.

solipsism · on April 26, 2021

Easily dealt with with a linter. C++ is not fundamentally a safe language. It's fundamentally a no-cost abstraction language. It's not a baffling design choice if you know C++.

pjmlp · on April 26, 2021

Pity that linters isn't something most C++ devs feel like they need.

https://isocpp.org/files/papers/CppDevSurvey-2021-04-summary...

bnurun · on April 26, 2021

> not an innocent-looking convenience syntax for flirting with the UB

I agree, but C++ uses this pattern in so many places that I think it is less confusing to be consistent. When I see a nice, concise syntax or function, I check cppreference for undefined behavior...

jcelerier · on April 26, 2021

> The whole point of an optional type is to prevent accidental unchecked access to the value

no ? it is to model the idea of "a value is there or is not there". definitely nothing more. If replacing a T* t by a optional<T>& incurs a meaningful performance cost (like a branch) then optional just won't be used

account4mypc · on April 26, 2021

boost::optional was designed to mimic pointers. Before boost::optional, some programmers would return a T* as a caveman's optional<T>. Boost wanted to preserve the syntax of *, ->, and operator bool. Dereferencing a null ptr is UB, so operator* for optional was the same.

I agree it would be nice if you could get an assert in operator*. But you can already fire up a debugger or sanitize build and get a nice error message anyway, so it's nbd.

Someone · on April 26, 2021

> it would be nice if you could get an assert in operator*

If, as you wrote, operator* for an optional that doesn’t contain a value is UB, it can do whatever it wants, including assert.

I think the problem is that there’s an implicit requirement that operator* optionals that do contain a value is as fast as a pointer dereference.

(Aside: reading https://en.cppreference.com/w/cpp/utility/optional, I wonder how one can misuse “When an object of type optional<T> is contextually converted to bool, the conversion returns true if the object contains a value and false if it does not contain a value.” to write obfuscated code or hide back doors using optional<bool>)

account4mypc · on April 26, 2021

well, the immediate problem is it's hard to rewrite the STL without resorting to ugly patches

ttt0 · on April 26, 2021

As far as I'm considered, the purpose of std::optional is so you don't have to allocate memory on the heap and then check for null pointer. I don't want it to throw exceptions, just like I don't want the language to check the validity of a pointer every time I want to dereference it.

AnimalMuppet · on April 26, 2021

> There are significant challenges to migrating existing, large, C and C++ codebases to a different language – no one can deny this. Nonetheless, the question simply must be how we can accomplish it, rather than if we should try.

Um... no. Not every (or even most) existing large C/C++ codebase should be migrated to another language.

"But security vulnerabilities! And crashes!" I admit all that. But a program can be useful and therefore valuable, even if it crashes at times. Rewriting the program adds value only by removing crashes and exploitability. In many cases, that's effort that could be spent in more valuable ways.

lmm · on April 26, 2021

This amounts to saying that we should accept codebases continuing to contain exploitable vulnerabilities indefinitely. Perhaps for codebases that have a finite expiry date that's tolerable, but for a codebase that's expected to be maintained indefinitely I don't see how it can possibly be worthwhile - a rewrite will be a one-time cost, whereas exploitation is an ongoing cost that will surely exceed the one-time cost eventually.

volta83 · on April 26, 2021

> This amounts to saying that we should accept codebases continuing to contain exploitable vulnerabilities indefinitely.

Right, why shouldn't we?

I have a CLI application for storing TODOs. It never accesses the network, I don't really care if it crashes, it does not have an expiry daye.

I have a constant amount of time to work on it. To me it is more value to put that time into new features that save me a couple of seconds here and there as a daily user, than to put all the time allocated for months or a year of working on the app to re-write it in a different language to remove the occasional monthly crash I don't care about.

For me, the decision is a no brainer: my time is better spent in the stuff that adds more value, and "avoiding exploitable vulnerabilities" is not it.

I suppose that this is the situation for many apps.

You are claiming that this is wrong. Prove your claim.

lmm · on April 26, 2021

If it's a personal app that you're not going to share with others then that app does have a finite expiry, admittedly in a slightly morbid way. I'd submit that an app with a truly stable userbase requires an impossible level of fine tuning - in reality an app is either growing or shrinking.

kaba0 · on April 26, 2021

A program can fail in exceedingly many ways. It is basically impossible to formally verify a program. It’s great to start a new project in a “safer” language, but porting to another language is a different thing.

So for example, let’s take SQLite. It is written in C, but it has an insane amount of tests. Would it benefit anyone to rewrite it in Rust? It will definitely be much more buggy for a long time.

lmm · on April 28, 2021

> So for example, let’s take SQLite. It is written in C, but it has an insane amount of tests. Would it benefit anyone to rewrite it in Rust? It will definitely be much more buggy for a long time.

I bet it wouldn't be, actually. In my experience porting between languages is much easier and safer than people tend to think. Meanwhile even with all their tests (which certainly have a maintenance cost) SQLite has been known to have memory safety bugs.

jandrewrogers · on April 26, 2021

An additional issue is that some sophisticated C++ doesn’t always translate easily into other languages. It isn’t just a fairly direct reimplementation but a legit redesign. That will be a bug factory, especially for the kinds of codes that tend to be difficult to translate, as proving equivalence won’t be trivial.

lmm · on April 26, 2021

I'd submit that the kind of code that's difficult to translate - that is, code where it's not clear where the responsibility for the lifecycle of a given piece of memory lies - is already a bug factory.

jandrewrogers · on April 26, 2021

Not at all. Modern C++ can express some memory safety and lifetime models simply and elegantly that are difficult to express in other systems languages. It doesn’t define one for you by default but it also doesn’t limit you to a single model that is clearly inappropriate for some important systems code.

The bug factory, in many cases, is a consequence of having no way to properly express lifetimes in languages that only support a single lifetime model (or no lifetime at all in the case of C), therefore requiring unsafe hacks and workarounds. If, for example, your entire address space is accessed via DMA then Rust’s memory lifecycle model breaks, and this is a canonical design characteristic of all high-performance database engines. You can trivially design C++ constructs that automagically handle lifetimes under these constraints; in other systems languages you have to do a lot of fiddly manual resource management in unsafe code blocks.

The idea that a single memory lifecycle model is appropriate or optimal for all systems applications is objectively wrong. C++ doesn’t implement other formally verifiable safety models but it provides the tools required to elegantly build applications using them and largely hide making them safe.

This is one of the well-known strengths of modern C++: the ability to implement many different formally verifiable memory safety and concurrency models as first-class constructs. Not every application needs it but some, certainly everything I work on, definitely do. I don’t disagree with the objective — I highly value the ability of the compiler to ensure that my code is safe.

pjmlp · on April 26, 2021

Yet Microsoft and Google, despite their C++ investment into compilers and ISO seats, are also investing into hardware memory tagging, forcing static analysers down developer throats no matter what, while slowly adopting other AOT compiled languages on their products.

Because while Modern C++ does indeed improve the memory safety and lifetime models, a large majority of the C++ community doesn't care about modern C++ features and has even started the Orthodox C++ campaign.

kaba0 · on April 26, 2021

Static analysis can benefit every sort of language, Rust included.

pjmlp · on April 26, 2021

For sure, I haven't said otherwise.

kaba0 · on April 26, 2021

Not every program has such a well-defined life cycle that fits into Rust’s memory model. There was a great post on why the wayland library’s rust implementation was abandoned. There was basically no point of Rust’s memory model there over C.

lmm · on April 28, 2021

In the worst case you're no worse off. As several replies there said, a weak pointer model would have worked fine for that scenario.

kaba0 · on April 28, 2021

For new code, absolutely. But rewrites, especially when the new language can’t necessarily give huge safety guarantees as in this case are very bug-prone.

josefx · on April 26, 2021

You haven't known suffering until you tried to rewrite a large C++ codebase in Java. No clear ownership you say? All those members that clearly belong to one object suddenly have to be guarded against accidentally having their references shared, all that clear math code turns into an indecipherable mess and you can forget about lifecycles unless you guard every resource with a try block. Sadly I end up writing Java code from time to time because I can write passable swing UIs faster than I can set up Qt.

atdt · on April 26, 2021

I read that sentence as applying to security-critical software. i.e., I don't think the author would insist that Quake II be rewritten in Rust.

brundolf · on April 26, 2021

Quake III already has been ;)

https://immunant.com/blog/2020/01/quake3/

ncmncm · on April 26, 2021

Quake III appears to be C code. The argument for (automatically) rewriting C or nearly-C to Rust is stronger than for C++.

Curiously, they seem to have found just one, inconsequential, memory usage error in the entire, large C program. This calls into question the frequently repeated assertion that there is no substantial and correct C code.

mamcx · on April 26, 2021

This mindset is the Cobol mindset.

Sound reasonable today.

Is crazy to keep it for long.

And the billons of dollars wasted on "fix" C/C++ with discipline have proven is the wrong.

pjmlp · on April 26, 2021

Lets take a second to remember C is only about 10 years younger as Cobol.

Iv · on April 26, 2021

Bah, just wait for the hardware or the general software ecosystem to make the code obsolete anyway.

zone411 · on April 26, 2021

Clang gives a warning for the first example:

<source>:7:25: warning: object backing the pointer will be destroyed at the end of the full-expression [-Wdangling-gsl]

https://godbolt.org/z/zTnjbhMqr

TeMPOraL · on April 26, 2021

Yeah, and clang-tidy gives another; I just typed the first example and got both -Wdangling-gsl and "Std::basic_string_view outlives its value [bugprone-dangling-handle]" from clang-tidy.

It doesn't invalidate the point of the article, that these things perhaps should be easier to avoid, or impossible to express. You can code safe-ish C++ with high warnings settings and enough liters and static analysis tools backing you, but it's not ideal experience.

cflewis · on April 26, 2021

I changed teams at work (Google) to one that uses a C++ codebase that has survived for 20 years. As someone writing C++ for the first time, clang-tidy has been huge. It stops so many screw ups before I get to the code reviewer.

I've not found C++ as bad as I worried, partly because Google's style guide limits the amount of silliness you can do, partly because I have avoided memory issues by just never allocating to the heap using `new` with smart pointers, but mostly because Clang and Clang-Tidy have helped me avoid all the footguns.

I wouldn't write a new project in C++ (especially outside Google where the package management ecosystem is non-existant), but I'm no longer of the opinion that I would never join a team just because their current codebase was written in C++.

FpUser · on April 26, 2021

Those C++ bad, complex, unsafe etc. etc articles are starting to get boring.

If one wants to shoot him/herself in a foot it is fine. C++ offers countless possibilities.

It (and plethora of libraries) also offers quick way to write sophisticated and performant applications without much fuss.

Make your choice. I personally use C++ to great advantage and find it very productive and safe. And while being good programmer / designer I am not a C++ expert. Far from it.

Or use language of your choice. Nothing is wrong with it. We do not have to live in the world where "there can be only one".

dbrueck · on April 26, 2021

In a way, I agree. To me, C++ is an absolute train wreck of a language and choosing it for a new project borders on malpractice.

But if people want to use it and it doesn't affect me, there's a limit to how much energy I'm willing to spend trying to talk them out of it... especially if they are a potential competitor, in which case I might nod encouragingly when I hear they're using it.

adev_ · on April 26, 2021

> To me, C++ is an absolute train wreck of a language and choosing it for a new project borders on malpractice.

Personnally I tend to think exactly the opposite. And choosing a brand new hype language because "it's shiny and fun" on new projects that will have to be there for 20 years is just a sign of being immature and a malpractice.

There is no guarantee your shiny language will be alive nor even supported on my next gen platform. When for sure, the good old safe set: C++, C, JS, python, Java will be there and alive, even in 20 years from now.

And the result you will probably struggle and spend more time to get your nice shinny Rust code running correctly on iPhone/Android 20/NG-Cloud than you will ever to debug a damn core file in C++.

To comment ironically on your post. It is more to "zealots" and "evangelist" like you that I personnally refuse to talk. These people are often more interested with playing with the last fancy tech available than producing anything useful and sustainable in their work.

Const-me · on April 26, 2021

> choosing it for a new project borders on malpractice

Did it multiple times recently and I’m fairly confident in my choice. Here’s the main reasons.

1. Interoperability. If you’re writing a web service which only needs TCP sockets and local files, standard libraries of all modern languages get you covered. However, many desktop applications need to consume large C or C++ APIs implemented by operating systems. Maintaining FFI wrappers is expensive in the long run.

2. Library ecosystem. For HPC only Fortran has comparable one, for game development only C# has. The rest of the languages aren’t even close for these areas.

3. SIMD intrinsics are awesome for performance. They slowly appear in other languages, but so far, the support in C and C++ is just better. Probably because the support is first-party, by Intel and ARM.

4. Tooling is good. I use debugger, CPU and GPU profilers almost every day.

FpUser · on April 26, 2021

>"...choosing it for a new project borders on malpractice..."

That's one healthy and balanced approach.

ncmncm · on April 26, 2021

And, I will nod encouragingly, knowing that you will spend hundreds of times as many hours waiting on builds as I will spend finding and fixing any bugs using your compiler might have helped avoid; and, knowing that you will find overwhelmingly fewer experienced coders available to help when you need them.

In the past decade, I have spent more time on filing compiler bug reports than I have on tracking down and fixing memory usage errors. Rust does not solve a problem I have. But its Node.js-like dependency milieu worries me.

That said, I wish you good fortune with your choice. I do not doubt you will find it. But if you do, it will be a result of your work, not your choice of language.

cataphract · on April 26, 2021

It's interesting that one of the things C++ got right was having value semantics by default and most of these "problems" are the result of using either references or types that behave like dumb references/pointers such as std::string_view.

nine_k · on April 26, 2021

Value semantics as in "copy constructor for everything" is only helpful in limited ways.

What is helpful is move semantics for anything mutable, which was introduced quite recently.

10000truths · on April 26, 2021

C++11 has been out for a decade now. Move semantics are not much more "recent" than Python 3.

nine_k · on April 26, 2021

I think the comparison is quite apt. Allowing new and radical features in production sometimes takes quite a few years :-/

pjmlp · on April 26, 2021

There are companies that are still considering migrations to C++11, let alone anything more recent.

dang · on April 26, 2021

It apparently didn't at the time either:

Modern C++ Won't Save Us - https://news.ycombinator.com/item?id=19723066 - April 2019 (388 comments)

Modern C++ Won't Save Us - https://news.ycombinator.com/item?id=19714204 - April 2019 (1 comment)

bnurun · on April 26, 2021

The deliberate omission of std::span::at() is annoying. The paper that introduced std::span was titled "span: bounds-safe views for sequences of objects," but there is actually no bounds checking in the standard. It's only true with debug builds that also use slow, debug versions of the STL, which have almost negligible value compared to just using sanitizers.

crazypython · on April 26, 2021

Dlang (https://dlang.org) avoids all these mistakes through careful language design and compiler-enforced checks.

klibertp · on April 26, 2021

D is garbage collected language. I've spent the past weekend writing some toy programs in D, and it was very pleasant experience, mainly because of the GC - I didn't have to think about allocating arrays or strings, I could just use them. There are ways to circumvent the GC, like RefCounted, but the baseline is different in D and C++.

jb1991 · on April 26, 2021

Many of the comments here are focusing on Rust as the obvious alternative, but it's great that the article twice also mentions Swift and Rust together as the obvious choices. If Swift ever gets more adoption outside Apple's platforms, it will be a game-changer. It's a very elegant language with much stronger typing than C++. It feels like writing in a scripting language but retains the type and memory safety.

ppg677 · on April 26, 2021

Yet C++ is what Google, Facebook, Microsoft and others still use for their most mission critical software.

pjmlp · on April 26, 2021

I moved away from full C++ development in 2006, yet C++ still is the tool I reach for in native code, because many libraries are only available in C++ or C, and I definitly will not be writing C unless obliged to do so.

Microsoft had a project to move Windows mostly into .NET, sabotaged by their Windows team, which a decade later rebooted the ideas using COM instead, an idea that has been a commercial failure, while everyone just keeps migrating to .NET.

Microsoft Security team pushed a best practices paper that new projects should use:

1. .NET

2. Rust

3. C++ with Core Guidelines, with Visual C++ analysers turned on

4. They also have a security annotations for C and C++ code (SAF)

Google castrates the use of C++ to app developers in Android to writing native libraries to be consumed by Java and Kotlin, is behind the effort to adopt hardware memory tagging alongside ARM, the reason why Linux kernel no longer uses VLAs and the big pusher for Rust on the Linux kernel.

So it is not just simply "they use C++".

01100011 · on April 26, 2021

When you're writing systems software you need systems software engineers and most of them are far more competent in C/C++ than, say, Rust, Ada, Go, etc.

We choose C/C++ because it's what we know and what our colleagues know, but I think a lot of us wish there was a better alternative.

jandrewrogers · on April 26, 2021

Not to take away anything from your post, but Go is not a systems language. No language with a mandatory garbage collector can make that claim since it makes some systems code effectively impossible to implement.

01100011 · on April 26, 2021

I agree, but some people don't. When I interviewed at Google a couple years ago, they were rewriting the Fuchsia network stack in Rust. It was currently written in Go. My jaw about hit the floor when I heard that. I'm guessing they realized it was a mistake, but then again maybe the Go version was just a temporary placeholder. IDK, didn't get the job.

jandrewrogers · on April 26, 2021

Rust is an entirely adequate systems language for most purposes. I think it could probably replace C for almost all purposes except in cases where extreme portability is required, which C excels at.

C++ is really only the answer if you need extreme performance and/or expressiveness out of a systems language. Some code, like database engines, really benefits from that.

pjmlp · on April 26, 2021

Better hold your jaw with both hands while reading this,

https://www.f-secure.com/en/consulting/foundry/usb-armory

01100011 · on April 26, 2021

Uh, if you think that's bad, I once worked at a company that made communications gear for first responders and the military. Most of their stack was written in Python 2. I'm not talking UI, I'm talking systems software level stuff. Python tied together with dbus on top of a shake-n-bake Yocto distro that was out of date. People's lives depended on that tower of crap.

pjmlp · on April 27, 2021

Why should it be bad?

One less device running C, with firmware written in a type safe language that takes bounds checking seriously.

As for Python, I recommend you to have a look at IEEE magazine.

pjmlp · on April 26, 2021

That is the agenda usually pushed by anti-GC crowd, others think differently.

https://www.f-secure.com/en/consulting/foundry/usb-armory

https://www.astrobe.com/

https://dlang.org/

https://www.ptc.com/en/products/developer-tools/perc

https://www.aicas.com/wp/products-services/jamaicavm

http://joeduffyblog.com/2015/11/03/blogging-about-midori/