I was once made fun of (in a kind of humiliating way, now that I recall) by a bunch of HPC scientists in a room at KAUST for saying that C++ had support for GC.
They were like "if you knew the least bit a about C++, you'd knew it doesn't have a GC".
I was like, "yeah cool, but C++ does have some GC functionality", but they just shut me down. My background is Biology so there were a few remarks about how "a biologist just doesn't know these things, lol".
This was around 2019, btw. And sure, I'm a biologist, but I also happen to have spent 8-10 hours/week for the past 20 years coding because I just really like it; so, there's my 10k hours, ¯\_(ツ)_/¯.
So far as I can see, this was in the C++ spec but not implemented by any major compiler, though. So I guess the technical answer to “who’s right?” would depend on the verbiage of the discussion. :shrug:
(But regardless, that sounds like a toxic environment and I’m glad you’re out of there. To think that because someone has a degree in biology, that automatically precludes them from knowing anything about computers or programming that you don’t is just the epitome of arrogance and hubris.)
If it came to that it's time to whip out the technically right big dick and remind them that C++ has had smart pointers since forever and that TRACING garbage collection is far from the only automatic garbage collection technique known to man. ;)
I'm kind of surprised they didn't try to get rid of/adapt the standard for C++/CLI on Windows since it actually uses GC for a lot of things. However, I guess the standard wasn't all that useful, especially since it mandated a lot of overhead for pointer arithmetic. My theory is that it is very difficult to come up with a general API for interacting with GC'd memory in C++ because making the API efficient depends heavily on how it is implemented.
Yeah that is the thing, Unreal C++, C++/CLI, and COM/WinRT frameworks, are the main users of automatic memory management in C++ dialects, and none of them was taken into consideration.
Some of the best programmers I've worked with have backgrounds in sciences, and it's been a delight just being around them as they apply some of that knowledge in in a different field.
That kind of blatant ass-holery is really common in C++ circles and you even see Bjarne doing it.
"I understand what you are trying to say and I think you are wrong" at a conf to delighted razzes and jeers from the room. Bjarne had seemingly just discovered that link lists thrash cache and was presenting about swapping in a vector instead. Note it's got nothing to do with whether this guy is onto something or not. It is possible to disagree without being an asshole. Bjarne missed the opportunity. C++ culture follows.
IMHO it is unsurprising there's a lot of groupthink involved in such a culture and it has really damaged the language and certainly makes it unpleasant in the way you describe.
Bjarne is right, the guy is wrong, full stop. If you don't care about performance, then don't worry about performance. Use std::list if you like, and don't complain about it.
Ultimately, we write programs to make machines do things. What we write exactly determines what the machine does. It is our responsibility to understand what we write makes the machine do; otherwise it won't do what we wanted. Often (maybe almost always nowadays) the performance details don't matter much. When they do matter, there is no substitute for understanding.
The Standard data structures are constructed simply, with understandable performance characteristics, imposing minimal overhead both in time and code space. This is an extremely important quality not to sacrifice idly. No one is obliged to use only the Standard containers; if you need something different, you can write something different yourself, or even find it online somewhere. You are encouraged to write something yourself, and maybe publish it. People do.
In a very real sense, the Standard containers are just examples. Making them unnecessarily complicated would make them worse examples, and would make them worse for almost all places they are used.
I think you may have missed the point of my raising that fwiw. I genuinely don't care who is right [1]. Bjarne's attitude and manner is very wrong and very normal amongst c++ people. That's a shame. He could have explained his point of view forcefully, without being a jerk. Bjarne made his choice.
Maybe it's why there have been so many terrible decisions in C++ over the years because being an asshole and right might scare off discussion but it doesn't make even Bjarne infallible.
[1] If you care about performance you take full responsibility for all code used including all libraries like the stl, which might well mean you don't use the stl and vector and write something optimised for your specific purpose, and not for every possible purpose. But I don't want to be an ass about saying it.
He is not a jerk. He explained his point clearly, concisely and respectfully, without wasting the time of the next person with a question. Making people into jerks because you don't like their answers, you find yourself surrounded by jerks. Then who's really the jerk?
He explains, "People have not figured out how to automate picking data structures optimally relative to their use in a program". In other words, the guy asked for something impossible. Bjarne was as diplomatic as was possible while still answering the question accurately.
"I think I understand what you are trying to say and I think that you are wrong."
The giveaway is the hoots and razzes from the crowd. No word above needed saying. It clarified nothing. It is pure assholery. (And he may not have even intended it. Clearly lots of c++ people have no issue with just how unpleasant community interaction is compsred other communities. It's also pretty obvious that many do see it plainly).
For someone for whom English is a second language, what was wrong with his statement ? For a non-native English speaker it seems expressing disagreement is becoming more and more difficult and complex.
I think you mean to say that you couldn't afford to devote the time to making a map that Microsoft did. Generally, in C++ we get to use data structures that have had far more attention to every detail than we could apply personally because they are used by many more than just us. Microsoft's gets used by everybody who uses their compiler, and so commands attention. It probably deserves more than Microsoft actually spends, but that is a management problem.
So definitely don't use it if you care about performance? (You can't write it, but anywhere you'd use it you can probably beat it). But we're getting into discussion of sensible benchmarking of actual deployable applications so let's leave that for another time.
To me it sounded more like a reaction to Bjarne phrasing himself in an unexpected way, much like you'd laugh at a joke. But I regardless do agree that it's inappropriate to laugh at a sincere question. And as an extension to that, it would have been better if Bjarne's first sentence wasn't phrased as a joke.
But I don't think there's any issue with plainly stating that you think the asker is wrong, or with the rest of Bjarne's response. Of course, I'm not a native speaker.
He was fighting social proof. Because they viewed him as an outsider no amount of facts will convince them. They won't take his word for it and they won't spend the time verifying his facts.
Been there many times. Including when a room full of programmers laughed at me for suggesting randomizing a list prior to using quick sort because "it messes up the big Oh complexity"
(Yes yes I know there are simpler ways to handle quick sort attacks but that wasn't why they were laughing)
There are so many arguments that I could have won, if I had just had an evening to mull it over.
I'm reminded of the movie Annie Hall where Woody Allen is waiting in line...
Alvy: [Hearing a man behind him rambling about Marshall McLuhan]
What I wouldn't give for a large sock with horse manure in it.
[Turns to the camera] What do you do when you get stuck in a
movie line with a guy like this behind you? It's just...maddening-
Man in Theatre Line: [Notices Alvy and walks up to him] Wait a minute,
why can't I give my opinion? It's a free country!
Alvy: Did-did he, he can give you- Do you have give it so loud? I
mean, aren't you ashamed to pontificate like that? And the
funny part of it is, Marshall McLuhan; you don't know anything
about Marshall McLuhan!
Man in Theatre Line: Oh really, really? I happen to teach a class at
Columbia called "TV, Media, and Culture." So I think that my
insights into Mr. McLuhan, well, have a great deal of validity!
Alvy: Oh, do ya? Well, that's funny, because I happen to have Mr. McLuhan
right here, so, so, yeah, just lemme lemme lemme — [pulls McLuhan
from behind a nearby poster stand] — Come over here for a second.
Tell him!
Marshall McLuhan: I heard what you were saying. You know nothing of my
work. You mean my whole fallacy is wrong. How you ever got to
teach a course in anything is totally amazing.
Alvy: [To the camera] Boy, if life were only like this!
But if you're the one starting the argument, then you have all time in the world to mull over the proper defense of that argument. Weeks, months, years ...
There is no need to regret anything. Especially in a matter that isn't even an argument but basically just conveyance of facts from the public record.
tbqh kazinator is manifesting the bare minimum of Caring to be trusted with C++ code as a developer. If you aren't prepared to get into bizarre pseudo-legal arguments with collaborators about language specifications you are going to be happier somewhere else in the software industry.
I do not say this as someone who believes this represents healthy behavior or that wants to write C++ for a living. They've been like this since #C on EFNet and FreeNode were a thing 30+/25+ years ago. My guess is they got the behavior from Usenet or something but I didn't hang out there so I don't know first-hand.
In this situation, those others were not exactly wrong. It was about OP proving also not being wrong.
If people in your organization think you said something stupid, that could be bad for your opportunities there.
If you say something which is correct, but the correctness depends on some unusual conditions that are not well known, or special interpretations of common concepts, or some context that is not shared with whoever you're talking to, you have to either not mention that, or spell out those details.
If you posit that pigs do fly, don't forget the detail that it's when they are shot out of a cannon.
I would probably agree with those scientists, this seems like a “well actually” kind of gotcha claim with no practical significance when there is no usable implementation of this.
And more specifically, never get into arguments about what C++ can do because it’s guaranteed that someone has used it the way you assume it should never be used.
They probably totally forgot the conversation only seconds later, and did not intend humiliation or perceive any humiliation. He said something not usefully correct for the circumstances under discussion, and was corrected without fuss.
> If you want automatic garbage collection, there are good commercial and public-domain garbage collectors for C++. For applications where garbage collection is suitable, C++ is an excellent garbage collected language with a performance that compares favorably with other garbage collected languages. See The C++ Programming Language for a discussion of automatic garbage collection in C++. See also, Hans-J. Boehm's site for C and C++ garbage collection.
> Also, C++ supports programming techniques that allows memory management to be safe and implicit without a garbage collector. I consider garbage collection a last choice and an imperfect way of handling for resource management. That does not mean that it is never useful, just hat there are better approaches in many situations.
It seems few remember the days when Java put so much pressure on C++ that Bjarne S was very much trying to find a way to provide GC for those who wanted it. In 2023, the folks wanting GC are all using those languages, and it's assumed C++ is anti-GC.
It is shocking how groups can create an environment of being proudly wrong. Once you see the pattern, you see it everywhere.
Hahaha, how can that guy be so stupid as to think the world isn't flat? Easy to see in hindsight, but it happens with everything (programming language a vs b, OS A vs B, etc).
Groupthink is strong and even the people who agree with you might find themselves not pointing it out.
Yeah it's like saying that Linux has support for case insensitive paths or that Windows supports forwards slashes in paths.
People with some limited amount of knowledge will quickly tell you that you are wrong, but it is them who are wrong. Many don't even give you the little benefit of the doubt to do a small internet search, and yeah the problem is worse when you are an "outsider" in some sense.
It's like when you say Java has a GOTO. It's in bytecode, in the language as a reserved word, and has a rarely used partial implementation vis-a-vis label:
https://www.geeksforgeeks.org/g-fact-64/
Ultimately, thankfully, it never got made into a real keyword.
Nothing from my POV.
GOTO is more necessary (or even required) in procedural non OO languages because of fewer constructs to support modelling or structure. The OP mentioned redundant features, and in OO GOTO is redundant.
Maybe you were technically right for some small scope of the problem, no proper industry definition of GC was supported by C++ by any of the major compiler. So, the HPC scientists were right imo.
Being technically right for an aspect that doesn't matter is a common trope that a lot of software engineers fall for.
Depends what they meant. An expert in C++ likely meant that C++ does not guarantee any fixed width types. Whether they are provided depends on the implementation.
Well, "by support" it means a few functions that allow user to query some information about GC if implementation had it. No implementation has ever implemented GC.
You'd have to print out your own, you can't actually buy a hard copy of any recent ISO C++.
Some national standards bodies, entitled to do so, might claim they can sell you one when actually they've got a deal with a relatively local print-on-demand company, but because this standard is huge that company will reject the order so you in fact can't buy the product you'll get an apology, or a PDF or both despite specifically buying paper.
Although their processes are hopelessly antiquated (fly to Hawaii to fix a typo? I don't think so), the result is, as you might expect in 2023 - just a bunch of document layout source, from which "draft" versions of the document are publicly available.
Not as useful if you're getting burgled, or to raise a display off the desk, but more practical for real work.
You joke about that, but people who blindly "know things" will refuse to even look at the standard. After all they already know what it says on the topic. So even carrying the standard around with you wouldn't help.
What does this have to do with the article? This sounds like you're here gloating about some pedantic argument you made 4 years ago. And of course we are only hearing your side, so of course you're the hero.
And judging by the fact that this is currently the top comment, I guess this is what other people come to HN (now) to read in the comments. Color me confused.
Did you read it? It's about GC in C++ being a relative obscure feature that is now going to be removed.
>some pedantic argument you made 4 years ago
I just shared an anecdote related to it and I thought it'd be appropriate bc. this is/was quite likely the only chance I'd have in life to talk about this very very very specific situation with people that would understand it. And yes, it's getting a few upvotes, weird thing to be mad at that but to each its own.
We should be clear, GC was never a feature in C++.
What these features do, which are being removed, is make it easier to implement a GC allocator in a standard conforming and portable fashion. Current GCs in C++, such as Boehm's GC, depend on platform specific hacks that are incredibly brittle.
As it turns out, these features do not make it easier to actually implement a GC allocator in C++, so they are being removed.
I think most people can differentiate between a language providing a feature as a first class native property, and a library or feature built on top of that language. Most people would not argue against the position that C++ does not contain a web browser as a feature, even though it's possible to build a web browser using C++.
Similarly, C++ does not and has never had garbage collection, but for a time the standard did provide facilities that was supposed to make it easier to implement a garbage collection library.
If you want to go this way, then take a look at compiler support matrix for last few language versions - if having an implementation is necessary for something in spec to "be a feature of C++", then C++20 and C++23 are still missing some features they're supposed to have.
>I guess this is what other people come to HN (now) to read in the comments.
Reading comments (loosely-related, tangential, meta, whatever) that generate potentially interesting discussion? That's pretty much the only reason I've visited HN for years and years. What do you mean "now"?
C++ is the language that I know the most things about, but I would never say it's the language I know the best because there are so many things to know about it.
It does make it a somewhat intellectual exercise, precisely because it is so complex. A few years of C# and I felt like I knew the whole language quite well. Double the time with C++ and I'm fluent, but no where near an expert.
So that's one way to not be bored and to challenge yourself at work every day. :)
In all other languages I worked in, I basically can keep the entire language (not stdlib) in my head, and it's extremely valuable to understand what's going on. In C++ I just can't.
Without looking it up do you know what are the Java memory model guarantees? Or what's the difference between >> and >>> ?
Even "simple" languages have things that most people simply don't remember without looking it up because it's just not useful to know in day to day work.
> Without looking it up do you know what are the Java memory model guarantees?
* Java is a happens-before language--you need a sequence of X happens-before Y from a write to a read that crosses threads for the language to be well-defined. I'm not going to list the "obvious" cases of happens-before relations.
* volatile variables in Java are sequentially consistent atomic variables.
* 64-bit loads are atomic in the sense that word tearing cannot be observed.
* Writes to a final variable in an object's constructor happen-before the end of constructing an object. (Probably the easiest thing for people to forget)
* There's a bunch of complex rules to try to explain what happens if you violate happens-before and you have a data race. These aren't really right anyways, so you can safely ignore these and pretend it's UB if you do create a data race, which is what happens in other happens-before language models.
> Or what's the difference between >> and >>> ?
The first is arithmetic shift right and the second is logical sign right; alternatively, the first will copy the sign bit into the lower vacated bits while the latter does not.
> you can safely ignore these and pretend it's UB if you do create a data race
I mean, that's safe but it's thoroughly unnecessary. In a language where races are UB all bets are off and that's a pretty wild consequence for what may be a relatively minor mistake.
Because data races aren't UB in Java, our program even though its behaviour now likely defies ordinary understanding by its creators, does still have some coherent behaviour. For example maybe we raced code that's putting CustomerOrders in a List, and so the List is smashed. In a language like C++ it's entirely possible that the smashed List is so toxic that even asking how big it is potentially has completely insane answers, crashes, runs arbitrary code, anything might happen - let alone if trying to take all the CustomerOrders out one at a time for processing. In Java, maybe the List is empty, which is not what we wanted, or it has all the orders which shouldn't be there, it won't have for example a negative number of CustomerOrders or anything else completely nonsensical.
In terms of defensible business logic, Java's memory model isn't better. But in terms of "Do attackers who exploited my race also get to run arbitrary machine code with database access?" the answer is definitely going to be "No" in Java for a data race, unlike in C++ or even Go.
Also, while it's true that Undefined Behaviour is common for data races (because of SC/DRF and laziness), it's not what happens in OCaml, they've come up with an even cleverer scheme than Java so that they hope their programs remain not only strictly well defined, but maybe something you can reason about despite a data race.
> 64-bit loads are atomic in the sense that word tearing cannot be observed.
This is true in practice in case of most JVM implementations, but is not in fact mandated by the spec — only 32bit values are loaded atomically according to that
(so on a 32bit JVM you might observe ‘tearing’).
Nonetheless, I also think that you can pretty much hold all of Java in your head, as opposed to C++.
Yes... but I think the real question that will stump people is "What is <? super Foo> and how is it different from <? extends Foo>". :D
Java generic semantics are wild and often surprising.
But too your point, I think typescript is FAR more complex than C++ yet I almost never see similar complaints of its complexity. The richer the type system, the more complex the language.
That's not the real confusing one in generics. The real confusing one is "What is the difference between Class<List> and Class<List<?>>" (i.e., mixed generics and raw types)... I only know about this because it bit me in the ass once!
Ha! I thought about doing a tricky thing with wildcards vs missing generics.
What gets even more wild is, for reasons I really don't understand, when a missing generic makes its way into a generic pipeline it seems to have the tendency to erase other generics. (IE, `list.stream().map((f)->new Map())` does weird things with the stream as it gets more complex.). You can throw in "What's the difference between List<Object>, List<?>, List, and List<? extends Object>" for good measure :D
These are both pretty intuitive - you can provide any class which is extended by or extends T respectively. Some of the consequences are less intuitive, but I don't think it's any hard to remember. As someone who last professionally programmed Java during Java 7 days, it was still clear to me what these mean.
No, I only listed it because of how simple it is which makes me confident I understand exactly what's going on.
The full list of opcodes is like an stdlib - you don't need to know it by heart, just know what's available and how to search the docs. The things you can and hopefully do keep in your head are what addressing modes exist, how flags affect things, how interrupts work etc'.
Better low level developers than me also understand all the concurrency stuff - what is kept in which cache when, what operations constitute which kinds of memory barriers and what guarantees do you get about the thing you just read or wrote.
Wow, they have a C# 12 now? I wrote "old versions of C#" because I last delivered a project in C# 3.5, and honestly that was maybe 12 years ago and it was a bit arrogant of me to add it, I don't think I understand C# _that_ well.
.NET Native is dead for a long time, it’s the least important runtime flavour compared to others today because of this. IL2CPP and Mono are orthogonal and only ever relevant if you develop for Unity going forward (okay, there’s Blazor too which uses it for WASM).
No it isn't, given the extent UWP is still used on Windows given the failure of WinUI 3.0/WinAppSDK to deliver something that has feature parity with it.
Windows widgets and the new badly implemented File Explorer are probably the only WinUI 3.0 applications, the remaining stuff is still pretty much UWP, while many third parties that still care about WinRT at all, are stuck in UWP due to missing features and tooling in WinUI 3.0/WinAppSDK.
As for the rest of your remark, it doesn't change the fact that those runtimes exist and have relevant differences.
I am catching a whiff of judgement in your reply. You are the person on the team who knows the ins and outs of the language and is proud of it. You are a valuable member of the team, but I don’t want a whole team of you. There are other skills that are just as important in getting something “delivered” as knowing all the syntax.
I said there are other skills that are just as valuable that I look for when building a team. Exhaustive knowledge of a language is a perfect skill for writing docs, training, performing interviews & doing code reviews. My main point was, it is not a required skill to be a “good” developer, like you seemed to imply.
I said it's possible to keep a full model of how some languages work in my head and I find it helpful to, not that it's mandatory. I even said that I can't do it in C++ and that I've delivered C++ projects, so I don't think my phrasing implied that it's required :-)
A good developer ships high quality software. If the situation calls for it, she ships it in C++ or ECMAScript 3 or Microsoft InfoPath. Nothing else is required. Many other skills are very useful, like good communication skills or some understanding of business or math or how computer infrastructure works.
These days I seldom write code - maybe one out of 4 or 5 work days I'll spend more than half an hour typing code in a text editor - and yet having a good model in my head of how my language works, how my database works, how my caches work, how my network works, is something that's very useful to me everyday. Other talented developers I know don't have that, and I recommend most of them to spend a bit of time acquiring it because it's worth it. More talented developers than me that I know and work full time in C++ don't have it, and sadly I don't recommend they get it because I think it's beyond their abilities. I think that's a negative attribute of a language that has some other very positive attributes (like its culture of just writing code that solves problems instead of obsessing about tools and DX all day).
The problem with this is what you don't know can hurt you, that's why I like Sean Baxter initiative[0], where you can selectively disable c++ features (at file level!).
I was about to say the same thing. With C++, you don't necessarily know if you're looking at code that uses unfamiliar language rules, especially when templates are involved.
I think I have about 95% of Python in my head. I occasionally see a new edge case I hadn’t considered, but it’s been a while. The language itself is rather compact. There aren’t so very many keywords or builtin functions. Even the stdlib is manageable, although I'm always pleasantly surprised when a module has added some new functionality or convenience that will save me time. It’s knowable.
The problem with Python (and dynamically typed languages in general) is not so much knowing the language, but understanding the code that people write with it.
Since there are no type annotations to help your understanding of the code, you need to keep a lot in your head, as opposed to statically typed languages.
That’s not been my experience at all, but I could see it being confusing at first.
Also, we do have type annotations now. They’re not universally used, and definitely not required, but mypy and friends can go a long way toward finding type issues even without them.
Yeah, because there's a whole lot of redundancy. If you "know" 3.11, and upgrade to 3.12, there are only going to be a few changelog items you'll actually care about. The rest are generally internal implementation changes that don't affect how you use the language.
OK, so I don't have every corner of the stdlib memorized, especially the odd deprecated batteries-included stuff like playing sounds on a Sun workstation. Nor do I have the extension API docs committed to memory, as you can use the language a lot without having to write those things yourself.
But the core of Python, like the language itself and the most common parts of the stdlib that are actually used in 95% of Python code, I have a strong grip on. In particular, the actual language is pretty simple.
Other people have touched on this, but I think one's ability to be productive in a language is highly correlated with how much of the language (and project domain) they can "keep in their head" at a time.
This is why Go is my (and seemingly a lot of other people's) go-to "get shit done" language, I can write and edit code for hours without having to think about (too many) footguns or hunt for docs and esoterica.
I can write and edit C++ for hours without thinking too much too, and I’m no language lawyer. Almost all code one encounters in the wild is not particularly esoteric.
> but I think one's ability to be productive in a language is highly correlated with how much of the language (and project domain) they can "keep in their head" at a time.
I disagree. I'd say productivity has very little to do with language familiarity and way more to do with ecosystem familiarity. C++ is hard because the ecosystem has no real entry point and several "script language to write a script language to write a compilation definition to write a script language to compile a project" build systems. (Who's familiar with M4?)
But when you hit the edge of your language, it will be a very painful experience, with disgusting, unmaintainable workarounds.
Mind you, I’m not thinking of some obscure thing that is only possible in Lisps, as in my experience these are seldom required, and code generation is always an option.
Go is hilariously ill-designed from this perspective, it gives you the common 80%, but has no answers for the remaining 20%. Generics does improve it a bit, but there are so many edge cases that it is simply sad to see it in a supposedly modern language.
I can see this as an attitude to stay the course using/leaning it. But not as an end game; “you’ll never get it, don’t even try!” tThe idea that you as a team might develop a large piece of software, but have an unknown feel for how much more there is that is unknown, is scary to me.
And a bit ironic. I lived through the static compile type/dynamic late type wars. There was this argument that with C++ (and similar) you knew and had some compiler ratified certitude you could count on. But if you’re telling me that a junior/intermediate dev can write code, perhaps making head scratching design concessions to get the thing to compile, he might make code that appears to run in the common case, but has unexpected behavior in other as of yet demonstrated areas, because they can never hope to have enough knowledge of the language to build a semi accurate model to know what to expect, then we’re really no better off than those late bound interpreted hippies with their weird metaprogramming VMs.
Whoah, I think you completely mis-read my intention.
My point is that C++ offers users a lot of tools. You don't need to use most of them to solve most problems. Just start with simple tools: functions, classes, standard templates, etc.
That 10% effort will solve 90% of problems.
If you find a problem that can't be solved with the simple tools, then start looking into more advanced features C++ offers and use them if appropriate.
That is why no-one needs (or should try) to keep all of C++ in their head. It's not necessary for most problem solving.
It's of course fine to be enthusiastic about it and learn some new things just for the sake of learning. But it's not a requirement to know all of C++ to write C++.
>> But it's not a requirement to know all of C++ to write C++.
You just need to know as much as everyone else on your team and all of the libraries that you are using.
"Within C++, there is a much smaller and cleaner language struggling to get out". Yes, that quote can be found on page 207 of The Design and Evolution of C++. And no, that smaller and cleaner language is not Java or C#. The quote occurs in a section entitled "Beyond Files and Syntax". I was pointing out that the C++ semantics is much cleaner than its syntax. I was thinking of programming styles, libraries and programming environments that emphasized the cleaner and more effective practices over archaic uses focused on the low-level aspects of C." Source: https://www.stroustrup.com/quotes.html#:~:text=%22Within%20C....
The question is then, "which 'cleaner and more effective practices' are being used in the code you write and maintain?"
Only someone who knows too little about C++ would say something like that. Different features can interact in subtle and unexpected ways and that can bite you at runtime without any warning at compile time.
As someone in a large, long-tailed codebase with multiple people working on it...it really hasn't been an issue. As long as people know the basics, and leave comments for anything weird they're doing, it's fine.
The only time I've gotten friction is when in another project someone was really eager about making everything as abstract and modern as possible, to the point the code was (in my eyes) nearly unreadable. But that's not a C++ problem, that's a dev problem.
I have this impression that modern C++ is far more complex with a far more large surface area (from the perspective of learning the language) than Rust.
Because I am not expert in both, what folks that came from C++ to Rust or have to work regularly work with both say about that?
I’ve done a pretty fair amount of both, more C++ in total, more recent work in Rust.
I think it really depends on whether you’ve got a mountain of legacy C++ with dated infrastructure and practices or a modern C++ code base at shop that runs a tight ship.
In the former case the incidental as opposed to inherent complexity in C++ is a real PITA compared to Rust (which isn’t exactly shy about gratuitous complexity especially in the trait system and going more than a bit overboard with macros IMHO). C++03 written without modern tooling and a hardass style guide and stuff is usually a nightmare. I would vastly prefer to work in almost any Rust codebase compared to a sprawling nightmare of code that still calls new/delete routinely.
A modern C++ codebase with all the best practices and tools and stuff? 6 one way half dozen the other more or less: Rust is now fast and feature full enough to be an option for most anything C++ would do. Do you like hardcore affine typing by default or dislike it?
Another way to think about it is that modulo some big differences: Rust bundles (and mandates) a bunch of stuff you opt into in C++: a uniform set of best practices, hardcore static analysis, a credible strategy for catching memory safety issues and UB and thread safety issues. (The case is overstated about the difference in efficacy of e.g. ASAN and the borrow checker, they have pros and cons and it’s not a 1-bit debate).
C++ tooling has a few important edges (though Rust is catching up): clangd is usually (always?) faster and more stable than rust-analyzer but you can throw hardware at it so it’s not a huge deal.
Cargo is just a dunk below some project size. Above some project size the story is still evolving.
It’s just not as big of a difference as you often hear one way or the other. I’d probably default to Rust unless I had library interop reasons to go C++ (which is often).
I work with both, having started with C++ about 17 years ago, and agree that Rust feels like a relatively simple language compared to C++. Rust might feel harder to learn initially because the borrow checker won't let you compile certain programs, but once you are over this initial hump, the rest is quite straightforward.
I don't really agree with that. I'd say they're complex in different directions. C++ has complexities Rust doesn't because it bends over backwards for source-level compatibility. A lot of it is entirely at the semantic and pragmatic level. Rust's complexities are mostly due to its type system, meaning the complexity is at the syntactic level. I had never seen a computer crash because an IDE was trying to figure out how to parse a program before I worked with Rust, for example.
"Too much anime" was not a phrase I expected to see in a compiler bug report.
Internal compiler errors do happen from time to time. They're annoying but usually easy to work around. I've had projects where I just cannot use rust-analyze because it just never finishes. It just eats RAM without accomplishing anything.
>I had never seen a computer crash because an IDE was trying to figure out how to parse a program before
This happens daily on my Intel MBP in Xcode. In only a ~15k LoC small app, 99% Swift. I’ve had to break up several functions into smaller ones purely because the compiler chokes so easily. They actually have a dedicated error message to the tune of “couldn’t figure out types here, could you break this function up please?”.
But yeah, outside of that I’ve never seen it happen in major languages using major language tooling. Never even saw it in 5 million+ line C/C++/.NET CLR mixed codebases.
> I had never seen a computer crash because an IDE was trying to figure out how to parse a program before I worked with Rust, for example.
C++ is complex enough that the IDE can't really parse much of the program's code in any useful fashion. You're lucky if it can get the right type hints and jump to definition. And even the latter may not be complete.
Contrast with e.g. Java, which makes it easy for the IDEs to get the full picture.
Sure, but in those cases the parser just gives up. It doesn't grow its working set trying harder and harder seemingly forever.
We're talking about C++ and Rust here, so I don't know why you bring up Java. If parsing Rust was as easy as parsing Java you would not see me complaining about it.
Giving up vs. crashing is a trivial difference, ultimately boiling down to an extra terminating condition connected to real-world resource use. Either way, the parsing is useless.
I brought up Java as an example of what it means for the IDE parsing to work and be useful.
What kind of IDE are you working in that will lose your work when it crashes?
I don't know what's going on in the Rust world, but in C++ world, even the good ol' Visual C++ (2017, 2019), when it crashes (which it does surprisingly often on my work's codebase), it doesn't lose anything other than maybe unsaved edits. It's annoying, sure, but 30 seconds later it's back to where it was before the crash.
Also, a not working parser is not a trivial inconvenience. It just means the tool doesn't work. From the POV of wanting to use advanced functionality that relies on parsing the code, it doesn't matter whether the tool aborts its execution so it doesn't crash, or has to be disabled because it doesn't abort its execution and just crashes. The end result is the same: I can't use the advanced functionality.
What I said was that the computer crashed. The IDE used so much memory that it took the system with it. When it came back up something weird had happened to the environment and it was ignoring the stuff in .bashrc.
>Also, a not working parser is not a trivial inconvenience. [...] The end result is the same: I can't use the advanced functionality.
Yeah. Now compare "the IDE is just working as a glorified text editor" to what I'm describing above.
I'm sorry for misunderstanding your earlier comment, and thank you for clarifying. I can see how this is a much more serious problem.
However.
That sounds to me less like an IDE problem, and more like a Linux problem. Specifically, the problem with the... unique way a typical Linux system handles OOM state, i.e. by suddenly curling into a ball and becoming completely unresponsive until you power-cycle it. I've hit that a couple times in the past, and my solutions were, in order:
- Monitoring the memory usage and killing the offending process (a graph database runtime) before it OOMs the system;
- After becoming tired of the constant vigilance, quadrupling the amount of RAM available in the OS; (yes, this is why people overprovision their machines relative to what naive economists or operations people may think..)
- Changing the job, and switching to working on Windows; WSL may not be 100% like a real Linux, but it inherits sane OOM handling from its host platform.
I'm sure there is a way in Linux to set a memory quota for the offending IDE process. This would hopefully reduce your problem to the more benign (if annoying) case I described earlier.
I actually run Windows mainly. This was inside a Hyper-V VM with 32 GiB of RAM. I'd like to be able to work on this project from Windows, but unfortunately I can't, and don't have the energy or inclination to figure out how to get it building on Windows. I already knew rust-analyze had this problem, which is partly why I allocated so much memory for the VM. Unfortunately I triggered a rust-analyze rebuild just as I was already building another codebase in a different directory. That's what brought it over the edge.
While I agree that Linux sucks at handling this particular situation, my point was about Rust's complexity. Normally, when you're using C/++ dependencies the IDE doesn't need to inspect the internals of the libraries to figure out how to parse dependent code. And, it's also true that rust-analyze doesn't know how to stop trying. It will parse what you give it, even if it kills your machine.
I had emacs crash (well, lock-up) while trying to parse the error messages from some extreme C++ metaprogramming. It was at least a decade ago and both emacs and C++ have improved, but still...
edit: mind, in the same code base (10M+ loc), Visual Studio would crash while attempting to initialize Intellisense.
C++ precedes Rust by ~30 years. Just wait how large the Rust surface area might have become in 2053. There’s lessons to be learned from history, so hopefully less, but about every successful language so far has only kept growing and becoming more complex.
Indeed it's already grown `async` and `?` since 1.0. But I love those features, they're a good thing.
In Rust, the compiler still checks everything. I can't mis-use async or the question mark operator and accidentally make my code unsafe.
In C++, I'm expected to know everything about every feature I use, so I have to be paranoid. Sure I remember that unique_ptr isn't atomic, do my juniors remember that? Sure, returning a reference to a local variable is a warning, but it's not an error, right? Many less-healthy teams probably ignore those warnings. And I myself don't even remember the rules for SFINAE or `move` in full. Not to mention that OpenCV's mix of shallow and deep copying for matrices casually breaks `const`.
In Rust, more surface area is just more Lego bricks. If two Legos snap together, you're safe. If they don't, don't force them. C++ expects you to force everything and take responsibility for not being a human encyclopedia of the language.
Almost every feature is a good thing. But the total quantity can become a bad thing, because the interactions between the features become more and more complicated. It's almost like features have values proportional to the number of features, but they have costs proportional to the square of the number of features. You eventually reach the point where new features add more cost than they add value.
But it's not that simple, because each new feature adds value to a subset, but adds costs to everyone. If the subset is vocal, they often get what they want, even if it's a net loss for all users taken as a whole.
So the trick is, first, to stop adding features once the costs outweigh the benefits, and second, given that you have only a finite number of features that you can add before you reach that point, to add the total set of features that are going to make the most valuable language.
Separate compilation means that even if you write an interprocedural static analysis system (quite a bit more complex than what most people would call a linter), you still run into oodles of hard boundaries where you can't look into other functions. Fixing this requires explicit annotations all over the place to even have a chance.
C++ is also a remarkably complex language in terms of aliasing relationships. There are a bazillion ways you can make two names alias each other. This gives you a few options when writing an analysis. You can be sound and do weak updates everywhere, which means your alarms will basically never be confident. You can be unsound and assume no aliasing, which means that you've got a lot of false alarms. You also can't really make a rule "don't ever create aliasing relationships" because they are often idiomatic C++.
And finally, the key properties that people really deeply care about like heap lifetimes fundamentally involve complex reasoning about both temporal properties, dataflow, and heap shape. All of these are hairy static analysis problems. In combination, a nightmare.
It is possible to use C++ with a linter to help detect errors, but linters do not catch all errors nor do linters always understand what you are trying to do.
Most approaches to using C or C++ safely involve throwing out large portions of the language and disallowing features that are easy to misuse or problematic to analyze for safety.
Maybe. But my employer doesn't have those tools. Assuming we're a median team, then half of all c++ teams don't do any static analysis or even know how to use valgrind properly. (The only tools that looked useful to me were the ones that incurred a thousand x slowdown, and our code is too crappy to run correctly at different speeds)
Rust has the advantage of seeing the mistakes of the past and not making them. Many intersting ideas have been tried, only after significant use do we discover which are good and which are bad.
Compromise is sometimes needed. C++ had some ideas they knew at the time were bad, but backward compatibility forced it and backward compatibility is itself a great idea worth the costs.
Rust will have plenty of time to invent whole new categories of mistake, if it ever catches on. It started out with a raft of old familiar mistakes, and shed them over the years leading up to the 1.0 release, such as non-contiguous stacks and green threads. Maybe the way async is specified will turn out to have been one of the mistakes. It has, anyway, mechanisms to shed old mistakes that are not relied on much.
One of those history lessons is that change happens. This is why Rust has editions: allows for new features without breaking old code. There is still code complexity, but that complexity has been shifted/amortized into the compiler suite instead of everyone's project code.
Crucially, editions allow for deprecation, which is a trick C++ always had trouble with no matter how outdated the language construct.
I think the big problem with C++ was the “C” in it, that is maintaining compatibility with C (or sort of compatibility). Rust didn’t make this choice and it’s a completely new and different language.
Bad choices in C++ will ever change since you will break compatibility with a ton of stuff. Rust has the concept of “edition” that allows to migrate to new language versions gradually.
>> Just wait how large the Rust surface area might have become in 2053.
This is a valid concern. One can hope that Rust evolves very slowly and as-needed. IMHO part of the problem with C++ is the fact that a committee exists to advance the language and produce regular updates. Combine that with most of the language already being defined (with a lot of overlap with C BTW) and you get a lot of bolted-on stuff and core features as part of the standard library that might have otherwise been part of the language with nice syntax. Rust had the advantage that a lot of things had been learned prior to its creation so things are cleaner. Lets hope keeping it that way is a priority and not just adding new things on top of new things - I think they're doing it right, but I don't really follow it.
C is great in this regard. The language is IMHO mostly "done" and rarely changes. I'm happy to use C99 and not much demands newer.
Why do you think it will be abandoned? Programming languages past a certain point don't die easily, especially when used for low-level system programming (otherwise C and C++ would be long dead).
All languages stop evolving at some point. C isn't there yet, but there are many dead languages that people only use because a lot of things are already written on it, and nobody wants to change anything.
So why does that mean Rust will be abandoned in a few decades?
Do you think there will be a trend "back" to using C or C++ for systems programming? I would bet against it. I do believe, by the way, that C has stopped evolving (which is good) and that C++ should stop evolving as well.
Or do you think the replacement of old languages by new ones will accelerate? So in 30 years most systems programming will be done in a language that doesn't exist yet? Maybe not done at all in a "programming language" as we have today?
Or do you think Rust is clearly losing out to some other new languages for systems programming, such as Zig, and will never be popular enough in the first place to enter the "slowly dying legacy" regime?
You don't need a "trend back to using C++". C++ usage is still growing by leaps and bounds. The number of people picking up C++ for professional use, in any short interval -- used to be two weeks, now a bit longer -- is more than the total number who are coding Rust in production. That will be true for a long time.
I've worked a lot with C++, and a small to moderate amount with Rust. I tend to prefer Rust when given the choice.
Comparing the overall complexity levels is something of a category error, though. Most of the complexity of Rust is in the core functionality, idioms, and conventions of the language. You'll need to grapple with most of that complexity very early.
Most of the complexity of C++ is in the various functionality that was either inherited from C or accumulated over the decades after that. Most individual pieces of software don't use all of that. E.g. approximately nothing will use both va_list and variadic templates (ok, maybe indirectly through libraries, but not in a way the direct author needs to think about). The latter is just a better way of accomplishing what the former does. There are lots of variations on this theme.
My sense is that, in practice, Rust has a steeper learning curve than C++. I find it more productive now that I'm pretty familiar with it, so I think it's worth that steeper learning curve. I still think it's a bit steeper.
This might be influenced by the fact that I've used C/C++ for much longer, but my impression is that Rust is an even larger language with more details to learn than C++. The difference is that in Rust if you get the details wrong (in safe code) you get a compiler error or at worst a logic bug, while in C++ sometimes you get a compiler error and sometimes you get memory corruption or integer overflows, or undefined behavior.
Expanding on this, the general concepts you need to understand for both are about the same, but because Rust enforces them in the compiler, you have to learn the detailed rules of how the language enforces ownership and lifetimes and such which is more detailed/complicated/restrictive than the concepts themselves.
Furthermore, some of the Rust language details cause library APIs to become more involved than they would be in C++. An example in the std library is that in C++ handling errors and function results are orthogonal features. With sum types they are intertwined, and the Rust community is more liberal with adding convenience functions and syntax so you end up with a combinatorial explosion of all the different ways you want to handle the error combined with all the ways you want to handle the result, with about 40 methods each in Result and Option, plus methods for handling errors in iterators (functional streams). Lifetimes and async can also complicate crate APIs in ways that don't exist in C++. None of them are difficult on their own, but the shear volume of things you need to learn an remember makes me appreciate the minimal Go philosophy.
On the flip side, the places where C++ gets more complex are all the little bad decisions that can't be fixed for backwards compatibility like the stupid numeric conversion rules inherited from C which can easily bite you. And both the C and C++ string/stream libraries suck in their own ways.
> This might be influenced by the fact that I've used C/C++ for much longer, but my impression is that Rust is an even larger language with more details to learn than C++. The difference is that in Rust if you get the details wrong (in safe code) you get a compiler error or at worst a logic bug, while in C++ sometimes you get a compiler error and sometimes you get memory corruption or integer overflows, or undefined behavior.
I disagree. Rust seems less complicated in many ways. For example, move semantics are a lot simpler; they're always a byte copy, whereas with C++, you have to remember rvalue references, lvalue references, and universal/forwarding references (which are usually rvalue references but sometimes are lvalue references). You also have to be careful not to mess with a moved-from object, as it's in an unspecified state.
C++ also makes a distinction between trivially copyable types and non-trivially copyable types (a distinction Rust doesn't make). It's difficult to remember the rules for non-trivially copyable classes, but they boil down to "if it has a user-provided constructor, destructor, or copy-assignment operator; or if it has virtual functions; then it's not trivially copyable."
In Rust, all you have to remember is that types are by default moveable (and moves are a memcpy), deep copies are implemented through the Clone trait, and if your type can't be memcpy'd (e.g., a self-referential struct) it needs to only be accessible through Pin pointers, which ensure that it isn't memcpy'd in safe code.
Another thing: you can cause a use-after-free error by combining coroutines with lambdas [1]. An error like this only happens because C++'s rules around coroutines and lambdas are complicated enough that the committee didn't forsee this happening. This seems indicative of higher complexity in C++ than Rust, to me.
On the other-hand, I find it annoying in Rust that an assignment or an unadorned function parameter might result in a move or a copy and you can't tell by the function signature or call site. Instead it depends on whether the type implements the Copy trait, which is a big semantic difference based on "spooky action at a distance".
> You also have to be careful not to mess with a moved-from object, as it's in an unspecified state.
This is a great example of Rust being easier, even when it just as complex. Both languages don't allow you to use an object after it is moved, so it is the same amount to learn and to think about when writing code, but Rust will give you a helpful compile-time error while C++ will let you blow your foot off.
>You also have to be careful not to mess with a moved-from object, as it's in an unspecified state.
This is incorrect, somewhat. It's "unspecified" in the sense that the standard doesn't mandate that user-defined move constructors and assignment operators leave source objects in any particular state. All standard library classes are left in well-defined states when moved (if they can be moved), and you can choose to define your classes to do the same. The usual rule of thumb is that a moved object should be in the same state as if it had just been default-constructed. This is what all the standard classes do.
Move semantics in C++ and in Rust are more or less equivalent. The major differences are that C++ copies by default and rust moves by default, and that Rust doesn't allow using a moved object while C++ does.
>In Rust, all you have to remember is that types are by default moveable (and moves are a memcpy), deep copies are implemented through the Clone trait, and if your type can't be memcpy'd (e.g., a self-referential struct) it needs to only be accessible through Pin pointers, which ensure that it isn't memcpy'd in safe code.
If defining value semantics was so simple you wouldn't have had to bring up Pin pointers (which I assume are not just pointers, but something special that needs to be kept in mind), or the fact that Rust understands that there are two different kinds of code, which C++ doesn't distinguish. It's suddenly so obvious that one is simpler than the other.
>Another thing: you can cause a use-after-free error by combining coroutines with lambdas [1]. An error like this only happens because C++'s rules around coroutines and lambdas are complicated enough that the committee didn't forsee this happening. This seems indicative of higher complexity in C++ than Rust, to me.
It's trivially easy to cause use-after-free errors with lambdas. Return an std::function<void()> that captures and reads a local std::unique_ptr by reference and then call operator()() on the object. This is not a complex interplay between features; lambdas necessarily introduce dynamic lifetimes into a language that was originally not designed to support them, so using them requires care.
> Move semantics in C++ and in Rust are more or less equivalent.
That's not my experience at all. The big annoyance for me with C++ move semantics is that it forces me to allow an invalid or default state (representing a moved-from object), which subverts a major premise of RAII: non-default constructors establish class invariants which are maintained for the lifetime of the object, so that all other code can assume those invariants. There should be no such thing as an invalid or partially initialized object. When I'm forced to allow an invalid state to support move semantics, all my code has to either check for this invalid state (if it's logically possible) or assert that it's not present (if it's not logically possible). That's a major source of gratuitous complexity that simply isn't inherent to move semantics, as Rust demonstrates. (The invalid state is necessary because there's no way to prevent destructors from running in C++, so a destructor needs some way to know that an object isn't properly initialized.)
C++ move semantics have brought us back to the bad old days of checking isInitialized flags and calling initialize() methods, which is what non-default constructors were supposed to solve.
If you find yourself doing if (!initialized) initialize(); that's a sign that you should have just called initialize() on the moved object while still inside the move constructor, if initialize doesn't need any additional parameters. If there's no way to construct or initialize the class in a default state (e.g. something equivalent to an empty std::string) with no additional parameters, it's probable that the class shouldn't have been movable, and instead the object should have been wrapped either in std::unique_ptr or std::optional. Not every class needs to be movable.
Again, this has nothing to do with C++'s move semantics and everything to do with how you define your object's state transformations.
Let me make it a bit more concrete. I have a hazard pointer class, where the constructor registers the provided pointer for GC protection, and the destructor removes GC protection. I would like to be able to dereference this hazard pointer object freely, without doing null checks everywhere. RAII is the perfect fit for these semantics: the constructor establishes the invariant (GC-protected non-null pointer) and all other code can assume the invariant. Until I needed move semantics, that is.
In the constructor of a class with a hazard pointer member I needed to be able to initialize a hazard pointer on the stack and then move it into the member variable. (Because the hazard pointer constructor is fallible, I needed to catch exceptions thrown from the hazard pointer's constructor and retry from within the containing class's constructor, so I couldn't just use an initializer list.) In order to support move semantics, I had to give up the invariant that any hazard pointer instance is properly initialized (I needed to use a null pointer to represent the invalid state). That complicated all the clients, which now had to either check for or assert against the invalid state.
None of these gymnastics would have been necessary in Rust. Sure, it doesn't have constructors, but it's easy enough to write a factory method that establishes constructor invariants, and then you know that any object returned from that factory method will satisfy those invariants for the entire lifetime of the object. Since it is impossible to accidentally use a moved-from object (unlike C++), there is no need to introduce an invalid state to prevent misuse. I could just freely dereference my hazard pointers, with no checks or asserts necessary.
It seems like the obvious answer is to have a nullable_hazard_ptr and a hazard_ptr, which composites nullable_hazard_ptr. nullable_hazard_ptr is movable and default-constructible, while hazard_ptr can only be constructed with arguments and cannot be moved, but can be constructed from a nullable_hazard_ptr &&.
So if you need to return a hazard pointer from a function you return a nullable_hazard_ptr and the caller can choose to assigned that value to auto or to hazard_ptr. In the latter case, the caller will have the guarantee that the object is valid because if the function returned a null pointer the constructor will have thrown an exception. Furthermore the pointer will remain valid until it goes out of scope because there's nothing that can be done to it to make it invalid (UB notwithstanding). Of course, anyone who chooses to use nullable_hazard_ptr will need to check for validity.
Unfortunately this does mean that it's the responsibility of the callers to choose the right pointer.
>In the constructor of a class with a hazard pointer member I needed to be able to initialize a hazard pointer on the stack and then move it into the member variable. (Because the hazard pointer constructor is fallible, I needed to catch exceptions thrown from the hazard pointer's constructor and retry from within the containing class's constructor, so I couldn't just use an initializer list.)
This particular case would be handled by calling a helper function in the constructor's initialization list for each hazard_ptr member. As I said, this function should return nullable_hazard_ptr (always non-null; you will have already ensured this inside the function. You still need the nullable type because it's the only one that can be moved).
Ultimately what you have is something analogous to std::lock_guard and std::unique_lock. You are acquiring and releasing a resource and in some cases you need to tie the acquisition into the program structure and in other cases you need to be able untie it. There's no way to specify that in C++'s type system other than by having two separate types.
Here's a much simpler example of something impossible in C++ and trivial in Rust: how about a non-nullable unique_ptr? The constructor should just be able to check for null and then no code need ever check for null again, right? Sorry, you need an invalid state to represent a moved-from instance, so this is impossible.
Are you telling me that having to accommodate invalid states that are semantically both unnecessary and undesirable is not a serious limitation of C++ move semantics?
Following the previous example, you wrap std::unique_ptr in another class that has no move constructor and forwards constructor parameters to std::make_unique(), and can also be constructed from an std::unique_ptr. Now you have a heap-allocated smart pointer class that can't possibly be null.
Alternatively, you make it movable, and if someone tries to call operator*(), operator->(), or get() on the null value, you throw an exception. Not as clean, but, hey, it's safe.
>If defining value semantics was so simple you wouldn't have had to bring up Pin pointers (which I assume are not just pointers, but something special that needs to be kept in mind) [...]
Pin<P> is basically a wrapper for any pointer type P, whether it's Box<T>, &mut T, &T, etc. Its sole purpose is to keep the object that's pointed at from being moved, byte-copied, or byte-swapped. For self-referential structs, that's all you can do; there's no equivalent to move constructors. See this link at the bottom [1].
> This is incorrect, somewhat. [...]
You're right, but I really just meant that you have to keep track of more. If you use a moved-from unique_ptr, for example, you'll dereference a null pointer. Sometimes you do want to use a moved-from object, though, so it's not like this can be disallowed. It's something you have to keep track of. I just think this is more complex than Rust's unconditional rule that moves are bytewise copies, and moved-from objects aren't allowed to be used. I've heard C++ programmers complain about Rust being less powerful than Rust in this respect, but it is simpler.
> It's trivially easy to cause use-after-free errors with lambdas.
I don't fully understand it, but from what's described at the link, this causes the use-after-free bug:
The fix is to write it like this, I'm pretty sure:
[x] () -> future<T> {
auto xx = x;
co_await something();
co_return xx;
}
The reason is because coroutines copy their input to their own stack frames, but lambdas are passed by address, and so the coroutine later dereferences a dangling pointer. Again, it just seems more complex than anything in Rust.
>If you use a moved-from unique_ptr, for example, you'll dereference a null pointer.
You can still check if the pointer is null first, which would be UB if the object was simply in an undefined state.
>It's something you have to keep track of.
You don't have to keep track of it, you just have to design your classes such that moving a value leaves the source in a usable state. At the point of use, moving is no different from any other operation on the object. There's no intrinsic difference between moving an std::unique_ptr and reset()ing it. You do have to keep in mind the possible values of the object as you operate on it, but this has nothing to do with move semantics. In any language, you wouldn't want to attempt to access the 10th element of a list after clearing the list.
Now, if you design your class such that the moved-from state is invalid and distinct from any state the the object could reach by any other means then yes, you will need to treat moves on that type differently. However, that's a problem you created for yourself.
>The reason is because coroutines copy their input to their own stack frames, but lambdas are passed by address, and so the coroutine later dereferences a dangling pointer.
Yes, like I said, lambdas decouple lifetime from lexical scope. You have to be use them carefully to avoid running into issues. This is not because C++ lambdas are a complex feature than Rust lambdas. The opposite is true: Rust lambdas are more difficult to use incorrectly because Rust's type system is more complex and can keep track more closely of the lifetimes of objects.
Dramatically more so. Rust is complex, but most (not all!) of the complexity is there to support a specific, modern, safe style of programming.
C++ adds fifty years of cruft to that.
A lot can be said about the surrounding social environment, but as far as the languages go, I don't think it's far wrong to say that Rust is "C++: The good bits".
Yeah I guess I'd phrase it as 3 kinds of "complexity"
Rust is complex because the compiler has many ways to say "No I won't compile that, it might be wrong".
C++ is complex because it requires the programmer to understand every feature used in the codebase. (e.g. Will the compiler warn me if I use inheritance and dynamic dispatch wrong? I'm not sure. I find code with inheritance hard to read.)
Go _code_ is complex because the language is too simple. (if err != nil is waterbed complexity)
It does, which is where the social aspect comes in: You're expected to make every reasonable effort not to use it, even if that comes at slight costs in performance, because practically speaking people are terrible at making those judgements.
Sometimes works well. Sometimes, unfortunately, it leads to bullying.
I always pictured that in a professional setting, engineering management would have an edict that `unsafe` is simply not allowed, or possibly, they allow it with extensive code review by multiple engineers.
And that there would also be a lot of people trying to implement a double-linked list and finding themselves having to sprinkle "unsafe" all over the place to satisfy the borrow checker.
Honestly, there's a "lot" of extra stuff but the standards committee has a general standard that things that can be done in a library don't need to be supported on the language level, meaning 90% of the new stuff is going to be invisible to you except in making the language more ergonomic to use.
An easy example is ranged-for. It depends on so much complexity internally that the end-user basically will never see. All they'll see is that as long as std::begin and std::end are defined for a container you can just `for (auto item : container)` it. Stuff like overload resolution is essential to it but you don't need to know overload resolution rules to use ranged-for.
Or the way initializer lists make initialization so much simpler. The way you can leave constructor writing to the compiler for so many different types of constructors.
How the compiler handles elision so well because of how the constructors are designed. You don't need to write a single rvalue reference move constructor for simple types. They're generated for you.
The current ongoing push to make large parts of the standard library constexpr so you can have seriously complex things going on directly at compile time to result in a minimal output that can just put in equivalent constants to constexpr function calls.
Like, the reality is, it is pretty much a core principle of the language that you can write C++2003 if you really want to, and everything on top of it just makes writing things easier. But if you want to piecewise substitute your code with the latest stuff it more often than not becomes terser and more straightforward because of the evolution of the std library.
I’ve got over a decade in C++ and I won’t use it unless forced (I.e. paid more money than I can reasonably refuse). It’s only gotten worse and worse. I would and have used C before I’d use C++ and now I’d use Rust before either of them.
C++ lets you write anything you can imagine, and the language features and standard library often facilitate that. The committee espouses the view that they want to provide many "zero [runtime] cost," abstractions. Anybody can contribute to the language, although the committee process is often slow and can be political, each release the surface area and capability of the language gets larger.
I believe Hazard Pointers are slated for C++26, and these will add a form "free later, but not quite garbage collection" to the language. There was a talk this year about using hazard pointers to implement a much faster std::shared_ptr.
It's a language with incredible depth because so many different paradigms have been implemented in it, but also has many pitfalls for new and old users because there are many different ways of solving the same problem.
I feel that in C++, more than any other language, you need to know the actual implementation under the hood to use it effectively. This means knowing not just what the language specifies, but can occaissionally require knowing what GCC or Clang generate on your particular hardware.
There are Java implementations in Java like Jikes RVM.
Garbage collected languages can be easily bootstraped, it is a matter of what intrisics are available, and what mechanisms are availble beyond plain heap allocation.
Oberon, Go, D, Nim, Modula-3, Cedar are some examples.
Apples to oranges. Rust's borrow system is something you couldn't implement in C++, meanwhile C++ has far better allocator and compile-time support and probably more features in total (things like concepts, intrinsic bitfields, etc.).
Importantly (afaik), Rust has far less features which are deprecated and/or in the specification but barely implemented.
Arguably, Rust is “just” a C++ subset with RAII made into a compile-time guaranteed primitive, and the only possible way to manage objects.
Add to it that it is a new language that had the hindsight of C++’s warts, and no backwards compatibility on C and its own existing syntax, it would be very surprising to see Rust as more complex.
There are many, many, many things that are regular classroom exercises in C++ that are extremely difficult or impossible to do in rust. Rust programmers tend to just pretend like anything rust doesn’t do well is an antipattern. Making the simple case simpler and the complex case way more complex is not desirable for most people.
I've used both professionally (Rust for the last 4 years) and I 100% agree. Rust, even with stuff like async, is a much, much, much simpler language than C++.
> When you read above about std::pointer_safety, did you understand the difference between relaxed and preferred? If you did, please explain in the comments section.
[Since I don’t want to sign up with Disqus, I’m commenting here.]
According to https://cplusplus.github.io/LWG/issue1098, “pointer_safety::preferred might be returned to indicate to the program that a leak detector is running so that the program can avoid spurious leak reports”.
This means that a program which uses different program logic depending on whether a GC is running or not, can choose to opt for the non-GC logic when the GC is running with a leak detector.
Who would write two separate but equivalent codebases, one which manages memory with RAII and another that assumes a GC, just to use that value for something? If you're already not assuming a GC is present then why rewrite every a second time assuming it's not?
This is probably aimed at libraries designed to be compatible with both kinds of environments. A library might implement some localized alternative logic to maximize efficiency in both cases. C++ is all about maximum runtime efficiency while allowing libraries to provide comprehensive compile-time abstractions.
RAII requires allocation patterns to coincide with lexical scopes, or to use reference counting, which is usually more expensive than GC. If RAII were all you need, no one would be using GC in the first place.
RAII ties resource management to lexical scope but doesn't strictly "require" it. Reference counting vs. GC cost varies by context—neither is universally "more expensive." Both RAII and GC have their merits; their use depends on the language and application needs, not on one being strictly superior.
You would usually do it for performance rather than RAII correctness.
For example, without GC, your library might use RAII with std::shared_ptr, for certain objects that outlive lexical scope. That is, reference counts tied to pointer copies and destructors. With GC, your library might omit the reference counts when deferred object destruction is ok, saving a little space and time.
If your library is dealing with something like a graph (nodes connected by pointers) which may have cycles, std::shared_ptr will not be enough to free all objects as they become unused, but the environment GC will do it. Therefore, without an environment GC, your graph-using library will need its own cycle detector, or if usage permits, something with arenas or std::weak_ptr. With GC, your graph-using library can omit that code entirely.
There are occasions when you'd want to add code for correctness when using a GC, to release objects by clearing pointers to them in some circumstances where it's safe to leave the pointer uncleared (and possibly invalid but harmless) without GC.
There are several ways I can think of. For example, a library that stores application-provided elements (e.g. a graph library where nodes/edges can carry application-specific data) can support by-value RAII elements as well as by-reference GC’d elements and by-reference library-memory-managed elements (using new/delete). As another example, the destructor of a library-provided class might have to free a complex internal heap-based substructure, a potentially expensive operation (having to iterate over the substructure) that it can skip (only) when running under a GC.
That's kinda the point: you can't skip it with RAII, unless you avoid exiting a scope (longjmp?).
I suppose you can have two versions of the class, and instantiate the leaky one only if you have GC, but that seems like work that the GC should free you from. It seems better to optimize the memory allocations, perhaps with an arena.
Not sure what you mean. My point is, for the case of no GC, you have to provide a destructor, even if the work done by the destructor only consists of freeing memory. With GC, calling the destructor could be omitted if it only frees memory. Client code however doesn’t know what a destructor does internally, so always has to call it even under a GC. The implementation of the destructor, however, can check whether it is running under a GC, and can then skip the freeing.
Edit due to your edit: An arena allocator isn’t always convenient, for example when you have multiple objects with different lifetimes that move and/or partially share internal elements between them, so the element lifetimes are largely orthogonal to the containing object’s lifetimes.
It never suited the use cases for Unreal C++, C++/CLI, and COM/WinRT, with the three major C++ implementations of automatic memory management ignoring it, no wonder it was standard dead weight and should be removed.
That is what happens when features are added without taking into consideration existing use cases.
Do you know the history of its inception? I feel that would be a good read. Like, what use cases did the original authors had in mind, how did they convince others to accept a GC in C++ spec, etc.
I like these kind of development stories, it’s often a weird mix of social and technical challenges, from a specific time period.
* The "Kona compromise" of 2007, where a few features were stripped down to try to be able to get a new version of C++ shipped by the end of 2008 (which turned out to slip to 2011 anyways). This meant that GC went from a full-bore proposal to a barebones proposal.
* https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n24... is the first proposal of barebones GC support. As you can see, it's stripped down a lot from the original GC proposal. What remains is largely "it's UB to have a pointer not be visible as a pointer address, except if you use new library functionality to declare your pointer shenanigans."
* https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2020/p21... proposes, in 2020, to remove the features added by N2670 on the basis of no real implementation or real-world support. The discussion in EWG indicates favor for removal (only one person objected), while LEWG indicates even stronger favor (nobody objects).
Unity/Unreal, devs who work with both engines complain about the GC, it never was something that made them pick the engine specifically
When you dig into projects, you realize that most of the time people do extra work to work around the GC (pooling, arena allocation)
GC is nice to have but is _very_ task specific
Anyone who sells a GC as a universal solution is clueless and is dangerous
At 120fps, your frame budget is only 8ms, you don't have time to waste on GC related tasks, in fact, you don't want to do any kind of heap allocation or anything that might introduce any form of hiccups during gameplay (no JIT as well)
Likewise people that usually use the GC as escape goat for not writing proper code in first place, like using the language features for allocation free code.
Those are the same folks that do memory allocation in C during frame rendering, with the compiler provided malloc()/free() lousy for multithreaded game code, while complaining about the GC.
Summary, complaining only allowed to those that actually master the toolbox.
Unity doesn’t use a fancy GC, though. Also, that is more than likely just bad game logic on their part — the engine itself doesn’t use GC, most of the objects are not supposed the be on the heap.
I would argue that not using a GC is very task-specific. Besides some audio-system, I really have a hard time thinking of areas where my go-to wouldn’t be a managed language.
This is in part due to lots of objects having strong references from JS objects, and the complexity in dealing with already having a GC with reference counting techniques.
Additionally its much harder to introduce UAFs & Type confusion bugs, but also has interesting performance implications (not as simple as GC is slow - a lot of the time GC'ing objects can make things faster).
> Garbage collection and related features are being removed from C++23. If you are surprised to learn that the C++ standard had support for GC, you are not alone. It was unimplemented, confusing and pretty useless hence it’s removed.
So the proposal was sponsored by HP, Symantec and Intel. I was expecting something related to Microsoft and it's ill conceived idea of "Managed C++", but it all boils down to: Hans Boehm.
As surprised as the next guy to find this in C++ standard. ...Again: good riddance.
Garbage Collection is not a bad idea, but, my takeaway is that if you try to bolt it onto a language that didn't have it in its core design, you're gonna have a bad time.
It's rather that with RAII (C++, Rust) you don't need a GC.
Of course, you can still create and operate a GC arena in C++, to manage a data structure or problem area where it makes sense. But why build in into the language, library, runtime if developers don't need it?
I'd also argue that with well used RAII, programs can be written more reliably than in some GC languages as you
have finer control over the lifetime of system resources like Windows HANDLES or sockets and can free them on scope destruct
with a simple wrapper class.
On .net I'm constantly fiddling with IDispose and checking each object if it needs a 'using' block so I'm not leaking resources.
On long running resource intensive apps, this can make a real difference and I prefer the finer control C++ gives me.
(Well-designed, first-class) GC is still the only system proven to completely remove memory safety issues. Rust comes close, but it's not RAII but the compiler-enforced ownership tracking which allows for this.
C++ is very much proven to be memory unsafe in practice despite the RAII and smart pointers being part of the standard for a decade or more now. Opt-in memory safety basically just means no memory safety.
Of course, this doesn't mean that you can't create memory safe C++ programs. Probably one or two actually exist. But there also exist memory safe C programs and probably somewhere there is some memory safe program written in assembly as well.
> (Well-designed, first-class) GC is still the only system proven to completely remove memory safety issues.
To be pedantic, you need more than a GC for memory safety. Like a language (type system, bounds checks) that forces you to only look only inside the memory being managed. :-)
Also, "for free" memory safety as offered by java's GC doesn't automatically guarantee resource safety. Without RAII, you still need to manage you file handles the old-fashioned manual way.
Beyond that there are safety concerns that where the shared memory model managed by the GC bites you. For example: Rust solves threading issues with move semantics and pinning, guiding programmer away from shared mutable state. Also think of stop-the-world problems inherent with GCs. They can be life threatening in real-time systems.
So: yes it's true that it's difficult to write safe code in C++, and many safety bugs are related to memory. But that doesn't mean that code in a GCed language is inherently safer.
> Also, "for free" memory safety as offered by java's GC doesn't automatically guarantee resource safety. Without RAII, you still need to manage you file handles the old-fashioned manual way
Don’t you think a file handle is a completely different thing than what is normally talked about with memory safety? Can a un-closed handle let you ++ your way into remote code execution? Or reading uninitialized memory?
Regardless, a GC is about trade offs. A GC alleviates the 99% of programming. Thinking about having to close a file resource occasionally, especially since javac will warn you when you don’t, is leagues easier than having to navigate the borrow checker all of the time.
The same goes for concurrent programming. 99.9% of all code written in GC’d languages like Java never go over a thread boundary.
> Don’t you think a file handle is a completely different thing than what is normally talked about with memory safety? Can a un-closed handle let you ++ your way into remote code execution? Or reading uninitialized memory?
Unfreed memory also doesn't let you ++ your way into remote code execution.
Using a file-handle after it's been closed can cause all sorts of issues; and if the underlying file-descriptor is being reused to e.g. write a shell script you can end up in RCE territory as well.
No, you can get an exception, but that can happen either way (the OS can close the file handler irrespective of what the process does), and is not a safety violation.
The only problem that can happen in managed languages is leaking of file handlers by not manually closing them, leaving it up to the GC, and opening handles faster than they are being closed — this can cause the OS to run out of handlers for the process, terminating it.
But this is not a big issue in practice in my experience, in Javac try-with-resources idiomatically solves this issue.
> To be pedantic, you need more than a GC for memory safety. Like a language (type system, bounds checks) that forces you to only look only inside the memory being managed. :-)
True enough :-)
> Also think of stop-the-world problems inherent with GCs. They can be life threatening in real-time systems.
There are GCs that can work in real-time systems, given enough resources. RAII is also not suitable for real-time systems without much care, so the point is pretty much moot. Real-time systems require a level of attention to timing that no regular programming paradigm enables.
> So: yes it's true that it's difficult to write safe code in C++, and many safety bugs are related to memory. But that doesn't mean that code in a GCed language is inherently safer.
It is very much clear that programming in a GC language is safer than in C++ (or C or assembly) - not safe (the log4j vulnerability comes to mind), but absolutely and certainly safer. Rust has a radically new approach to memory safety that seems to offer similar advantages. It's still a little early (not that much internet-connected Rust-based infrastructure) to say for sure, but so far it does seem to offer similar or better safety guarantees as a GC language.
Java has try-with-resource blocks, though. But scope-based resource scopes are fundamentally limiting and static, it is not a generic solution.
Preventing data races is cool, but it’s again, not a complete solution to the general category of race conditions, which are the real issue - and I’m not sure if there is any general enough system that could prevent it whole-sale. Also, java’s data races are well-defined (while if you mess up some unsafe logic in rust, your program is completely unpredictable from that point).
Hard real time is so special that I don’t see much point bringing it up here — standard Rust/C/whatever is also not fit for that in and of itself. For soft real time, some GCs may be more than fine, the occasional frame drop won’t kill anyone for example.
> (Well-designed, first-class) GC is still the only system proven to completely remove memory safety issues.
This depends on what you mean by "proven", but I don't think I agree. It's pretty easy to demonstrate that RAII without raw pointers and without multithreading is going to be unable to escape its bounds. And for C++ that includes changing the standard library so it doesn't generate pointers without checking, for example vector is entirely capable of enforcing bounds.
But as you say, opting in on a per-object basis is not going to work.
I was saying "proven" more in the sense of "battle-tested": that is, we can point to large systems people actually use which are free of memory errors, written in Java or C# or other GC languages (those with a few more memory safety features, to be fair, such as out-of-bounds checks on all containers and a memory model that doesn't allow partial reads). We don't have a single large C++ system that is free of memory issues (we do have two for C, CompCert and seL4 - but those are not manually written in C, and they took many times more work than equivalent systems without this property). Perhaps some embedded high-safety systems such as embedded medical or avionics devices would qualify, but those are not public so it's hard to be sure.
So yes, theoretically you can build a complex system with a subset of C++ that is safe. Bjarne Stroustrup has explicitly defined such a subset for use in avionics and other military applications, even before many of the more modern C++ features. But there is little proof that this system can actually be successfully be deployed in most organizations.
I'd also note "without multi threading" is such a huge limitation that it immediately moves your proposed system to toy language territory in my book. And even adding shared memory multi-process collaboration (e.g. mmap() ) would not be safe.
Well not many languages are actually trying to hit the intersection of memory-safe but no-GC, so I think that's a bit too high of a bar.
> So yes, theoretically you can build a complex system with a subset of C++ that is safe. Bjarne Stroustrup has explicitly defined such a subset for use in avionics and other military applications, even before many of the more modern C++ features. But there is little proof that this system can actually be successfully be deployed in most organizations.
It's extremely difficult to deploy a subset of C++ as C++, for sure. I was thinking more about making a new specific language, but C++ could be a base for ease of explanation.
> I'd also note "without multi threading" is such a huge limitation that it immediately moves your proposed system to toy language territory in my book.
That's because I'm speaking about it as a proof of concept. Allowing multithreading is a hard problem, but it's not something that particularly shouts "you need garbage collection", so I leave it as an exercise to the reader.
Add some locks and stuff. There's a lot of ways to do it.
RAII gives you only static, scope-based lifetimes. These are not sufficient in itself for every kind of application. With that said, C++ is a low-level language with manual memory management, so you can do whatever, an unused GC is indeed good riddance.
It's a matter of definitions. Technically, reference counting is a form of garbage collection, and it is often discussed in the garbage collection literature.
Most commonly though, "garbage collection" refers to global schemes which apply by default to all memory (perhaps with rare exceptions such as pinning).
So, in typical usage, C++ smart pointers, being opt-in, are not considered garbage collection. Languages which do automatic reference counting globally, such as Python or Swift, are indeed considered garbage collection schemes.
> Technically, reference counting is a form of garbage collection, and it is often discussed in the garbage collection literature.
This is something I always thought was true, but then I see people talk about reference counting VERSUS garbage collection, when I was taught that reference counting is a subset of garbage collection.
In other words, "garbage collection" merely refers to any method of automatic memory management that the programmer doesn't have to think about, and reference counting is merely one method of implementing garbage collecting.
Heck, Python uses reference counting, yet the library for directly interacting with the reference counter is called "gc", for "garbage collection".
Different terms can have (slightly) different meanings in different contexts.
As I said, I think in practice any form of memory management that the language runtime/compiler does for you is considered garbage collection in the colloquial sense, but any such system where you have to use a specific language construct (such as std::shared_ptr or Rc) are not.
I'd also note that Python doesn't just do reference counting: it also has a tracing garbage collector that collects unreachable reference cycles. I'm not sure if Swift does something similar.
It uses reference counting and RAII. It does not use reachability analysis or tracing. Many people refer to a tracing routine or reachability analysis when they refer to garbage collectors
I would call a reference-counted mechanism automatic memory management. I would only call it garbage collection if there were a mechanism to try to collect cycles among reference counts.
This is a silly argument, though. GC has a clear de facto meaning. If you say "Garbage Collection" in a room full of programmers, they're going to assume the traditional meaning, as seen in Lisp and Java, where you have something capable of automatically collecting cycles without programmer intervention.
I've seen people claim malloc and free are GC. And sure, you can get there, but it makes the term utterly meaningless.
1. That's a pretty ungenerous interpretation of "as seen in".
2. Are people expected to know about those specific implementations or they're a bad programmer? If that's your intent then screw off.
Also that sounds like a broken way to do "Java" without qualifiers. For Lisp, eh, you can even use a bump allocator if you want but that doesn't mean everyone should consider it every time they talk about Lisp.
No, "reference counted java implementation" is not part of all of those. Please have perspective.
And again, "as seen in [language]" doesn't mean that's the only way to implement it. If I talk about global locking "as seen in python" it doesn't mean I'm unaware of non-GIL python implementations.
Yes it is, by having a sound compiler design section.
If you are aware of non-GIL Python, you would say "as seen in CPython", otherwise it just proves the point of not being able to distiguish between implementations and language definitions.
Not where I work. Where I work people are very well aware that referencing counting is a form of garbage collection. Any memory management abstraction that simulates an infinite amount of memory is a form of garbage collection, including a system that never frees memory (the so called null collector).
What's the formal difference between 'pointer safety' and 'pointer provenance' which is also a concern in recent C/C++ standardization work? Perhaps the former is getting removed because it turns out to be a less precise duplicate of the latter?
Seems strange to me them doing this rather than implementing at least partially something like an optional flag for memory tracing safety features instead.
A bit like we're C, we love pointers and overflows, we don't need no stinking GC or safety rails :D
They were like "if you knew the least bit a about C++, you'd knew it doesn't have a GC".
I was like, "yeah cool, but C++ does have some GC functionality", but they just shut me down. My background is Biology so there were a few remarks about how "a biologist just doesn't know these things, lol".
This was around 2019, btw. And sure, I'm a biologist, but I also happen to have spent 8-10 hours/week for the past 20 years coding because I just really like it; so, there's my 10k hours, ¯\_(ツ)_/¯.