Some of this is opinionated, and some is real. It's helpful to separate the two.
We've learned some things over the years.
* Null-terminated strings were a really bad idea.
* Arrays of ambiguous size were a really bad idea.
* Multiple inheritance just gets too confusing.
* Immutability is very useful, but not everything can be immutable.
* If thing A over here and thing B way over there have to agree, that has to be checked at compile time, or someday, someone will break that invariant during maintenance.
* Concurrency has to be addressed at the language level, not the OS level.
* Which thread owns what data is a big deal, both for safety and performance reasons, and languages need to address that. Some data may be shared, and that data needs to be identified. This becomes more important as we have more CPUs that sort of share memory.
* The jury is still out on "async". Much of the motivation is web services holding open sessions with a huge number of clients. This is a special case but a big market.
* Running code through a standard formatter regularly is a win.
* Generics can get too complicated all too easily.
* There are three big questions that cause memory corruption: Who owns it, who can delete it, and who locks it? C addresses none of those issues, GC languages address the second, and Rust (and Erlang?) address all three. Future languages must address all three.
The biggest issue with C++ metaprogramming is not necessarily that it is that powerful (though I've concluded for me that C++ allows too much overloading of semantics by library authors and they seem to really like that) but that the metaprogramming "language" if you can call it that is just incredibly weird and like nothing else on Earth. There's this entire jargon that literally only exists for meta C++ and that should be kind of a huge red flag besides the entire thing being an accident. It's truly a "weird machine" in the truest sense of the word that somehow became an industry standard.
I think it's "weird" because it fills this weird gap between shitty macros and generic programming but does none of the hard parts of either. The syntax itself is not nearly as strange as those semantics.
For example, if you look at what type traits are used for, it's essentially making up for the fact that C++ lacked semantics for compile-time checks of generic arguments, forcing anyone that needed to care about that to implement their own (often simple, special case) constraint solvers. Meanwhile, typeclasses in other languages wrote a generic constraint solver (ironic) and support the semantics to express those constraints on generic types, obviating the need for any kind of complex compile-time template logic hackery. Essentially the lack of semantics for a simple concept (yet difficult implementation behind it) forced the implementation of a simple extension to the same syntax that enabled really complex macro programming using the same templating engines.
It's no surprise that metaprogramming in C++ is exceptionally weird; it supports neither proper macros or generic programming, yet half-implemented both through templates to solve the lack of either.
Right, D has fairly sane metaprogramming (just slap the static keyword on all compile time stuff!) and somehow managed to make it compile fast unlike c++
Not to mention C++ templates are hideously slow to compile (if used indiscriminately, as they often are). Although C++ isn't the only language that struggles with build times. At any rate, it does seem possible to do generics without slow compile times (C# seems to do fine, for instance), so C++ might be giving generics a bad name on that count as well.
Fast builds would be high on the list of my own "dream language" features. Long developer iteration times are poison.
I don't think c++ overdid it persay, I just think that they really got some of the ergonomics wrong. SFINAE and the CRTP are examples of idioms that are possible due to the flexibility, but unintelligible to read (I'm sure there's some population that really disagrees with me. I'm not saying they're not useful, just not ergonomic)
They're slowly layering in those ergonomics, which will eventually be nice, though there'll always be a big pile of legacy.
Generics in C# is crazy good, both in writing and using them, all while the compiler checks the correctness especially if combined with constraints. I've tried "generics" in other languages but they all fall short compared with C#'s way. Between reflection, generics, polymorphism, dependency injection, you have a winning combo of code reuse. Can turn thousands of normal lines of code into a few hundred. I've surprised myself again again on how useful it can be, all without getting lost in the abstractness of it. The people at the dotnet team who carefully thought it through and implemented it deserves a ton of recognition for this.
Eh, It can be good or bad. While generics (in C#) can make your code easier to use correctly, by making sure that everything is of the right type, and can give you a nice perf boost when using generic specialization with structs, they also have a tendency of infecting every bit of code they affect with more generics.
For example if a function is generic over T, where T is parameter X constrained to be `ISomething`, all the function that it calls with X should use constrained generics as well, this can easily lead to the explosion of generic functions in your codebase. Many times, instead of making a function generic over T it's easier to make parameter X just be interface `ISomething` and be done with it.
Very good point. I typically use both approaches, sometimes using an interface as a method parameter just "feels" more natural, but other times a generic + type constraint is better. I usually draw the line when the infection you mention becomes too much, aka when 3 or 4 layers are now forced to use a generic cause 1 layer needed it. So yes, very valid point. The key is to find balance so that the code is still maintainable in 6 months from now, still readable + fast, while trying to reduce duplication where it makes sense.
I'd love to know if you write lots of code and/or have to maintain other people's hugely templated code. I find if you have to do lots of the second you quickly stop writing it.
I would still prefer that over copy pasting that shitty list implementation for the nth type, or whatever “convention” some C devs do to make it “generic”.
> Arrays of ambiguous size were a really bad idea.
When? And for whom? Having to make everything a fixed size array like in C is no good either, that's for sure. It leads to bad guesses and comments like "should be big enough".
I think he was referring to areas of contiguous memory (arrays) where the size of the area is separated from the pointer to that area. Nearly any operation on the array will require both pieces of information, and tons of C bugs come from making assumptions about the length of an array that aren't true.
So, better just to carry the length along with the pointer (Rust calls these "fat" pointers) and use that as the source of truth about the array length (for dynamically sized arrays, such as those created by malloc).
They're referring to dynamically-sized arrays that do not store their own length. In C, this would be a pointer to the first element of an array; the length of the array must be passed around independently to use it safely. Instead, they're advocating collections which store their own length, such as vectors in C++ or arrays in C# or Java. Personally, I believe there is a need for three different kinds of array types:
1. Array types of fixed length, ensured by the type system. This corresponds to arrays in C, C++, and Rust, and fixed-size buffers in C#.
2. Array types of dynamic length, in which the length is immutable after creation. This corresponds to arrays in C# and Java, and boxed slices in Rust.
3. Array types of dynamic length which can be resized at runtime. This corresponds to vectors in C++ and Rust, and lists in C#, Java, and just about every interpreted language.
Of course, for FFI purposes, it is often acceptable to pass arrays as a simple pointer + length pair, this being the common denominator of sequential arrays across low-level languages.
We also frequently need reference types to contiguous storage in addition to value types of contiguous storage. In C++ this is satisfied by std::span for both the static and dynamic cases.
True. In Rust those are regular slices, and in C# and Java all collections are owned by the runtime anyway. Something which several languages do lack, though, is a view of a contiguous subsequence, in the manner of an std::span; it would be nice to see those in more places.
I've also taken a look at Golang's slices, which have rather confusing ownership semantics. One can take a subslice of a slice, and its elements will track those of the original, but appending to that subslice will copy the values into an entirely new array. In fact, appending to the original slice past its capacity causes a reallocation, which can invalidate any preexisting subslices. This also occurs with C++ vectors and spans, if I am not mistaken. This is an area where I think Rust's borrow checker really shines; it prevents you from resizing a vector if there are any active slices, encouraging you to instead store a pair of indices or some other self-contained representation.
You cannot append to a std::span, or append to a vector-backed span through the span. You have to perform the append on the underlying vector. It is possible to perform an insertion into the middle of a std::vector through std::vector::insert.
If you can establish the precondition that the underlying vector will not re-allocate as a result of the append, then it is perfectly safe to perform such an append while holding a reference to one or more elements of the vector. Same thing for insertions: references to elements before the insertion point may still remain valid if you can establish the precondition that the vector will not be resized. In both cases, it is straightforward to establish the precondition through the reserve() member plus some foreknowledge of how much extra capacity the algorithm needs.
You can always construct a user-defined reference type which back-doors the borrow checker, such as by storing indexes instead of iterators as you mentioned. If the std.vector is reduced, then they are still just as invalid.
I think it's also okay to address memory corruption using tooling (like how sel4 proof checks c), the language might not have to address it, but it would be nice for the language to make it easy to address.
* Null-terminated strings were a really bad idea.
* Arrays of ambiguous size were a really bad idea.
* Multiple inheritance just gets too confusing.
* Immutability is very useful, but not everything can be immutable.
* If thing A over here and thing B way over there have to agree, that has to be checked at compile time, or someday, someone will break that invariant during maintenance.
* Concurrency has to be addressed at the language level, not the OS level.
* Which thread owns what data is a big deal, both for safety and performance reasons, and languages need to address that. Some data may be shared, and that data needs to be identified. This becomes more important as we have more CPUs that sort of share memory.
* The jury is still out on "async". Much of the motivation is web services holding open sessions with a huge number of clients. This is a special case but a big market.
* Running code through a standard formatter regularly is a win.
* Generics can get too complicated all too easily.
* There are three big questions that cause memory corruption: Who owns it, who can delete it, and who locks it? C addresses none of those issues, GC languages address the second, and Rust (and Erlang?) address all three. Future languages must address all three.
* Error handling has to be designed in.