> Java is the only one on this list that has a different approach. Instead of using generics at compile time, it is actually doing a dispatch of the code to optimized routines based on runtime types. That does mean that they had to write the same sort code multiple times, of course.
> Using Java:
public static void sort(int [] a);
public static void sort(long[] a);
public static void sort(Object[] a);
Perhaps they meant it another way, but this makes it sound like there is runtime dispatch between these different signatures - there isn't. int, long etc. are unboxed types in Java, which are not Objects, and cannot be used for Object-based generic type-erasure. So the decision which of these sorts is going to be used is made by the compiler. All Objects are sorted using the third signature.
Java generics works only on reference types (i.e. anything that derives from Object), and not on primitive types like int and long. For this reason, if you want a method that work on both primitives and reference types, you have to provide distinct implementations. This is true even if you provide a generic version like
public static <A> void sort(A[] e);
You will not be able to use the above method with an int[]. Whenever possible the compiler will try to be helpful, and automatically box/unbox primitives for you, so for example the following will work
public <A> A max(A a1, A a2);
/* ... */
void foo() {
int x = max(1,2);
}
this comes at the cost of boxing ints inside Integer, though.
So, if you ignore for a moment primitives types, whenever you have generics, everything boils down to a single method accepting Objects and returning Objects. What the JVM does is to do runtime profiling of what actually you are passing to the generic method, and generate optimized routines for the "best case". In theory this is the best of the two worlds, because (unlike, let's say, C++) you will have a single implementation of the method, avoiding code explosion, but if you use it in an hot spot you get the optimized code.
In a way, it is quite wasteful, because you throw away a lot of information at compile time, just to get it back (and maybe not all of it) at runtime through profiling, but in practice it works quite well.
A side effect of this is this makes the JVM a wonderful VM for running dynamic languages like Ruby and Python, because that information is _not_ there at compile time. In particular GraalVM/Truffle and exposes this functionality to dynamic language implementations, allowing very good performance (according to their website [1][2], Ruby and Python on Truffle are about 8x faster than the official implementation, and JS in line with V8)
The state of Java’s type erasure makes me quite sad when I need the type information via introspection.
I’d say the best situation would be keep the type information at compile time but let the VM decide if multiple implementations are needed or not at runtime.
The arguments still carry their dynamic type, but you lose the type parameters for the functions applied to those arguments. Of course, at the call sites for known working code you can infer the type parameters -- let's say for example that the function has one type parameter and one argument of that same type: use reflection to get the of that argument and, under the premise that you're code doesn't have a runtime type error, you know the type param for that function is the very same.
However, what if you have a collection of generic functions divorced from the arguments? There's no way from the function alone to determine the type params. So you can't write code that reflects over a collection of generic functions and dispatches solely on the type params.
And, while I've been speaking of functions, this really generalizes to any generic type. If you have e.g. a collection of PetTrainer<?>, and you have a code path in which you have a pet of type Dog or Cat, how can you use only reflection to choose a trainer upon which you can call Train(pet)? You can't.
(I don't Java, so please give me a little slack wrt syntax.)
Sounds like a missed opportunity for polymorphism. The whole point of OO and polymorphism is to do let the dynamic dispatch call the right method body for you.
Implementing that by hand using reflection just sounds plain wrong.
Would love to hear of a concrete real world example where this was an issue
Java users have to use double dispatch and such, because Java only implements dynamic dispatch on the type-erased type of the first function argument.
The last bit of the GP comment is an example of the sort of thing that isn't possible with generics in Java, because the PetTrainer-of-Dogs and PetTrainer-of-Cats are indistinguishable at runtime. If you want to solve the problem with dynamic dispatch on the type-erased type of the first argument only, you could *stop using generics*, write a separate class for each generic instantiation you would have had, send every pet to every trainer, and have each trainer check the runtime type of their argument and do nothing if it's the wrong type. This probably isn't what anyone really wants to write or an acceptably cheap thing to do at runtime.
Nobody actually deals with PetTrainers but people do deal with MessageSubscribers that work more or less exactly as described, and the type system does nothing to help. For a concrete example of the horrors of double dispatch, you could see the wonderful book Crafting Interpreters.
This is what I think .NET's generics are like. Type information including all parameters are available, and JIT does as you say. The approach has some advantages (binary size, less memory instantiations when required) but a slight cost when method is first JIT'ed. This only works for VM languages though.
Yeah, this really needs to point out that the elements in Arrays.sort have to be Comparable. More, they have to be Comparable with each other. Should probably show the Collections.sort that takes a Comparator to really make the point that that is dispatched at runtime.
But then you would probably need to cover just how aggressive the JIT is in the JVM and how the costs may not be what you would think they are.
You're not wrong, but I feel like this doesn't need explicitly spelled out, as it would be obvious to any Java dev (or C#, or indeed probably any language with generics).
I think my problem is that there is no runtime dispatch between the methods that were listed. Such that that was a bit of a non-sequitur for the point at hand.
The technical term for this feature in this context is "ad hoc polymorphism". It's polymorphic because it only allows you to specify different types, you can't just go around making six different foo() methods that all take a string - that doesn't work because the compiler can't tell which one you wanted, and it's ad hoc because you can just fill out whatever you want in each version.
The negative consequence of being ad hoc is that there's not even a very strong implication beyond the name that these different functions all do the same thing, and sure enough they often don't. So now either your users must read your documentation very carefully, or sometimes they will have unhappy accidents. This is made worse in many modern languages with type inference where your users may be mistaken about which of the functions they're actually calling since they don't always know the type of anything anyway...
Ad hoc polymorphism is over-used because it is relatively cheap to implement, and it's cheap to add more of these overrides later.
Well, having methods in different classes that happen to share the same name is also an example of that, and so it cannot be avoided (not in practice, anyway).
That is also not true — Java’s generics don’t add any overhead. If you would have written an ArrayList implementation before generics you would have had the exact same compiled method signature like put(Object o) and Object get() - the only difference is that it would not have been type safe, that is you could have inserted a String into it while you might have expected to get an Integer on the other side.
So generics are completely a language semantic concept in case of Java, with no performance detriment/benefit. Of course more information can be used up for better optimizations as well, see Rust’s or C++’s generics/templates.
The javac source compiler erases source-level types and inserts casts into the bytecode (at usage sites, not within the generic class). So if you have ArrayList<T>, it will generate one version of ArrayList. But usage sites will have casts inserted where necessary. In particular, in a use of put() on the outside doesn't need a cast (because the single implementation of ArrayList uses Object underneath), but get() will have a cast from Object to String at the call site.
The path Java chooses for optimization is through JIT, but that's entirely orthogonal to generics. Java generics are a syntactic construction, there really is no JVM support for them what so ever.
It's complicated. There actually IS JVM support for generics in surprising locations. Just not at method dispatch... sort of.
For example
void <T extends Foo> bar(T f);
will, under the covers, create a function that looks like
void bar(Foo f);
If you create a class
class Foo implements Supplier<Bar> {
Bar get() { return new Bar(); }
}
The method signature for `Foo` does in fact return a `Bar` type and not some `Object` type even though `Supplier<Bar>` would imply that is erased. The same applies to if that generic was used within the method parameters.
That generic type information is, in fact, stored off at the class level and can be extracted through Java's reflection API.
This is how magic like
json.parse("[1,2,3]", new TypeReference<List<Integer>>{});
is getting it's stuff done. It's creating a new anonymous class of "TypeReference<List<Integer>>" which the json library can then use to introspect and pull the generic information out and know that it should create a `List` of `Integer` rather than of `Long` or `Double` or a `Set`.
This makes generics more than just a syntactic construction in Java.
The JVM still doesn't know about generics. The explanation you provided is entirely at the compiler (javac) level, and the JVM is still oblivious to it. There is no special dispatch, or optimization. At best, you could say there is support in the Java standard library, but not in the JVM or hotspot or anything from running "java".
An easy way to verify: Are generics in Clojure, Scala, or JRuby the same as Java (the programming language)?
class Foo implements Supplier<Bar> {
Bar get() { return new Bar(); }
}
Bar isn't a generic in Foo though (if we remove the implements-clause, that is very clear). It's only a generic in Supplier, so I don't see where the implied erasure quite comes from.
I didn't add it, but the `get` method comes is an override from the `Supplier`.
For example, if we did something like
class Foo extends HashMap<Foo, Bar> {
@Override
Bar put(Foo f, Bar b) {
super.put(f, b);
}
}
The `Map` signature for `put` is `Object put(Object, Object)` due to erasures. However, because you are extending and specifying the generics, the overriden method ends up reifying the type.
All this is to say that generics end up being a little more than just syntactic sugar.
The example you provide has little to do with generics, but how inheritance works.
In particular if you override a method in a derived class, you're allowed to return an instance of a more specific class (in your example, Bar is an Object so it is fine). At runtime, your example is equivalent to something like
interface Supplier {
Object get();
}
class Foo extends Supplier {
Bar get() { return new Bar(); }
}
I think the author got it slightly wrong here because he's conflating the programming language feature and its semantics with the run-time implications of the technique(s) you use to implement that feature.
For what it's worth, this seems to be a common misunderstanding where generics are concerned.
Generics allow optimization, but that optimization has to be implemented in the first place. Java generics is an afterthought so you can expect such a suboptimal design.
I don’t think it is fair to call it an afterthought. I would even say that it was extremely well thought out, managing to move a whole ecosystem with millions of lines of code from unsafe raw usage to type-safe generics without much hiccup, at a time when this feature wasn’t all that well known by the average developer.
Erasure was a bad design choice. C# did the exact same upgrade (add generics to the language), but they chose to reify them and add them to the binary format, with a better overall result (IMHO). C# does not have platform reach that Java has, but there are a lot of factors at play there that go beyond language design choices.
As I learned from another commenter here, C# 1.0 didn’t actually have generics at a language level, but was already designed that way on a vm level. So C# 2.0 could take advantage of that, making it much more of a planned feature from the start.
Also, I don’t see erased generics all that big of a deal — if there was no reflection (which is not that common feature in languages) one could not even differentiate between the two models. Haskell also does complete type erasure for example.
As for another view point, it is often brought up on HN that what made the JVM language ecosystem so thriving is exactly the type erasure model. C# effectively forces their idea of generics onto the whole platform, while JVM languages are free to choose their own model. I don’t know how true is that, but empirically it does seem to sound true (F# does share its generics model with C#, and the CLR has a history of getting some attempts at being a target which is later dropped. Though I assume it is more often due to it simply being a smaller target)
> if there was no reflection (which is not that common feature in languages) one could not even differentiate between the two models. Haskell also does complete type erasure for example.
You can, even without reflection, because the Java generic type system doesn't have the parametricity property; you can detect runtime types with cast/instanceof. Haskell doesn't have non-parametric operations, so erasure is not observable.
Java optimizes at the method level and the majority of methods when writing an app will have constant types used against them.
Your method signature may look like
void foo(Iterable<Bar> b)
which would make you think of vtables and casting performance losses.
However, Java's JIT is fairly cleaver. It can tell that you always call `foo` with an `ArrayList` and it can tell that `ArrayList`'s iterator always emits a `Bar` Object. From there it can elate a lot of the type tables and virtual dispatch to just be direct dispatch. It does that at the cost of a couple of added runtime checks to make sure those assumption's aren't violated.
The impact of generics on performance has, for me, generally been pretty low. I've seen a much higher impact due to java's memory model (which is getting fixed with Valhalla).
This, btw, is how JavaScript's and indeed most high speed JITs function.
Yeah, JIT does subsume naive AOT + generics for many cases, but it is a bit unfair comparison because there is no reason for other languages to implement such optimizations (PGO is one of them). And such an inference is not free and causes usual JIT pain points.
Generics were only added in Java 1.5, they were absolutely not part of the original design, and type erasure is very much a considerable restriction in some cases. I would wager that a greenfield design would look quite different.
Whether you call that an afterthought? I'm sure the designers had this in mind, but at the same time they painted themselves into a corner somewhat with earlier decisions and a huge dedication to backward compatibility. Their work on threads on the other hand shows some real forethought and they were able to avoid most of the pitfalls that a few other language impl's fell into.
Generics in C# were also added after launch but have a significantly better implementation. While they are a key part of the language they aren't a key part of the platform.
>All Objects are sorted using the third signature.
Just to be pedantic int[] and long[] are both objects, but will be sorted using the first two signatures because the JVM uses different instructions for loading and storing elements from an array. Technically you could make a single function to return an array's length since it uses the same instruction.
Plus I don't know about you, but I don't sort (or really use) arrays in Java very much. I see a lot more of the Collections.sort() methods, particularly the comparator based one.
Collections.sort() just copies the list into an array, sorts the array, and copies the sorted elements back into the list. So it also just sorts an array under the hood. To eliminate the temporary array (at least for ArrayList), the List#sort() method was added in Java 8.
Arrays are usually still preferred over collections in the case of primitive types (int[] etc.).
Sorting arrays of primitives ought to be nearly as frequent as the need for sorted arrays of primitives due to the downright comical overhead of List<Integer>, but sorting arrays of objects has be rare in Java code written this millennium.
I mean I don't sort List<Integer> more than once in a blue moon either. I'm much more likely to either be sorting a list of model objects with a comparator, or manipulating a collection of objects that's already been sorted by some external source.
In the unlikely event that I'm modeling something that can be represented as a list of integers AND it's a significant performance bottleneck then, sure, I'll consider the array methods. But I don't recall the need arising in the last year or two.
Enterprise Java stuff just isn't often bottlenecked on that sort of thing. Network latency fills far more of my dreams ;)
Maybe we're seeing different sides of the language then. I'm doing a lot of integer-list sorting, building a search engine in java. I'm evenly split between Arrays.sort or GNU Trove's specialized primitive collections.
The overhead of List<Integer> really can not be overstated, it can approach order 1000% memory consumption, speed is prima facie about half for dealing with collections, but in practice it can be much lower from the GC churn. Even for lists as small as a million entries this can be quite untenable.
Maybe we're seeing different sides of the language then.
You are not seeing different sides of the language but participating in different domains, application/business logic programming vs more system level stuff.
Not mentioned: the type-specialized code is larger in code size, which can cause its own performance problems due to icache.
A virtue of C, in my opinion, is that it's very difficult to accidentally bloat the code. C code tends to produce smaller binaries, while leaving the option of expanding performance-critical code sections with macros in cases where it is beneficial. It's much more difficult to take a heavily-templated C++ binary and shrink code size in the cold parts of the binary (which is probably most of the binary).
Same thing in e.g. Rust: you can either monomorphize (expand code for each generic it's used with), use dynamic dispatch via a vtable or (with help from a crate) dispatch via an enum.
Each solution has its tradeoffs, you are empowered to choose.
Though C macros are so bad they give the whole term a bad name. You might as well write a sed script for “macro expansion” instead of what the preprocessor does.
I think the case against the C preprocessor is overblown. What's the worst problem C macros have caused you? I've been programming C for 27 years and I can't really think of them causing any bugs in my programs. The worst thing for me is that #defines don't show up in the debug info.
C programmers tend to reach for the preprocessor in cases where other tools would have been more appropriate. In particular, #ifdef. If you haven't experienced this then maybe you've worked with people that have uncommonly good taste.
I agree there's nothing particularly terrible about the preprocessor as such, if used tastefully. Especially if you compare it to other macro engines like maybe TeX, CPP strikes me as quite simple and clean.
Even newer languages like Haskell choose to adopt it after all.
Good point about #ifdef. I'm not sure what other tools are more appropriate though. I went through a phase of moving target specific code into separate files instead of using #ifdef. But that makes the build system more complex and makes it harder to eyeball the difference between two targets' implementation of the same function. This is just an intrinsically complex problem and not one I blame the preprocessor for. As always, aiming for minimum total complexity is the goal. #ifdef is a useful tool in that persuit.
That said, I'd be interested to hear about other solutions that are better. I guess Zig's comptime is a candidate, but I see that as only a superficial improvement (not a criticism - I can't think of anything it could do better).
I agree, #ifdef is sometimes the best tool for the job. I didn't mean that it's always wrong, but it does tend to be overused in my experience. For instance, hardcoding assumptions about the target platform when a dynamic lookup could have been used instead. E.g. compare old skool hardcoded addresses in Linux drivers to modern devicetrees.
I don’t think Zig’s comptime is a superficial improvement, especially not compared to C “macros”. They are such a simple, single concept yet replace C++’s templates and const functions.
But a macro system without understanding the AST is just shitty.
People usually overestimate their capabilities when it comes to such things. In this case I think you are underestimating yourself, I do not see how anyone can write something as bad as the C preprocessor if they had 2-3 days to work with.
> the type-specialized code is larger in code size, which can cause its own performance problems due to icache.
since IIUC you mean the alternative would be essentially non-reified types, so the icache would involve more frequent branches, thus blowing cache in general, even with branch prediction.
anyway i'm pretty sure i'm missing your point, so figured may as well ask.
When you ("you" in this comment is probably most usefully read as "a compiler" :) ) monomorphize a generic function or data structure with the various type parameters that it's been called with, you make a copy of the code for every set of type parameters it's been called with.
Each of these copies requires space for the instructions...this means a larger binary on disk, more memory required when it runs, and more pressure on the CPU's caches, especially precious L1I.
The cache pressure issue is typically the big one worth thinking about, although binary size itself can be an issue for some cases - I've primarily got embedded use cases in mind, but I'm sure there are others I'm not thinking of.
Anyway, back to cache pressure. If there are a whole bunch of different monomorphizations of a given function, and they're all called frequently, that could mean that the CPU will frequently need to refer to slower L2$, or much slower L3$ (or much much slower RAM) to load instructions. That's no bueno from a performance standpoint.
Because of this, there are cases when dynamic dispatch can outperform monomorphization - it's definitely not as simple as "vtable slower, monomorphization faster" across the board, even if that is an OK rule of thumb.
The linker can usually merge identical functions, and the compiler does this too.
You'll only get different copies if they are relevant, and usually the branch elimination is more than enough to compensate.
Another aspect about this is that the cache will hold the _hot_ stuff. In most systems, even if you have three copies of a function, it will likely only have one of those that is hot.
C doesn't require you to use qsort, the only thing that makes qsort common is that it is available in the standard library, but there are other libraries that provide a faster approach, e.g. SGLIB[0] uses a macro-based approach where the code and comparison can either be inlined with the code or define functions for sorting specific types[1].
There are other libraries that do the same, another one was posted here recently. And of course, if needed, it can be implemented by the programmer. IMO the issue with the given Postgres example isn't that it is written in C therefore it is slow, it is why it doesn't use all of C's features to make it faster (assuming there is a real performance issue here - it is very possible that the Postgres authors didn't consider it a real issue until now).
> IMO the issue with the given Postgres example isn't that it is written in C therefore it is slow, it is why it doesn't use all of C's features to make it faster (assuming there is a real performance issue here - it is very possible that the Postgres authors didn't consider it a real issue until now).
Right, but that's kind of missing the author's point, which isn't "things that are written in C are slow" so much as "because C lacks generics and has a relatively-inefficient standard library implementation based on function pointers, most implementations do not take the extra steps of 'using all of C's features' to do something faster (because it's a lot of work, and the standard implementation exists)".
It's mostly just an argument that generic-lacking languages make it easier to do something slower than something faster, and most people and projects follow the path of least resistance. That there are more efficient implementations available if you sink more effort into it isn't in dispute.
But at that point it isn't C's fault but the developer's fault for not using C's features. And of course assuming there is a fault in the first place (i.e. performance issue that actually matters in practice).
> IMO the issue with the given Postgres example isn't that it is written in C therefore it is slow, it is why it doesn't use all of C's features to make it faster
I think the issue the author is pointing out is that, in the real world, this real-life (fairly obvious) performance improvement was enough of a pain in C, that it was not done for 25 years.
In languages with generics, this improvement would have come for free. With C, yes you can do the improvement, but you have to do it by hand, and you have now to maintain and debug multiple versions of sort.
> was enough of a pain in C, that it was not done for 25 years
As i wrote in another reply, that might be one way to read the situation - especially if you want to create a narrative where languages like C without generics have inherent faults - but another way is that it was there for 25 years because it wasn't an actual problem in practice so nobody bothered to fix it until now.
You don't need to do it all the time - or at all - just for the few things that can be performance sensitive (assuming the only reason you'd care is about performance and not, e.g. type safety) and that after you've actually figured out there is a performance issue that can only be solved with such an approach.
After all while the linked article's author seem to present it as a problem that it took 25 years for this to be fixed, another way to see it is that it took 25 years for this to be an issue for someone to bother enough to try and fix it.
Minor nitpick: if the implementation of qsort() would be visible to the C compiler (for instance if it would be defined as inline code in stdlib.h, or with a statically linked libc and LTO), the compiler would happily "stamp out" a specialized version of the function, just as std::qsort() does in C++ (assuming the right optimization settings).
After all, "zero cost abstraction" (aka "optimizer passes") works just as well in C as in C++ ;)
The actual advantage of generics isn't performance, but less (manual) code duplication.
Not only the implementation of qsort would have to be visible to the compiler, but also the implementation of the comparison function, which may be defined in a separately compiled module or passed around dynamically as a function pointer (as might qsort itself, potentially).
> The actual advantage of generics isn't performance, but less (manual) code duplication.
Or type safety, since code duplication will often be shunned. And the benefit of type safety is there regardless of whether the generics implementation performs specialization.
> Not only the implementation of qsort would have to be visible to the compiler, but also the implementation of the comparison function, which may be defined in a separately compiled module or passed around dynamically as a function pointer (as might qsort itself, potentially).
In postgres' case the whole set of comparison functions isn't know at the compile time of the server - new types / operators can be added at runtime by extensions.
Hence the branches for the few most common/performance critical cases. I'd personally prefer if we gradually moved to C++ in postgres, but that'd not change that we'd need to have those explicit branches to benefit from a faster sort.
Of course it'd be a lot less annoying to implement in C++ - manually implementing templates with macro magic is many things, but not fun.
It's pretty straightforward to have a qsort() or bsearch() clone generated by a macro alongside a static inlined comparison function. You will get C++ level performance out of that.
glibc does exactly this for bsearch. The difference with C++ is that the specialized function (bsearch or qsort) does not receive vague linkage, so it is not going to be deduplicated by the link editor. For bsearch, the code size increase is probably not a problem, but qsort tends to be much larger. Basically, intra-procedural constant propagation plus inlining into qsort is very beneficial, but inlining the specialized qsort itself is probably not such a great idea. Presumably LTO with profile feedback could help here, but striking the right balance is going to be difficult (as is with C++, which tends towards code duplication, despite vague linkage and link editors that perform identical code folding).
I'm not sure if specialization is a good argument for generics. Not everyone implements generics through full specialization/monomorphization, and then the performance benefits are not much more likely to emerge than with function pointers or vtables. And many JIT compilers can effectively specialize code involving indirect calls.
In principle yes. With enough constant propagation, inlining and/or intraprocedural optimizations, the compiler can indeed inline the comparator in qsort (and in practice GCC has been capable of it for a few years).
The problem is that it more work for the compiler. For example GCC has an early inliner pass that will exploit all easy "inlining" (like those provided by templates) to simplify the code and to expose more opportunities for later passes. On the other hand, indirect function inlining is only done at later stages. The effect is that indirect function call inlining is much less reliable and sometimes not as effective.
In the case of qsort() at least, there wouldn't be "preprocessor generics" involved, because the comparison is done in a user-provided callback function. But both the comparison function and the qsort implementation must be visible to the compiler (or in case of LTO: the linker) for the "auto-specialization" (via the optimizer) to kick in.
It's a bit brittle of course to depend too much on the optimizer, but the whole idea of "zero cost abstraction" in C++ also puts a lot of trust into the optimizer passes.
Now tell me I need generics for this to get optimized. (Generics are
useful for type safety, which you can add using macros in C, but
this is then ugly).
Generics can not reliably be optimized. They create specialized inline code by monomorphization (except partially in Go which is a bit smarter). Specialization and inlining can be good if it leads to simplificatons or bad when it leads to a combinatorial explosion, which is why optimizers are careful with it to avoid code bloat. Generics always create the bloat by default. If this were a good strategy, optimizers would always inline.
To reliable devirtualize function pointers in C you need to use non-standard extensions. But the best strategy is not let the optimizer decide and apply this very carefully inly for the few selected cases where it matters.
Choose to believe this, it's your loss. The fact is that performance sensitive domains like browsers/HFT/HPC/ML/etc are almost universally written in heavily templated C++. Just look at linear algebra libraries, C can't compete.
> They create specialized inline code by monomorphization
Neither generics nor monomorphization imply inlining, that's purely a performance optimization.
> (except partially in Go which is a bit smarter).
Are you joking? Go, already an extremely bloated language, somehow managed to implement generics with runtime overhead.
> Generics always create the bloat by default.
Generics only generate code you would have manually written/copied otherwise.
> But the best strategy is not let the optimizer decide and apply this very carefully inly for the few selected cases where it matters.
Yes, you just compiled C code as C++. Actually write it as C++ (get rid of the void* mostly) and your program becomes a constant time evaluation: https://godbolt.org/z/5fedo8s5q
Well the whole point of the exercise was to see whether the C-version of qsort would be optimized in the same way as templated code when the compiler can see the whole thing instead of just the declaration. That's why I didn't write templated code.
Try to define the find() function as inline or static (but in this case the optimizer is actually too aggressive for this example and resolves the entire function call into a constant).
Okay, but as far as I know "inline" is just a suggestion for the compiler. Depending on the complexity of the function or the phase of the moon the compiler can choose not to inline it. I wouldn't want to rely on that for a critical piece of my performance.
True, that's what I mean with depending too much on the optimizer (in a sibling comment). But IME the threshold for inlining code is surprisingly big (which of course also has a downside: potential code bloat).
I added a little fix which prevents the compiler from discarding the result completely (just by going through __builtin_printf()):
`static` also allows it to optimize, and it's more of a code hygiene thing. Your original code as written unintentionally defines `find` as a function that must be preserved (as a separate function) into the compiled output, and `static` undoes that.
I think it would be better said that contemporary languages should consider research in PL design and type theory over the last few decades. There is a trend it seems to discard it as "academic" (never mind the academics need to be brought in eventually, as in Go's generics).
It feels like there's a lot of wheel reinvention going on, all the time. Especially with systems languages. Rust is the one that seems to "get it" (bite the bullet on syntax and learning curve, but give you reliable software that is pretty fast). I'm still waiting on Jai to be made public to see what is going on there.
Essentially without generics, any kind of polymorphic code needs to use dynamic dispatch. For example, if you add a generic sort() function for collections, you would need to lookup the comparison function for any arguments at runtime. Generic programming - be it through type parameters, ad hoc polymorphism, or whatever - allows a language implementation to know what that comparison function is at compile time and inline it accordingly. This is true for all kinds of things you may want to do generically. Sorting, layout of objects, iteration, etc.
Ultimately generic typing is a way of better expressing the constraints of a program to a compiler, which means the compiler can do a better job of generating code for specific instances. The downside is that the compiler needs to generate code explicitly for each concrete instance of a generic function or type, which means that it takes more compile time and code size is larger.
Edit: I'd also like to point out that JIT compilers and sufficiently-smart AOT compilers can do this sort of inlining if they can prove that for any combination of run time types passed into a sort() they have the same comparison function and inline it as needed. That said it's the same (or higher) complexity as a generics system, with more pitfalls (for example, if they can't prove it but make a guess the sort function never changes, then get the guess wrong, they have to de-optimize).
Monomorphization generally opens the door for faster execution on a modern processor.
If you have a layer of indirection (i.e., no genetics so you need to dispatch at runtime) then you wind up with an additional jump instruction.
While modern processors and branch prediction make jumps relatively cheap, they can’t avoid instruction cache misses if the dispatch is happening frequently.
By removing the layer of indirection, the compiler can choose to inline the implementation instead of using a jmp; this keeps our icache clean, and can lead to faster execution.
However, there’s a real cost to this! Inlining is, in broad strokes, often faster; but that’s not always true if you’re e.g. inlining a large number of instructions into a tight loop. As with all performance, profile to know the truth.
What is more meaningful perhaps is not the lack of jump, but another optimization “pass” after the inline. Let’s say the compare function needs an object passed to it (e.g. in Java’s case). In this case the call site would create two new objects, pass them to the compare function and jump there. Instead in case of an inline and optimization case the call site might avoid object creation entirely if they would only be destructured in the compare part.
The ultra short version of what he is saying is that generics allow you to provide information to the compiler that enables inlining. So instead of doing something like this (pseudo assembly)
PUSH x
PUSH y
CALL compare
CMP x y
POP
You end up with
CMP x y
Of course this will vary based on the type that you're optimizing for but in all cases you're removing the need to push and pop the stack for every comparison. I'm not smart enough to talk about other optimizations enabled by generics though.
Anytime you can move a choice from runtime to compile time, it has the ability to speed up runtime. Such that if a lot of what your sort routine is doing at runtime is finding out what the type of the objects being sorted is, then moving that to a choice at compile time can recover all of that time.
Note, this is not just generics. Java could use overloading for years before generics to get similar benefits. Common LISP has type annotations that can give similar benefits. Generics is just a much easier/stronger to use construct for this in many cases.
I absolutely agree that modern programming languages need generics, but this particular issue with qsort is only a problem if you're not using link-time optimization (LTO). If you care about performance, you should be using LTO (all the FAANGs use it extensively, for example). And if you do use LTO, then the call overhead of qsort is eliminated via inlining.
As an example, here's musl's qsort() implementation paired with a use in Godbolt. (I couldn't actually use LTO on Godbolt because it doesn't LTO the standard library, but the effect would be the same.) Note that there are no call instructions at all and the compareme() function is completely eliminated by the optimizer: https://godbolt.org/z/6W8Wxf6xP
For LTO, you also need to statically link with your libc which is seldom done (do FAANGs do it?). Or write/duplicate qsort in your own library, which is more likely.
Not providing generics was already a mistake > 25 years ago. Coming from C++ with its templates, I was astonished when Java came out without generics in 1995. It was dumbfounding, because parametric polymorphism was so obviously an essential feature for static typing (otherwise one has to resort to runtime type-casts all the time, relinquishing compile-time type safety), and Java was positioned as an alternative to C++. It took ten years for generics to finally be added to Java. (And another four years later Go came out, again without generics…) I later read an article or interview that one of the original Java implementers actually pushed for generics, but it was too much work/complexity and they wanted to get the language out.
Don't forget C# 1.0 which also shipped without generics. The people designing these langs were not inexperienced, far from it. Yet the mistake keeps being repeated every decade. I feel like the industry would be a lot further along if we paid more attention to these lessons from the past.
> Don't forget C# 1.0 which also shipped without generics.
That's true, but work to support them was already underway and when C# 2.0 shipped with generics, the bytecode and VM already supported it. That's why C# generics support specialization for non-reference types. Don Syme and company knew they wanted generics, but it takes time to design a system like that, so they shipped C# before that work was complete.
Anders Hejlsberg, who designed C#, had previously designed Delphi, which has some support for generics (and I believe it had them before he left for Microsoft). I think the case may be that they're not making the same mistakes, but that they're just trying to ship software. Maybe though you would reply that generics are part of the minimum viable product of a programming language and it shouldn't ship without them. I can't say I disagree.
When C# first came out it was in competition with Java, which at that point didn’t have generics yet either. That was probably a relevant consideration for shipping the first version without generics.
It seems that these languages without genetics keep getting popular. Perhaps they get popular in part because they don’t support generics at first, and so seem to be simpler?
It seems to be inevitable. Few languages can remain simple forever because they run into the realities of large scale systems development. If they remain simple, they either force the programs to become more complex or effectively constrain the complexity of the systems that are developed.
The question is for how long, if they hadn’t added generics at some point. There’s certainly a lot of pressure e.g. for Go to have added generics, and Rust and Swift likely wouldn’t have been as successful as they are if they didn’t include generics.
This is anecdotal, but Java 5 adding generics was crucial for me to staying with a Java job. Without generics, working with generic algorithms and facilities was just miserable, and there was great temptation to use code generators based on some sort of templated Java as a workaround.
Perhaps it's necessary to add generics to keep users, but my takeaway is this: if you have innovative language features, then you don't need generics to attract a passionate set of early adopters. If you don't have innovative language features, people won't use your language regardless of generics support.
Yeah, as I already said I don’t think Rust’s borrowing semantics would have been quite enough for it to be as successful without generics, and Swift as a modern replacement for Objectice-C also wouldn’t have been quite as convincing without generics support (which was a pain point in Objective-C).
I can’t exclude the possibility of new innovative language features outweighing the penalty of not having generics (like Go’s concurrency support did for a while) even today, but I believe it’s becoming increasingly unlikely (for statically typed languages, obviously).
You can't do Rust without generics at all, they're utterly fundamental. Take the for loop. Rust's for loop is literally sugar for the infinite loop over an Iterator (a generic type) which emits an Option generic, with a break for None.
Rust calls such types "langitems" and they're required for the core Rust language. An allocator (for heap memory) isn't mandatory, fancy file I/O and networking aren't mandatory... But the langitems, and thus generic types, are mandatory.
Rust could conceivably have worked similar to C, where you have to cast void* to the correct type. You could still have borrowing annotations on such generic pointers, similar to how C(++) has const. Having borrowing semantics doesn’t necessarily imply a general-purpose generics feature. Stuff like iterators and option types would obviously be different then.
You could build a completely different language with some of Rust's features, if you go back a few years before Rust 1.0 there are lots of possibilities, but what I was getting at is that Rust itself is much more tightly wedded to the generics than many of the languages people associate with generics, a "Rust without generics" isn't Rust at all.
Even in C++ the generics are a later addition, Stroustrup's early 1980s language doesn't have generic types, templates were a proposal added to what would become the C++ 98 standard and the Standard Template Library presents generic types for containers in C++ years after the C++ language was first popularized.
I don’t think we really disagree. I was originally reaponding to the claim by DANK_YACHT stating that a (statically typed) language can be successful without having generics as long as it has other sufficiently innovative features. As a “counterexample” I hypothesized a version of Rust without generics but with the borrow checker (arguably the innovative feature most decisive for Rust’s success) and argued that I would doubt that this version would have been very successful, exactly because its lack of generics (which I believe aligns with what you are arguing). I’m not sure if you’re arguing that this hypothetical scenario isn’t a valid counterargument to the original claim I was responding to.
Yes, technically you can do that but it makes no sense from a design point of view to have a language that cares so much about safety it adds a borrowing system but then says "fuck it" to type safety.
You have to have type safety to have memory safety and that was the whole point of Rust.
Rust's value proposition is being a modern language capable of low-level programming. Surely, the low-level part is essential, and the borrowing semantics brings it, but it wouldn't have much value if it hadn't modern features.
Just starting with C and adding a borrow checker would have had some value (and there are some linters that doggedly try to do this). I agree I’d much rather have Rust, though.
Both Java and Go are developed by mega corporations though. Hard to disentangle their successes from that. It's not hard to be successful (when success is measured as adoption) when you literally can't fail due to the giant pile and of never-ending cash and army of developers fueling your growth. Even if your language is terrible people will adopt it, if for no other reason than Google or Oracle are behind the project.
If Oracle were to disappear tomorrow I imagine Java would persist for quite a while as it's so embedded by now. It's the introductory language in schools across the US, so it's not going anywhere for now.
Go on the other hand I don't think would last very long (maybe a decade?) if Google decided to divorce themselves of the language, as it doesn't have the installed userbase of Java, but that's just my opinion. Who knows what would happen.
Yes, considering the impact to the standard library, but generics are very hard to do well. The C++ language committee is still 30 years later dealing with the fallout from the underspecified initial template implementation. Java's implementation seems to get mixed reviews.
Yes, it’s not easy, but not designing them in from the beginning only imposes further restrictions on the eventual design. And the lesson has been that generics will have to be designed in at some point for the language to stay relevant. Nowadays we at least have more existing languages — and advances in type theory — to learn from.
I would say that C++ templates have been extremely successful. The reason that the committee keeps tinkering with them is that users keep asking for even more features (very very few languages have all comparable features, so it is certainly not a matter of catching up).
It might have been understandable in 1995, because by then C++ already was mind-boggingly complex for a lot of applications, and it might not have been entirely clear which of it's features were the worst in that regard. But I was only starting programming with QBasic in 1995, so this is just a thought, not a fact.
Ada generics are easy to understand and were at least publicly known for ~12 years in 1995 (going by the publication date of the first Ada standard, older if we go to drafts and discussions prior to that standard's release). Generics aren't hard to use, and they aren't novel. That there's any debate nearly 40 years after at least one major procedural language introduced them is mind-boggling to me.
People really ought to look into Ada more. The language has a pretty neat type system (ranges are neat!), and it even has proper language constructs to do concurrent programming in it (had it for quite a while!). FWIW, search for "generic" here: https://en.wikibooks.org/wiki/Ada_Style_Guide/Concurrency
The title doesn't exactly reflect the contents. It should have been "modern programming language that aims for high performance require generics", which is in my opinion close to the truth. It is a different matter whether a modern programming language should aim for high performance or not.
well, the 2 main reasons to make a new language are higher performance and higher productivity. Generics are a clear win for productivity, but there's less widely available info on their affect on performance.
Static vs. dynamic typing has nothing to do with performance. You can have a reasonably performant implementation with dynamic typing, or conversely a statically typed language with zero concern about performance.
That's empirically not true. The dynamic language (implementation) that is closest to static typing performance is probably Javascript, and it achieves this using an absurd amount of work to infer static types and to jit the code.
If instead you make the reasonable assumption that dynamic typing => pointer indirection, and look at how memory works in modern machines, it's also theoretically untrue for a huge number of programs.
Not every code is written with performance in mind. In fact, most aren't. This alone gives much leeway for dynamic languages; if you can make more performant program easier to write, they can pay off even when their maximum performance is limited (this is my definition of "reasonably performant"). The exact amount of leeway will of course depend on circumstances but I guess 10x is a fair estimate, which many JIT implementations can reach---not just crazy beasts like JS engines.
I agree with the conclusion (generics good) but the argument's not good.
All your 25 year-old projects will have performance bottlenecks to the extent you don't look for and address them.
The idea here that if you use generics your hot-path inner loops will be automatically optimized is nonsense. Yes, here and there you may get lucky -- the generics will exposes some code to an optimizer which will generate some great instructions without any extra work by you.
But while shaving a few bottlenecks off it nice, if your goal is performance, the bulk of the work is the same: you'll still have to consistently profile and optimize your code.
Essentially every language provides some form of generics. Even Go 1.0 had `append` (a polymorphic function) and `map[K]V` (a polymorphic type). What Go 1.0 didn't have was user-defined generics. I am still of the (unorthodox) opinion that adding more built-in generics would have been a better solution than opening the door to arbitrary user-defined generics. Providing a generic `sort`, `reverse`, and a few others would cover 90% of the cases mentioned in every "Go needs generics!!" article.
I completely disagree. Putting builtins on some special pedestal and demoting user-defined classes and functions is always a mistake IMO. I also don't see how this is a reasonable approach even in the cases you mention. If map is polymorphic, but I can't define polymorphic classes myself, it seems impossible to define classes that can hold arbitrary maps. So I can't even wrap map.
Also, if I'm reading https://go.dev/tour/moretypes/15 correctly, append is a variadic function and not a polymorphic function. That's something very different.
> Also, if I'm reading https://go.dev/tour/moretypes/15 correctly, append is a variadic function and not a polymorphic function. That's something very different.
It is both variadic and polymorphic.
> Putting builtins on some special pedestal and demoting user-defined classes and functions is always a mistake IMO.
There is a line of thought in programming language design that claims everything should be user defined, but too much customization of everything inevitably leads to a lot of fragmentation, subtle incompatibilities and lots of ways to solve the same simple problem.
Avoiding all of these issues is an important part of go's reasoning behind some of their design choices and the proposed idea could probably fit that quite well.
The problem is that there is no mechanism that can deal with the remaining 10%. You either copy-paste code, add overhead or use some macros. For languages that are not well-suited for the last option we can see how “well” it turned out in case of C, where everyone has their own data structure lib, or use goddamn linked lists in a supposedly performant low-level language.
I think the idea that a languages ought to cover 100% of possible usecases has done far more harm than good. It's a ridiculously high bar. We should aim for more "Pareto-optimized" languages -- languages that address 80% of the problem space with 20% the complexity.
"What Go 1.0 didn't have was user-defined generics."
If you sit down and carefully write out the use cases for generics, which extend well beyond "parameterized data types", you'll note that Go interfaces cover a lot of them reasonably well, which is how it has been a viable language for versions 1.0 - 1.17. While this is an unpopular opinion, I don't really consider 1.18 to have "added" generics so much as completed them. (As evidence, consider how important "interfaces" still are in generics, and I can attest from personal experience with helping people that there's a lot of people who try to fire "generics" at a problem when they just need interfaces.)
I say this in support of the original post. Otherwise one must reconcile how Go could be successful "without generics"; my suggested resolution is that it shipped with "bad" or "incomplete" (more charitably) generics rather than no generics. Compare programming in Go with programming in C and the differences become very clear. There's a reason so many people were able to smoothly transition their code bases from Python to Go, for instance, which a view of Go in which it literally has "no" generics finds hard to account for. You need more than just "maps" and "slices" to do that successfully.
Honestly, I enjoy the current answer to this (which was around long before Go's generics) - there is the `sort` package, which provides a lot of baseline sorting functions for built-in types, and then a function that takes a slice of `interface{}/any` and a function to sort the elements of that slice.
I agree wholeheartedly. If your language has static types but no generics, it's akin to supporting functions but not parameters. Static types without generics are like the `GOSUB` of type systems.
Generics or otherwise, I just want type info to flow into and out of things like 'map' or 'SpecialContainer<T>', and for my editor to surface that information.
To say something that’s maybe obvious: generics don’t seem so necessary for languages with dynamic types. (Caveat: I definitely remember wanting to express parameterised types in Common Lisp but I was probably doing something wrong).
I’d argue that memory safety feels like more of a requirement but there are plenty of new languages which don’t have it.
The performance argument (for generics in statically typed languages) also feels weak to me.
Firstly, you can get lots of the same value of generics using polymorphism in a language like ML. E.g. sort might have a type like:
type ordering = LT | EQ | GT
sort :: ('a -> 'a -> ordering) -> 'a array -> 'a array
Which reads as ‘sort takes a function to compare things of some type 'a, and an array of such things, and gives you a (sorted) array of those things.’
You can get type safety enforcing correct call sites (ie no sorting an array of strings with int comparison) and type safety in the implementation of the sort function however this could be implemented with (machine) code generated for the sort function only once rather than a separate function for each different type it’s used with. Indeed this is what you’ll get with the classic OCaml compiler. The thing you would want for performance is either sufficient inlining or generics which generate new code for each call.
So it seems that performance is not a necessary consequence of that sort of code generation. (Also you will get worse type checking with something like C++ templates because they are type-checked where they’re expanded rather than where they are created).
Secondly, this style of generics may lead to code size bloat which may lead to poor performance. Sure it is nice if you can speed up the functions in your critical path by 6% but if you have a big program doing lots of things, the increased number of cache misses/reads from having more code may slow you down. It’s maybe not great if e.g. you have a different function to peek inside a future/extract the value for each different thing you’d put in it, but that function would likely be inlined either way so isn’t a great example.
Found a bug though: the C qsort() comparison callback needs to be able to express a being less than b by returning a negative value. The post's returns 0 if equal or 1 if greater.
My typical implementation for numerical types goes
Another aside, C++ also has the spaceship operator. If you define it, the compiler generates implicit declarations for ==, <, >, etc. You can provide your own ==, too, if that would be faster, and the compiler will use that to generate !=.
There can still be reasons to define all the operations explicitly, but those are rarely encountered.
Even further aside, in discussing language features and design, the expression "but those are rare" comes up frequently. For different languages, its meaning may be radically different. In C++, it means "for that case, you have to do something a little less convenient". For almost all languages, it means "we don't handle that case, you're on your own". That depth is what has kept C++ on top of the performance and major system implementation heap for decades, not just, as is often asserted, inertia. Achieving depth takes decades.
No, functional languages also have generics. It is a type system feature, not related to OOP.
Maybe the best example of a language without generics is C. In short there are 3 solutions to counter the lack of generics: copy-paste the same code multiple times with minor changes, make this copy-paste automatic with macros, or circumvent the problem by further abstraction.
Let’s say you want to create a vector data structure that can grow. It is not a hard thing if you know the size of each element, so you can trivially write it for, say, ints. Now you want to reuse your logic for your custom structs that have a different size - the only change you would have to do is replace the element_size constant and repeat the whole thing with a different name (first solution). You might try to write some ugly hack with C’s pre-processor macro to generate this thing for you by an easier to use wrapper, or you could just implement a wrapper that has a next, prev pointer and use a linked list structure (with very bad performance compared to storing the elements next to each other in an array). C code often uses the third option due to the inconvenience of the first 2, which would be solved by generics.
Aahhh, now I get it! So you basically write code for "one size fits all" for the programmers convinience, at the cost of performance/general efficiency?
If you don't have generics, you either check the data type at runtime (e.g. dynamic types, or dynamic dispatch in a statically-typed language) or you write different functions for every data type you want your function to handle.
Generics are orthogonal to OO, they're about parameterizing your code with types (and sometimes values) rather than having to make many copies and tweaks for each possible specialization. Ada had generics in 1983, and only got OO in 1995.
> Can someone explain what people did to solve these problems before someone came up with generics?
Like many things in programming, when your language didn't have a feature you just did it yourself. You didn't have one sort() method you had a different one for integers, floats, strings, etc. Or you used other language features that could accomplish the same thing only a little messier.
The article is fun to read, but I’m not convinced C’s lack of generics prevents it from supporting fast generic sort. Imagine qsort() being a header library, and a compiler that can inline heavily. What would prevent C from inclining the compare() function, recover the element types and produce specialized sorts?
That misses a fundamental point: that is about equivalent to textual substitution. With enough inlines, the compiler can patch together a specialized version of the function applied with that comparator.
What it can't do is vary details of the implementation, at compile time, according to properties of the types operated on. C++ can do this. Rust can, to a degree. Go cannot, because it doesn't understand much about types. C macros cannot, because the macro system has no conception of types at all. The C optimizer can apply what it knows of types, but you have no way to tell it any more than it can puzzle out for itself.
If you are not used to a language that really supports strong types, you probably don't understand how much of programming work is offloaded onto the type system when using such languages. The C-grade "type checking" such languages provide is just the most trivial possible use of compile-time types. A powerful type system makes a language able to do automatically what you almost always cannot afford to do at all in a language without.
There was a C library - posted here but I cannot remember the name - which used the preprocessor to avoid invoking a function call per comparison. You'd use it something like this:
#define SORT_ELEMENT_TYPE int
#define SORT_COMPARE(a, b) sign(a - b)
#include "specialized_sort.h"
And the header would define a function with the following signature:
I've written code just like this. Another option is to make the entire sort function a C macro. This would be feasible for something simple like quicksort.
But generics provide another benefit: in theory with them, the compiler should be able to determine that two high-level types are actually the same machine type, so in the above example, SORT_ELEMENT_TYPE int and long would not generate duplicate code on many machines.
It's not necessary to use macros. Just make both the sort function which the comparison function is passed to, as well as the comparison function itself, both "static inline". The compiler is smart enough to figure it out.
Why is this limited to the preprocessor? I imagine the compiler proper could inline the compare function parameter into the qsort implementation body in the typical case, at the qsort() call site. (Assuming qsort's implementation was in a header, as the original comment posited).
You're right. I tried it here https://godbolt.org/z/457dYWfq5 and if the generic function is visible to the call site and inlineable, then the compare function can be inlined too. I didn't know this, I had always thought that the semantics of function pointer prevented this from being achievable!
If both the comparison function as well as the sort function which the comparison function is passed to are declared 'static inline' and available in a header, the compiler is smart enough to figure it out and inline the code, provided optimizations are turned on.
Does the necessary pointer casting in the implementations also make it slower than it should be (the specialized versions generated by compilers that support generics wouldn't have any casting)?
Its not the pointer casting, its the indirect function call that probably can be cached but it will always be another memory location that the CPU will need to jump to.
If the compare function is inlined into the qsort() body, there wouldn't need to be any function call. I think that was the point of the question above.
The way to answer this question is to look at the assembly code generated by the compiler with various optimization levels and see whether the compiler ever inlines the compare function.
I think complexity was the main argument against, and the majority opinion around performance was that the status quo was good enough. Unless you meant compile-time performance, in which case yes that was another argument against.
A key driver of Go design was compiler performance. Generics make the compiler more complex, use more memory, etc. and presumably require a less performant compiler.
I don't know what the implications are for Go with generic types. Either they've convinced themselves that a fast generic capable compiler is possible or the the compiler performance goal is no longer the high priority it once was.
I like how C++ gets the verbose template parameter name of RandomAccessIterator (repeated 3 times), while Rust gets T. It's sort of like he's saying, "if you choose to use C++, this is the sort of muck you'll have to deal with." He's not wrong, it's just funny.
The C++ code could have also just have been `T`. It's a bit odd of a comparison because `RandomAccessIterator` is just a name the author chose for the template typename.
I think that's what the person you responded to was saying actually, that the author specifically chose a verbose template parameter for the C++ examples.
You've confused RandomAccessIterator (a perfectly reasonable name for a Rust trait, but not really in the usual style of the C++ standard library) for std::random_access_iterator an actual concept defined in C++ 20
std::random_access_iterator like most of its fellows in C++ 20 is largely duck typed, there's a semantic "model" provided, but this is not enforced anywhere and your compiler will actually accept completely bogus types so long as the syntax matches. The standard washes its hands of the consequences by saying that if you don't obey the model your program is not valid C++ and the compiler is not expected to detect that and tell you about it.
As of C++20, with "Concepts", that name is not just a placeholder. It means something that enables the compiler to do work for you that Rust cannot. The most trivial of that work is to check that the type provided implements a comparison operator, which mostly affects error messages if you pass a type without, but you can use it to make types pick up a great deal more reponsibility.
It is not often mentioned on HN that whole swathes of functionality modern C++ coding relies on are entirely missing from Rust. Clearly, people can get along without powerful features -- important programs are coded even in C -- but it is a great deal more work. One consequence is that C++ and, to a degree, Rust programs are faster than programs coded in weaker languages, simply because more time is available to pay attention to performance, and any such attention to a library may help many more programs. Weaker languages are simply unable to express usability improvements and optimizations you would like to capture in a library.
> check that the type provided implements a comparison operator
If I understand correctly, this is done with traits in Rust. In this case, the PartialOrd and Ord traits. More elaborate use cases for traits cover things like "is this type safe to send between threads" and "does this type have a statically known size on the stack".
Nathan is talking about std::random_access_iterator which is a C++ 20 Concept, whereas the text in the blog said RandomAccessIterator which was a C++ 17 (and earlier) Iterator "category" now named LegacyRandomAccessIterator which doesn't actually do anything in code and so yeah you could just as well write T.
C++ concepts are just duck typed, whereas Rust's traits have actual semantics.
For example in C++ a float (the 32-bit floating point type) is std::totally_ordered because it has the necessary operators so "quack" - even though of course this type isn't actually totally ordered. If (when) you rely on this sort of promise your C++ program is "ill-formed, no diagnostic required"...
In contrast Rust's 32-bit floating point type f32 is not Ord, because well, it isn't totally ordered is it? NaN. So sort() doesn't work on say a slice of f32, because sort is defined over totally ordered types. You will need to provide an ordering function to explain how you think they're ordered.
C++ had no choice, you can't show up decades late to the party and say "Now we're defining these semantic traits" when there's so much code out there without them. Rust started from a fresh slate. Now, Nathan here (and many like him) would prefer to make a virtue of this necessity.
Once upon a time (before Rust 1.0) Rust did actually have a trait named RandomAccessIterator but it was not well loved because it wasn't practical to do the optimisations you'd actually want (they'd be unsafe, and an unsafe trait here is a big pain point), yet without them it was too slow to be reasonable to suggest anybody use it. So it was abolished.
In practice the slice type [T] is fine here, and the Ord constraint means this sort, like the real one defined on slice, enforces the requirement you'd actually expect.
The Rust sort() implementation is safe here too, unlike the C++ one. It is also stable, because Rust chooses to give you a stable sort, and have people who know they need it write sort_unstable() when that's what they meant, whereas C++ assumes you'd better know exactly what you need or else too bad here's an unstable sort by default...
The exact spelling of correct use of concepts differs from in the example code, but dwelling on it distracts from the valuable point.
It is not correct to say that C++ concepts are duck-typed. They can be used in a sort of duck-typed way, if you choose to. But it is not the normal way to use them. And, using them only as far as Rust's built-in checking can go wastes most of their expressive power, which is unavailable to Rust coders.
> dwelling on it distracts from the valuable point.
You brought up an entirely imaginary situation because you misread the C++ code.
> It is not correct to say that C++ concepts are duck-typed.
The actual mechanism delivered is duck typing. Rust comes with Eq and C++ 20 comes with std::equality_comparable. The former has semantics, Rust only provides implementation of Eq on types which exhibit equivalence, however C++ provides numerous types which match the concept std::equality_comparable despite not exhibiting equivalence, because all it does is check the != and == operators are present.
Would you prefer to say it's "structural typing"? Because the trouble is that it doesn't (and shouldn't) do all of what structural typing does, that's why "duck typing" is the correct phrase.
You are confusing limited use in certain Standard Library components with what concepts are able to do. Duck typing is, as I said, strictly optional. Where that is used in the Standard Library thus far, it improves upon the experience of legacy usage.
Doubling down when you have been corrected adds only heat, not light.
> Doubling down when you have been corrected adds only heat, not light.
All I've done is reported the facts, and, aside from hallucinating concepts in an example which didn't use them you've just insisted the facts aren't true.
I've provided examples like Ord and Eq, and you've provided nothing but denial. Nobody can do anything with that. The reason you find all those Concepts in the Standard Library with duck typing is that it's what Concepts do. If not for the duck typing they might as well just be comments.
Was that all you were getting at? Look at this fancy new comment syntax?
Presumably they intended to say that the (1) cplusplus thing can express something that Rust cannot, and (2) here’s an example of Thing. But then they didn’t properly delineate the two things so that it might come off as if (2) is an argument for (1), but really (1) just talks about Thing in isolation and not as a comparison to Rust... The example was just a standalone example, unrelated to Rust.
If you read what it says, you will see that it says Rust has built-in the most trivial of the things that concepts are used to do, and cannot express the rest at all.
Rust's traits actually reflect semantics, which is what you want, whereas C++ concepts are duck typing and you just get lots of wishful thinking when (too often) that's not enough.
Take a look at Eq for example. Eq is a trait that inherits from PartialEq but has no actual syntactic requirements. It reflects equivalence. Choosing to implement Eq is a claim about equivalence for your type. It has semantics. With C++ concepts you'd have to invent a marker for that purpose, and so C++ does not have any equivalent to Eq. Instead it offers std::equality_comparable which models equivalence but only deliver syntactic matching on the != and == operators like Rust's PartialEq.
Whenever I read a post with this premise and the first thing that pops up is qsort, I instinctly call BS. (Sorry to the author, no offense meant, I didn't read the post).
qsort is the prime example why (when!) generics do not matter. Half of the time if you need to sort, you're doing something wrong and the data should already be sorted (that's a sort in constant time). If that is not the case, but performance matters (which is quite unlikely by itself, since sorting is probably not where your program spends its time), then it is extremely likely that it's easy to write a much quicker sort than poort std::sort by taking advantage of the context. For example, some bucket sort which you could say runs in linear time.
The time spent waiting for bloated C++ templates to compile can instead be used to implement a sort routine that beats std::sort for the specific situation, and to improve on it 12 times over.
There is yet another point that is frequently heard, which is that compilers can inline function pointers if the "qsort" is defined as static inline. I'm only not sure how to guarantee that the inlining is performed.
To me this goes mostly to show that most people that whine about generics & performance don't know a lot about performance.
This comment really misses the point of the post and kinda just feels like you have an ax to grind about sorting. This feels like when people say "don't worry about memory safety, just don't write bad code". "don't worry about generics, just don't sort"
Generic sorting is something all programmers are familiar with so it makes good examples.
OOP syntax is not in fact required, and such hierarchies are not in fact presented for that reason. They are used to illustrate a concept using a familiar subject.
OOP is rarely the right way to organize a program, but almost any sufficiently large program has one or more subsystems that map naturally to OOP. A language that supports it is helpful in those places. Saying you could code the same thing without the feature wholly misses the point: of course you could, but it would be more work, and there would be more places to make mistakes.
The same applies to any powerful language feature. Where it is useful, using it helps. Where it is not useful, another may be. Languages lacking powerful features make the programmer do more work, leaving less time and attention to make the program, or other more important programs, better.
I wasn't saying there can't be a good way to put a feature to use. And, like any programmer that is maintaining a level of pride, I'm aware I am constantly looking for reasons why certain features aren't needed or are even bad in the global picture, at least in a given context. Looking for a minimal set of axioms to build the world on does make sense. Language feature can be seen as axioms in the sense that it's hard or impossible to break them down, since they're hidden in the compiler.
Regarding OOP syntax specifically, let us agree that it is a question of religion if writing foo.bar() instead of foo_bar(foo) is really a huge saving of work. There are numerous counter-arguments (language-specific ones as well as more general ones) why it could be not, and could in fact be detrimental.
I disagree with the view that using any feature to quickly get something done leaves more time to make the program better. In my view, less focus on syntactic superficialities like OOP method syntax, and premature optimizations like using std::sort, leaves more time to focus on what really matters, to understand what a program does and needs to do on a global scale, and it holds me back less waiting for slow builds.
All of that is pretty subjective, and there are certainly good examples for many different approaches, and naturally one is most familiar with those that one likes personally. The important thing is to be happy and be / feel productive.
OOP is absolutely not about syntax. At all. The differing syntax is nothing but a visual clue, for benefit of readers, that some late binding might be going on.
Literally anything that takes your attention necessarily takes it away from something else. Anything that does not need your attention anymore frees it up for important things you now have more time for. That is true for everyone, unless you waste it on looking at cat pictures (or, here) instead.
Failure to understand the uses and consequences of unfamiliar language features is not a virtue. Every single line of code anyone writes does not use almost all features of a language. All that is different in your case is failure to use features in those places where using them correctly would have made your code better.
The important thing is good code. If your code is not improving, you short-change yourself.
I often like your comments but sometimes not. Here I would definitely downvote if I could, let me explain why.
- You are disagreeing with me on things I did not say and that were not the subject of discussion. ("OOP is not about syntax")
- You are "teaching" me, from a high point, quite arrogantly. I'm particularly irritated by this: "The important thing is good code. If your code is not improving, you short-change yourself." Suggesting I am not improving, and giving commonplace advice as if I wouldn't realize that.
- You're suggesting a "failure" of mine to take an appropriate action, without an existing situation to judge.
- You are disagreeing with something I said and that I made an effort to create a reasonable argument for, but you're not presenting a counter-argument and instead are just repeating what you said before.
- While I think you might be super smart, I feel it was altogether a low quality comment of yours with little grounding in reason. Let me go back to this: "Failure to understand the uses and consequences of unfamiliar language features is not a virtue". Not that I ever claimed it was. Conversely, do you think that understanding the uses and consequences of all or most language features is a virtue? Do you think tools matter over results? Do you think a mathematical theory is more beautiful if you introduce more concepts than necessary to elegantly transport the message? Do you agree that features have a utility but also a cost? Can you see how writing a piece of code using a new feature creates new interactions and complexities with the rest of the code (including seemingly unrelated parts), producing new headaches?
(easy examples: isn't it enough to use functions? Why should I use methods, creating new problems when I need to take a function pointer and making it harder to find places of usage? Why should I use private methods, when there is no good reason to declare them in the header in the first place (static functions in the implementation file are just fine!) especially if this exposes implementation details?)
Do I have to apologize that while there might be theoretically ways to accommodate uses of many different features, I'm simply not willing to invest the effort to figure those out - effort that I'm not sinking into problems that I'm more interested in? I simply don't feel like the programming language I use (I like C) is holding me back from what I'm interested in doing.
First: I apologize for my tone. You did not deserve to be talked down to.
In answer to questions. Yes, it is a virtue to understand more language features, even in languages one doesn't use. Nobody understands all features in all languages. Many think they understand features they do not. (It appears, by your remarks, that you do not understand features that support OOP, so cannot reason reliably about their potential value for particular cases.)
A program using more powerful features, where they contribute something useful, will be shorter and clearer, and offer less scope for mistakes, than one with the same semantics cobbled together from low-level primitives. Anyone reading the latter has to dig down to see whether it really attempts what it seems to, and really does what it attempts. There are no such doubts about standard features.
A proof that relies on previously proven results is usually considered more elegant, but one that relies more on axioms may have more generality, which is also desired. The analogy to code would be a program relying on fewer third-party libraries, making it more portable and less fragile. That reasoning would not apply to language features, which are more akin to axioms.
(Euclid proved as much as he could without using the fifth axiom, which turned out millennia later to make those results applicable in spherical and hyperbolic geometry. This does not seem analogous to anything on-topic.)
Do features have a cost? Everything costs, so the cost of one choice can only be compared to the cost of another. Some costs are borne once, e.g. learning a feature. Some cost at build time. Some cost at runtime. We need to know which is which. Any abstraction used imposes the cost of knowing its domain, and, where that is limited, ensuring each use entirely fits in it. This cost is balanced against the load of extra detail exposed in not using it.
Results and tools are not at odds. Results include more than program output; the program itself is a result. Drawing upon the full suite of available tools produces better results. A screwdriver handle might substitute for a hammer, but it is not a thing to brag on.
Using a feature that doesn't contribute anything is just flailing. Everything in a program should bring it closer to achieving its goals. Not knowing to use a feature where it would have helped is paying a cost in every such case just to avoid the one-time cost of learning the feature, generally without even knowing we are paying it.
Features are added to languages always against enormous resistance. Any that make it in have passed the test of making important programs better. Not learning an available feature has the consequence of not knowing where using it would have made one's program better.
Not using a feature because you don't understand it is very different from not using the same feature because you have adjudged that it adds not enough value. Our tools affect us more than we imagine. What we are interested in doing is always affected by what we know how to do, or know can be done. We attempt more if we know more.
Attention is always the scarcest resource. Spending it learning language features takes it away from other things. But learning language features is an investment that pays back anywhere the feature might be useful, even where we don't end up using it.
> (It appears, by your remarks, that you do not understand features that support OOP, so cannot reason reliably about their potential value for particular cases.)
I would be interested to learn what you think those features are; I have years of experience fooling around in languages such as Python, Javascript, Java, C++11+, Haskell, and also other less common ones. Maybe I have in fact not understood how to best make use of certain features. I've always been a little bit obsessed with performance & control, and I ended up in frustration about non-orthogonality / non-composability of modern language features and approaches. So I found myself on the path of exploring what's possible when limiting myself to basic flow control and basic read and write operations and function calls.
It's certainly been a journey of ups and downs, but I'm in good company, and I've gotten to know some people who are amazingly productive this way. Basically my standpoint as of today is that at least for systems programming, languages should get out of the way. There is little value in using features that make it easier to forget about certain aspects of programming, when those tasks are actually only a small part of the total work, and for good performance it's important to get the details right - which can be done mostly in a centralized location. Of course I'm thinking about resource and memory management. (Most critics of bare-bones C programming still think it is about matching malloc() and free() calls. That's a terrible way to program, as they rightly claim, and there are far better approaches to do it that do not involve language features or libraries).
What matters IME is mostly systemic understanding, and sinking time into getting the details right. Papering over them is not a solution. YMMV, maybe I'll soon have another series of attempts at "modern" approaches, and whatever the way to programming heaven ultimately is, I've already learned a couple of insights that transformed my thinking significantly, and that I wouldn't have learnt otherwise.
The essence of OO, from the standpoint you describe, is a struct containing a pointer to an array of function pointers. Such an object is what, in kernel memory, a file descriptor in Unix locates; read() on a file descriptor calls through one of those pointers. Syntax this, "private" that, "this" the other are window dressing. You obviously can cobble that together by hand, as has been done in Unixy kernels forever. But, as a language feature, it is always done exactly right, with no distracting apparatus. You never need to check if it is really meant for what it seems to be, or if it really does exactly what it seems meant to do.
It is the same for other powerful features. You might, in principle, write code by hand to do the same thing, but in practice no one has enough spare time and attention, so you do something more crude, instead. A destructor is not something sneaky going on behind your back: you scheduled it explicitly when you called the constructor, certain it will run at exactly the right time, every time, with no possibility of a mistake. You never need to check up on it, or wonder if it was missed this one time.
The destructor is still the single most powerful run-time feature of C++. It was the only instance of deterministic automation in any language, ever, until D and Rust adapted it. Most of the other powerful features in C++, as in other powerful languages, direct action at compile time. The closest analogs usable in C are cpp, yacc, and lex. You can't cobble up your own without abusing macros or making other programs like yacc. The world has not been kind to attempts at those. But build systems see no problem with powerful language built-ins.
Static types are far more powerful than what C does with them, just checking that function calls match declarations. (Function declarations, BTW, were copied into C from C++.) A powerful language puts types to work to do what you could never afford to do by hand.
Sure I understand all of that, and I'm really surprised you thought I don't.
You describe the essence of OO being implemented by vtables and how C++ does the right thing while many Unixy kernels do this by hand.
(I understand the part about manually set up vtables as well, and in fact one of the things I do for money is maintenance of a commercial Linux kernel module that has lots of these. I don't feel like Linux has a problem encoding these tables by hand, while it is one of the projects that has the most need for vtables because it fits the description of a system that needs to be very flexible. Linux has many more relevant problems than hand-crafted vtables.)
Maybe it will surprise you, but for the greenfield stuff that I do in my spare time and that I'm allowed to do as my job, I can write dozens of KLOC of code without ever needing to setup a vtable. This probably isn't even that special for how I approach things - if we go to any random C++ class, chances are that by far the most methods there aren't virtual, or at least are prematurely virtual.
Personally I think of vtable situations as less than ideal from a technical perspective. I get that these situations can arise naturally in highly flexible systems where there are lots of alternative implementations for the same concept. On the other hand, I think vtables are problematic as a default, in particular for abstractions that aren't yet proven by time to work. They tend to create this kind of callback hell where stuff never gets done in the most direct way, and where you end up with much more repeated boilerplate than you'd like.
There are many other ways to go about "interfaces" that are not vtables, and it depends on the situation which is best. Vtables are callback interfaces, and I try to avoid callbacks if possible. As described callbacks make for incoherent code, since the contexts of the caller and the callee are completely disjoint. Another problem is that they imply a tight temporal coupling (a callback is a synchronous call)!
I would say the primary approach that I use where vtables are maybe often advertised, is to decouple using queues (concurrency, not necessarily implying parallelism). This achieves not only decoupling of implementation but also temporal decoupling. Asynchronicity should never replace standard function calls of course. But it is a great way to do communication between subsystems. Not only for requests that could take a while to process (parallelism) but also from a perspective of modularity and maintainability.
read() and write() are the best examples, they are now starting to be replaced by io_uring in the Linux kernel. I think read() and write() maybe only ever made sense in the context of single-cpu machines. Doing these things asynchronously (io_uring) makes so much more sense from an architectural perspective and also from a performance / scheduling perspective if requests aren't required (or impossible) to be served immediately.
Leaving that aside, there are other ways to do vtables than the one C++ has baked in. A few months ago I watched a video about this by Casey Muratori that I liked, but I can't find it right now. He talked about how he dislikes exactly why you stated, because it promotes one way to "implement" OO (which might not be the best, for example because of double indirection) over others.
Regarding (RAII) destructors, it's a similar situation. If I need a language feature that helps me call my destructors in the right order, this could mean I have too many of them and lost control. I can see value in automated destruction for script-like, scientific, and enterprise code, and admittedly also for some usage code in kernel / systems programming a la "MutexLocker locker(mutex);". But as code gets less scripty and more system-y, the cases where resources get initialized and destroy in the same code location (scope) become fewer. I have decided to see what happens if I leave destructors out completely and try to find alternative structures, which I feel has been fruitful so far.
As I said before, OO is not a valid organizing principle for programs. It is just a pattern very occasionally useful. The feature is there for such occasions. It does not merit multiple paragraphs.
Destructors, by contrast, are a key tool. The same is true of other actually-powerful language features. Avoiding them makes your code less reliable and slower. You may assert some moral advantage in chopping down trees with an ax, but professionals use a chainsaw for reasons.
One of the projects I'm working on right now is a vector GUI with one OpenGL based and one software based render pipeline. It's also networked (one server only). Right now I can't tell you a single thing that needs to be destroyed in this program. When you start it, it acquires and sets up resources from the OS, and it uses those. Well, there is a worker thread that reads files from the filesystem, that one needs to fopen() and fclose() files. There is also a job_pool_lock() and job_pool_unlock() to read/write work from a queue...
When the program is closed, the OS will cleanup everything, no need for me to do that (that would also be slower). And note that what the OS does is not like RAII destruction of individual objects. It's maybe also not like a garbage collector. It is like destruction of a (linear) arena allocator, which is how I try to structure my programs when that is required. This way reduces need for destruction calls by so much that I seriously don't worry about not being able to use RAII.
(The frame rendering also works like that - When starting a new frame, I reset the memory cursor for temporary objects that have frame lifetime to the beginning, reusing all pages from the last frame. I'm not ever releasing these pages).
The widget hierachy is constructed by nesting things in an HTML like fashion. I have a simple macro that exploits for-loop syntax to "open" and automatically "close" these widgets. The macro works mostly fine in practice but can break if I accidentally return from this loop (a break is caught by the macro). An RAII based solution would be better here, but I also feel that maybe there must be a better overall approach than nesting generic stuff in this way. The "closing" of elements in this case is needed not because they need to be destroyed (they are stored in the frame allocator after all) but because I don't want to name individual nodes in the hierarchy, but want to build the nesting implicitly from the way the nodes are opened and closed.
There is no way that you could convince me that "destruction" is an important concern in this program. It's probably not enough of a concern to make me switch to some idiomatic automation technology that has potentially large ramifications on how I need to structure my program.
That is literally what I described, and named. It appears I appear a lot of things to you that aren't so.
> Anyplace you do, there they are.
My point was, maybe, after all, there aren't that many things that have to be scheduled. And there are other, sometimes better (depending on situation) ways to schedule these things, too.
After all, there is a LOT of value in using plain-old-data without any "scheduled" stuff attached to it. POD is a huge tool for modularization, because it allows to abstract from arbitrary objects to "untyped bytes".
But hey, guess I might go back at some point and use RAII for some things when I've learned enough good ways to avoid them, so I won't tend to paint myself in a corner just because RAII is always easily available.
Stable sort and binary search is the bread and butter of pretty much every database operation.
When you bulk insert into indexed columns, it sorts the inserted data and merges that with the existing index. If you do joins on non-indexed columns it often sorts the intermediate results.
If “my dude” offended you, I’m sorry. I was just trying to make it lighthearted. I get frustrated at the number of comments on HN articles that are mostly irrelevant if you’ve actually read the article.
> most people that whine about generics & performance don't know a lot about performance
Really it is almost everyone who doesn't know a lot about performance. Any modern programming language is targeted not at an expert, but a beginner(look at Rust which is supposed to be relatively difficult, yet is quite quick for people to pick up in my experience).
Languages should be able to be performant, even when used in poor ways. No matter what, people won't presort their data, since it likely comes from some database or csv file or json response or who knows what. So a language should be trying to accommodate that use case.
- construct and query dictionaries (especially multi-maps, btrees, etc)
- group equivalent elements
- find the number of occurrences
- etc
Hashing solutions tend to be popular in modern language standard libraries, but not in C or C++ where in-place algorithms and simpler implementations are preferred. You could also make the argument that it's much harder to use hashes with generics as it's not always clear how to define good hashes for complex data types. Alternatively, sort requires only a < implementation.
> bucket sort which you could say runs in linear time.
This only works for integers in a small range. Also consider strings, doubles, or lexicographically ordered arrays.
> Using Java:
Perhaps they meant it another way, but this makes it sound like there is runtime dispatch between these different signatures - there isn't. int, long etc. are unboxed types in Java, which are not Objects, and cannot be used for Object-based generic type-erasure. So the decision which of these sorts is going to be used is made by the compiler. All Objects are sorted using the third signature.