The problem is that it operates on an abstract interface but it is NOT generic.
With any Generics implementation that generates code for the specific type under the hood (e.g. at compile-time like C++ Templates or Rust Traits, or just-in-time), there would be no overhead.
C's qsort() has a similar overhead vs C++ STL's std::sort() for much the same reason - the C version calls a comparator function but the C++ compiler can inline the comparator.
It seems like what amelius suggests in another thread would work too. I'm not advocating for this as desirable, but I'm thinking about the "possible" bit.
It's basically the same as what a JIT compiler does around type inference. If it can reason about the underlying type that's really going to Sort(), it can code generate a specialized version. The machinery for escape analysis that Go already has probably gets you at least part of the way there.
That pretty much boils down to a non-user-controllable templating done entirely through the compiler. Like I said... possible vs desirable.
What I take away from this, is that unless you are spending a lot of run-time sorting things, or you are sorting a lot of things, you might as well not bother to micro-optimize.
This article is not about micro-optimisations, it's about mesuring the cost of dynamic dispatch and virtual fonction call, which cannot be avoided in Go since there are no generics and the langage relies on interfaces instead.
I agree with you, but in the cases where it really matters you can still avoid it! There is community activity around code generation to fill in for the cases where generics would really be the right answer. Basically, you make a template and render it for your type, check that into source control, and use as normal.
This removes the dynamic overhead at the cost of having to run a code generator and maintain a rendered template library in your source. There's actually language support for hooking into these generators at build time.
> There's actually language support for hooking into these generators at build time.
Hmm, are you talking about "go generate" or is there a new feature? Because go generate is not a build time hook, but build time hook would be very useful in some cases.
To be honest, while the generate functionality has been helpful, I have to say it's fairly annoying to manage at times. It can be super tempting to create a new generation tool that writes "ugly" memory optimized code (a la the stringer example by Rob Pike) only to find out you need to debug that mess. In something like stringer it would be okay, but for something more complex, perhaps not.
This is just a minor complaint. There is at least a workaround, even if the answer is still typing a bit more than if you used Templates/Generics.
"It depends." Go's wicked fast for a scripting language, fairly slow for a compiled language.
I just grepped over an in-the-wild Perl source code base. It had 137000 lines of code, of which about 150 had sort in it, of which about half of them were either tests that wanted to maintain key stability[1] or other things that don't generally come up in Go. And this is a language where "sort" is just... "sort". So it is very easy to use, and still doesn't come up that often.
I think it's easy to overestimate how important sorting is. Especially after years of horror stories about bubble sorts taking hours. Most people's primary objection to Go's sort will be the need to write three stupid-simple methods rather than its speed, and in practice that's still just an annoyance rather than a stopper, unless you have a really sort-heavy use case.
[1]: For example, "code that emits HTML attributes specified in a hash". In real code you don't care to sort those, but if you want to verify in modern Perl that {a => 1, b => ">"} gets converted to 'a="1" b=">"', you have to sort because half the time they'll be reversed.
Yes but this is true for nearly every operation under the sun.
The real take away should be that generics do offer a sizable speed up for the trade off of added complexity. Modern compilers can leverage them to generate very performant code.
I can imagine the overhead for ints is large. But how often do you sort pure ints? A more realistic test would be some objects sorted on a field value.
That's great feedback. I'll try out this specialization on a simple struct (int32/string) combo later today. We can still get rid of the indirections around slice manipulation (Swap and Len) but obviously stay stuck with the Less.
Or try sorting points (struct of two float64s) by the x-coordinate, and compare the timing with similar JavaScript code running in V8. You might be surprised...