Hacker News new | past | comments | ask | show | jobs | submit login
Interface Dispatch (2018) (lukasatkinson.de)
82 points by matthewwarren on April 18, 2019 | hide | past | favorite | 36 comments

Some JVMs use fat pointers where the pointer embeds the dynamic type of the class rather than the statically declared type. This permits inline caches to verify the type of the instance without any indirections.

Delphi implemented interfaces like the C++ approach, and in order to be compatible with COM, it didn't have a huge amount of flexibility: interfaces are mandated by the COM ABI to be called like this:

This pretty much forces the vtable references to be stored in the object instance, one per implemented interface. Casting an instance to an interface means adding the offset to the vtable in the object data to the instance pointer.

The vtables for every interface point at stubs which subtract the same offset and then jump to the implementation, so that object data is where it's expected for a normal instance method call.

COM influenced the internal design of Java interfaces too - the first three slots were reserved for COM's lifetime management and typecasting methods. I'm not sure anything practically ever came from that though.

I reused the interface calling mechanism when I implemented anonymous methods in Delphi. Anonymous methods can capture variables from their declaring scope; the captured state needs lifetime management in the absence of GC, so I piggy-backed on COM's lifetime management. This reduced the effort to interop with C++ too.

I concur with the description of the Delphi implementation, as I've navigated it numerous times myself. I guess I don't understand the author's negative tone with regard to the C++/Delphi implementation. There's going to be overhead with any choice made and C++ and Delphi made the choice to sacrifice memory usage for speed. Furthermore, the only part that really scales with the number of objects is the extra vtable pointer for each implemented interface in each newly constructed object. Hardly a significant overhead in today's memory-prolific computing environment. Also, pointer fixup thunks and unconditional jumps are easy to speculatively handle.

The author is interested in PL design, and thinks that while C++ style object implementations are great at offering compatibility with certain systems or interfaces, other approaches allow for far nicer semantics in a greenfield programming language.

For example, fat pointers make it possible to implement interfaces for existing types. I have experienced that as game-changing in Haskell, Go, Rust.

I also disagree that any memory usage can be handwaved as “no longer relevant today”. Electron, or garbage collection: perhaps. But object size affects cache pressure, and cache utilization is critical to performance. C++ is typically good at keeping objects small, except when it comes to multiple inheritance.

Honest question: do you have a real-world example that indicates that the extra pointers added due to multiple inheritance actually results in performance degradation due to extra cache pressure?

With the C++, there is also an optimisation which can be performed: put the method pointers in the object itself, instead of in a vtable. This is beneficial when there aren't so many objects nor methods, since it eliminates one indirection. Instead of obj->table.method() it turns into obj->method(). When I write "OOP" code in C, I usually do this instead of using a separate vtable.

As an aside, I noticed the word "klass", then looked up at the domain name and confirmed my suspicions. ;-)

"klass" is a common convention in OOP code when referring to classes, because "class" and "Class" are often reserved already.

Python typically goes with `cls`, particularly in conjunction with the `@classmethod` decorator.

Python code uses both.

Um, not really. PEP8 explicitly says to use `cls`:


Java seems to prefer `clazz`.

"klass" seems to somewhat outnumber "clazz" in OpenJDK, 1541 to 1021. Wonder if there's a reason they haven't settled on one.



C# prefers `@class`

Doing this prevents a large class of important compiler optimizations. Pitching a DIY vtable as an optimization seems to require a lot of hard evidence.

> C# generally follows the same approach as Java, but tries to perform more aggressive optimizations.

I don't understand this when it doesn't have deoptimization?

The Java example has one condition for fast path, while the C# has two nested loops and a condition inside that!

Author here. Both the JVM and CLR do some pretty aggressive optimizations even for normal interface calls. In the Java section I also show inline caching, a standard optimization approach. The CLR calls its approach to caching “virtual stub dispatch”. Speculative optimizations and inlining would be more advanced optimizations that I did not cover.

But what about the worst case – the method lookup when the method is not in the cache? There, the CLR introduces an additional level of indirection for calls through interface types, which has benefits for memory usage and possibly dispatch time. But since this is the worst case, the complexity doesn't matter that much as long as the important cases (which can be cached) work well.

> In the Java section I also show inline caching, a standard optimization approach.

Thanks. Does the CLR do inline caching for interface calls which cannot be devirtualised without speculation? Or not because of the lack of deoptimisation? Or does it have self-modifying code instead?

It depends very much of which .NET runtime we talk about (NGEN, Pre-RyuJIT, RyuJIT, IL2CPP, Mono, MDIL, .NET Native, .NET Core, Burst), and which version of it.

CLR routes any interface calls through an indirect cell or pad that can be quickly rewritten to point to a different dispatch stub. Some stubs perform caching whereas the generic resolver behaves as explained by the pseudocode in my article.

The “book of the runtime” has a graph visualizing these possibilities: https://github.com/dotnet/coreclr/blob/master/Documentation/...

So this is a self-modifying code approach.

Perhaps it's memory optimisation? Especially considering all the instantiations of generic types that Java wouldn't have.

Here's another interesting approach for Java:


Interface methods are looked up in a hash table containing function pointers (the table index can be computed at compile time). If there's a collision, a small stub is generated that dispatches to the right method.

fat pointer cost can be mitigated by doing pointer preunpacking in hardware. That kind of support fell out of favor when RISC started beating CISC architectures, but if dynamic dispatch became a significant performance issue I could see it going into certain processors. We have happily returned to an era of processor experimentation.

Speaking of interfaces in Go, can anyone explain this particular result? We've done a benchmark recently at work to test which way we would best represent a set. (Linux, go1.12)

      map[string]interface{} // where we put an empty struct in it.

     BenchmarkBoolSet-4              5886572           223 ns/op          39 B/op           0 allocs/op
     BenchmarkStructSet-4            6716685           206 ns/op          32 B/op           0 allocs/op
     BenchmarkInterfaceSet-4         4299072           313 ns/op         115 B/op           0 allocs/op

(I'm aware this is a bit of a tangent to TFA)

Can’t really say what’s going on here without seeing some code.

Well the code is the same for all of them (but when using a bool map we assign true instead of empty struct). So we run this code inside a benchmarking function:

      m := map[int]interface{}
      // or m := map[int]struct{}
      for i := 0; i < t.N; i++ {
           m[i] = struct{}{}
edit: fix code.

I'm speculating, but sizeof(bool) == 1, sizeof(struct{}) == 0 and sizeof(interface{}) == 8, so might be as simple as that.


Yup that well could be it!

If you liked this you might also like the COLA system: https://en.wikipedia.org/wiki/COLA_(software_architecture)

> COLA is a system for experimenting with software design currently being investigated by the Viewpoints Research Institute. "COLA" stands for "Combined Object Lambda Architecture". A COLA is a self-describing language in two parts, an object system which is implemented in terms of objects, and a functional language to describe the computation to perform.

I'm curious about the history of the fat-pointer approach. The first exposition I personally ran across was http://erights.org/enative/fatpointers.html in the 90s. Can anyone point out others?

The earliest one I know of is in the Emerald programming language. There is a brief description of the technique in http://www.emeraldprogramminglanguage.org/OOPSLA-86-paper.pd... from 1986.

See http://citeseerx.ist.psu.edu/viewdoc/download?doi= for more history and description of the implementation. From that paper: "The run-time system constructed an AbCon for each <type, implementation> pair that it encountered. An object reference consisted not of a single pointer, but of a pair of pointers: a pointer to the object itself, and a pointer to the appropriate AbCon". (Called an AbCon because it "mapped Abstract operations to Concrete implementations.)

Great find! Thanks.

The ascii art is broken.

(Author) I am aware of the problem. For me it works on desktop but not on mobile browsers – apparently the Unicode box drawing characters are not really fixed-width on all systems?

One day I'll write a Pandoc filter that pipes these diagrams through a Shaky or Ditaa type tool first…

> For me it works on desktop but not on mobile browsers

Looks weird for me on Mac desktop Chrome. It looks like the default monospace font, Courier, lacks box drawing characters. So it probably falls back to some other monospace font which likely has different metrics.

A good CSS font stack for monospace fonts is something like:

    font-family: Consolas, Monaco, monospace;
The first catches almost all Windows users, the second almost all Mac users, and the last fallback to the default.

Not in Mac Safari

not on firefox/chromium

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact