
Shenanigans With Hash Tables - fanf2
http://thume.ca/2019/07/29/shenanigans-with-hash-tables/
======
pcwalton
Objective-C uses a hash table cache for all method calls. It's a bit different
in that Objective-C deals with collisions by just not handling them: since
it's only a cache, collisions will result in going through the slow name
lookup path.

Mike Ash has a nice series that goes through the disassembly of objc_msgSend:
[https://www.mikeash.com/pyblog/friday-
qa-2017-06-30-dissecti...](https://www.mikeash.com/pyblog/friday-
qa-2017-06-30-dissecting-objc_msgsend-on-arm64.html)

~~~
jonathanstrange
But aren't hash tables orders of magnitude slower than dynamic dispatch with
jump tables? I'm just interested in PL implementation as a hobby, so correct
me if I'm wrong, but this seems horribly inefficient to me.

~~~
tcard
You can't use jump tables in Objective-C because methods can be added to
objects at runtime.

~~~
masklinn
That doesn't preclude using jump tables. You can also add methods to objects
at runtime in JS and modern runtimes use jump tables.

I expect you can't use jump tables in objc because you can't really "patch in"
jump tables at runtime as it's AOT-compiled rather than JIT-ed.

~~~
tcard
Right, but that's no longer "just" jump tables, you need additional JIT logic
for those tables to be correct.

Plus, in a JIT you can go even further and just do direct calls, no need for
the indirection of tables.

------
layoutIfNeeded
I recommend reading up on minimal perfect hashing: [http://ilan.schnell-
web.net/prog/perfect-hash/algo.html](http://ilan.schnell-web.net/prog/perfect-
hash/algo.html)

~~~
chaboud
This is what came to mind when I read this. Far too many professional
programmers are unaware of perfect hash functions. A disturbing number are
unaware of pathological collision cases.

For a complete set of methods known at compile time, a minimal perfect hash
is, well, kinda perfect...

~~~
trishume
In this case I did actually consider it but forgot to mention it in my post.
The slightly more complex constraints of having multiple different tables
using the same hash function make searching for the hash function more
annoying so I didn’t bother.

------
nwellnhof
Seems like the same idea as in this paper:

Efficient Implementation of Java Interfaces: Invokeinterface Considered
Harmless (2001)
[https://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.21....](https://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.21.7133)

------
rurban
I think Self did the same. At those times this was one of the most researched
topics, how to efficiently implement a PIC, polymorphic inline cache. Either
with a simple ordered list (usually constructed at run-time, a harder
problem), such a hash (good for compile-time) or vtables.

------
caf
_According to my compilers class professor there’s a broad literature of
optimizing the “giant table with all signatures” approach with heuristics for
saving space by rearranging and merging the tables of different classes into
the same space or re-using offsets across classes to make tables smaller. But
the general problem is NP-complete so can only be solved heuristically._

A SAT solver should actually be able to handle this fairly easily, because in
practice, you don't get long complex chains of overlap between the interfaces
implemented by common classes. If someone feeds your compiler an adversarial
input you can just fall back to the naive implementation.

------
raja_mukherji
I've been using a similar but flipped approach in my own interpreted language,
creating a hash table for each method and then look for the object type as the
key. This allows me to check the parent types if an exact object type is not
found. I then nest the tables for multiple dispatch.

------
nutjob2
I think this is more common that the author assumes, I use hash tables in a
similar way in my own project. It quickly suggests itself when you are mapping
names within types to different offsets per type.

------
jasoneckert
I'm quite experienced in shenanigans. Doing them with hash tables, however,
definitely peaks my interest.

~~~
radiowave
Please allow me to bring your whimsical humour back down to earth with some
tedious pedantry: "piques", not "peaks".

