
Handles Are the Better Pointers (2018) - fanf2
https://floooh.github.com/2018/06/17/handles-vs-pointers.html
======
twoodfin
There should be a minor law in the spirit of Greenspun’s tenth rule: Every
sufficiently performance-sensitive system eventually grows some version of a
slab allocator[1].

There’s really no substitute for packing fixed-sized objects together in N big
array chunks. If you can operate in a particular context (or entirely) with
N=1, you can substitute handles for pointers. For any reasonable # of objects,
those handles can be smaller than a pointer and provide even better compaction
and thus cache effectiveness.

[1]
[https://people.eecs.berkeley.edu/~kubitron/courses/cs194-24-...](https://people.eecs.berkeley.edu/~kubitron/courses/cs194-24-S14/hand-
outs/bonwick_slab.pdf)

~~~
epistasis
Everything I read these days seems to trend towards columnar stores...

I write a mixture of R, Python, C and C++, in that order of prevalence, and
mostly for data intensive tasks. After nearly two decades of unlearning loops
and implementing instead with FP idioms, vectorization, and matrix multiplies
to take advantage of R's strengths, I often have trouble recreating R's
performance when I first reimplement in C or C++.

I think it would be beneficial to start with functional programming in more CS
programs, they are a much better fit for modern CPU architectures in many
ways.

~~~
jerf
In my Perfect Programming Language That Doesn't Have To Deal With The Problems
Of Actually Existing, serialization of data structures into memory is
explicitly defined in the language, just as one would define a JSON or
Protobuf serialization. One could define a "Point3{x,y,z}", and hypothetically
define one array of them in the conventional manner, define another array as a
columnar serialization, and potentially even define a compressed data
structure right there in memory if that was advantageous.

It opens a wide array of exciting problems and possibilities, if you start
thinking of that as something a language should let you twiddle with rather
than hard coding it. It seems like a lot of high performance stuff is moving
in a direction where this would be advantageous.

(For example, if you compress an array, you lose random access. The language
would have to carefully build up capabilities from scratch so it can have a
concept of "arrays" that can't be randomly accessed, can only be written in
some very particular manner that may even require a "flush" concept, etc. My
gut says it ought to be possible, since, after all, we can do it manually, but
it would be a very careful operation to disentangle casual assumptions about
what data structures can do based on the definition of what data they contain
vs. how they are represented in memory, and what capabilities you get from
that.)

~~~
zero_iq
Jonathan Blow's new language, Jai, has some control over layout allowing easy
switching between array of structs and struct of arrays without having to
change the accessing code.

~~~
clucas
Here is the demo video of the stuff zero_iq is talking about:
[https://www.youtube.com/watch?v=ZHqFrNyLlpA](https://www.youtube.com/watch?v=ZHqFrNyLlpA)

Basically right on point with what the GP is saying.

~~~
epistasis
That video briefly mentions the CppCon 2014 talk by Mike Afton, "Data-oriented
Design and C++". The talk was great, pretty much lined up with how I think of
modern systems (treat RAM like disk: seeks take a long time, but streams will
likely be able to maximize CPU usage)

But the questions at the end, and the resistance to these basic ideas, roughly
"what about platform portability, programmer productivity, etc."; these
questions were a bit shocking to me. Is this not a C++ conference? This was
six years ago, so perhaps the machine constraints were not as well known back
then, but it really shows how strongly the community culture will influence a
language and its capabilities.

------
muth02446
I am surprised that not every large project uses this technique. If you are
running on a 64bit system, the use of 32bit handles instead of pointer results
in huge memory savings. You can also put the start addressed of the arrays
(the handles are indexing into) anywhere in virtual memory and grow them as
needed. Finally you can stripe the data for one object into multiple arrays
and use the same handle to index each of them which can improve locality.

~~~
jagger27
> you can stripe the data for one object into multiple arrays and use the same
> handle to index each of them which can improve locality.

I wish programming languages made it easy to structure arrays of objects like
this. I think Johnathan Blows’s JAI has some kind of transparent support for
SoA.

~~~
gumby
It’s trivial to do this in C++ and often the caller doesn’t even realize.

~~~
aks_tldr
do you have any pointers of such implementation ?

~~~
gumby
We have a mixin class that will allocate a subclass with its own array of
handles (the handles overload * and -> so they act like pointers). If you have
an instance variable you can choose to give it a different allocator, where
that instance variable of object with handle `n` is the nth object in the new
area. Then your instance method `barf& foo();`, instead of getting the
instance variable `barf m_foo` Instead does some inlined arithmetic and gives
the caller a reference to the object.

------
anilakar
Moving away from pointers and using a shared pool of handles with integer
indexes will introduce a whole new array of issues. A plain int carries no
information about the validity of the object behind that handle because they
are carried around as copies, not references.

I remember debugging a regression with UNIX network sockets where valid
connections were being killed, and the bug was only triggered under heavy
load. Deadlines were approaching and as a desperate measure I did the exact
opposite the article suggests: I wrapped all the socket calls to only accept
pointers to an opaque struct with a single integer and made sure the int was
set to a guardian value to indicate invalidation after an error. The culprit
was a double-close that normal debugging tools such as Valgrind could not find
but my unorthodox refactoring did.

Later I've learnt to love UUID's whenever performance allows. They're not
pointers so they don't introduce memory handling issues and they can be easily
tracked in e.g. distributed data pipelines and logging systems.

~~~
AnIdiotOnTheNet
> A plain int carries no information about the validity of the object behind
> that handle because they are carried around as copies, not references.

A pointer is just an integer index to a byte in memory in most computing
architectures.

~~~
kazinator
Ah, but that pointer doesn't get re-used on you until you free it, and for
that there are strategies like refcounting and GC.

Integers in a small array will get re-used; they have to.

~~~
andolanra
You should probably read the whole article, because it describes strategies
for dealing with exactly this problem in the section under _Memory safety
considerations_ and then again in the update at the end. The short form is: if
you have a bounded size on your array, then you will often have unused bits in
your index pointer; you can use these to store a "generation counter", which
needs to match the slot. You can now re-use indices, because you can compare
the generation of the index (stored in the extra bits) with the generation
count in the slot, and if they match, it means that the item has been 'freed'
and a new thing is in the slot.

------
wtracy
Thinking out loud:

Since handles refer to offer rather than absolute addresses, this sort of
scheme should allow for "serializing" data by memmapping chunks of memory to
disk.

Obviously, this scheme wouldn't be useful for cross-platform save files, and
might not even translate across builds.

Still, as long as you have a robust scheme for invalidating data when a new
version is deployed, I could see this being useful for caching, and for saving
state for short periods of time on mobile platforms that like to restart
processes with little warning. (Rather than potentially losing your place
within the game level any time you switch between apps, you only lose your
place within the game level when an app update gets installed.)

~~~
bluetomcat
Handles are essentially integer indexes, relative to a base pointer, also
implying a certain object size (needed for index multiplication). A nice
property therefore is that the base pointer can be different each time the
program is started, but the handle stays constant.

So in case that module-private array is mmapped to disk, handles themselves
can be serialised, unlike pointers. This can enable quick and easy storage of
most of the program state, allowing for a fast recovery afterwards.

------
coldcode
MacOS 1.0 in 1984 used handles in order to allow the heap to be coalesced more
easily (although locked handles and regular pointers often created islands).
Given the first Mac had 128K RAM and no MMU, it was highly necessary at the
time.

~~~
jiveturkey
Of course you mean System 1, not MacOS 1.0. Handles were used all the way
through System 7, last released in 1997. Perhaps by then it was more about the
legacy than the need. Don't know the physical limits, but a Quadra 950 could
support 256MB RAM and more through VM (ramdisk), and of course these later gen
machines did have MMUs.

In ye olden days, addressing was probably physical, so being non-multi-tasking
ensured the safety of using handles, ie if used properly. IOW the system
process was free to clean up the heap and manipulate your pointers b/c your
app would only yield at specific points where you aren't holding onto a
dereferenced handle.

~~~
duskwuff
> Handles were used all the way through System 7

Handles continued to be used all the way through Mac OS 9, as they were used
by a ton of critical Toolbox APIs like the Resource Manager. A vestigal
version even existed in Mac OS X -- handles still existed in Carbon, but I'm
not sure they would ever be relocated by the system like they were previously.

> of course these later gen machines did have MMUs.

They did, but it was only used to simulate a system with more physical memory.
Every process still saw the same view of a single "flat" memory space.

------
cmrdporcupine
In many ways a pointer on a modern OS with a virtual memory subsystem really
is a handle already. It's not like it points to a single given exact location
in physical memory. The OS and CPU are already presenting a bit of a facade.

------
cryptonector
Yes! For example, in the GSS-API the various complex, opaque objects in the C
bindings are specified in the RFC (2744) as:

    
    
       typedef <platform-specific> gss_ctx_id_t;
       typedef <platform-specific> gss_cred_id_t;
       typedef <platform-specific> gss_name_t;
    

Sadly this is always a pointer in actual implementations, but should have been
a handle, where a handle is a structure containing two fields: an index or
pointer-like value, and a _verifier_ that can be used to check that the handle
is [still] valid. Using a handle would have made it possible to get much
better memory safety.

------
cpr
Perhaps they should ask the old MacOS toolbox Handle users what they think.

(Pretty painful trade-off in the long run, is what I think, having lived
through that.)

~~~
danaliv
Visions of the "moves memory" icon in THINK Reference came to mind as soon as
I saw the headline.

------
dang
Discussed at the time:
[https://news.ycombinator.com/item?id=17332638](https://news.ycombinator.com/item?id=17332638)

------
steveklabnik
(2018)

This is often a really useful strategy to use in Rust code as well.

~~~
edflsafoiewq
You can also use it in GCed languages if you want to avoid creating cycles in
the object graph.

~~~
earthboundkid
Yes, Go supports this idiom very well.

------
rwmj
Also helps with arrays that you might wish to realloc. I've sometimes hit the
(obvious, in hindsight) bug that you realloc an array and all pointers to
members that are held elsewhere suddenly become invalid.

------
hinkley
Somehow, my first specialty as a developer was perf analysis, and so all
through the Java Golden Era I read every press release about the JVM and
subscribed to the ACM SIGPLAN, which had at least one paper on GC in virtually
every proceeding that was published, and some occasional cool stuff on
interpreter construction. I believe the 4rd technical book I read was The Java
Virtual Machine. After Stevens, Comer (3 volumes), and the Java Programming
Language.

The original GC had handles. With Hotspot (JDK 1.2, IIRC?) those were gone,
and by Java ~8 they were back again, but embedded in the object header, where
the pointer indirection presumably hurts less.

In an alternate universe, Sun would have shrunk the memory size for Strings
earlier and kept the overhead per object, in which case I suspect the GC stall
around 1-2GB would never have happened. But I don't think they had admitted to
themselves how Stringly Typed production Java code was, and how UTF-16 wasn't
going to last forever.

------
teddyh
“ _We can solve any problem by introducing an extra level of indirection._ ”

— “Fundamental theorem of software engineering”

[https://en.wikipedia.org/wiki/Fundamental_theorem_of_softwar...](https://en.wikipedia.org/wiki/Fundamental_theorem_of_software_engineering)

------
timgarmstrong
shared_ptrs, if misused, can cause a lot of memory management issues, because
you end up with a web of object lifetime dependencies that is hard to reason
about. I think this is a severely underrated problem in C++, particularly for
newer developers. shared_ptr is extremely powerful, but it it makes it easy to
gloss over issues of object lifetimes and ownership, when in fact those issues
are still very important for building robust systems.

In the code bases I've worked on, we have pushed hard to have unique ownership
wherever possible and only use shared_ptr where there was a clear need. You
can still have a huge object graph kept alive by a single unique_ptr, but it
happens less often and it's easier to trace back and fix.

A nice thing about the handle approach is that it makes it a lot harder to
build up these object graphs or in general to implement anything without being
explicit about object lifetime.

I've seen parts of the handle/object pool approach misapplied and cause more
trouble than it's worth, though. It's a good idea for self-contained
subsystems where there are a limited number of object types that you would
apply this to. I don't think it scales to 100s or 1000s of distinct types of
objects because then you're going to have headaches dealing with the sheer
number the object pools.

I've also seen object pooling be implemented proactively as an "optimisation"
to avoid calling malloc() but then become a bottleneck because of lock
contention in the object pool.

------
zoomablemind
Does this assume that the handle consumers have to somehow 'release' the
handle in addition to 'deleting' it? As was explained with an example of
delayed release of some deleted but still GPU queued element.

I'm not sure if the suggested generation counter approach is the optimal way
to deal with the delayed release. Also, when it comes down to own memory
managers, benchmarking is a must. Added complexities may eventually eat up the
initial savings.

~~~
bluetomcat
The "generation counter" suggested in the article has nothing to do with
delayed release, as far as I understand. It is used as a strategy to avoid
returning colliding handles close in the time domain, reducing the chances for
a collision of the "unique bit pattern", which would allow non-detectable
dangling access conditions.

~~~
zoomablemind
> ... non-detectable dangling access conditions.

My understanding is that the dangling access conditions potentially arise due
to the delayed release of deleted handles.

["... Instead the rendering system would only mark the resource object for
destruction when the user code requests its destruction, but the actual
destruction happens at a later time when the GPU no longer uses the
resource."]

~~~
renox
Not necessarily, there's also the "use after free" error, which can happen
everywhere..

------
gumby
I have a standard mixin class that does this, overloading * and -> so users
often don’t even notice.

------
dtheodor
Curious how Rust's ownership interplays with this memory management approach.
Does it actually get in the way of implementing this?

~~~
steveklabnik
On the contrary, it kind of encourages it
[https://kyren.github.io/2018/09/14/rustconf-
talk.html](https://kyren.github.io/2018/09/14/rustconf-talk.html)

