
Chris Lattner on Swift and dynamic dispatch - gbugniot
https://lists.swift.org/pipermail/swift-evolution/Week-of-Mon-20151207/001948.html
======
ghshephard
The part of this (really excellent) writeup that caught my attention was this:

 _someone writing a bootloader or firmware can stick to using Swift structs
and have a simple guarantee of no dynamic overhead or runtime dependence_

There seems to be a pretty strong implication here that Swift could be used to
write firmware/bootloaders, and other low level constructs - including
operating systems. Has anyone worked with Swift yet on that type of project?

~~~
mpweiher
Except for the minor part that it's not true.

The performance variations in Swift are (as of now) much larger and less
predictable than the overhead of dynamic messaging in Objective-C, and the
latter can always be removed by using IMP-caching or converting to a C
function.

~~~
Someone
The claim is not about Swift in general, but of a limited subset of the
language: _" someone writing a boot loader or firmware can stick to using
Swift structs and have a simple guarantee of no dynamic overhead or runtime
dependence"_

Based on what I know about Swift (reading Apple's manual, a tiny bit of
experimenting, and decent knowledge about how compilers work), I would expect
Swift to compile purely statically, like C, for this case. Performance may
still be lower because the compiler is newer, but I wouldn't expect
performance variations.

And of course, the standard library may not be optimal for this use case. The
implementation of strings, for example, may be too dynamic for embedded work.

~~~
mpweiher
My measurements were also with a limited subsets: integers, floats and arrays.

Depending on compiler optimization settings, you get from 30% to 1000x (yes, 3
orders of magnitude!) slower than C. Yes, technically that's not "dynamic"
overhead, but from where I stand that's still orders of magnitude more
variability and unpredictability (for reasonable values of "predictability")
than even byte-coded Smalltalk.

~~~
mikeash
If you're using arrays, you've departed from the limited subset in question.
They're structs on the outside but complicated on the inside.

~~~
mwcampbell
I wonder why the designers of Swift decided to make arrays structs on the
outside but complex on the inside. Why not make them objects, i.e. reference
types, as in Java and .NET?

~~~
mikeash
Because having collections be value types makes them _really_ nice to work
with.

------
mbrubeck
People interested in Swift at this level should also read Lattner’s comment on
/r/rust about the possibility of extending Swift to include Rust’s statically
checked ownership and borrowing concepts:

[https://www.reddit.com/r/rust/comments/3vadg8/swift_is_open_...](https://www.reddit.com/r/rust/comments/3vadg8/swift_is_open_source/cxnu2kk)

~~~
bluejekyll
If you read that though, it is not something that's a high priority for the
team. Basically, 90% of devs won't need or want some of these features. Unless
Apple decides to have a focused effort around recoding the core of OS X into
Swift which would require these changes, it's hard to see when the language
would gain these options.

Don't get me wrong, Swift is an awesome language, but I see Rust as
maintaining a position as a systems language equivalent to C. Where Swift will
possibly start supplanting Java/Go/C++/Python use cases.

~~~
pjmlp
As someone that would like to see C replaced by something better and has
always been a fan for Wirth influenced languages, Swift has actually a better
position to achieve that than Rust.

The reason being that Apple, like it happened to OS vendors that adopted C and
later C++, can allow themselves to push the language into the developers
regardless of what they think they should use.

All the systems languages that have survived in the mainstream, have done so
by being adopted by an OS vendor.

~~~
lobster_johnson
Out of interest, in what way do you consider Swift a Wirth-influenced
language?

Go seems a closer cousin to Wirth's languages. Its strict type coercion, value
copying semantics, package system, lack of OO, use of ":=", type declarations
with "type", "var", GC, colons, etc. are all reminiscent of Wirth's languages,
especially Oberon. (Robert Griesemer worked with Oberon under Wirth at ETH
Zürick, so there's a clear relationship.)

Another highly wirthian language is Nim, though its influence comes via
Delphi/ObjectPascal.

~~~
pjmlp
> Out of interest, in what way do you consider Swift a Wirth-influenced
> language?

In the context that system programming languages shouldn't sacrifice safety,
except when needed and be explicit about it with constructs like unsafe/SYSTEM
and so on.

Swift is very explicit when doing C style tricks, e.g. UnsafePointer<>.

There is more to Wirth school of languages than just syntax.

~~~
lobster_johnson
The only "safe" Wirth language I know about is Oberon-2, which has bounds-
checking of arrays, sentinels and runtime-checking of pointers — though no
compile-time checking. Is this what you're referring to?

I like Wirth, but his type systems were always super simple, and as far as I
know, he never innovated at compile time safety in the way that, say, Rust or
Haskell do. Again, Go seems a better analogue; anything that needs to deal
with unsafe pointers needs to use the "unsafe" package.

~~~
pjmlp
Not sure what you mean with runtime-compile time checking of pointers.

Wirth type systems weren't always super simple, not when compared with C style
ones.

Wirth has designed Pascal, Modula-2, Oberon, Oberon-2, Active Oberon and
Oberon-07

He has contributed to Extended Pascal, Object Pascal, Component Pascal.

Also worked on Algol compilers and some of its dialects, Mesa and Cedar while
at Xerox PARC.

His work had an influence in Ada, Modula-2+ and Modula-3.

Some of the type safe things that Modula-2 in 1978 already offered, that C
still doesn't do today:

\- Type safe enumerations

\- Bounded checked arrays

\- Arrays indexed by enumerations

\- Type safe sets

\- Range typed integers

\- No implicit conversions

\- reference parameters instead of pointers for function/procedure calls

\- tagged records

\- low level coding/assembly intrisics require use of the SYSTEM package

The only thing lacking was automatic memory management, which we later
introduced in the Oberon family.

However as you can see from the list there are a whole class of errors that
aren't possible.

If you like to research old stuff, have a look at B5000 system, which used
Algol as their system programming language in 1961.

Common to his languages and Xerox PARC ones, was the idea that you can do
systems programming in higher level languages, provided the language provided
some unsafe path that needs to be explicitly mentioned.

But type safety should precede "performance at any cost" mantra.

Something also mentioned by Hoare on his ACM Award speech in relation to Algol
in 1980.

"A consequence of this principle is that every occurrence of every subscript
of every subscripted variable was on every occasion checked at run time
against both the upper and the lower declared bounds of the array. Many years
later we asked our customers whether they wished us to provide an option to
switch off these checks in the interest of efficiency on production runs.
Unanimously, they urged us not to - they already knew how frequently subscript
errors occur on production runs where failure to detect them could be
disastrous. I note with fear and horror that even in 1980, language designers
and users have not learned this lesson. In any respectable branch of
engineering, failure to observe such elementary precautions would have long
been against the law. "

The problem is that younger devs never learned any of this and probably the
only thing they know about Wirth languages is some braindead Pascal compiler.

~~~
lobster_johnson
Oberon-2 had runtime-checking of pointers: It could check the validity of a
pointer and immediately panic rather than return garbage or crash with a page
fault. Similarly with arrays.

Wirth did some nice things with type safety, of course (I've always liked his
type-safe set/range types), but he never was much into type _systems_. For
example, his set and range types were hardwired into the language; you
couldn't build a set-like type yourself, or build something that acted as a
range, or overload any of the standard function such as inc() or min(), or
indeed any type coercion. With the exception of Oberon-2's (limited and highly
Go-like) notion of abstract classes, nothing was polymorphic, every type was
only "itself" and their purpose was always entirely about holding data. This
is what I mean by his type systems always being "simple". He was attacking
practical engineering problems, not type-theory problems.

Go is exactly the same. Swift isn't; it has generics and sum types and
protocols and OO and pattern matching and lambdas and a bunch of other stuff
that Wirth would probably find too complex (his last few languages tended
towards mercilessly removing features).

I don't disagree that Wirth's languages are more advanced than C, though!

~~~
pjmlp
> Oberon-2 had runtime-checking of pointers: It could check the validity of a
> pointer and immediately panic rather than return garbage or crash with a
> page fault. Similarly with arrays.

Ah, but this was a property of any sane systems programming language with
automatic memory management.

> With the exception of Oberon-2's ....

How well do you know Active Oberon, Zonnon and Component Pascal?

They extend Oberon(-2) with abstract classes, generics, tasks (active
objects), type extensions, method definition signatures.

Also many consider his work as influence in Ada and Modula-2 successors, whose
type systems are not very far from C++ ones.

Incidentally he became disillusioned with these language variants and went
with a minimalist view (Oberon-07) that makes even Go look like a complex type
system.

~~~
lobster_johnson
Sure, Active Oberon is pretty advanced, and Modula-3 had all sorts of things,
including generics — but to my knowledge Wirth was not directly in those
languages (or with Zonnon or Component Pascal), and he was unhappy with their
complexity. (He apparently prefers the entire language's grammar to fit on a
single screen.)

------
nielsbot
I feel like Swift reduces the runtime availability of dynamic dispatch by
default... so for things like live programming or runtime hacking Swift is
worse than Obj-C.

I envision a future similar to the "Smalltalk dream" where one can build apps
live, with minimal recompilation and where dynamic dispatch should be the
default. I think this scenario is less ideal for "systems programming" but for
areas when one is snapping together UI components and data sources I think
it's ideal. Perhaps someone here can come up with good counter examples why
this isn't the case?

Finally, I think static dispatch, as is favored by Swift, tends to paint one
into a corner down the road... but it's possible Swift has enough leeway so
that this isn't the case?

~~~
mnem
Snapping together UI components and data sources has existed for a long time,
but it always seems to fail to be popular outside of niche uses. That's often
due to the overhead of runtimes and the unsuitability of generic UI components
when you get into the nitty gritty of the application being designed. Of
course, that doesn't mean it can't be done, just an observation that the
attempts I've seen over the past 20 years or so haven't been very good,
ultimately.

I've often been curious why programming by manipulating metaphors on screen is
seen as an ultimate goal in the evolution of programming languages. It seems a
bit like creating a book by sticking pre-generated paragraphs together.

On your final point - static dispatch doesn't prevent you from making your own
dynamic dispatch mechanism in the language.

~~~
Razengan
> It seems a bit like creating a book by sticking pre-generated paragraphs
> together.

But books are not tools or machines, whereas software applications are (not
counting art & entertainment, generally.)

Nobody creates every screw and electrical component from scratch when
designing and assembling other tools and machines.

"Programming by manipulating metaphors on screen" would be more akin to using
predefined mathematical symbols and formulas, and just putting in the numbers
and variables related to the problem you want to solve or the task you want to
perform.

Even most popular genres of games could be made entirely from wiring up
predesigned components together in a visual environment, without writing any
code at all. You would just supply your own graphics and sound and other
content.

~~~
oldmanjay
I think your vision of these tools is far more sophisticated and capable than
the current reality. It's certainly been an industry dream to produce tools
that work at the high level you've described, but they have very limited
domains and thus far, less than impressive results.

------
pbreit
I'm definitely a novice programmer but that entire post is a foreign language
to me.

On what level do these concepts impact me? Performance? Managing large code
base? Easy, hard, impossible to do something?

~~~
nemothekid
Chris Lattner is a compiler author (LLVM/Clang, Swift), so this post is most
definitely talking about compiler semantics that is pretty low level stuff
that isn't really visible to the end-user (the programmer of the language),
except for in C++.

When a compiler is compiling a program, and you make a function call (lets say
`int x = obj.compute()`), the compiler has to know where the code for that
function is. In C (and C++ to an extent), this is easy, functions aren't very
fancy and compiler can just go to that place in memory which doesn't change
during runtime. These types of functions are called "static functions" and the
compilers method of calling these functions is called "single dispatch". Since
the code that runs is very predictable, its easy for the compiler to optimize
the function call.

In other languages (like Ruby, Java, Swift), a statement like `int x =
obj.compute()` has multiple meanings if `obj` can be subclassed. For example,
if you have an inheritance structure like (Dog, Cat) > Mammal > Animal,
`obj.compute` could mean the compute function on Animal, Mammal, Dog or Cat.
At compile time, the compiler may have no idea of knowing which definition of
`compute()` to call. So during runtime, it will analyze the type information
and call the correct function. These are called "virtual functions" and they
are called with a "dynamic dispatch". Because there are multiple functions
that could run, actually predicting what will run on the machine is hard.

Dynamic Dispatch function calls are generally slower than static ones, and a
compiler would prefer static calls. Chris goes over the many different methods
languages attempt to make dynamic calls faster - and Swift, thanks to its
simple programming model, (AFAIK) gets "smart" about function calls and can
make single dispatch calls when needed (Java can do this to, but what makes
Swift "cool" is that it doesn't need a Just In-time Compiler (JIT) to do so).

What all this means to you (the end user) is that you can use all these fancy
functions in Swift without having to worry about performance. The overhead of
a function call may be something you have never considered (I certainly don't
think about it), but I'm sure its something compiler geeks obsess over.

I may be 100% wrong here, I'm not a compiler author, can only make a guess
from C/C++ experience, and I've never used Swift before.

~~~
krat0sprakhar
Thank you for the explanation!

------
twoodfin
How does the presence of C++'s placement new() affect devirtualization
opportunities? I'm sure Chris knows what he's talking about, so I'd like to
hear more.

~~~
chadaustin
I was going to write a response but bdash over on reddit did a great job, so
I'll just link that:
[https://www.reddit.com/r/programming/comments/3wla9f/chris_l...](https://www.reddit.com/r/programming/comments/3wla9f/chris_lattner_author_of_the_swift_programming/cxx7k0m)

One interesting note is that, the last time I looked into this, the compiler
_is_ allowed to assume that multiple sequential vtable calls do not replace
the vtable, allowing the vtable lookup to be cached. But that is not true for
some arbitrary external-to-the-compilation-unit function that.

~~~
bdash
In practice neither GCC nor Clang seem to be willing to believe that the
vtable is immutable. See [https://goo.gl/y0cx1r](https://goo.gl/y0cx1r), for
example, which shows only the first of two calls being devirtualized. The call
to fprintf within B::work is sufficient to cause to both compilers to reload
the vtable pointer, preventing devirtualization of the second call.

Interestingly enough, when a use of placement new is visible to the compiler
it can prevent the conservative behavior mentioned above. In
[https://goo.gl/Y8rwYG](https://goo.gl/Y8rwYG) the vtable store that placement
new conceptually generates allows Clang to devirtualize the second virtual
call. GCC doesn't appear to catch this case.

~~~
chadaustin
Thanks for digging in! I saw the same thing with clang last year, in the
context of Emscripten, where we were hoping to eliminate the redundant vtable
loads as a JS code size optimization.

------
golergka
Can someone elaborate on what methods JIT uses to boost perfomance as compared
with AOT compilation? Do modern JIT compilers use statistics about what code
actually got called to optimize call dispatch? Where can I find more info
about this topic?

~~~
xxs
JITs do know the call target. This is a quote from "A JVM does that?"[0]:

    
    
      C++ avoids virtual calls – because they are slow
      ● Java embraces them – and makes them fast
      ● Well, mostly fast – JIT's do Class Hierarchy Analysis
      ● CHA turns most virtual calls into static calls
      ● JVM detects new classes loaded, adjusts CHA
      – May need to re-JIT
      ● When CHA fails to make the call static, inline caches
      ● When IC's fail, virtual calls are back to being slow
    

\----

There are several misconception about Java in the write-up. The real reason
there won't be a bootloader is the need for the Java runtime and the need to
GC. Unoptimized virtual calls won't affect any bootloader for that matter (and
you can/should use static and/or private methods for the inner loops)

[0]:[https://www.youtube.com/watch?v=uL2D3qzHtqY](https://www.youtube.com/watch?v=uL2D3qzHtqY)

~~~
pvg
He doesn't say you won't see a Java bootloader because of unoptimized virtual
calls, he says it's because the runtime is too big. CL: _It also means that
Java doesn’t “scale down” well to small embedded systems that can’t support a
JIT, like a bootloader._

~~~
xxs
_OTOH, since the compilation model assumes a JIT, this means that purely “AOT”
static compilers (which have no profile information, no knowledge of class
loaders, etc) necessarily produce inferior code. It also means that Java
doesn’t “scale down” well to small embedded systems that can’t support a JIT,
like a bootloader._

This the quote. He talks about the JIT only, no runtime mentioned.

~~~
pvg
He might somewhat tersely be conflating JIT and 'runtime'as whole - see the
mention of classloaders but his meaning is clear, it's about size rather than
performance. 'It also means' etc.

~~~
masklinn
> it's about size rather than performance.

The tradeoff really: what he's saying is that you can JIT java to get speed
but you'll lose size (resident), or you can AOT compile java to improve size
but then you lose speed because calls may not be possible to devirtualize
(unless you have PGO maybe? You'd lose some speed and size to guards and
duplicated dispatch though)

~~~
pvg
You can't really AOT compile java and have it be java and it seems fairly
obvious to me he's talking about size and runtime-related things, in which he
throws in a JIT (i.e. the compilation model 'assumes' a JIT). I think he's
just being slightly loose with his terminology because he's just replying on a
mailing list without expectation that random nerds on an internet forum will
be talumidcally dissecting his semiotic intent.

~~~
masklinn
> You can't really AOT compile java and have it be java

Of course you can.

> and it seems fairly obvious to me he's talking about size and runtime-
> related things

I'm not sure why you're so intent in reinterpreting his rather plain words
(and insulting people when they don't agree with your broad reinterpretation).

Lattner is literally just noting that an AOT-compiled java program has no
knowledge of java's dynamic semantics (classloader &al), so it can't
devirtualize calls and must necessarily pay the dynamic dispatch cost, in the
same way e.g. messages in objective-c can't be devirtualized. In essence, an
AOT java is slightly better than a straight interpreter but not by much.

The _whole essay_ is about static versus dynamic dispatch and how languages
provide for those, it's not exactly a stretch to assume that when Lattner
talks about _inferior code_ that's what he's talking about: bootloader = no
JIT = pervasive dynamic dispatch = slow, you don't have to bring your own pet
issues into that.

~~~
comex
Not nearly as bad as an interpreter, especially if you do whole program
optimization. In fact, Android switched from the Dalvik JIT to ART, which
compiles AOT-ish.

------
Symmetry
Does this really buy you that much? For most programs the runtime is going to
be dominated by loop execution. If you're in a situation where the all the
objects have the same type and the compiler could switch in static dispatch
the computer's branch predictor should catch on very quickly and the
performance gains won't be that high. In cases where you have heterogeneous
objects the branch predictor won't be able to catch on but the compiler will
be forced to stick with static dispatch anyways.

I suppose there is a bigger gain where you can inline static functions or
optimize the calling convention.

~~~
Rusky
Inlining is often called the mother of all optimizations. On its own, it
doesn't gain you much compared to any other optimization, but once it's done
it enables much, much more.

There's a reason languages like C that default to static dispatch on value
types are so much faster without much effort.

------
tomcam
Good writeup. What is a checked downcast?

~~~
twoodfin
In this context, it's a runtime check inserted by the compiler to be sure that
an object retrieved via a generic interface matches the more specific type the
caller expects.

