
Apple’s Bitcode Telegraphs Future CPU Plans - 127001brewer
https://medium.com/@InertialLemon/apple-s-bitcode-telegraphs-future-cpu-plans-a7b90d326228
======
drfuchs
I managed to ask Chris Lattner this very question at WWDC (during a moment
when he wasn't surrounded by adoring crowds). "So, you're signaling a new CPU
architecture?" But, "No; think more along the lines of 'adding a new multiply
instruction'. By the time you're in Bitcode, you're already fairly
architecture-specific" says he. My hopes for a return to big-endian are
dashed. [Quotes are approximate.]

~~~
kmicklas
Why on earth would you ever want to _return_ to big-endian?

~~~
drfuchs
Because big-endian matches how most humans have done it for most of history
("five hundred twenty one" is written "521" or "DXXI", not "125" or "IXXD").
Because the left-most bit in a byte is the high-order bit, so the left-most
byte in a word should be the high-order byte. Because ordering two 8-character
ascii strings can be done with a single 8-byte integer compare instruction
(with the obvious generalizations). Because looking for 0x12345678 in a hex
dump (visually or with an automatic tool) isn't a maddening task. Because
manipulating 1-bit-per-pixel image data and frame buffers (shifting left and
right, particularly) doesn't lead to despair. Because that's how any right-
thinking person's brain works.

The one place I've seen little-endian actually be a help is that it tends to
catch "forgot to malloc strlen PLUS ONE for the terminating NUL byte" bugs
that go undetected for much longer on big-endian machines. Making such an
error means the NUL gets written just past the end of the malloc, which may be
the first byte of the next word in the heap, which (on many implementations of
malloc) holds the length of the next item in the heap, which is typically a
non-huge integer. Thus, on big-endian machines, you're overwriting a zero
(high order byte of a non-huge integer) with a zero, so no harm done, and the
bug is masked. On little-endian machines, though, you're very likely
clobbering malloc's idea of the size of the next item, and eventually it will
notice that its internal data structures have been corrupted and complain. I
learned this lesson after we'd been shipping crash-free FrameMaker for years
on 68000 and Sparc, and then ported to the short-lived Sun386i.

~~~
ahomescu1
> Because big-endian matches how most humans have done it for most of history
> ("five hundred twenty one" is written "521" or "DXXI", not "125" or "IXXD").

Actually, it is possible that that was nothing more than an accident. We use
Arabic numerals, and Arabic languages are written right-to-left. Then there
are languages like German where digits are read in reverse, so "42" is read as
"two-and-forty".

~~~
kylebgorman
The German "decade flip" is restricted to the tens and unit places; otherwise
the order is English-like, with larger terms leading.

The cardinal number systems for most major languages lead with larger terms
(as in English). I don't think there's anything deep about this, it's probably
an accident. And there are languages which lead with smaller terms, such as
Malagasy (the national language of Madagascar).

The ordering of digits in Arabic is not obviously relevant, per se, since
spoken English ("one hundred twenty one") matches the order of the Arabic
numbers, too.

~~~
DonHopkins
It's funny how the Germans and Dutch (rightfully) ridicule Americans for
writing dates in middle-endian order like 9/11/2001, yet they say numbers with
the decade flip "two and forty". That's just as ridiculous.

~~~
cactusface
MM/DD/YYYY is simply a direct transliteration of spoken English, which makes
it easy to read and write dates. In other languages, the spoken version is
little endian or big endian, and the written version aligns accordingly. (At
least for the languages I know.)

~~~
rjsw
ITYM "... spoken American".

~~~
cactusface
Are you saying it is commonly referred to as "The 11th of September, 2001" in
England?

~~~
smcgivern
We would normally refer to that as September 11 because it's much more talked
about in the US, and that's the phrase used there.

Any other dates will likely be in the same order as written. For instance, the
rhyme for bonfire night is 'remember remember, the fifth of November'. I
believe that many in the US also talk about the fourth of July, rather than
July fourth, so it's not like English has the hard-and-fast rule you were
proposing.

~~~
cactusface
Ok, fair enough. What do you say for non-special dates like July 3rd, 2015?

~~~
smcgivern
'Third of July, 2015' or more likely 'third of July'. The date format really
isn't lying to us in UK English.

------
monocasa
I thought that just being LLVM bitcode wasn't enough to guarantee portability
like the author assumes that it is.

There's ABI specific pieces that are still not abstracted in the bitcode like
struct packing rules.

~~~
riscy
Not only that, but "[target data layout] is used by the mid-level optimizers
to improve code, and this only works if it matches what the ultimate code
generator uses. There is no way to generate IR that does not embed this
target-specific detail into the IR. If you don’t specify the string, the
default specifications will be used to generate a Data Layout and the
optimization phases will operate accordingly and introduce target specificity
into the IR with respect to these default specifications."

Much of this article is simply inaccurate speculation.

[1] [http://llvm.org/docs/LangRef.html#data-
layout](http://llvm.org/docs/LangRef.html#data-layout)

~~~
DannyBee
note that data-layout used to be optional, but is now mandatory.

------
exelius
How does dynamic linking work in a scheme like this? Would any pre-compiled
libraries need to be distributed as Bitcode as well?

Due to the concerns above, IMO Bitcode is less about compatibility and more
about app thinning. It's pretty easy to go from Bitcode to 4 different
variants of ARM; but another entirely to go from Bitcode to x86 and ARM.
Currently, developers have to ship binaries compiled for multiple
architectures, which increases app sizes. I suspect Apple is just building a
workflow that creates a device-specific version of each app, and having
developers compile to Bitcode simplifies app submission.

~~~
hundchenkatze
Yeah, according to Apple "If you provide bitcode, all apps and frameworks in
the app bundle need to include bitcode."

[https://developer.apple.com/library/prerelease/ios/documenta...](https://developer.apple.com/library/prerelease/ios/documentation/IDEs/Conceptual/AppDistributionGuide/AppThinning/AppThinning.html#//apple_ref/doc/uid/TP40012582-CH35-SW2)

------
nickpsecurity
This is a smart move. It's essentially what the System/38 (later AS/400 & IBM
i) did. They had an ISA or microcode layer that all apps were compiled to.
Then, that was compiled onto whatever hardware they ran on. When IBM switched
to POWER processors, the just modified that low layer to compile to POWER
processors. That let them run the old apps without recompilation of their
original source. They used this strategy over and over for decades. Always one
to keep in mind.

Going further, I think a team could get interesting results combining this
with design-by-contract, typed assembly, and certified compilation. Much like
verification condition generators, the compilation process would keep a set of
conditions that should be true regardless of what form the code is in. By the
time it gets to the lower level, those conditions & the data types can be used
in the final compile to real architecture. It would preserve a context for
doing safe/secure optimizations, transforms, and integration without the
original source.

~~~
chmaynard
Based on other comments here, it sounds like the LLVM Bitcode representation
is more closely tied to a specific architecture than the System/38
intermediate representation was. Same idea, though.

~~~
nickpsecurity
I agree. That's either an advantage or another opportunity for modern IT to
learn from the past. Those old systems were full of so many tricks they're
still outclassing modern efforts in some ways haha.

------
Ruud-v-A
Microsoft has been doing a similar thing with [.NET
Native]([https://msdn.microsoft.com/en-
us/vstudio/dotnetnative.aspx](https://msdn.microsoft.com/en-
us/vstudio/dotnetnative.aspx)) for a while now, though MSIL is much higher
level than LLVM IR. With .NET Native, you _can_ submit your app once an run on
ARM and x86, 32-bit or 64-bit.

------
pilif
I really hope this bitcode feature isn't going to cause a lot of trouble for
app developers. Up until now, the app you built on your machine (and tested on
your devices) is the app you submitted and which is running on your customers
machines.

In the future, when your app crashes on customer's machines and doesn't on
yours, how are you going to debug much less explain this to apple and have
them fix the issue for you?

This is especially scary when you consider the turnaround time of ~2 weeks
before your new build becomes available in the app store for you to test.

~~~
DerekL
You can download the newly compiled binaries to your test devices. The App
Store won't release it to customers until you say it's okay.

------
stcredzero
With Bitcode, Apple could change OS X into something like the old TAOS jit
based OS. Except for a small kernel, all TAOS executables were intermediate
representation files. This IR could be translated to real machine code at the
same speed as disk access, and resulted in code running at 80-90% native speed
on most platforms.

With software like that, Apple could become independent of any particular
software architecture.

(TAOS dates from the 90's and is hard to google, but is mentioned in some
papers. And yes, the JIT translator could do that even on 90's machines.)

~~~
spitfire
Sounds somewhat similar to the IBM AS/400 (renamed many times). Applications
were shipped as byte code and translated to the local machine architecture.
The translated byte code was appended to the application, a but like NeXT fat
binaries or OS-X's universal binaries.

The native byte code was 64bit, though iirc the first implementation used a
32bit address. The result was you could ship an application once and when it
came time to move architectures all you had to do is one command to
retranslate the byte code.

Very neat. I'm not sure why this approach hasn't been used more often.

EDIT: Looks like they're going exactly the AS/400 route.

~~~
monocasa
Actually AS/400 byte code has 128 bit pointers. It was originally implemented
on a 48 bit processor (weird 70's architectures FTW!).

~~~
spitfire
I stand corrected. Thanks.

AS/400 had some really cool technology built into it.

------
chuckcode
Note that gcc considers their monolithic design a feature to encourage
companies to contribute back code rather than a painful lesson to be learned
from...

[http://gcc.gnu.org/ml/gcc/2004-12/msg00888.html](http://gcc.gnu.org/ml/gcc/2004-12/msg00888.html)
[https://gcc.gnu.org/ml/gcc/2007-11/msg00460.html](https://gcc.gnu.org/ml/gcc/2007-11/msg00460.html)

~~~
azakai
The article was talking about the Bitcode design in LLVM, not the plugin
situation. Yes, gcc considered making plugins hard a feature, but as for a
design that separates the frontend and backend in order to make each more
modular, then gcc in fact does have such a thing, gimple,

[https://gcc.gnu.org/onlinedocs/gccint/GIMPLE.html](https://gcc.gnu.org/onlinedocs/gccint/GIMPLE.html)

I'm far from an expert, but from what I've seen and heard, gimple is pretty
good for what it does. In other words, the author of the article is wrong to
say that LLVM learned from a painful lesson in this area.

------
Corrado
When I read something like this my mind wanders and I imagine a future MacBook
Pro that, instead of a single Intel processor, contains many ARM processors.
10-15 ARM processors acting as one could offer a whole lot of performance when
you needed it (use all of them), and a whole lot of energy saving when you
didn't (use 1 of them). With the current trend of multi-core CPUs, I see this
as the ultimate form of the architecture.

Now, whether Apple will do something like this is or not is anyones guess, but
its nice to dream of the possibilities. :)

~~~
coldtea
> _I imagine a future MacBook Pro that, instead of a single Intel processor,
> contains many ARM processors. 10-15 ARM processors acting as one could offer
> a whole lot of performance when you needed it (use all of them), and a whole
> lot of energy saving when you didn 't (use 1 of them). With the current
> trend of multi-core CPUs, I see this as the ultimate form of the
> architecture._

Doesn't make sense. What's the difference between "10-15 ARM processors" and
4-8-12 Intel based cores?

I mean apart from the fact that we aren't going back to multi-processor
architectures, since there's no benefit from that compared to cores (latency,
etc).

------
legulere
I wonder how Bitcode will play with profile guided optimization. Will you also
provide pgo information to apple or will they generate it.

------
jarjoura
My suspicion is that bitcode allows the App Store team to provide Watch/iOS
specific binaries for an individual device. Right now the solution is to
create fat binaries that eat precious space. The watch being even more
constrained can use all the help it can get.

------
gojomo
I wonder if this could also be a way to protect compiler-tech and silicon
trade secrets, even after they're widely used in the field? Perhaps only Apple
ever compiles the final, deployed versions of apps.

------
serve_yay
Yeah, and their patent applications telegraph the iMac with the fiber-optic
shell that's been just around the corner for 10 years.

