Hacker News new | past | comments | ask | show | jobs | submit login
Bitcode (iOS, watchOS) (developer.apple.com)
117 points by comex on June 9, 2015 | hide | past | web | favorite | 83 comments

In case y'all are curious it would make sense if this is LLVM bitcode: http://llvm.org/docs/BitCodeFormat.html

Seems like a very smart way to keep things binaries up to date without developer intervention -- and possibly even allow re-targeting to different CPU architectures after the fact. That would eliminate the need for something like Rosetta if Apple ends up switching major CPU architectures again some day.

I really think that LLVM is one of the best things to happen to computer science in a long, long time.

I remember a letter from some time ago saying that LLVM IR format is private to specific version of LLVM, and is not guaranteed to be compatible with other versions in the future or in the past. Does this announcement mean that is not the case anymore, and LLVM IR is backwards compatible now?

There's IR assembly (the human readable format) and the bitcode format. There are no compatability guarantees for the assembly but new LLVM versions can read the old bitcode inside the same major version.


LLVM is a great tool, but I've had trouble in the past with version compatibility. I wrote a compiler that used LLVM as a backend (I used LLVM versions 3.4-3.6). The problem was that each minor version of LLVM slightly changed the API. It was things like removing parameters from methods, renaming methods, removing the need for some methods completely, or adding/removing some static-link libraries. If you only wish your tool to be compatible with a single version of LLVM, its not a problem, but attempting to support a selection of minor versions ended up being a pain. These minor versions would come out 3-4 times a year and I would need to find what broke each time, and if there was even an equivalent solution in the new version.

I didn't work on the level of IR, so I didn't come across any problems there, but I wouldn't be surprised if the IR syntax changed slightly across minor versions.

I guess you used the C++ API? There's a C language subset available that has way more stability.

Given that Apple controls that format, even if it were to continue to change in the future, that would not necessarily be an unsurmountable obstacle.

All of LLVM is open source, and the format isn't quite so fluid as GP makes it sound. It should be relatively trivial for one versed in the LLVM codebase to grok the current features of the bitcode.

That said, as long as you keep an indication of the LLVM version that your bitcode was generated with, I really don't see a problem with fluid bitcode specs.

I'm quite sure it is LLVM bitcode. Btw, the creator of LLVM is the same guy who created Swift, i.e. Chris Lattner.

> I really think that LLVM is one of the best things to happen to computer science in a long, long time.

While LLVM is a great compiler toolchain, it is no different than many other JIT/AOT frameworks since the old mainframe days.

What you are suggesting is how OS/400 is adapted to new CPUs.

The major difference with LLVM is modularity. If you create a new language frontend, you get all LLVM's CPU support "for free." If you create a new processor backend, you get all LLVM's frontend languages "for free."

This has interesting consequences such as retargeting anything from the frontend to anything on the backend. I'd venture a wager that in the old mainframe days, the monolithic nature of a JIT would not have been friendly to a porting campaign.

Again, I don't see how this is any different from mainframes.

On OS/400 all executables are bytecode (TIMI) with the JIT on the kernel.

When AS/400 changed processors, the programs continued to execute as always, no change required. All languages got the new processor for free.

Any compiler that targets TIMI, gets OS/400 support for free.

The difference is that they had to port the entire JIT to the new processor. With LLVM, you just need the backend components that represent the CPU. Granted, from an app developer's perspective, this is not really relevant because they're targeting a VM; where the VM runs, their apps run.

Distinction without difference. These two components are essentially equivalent in scope.

Any resource in how look the TIMI bytecode? Could be usefull for a x86 interpreter?

There isn't much available. IBM doesn't disclose much about it.

However here are some links about OS/400, nowadays known as System i.



A story about the two times TIMI actually changed:


Some Redbooks about ILE, which sits on top of TIMI



All of them can be found at


My limited understanding is that LLVM bitcode doesn't insulate you completely from the ABI/platform differences, e.g. between 32- and 64-bit. So I wonder if they'll be "fat" binaries.

Coincidentally, there's been a bunch of stuff on the mailing lists recently about embedding bitcode in object files in order to support link time optimisation.

You are correct about the 32-bit, and furthermore correct about ABI (LLVM code targeted at Mac wouldn't work on Linux without a lot of work and inefficiencies, and it's not obvious how to fix that).

LLVM IR files are already target specific. There are some targets like Google NaCl that work on several architectures, but I doubt Apple wants to go that way.

So Apple is following following the footsteps of IBM, huh?


Everyone is.

I got to understand better the whole concept of having a bytecode format for executables, with JIT/AOT deployment options when I started delving into the old mainframe world.

I used to do AS/400 backups, but never coded for it. So it was quite interesting to discover the all TIMI concepts.

Also other similar environments like the Burroughs B5000.

The old is new again. :)

That's true...MS has been doing this for phone and windows store apps (the few that there are) that are written in .Net to improve performance. The get AOT'd from CIL to target binary when uploaded to the store.

I wonder why this isn't bigger in the FOSS world...maybe because the source and toolchain are already available...I don't know, it might be a neat idea to have an IR userland that compiles on install.

They seem to be preparing for something in the 2-5 years timeframe, but what?

Least grandiose theory I can come with: two ARM cores in the Watch, one big, one small, with different instruction sets, for active/standby modes.

They're preparing for anything. Apple has a dislike for relying heavily on any partner in their DNA, thanks to PowerPC.

I think this is exactly it. Perhaps they do have a specific usage in mind that they're aiming for, but I bet they'd be taking these same steps regardless.

Not just that. Apple also switched from 68k to PowerPC.

True, though nominally the same partner. Motorola was the M in AIM, though yes, IBM did the heavy lifting. (Remember Motorola made a Mac clone? How weird.) And the situation wasn't as dire then.

I'm guessing they can just optimize for the watch when they come up with a new trick.

It also means they can optimize for different devices.

This is going to spell the end of all 3rd party crash reporting tools.

I'm not sure I understand what you are implying. How so?

Possible issues with symbolicating the crash logs if you don't have the debug symbols as they're generated on Apple's side.

If Apple does the linking of the final binary for you, you won't have a way to recover the symbols <-> address mappings you need to desymbolicate (that'd the dSym file you usually upload to hockeyapp)

I'm honestly surprised that noone has commented on the security implications of this, yet. After all, allowing Apple to recompile your "bitcode" essentially means, that the user can in no way validate that the version he's running is the version the developer published. It would be fairly straightforward to perform MITM attacks on apps this way. (And, sure, Android's APKs are faced with quite the same issue - but AFAIK Google doesn't modify your bytecode (yet?))

Apple control the platform. They can patch your app at runtime if they want.

This can't be stated enough. You can have all the fanciest encryption messenger app you like, but it all comes down to - do you trust Apple?

It is said they even do binary level analysis of your app to see no restricted APIs are used.

They could already do that. They sign and encrypt the (plaintext) code segment you submit so they could easily insert a JMP to their own payload at the head of your code any time

They could also just have the kernel do whatever it wants to your program, because they control that too. If you are worried that Apple might tamper with your binaries if given the chance, using an iOS device is pure folly, because they outright control much more important parts of your system (the kernel, the UI libraries, the RNG…).

Why do they encrypt code segment? Signature is essential, of course, but encrypting looks unnecessary.

It's their fairplay DRM implementation that is supposed to prevent piracy.

Apple has a pretty complex boot system in which each stage of the boot process verifies a checksum of the next layer before starting it up. The lowest layer is etched in ROM. Theoretically, if you verify the integrity of the bitcode, and you verify the integry of the bitcode compiler, you should be able to trust the native binary as well.

Apps you upload to iTunes Connect that contain bitcode will be compiled and linked on the App Store.

Unless the description is wrong, this looks like Apple could insert any code they want into your binary, without your users noticing.

As others have pointed out - which they can do anyway because they own the entire operating system and application runtime.

Seeing as the only way to run these apps is to get them through the App Store Apple could also just completely replace your app code with whatever they want.

They can also choose to just not validate signatures. If they wanted to MITM your app, this doesn't make it any easier for them as it's already very easy for them to do that.

Well, APK's are signed by the developer. Google would have to replace them on first install.

Signed binaries don't mean anything to someone like Google or Apple who control the host OS (and the hardware).

In both their OS/kernel and in the hardware, Google has the ability to make your app do anything they want to regardless of how you coded it or signed the binary.

If to put Swift on Linux and Bitcode together, I'd take a wild guess that an apple powered PaaS is on its way.

Swift on Linux is probably just a consequence of Apple using Linux on their servers. Don't read too much into it.

Pats (like AppEngine) is exactly the things can happen on server.

They kind of have that already with CloudKit - it's not suitable for all use cases though.

They've just announced a major new addition to CloudKit though to basically allow integration into CloudKit via web apps, which seems like a pretty big deal.

CloudKit is BaaS (like Parse), and here I meant PaaS (like AppEngine).

I'm thinking that Apple sees a new threat to their mobile (profit) dominance: Fully portable Mobile+Desktop OS enabled through Core M, i.e. Windows 10. "Bitcode" is them preparing to unify their OSes just in case Win10 is successful and buyers suddenly want to have one device to rule them all (i.e. MS Continuum). This could very soon be a reality - and Apple seems to be preparing themselves. Smart move [1].

[1] http://img2.wikia.nocookie.net/__cb20130701185217/jurassicpa...

My thought is more that they are going the Android route of using an intermediary bytecode, to get both platform independence, but on-device compilation for performance. If they can push hardware that will automatically run faster, that makes it all that much easier to sell new product.

Yes, I didn't mean Apple is going the MS route from a technical point of view, just the timing of their platform independence efforts leads me to believe that it has to do with pressure (or the potential thereof) by Microsoft.

I find it interesting this hasn't gotten as much attention as other announcements. It seems to hand over a great deal of responsibility to Apple, at least in terms of performance. Something some might feel comfortable with or not. I can imagine this lack of control about what instructions will actually run on the hardware might not fair well with some developers?

As long as you have (and keep updating to) the devices in question to test on, you'll have full performance info.

Most likely, however this is already being done in the other mobile platforms.

Not really, you can ship native code on Android and, for Java apps, intermediate bytecode is delivered to end-user phones, not compiled in the cloud.

Windows Phone 8:


Windows Phone 10:


As for Android, having ART on the phone is technicality, given the platform fragmentation Google rather leaves to the OEMs the task of making ART generating the proper code.

John Siracusa tweeted a rumor yesterday speculating/hinting at ARM Macs.


How fast are the fastest ARM chips compared to the lower-end Intel chips? Could we see low-powered Mac Airs with long battery lives?

This will affect crash reporting and symbolication. Presumably Apple strips debug symbols when they compile. Will they make the dsyms (DWARF containing files for stripped mach-o binaries, necessary to symbolicate crash reports) available somehow? If not, third parties (or anyone who chooses to handle their own crash reports directly) are in trouble.

So is this just LLVM IR?

I'm curious how that works. LLVM bitcode backwards/forwards compatibility is quite poor, from what I've heard. Will they fix some bitcode version to transmit, and have translators on the client/server sides? Have a bunch of different toolchain versions on the server side? Force everyone to resubmit with a newer version if they want to do any changes on the server side?

Presumably the backwards/forwards compatibility is poor simply because people don't tend to have these files sitting around, they mostly exist (at the moment) as part of a compilation framework and are tempfiles.

If needed, it wouldn't be too for someone with a few programmers (Apple could spare a few!) to write the proper versioning to upgrade/downgrade the file format as LLVM changes?

Yep. I hope they leave this optional for iOS, as we use some proprietary libraries which we don't think the vendor will ship a new version of :(

You may get by with a decompiler as laid out in http://llvm.org/devmtg/2013-04/bougacha-slides.pdf

And since you'll certainly not be the only one with that problem, Apple making it mandatory for iOS would probably spur development in that area.

I am not so confident : they now require that you ship a 64 bits version of your apps, which can be also be a problem if you use 32 bits only proprietary libraries.

There's a good reason for that requirement. Loading 32bit apps on a 64but device requires loading all the 32bit version of shared libs, massively increasing the memory footprint.

That's really cool.

I like that idea in general. The other day I was reading about OS/400 on wikipedia. It always used an intermediate bytecode...and because of it they were able to seamlessly (who know how seamlessly) from architecture to architecture.

Doesn't it open doors for proprietary LLVM optimisations hidden from others?

So, is there any actual documentation of this anywhere?

Interesting to see Apple is following Microsoft (MDIL, .NET Native) footsteps.

Is there any substantial new technology a major company gets that another competing company wont try to match sooon?

Yeah, but the funny thing is how Apple's AOT compilation toolchain was being discussed by some Apple fans as the way to go.

Now they turn around and follow what the others are doing.

It still is AOT. There's just an intermediary step, which is what you send to Apple, but the code is compiled in its servers, not in the runtime.

It's not like LLVM doesn't have several steps already in its pipelines.

I guess you don't know how MDIL and .NET Native work.

Not sure where you're getting at.

Like Apple's .NET Native is AOT.

Like Apple's .NET Native compiles to native code in the server.

It's your comment that gave the impression that you thought Apple's new bitcode thing is not AOT and that they follow MS in this not-AOT-ness.

It might not have been what you meant, but it's not very clear from the phrasing:

>"(...) Apple's AOT compilation toolchain was being discussed by some Apple fans as the way to go. Now they turn around and follow what the others are doing"

This reads like Apple had an AOT compilation toolchain that Apple fans thought it was "the way to go" and now Apple dones't have one (AOT compilation toolchain) anymore following MS lead in this regard.

Whereas what you actually meant was probably that Apple fans thought that Apple's PREVIOUS AOT compilation toolchain was the way to go, but now they've changed course and went for an MS style AOT compilation toolchain.

(it read like you think "Apple's AOT toolchain" was a thing of the past, and not they follow MS which doesn't have AOT).

> Whereas what you actually meant was probably that Apple fans thought that Apple's PREVIOUS AOT compilation toolchain was the way to go, but now they've changed course and went for an MS style AOT compilation toolchain.


Yeah, took me a while to get it can mean that, I think the straightfoward reading is the other one.

Well, not native speaker here. :)

Both .NET Native and this (bitcode) are an AOT compilation toolchain.

Who says otherwise?

What I said is that I heard from many Apple fans that it didn't make sense the MDIL/.NET Native compilation model[0] and directly compilation from XCode to the device was the way to go.

[0]Uploading IL to the store and having a server based compiler generate native code before download.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact