Hacker News new | past | comments | ask | show | jobs | submit login
Apple’s Bitcode Telegraphs Future CPU Plans (medium.com/inertiallemon)
161 points by 127001brewer on June 16, 2015 | hide | past | favorite | 98 comments

I managed to ask Chris Lattner this very question at WWDC (during a moment when he wasn't surrounded by adoring crowds). "So, you're signaling a new CPU architecture?" But, "No; think more along the lines of 'adding a new multiply instruction'. By the time you're in Bitcode, you're already fairly architecture-specific" says he. My hopes for a return to big-endian are dashed. [Quotes are approximate.]

That sounds about right. No radical architectural shifts, but bitcode submissions should let Apple optimize apps automatically for whatever latest tweaks are available in issue width or fused instructions.

My most radical speculation is an iPhone A-series chip with an additional low power ARM core especially to support Watch apps without burning too much "host" device battery.

Why on earth would you ever want to _return_ to big-endian?

Because big-endian matches how most humans have done it for most of history ("five hundred twenty one" is written "521" or "DXXI", not "125" or "IXXD"). Because the left-most bit in a byte is the high-order bit, so the left-most byte in a word should be the high-order byte. Because ordering two 8-character ascii strings can be done with a single 8-byte integer compare instruction (with the obvious generalizations). Because looking for 0x12345678 in a hex dump (visually or with an automatic tool) isn't a maddening task. Because manipulating 1-bit-per-pixel image data and frame buffers (shifting left and right, particularly) doesn't lead to despair. Because that's how any right-thinking person's brain works.

The one place I've seen little-endian actually be a help is that it tends to catch "forgot to malloc strlen PLUS ONE for the terminating NUL byte" bugs that go undetected for much longer on big-endian machines. Making such an error means the NUL gets written just past the end of the malloc, which may be the first byte of the next word in the heap, which (on many implementations of malloc) holds the length of the next item in the heap, which is typically a non-huge integer. Thus, on big-endian machines, you're overwriting a zero (high order byte of a non-huge integer) with a zero, so no harm done, and the bug is masked. On little-endian machines, though, you're very likely clobbering malloc's idea of the size of the next item, and eventually it will notice that its internal data structures have been corrupted and complain. I learned this lesson after we'd been shipping crash-free FrameMaker for years on 68000 and Sparc, and then ported to the short-lived Sun386i.

Humans never chose to communicate via hex dumps either. We're making some concessions to hardware concerns here at the cost of human readability anyway, so one can debate whether endianness should also be considered that way. Protocol decoders are really great.

Little endian is more natural for data structures. The least significant bit goes in the byte with the least address, and the most significant bit goes in the byte with the greatest address, so you never have to remember which way you're going, which is particularly nice when working with bitvectors and bit fields.

"Left" and "right" can go either way, depending on which kind of diagram you draw, even on big-endian machines, so those words always end up ambiguous. Stick to bit significance and address order and everything is unambiguous and naturally inclined to little endian.

I'm not sure what you mean by the bit shifting case. The 8 char ASCII sting compare is a neat trick with limited applicability these days.

Well for starters, ARM has runtime-selectable endianness, so if Apple had felt any reason to do so, they would have used a big-endian ABI by now.

This article covers the practical tradeoffs of little and big-endianness well: https://fgiesen.wordpress.com/2014/10/25/little-endian-vs-bi...

The tl;dr is that little-endian was a smart performance optimization in the early microprocessor days when nearly all arithmetic was effectively bignum arithmetic (because the ALUs were only 4 or 8 bits wide), but that doesn't really matter now, so we're stuck with little-endian despite big-endian having some small developer productivity benefits.

The thing is, little-endian won pretty much everywhere outside of network protocols, so almost all of the common data formats store words in little-endian format as a performance optimization. By going big-endian, you'd be both forced to eat a byte-swapping performance hit on every load or store to these formats, and you'd break a tremendous amount of software that assumes it is running on a little-endian architecture. Dealing with those headaches would absolutely not be worth the trouble for the almost insignificant benefit of slightly easier to read hex dumps, or the slightly more useful benefit of string comparison via memcmp that could be better performed by dedicated SIMD instructions anyway.

> Because big-endian matches how most humans have done it for most of history ("five hundred twenty one" is written "521" or "DXXI", not "125" or "IXXD").

Actually, it is possible that that was nothing more than an accident. We use Arabic numerals, and Arabic languages are written right-to-left. Then there are languages like German where digits are read in reverse, so "42" is read as "two-and-forty".

The German "decade flip" is restricted to the tens and unit places; otherwise the order is English-like, with larger terms leading.

The cardinal number systems for most major languages lead with larger terms (as in English). I don't think there's anything deep about this, it's probably an accident. And there are languages which lead with smaller terms, such as Malagasy (the national language of Madagascar).

The ordering of digits in Arabic is not obviously relevant, per se, since spoken English ("one hundred twenty one") matches the order of the Arabic numbers, too.

It's funny how the Germans and Dutch (rightfully) ridicule Americans for writing dates in middle-endian order like 9/11/2001, yet they say numbers with the decade flip "two and forty". That's just as ridiculous.

MM/DD/YYYY is simply a direct transliteration of spoken English, which makes it easy to read and write dates. In other languages, the spoken version is little endian or big endian, and the written version aligns accordingly. (At least for the languages I know.)

ITYM "... spoken American".

Are you saying it is commonly referred to as "The 11th of September, 2001" in England?

We would normally refer to that as September 11 because it's much more talked about in the US, and that's the phrase used there.

Any other dates will likely be in the same order as written. For instance, the rhyme for bonfire night is 'remember remember, the fifth of November'. I believe that many in the US also talk about the fourth of July, rather than July fourth, so it's not like English has the hard-and-fast rule you were proposing.

Ok, fair enough. What do you say for non-special dates like July 3rd, 2015?

'Third of July, 2015' or more likely 'third of July'. The date format really isn't lying to us in UK English.

As a British English speaker, I'd say "yes".

Technically, I'd drop the "th of" and just say/write "11 September 2001".

Interesting. Would you choose "three July twenty fifteen", "July third twenty fifteen", or "the third of July twenty fifteen" (substitute "two thousand" for "twenty" if you like)? Assume someone has asked you the date and you're responding out loud.

I would say "three July twenty fifteen".

It took me a while to figure this out because actually it's quite rare to speak a date including a year without reading it - most spontaneously-spoken dates are this year (so the year is implied) and for a read dates, I'd probably say whatever was written.

The clincher was how I'd say my birth date, which would be of the form above.

I'm not claiming to be the definitive British English speaker, though! ;)

...and as another poster commented, it might depend on context - for example, "September 11" is often used in British English because it refers to an American event.

As a Brit, IMO both are perfectly acceptable in English prose. It isn't unusual to say "October the 4th" as opposed to "the 4th of October".

Americans read and write dates in middle endian. Germans and Dutch only read numbers with decade flip, they are written as usual. Furthermore with "two and forty" there can be no confusion since it's not "two and four", so it's clear that "forty" refers to the tens position and "two" to the units position. It is of course not ideal, but not nearly as much cause of confusion as the middle endian dates, because there's simply no way to know what 9/11/2001 means.

At least they (we) stay consistent between 13 and 99, while a certain other language elects to switch the flip at 20.

I think we should all take a moment to admire the francophone Swiss for boldly dropping much of the madness that is french counting. (Yes, I am looking at you, quatre-vingt-dix-neuf!)

> The ordering of digits in Arabic is not obviously relevant, per se, since spoken English ("one hundred twenty one") matches the order of the Arabic numbers, too.

I think it is relevant. It is possible that Western mathematics copied the Arabic notation (with right-to-left numbers), without also copying the correct way to read it (also right-to-left). For a similar situation in language, think of accents and the many different ways you can pronounce the same word.

Unlikely. Cardinal numbers in Old English, long before the slightest chance of contact with Arabs, were virtually identical to the modern system with respect to order of terms.

"For a similar situation in language, think of accents and the many different ways you can pronounce the same word."

Could you be more specific as to what you mean?

I mean the writing is identical, but the pronounciation varies wildly.

I think the way we do it is more natural. Numbers which have an infinite decimal expansion towards the right side of the decimal point are relatively much more common and useful compared to numbers which have an infinite decimal expansion towards the left.

For example, you can write out e as 2.7182...

However, if we were to flip this notation, ...2817.2, it isn't clear where to begin writing the number, if we read(and write) from left to write. With the regular representation, you write out the 'major' parts of the number first and then give out as many details as you want. You have the beginning of your string in mind. With a reversed system, you don't have the beginning but the end of the string in mind.

See https://en.wikipedia.org/wiki/P-adic_number for an overview of the system of numbers that actually works this way, similar to https://en.wikipedia.org/wiki/Two's_complement with infinitely long registers.

All you are pointing out is that mixing little-endian and big-endian may cause trouble. You're not saying anything about which of the two is better.

Anyway, it doesn't matter what you think is 'more natural.' Computing in binary probably feels less natural to you, but nobody is going to stop making binary computers because of that.

Do they also call 1042 "Forty two and one thousand"?

After looking up some German for beginners (German speakers, feel free to correct me), I found out that 1042 is read like "one thousand two and forty".

German speaker here, that's correct. When the number is between 1000-1999 the "one" in "one thousand" is sometimes omitted, so "thousand two and forty".

Also known as middle endian.

Even if its weird, its nothing like French numerals.

Aren't the only weirdness in French 70, 80 and 90? That's nothing compared to German :)

Georgian has a fun number system too - numbers between 20-90 are expressed as a multiple of 20 + a number between 1 and 19: http://blog.conjugate.cz/georgian-is-fun

Its weird its not consistent! :D

The german thing is just weird until you get used to it, with french I constantly go, wait, crap this is over 70, whats the deal again. I blame the wine consumption.

Can't be worse than Dutch counting!

Yeah, but the way individual languages do numbers doesn't necessarily make sense. In French, 99 is read as "four-twenties and ten-plus-nine"

Little endian is easier to deal with at the machine level as you don't need to adjust pointer offsets when referencing a smaller sized variable. A pointer to an 8, 16, 32, 64, 128 bit quantity will be the same. Big endian you will need to adjust the pointer up/down accordingly.

I always thought it was just one of those arbitrary choices made by companies with massive headaches resulting. I never thought to ask anyone if there was a good reason for one or the other. You've made quite the case for big-endian. I'm saving your comment (with attribution) for the next time this comes up in discussion. ;)

"Framemaker: it's riddled with features!"

>why on earth

there's your problem, you're living on earth. try living in the cloud. :) (network byte order)

Because BSD vs SysV just doesn't have that frisson of '80s nerdwar any longer?

Remember the Tie Fighter -vs- Death Star poster swag from Usenix?

4.x > V ∀ x from 0..∞

To think different

Exactly. Clang makes platform-specific lowerings and bakes in some ABI-specific gunk. Granted not so much as the object code on the other side of the back-end. Enabling target-specific vectorization and whole-program optimization are probably the goal.

I thought that just being LLVM bitcode wasn't enough to guarantee portability like the author assumes that it is.

There's ABI specific pieces that are still not abstracted in the bitcode like struct packing rules.

You are correct, and the article isn't even close to right.

There are things you can do if the ABI's are the same, such as optimize for microarches, but they otherwise have literally no idea what they are talking about.

Bitcode is meant for repeated optimization of the same IR.

That is, it would make sense to start from a well-optimized AOT compiled version of bitcode, then JIT the bitcode at runtime and try to come up with something better once you have profiling feedback.

I expect this is the plan, given that it's what everyone else who is serious about LLVM does.

It would not make any sense to start from bitcode and try to generate code for different architectures.

LLVM is a low level virtual machine for a reason.

There has been work on things like virtual ISA's using LLVM (see http://llvm.org/pubs/2003-10-01-LLVA.html), but the result of this research was, IMHO, that it's not currently a worthwhile endeavor. You also can do things like restrict bitcode in ways that make it portable (like, for example, PNaCL), but this is closer to the LLVA work than anything else (It's essentially a portability layer) It actually still requires porting, just porting to the single portability layer.

I appreciate the clarification.

Not only that, but "[target data layout] is used by the mid-level optimizers to improve code, and this only works if it matches what the ultimate code generator uses. There is no way to generate IR that does not embed this target-specific detail into the IR. If you don’t specify the string, the default specifications will be used to generate a Data Layout and the optimization phases will operate accordingly and introduce target specificity into the IR with respect to these default specifications."

Much of this article is simply inaccurate speculation.

[1] http://llvm.org/docs/LangRef.html#data-layout

note that data-layout used to be optional, but is now mandatory.

That's correct. Bitcode for a 32 bit processor won't work with 64 bit processors, for instance.

The ABI would be compatible between similar architectures: Apple has somewhat encouraged developers to provide binaries for all of armv6, armv7, and armv7s in the past, all of which are 32-bit ARM architectures. While some new features might be difficult to take advantage of without high-level information, I imagine a difference as drastic as ARM vs Thumb would be easy to pull off starting from bitcode.

Bitcode alone might not be infinitely malleable. But given that Apple knows intimately the original platform the submitted app targeted, it has the ability to add additional logic in iTunes Connect that permits tweaks and assumptions you couldn't get away with in the promiscuous and agnostic setting of a generic compiler toolchain.

Yeah, you more or less can't change struct layout after IR (maybe unless you had a no-aliasing no-FFI guarantee... safe Rust?) But you can encode new or different instructions and add new optimizations.

Perhaps someone can explain to me why converting a binary straight from one architecture to another couldnt be done. If emulators can work why not translaters?

An emulator has run-time information, a translator does not.

That can be important information. For example, the translator might not figure out where data embedded in the code (for example a jump table) ends, and, consequently, continue translating from a point mid-way into a multi-byte instruction.

Translating code that generates code (as done in JIT engines or to speed up image processing kernels) also is a challenge; the best one realistically can do is to translate the existing code as-is and then call the translator at run time to do the translation. If that works, the 'call the translator' step may kill any performance that was won by generating code.

Of course, self-modifying code is a challenge, too.

The problem is that perfect disassembly (figuring out where every instruction starts, and if bytes are instructions or data) of an arbitrary program is undecidable. Emulators get around that problem by only disassembling instructions that actually get executed at run-time (and therefore can safely be called "code").

How does dynamic linking work in a scheme like this? Would any pre-compiled libraries need to be distributed as Bitcode as well?

Due to the concerns above, IMO Bitcode is less about compatibility and more about app thinning. It's pretty easy to go from Bitcode to 4 different variants of ARM; but another entirely to go from Bitcode to x86 and ARM. Currently, developers have to ship binaries compiled for multiple architectures, which increases app sizes. I suspect Apple is just building a workflow that creates a device-specific version of each app, and having developers compile to Bitcode simplifies app submission.

Yeah, according to Apple "If you provide bitcode, all apps and frameworks in the app bundle need to include bitcode."


This is a smart move. It's essentially what the System/38 (later AS/400 & IBM i) did. They had an ISA or microcode layer that all apps were compiled to. Then, that was compiled onto whatever hardware they ran on. When IBM switched to POWER processors, the just modified that low layer to compile to POWER processors. That let them run the old apps without recompilation of their original source. They used this strategy over and over for decades. Always one to keep in mind.

Going further, I think a team could get interesting results combining this with design-by-contract, typed assembly, and certified compilation. Much like verification condition generators, the compilation process would keep a set of conditions that should be true regardless of what form the code is in. By the time it gets to the lower level, those conditions & the data types can be used in the final compile to real architecture. It would preserve a context for doing safe/secure optimizations, transforms, and integration without the original source.

Based on other comments here, it sounds like the LLVM Bitcode representation is more closely tied to a specific architecture than the System/38 intermediate representation was. Same idea, though.

I agree. That's either an advantage or another opportunity for modern IT to learn from the past. Those old systems were full of so many tricks they're still outclassing modern efforts in some ways haha.

Microsoft has been doing a similar thing with [.NET Native](https://msdn.microsoft.com/en-us/vstudio/dotnetnative.aspx) for a while now, though MSIL is much higher level than LLVM IR. With .NET Native, you _can_ submit your app once an run on ARM and x86, 32-bit or 64-bit.

I really hope this bitcode feature isn't going to cause a lot of trouble for app developers. Up until now, the app you built on your machine (and tested on your devices) is the app you submitted and which is running on your customers machines.

In the future, when your app crashes on customer's machines and doesn't on yours, how are you going to debug much less explain this to apple and have them fix the issue for you?

This is especially scary when you consider the turnaround time of ~2 weeks before your new build becomes available in the app store for you to test.

You can download the newly compiled binaries to your test devices. The App Store won't release it to customers until you say it's okay.

With Bitcode, Apple could change OS X into something like the old TAOS jit based OS. Except for a small kernel, all TAOS executables were intermediate representation files. This IR could be translated to real machine code at the same speed as disk access, and resulted in code running at 80-90% native speed on most platforms.

With software like that, Apple could become independent of any particular software architecture.

(TAOS dates from the 90's and is hard to google, but is mentioned in some papers. And yes, the JIT translator could do that even on 90's machines.)

"With software like that, Apple could become independent of any particular software architecture. "

Not with the languages they use :)

Neither C, nor C++, nor swift, can be made portable to new architectures through bitcode.

At least, not without language-level changes to each of them.

For example, for C and C++, sizeof is a constant expression, so you can't easily just do something like "defer evaluation to runtime". Plus ifdefs, struct layout, etc.

ANDF tried to solve these problems many years back. It may have even been "mostly possible" with c89. But today's languages, not so much.

(Even things like PNaCL and emscripten and what have you have restrictions on what C++ they allow)

> For example, for C and C++, sizeof is a constant expression, so you can't easily just do something like "defer evaluation to runtime". Plus ifdefs, struct layout, etc.

What prevents things like sizeof(T) or alignof(T) from being representable in the LLVM IR in a form that says, "defer to final translation" (to the target architecture)? Does substitution of platform-dependent types, for example, i64 for size_t on x86_64, happen prior to generating the LLVM IR? It would seem useful for me for LLVM IR to retain some platform-dependent types like size_t or intptr_t (deferring until final translation from bitcode to machine code), but maybe that would inhibit certain optimizations.

The short version is, that just isn't possible. Too many things would be unknown, and it is easy in both C and C++ to, at compile time, check the sizeof(size_t) and perform totally different behaviour depending on the value.

> The short version is, that just isn't possible. Too many things would be unknown, and it is easy in both C and C++ to, at compile time, check the sizeof(size_t) and perform totally different behaviour depending on the value.

I can see an issue with template instantiation in C++: for example, if a template uses SFINAE to specialize a template for types of certain sizes, that needs to be evaluated purely at the C++->LLVM stage of compilation.

What is the compile time part of C do you mean? sizeof(T) is evaluated at compile time of course, but it would still produce pseudocode like:

    if (4 < 16) { // sizeof(T) replaced with 4
        // do something
    } else {
        // do something else
Of course, an optimizer would likely constant-fold that conditional expression to remove the branch entirely, but I'm having a difficult time seeing how one could perform different behavior at compile time with sizeof(T) in C.

#define BUILD_BUG_ON(condition) ((void)sizeof(char[1 - 2*!!(condition)]))


There are many ways to make it perform different behavior at compile time (though admittedly, most are abuse). The above should compile error, but if you push sizeof evaluation, will not.

Better form, for certain definitions of better (works in global scope):

    #define BUILD_BUG_ON(condition) extern int build_bug_on[!!(condition)-1]

It's worse than this. For example, you'd never be able to issue diagnostics about constexpr's. You'd also not be able to layout structs, etc.

"Does substitution of platform-dependent types, for example, i64 for size_t on x86_64, happen prior to generating the LLVM IR? "

In LLVM world, Clang is performing most of the ABI lowering.

For C and C++, the answer is "yes", and "it must happen this way", because struct layout/etc will depend on it. Not to mention what you want is at some level, impossible in the LLVM IR. LLVM types are not C/C++ types. This leaves you with no type system capable to do the kind of thing you want to do :)

Good point.

One minor caveat: ~16 years ago, in C99, C got variable-length arrays (VLAs). If you use the sizeof operator on a VLA, it is evaluated at runtime.

Of course that doesn't change your argument; I just wanted to mention it for completeness' sake.

Sounds somewhat similar to the IBM AS/400 (renamed many times). Applications were shipped as byte code and translated to the local machine architecture. The translated byte code was appended to the application, a but like NeXT fat binaries or OS-X's universal binaries.

The native byte code was 64bit, though iirc the first implementation used a 32bit address. The result was you could ship an application once and when it came time to move architectures all you had to do is one command to retranslate the byte code.

Very neat. I'm not sure why this approach hasn't been used more often.

EDIT: Looks like they're going exactly the AS/400 route.

Actually AS/400 byte code has 128 bit pointers. It was originally implemented on a 48 bit processor (weird 70's architectures FTW!).

I stand corrected. Thanks.

AS/400 had some really cool technology built into it.

The book "Inside the AS/400" has some really cool exposition on the unique aspects of the AS/400. If I remember right, the disks and memory are in a single address space, and of course, there is a relational database built in to the operating system.

I have seen the binary translation in action: it was required when POWER6 machines came out. On a POWER5 machine, I remember it taking around 20 minutes for an application that was around 20 MB zipped, but for software licensing reasons not all of the server models exposed the full performance of the processor, so that might not be representative. And at any rate, like the OP said, it's a one time thing, and then the translated version is kept.

There was the OpenGroup's Architecture Neutral Distribution Format, but I'm not sure if it was ever fully realized. Around the time that was proposed some real serious energy was put in to C compiler optimization, across the industry.

It does free them up in the future, they can make bigger hardware changes and just have to supply a bitstream player for the older code. I have a difficult time envisioning the architecture changes that make tons of sense to Apple right now though, other than like swapping video on their ARM chips and stuff like that. Running iOS apps on the desktop might make a ton of sense for some of them or some sort of tweener between the iPad and Mac book.

We were thinking on the same lines. IBM System/38 was truly a brilliant, integrated design in so many ways. Even had capability-security built into the hardware. Check it out at the reference below. Intel made a monster of a system too. Would love to have either on a SOC or an inexpensive server.


You mean just like Java? The performance overhead of Java versus native code is pretty negligible at this point, and Swift is pretty similar to Java 8. I suspect if Apple wanted to go the intermediate compilation route, they would have just used Java.

I suspect Bitcode is more about helping developers ship a single binary that works across 7 or 8 devices with slightly different ISAs. App thinning is a big push for Apple right now because app sizes are getting out of control with the number of variants that developers are required to compile to. Developers compile to Bitcode, then upload to Apple, who then compiles multiple versions of the machine code, using the App Store to download the correct bundle for the device.

Every single binary every emitted by LLVM has been Bitcode during the compilation process. The only new element here is that Apple's asking for this intermediate product in order to be on the App Store.

Yep, and what I'm saying is that Bitcode doesn't help portability; only in producing device-optimized binaries for variants of the same ISA. The end goal here is reducing app size, not making it easier to port things to a new ISA. A new ISA wouldn't be impossible -- Apple is one of the few companies to successfully navigate this transition -- but IMO Apple would be better served on the Mac side by lower-power x86 parts (which are finally available) than higher-performing ARM CPUs.

In my industry, app data size swamps app executable size by a factor of about 2000.. So all this app thinning stuff seems useless (to me).

That's not universally true (as you seem to indicate already). For example, I included Google's WebRTC libraries in an iOS app and it bloated the binary by several hundred megabytes per architecture. Thinning that would help a lot.

Resources are a big part of it too, of course. A typical iOS app these days has three copies of every image, for 1x, 2x, and 3x resolutions. Dropping that down to one copy on the device is helpful. Of course this doesn't require any of this bitcode stuff, although neither does thinning the executable.

FFmpeg on Android also tends to make apps ridiculously huge when needed due to the need to support multiple architectures. Worse if you want something like x264 as well.

Well-optimized, lean apps often use cache and registers more efficiently. This brings performance boosts. That can save real $$$ on hardware or just let you do more with it.

TAOS, what a blast from the past. That system was cool. Sad that it never came to be.

Note that gcc considers their monolithic design a feature to encourage companies to contribute back code rather than a painful lesson to be learned from...

http://gcc.gnu.org/ml/gcc/2004-12/msg00888.html https://gcc.gnu.org/ml/gcc/2007-11/msg00460.html

The article was talking about the Bitcode design in LLVM, not the plugin situation. Yes, gcc considered making plugins hard a feature, but as for a design that separates the frontend and backend in order to make each more modular, then gcc in fact does have such a thing, gimple,


I'm far from an expert, but from what I've seen and heard, gimple is pretty good for what it does. In other words, the author of the article is wrong to say that LLVM learned from a painful lesson in this area.

When I read something like this my mind wanders and I imagine a future MacBook Pro that, instead of a single Intel processor, contains many ARM processors. 10-15 ARM processors acting as one could offer a whole lot of performance when you needed it (use all of them), and a whole lot of energy saving when you didn't (use 1 of them). With the current trend of multi-core CPUs, I see this as the ultimate form of the architecture.

Now, whether Apple will do something like this is or not is anyones guess, but its nice to dream of the possibilities. :)

>I imagine a future MacBook Pro that, instead of a single Intel processor, contains many ARM processors. 10-15 ARM processors acting as one could offer a whole lot of performance when you needed it (use all of them), and a whole lot of energy saving when you didn't (use 1 of them). With the current trend of multi-core CPUs, I see this as the ultimate form of the architecture.

Doesn't make sense. What's the difference between "10-15 ARM processors" and 4-8-12 Intel based cores?

I mean apart from the fact that we aren't going back to multi-processor architectures, since there's no benefit from that compared to cores (latency, etc).

I wonder how Bitcode will play with profile guided optimization. Will you also provide pgo information to apple or will they generate it.

My suspicion is that bitcode allows the App Store team to provide Watch/iOS specific binaries for an individual device. Right now the solution is to create fat binaries that eat precious space. The watch being even more constrained can use all the help it can get.

I wonder if this could also be a way to protect compiler-tech and silicon trade secrets, even after they're widely used in the field? Perhaps only Apple ever compiles the final, deployed versions of apps.

Yeah, and their patent applications telegraph the iMac with the fiber-optic shell that's been just around the corner for 10 years.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact