Hacker News new | past | comments | ask | show | jobs | submit login
Apple aims to sell Macs with its own chips starting in 2021 (bloomberg.com)
482 points by blopeur on April 23, 2020 | hide | past | favorite | 533 comments

It's worth pointing out how extremely far ahead Apple seems to be in terms of CPU power and efficiency. It tends to go under the radar a bit because it's not easy to run your own software on an iPhone, and most apps on these things are not serious workhorse loads (a few specific use cases are, most are not).

The Apple A13 - even its implementation in the iPhone SE, in microbenchmarks achieves on par single core performance [1] with the Core i7 8086k [2] and Ryzen 9 3950X [3]. That's the highest single core performance you can buy in PCs in principle.

I don't have to explain how insane it is a 5-ish watts smartphone CPU delivers that kind of performance, even if it is in bursts. By sticking to Intel or even x86 in general, there is ample evidence Apple is leaving a lot of performance on the table. Not just in MacBooks - but for the Mac Pro too.

[1]: https://browser.geekbench.com/v5/cpu/search?utf8=%E2%9C%93&q... - the iPhone 11 with the A13 has to do as a surrogate while the SE is just out, they benchmark the same.

[2]: https://browser.geekbench.com/v5/cpu/search?utf8=%E2%9C%93&q...

[3]: https://browser.geekbench.com/v5/cpu/search?utf8=%E2%9C%93&q...

It's worth noting that Geekbench is a pure microbenchmark. The iPhone will not sustain performance as long as the others. The point is that Apple could solve this when moving to bigger devices.

> It's worth pointing out how extremely far ahead Apple seems to be in terms of CPU power and efficiency.

Honestly, I'm not convinced they're THAT far ahead. They don't have a lot of the legacy baggage Intel has to contend with, and they're the only company making high end ARM chips (besides Amazon and a few other weird server implementations), but being able to match big core i7s in some benchmarks single threaded is to a large extent something that Intel's own low power chips can also do, at least burstily.

There are a lot of challenges to big many-cored chips beyond single-core performance, and we really don't know where they are with that yet, as there are no publicly-available examples of Apple desktop chips.

> and they're the only company making high end ARM chips

That's, if you'll pardon the pun, an Apples to Oranges comparison. Apple isn't making "high end ARM chips" either if your comparator is powerful servers. You need to look at what Apple is doing within their power envelope and compare that to what everyone else is doing within _their_ power envelopes. The A13 Bionic is an uncooled 6W TDP chip blowing past 95W base TDP chips that require hefty active cooling.

To add to that, if you look up the benchmarks of the i7 8500Y, which is intel’s top shipping 5W offering, you see it is vastly slower than the A13. A third slower in single core, less than half the performance in multicore.

This whole thread reminds me of how passionate the power pc enthousiasts were defending it as superior, right before apple switched to intel and doubled mac performance overnight.

> This whole thread reminds me of how passionate the power pc enthousiasts were defending it as superior, right before apple switched to intel and doubled mac performance overnight.

To be fair, the PowerPCs _were_ measurably better and faster when each one was released. Apple just couldn't get a G5 CPU that would fit into a laptop, and IBM was an unreliable partner with a slow release cycle, so by the time the transition happened they had fallen behind.

The PowerPC G3 and G4 were great chips. The G3 was much better than competing x86 chips, and G4 beat contemporary ones consistently too.

The G5 wasn't great though. When Steve Jobs announced it, Apple already showed it only trading blows with the then-current Pentium 4 (a Pentium 4 - they sucked!). And a few months after the first Power Mac G5s were launched, they were already resoundly beaten by the new Athlon 64s [1].

Add to that the G5 was basically a POWER4 server chip, and IBM was only building server chips in the future, Apple basically had no choice. Nothing really to do with PowerPC vs x86, but more to do with what kind of processors their suppliers were willing to build.

[1]: https://web.archive.org/web/20050605023250/https://www.pcwor...

Apple's volume was not enough to warrant investing on a G5 laptop. IBM offered Jobs the Cell and Jobs went elsewhere to get a better deal from Intel.

POWER is very much alive in the high-end server space, powering IBM's p and i series of machines.

Oddly enough, shortly _after_ the Intel transition was announced, IBM announced a PPC970 that would fit in a laptop power envelope.

I don't think it was ever used for anything.

That was probably the last PPC they did and was probably ready when Apple made the announcement. It just wasn't worth for IBM to invest in workstation-class PowerPC chips with laptop power envelopes. The only other use for PowerPCs, from their PoV, was their own workstations and those could use the higher end POWER chips.

IBM continued making PowerPC G3 derivatives for a while for Nintendo. Nintendo ended Wii U production in January 2017, which would I guess would mark the end of IBM's production of PowerPC as well.

The Wii U for all its flaws is probably the most "practical" Power-based machine you can get nowadays, given relative power, availability, size and price. A 1.2GHz triple core PowerPC G3 would probably still eek out Raspberry Pi 3 like performance. Shame the Linux port to it never really got off the ground (also partially due to IBM's hackjob of an SMP implementation for the G3).

Isn’t that, essentially, why everyone is now expecting Apple to switch away from Intel? The chipmaker is doing too little, too late?

>There are a lot of challenges to big many-cored chips beyond single-core performance, and we really don't know where they are with that yet, as there are no publicly-available examples of Apple desktop chips.

The A12X from 2018's iPad Pro is the sort of chip I'd expect to see in an ARM laptop, and its multicore scores are similar to the top end of 2018 Macbook Pros.

2020's iPad revision didn't get much in the way of processor improvements (just one more GPU core), so the new 16" MBP has pulled ahead with an 8-core i9, but when we get a new iPad based on an A13X or A14X I expect it to be back in that range again.

And these are in thin fanless tablets. With a proper cooling system, there's got to be some extra juice to be squeezed out of them.

I wonder if anyone's ever done A-series chip "overclocking", or at least manual overvolting with active cooling. It'd be cool to see what kind of performance increase might be available.

> Being able to match big core i7s in some benchmarks single threaded is to a large extent something that Intel's own low power chips can also do, at least burstily.

You're not wrong. In general, it's true that a microbenchmark amplifies the Apple A13's strengths due to power limits. The assumption I make is that microbenchmarks indicate the true peak performance of Apple's architecture, and as power limits become a smaller constraint when Apple uses its chips in laptops and desktops they will make available that performance in a more sustained way.

But even low power Intels don't compare that favourably. Intel's new i7 10510U delivers very nice single core performance [1]. But it's worth noting that 1) that still does not quite match the A13's burst performance 2) that chip is still rated for a power profile much larger than the A13's and 3) as always in these discussions - Intel's "TDP" is a marketing term not a power limit. At high turbos the chip is permitted to consume quite a bit more power than the 15W it's rated for.

This particular Intel chip boosts to 4.90GHz. For Apple chips, even stuff like clockspeeds are a matter of conjecture, but Wikichip without a source claims that the A13 tops out at 2.65GHz [2] which if true indicates a lot more thermal and frequency headroom in bigger form factors.

I just benchmarked my MacBook Pro+Safari in Jetstream 2.0 [3] - not a microbenchmark - and it scored nearly 145 compared to the nearly 130 the iPhone 11 scores [4]. That's with a "45W TDP" Core i7 8850H topping out at 4.3GHz. It's hard to benchmark iPhones well, but all evidence points to the fact that they are actually really fast.

[1]: https://browser.geekbench.com/v5/cpu/search?utf8=%E2%9C%93&q...

[2]: https://en.wikichip.org/wiki/apple/ax/a13 - worth noting that high-end Qualcomm SoCs also operate at comparable frequencies.

[3]: https://browserbench.org/JetStream/

[4]: https://www.anandtech.com/show/14892/the-apple-iphone-11-pro...

I tried this benchmark in Chrome on 3 computers:

* 6 year old Macbook Pro i7-4980HQ, Windows 10 - 102

* 5 year old Macbook Pro i7-5557U, OS X - 100

* Threadripper 2990WX desktop - 99

So, uh, I might have some questions about this benchmark's general validity now?! - though maybe it is some evidence in favour of my vague feeling that the Threadripper sometimes doesn't feel as fast as it seems like it ought to feel.

The Jetstream benchmark is made by the WebKit team. Its scores will vary per browser, so you have to compare browser to browser. I compared an iPhone 11 running Safari to a MacBook Pro running Safari. Apples to apples.

Moreover, while the Threadripper 2990WX is a really awesome processor, single core benchmarks (I think Jetstream is mostly limited to a single thread) aren't particularly its strength. Over multiple runs it should beat your Macbook Pro, but not by a huge amount. If not, take a look at how you're cooling that beast :)

I updated a six core i7 3930k with a sixteen core 1950x threadripper on one of my boxes. Biggest reason for the update was the disk IO (m2 drives) and number of cores and I'd had the 3930k since the launch day at Microcenter. On Windows, single threaded at stock speeds, the cores were comparable from a 6-8 year gap between the two CPUs. For hosting virtual machines... the speed did not matter as much as having an abundance of physical cores. Still - I was not expecting the 'core speed' as reported by a video game to be as close as it was.

The Zen2 (3900x) core speed on the other workstation reported as almost twice as fast, with 12 cores. Really wish that TR4 board supported the 39xx threadripper series.

iOS Safari and desktop Safari aren't identical.

Okay, thanks? Two actual apples are never identical either, but they're still more comparable than to oranges.

I don't think a threadripper would be expected to have particularly good single-threaded performance?

Zen 2 was a big uplift in single-threaded performance.

Ryzen and Threadripper 1000- and 2000-series, and Ryzen Mobile < 4000-series are all on Zen or Zen+ architecture.

The current gen Ryzen and Threadripper 3000-series and the Ryzen Mobile 4000-series are the ones running on Zen 2. This is where AMD is competitive with Intel on single-threaded workloads, largely across the board.

Parent mentioned a 2990WX, which is a Zen+ part.

Oh, right; didn't realise Zen 2 ones actually existed yet.

It's the opposite. The latest generation of ThreadRipper is not only ahead in multicore peformance it is also comparable to the highest possible single core performance from the regular Ryzen Lineup.

>besides Amazon and a few other weird server implementations

One of those 'few other' was Scaleway, but they recently buckled up and ended their ARM server lineup abruptly[1]. They were running Marvell ThunderX SoCs(Upto 64 cores, 128GB RAM).

So, Amazon might soon takeover ARM server market i.e at-least till Tim Cook does a Satya Nadella and brings in Apple IaaS with ARM CPUs.


> and they're the only company making high end ARM chips

I'm typing this with a Surface Pro X running a ARM64 CPU called SQ1 which a customization of a Snapdragon 8cx. It is quite high end and is not made by Apple. It might not be the amazing custom CPUs they have on iOS devices but it is still a pretty good CPU.

I mean, the Surface Pro X chip is fine, but it would be hard to call it high end; it significantly lags the usual intel chips used in tablets and small laptops on performance, especially single core. The newest Apple ones are competitive with those or beat them.

When your cheapest smartphone is faster than the fastest competitor smartphone, I’d say you are far ahead. But your point on how well that will apply in desktop PCs is certainly valid.

> By sticking to Intel or even x86 in general, there is ample evidence Apple is leaving a lot of performance on the table.

That isn't necessarily true. Having competitive performance at lower power isn't always the same thing as having better performance, even assuming these benchmarks are representative.

Processors designed specifically for low-power make different design trade offs. One of those is to exchange maximum clock speed for IPC (because higher clocks burn watts). The A13 maxes out at 2.65GHz, the i7 8086k hits 5GHz. Chances are you can't just give the A13 a 95W power budget and see it hit 5GHz, it would have to be redesigned and the kinds of changes necessary to get there would generally lower IPC.

Apple is also riding the same advantage as AMD -- they're using TSMC's 7nm process which is better than what Intel is currently stuck with. Even AMD is still using an older process for the I/O die. We don't know what that's going to look like a year or two from now.

Meanwhile the renewed competition between Intel and AMD makes this kind of a bad time to move away. They're both going to be working hard to take the performance crown from each other and Apple would have to beat both of them to claim an advantage. And continue to do so, or they'd have a lot of pissed off customers and developers after forcing a transition to a new architecture only to have it fall behind right after the transition is over.

>Chances are you can't just give the A13 a 95W power budget and see it hit 5GHz, it would have to be redesigned and the kinds of changes necessary to get there would generally lower IPC.

It's not really a difficult concept to understand. If your CPU runs at 5GHz then maximum time a single cycle is allowed to take is 0.2 nanoseconds. CPU designers have to make sure that this limit is never exceeded anywhere on the chip. If you make a even the slighted mistake in some unimportant corner of the CPU you will end up limiting the maximum performance of the entire CPU.

Most CPUs are optimized for a specific clock frequency and going beyond it is not possible without sacrificing stability.

I'm always a bit suspicious about x86 vs. non-x86 micro-benchmarks. I remember all the fun people had with ByteMark back in the day, and while I assume that Geekbench doesn't play those sorts of games with compiler optimizations, I would really like to see data from something a bit more representative of a real-world CPU-bound application (short-lived is fine, just not synthetic).

Geekbench isn't a micro benchmark, it's a comprehensive test of the system using a variety of programs and workloads and aggregates the results. It's not a single program that one can play games with compiler optimizations.

I should have clarified that Geekbench can be more accurately described as a set of microbenchmarks. It does test a lot of different kinds of performance, but it does not test sustained performance.

I don't like arguing semantics but I don't really think any of what Geekbench does is a "micro" benchmark. At least for me that typically refers to running a small snippet of code, like calculating a dot product or something. Geekbench tests whole program performance.

It's not aida64 but it is a pretty decent metric, and consistent.

> It's worth pointing out how extremely far ahead Apple seems to be in terms of CPU power...

I agree that Apple's ARM CPUs are very competitive on simple scalar instructions and memory latency/bandwidth. However x86/x64 CPUs have up to 512 bit wide vector instructions and many programs use vector instructions somewhere deep down in the stack. I guess that the first generation of Apple ARM64 CPUs will offer only ARM NEON vector instructions which are 128 bit wide and honestly a little pathetic at this point in time. But on the other hand I am very excited about this new competition for x86 CPUs and I will for sure buy once of these new Macs in order to optimize my software for ARM64.

Also, vector instructions are not doing that well on laptops, but are thermally throttled making them less useful https://amp.reddit.com/r/hardware/comments/6mt6nx/why_does_s...

I am more than a little naive on the subject, but is it possible that the vector instructions could be farmed out to a co-processor that is dedicated to that kind of workload? I suspect that the rich instruction set leads to higher transistor count and density(?true?) and thus higher TDP?

Would love to learn more from sources if people might provide a newb an intro.

The vector instructions can't really be farmed out because they can be scattered inline with regular scalar code. A memcopy of a small to medium-sized struct might be compiled into a bunch of 128bit mov for example and then immediately working on that moved struct. If you were to offload that to a different processor waiting on that work to finish would stall the entire pipeline.

Could the compiler create a binary that had those instructions running on multiple processors? I see now I have some googling/reading to do about how you even use multiple processors (not cores) in a program.

That's what we call the magic impossible holy grail parallelizing compiler.

Good to know before I run off looking for the answer :)

The technological knowledge to do this is years and years away.

> The vector instructions can't really be farmed out because they can be scattered inline with regular scalar code.

If you believe this, you won't believe what's in this box[1].

[1]: https://www.sonnettech.com/product/egfx-breakaway-puck.html

> A memcopy of a small to medium-sized struct might be compiled into a bunch of 128bit mov for example and then immediately working on that moved struct

I'm not sure that's true: rep movs is pretty fast these days.

> If you believe this, you won't believe what's in this box[1].

There's a fundamental difference between GPU code and vector CPU instructions, though. GPU shader instructions aren't interwoven with the CPU instructions.

Yes, if you restrict yourself to not arbitrarily mixing the vector code with the non-vector code, you can put the vector code off in a dedicated processor (GPU in this case). The GP explicitly stated that a lack of this restriction prevents efficiently farming it off to a coprocessor.

> I'm not sure that's true: rep movs is pretty fast these days.

That's only true if you target skylake and newer. If you target generic x86_64 then compilers will only emit rep mov for long copies due to some CPUs having a high baseline cost for it. There's some linker magic that might get you some optimized version when you callq memcpy, but that doesn't help with inlined copies.

I think people with computers more than five years old already know that their computer is slow.

Why exactly do you think seven-years-old is too-old, but five-years-old isn't?

That is irrelevant. The default target of compilers is some conservative minimum profile. Any binary you download is compiled for wide compatibility, not to run on your computer only.

That’s different. Rendering happens entirely on the GPU, so the only data transfer is a one-way DMA stream containing scene primitives and instructions.

There's absolutely no reason it _has_ to be one-way: It's not like the CPU intrinsically speaks x86_64 or is directly attached to memory anyway. When inventing a new ISA we can do anything.

And if we're talking about memcpy over (small) ranges that are likely still in L1 you're definitely not going to notice the difference.

By definition a co-processor won't share the L1 cache with another processor.


Then you will face the same problems that GPUs suffer from. Extremely high latency and constrained memory bandwidth. Sending an array with 100 elements to the GPU is rarely worth it. However, processing that array with vector instructions on the CPU is going to give you exactly the speedup you need because you can trivially mix and match scalar and vector instructions. I personally dislike GPU programming because GPUs are simply not flexible enough. Either it runs on a GPU or it doesn't. ML runs well on GPUs because graphics and ML both process big matrices. It's not like someone had an epiphany and somehow made a GPU incompatible algorithm run on a GPU (say deserializing JSON objects). They were a perfect match from the beginning.

This is not an area of expertise for me, so is there a reason to not offload vector processing to the GPU and devote the CPU silicon to what it's good at, which is scalar instructions?

There are many reasons. The latency of getting data back and forth to the GPU is a pretty high threshold to cross before you even see benefits, and many tasks are still CPU bound because they have data dependencies and logic that benefit from good branch prediction and deep pipelines.

Many high compute tasks are CPU bound. GPUs are only good for lots of dumb math that doesn't change a lot. Turns out that only applies to a small set of problems, so you need to put in lots of effort to turn your problem into lots of dumb math instead of a little bit of smart math and justify the penalty for leaving L1.

Yes, communications overhead. SIMD instructions in the CPU have direct access to all the same registers and data as regular instructions. Moving data to a GPU and back is a very expensive operation relative to that. The chips are just physically further away and have to communicate mostly via memory.

Consider a typical use case for SIMD instructions - you just decrypted an image or bit of audio downloaded over SSL and want to process it for rendering. The data is in the CPU caches already. SIMD will munch it.

For certain professions like media editing vector instructions help. But for your average Facebook / Netflix / Microsoft Word user, a kind of user that 95% users are, there are less benefits on vector instructions.

Are you saying Facebook, Netflix and Microsoft Word don't require media processing? Pretty sure you'd see plenty of SIMD instructions being executed in libraries called by those applications.

AVX is widely used in things as basic as string parsing. Does your application touch XML or JSON? Odds are good that it probably uses AVX.

Does your game use Denuvo? Then it straight-up won't run without AVX.

People are stuck in a 2012 mindset that AVX is some newfangled thing. It's not, it's used everywhere now. And it will be even more widely used once AVX-512 hits the market - even if you are not using 512-bit width, AVX-512 adds a bunch of new instruction types that fill in some gaps in the existing sets, and extend it with GPU-like features (lane masking).

Are you saying that iPhones and iPads are bad at Facebook, Netflix, and Microsoft Word? If they are, the end user certainly can’t tell. If they aren’t, then it doesn’t really matter does it?

Phones are much more reliant on having hardware decoders for things like video while desktops can usually get away with a CPU-based implementation, yes.

Sure but the same is true about performance in general.

That's not really true. Single-threaded scalar performance is still super important for the everyday responsiveness of laptop/desktop systems. Especially for applications like web browsing which run JavaScript.

Your UI is slow because of IO and RAM and O(n^2) code, not CPU. Look at your activity monitor.

> It's worth noting that Geekbench is a pure microbenchmark.

This isn't just a note, it's an important clarification.

Microbenchmarking is used in-lieu of proper benchmarking because you can't do proper benchmarking.

How was Apple able to get here? What secret sauce does Apple have in it's chip that's beating out Qualcomm, Intel and AMD?

Apple is willing to trade die space (i.e. manufacturing cost) for performance in a way that their competitors are not.

Anandtech architecture reviews are helpful, as always. Worth reading for a page or two from the linked page:



The short of it is: massively wide execution units, massive amounts of SRAM, massive amounts of cache at all levels. There's no real "secret sauce", they're just willing to pay to make an incredibly fat core and Qualcomm and company are not.

They have 16 MB of system cache on A13 for 2 high-performance and 4 low-performance cores, which is as much as a 9900K gets for 8 cores and as much as Zen2 gets for 4 cores. Plus another 8MB per big core on top of that (so up to 24MB per core in single-threaded mode), and 4 MB per small core.

It helps that they're a vertically-integrated company, they don't have to sell their processors on the open market at competitive prices such that an OEM can also make a profit selling a finished product at competitive prices, they just sell the finished product.

It's intelligible that Apple A series CPU wins to other ARM chips. But I can't find why they're competitive with big Intel/AMD chips even though lower core clocks. x86-64 is really a problem?

I don't buy it until I see some more well done tests.

Something like Cinebench ran 10 times in a row and taking the average of results would be more meaningful.

Also, the benchmark has to enable or disable optimization on all platforms. Some people on reddit claim that Geekbench is highly optimized for ARM and less optimized for X86.

That is somethimg I've been wondering too. Is such a huge difference in perf/watt due to the ISA?

I think you are misreading the L2 sizes. It looks like 1x8MB L2, 1x4MB L2, and 1x16MB system cache. So you are correct that a lone thread could get up to 24MB of cache, but that's not per core. It's a total of 28MB of cache on the die.

Zen2 has 2x16MB L3 and 2x4x512KB L2 per chiplet (36MB) so it's not like Apple is throwing down afore-unheard-of quantities of SRAM. It's true a single A13 thread has much more accessible L2 capacity, though.

Apple is arguably making some better design trade-offs. This is possible due to them making money on the whole computer/device, and not just the CPU as Intel/AMD/others need to do. So while the others are doing everything possible to both minimize die space and likely keep around a bunch of compatibility cruft that could probably go, Apple is free to go in a different direction.

The 5775c (https://wccftech.com/intel-broadwell-core-i7-5775c-128mb-l4-...) was a good example of a no-compromises (from a cache standpoint) CPU from Intel that just annihilated their other CPUs at the time... it's not that hard to do, provided you're willing to pay the price for it somewhere else.

Two reason for x86 (IMO) and 3 reasons for ARM.

I've thought for years that the overhead of the extra decoder hardware and legacy cruft was non-trivial (though Intel claims that's not true). The evolution of ARMv8 (where ARM went much closer to it's RISC roots) seems to disagree. This explains the performance per watt issue (and potentially some IPC).

That said, scaling IPC (instructions per clock) seems to have a pretty big limit. x86 has basically hit a wall and it's been lots of time and research for small gains. Additionally, the biggest challenges in large systems is that the cost to do a calculation on some piece of data is often less than the cost to move that data to and from the CPU. As Apple increases cache size, frequency, and starts dealing with bigger interconnect issues, I suspect we'll see a distinct damper on their performance gains.

Qualcomm (and ARM as the designers of the core) has a very different problem to solve. They can't make money off of software. They make money when they sell new chips and they make more money from new designs than from old ones. This means incremental changes to ensure a steady revenue stream. Since Apple having a fast, proprietary CPU doesn't actually affect Qualcomm or ARM, they most likely don't even see themselves as in direct competition. Most people buy Android or iOS phones for reasons other than peak CPU performance and Qualcomm is fairly competitive with a lot of these (esp actual power usage).

A further complication is that they also need "one design to rule them all". They can't afford to make many different designs, so they make one design that does everything. Apple doesn't need to spend loads of time and money trying to optimize the horrible aarch32 ISA. Instead, they spend all that time on their aarch64 work. ARM and Qualcomm however need to add that feature so the markets that want it still buy their chips.

Apple shipped their large 64-bit design only a couple years after the ISA was introduced. Put simply, that is impossible. It takes 4-5 years to make a new, high-performance design. It took ARM 3 years for their small design (basically upgrading the existing A9 to the new ISA) and closer to 4.5 years to actually ship their large design (A57) and another year for a "fixed design (A72, though it's actually a different design team and uarch). Though the gap has been closing, 2.5 years in the semicon business is an eternity.

A crufty ISA and non-CPU scaling problems seem to explain Intel/AMD. A late start, bigger market requirements, and perverse incentives against increasing performance seem to explain ARM/Qualcomm

Its hard to buy that the ISA really has anything to do with it. As you mention apple has a fairly narrow market target for their cores. Both intel and AMD are basically building server cores (can you say threading?) and selling them as client devices. Mostly because that is where the real money for them is. Apple OTOH is building a client/mobile core, and they benefit from a number of "features" that they enable, which are known performance problems in the desktop/etc space but continue for legacy reasons. Combined with Intel basically standing still for the last ~5 years, and the tables have reversed as far as who is ahead on process+microarch.

Basically a lot of apples advantages are:

1: Complete vertical control of compiler+OS+hardware 2: Plenty of margin to spend on extra die 3: More advanced process @ TMSC 4: Very narrow focus, apple has only a few models of iphone+ipad, where as intel has dozens of different dies they modify/sell into hundreds of product lines. So everything is a compromise.

Any of those four give them a pretty significant advantage, the fact that they benefit from all four cannot be discounted.

aarch32 decode is far less complex than x86 and aarch64 is even less complex than that. On the power consumption side, decoders definitely make a difference. They use tons of power and a huge number of instructions means having a huge, power-hungry instruction decode cache.

In addition, complex instruction decoding requires more decode stages. This isn't a trivial cost. Intel can shave off several stages if they have a decode cache hit and that's not including the ones that are required regardless (even the simple Jaguar core by AMD has 7+ decode stages possible). Whenever you have a branch miss, you get penalized. Fewer necessary decode stages reduces that penalty.

OTOH, you have x86 using what is effectively a compressed instruction encoding, and a trace cache (although its advantageous enough arm designs are apparently using them now too) which reduces the size of the icache for a given hit rate. So the arch losses a bit here, and gains a bit it elsewhere. Its the same thing with regard to TSO, a more relaxed memory model buys you a bit in single threaded contexts, but frequently TSO allows you to completely avoid locks/fencing in threaded workloads which are far more expensive.

So people have been making these arguments for years, frequently with myopic views. These days what seems to be consuming power on x86 are fatter vector units, higher clock rates, and more IO/Memory lanes/channels/etc. Those are things that can't be waved away with alternative ISAs.

If it were vectors, clocks, and memory, then Atom would have been a success, but even stripping out everything resulted in a chip (Medfield) that under-performed while using way too much power.

Either the engineers at Intel and AMD are bad at their job (not likely) or the ISA actually does matter.

Atom is a success, just not where you think it is. The latest ones are quite nice for their power profile and fit into a number of low end edge/embedded devices in the denverton product lines. Similarly the gemilake cores are not only in a lot of low end fairly decent products (pretty much all of Chuwi's product lines are N4100 https://www.chuwi.com/), but they are perfectly capable very low cost digital signage devices/etc.

So not as sexy as phones, but the power/perf profiles are very competitive with similar arm devices (A72). If you compare the power/perf profile of a denverton with a part like the solidrun MACCHIATObin the atom is way ahead.

Check out https://www.dfi.com/ for ideas where intel might be doing quite with those atom/etc devices.

Conversely, if the instruction set was the main factor, you'd expect Qualcomm and Samsung also to have ARM processors with a similar power to performance advantage over Intel chips.

The reality is just that Apple is ahead in chip design at the moment.

They are 2 years behind Apple and slowly catching up.

When Medfield came out, Apple didn't have it's own chip and x86 still lost. It was an entire 1.5 nodes smaller and only a bit faster than the A9 chips of the time (and only in single-core benches). The A15 released not too long after absolutely trounced it.

>When Medfield came out, Apple didn't have it's own chip

>It was an entire 1.5 nodes smaller and only a bit faster than the A9 chips of the time

You seem to have the chronology all mixed up here. Medfield came out in 2012. The A9 came out in 2015. Apple was already designing its own chips in 2012. (The A4 came out in 2010.)

> Apple shipped their large 64-bit design only a couple years after the ISA was introduced.

Actually ARM cores were available earlier than that, just nobody wanted to license them until the elephant in the room (Samsung) forced everybody to follow.


2011 -- ARM announces 64-bit ISA

2012 -- ARM announces they are working on A53 and A57 and AMD annouces they'll be shipping Opteron A1100 in 2014.

2013 -- The Apple A7 ships doubling performance over ARM's A15 design.

2013 -- Qualcomm employee leaks that Apple's timeline floored them and their roadmap was "nowhere close to Apple's" (Qualcomm seems to switch to A57 design around here in desperation -- probably why the 810 was so disliked and terrible).

2014 -- Apple ships the A8 improving performance 25%.

early 2015 -- Samsung and Qualcomm devices ship with A57. Anandtech accurately describes it saying "Architecturally, the Cortex A57 is much like a tweaked Cortex A15 with 64-bit support." Unsurprisingly, the performance is very similar to A15.

late 2015 -- Apple ships A9 with a 70% boost in CPU performance.

later 2015 -- Qualcomm ships the custo 64-bit kryo architecture as the 820. It regresses in some areas, but offers massive improvements in others for something close to a 30% performance improvement over the 810 with A57 cores.

2016 -- AMD finally launches the A1100. ARM finally ships the A72 as their first design really tailored to the new 64-bit ISA.

Final Scores

Apple -- 2 years to ship new high-performance design

ARM -- 4 years to ship high-performance design, 5 years for new design

Qualcomm -- 4.5 to 5 years to ship new high-performance design

Sorry, something's definitely fishy. Nobody can design and ship that good of a processor in less than 2 years.


Isn't the current Intel Core line an evolution of the Pentium M (2003), itself an evolution of the Pentium III (1999)? Apple starting a fresh design with up to date constraints may have given them room for improvements I guess.

That doesn't explain why AMD and Qualcomm can't keep up either.

Sandy Bridge was the last really big architectural change. It seems heavily inspired by the Alpha EV8 design (Intel bought Alpha from HP in the 2001) with of course, a very different decoder section (they wrap the x86 decoder around a RISC architecture).


Vertical integration? Even if their chips end up being a bit slower in the end, they'd probably increase their overall profit margins and get increased flexibility aligning their development cycles.

>It's worth noting that Geekbench is a pure microbenchmark. The iPhone will not sustain performance as long as the others. The point is that Apple could solve this when moving to bigger devices.

Is that inherent to the architecture or is this a self-imposed limitation by Apple since it has to sip power and run without any active cooling?

Also, the A-series chips seem to fall down in comparison against the Intel Macs on multi-core performance which seems like it would matter for anyone who needs a desktop.

> extremely far ahead Apple seems to be in terms of CPU power and efficiency

RISC finally coming into is own.

For the longest time, Intel was able to fend off much better architectures simply by being a fab generation or two ahead, more clock speed, more transistors, more brute force.

Not to belittle the engineers wringing out seemingly impossible performance from the venerable architecture, but the the architectural limitations always mean extra work and extra constraints that have to be worked around.

And now that Dennard scaling is dead, Moore's law wheezing and not helping all that much for our mostly serial workloads, they just can't compensate for the architecture any longer, at least not agains a determined, well-funded and technically competent competitor that's not beholden to Wintel.

I remember when the Archimedes came out and just offered incredibly better performance than the then prevalent code-museums, 386 and 68K variants, at incredibly lower transistor counts. The 486 and 68040 were able to compete again, but with vastly larger transistor budgets (and presumably power budgets as well, but we didn't look at that back then).

Oh, and can we have our Tansputers now? Pretty please, Xmos?

I'm not sure how you can call that a victory. ARM and AMD are only ahead because of superior manufacturing processes. That's exactly the thing you accused Intel of doing.

How will this change relying on libraries like Intel's MKL library if Apple is using their own chips?

One of the general downsides of Apple development is how often they change absolutely everything. Like the time they switched from 68K to PowerPC. Or MacOS classic to Mac OS X. Or PowerPC to Intel.

Apple provides brief windows of automated compatibility, but code untouched since 2003 won’t run on a mac today, and code untouched since 1986 wouldn’t run on a 2003 mac.

Apple will say developers should have used Core Whatever instead of Intel libraries.

I'm not sure the difference is so great. It's all about the node about the node... (to the tune of "all about the bass" :)

I just got a 2020 MacBook Air with an Ice Lake 10th generation 10nm X64 chip in it.

It's a quad-core, newer core rev, has AVX2 and a bunch of other stuff.

It runs very noticeably cooler than a 2019 Air with a 14nm Amber Lake chip in it. Battery life is also noticeably better. When I read the spec differences I basically traded my 14nm Amber Lake Air for a 2020 Air (and also because of the better keyboard). It's a great machine.

The difference between the Amber Lake and Ice Lake core designs is not substantial and the Ice Lake has twice the cores and a larger on-board GPU, so how is it so noticeably cooler? I can fully max out all four cores of the 10nm Ice Lake and it doesn't get as hot as the 14nm Amber Lake did at lower loads! The answer is obviously the process node: 10nm vs 14nm. It's a more efficient chip at the physical circuit level.

ARM has some intrinsic power advantages over X64. The biggest thing is that ARM instructions come in only two or three sizes and it's easy to size and decode them, while X64 instructions come in sizes from one byte to 16 bytes and are a massive pain to decode. That decode cost comes in energy and transistors, but it's worth pointing out that this is a mostly fixed cost that shrinks as a percentage of the overall CPU power/transistor budget as the process node shrinks. In other words the cruftiness of X64 remains the same as you go from 22nm to 14nm to 10nm.

Other than the ugly decode path the ALU, FPU, vector, crypto, etc. silicon is not fundamentally different from what you'd find in a high-end ARM chip. A lot of the difference is clearly in the fabrication. ARM chips have been below 14nm for a while, while Intel X64 chips have lagged.

(Tangent: since the actual engine block is largely the same, I've wondered if Apple might not slap an X64 decoder in front of their silicon in place of the ARM64 decoder and make Apple X64 chips?! I am not a semiconductor engineer though, so I don't know how hard this would be and/or what IP issues would prevent this. Probably unlikely but not impossible. They certainly have the cash and leverage to muscle Intel into licensing anything they need licensed.)

If Intel gets its act together with process nodes and/or starts using other fabs who are at tighter lower power nodes, the advantage will shrink a lot. AMD already has mobile chips that are close to Apple's ARM chips in performance/watt, partly because they are fabbed at TSMC at 7nm.

BTW: I'm not claiming Apple's chips aren't impressive, and as long as they don't lock down MacOS and make it no longer a "real computer" I personally don't mind if they go to ARM64. Also: the fact that Apple's chips only get this great performance in bursts is mostly due to power and cooling constraints on fanless thin phones and tablets. In a laptop with better cooling or a desktop they could sustain that performance no problem.

> Core i7 8086k

TIL of this commemorative naming. Sole comment I could find on HN 2 years ago https://news.ycombinator.com/item?id=17409849

> It's worth pointing out how extremely far ahead Apple seems to be in terms of CPU power and efficiency.

Nope. But Apple put a MASSIVE amount of cache on their chips.

It's not rocket science, but it is expensive.

>> By sticking to Intel or even x86 in general, there is ample evidence Apple is leaving a lot of performance on the table. Not just in MacBooks - but for the Mac Pro too.

So we should expect lock-in on Mac Pro chips as well. Exactly what people were asking for. Another brick you can't upgrade without paying the cost of a new machine.

having faster processors, is what's wrong with our computing industry in general. processors on laptops | desktops have become 1000x faster, but has the software kept up.

no software has gotten slower by the ages. Apple, Qualcomm, Intel can make as much faster processors with x cores, but do we have software able to utilize those, eh! run a JS heavy site | app and you see most processors heat up these days whether mobile or desktop.

most programming languages can't easily delegate work to cores with smoothness like how Erlang n Elixir do it. in Python threads, were a nightmare but now with concurrent futures or dask at least we can utilize all cores.

tldr - we need to make faster software

It's not even "faster software" so much as eliminating the culture of Developer Convenience at the expense of User Experience. That's what got us Electron. I've actually seen comments on HN unironically describing the web as "the perfect app platform".

Unfortunately, the majority of users seem conditioned to accept software with awful performance, so there's no impetus for developers to upgrade their skills.

It's not about "developer convenience". It's about "developers are expensive".

I can build an application in electron 4-10x faster (at least) versus building the same application in C. If I'm costing a company $100-200 per hour, would they rather pay me for 4 months (500 hours and $50,000-100,000) or would they rather pay me for 1-2 years (2-4k hours) at a cost of $200,000 - $800,000?

What about when we multiply that by a team of 5-10 people? Don't forget that time to market is often incredibly important. Tell them 2 years and 8 million or 4 months at 1 million and what will they say?

You might build an application faster in electron...

OTOH, C isn't a GUI development environment. If you want to compare a C based environment you compare it with GTK/QT/winforms/etc.

In the end, as someone who has written GUIs in a wide range of tooling i'm not sure there really is that much difference.

I've yet to see an electron application with 1/2 the functionality of similar native applications. Electron maybe gets you bootstrapped faster but then you bog down in basic data manipulation, and functional behavior because it turns out HTML/CSS/Javascript are absolutely terrible for building rich GUIs. Even now 20+ years after people first tried to do it. There are so many things people took for granted in the past (ex: grids with arbitrary sort, editing, and a scrollbar that represents where in the data you are) that are far more difficult in HTML than they are in more native solutions. Plus, the scalability is miserable, take your favorite framework and have it load 10k rows of data into a table. That was something you could do in VB/delphi in the mid 1990's on a 486 in a matter of seconds. This is why pagination is so popular. Half a meg of actual data bloats up into half a gig when you try rendering it in chrome/etc so your forced to leave it on the server and round trip for tiny bits.

It is because you want to hire inexpensive web developers to develop for desktop and get the check box ticked. Your average bootcamp webshit doesn't even know Big O notation. I don't expect them to be as productive as good developers either.

The web ecosystem is a big mess where trends change every month and working in web ecosystem requires looking up things a lot because no one bothers to master the thing. It doesn't help that many web developers don't have solid foundations.

This is a trap managers generally fall into. Cheap developers aren't equivalent to competent developers, and their incompetence will cost you more than what you save by hiring them instead of a competent developer.

Don't worry, the future is wearables and more efficient mobile phones. Electron just won't run on your Apple Watch.

The only reason Electron isn't on your Apple Watch today is because of App Store Review.

"No one will ever need more than 640K of RAM." - Definitely NOT Bill Gates.

The mobile equivalents of Electron (PhoneGap/Cordova/Ionic) are already extremely popular.

Never used electron but I did see threads on how it’s very bloated. Are there other issues?

Electron is Chrome and Chrome itself is pretty tightly optimised for what it does. The issue isn't actually Electron so much as using a platform designed for typesetting for making complex GUI apps, a task for which it was never designed and isn't particularly good at.

But. That said. Whilst I'm no big fan of web apps, there are good reasons they're so prevalent. It's not merely about developer convenience. Native GUI toolkits can appear artificially performant because they're required by the OS and thus almost always resident, vs cross platform toolkits that may be used by only one app at a time. When you open the lid though the gap between an engine like Blink and something like Qt, JavaFX or Cocoa isn't that big. They're mostly doing similar things in similar ways. The big cost on the web is the DOM+CSS but CSS has proven popular with devs, so native toolkits increasingly implement it or something like it.

> Chrome itself is pretty tightly optimised for what it does.

My laptop battery would disagree. I get about an hour and a half with Chrome/Electron, four or five without.

Which laptop is that?

Note: I didn't say the web itself is highly optimal. Just that for a web browser, Chrome is pretty thoroughly optimised.

It appears to me that chrome is perfectly optimized for speed at cost of memory and power.

Sort of reminds me of Braess's paradox: you add a new road and then overall traffic slows. More bandwidth? Larger videos. More roads? More cars. Faster processors? More abstractions.

Also Jevon's paradox, the more efficient you make a process or resource, the more consumption of it will grow.

>tldr - we need to make faster software

Just use compiled, strongly typed languages.

No: Javascript, Python, PHP, Ruby Yes: Java, C#, C, C++, Rust.

Imagine one day we get a computer able to perform operations unimaginably faster than what we have now. It would process Snapchat's dog filters in femtoseconds, firing up and tearing down millions of kubernetes clusters every frame (because it would be easier to write it that way). Would is still make sense to try and optimize software? Wouldn't it be more constructive to solve real-world tasks instead?

Hardware is fast and cheap, and it's getting even faster and cheaper. It's perfectly fine to utilize this power, if it makes developing products faster, easier or cheaper.

Now, there are still cases when you need to send a machine to roam the mountains of another planet. This may justify doing some assembly.

> It would process Snapchat's dog filters in femtoseconds, firing up and tearing down millions of kubernetes clusters every frame (because it would be easier to write it that way).

No, it wouldn't. Those tasks would just become less efficient with time as developers stopped caring to optimize them, as has happened with the overwhelming majority of consumer software for the past several decades.

They would work well enough on the computers of that generation and painfully slow on today's supercomputers. Yes, just like the software we have today.

I was trying to express that Electron (and the like) is not an inherently bad thing. It allows to trade hardware capacity for easier development experience. Those developers who use it create useful software that works. And software that works in a given environment is exactly the point of the industry, is it not?

Not to mention that with the latest few generations of CPUs Intel has also increased burst/decreased base clock while promising unrealistic power draws.

In other words, Intel's CPUs are mostly about "burst" performance these days, too. They don't get anywhere the promised peak performance for any significant amount of time.


Oh boy, here comes the AMD peanut gallery.

The conclusion of your article is precisely about how motherboard manufacturers don't obey the official/nominal behavior and how that makes it irrelevant:

> Any modern BIOS system, particularly from the major motherboard vendors, will have options to set power limits (long power limit, short power limit) and power duration. In most cases, at default settings, the user won’t know what these are set to because it will just say ‘Auto’, which is a codeword for ‘we know what we want to set it as, don’t worry about it’. The vendors will have the values stored in memory and use them, but all the user will see is ‘Auto’. This lets them set PL2 to 4096W and Tau to something very large, such as 65535, or -1 (infinity, depending on the BIOS setup). This means the CPU will run in its turbo modes all day and all week, just as long as it doesn’t hit thermal limits.

Intel desktop processors will sustain boosts for an arbitrary amount of time. Yes, they will exceed the nominal TDP while doing so, so do AMD processors (AMD's version of "PL2", which they call the "PPT limit", allows power consumption up to 30% higher than the nominal TDP while boosting, and there is no official limit to how long this state may occur).

These limits are of course observed much more strictly on laptops since power/thermal constraints actually make a real-world difference there. But overall, Ice Lake perf/watt is competitive with Renoir and its IPC and per-core performance is actually higher than Renoir. You just get fewer cores.

While I don't know much about phones and suppose you are correct there, I have been disappointed with the hardware that they pushed to their macs the last 5 or so years. For instance, why did they ship newer macs without upgrading to the newest Intel chips?

I guess you can solve anything if you do it yourself, but that was perplexing for me that they would do that. I am not an expert on this, but this happened some time between 2016–2018.

Edit: I stuck with my upgraded 2013 model and even today it's fast and good for the job. Even my 2009 mac is still running. So I am not saying they can't pull it off, I am just wondering whether they give enough attention to their non-iPhone products.

Since I was downvoted, here are some discussions on what happened cerca 2016–2018:




Don't bother with the links, the comments are more informative. In some of the comments people explain why it's Intel's fault and not Apple, in some comments people explain processor speeds. A lot of it is about the keyboard that they changed, but that is not related to this. My point was more that I think Apple should put the amount of effort into their laptops that they did in 2009, whether or not they do is subjective until we see their new processors.

"newest" might be unfair. I imagine there's a lot of product life-cycle / driver testing to ensure high quality.

Whereas other PC shops typically will just throw PCs together and assume they'll fix issues found in patches.

That's not going to fly for "just works" Apple product image - and why it's positioned and priced as a "luxury" product.

Well, I've purchased Macbook Pros, iPhones, and iPads in the past, and every single one got incrementally more sluggish with each OS update until it was unsuable. This was in 2013-2015. It's been 5 years since I've owned an Apple product, but I just have a hard time believing Apple is heads and shoulders above everyone else.

1. ARM-based CPUs have for years (decades) been much more power-efficient than Intel and AMD CPUs. This is regardless of Apple, and is not really about being ahead - it a different path in the design space.

2. If Apple were "far ahead" in terms of performance in general, they would probably have been using these chips in products which aren't smartphones.

wrt point 2. I always assumed that it's not only chip performance or efficiency, but ecosystem and software compatibility that's a larger issue.

It's not trivial to port x86 software to ARM, or even run an energy efficient emulator.

Is that correct?

I believe you are correct. I suspect a big part of Apple's increased popularity since moving to x86 has been the ability to run an alternate OS either via dual-boot or in a VM at native speeds. While it was possible to run Windows in a VM back in the PowerPC days, it was sloooow. While switching to ARM would be relatively trivial for Apple's own software and even OS X applications, the downside would be losing easy/performant/power efficient access to non-OS X applications. That said, I expect Apple to make that trade-off sooner rather than later... x86 compatibility isn't nearly as important today as it was 10 or even 5 years ago.

In most cases it's fine since Linux distros for arm have existed for a while. Getting proprietary vendors to support it is a different question.

> because it's not easy to run your own software on an iPhone

Then why do I need all this performance? Perhaps to run js on shitty websites. Well, maybe for gaming.

Alternatively, when you don't need to be running full throttle, you get better battery life. Which any consumer will appreciate. And is also perhaps an environmental win, if it allows them to get away with smaller batteries.

That said, the experience of trying to browse a local newspaper's website with NoScript turned off on my 2015 MacBook Pro tells me that, yes, JS on shitty websites is a problem. And not one that most people would find to be particularly avoidable. Heck, even GMail is getting to be noticeably slow on that computer.

The extreme slowness of the modern web honestly is a reason. I do a lot of my personal web browsing on an elderly (nearly 6 year old) iPad Air 2. For everything but the web, and for old websites, it's fine. For the "100MB of react crap" type of website, it's getting pretty painful, tho.

My personal laptop is a Lenovo X200s from 2008 or 2009 with 8GB RAM. I use it to write words, read PDFs, do some software development (most often C and "vanilla" web stuff), design electronics in Kicad, and casually browse the web... and I'd be happy with it as my only computing device forever, except for some websites it's starting to be a little slow, especially a well-known site that's meant for exchanging messages under 500 characters. There's some irony there.

Mine is an X220s, so slightly newer than yours, but still a modest i3, upgraded to 8GB RAM, 128GB SSD and dual-band WLAN. With a new 9-cell battery, I get ~6 hours of battery life, and it's fine for general browsing with a decent number of open tabs, even a few games (Darkest Dungeon and some emulated SNES games).

As long as it keeps ticking and I can get whichever spare parts I need on eBay or something, I'm not going to replace it anytime soon.

The only reason I have a more powerful desktop PC (still ~2011 vintage) and don't just use the X220s in a dock, is that it it struggles with a 1440p external monitor (full HD is fine, though) and I sometimes like to play more graphically intense games.

>As long as it keeps ticking and I can get whichever spare parts I need on eBay or something, I'm not going to replace it anytime soon.

Good luck with that as the Javascript crowd are in competition to slow the web as much as they can.

I generally try to stay away from the worst offenders, most sites I use are relatively lightweight, like HN and various forums. Even Youtube isn't too bad, as long as you force it to not use the broken Polymer rendering on Firefox.

Do you use an ad blocker like UBlock Origin? That speeds up web by significant margin. I also used that to disable JS by default and enable only when needed, works well for me.

You will never be able to outrun bad software with good hardware. At best it's a rat race where devs target the n Percebtile machine and performance is effectively auction priced to buy an above "average" machine.

Just say No to bad software, starting with web ads.

>You will never be able to outrun bad software with good hardware.

We are just doing that. And one of the reasons is better software costs more than faster hardware.

Not that these Apple chips will run x86 or x64 code. Unless they bother with a Rosetta 2.0 then gaming is out of the question if Apple makes this jump.

The Catalina move already killed off a lot of games where the developers didn't provide an updated (x86_64) binary. A move to ARM would kill more, of course, but it's not like there's not precedent.

I'm half-convinced that Apple killed 32bit support so early precisely to see how developers coped; if it had been really bad they could have re-introduced it in a point release of Catalina. As it was, it wasn't very bad and most developers complied, which is an argument in favour of an ARM transition being feasible.

The only other reason I can think of to so aggressively move to 64bit is security, but most of the apps that were stuck on 32bit were not that big a security concern.

Having worked at big companies, I start to suspect that many recent Apple deprecations are detached from technical reasons or customer scenarios. They are playing internal games and don't care if it makes sense nor will they have any interest in reversing or revisiting the wrong call later on.

I am not just talking about 32-bit support. It shows up in a lot of random libraries that wind up deprecated and replaced with something less capable. That's a pattern I have seen a lot elsewhere and it's usually a bad sign for overall product quality.

Yes, I think the true reason, that Catalina is so incompatible with old programs, is, that they wanted to have the big compatibility breaking before the announce the ARM transition, which then would look as not so much a big step.

This does not make sense. I can give an counter-example for this: the education market, especially the higher-education market. This market exists so long and aged so well that, in real life, enterprise-level deployment of Macs well-likely exist mostly in this market now. The down-side for this market: they are slow or reluctant to change. Those ones who make research-related or educational software never are quick enough to do the big jump for an architecture change. And they also might not be able hire more people to do this. And the customer also hate to do those type of changes, both the IT department and the researchers, nobody wants to find their code couldn’t run properly anymore on these new machines.

But that is already true. You cannot buy a mac any more which runs 32 bit software. So Apple is obviously willing to make things miserable for a considerably part of their user base. Me included. I will avoid anything with Catalina, because I still have a very few 32bit programs I want to run and can't upgrade. As long as macOS runs on x86 hardware, there is not much justification for such a break. Yes, they clean up the software stack, but at a very high price.

The only good reason I can imagine is, that when Tim Cook announces macOS on ARM, he will claim "runs everything that runs on Catalina".

Doesn’t that invalidate your earlier claim that “it wasn't very bad and most developers complied, which is an argument in favour of an ARM transition being feasible”? It doesn’t matter how nice the experience is for those who upgrade if a significant number of people avoid upgrading because the experience would be terrible. That’s selection bias.

Sure, Coke sales are down 50%, but the customers who are buying New Coke say they like it just as much as the old recipe!

Which earlier claim of mine? Are you mistaking me for another poster? My point was, that they already had the breaking change so the change for the Catalina users - which certainly is only a part of the Mac users, many stayed on Mojave because of the 32bit support - will be smooth.

Well, if you need to run diverse software, macOS isn't the best choice.

The best bet is Windows since it would happily run software made 25 years ago in Windows 95 time.

To use macOS, you have to be satisfied by using the apps which run on macOS: Adobe apps, Apple apps, Microsoft Word and open source software.

Not sure what you mean by "diverse". There is a lot of legit macOS software, which is 32 bit only and no longer updated - for example because the company went out of business or couldn't justify the effort for a port. Cutting support off for these programs is a harsh step. While I can understand that Apple doesn't want infinite backwards compatibility, it hits a lot of users. One reason for this might be the preparation for the bigger switch to a new cpu architecture.

Apple market is: creatives, iOS developers, some other developers and MAC enthusiasts.

Enterprise is Windows territory, most businesses are Windows territory, education is Windows territory, most home users / small businesses are also Windows territory.

So if Adobe apps will have an ARM build, that will satisfy a huge part of their user base. The rest would use Apple tools which will get ARM builds and open source tools which already have or will have ARM builds.

> I'm half-convinced that Apple killed 32bit support so early precisely to see how developers coped; if it had been really bad they could have re-introduced it in a point release of Catalina.

Is it really so easy to introduce 32-bit support back?

I mean, it depends on exactly what they did to get rid of it. For many Linux distros you add 32bit support back by "apt-get install glibc-x86" or similar.

_If_ they were taking the approach of a deliberately early deprecation, which it seems like they were given the timing relative to the rest of the industry, it would only make sense to make it be an easily reversible decision.

Have you looked at the iOS App Store lately? There are tens of thousands of games there. Apple Arcade has a small (100+) but well-curated selection of good games that run on both iOS and MacOS.

Oh, you meant PC games? Sure, the Mac only has a small percentage of (for example) Steam games, but that percentage is steadily rising - it's now over 25%. Switching architecture is unlikely to present a major problem for most developers, especially given that they're probably using Unity or Unreal Engine.

>Switching architecture is unlikely to present a major problem for most developers, especially given that they're probably using Unity or Unreal Engine.

As a former game developer I can tell you that that is an issue. Apple is totally against using cross platform tools. They break compatibility as much as they can.

Framework change, architecture change and so on.

Instead of going with OpenGL ES, Vulkan, OpenCL they made Metal.

If user base is large enough, as is the case with iOS, there's an incentive to go trough the pains of releasing for that platform. But that isn't the case with macOS. Maybe for Adobe is worth it to spend resources to build software for macOS, but for other companies that might not be the case.

Anyway, you make much more money by targeting Play station and Xbox, and the resources needed are the same in money and man hours so it kame sense to target macOS last, if ever.

And no, not everybody is using Unity and Unreal. That is true mostly for indies.

Aren't apps already submitted as bitcode? They don't have to emulate, just recompile.

LLVM Bitcode is still architecture specific. That is, bitcode generated targetting x86 is not compatible with bitcode targetting ARM is not compatible with bitcode targetting aarch. The reason they are distributed as bitcode is to enable additional optimisations based on what exact chip will run the code (i.e. -march=native equivalent)


This article appears to successfully statically translate arm64 binary into an x86_64 binary using bitcode.

Using LLVM IR merely solves the problem of being able to compile to a specific instruction. You still need a compatibility wrapper for all the platform specific APIs like wine or Windows on Windows. If Apple puts in the effort then it might work out.

On Mac? I don’t think so, no, because you can move App Store apps to different machines on USB keys.

Also, a considerable number of Mac apps don’t use the App Store for distribution.

No, the Mac compiler target has never supported bitcode. And LLVM bitcode doesn’t work like that anyway.

Or run virtual machines like Android users do.

A more future proof too.

A lot of people who don't need it already buy it for the image and social status. Apple doing this for workstation-type machines too just makes even more sense.

I remember watching this video on the Intel 8086 by Harvard Business School, and they mentioned how revolutionary it is from a business perspective because it took a vertically integrated market and made it a horizontally integrated market.

(I can't find the video after a cursory search...)

Is this is the reversal of this revolution? Are we going back to a vertically integrated market due to consolidation in market players or because of performance / power concerns? Everyone seems to be making their own chips and boards these days. Google / TPU, AWS / Graviton, Microsoft / SQ1...

Will we ever see a fragmentation in ISAs a la EEE? IMHO that would be a catastrophic regression in the software space, easily a black swan event, if you say needed to compile software differently between major cloud vendors just to deploy.

In a certain way, this move is horizontally integrated. Intel vertically integrated its chip design and fab. This moves design and fab apart. One of the reasons that Intel is falling so far behind is that they can't keep up with TSMC (and maybe others as well) on the fab side.

Intel's vertical integration worked well for them for so many years. However, the crack has been around long enough for others to start muscling in. AWS can push Graviton because Intel has been stuck at 14nm for so long (yes, they have some 10nm parts now, but it's been limited). Apple can push a move to ARM desktop/laptops because Intel has stagnated on the fab side.

I wouldn't say this is a reversal of that revolution as much as a demonstration of the power and fragility of vertical integration. Intel's vertical integration of design and fab gave them a lot of power. Money from chip sales drove fab improvements for a long time and kept them well ahead of competitors. However, enough stumbling left them in a fragile place.

I think part of it is that ARM also has reference implementations that people can use as starting blocks. I don't know a lot about chip design myself, but it seems like it would be a lot easier to start off with a working processor and improve it than starting from scratch.

I think we're just seeing a dominant player that no one really likes stumble for long enough combined with people willing to target ARM. Whether I run my Python or C# or Java on ARM or Intel doesn't matter too much to me and if AWS can offer me ARM servers at a discount, I might as well take advantage of that. Intel pressed its fab advantage and the importance of the x86 instruction set against everyone. Now Intel has a worse fab and their instruction set isn't as important anymore. They've basically lost their two competitive advantages. I'm not arguing that Intel is dying, but they're certainly in a weaker position than they used to be.

I think a larger piece is intel was able to jump from desktops to laptops but not cellphones. This meant TSMC simply had more economy of scale to push fab’s further.

I never understood why Intel gave up on smartphone chips. I remember thinking at the time how that would bite them in the next 10 years.

I had an ASUS ZenFone 2 powered by Intel. It was fantastic! I didn’t notice any problems or major slowness compared to a Qualcomm chip. To me it seemed like they had a competent product they could iterate on. And they just canceled the program, how short-sighted!

I mean, maybe I’m wrong here and there isn’t really money in that business.

>I never understood why Intel gave up on smartphone chips.

Margins. They were far too focused on margin they didn't realise their moat were cracked once they let go of it. 1 Billion+ of Smartphone / Tablet SoC, Modem and many other silicon pieces are now Fabbed on TSMC. None of these exist 10 years ago. Just like the PC revolution, while Sun and IBM were enjoying the Workstation and Server marketing booming, x86 took over the PC market segment, and slowly over the years took over server market.

The same could happen to Intel, this time it is ARM and will likely take 10 -15 years.

And I keep seeing the same thing happening over and over again, companies were too focused on their current market and short term benefits and margins they fail to see the bigger picture. Both Microsoft and Intel are similar here. ( And many other non tech companies. )

>Both Microsoft and Intel are similar here. ( And many other non tech companies. )

Microsoft kind of moved from their traditional market. Now they make more money from services and cloud.

They missed the boat on mobile, though, by focusing too much on Windows

That's a big list of those who missed the mobile boat:

Microsoft, Intel, RIM, Nokia, HP, MIPS, Sony, HTC

And this are just the few of the big players.

And even if Microsoft lost with Windows on phones, they still try to make apps. I am currently using Edge on Android because its built in ad block is quite good.

People also forget that Intel had a pretty decent line of ARM CPUs right before the time the iPhone was released (XScale). And they gave up on those too, just in time for the market for smartphone CPUs to explode.

Not only pretty decent — XScale was the best and fastest ARM for PDA-size devices.

The only reason Intel killed XScale was that it wasn't x86, it came through an acquisition, and they were afraid to cannibalize their own x86-based mobile plans.

Turns out it would have been far better to disrupt yourself rather than let others do it.

Weird since Intel tried themselves to kill x86 with IA-64.

IA-64 is the "INTEL Architecture 64", xScale is the non-Intel architecture.

A childish NIH driven decision that destroyed their future.

Intel tried to replace x86 with Itanium. XScale was just a thing they did on the side.

Because they won't have to share IA-64 with AMD.

ARM has many other licensees so Intel didn't want to continue

Their hearts were never really in XScale. Intel only ended up having that group due to a bizarre legal settlement with DEC, and pawned it off to Marvell in some misguided attempt at corporate streamlining.

Should also point out that the lead on the DEC StrongARM (which became Intel XScale) was Dan Dobberpuhl, who founded PASemi, which Apple bought to help their own chip development work. In between he also co-founded SiByte which made some of the best MIPS chips ever made.

I get thinking phones were to small of a market or margins were too thin back then. But you would think when the smart phone market got bigger and prices went up someone would have re-evaluated that decision.

Executives are regularly incentivized to focus on short term profits over long term company survival. I call this process “bonuses today, layoffs tomorrow”.

Intel was actually subsidizing smartphone vendors who wanted to use Intel chips. Unluckily hardly any smartphone vendor wanted to react to this offer (except some small experiments like the mentioned ASUS ZenFone 2).

>I never understood why Intel gave up on smartphone chips.

They didn't sell enough. Phone makers, software developers and users settled on ARM.

Software developers? A large percentage of those developers would not have known the difference since these phones ran Android and the VM abstracted this away. Only for games and some other apps that use the NDK would this have made a difference.

ARM was well established for smartphones and handheld computers well before the HTC Dream ever existed.

I don't understand this comment. Who is talking about when ARM was established? Android runs on the vast majority of the world's smartphones and smart devices TODAY and was compatible with intel's(x86) mobile processor. That would have been enough of a market.

edit:// I am commenting on the ASUS ZenFone 2 which has the intel processor running ANDORID, fyi.

> Whether I run my Python or C# or Java on ARM or Intel doesn't matter too much to me

I think you make a key point here. A whole lot of code now runs inside one runtime or another, and even outside of that, cross-architeture toolchains have gotten a lot better partly thanks to LLVM.

The instruction set just doesn't matter even to most programmers these days.

> One of the reasons that Intel is falling so far behind is that they can't keep up with TSMC (and maybe others as well) on the fab side

Actually more that they bit way more than they could chew when they started the original 10nm node, which would've been incredibly powerful if they had managed to pull it right. But they couldn't, and so they stagnated on 14nm and had to improve that node forever and ever. They also stagnated the microarch, because Skylake was amazing beyond other (cutting corners on speculative execution, yes), so all the folowing lakes where just rehashes of Skylake.

Those were bad decisions that were tied to Intel not solving the 10nm node (temember tick-tock? Which then became architecture-node-optimztion? And then it was just tick-tock-tock-tock-tock forever and ever), and insisting on a microarch that, as time went by, started to show it's age.

Meanwhile AMD was running from behind, but they had clearly identified their shortcommings and how they could effectively tackle thme. Having the option to manufacture with either Global Foundries or TSMC was just another good decision, but not really a game changer until TSMC showed that 7nm was not just marketing fad, but a clearly superior node than 14nm+++ (and a good competitor to 10nm+, which Intel still is ironing).

That brings us to 2020, where AMD is about to beat them hard both on mobile (for the first time ever) and yet again on desktop, with "just" a new microarch (Zen 3, coming late 2020). The fact that this new microarch will be manufactured on 7nm+ is just icing on the cake, even if AMD stayed in the 7nm process they'd still have a clear advantge over Zen 2 (of course, their own) and against anything Intel can place in front of them.

That brings us to Apple. Apple is choosing to manufacture their own chips for notebooks not because there's no good x86 part, but because they can and want to. This is simply further vertical integration for them. And this way the can couple their A-whatever chips ever more tightly with their software and their needs. Not a bad thing per-se, but it will separate even more the macs from a developer perspective.

And despite CS having improved a lot in the field of emulation, cross compilers, and whatever clever trick we can think of to get x86-over-ARM, I think in the end this move will severely affect software that is developed multiplatform (this'd be mac/windows/linux, take two and ignore the other). This is some slight debacle that we've seen with consoles and PC games before.

PC, Xbox (can't remember which) and PS3 were three very different platforms back in 2005-ish. And while the PS3 held a monster processors which was indeed a "supercomputer on a chip" (for it's time), it was extremely alien. Games which were developed to be multiplatform had to be developed at a much higher cost, because they could not have an entirely shared code base. Remember Skyrim being optimized by a mod? That was because the PC version was based on the Xbox version, but they had to turn off all compiler optimizations to get it to compile. And that shipped because they had to.

Now imagine having Adobe shipping a non-optimized mac-ARM version of their products because they had to turn off a lot of optimizations from their products to get them to compile. Will it be that Adobe suddenly started making bad software, or that Adobe-on-Mac is now slow?

Maybe I got a little ranty here. In the end, I guess time will tell if this was a good or a bad move from Apple.

Because x86 emulation will be no go for Photoshop and alike, can Apple simply ship with both x86 and ARM CPUs in the same laptop?

In a way they already do.

All current Macs include a T2 chip, which is a variant of the A10 chip that handles various tasks like controlling the SSD NAND, TouchID, Webcam DSP, various security tasks and more.

The scenario you mention — a upgraded "T3" chip based on a newer architecture that would act as a coprocessor used to execute ARM code natively on x86 machines — seems possible, but I don't know how likely it is.

Yeah, but what would be rationale? They want to avoid x86 as a main CPU, so either you'd get an "x86 coprocessor to run Photoshop" (let's go with the PS example here).

Or you'd have to have fat binaries to have x86/ARM execution, assuming the T3 chip would get the chance to run programs. Now either program would have to be pinned to an x86 or ARM core at their start (maybe some applications can set preference, like having PS be always pinned to x86 cores) or have the magical ability to migrate threads/processes from one arch to another, on the fly, while keeping the state consistent... I don't think such a thing has ever even been dreamed of.

I don't think there's a chance to have ARM/x86 coexist as "main CPUs" in the same computer without it being extremely expensive, and even defeating the purpose of having a custom-made CPU to begin with.

An x86 coprocessor is not that outlandish. Sun offered this with some of their SPARC workstations multiple decades ago, IIRC.

Doing so definitely would be counterproductive for Apple in the short-term, but at the same time might be a reasonable long-term play to get people exposed to and programming against the ARM processor while still being able to use the x86 processor for tasks that haven't yet been ported. Eventually the x86 processor would get sunsetted (or perhaps relegated to an add-on card or somesuch).

Either if it's for performance, battery life or cost reasons, it wouldn't really make sense:

a) performance wise, they move would be driven by having a better performing A chip

b) if they aimed at a 15W part battery life would suffer. 6W parts don't deliver good performance.

c) for cost, they'd have to buy the intel processor, and the infrastructure to support it (socket, chipset, heatsink, etc)

Specially for (c), I don't think either Intel would accept selling chips as co-processors (it'd be like admitting their processors aren't good enough to be main processors), nor Apple would put itlsef in a position to adjust the internals of their computers just to acomodate something which they are trying to get away from.

> I don't think either Intel would accept selling chips as co-processors

Who said they'd have to be from Intel specifically? AMD makes x86 CPUs, too. Speaking of:

> 6W parts don't deliver good performance.

AMD's APUs have historically been pretty decent performance-wise (relative to Intel alternatives at least), and a 6W dual-core APU is on the horizon: https://www.anandtech.com/show/15554/amd-launches-ultralowpo...

Apple probably doesn't need the integrated GPU, so an AMD-based coprocessor could trim that off for additional power savings (making room in the power budget to re-add hyperthreading or additional cores and/or to bump up the base or burst clock speeds).

> for cost, they'd have to buy the intel processor


> and the infrastructure to support it (socket, chipset, heatsink, etc)

Laptops (at least the ones as thin as Macbooks) haven't used discrete "sockets"... ever, I'm pretty sure. The vast majority of the time the CPU is soldered directly to the motherboard, and indeed that seems to be the case for the above-linked APU. The heatsink is something that's already needed anyway, and these APUs don't typically need much of it. The chipset's definitely a valid point, but a lot of it can be shaved off by virtue of it being a coprocessor.

I'd imagine Apple are likely a big enough part of Adobe's userbase for them to release a native ARM version.

I'd imagine that user base would move to Windows if Adobe won't release an macOS/ARM version.

People care more about the software tools they use to do their jobs than on the operating system and hardware.

So, it might depend on how much would cost Adobe to release for ARM.

> So, it might depend on how much would cost Adobe to release for ARM.

This is a fair point. If Apple had any sense they'd pay Adobe to do it if it came to it.

Didn't they already port the Photoshop engine to ARM? As far as I remember the iPad app shares the same engine only the user interface is different.

Most of it must be ARM compatible already, for the iPad version.

Also Photoshop was first released in 1987 and has been through all the same CPU transitions as Apple (m68k/ppc/...) so presumably some architecture-independence is baked in at some level.

>Whether I run my Python or C# or Java on ARM or Intel doesn't matter too much to me

Me too. But if the apps I need don't come to ARM, I don't care too much about ARM on the desktop/laptop/workstation.

In what sense has Intel vertically integrated design and fab according to you?

NodeJS is in that category as well. If you avoid native modules it’s easy to cross deploy on ARM. The workload seem to translate well to the process model of scaling out as well.

Really though, unless you're writing x86 assembly, any language should be just fine on ARM. The only potential holdp is is you rely on precompiled binaries at some point. Otherwise it should just be a matter of hitting the compile button again.

In C/C++ land the kind of ”Just try and see if it works” kind of development is super common in proprietary software. Leading to issues such as:

Using volatile for threadsafe stuff. Arm has weaker memory model than X86 so it requires barriers. C++ standard threading lib handles this for you but not everyone uses it.

Memory alignment. Arm tends to be more critical of that. While it’s impossible for well formed C++ program to mess it up it’s quite common for people just go ”Hey it’s just a number” and go full yolo with it. Because hey, it works on their machine.

Apple's recent 64-bit ARM processors now support unaligned memory access, so there's one porting problem out of the way:


It's difficult to avoid native modules though. Many basic things like SQLite or image manipulation are binary blobs.

In SQLite's case (and probably in the case of most reasonably popular image manipulation libraries), those binary blobs very likely already exist or can be readily recreated. ARM is not some newfangled obscure architecture; SQLite's been used to great success on far more exotic platforms than that (including, I'd imagine, on devices running iOS).

I seldom see issues porting C and C++ to ARM unless people do weird things with casts that are undefined in the language spec but that work on X86 or use X86-specific vector intrinsics. Most well-written C and C++ compiles out of the box to ARM and just works.

> M. Whether I run my Python or C# or Java on ARM or Intel doesn't matter too much to me and if AWS can offer me ARM servers at a discount, I might as well take advantage of that.

I think that's the fundamental mistake in reasoning:

If ARM is cheaper for AWS, then AWS has no reason at all to offer it to its customers at a discount because the customers will not move if no discount if offered. As long as there's no mass market for ARM PCs/servers that work with all the modern software that anyone can rack and sell a-la ServInt/Erols/EV1 circa 1996 there won't be pricing pressure.

This has played time and time again in transit pricing.

AWS is offering ARM servers at a discount right now.

One cannot buy another ARM with the same performance and compare it head to head.

AWS ARM is instances are mattress market.

> I remember watching this video on the Intel 8086 by Harvard Business School, and they mentioned how revolutionary it is from a business perspective because it took a vertically integrated market and made it a horizontally integrated market.

I never really fully appreciated this until I read a review (probably linked on HN) for a new system released in 1981 or 1982 and my mind couldn't stop boggling at how there was a CPU with its custom ISA and an OS written specifically for that ISA and applications written specifically for that OS, and the reviewer was praising some innovative features of the ISA and how the applications could make use of those.

The icing of the cake was how the reviewer discussed how this system could be a big commercial success and which other systems it might take marketshare from - without ever mentioning the IBM PC released around the same time...

The irony is that the ARM architecture was also developed in the same way. A custom CPU, for a new machine, with an operating system specifically designed for it, and applications written specifically for that OS (the Acorn Archimedes/RiscOS).

The platform ended up failing, but they spun out ARM. If ARM end up overtaking Intel on the desktop, it gives the story some entertaining irony/symmetry.

I still can't quite believe those crappy computers in my school IT lab eventually morphed into iPhones.

Saying that, I did write my first ever computer program on them and ended up making iPhone apps.

When they were released in the 1990s they were pretty capable machines. The OS had its issues, but I don’t think it faired badly against DOS/Windows 3.11. They were however largely limited to the UK marketed, and had a more limited selection of software.

I’m not sure it’s really fair to characterize them as “crappy”.

I'm being mean I guess - I had a PC at home and it could do so so so much more.

This is a great reminder that everything in business history looks obvious in retrospect.

This development of market dynamics & structure has been famously featured in the book "Only the Paranoid Survive" by Andy Grove, which I can only recommend (it's a quick and easy read).

Anecdotally, it seems to me that today, the more successful companies are those that tend to be vertically integrated, such as Apple, Tesla, and Amazon.

Yet many companies are in a craze to go to cloud and outsource every bit of competence which isn't a core product.

It's not that bad yet. Almost everything is still some flavor of x86 or ARM. Back in the 90's server/workstation space, almost every player made their own wildly incompatible chips;

* DEC Alpha



* SGI MIPS (MIPS was owned by SGI in the 90's)

* IBM PowerPC, RS/6000

I kind of hope so. AMD and Intel hasn't really had a tight software department. For example, offloading some math to the Intel GPU has been in the OpenMP 4.5 spec since 2015 in a really easy way. It is supported by using Intel® oneAPI Base Toolkit AND the Intel® oneAPI HPC Toolkit. Which... no one uses.

That we still don't have any good GPGPU resources is just crazy to me.

Heterogenous computing with TPUs/GPUs/DSPs and other chips should be standard by now.

>Heterogenous computing with TPUs/GPUs/DSPs and other chips should be standard by now.

It sounds nice in theory but in practice is hard. Writing CUDA or OpenCL is not exactly pleasant or easy and compilers do a poor job at vectorizing code.

Se we use accelerators when it's an absolute must.

So we also need to be able to see the IR/ASM. It's not like the CPU only compilers are great at it either, quite a lot of handholding is needed there too, but one step at a time.

> it took a vertically integrated market and made it a horizontally integrated market

I find that a curious statement. While there was quite a bit of diversity in microprocessors, microcomputer manufacturers almost never built their own processors. The majority of the 8 bit market was 6502 and Z80, with a smattering of 8080 and 6809. The 16 bit market mostly was 68000.

Neither Motorola nor Intel were big players in microcomputer manufactoring. The 6502 story is a bit more complicated: MOS sold the KIM-1. And MOS itself were bought by Commodore, though there were several 6502 manufacturers.

The only microcomputer manufacturers I can think of that truly designed their processors were Acorn and Texas Instruments.

At that time microcomputers were still a small portion of the computer business. Most of the market was for mainframes and minicomputers, which were vertically integrated.

But the 8086 was used in microcomputers, not mainframes or minis.

The broader interpretation of this idea, that the x86 era saw a shift away from vertically integrated computer manufacturing, is absolutely true. The narrower interpretation, that the 8086 chip triggered a shift away from vertically integrated computer manufacturing, is not.

The "x86 era" is really the era of microcomputer architectures eating mainframes and minis. I suspect that would still have happened if the 68k (or Z8000, NS32000, etc) had won the war instead of the x86.

Random data point. In 1977, a couple of serious business industrial machines from two worlds:

Cromemco Z-2: 8-bit Z80 @ 4 MHz, maximum 256 kB RAM (bank-switched in a 64 kB address space), 0.007 Whetstone MIPS [1]

VAX-11/780: 32-bit VAX @ 5 MHz, maximum 8 MB RAM, 0.476 Whetstone MIPS [1]

I have no idea what the prices were. One source reckons a Z-2 was $995 [2], but a price list from 1983 has a system with 64 kB and two floppy drives for $4695 [3]. In 1978, the list price of a 11/780 with half a megabyte of memory, a floppy drive, a tape drive, and two hard disk spindles (possibly not including disk packs, though) was $241,255 [4].

[1] https://amaus.net/static/S100/cromemco/IOnews/05x04%20198607...

[2] https://www.old-computers.com/museum/computer.asp?st=1&c=559

[3] http://www.hartetechnologies.com/manuals/Cromemco/Cromemco%2...

[4] https://www.bell-labs.com/usr/dmr/www/otherports/32v.pdf

>I suspect that would still have happened if the 68k (or Z8000, NS32000, etc) had won the war instead of the x86.

I think that is probably correct. The economics of "commodity" microprocessors were such that one (or two) would have probably won had the x86 not. (Of course, the shift to horizontal is not just about the microprocessor but volume operating systems, volume scale out servers, packaged software and open source, etc.

the 68000 is a 32 bit cpu

The registers were 32 bit, but the data bus on the OG 68000 was 16 bit. I tend to count the latter width (I don't see the Z80 as a 16 bit processor either).

To make things even more confusing, the Z80's ALU operates on 4 bits at a time while the 68000 has 3 16 bit ALUs and so can crunch 48 bits per clock.

Half of the early ads for the 68000 called it a 16 bit processor and the rest 32 bits. The latter became more popular as time went on.

Apple has a history of doing everything in their own way if feasible. Not sure if one should extrapolate from them to the entire market.

Absolutely agreed with that. The key is they insist and hence they failed in the desktop to fight WinTel, then they go to try non-Desktop. iPod, iPhone, iPad etc. The MAC is left abandon quite awhile. And I am not sure it is still a big business enough for them.

I won't be surprised if what happened in desktop space will repeat in mobile space.

The pressure from Android side is huge.

Apple also made their own operating system, their own programming language, their own external connectors, etc. Have their competitors followed their lead in becoming more vertically integrated in any other way? I’m not really seeing it.

This wouldn’t be the first time Apple used a non-x86 chip in the Mac. Nobody followed them last time.

Yet most things they did weren't original. macOS was mostly made from bits of BSD. Apple's CPUs rely on ARM and in the beginning they were designed with the help of Samsung.

They are mainly polishing things.

>Nobody followed them last time.

Breaking compatibility with software and operating systems and hardware is bad for the consumer.

With x86 I can run any software I need and I can optimize for cost and performance. I can use diverse CPUs, use graphic cards, memory chips, cases, PSUs, SSDs and HDDs from different makers at different price points and performance points.

I can hit exactly the sweet point I need to. And if something breaks, it won't be hard to replace.

Andy Grove goes into the flip to horizontal integration at length in Only the Paranoid Survive. He also touches on it in this video: https://www.youtube.com/watch?v=LfU2Qu4MzZk

There's something of a shift back to more vertical integration but:

-- As another commenter mentioned, there's simultaneously been something of a split between chip design and fabs

-- The big public cloud providers probably make a stronger case for a return to vertical integration than Apple does

-- For the most part, any vertical integration today is taking place in the context of both global supply chains and standards. You don't have every company using their own networking protocols and disk drive interconnects.

Of the Intel 8086 or of the IBM PC platform?

I don't remember anything specific to the 8086 that made it a "horizontally integrated market" w.r.t. its competitors (I might be wrong, of course)

>> "IMHO that would be a catastrophic regression in the software space, easily a black swan event, if you say needed to compile software differently between major cloud vendors just to deploy."

Wasm blobs seem poised to moot this in the next 10 years. So many things on the desktop have already moved to the web or Electron. Server-side applications turning into browser-executed blobs backed by one of a few database systems still on the server is the next logical step.

I am tired of wasm hype. Can someone explain what it does so well than existing cross platform solutions like CLR / JVM? It doesn't yet have a GC nor is most efficient for that matter. And JIT is not efficient in terms of memory and power compared to native code.

Both of those are owned by organizations that have behaved pathologically in the past so I think many people reflexively avoid them.

Wasm is supposedly language agnostic (although I’d disagree heh) whereas even CLR imposes a lot on the language you build for it.

I've heard it 10 years ago. Together with: "in 10 years we won't need desktops, only dumb clients as most software will run from the cloud"

It certainly feels that way. I think that the fragmentation problem might be diluted this time because the software is way more abstracted from hardware (mainstream languages are all very high level, and there is widespread adoption of open compilers - with llvm/clang being pushed by apple)

I don't see it as a çatastropic regression. We're already seeing instances of software gatekeepers like the Apple Appstore requiring users to upload bitcode that can be specialized to different processors. This is strictly better for consumers because they get more efficient software. Developers don't have to do much work; they just need to pick a compiler that supports the bitcode output. Hopefully, we'll settle on a bitcode format that works across multiple clouds and storefronts. LLVM bitcode or WASM are the best candidates right now. The clouds/storefronts can perform additional optimizations (perhaps even run a superoptimizer per hardware SKU), and additional privacy/security checks.

With Moore's Law ending, these are the tricks we need to get up to to improve user experience, reduce power draw, and make hardware go further.

Well, if you have a 100 billions of dollar that sits doing nothing like Apple does, spending it to get a new competitive advantage may make sense. They can do what no other computer manufacturers can no in that position.

Those billions are made from iPhone sales. Lenovo, Dell and HP don't make billions from selling phones.

And the move to ARM isn't guaranteed to bring them more sales. Customers might dislike it and software makers might dislike it.

Difference being these aren't vertically integrated in an ecosystem, they're made in-house as a result of corps owning their supply chains.

I know it won't be exactly the same thing as the 68k and PPC eras, but I'm excited nonetheless. I'm a sucker for stories of going against the mainstream and pulling it off. I remember being wholly underwhelmed when Apple switched to Intel in the mid 2000s; my PPC Mac mini was more performant than the first Intel mini by far, especially the GPU. Given how powerful the A series mobile processors are (the new iPhone SE makes my Galaxy Note 10+ look like a slouch in benchmarks), I have a feeling the new Macs will be worth a look for anyone without hard requirements for Windows 10.

> I have a feeling the new Macs will be worth a look for anyone without hard requirements for Windows 10

Not just that - but the many, many legacy apps professionals rely on. The 64-bit move already decimated the professional audio space. The cynic in me can't help but see this as a move towards pure consumer device. As a development machine it will be all but useless as you find things that won't compile or work correctly on the new CPU arch.

> As a development machine it will be all but useless as you find things that won't compile or work correctly on the new CPU arch.

I mean, I suppose it depends on what sort of development you're doing, but in 2020 most libraries do work on ARM. There'll no doubt be a painful period (as there was with the death of PPC), but it shouldn't be that dramatic.

Still, wouldn't you want to test it on the same architecture? It's fine if you just compile on a server but then you can as well have a cheap client with only a web interface.

Mobile developers have done this for years; I don't really see how testing it on your local machine can really tell you much other than "it compiles and seems to run" if you're actually going to run it on completely different hardware.

> As a development machine it will be all but useless as you find things that won't compile or work correctly on the new CPU arch.

I'm not a developer so forgive my ignorance, but isn't this what cross-compiling is for? I get that compiling natively can increase performance and find obscure hardware issues, but it's my understanding that, for example, ARM builds of GNU/Linux binaries are just cross-compiled by server farms that are also natively compiling the AMD64 builds.

Also, fat binaries and JIT emulation have been a thing forever, especially for Apple who has dealt with these changes twice now (68k -> PPC -> x86-64).

I just don't see this being any different than current multi-platform efforts like Debian, NetBSD, etc., except it's a for-profit company with billions of dollars and thousands of expert employees behind it.

There can be subtle bugs, especially if you have code which has to adhere to a strict on-disk or on-network format.

I recently committed a change to FreeBSD's kernel (written in C) which I'd tested on amd64 and x86. Much to my surprise, when I did the cross-build for all platforms, I had a compile-time assert fail on 32-bit arm because the size of a struct was too large. It turns out that 32-bit arm (at least on FreeBSD) naturally aligns 8 byte data types to 8 byte boundaries, whereas x86 packs them and allows them to straddle the 8 byte boundary. This left a 4-byte hole in my struct, and caused it to be 4 byte too large.

These are the sorts of things that bite you when moving from x86/x86_64 to a risc platform, even when its the same endian.

If you care about the packing of your struct, then you should probably be using compiler-specific packing attributes.

There are bugs that will show up on certain x86 that won't show up on ARM. Plus you've got things like code that is already built that you'll have to emulate in some way or another.

Cross compiling is more for supporting more platforms with minimal or no code changes. If you develop applications that will always just run on x86, developing and testing them on another platform doesn't make much sense in my opinion. You might have to compromise just because of your development environment and errors could be caught too late.

Here is Linus Torvalds explaining why cross-development isn't great:


pro audio tends to be a lot more low level to prevent any kind of stuttering or delay, so it may not always be compatible across stacks. It also tends to not get updated very often. You create 1 plugin for ableton, then the creator has moved on to the next project.

I have a hard time feeling sorry for companies that are still churning out 32-bit software. The move to 64-bit wasn't a surprise to anyone that was paying attention. If the pro audio space is really decimated then I'm not sure they were that hardy to begin with.

Granted, the move to ARM is a bit more work, but again I don't think anyone should be especially surprised by it. As soon as the "A" processors started posting real, positive benchmark numbers I figured that Apple would move to them and away from Intel. In my minds eye I can see Apple differentiating machines depending on how many ARM chips it has. MacBook Air - 2 A15; MacBook - 4 A15X; MacBook Pro - 12 A15X; Mac - 128 A16X (or something like that)

I have a hard time feeling sorry for companies that are still churning out 32-bit software.

A lack of sympathy does not make my audio software work, however. And I believe what parent is driving at is that should the ARM transition take place, even more stuff isn't going to work for the end-user. So a little empathy for the user, eh?

I meant no offense to the user and have myself been the victim of using old, unsupported software (ie. I play TF2). The point I was trying to make is that companies can't expect the world to stop changing and advancing just because they don't want to build new things. AFAIK there is no computational reason that any software can't be made 64-bit (within reason). We've moved on from 8-bit and 16-bit software; what's stopping them from leaving 32-bit behind?

When I hear that some group (ie. pro audio software makers) won't update their offerings I have to believe it's because they don't want to or they just don't want to serve the market anymore. In either case it seems like there is an opportunity for someone to create an alternative, and possibly make some great money.

Those companies aren't around anymore. There's no one left to update the software, and like vintage gear, it's irreplaceable.

Yeah, I was having a long discussion about this with a friend, and this is precisely where the misconception lies.

The problem with the 32->64 transition (or x86->ARM) doesn't lie with active businesses failing to "get with the program" and update their software - it lies with abandonware. With software that's been put out either by defunct companies or sometimes literally deceased programmers.

In some niches, this sort of stuff is really, really common - generally this is the case if there's a really stable API for building things, like VST plugins, and if the niche in question has a lot of failed businesses. A lot of times in the pro audio space, a musician will spend a large part of their career "collecting" a ton of little one-off sound libraries and fx plugins, because these are the only way they can get the computer to produce that exact kind of sound they're looking for. This collection slowly builds up over the course of, say, a decade - just like a graphic designer would collect fonts.

The difference is that unlike fonts, which last had their "greet the reaper" moment back when bitmap fonts got scrapped in the mid-90s (despite OpenType becoming a thing, TrueType fonts from the 90s still work fine, some 30 years later), any audio plugins that aren't compatible with the cpu architecture die out. And that's just really brutal to a working musician.

You can't get an update to most of those because there's a ton of attrition in that industry; lots of small-time plugin makers realize pretty fast that it's a very difficult place to make any kind of ROI, so they quit after a few years.


Games are in a really similar place - they're a business slaughterhouse where most companies that attempt to make something discover they're not going to cover the initial investment, so after the game's produced it typically gets a couple of years of barebones support, and then gets abandoned - or the company just croaks. Any kind of rewrite is completely out of the question. The tragedy is that most of these games are pretty good and fun, they're just not economically viable.

I love apple moving the tech forward, but we desperately need a better emulation solution, and/or we've got to get the industry off of coding for bare metal.

I am sympathetic, don’t get me wrong, but this isn’t the only industry and this isn’t the only time this has happened. It happens every day in every industry and has been happening forever and will continue to happen forever. The old example is buggy whips, someone might have spent hundreds of thousands of dollars making or collecting whips for horses only to have thar investment disappear when cars came around. Or making/collecting spears just as the Roman army switched to swords. Or medieval turnip farmers when suddenly everyone wanted to eat the newly discovered potato. Or people who bought all their favorite movies on Betamax and can’t easily watch them anymore.

There are very few creators who actually create the things they rely on to create, ultimately we are all consumers even inside our professions. And like any consumer, we are all at the mercy of a market we don’t control. Either you have to accept that everything must come to an end and plan for that eventual end, or you have to dig deeper into your creativity.

Everyone has something they rely on that will disappear before they’re ready to lose it. It’s a reality of life and as much as humankind has experienced that loss for thousands of years, we never seem to get any better at accepting it or planning for it.

Counterexample: today you can run 25 years old Windows software which was made in the time of Windows 95.

Yes, Windows has made a reputation for backwards compatibility and it works surprisingly well. However, it's not perfect and it only goes so far. I know there are a lot of companies out there still running Windows 95 because their software doesn't work on more advanced versions. Also, what have Microsoft given up for this compatibility? Its my understanding that Windows source code is a huge mess which has made updating it a large project. Just look at the control panel for the most recent version of Windows; it's a mish-mash of styles and layouts.

Some things make no sense whatsoever though. Processor architectures only matter because CPU companies can't just add their competitors' ISA to their own chips. Nividia and Transmeta experimented with processors that contained all the features necessary for x86 or ARM support and then simply used a software JIT to convert ARM and x86 to an internal VLIW architecture. Translation from one instruction set to another is already a solved problem. The real problems are that some architectures are fundamentally different e.g. the memory model in x86 vs ARM. You can solve it by simply including both memory models in your universal CPU. The only thing that is standing in the way of this are the darn patents.

> Transmeta

...was slow.

> The 64-bit move already decimated the professional audio space

Every architecture switch has casualties, what I feel might be shortsighted by Apple this time is the x86 Cocoa apps that die this time are not going to be replaced by Catalyst iPad ports or ARM builds, they'll be replaced with Electron apps.

Apps that die from these kinds of switches tend to keep users that don't have a choice from upgrading at all. There's still people running PPC macs.

Also virtual machines. Depending on what kind of development you do, those are indispensable in some cases. Does ARM have the same kind of hardware virtualization x86 has?

I feel like this will be an easier migration from a software perspective.

At least both architectures are 64 bit little endian, so all those buggy C programs that do illegal type casting might still work.

One issue might be that X86 is pretty relaxed when it comes to unaligned memory accesses. I'm not sure how ARM64 handles it.

It's slower but generally will not trap unless the processor is set to do so.

I'll bet there's a lot happening under the hood with xcode that means any app in the Appstore today is going to 'just work' with an arm chip without the dev having to do a thing.

Anything external will suffer, and apple probably aren't gonna cry about that.

Xcode is not doing much in this direction at all.

how can you be so sure what they're working on?

I worry about legacy apps also, specifically the professional version of LispWorks. I expect a version for Apple’s new systems will be available, but that will be another $3k for a new license. Oh well.

At that time, it probably made more strategic sense for Apple to move to 'commodity' hardware and away from relying on Motorola. The success of iphone, ipad, etc., and Apple's huge cash pile puts them in a completely different position now. They aren't going against the mainstream; they are the mainstream.

>my PPC Mac mini was more performant than the first Intel mini by far

That's true only if "Apple using CPU I deem cool" counts as performance.

Otherwise, benchmarks say that first Intel Mac Mini was almost twice as fast than the latest Mac Mini with PowerPC.


CPU speed alone doesn't indicate overall system performance, nor do synthetic benchmarks. The PPC mini had a discrete GPU with dedicated RAM, and OS X at that time was mature on the platform and very performant. The first Intel mini was a Core Solo machine that was horribly underpowered, had Intel GMA 950 graphics not even powerful enough to properly render QE/CI without stuttering, and was hobbled by a very slow system bus. Intel Macs didn't start performing better than their PPC counterparts until Lion was released and the second generation of Core 2 CPUs with faster system bus and higher clock speeds came about.

I get the point you're trying to make, but unless you owned both the last PPC mini and the first Intel mini at the same time, as I did in 2006 (and still do), you have no idea what you're talking about.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact