I’ll still be waiting to see what latency-sensitive performance is like (specifically audio plugins) but this is halfway to addressing my biggest concern about buying an M-series Mac while the software market is still finding its footing.
I think the contrast in approach here is motive, where I get the impression that Microsoft wanted developers to write ARM apps and publish them to its Store by making emulated programs less attractive by the virtue of poorer performance.
Apple on the other hand is keen to to get rid of Intel as quickly as they can, therefore they had to make the transition as seamless as possible.
Their current implementation only supports 32-bit code, x64 translation is still underway. It is not known how well x64 translated code will perform relative to native code.
Off topic, my job has been in virtualisation for the last 12 years, thus I am very familiar with the publicly available body of research on this topic. Ahead of time binary translation has been a niche area at best.
Newer instructions are still encumbered by patents.
That is what patents are. They encourage you to actually get there by giving you a time-limited monopoly on the implementation. It is easy to say "a freshman CS student could have figured this out", but if they did, they could have had the patent instead and licensed it to Intel and Apple. Instead, Intel and Apple had to figure it out on their own.
In particular, costs are high, litigation to protect is expensive, and so your average student wouldn't be able to afford this. The fact that software is typically shipped virtually means that borders are practically non-existent, and wide patents are often needed, or a company needs to give up on defending their patent outside their primary market.
To patent an idea will probably be around $10-100k per market. To cover US, EU, and large Asian markets, you're looking at $500k-1m, and thats just to get the patent. Then you'll need to defend it, which can be hard to do against entities based in non-compliant countries such as China.
This all means that unless you're defending the very core of your entire business proposition, you probably need to be a >$100m before it's worth pursuing patents, and even for the core of your business you probably need to have several million in funding.
Was that from a presentation on Rosetta2 that I have missed? It certainly makes sense, but also you'd need to watch for writes to executable pages that have been already AOT translated.
I also see issues with self-modifying code, but this is also very rare (I knew some .NET code that was injecting/manipulating its own JIT code to go around C# private/protected encapsulation).
Browsers themselves no, but there's many apps out there based on browser tech (e.g. electron apps). Also many apps shipping with their own JVM or other JIT engine.
Anecdote incoming, my pet project is based on electron, I am currently building a mac x86 version but don't plan to ship an ARM version (since testing it without an actual physical mac is going to be even more difficult / impossible).
Wouldn't executable pages (translated or not) normally be mapped as read-only, ensuring the processor faults on any attempt to modify them?
And the correct term for "JIT-based emulation" is "Dynamic Binary Translation" (DBT).
At least these are the terms you should use if you want to find some literature on this subject.
We're not talking about JIT or AOT compiler because it's not really a compilation (compilation is translating to a lower level language).
I think a lot of people talk about JIT rather than DBT because the JIT term is better known, and there is confusion when Apple says they do "Dynamic translation for JITs".
Which means that: they do DBT to handle applications that use JIT.
Edit: So Rossetta 2 does both: SBT and DBT.
Furthermore, SBT, even for user mode binaries, can rarely reach the performance levels that we see with Rosetta2. There are many issues in determining what is code, where are the branch destinations in case of indirect branches, etc. What we have here is certainly a feat of engineering on its own.
Yes, handling indrect branch seems a bit complex and I'm not a specialist in the field.
But I'm pretty sure that the cases of indirect branch are rare enough so that an additional indirection is relatively inexpensive.
Adding a simple address mapping table should meet most of the cases.
An interesting question would also be whether Apple has added features to the hardware to improve the translation?
We know, for example, that Apple introduced a special register  to temporarily switch from the ARM consistency model to the TSO consistency model (Total Store Order) from x86.
 : https://github.com/saagarjha/TSOEnabler
Anything less than that is emulation, and requires dynamic elements. All modern emulators use JIT, and caching the result is similar to AoT translation; plus JIT can be faster than AoT sometimes due to being able to take advantage of runtime profiling, and you can never guarantee ~full AoT translation of even binaries without self-modifying code without additional metadata (like a list of all branch destinations), so Rosetta cannot possibly claim it does that with full coverage. On top of that you need to add a level of indirection to all indirect branches, as you cannot statically change all function pointers in data structures (that's an even harder problem). At that point you're adding enough bookkeeping gunk to the translated code that it is no longer a straight translation, like Apple would want you to believe. JIT is binary translation too, so by Apple marketing standards, qemu, Dolphin, and basically every other modern emulator is also "translation". Which is just not useful.
So everyone saying that "Rosetta 2 is AoT translation" as if that means it's fundamentally better/faster than other emulation technologies is just falling to marketing.
Whatever you call it, it's not fundamentally different from any other emulator in a way that puts it in another class of technology. It is not straight converting x86 to ARM. That's just not a thing and it never will be. The end result is that the CPU is going to be executing a series of translated basic blocks interspersed with code added by the translation to glue everything together, which is the same thing every JIT-based emulator does, and will have the same performance characteristics, and the fact that some of that work can be done ahead of time is not a fundamental difference.
If you want to look for reasons why Rosetta 2 is faster than other emulators, look for places where Apple cheated and made their CPUs implement x86 things like its memory consistency model. That can have massive gains. I bet if you port a decent JIT-based emulator to use that feature on M1, and compare it to Rosetta 2 for number crunching inner loops and such, you'll find you can get very similar performance numbers out of it once the JIT cache is warm.
It'll be interesting when people take a deep dive into specific things Rosetta 2 does.
FWIW Chris Randall from Audio Damage posted a quick note saying performance of plugins under Rosetta was basically comparible with Intel: https://www.audiodamage.com/blogs/news/a-quick-note-about-ap...
I've read tweets from other plugin developers saying similar things, so preliminary feedback seems quite positive!
Either he's talking about AUv3 specifically, or the hosts he tested already are doing out-of-process wrapping, or Rosetta 2 is actually magic (AFAICT this isn't a generally solvable problem at that layer), or he's confused.
They'd still have to be run under rosetta2 (because programs can write code and branch to it) but a lot of computation could be done once rather than every time.
Even if that means the users have to retranslate, that's still essentially "free" (to Apple) distributed compute.
What really matters is that it's using about 1/3 the power of a comparable Intel chip while doing it, and that Intel routinely has a 60%+ quarterly profit margin (probably much higher on PC chips) that Apple now gets to keep.
They could in theory move the production to U.S. however without the ecosystem of talent, suppliers and partners that Taiwan or china has it will be hard .
They already spent 10+ years improving their iPad chips, switching made sense especially given intel lack of reasonable roadmap, the effort to integrate TSMC part of the chain does not have the same value given the risks today.
Are you confusing with Foxconn or other parts assemblies that Apple use? Because for Apple to roll out their own fab infrastructure would be a massive undertaking with very little benefit.
Honestly, if Apple or Google or any other company with top-tier engineering wanted to get into the game and were serious about it, they could be competitive in a < 10-year time frame, and easily making laps around the competition past that.
It could be dramatically more expensive to make an M1 chip and Apple is making the trade off for performance/walled garden security.
Unlikely, but possible
But Apple didn’t switch chips for the cost. Just like the PPC to Intel switch, they did it for the performance and efficiency. More efficient means smaller, thinner, less battery, etc... more design flexibility. That and the fact that they fully control their release cycle is also a nice bonus.
Please also note that a modern non-M1 Mac included a TSMC silicon (T2) on a similar order of complexity (A10-class) anyway that’s no longer present (albeit at a larger node).
Me personally? I have no way of knowing or estimating those costs either way. But that was how I interpreted the parent comment.
If you allow that R&D costs for the M1 are largely aligned with the A* series for iPad that they would have already been paying (from your comment), then you're left with the manufacturing costs.
For Intel, they have: 1) chip design R&D, 2) fab method R&D, 3) fab costs, 4) sales/marketing/profit margin
By switching to their own chips, Apple has absorbed #1 and #4, and left #2 and #3 to be outsourced to TSMC. If the main costs for #1 are duplicated from the iPad chip, then you're left with fab costs (and fab R&D).
But, again, even with the likely cost benefits of making your own chips (and avoiding the Intel profit margin), I doubt that was the deciding factor in switching. The costs probably made the switch possible, but wouldn't have been the primary factor. Apple would gladly pay more per chip if it was more efficient (heat and power) and if they could control the release schedule. Even if their total costs for a single M1 chip are more than for an Intel chip, they probably still would have made the switch.
Apple has only taken over the design of the chip not the manufacturing, Intel was doing both for Apple. Nobody has any data to know how the manufacturing costs have changed.
There is real chance that total costs for Apple could have increased overall and there is no way to know. Intel was in difficult position over the Apple deal for years, Apple would have negotiated down significantly given their threat to move.
For all we know Intel was selling at loss per chip to apple and making it up in other business.
The vertical integration of all is frightening though. It does not end well with many industries. You need competition and being now on arm and one gpu meant all those fierce competition in pc and gpu of intel / amd / nvidia is just figures. You do not get the insight and innovation by looking at how good or bad other side is.
There are some limitations: lack of PCI-E lanes, lack of off-package RAM support, lack of user-replaceable RAM support, lack of discrete graphics support, lack of Thunderbolt 4 support. These are all essentially non-performance features of the chips that need to be developed to bring them up to feature parity with the Intel chips.
This generation happens to have carefully bypassed the need for those features, a smart move by Apple for the first release, but later releases for the higher end MacBook Pros, the iMac/iMac Pro, and the Mac Pro, will need most/all of these features, and those will likely take significant time to work into their architecture.
Purely from a CPU / SoC "Unit" shipped perspective. Apple currently shipped more CPU unit on both leading edge and total volume than Intel.
I couldn’t find hard numbers, but it seems that it isn’t a given that, adding all CPU power globally, there is more CPU power in servers than in smartphones.
What made TSMC so successful? Is it primarily thanks to their business strategy? Or did Intel do something so wrong they tumbled down this far?
I know that Intel’s 10nm is closer to 7nm TSMC but still their competition is coming up with interesting and relevant technologies while Intel is like a junkyard of half baked ideas. 5g modems? Arduino competitor? Vaporware GPU’s since Larrabee? Claims of dominance in NN accelerators with nothing solid? Nirvana? Optane 600 series garbage SSDs? Stupid desktop computing form factor ideas? I can go on...
I don’t hate Intel or root for any other company. I’m just trying to understand how incompetence like this happens in companies
Basically they are riding the new wave of cheap devices that outnumber the x86 devices by 10x 20x.
Basically everything uses an ARM CPU these days, not just tablets and phones, but microwaves, TVs, projectors, refrigerators, ovens, 3D printers...
That makes those devices extremely cheap on volume and make innovations to happen faster than o a single company like Intel, that was not interested on those low margin products.
Intel is far from incompetent, they just decided to get advantage of their monopoly position to reap as big profits and margins as they could get for the longest possible time, instead of cannibalizing themselves with lower margins.
And it was great for them. Their executives have done great. They have just ruled the semiconductor industry and wanted to enjoy it.
When the iPhone launched Steve Ballmer laughed  at the price and pushed a $99 competitor with MS software. The phone market was very, very different before the iPhone got real traction.
Most of these things are not on bleeding edge 5nm or 7nm process, though. Most microcontrollers are more like 90nm (e.g. STM32 up to F7 is 90nm; STM32H7 is 40nm... many smaller micros of the M0 variety are even 180nm...)
Basically, if you're not video processing, or a real computing device, or something power sensitive, 90nm is still a pretty sweet place to be-- ~$300k for a mask set, easy to have 5V tolerant I/O if that's something you need, high likelihood of common I/O and core voltage, &c.
discussed recently: https://news.ycombinator.com/item?id=25092721
Also, Intel declined to make a cpu for the iPhone, dropped their own ARM line and also didn't get into fabbing for other companies when they still had a large lead in fab technology.
I do get the impression, Intel was far to happy selling x86 chips. Which worked and gave them lots of revenues, till they got stuck with the 10nm process. And of course while TSMC grew into the power it is today.
Funny how 15 years out it's the exact opposite now.
That sounds exactly like an incompetent strategy by being lazy ignoring possible competitors.
They are in fact one player among many. Their strategic advantages have evaporated. The entire mobile space passed them by and now AMD is seriously threatening their x86 business. Something has clearly gone very wrong at Intel. Shareholders can't be happy about that.
True, but only a small minority of companies that went into that space made money on it. Any Intel mobile division could very easily have been the next Blackberry or Nokia. Staying out of that fight may well have been the best decision for their shareholders.
> AMD is seriously threatening their x86 business.
If they lose the profitable server segment to AMD then that's a serious problem, I'd agree with that. That's far from settled though.
(On the other hand, I've often pointed out that Intel's attempts to develop two microarchitectures in parallel have always failed in the long run, with one project ending up woefully uncompetitive.)
Some friends and I were BSing about the "pro" level parts, it you can graft 2 or 4 M1s together, use off chip RAM and then treat that onboard 16GB like cache? We're talking about some game changing stuff.
If Apple integrates two more memory chips, it'll be able to power a pretty solid desktop or laptop.
On the performance, Rosetta is most likely doing JIT so that most of the time it's running native ARM code. It did this with PPC binaries and DEC had it for Alpha.
It is not on-chip memory, the dies are separate, they're just in the same package. They seem to use standard LPDDR4 connectivity, so I don't think its actually faster. The "unified" bit seems to matter more: having a single address space for both CPU & GPU, but this is pure speculation. I don't know if AMD or Intel APUs do this too.
As a result you would be able to drive a higher bandwidth because you don’t need to be as limiting with the transfer time of signals.
They may have been one of the earliest investors in EUV (along with TSMC, by the way), but in terms of adoption and roadmap they have been way behind both TSMC and Samsung. I don't know the exact numbers of machines but my educated guess is that TSMC and Samsung together probably have close to 10x the EUV wafer capacity compared to Intel. And have had it for much longer as well.
The problem Intel created for itself is that they have always had a very stubborn over-confidence in their own knowledge of process technology, and have driven tool manufacturers like ASML to work within Intels constraints, instead of working together to alleviate them. Their hubris has bitten them now that EUV has become economically viable compared to Intels process technology that relies heavily on triple and quadruple patterning, and very little of Intels 'old' process technology knowledge carries over to EUV.
TSMC has also had a lot of teething pains with EUV but they have been very determined to make it work, and that's paying off now.
There are currently zero EUV Wafer from Intel. Which means the answer to your question with would be close to infinite.
Binary translation can work pretty well for user code, especially synthetic benchmarks.
Arguably Intel has been falling behind since the delays in replacing Haswell (so, last six years or so). It just hasn’t been particularly visible, as the ARM vendors simply don’t compete in the same spaces, until now.
Though, in what might be an early sign in retrospect, x86 phone chips, after a lacklustre launch, vanished without a trace some years back.
For example: these benchmarks show that under excessive load (as benchmarks do), the M1 is capable of out-performing Intel chips at something like 1/3 the energy usage. No matter how you slice it, that is an impressive achievement and shows what M1 could potentially do for the most perfectly optimized software.
Realistically, of course, software will not attain this. But having the upper bound lets software writers know what to aim for and when it's not worth pushing for extra performance.
I wasn't aware of this. Can you provide a link?
* Compress a 2.3mb file that fits in cache.
* Alter a 24MP JPEG (that's around 5MB filesize).
* Gaussian blur of 24MP JPEG
* Gumbo Parser of an HTML file then execute some stuff with duktape -- note: this parses HTML to a simple DOM -- nowhere close to actual rendering
* Text rendering of 1,700 words into a 12MP image
* Horizon detection of 9MP image (that's around 2MB)
* Image repaint of 1MP image (that's around 200KB)
* HDR image (4MP -- around 800KB) from 4 normal images
* Neural Net of tiny 224x224 images
* Navigate using a graph with 200k nodes and 450k edges.
* SQLite is between 0.5MB and 1.1MB depending on the compiler options. Maybe the dataset pushes it out of cache, but I wouldn't bet on them creating many millions of rows.
Not sure, but may not fit in cache
* Google's PDFium render. Not sure about the library itself, but a 200dpi map doesn't sound like anything worth mentioning
* Camera test gives very little in specifics, but with several steps and a handful of libraries, this probably overflows cache a little.
* Ray Tracing 3.6K triangles and 768x768 output. I'd put it elsewhere, but they could be using a huge number of rays (though I seriously doubt it)
Undoubtedly doesn't fit in cache
* Clang rendering 730 LOC (seriously?) Clang is pretty big and most likely needs to cache.
Zero actual details
* Speech Recognition
* N-Body Simulation
* Rigid Body Simulation
* Face Detection
* Structure from Motion
All in all, them saying these things are "real world" would be a huge overstatement at best. I don't see anything here to contradict Linus' assessment.
What your list shows is that caches are huge these days. Because 24MP JPEGs are very much a real world scenario when most cameras sold in the world have half that resolution.
But sure, go ahead and pick another industry standard benchmark of your choice.
Let's see how the M1 fares.
Also curious to see how Rosetta will work with x86 code whose instruction alignments can't be determined statically.
(I didn't know what you were talking about at first, had to work it out.)
I presume the exact terms of Apple's license agreement with ARM are not public, so who knows exactly what is in it. It might have different terms from what ARM offers in the general case.
Shortsighted really considering there weren't really any other options for instruction sets for apple to choose - all this stuff was signed before RISCV was a thing, and PPC and MIPS had pretty high barriers to entry (lots of porting work, lacking SIMD type instructions, and another migration for all apple devs) and poor performance.
Is it just that “right now” we can’t run a windows VM but with some work by Apple/Microsoft/Parallels it can happen or is there some fundamental blocker here?
I thought there was an ARM version of Windows already?
The link does mention an early access program.
I remember this was promised a long time ago. I'm surprised it's not the case.
Parallels, like vmware, is depending on microsoft here. If microsoft doesn't play ball there will be no windows on apple silicon macs. Currently there can be no officially supported way of running windows ARM on M1 macs, as the windows license does not allow it.
Parallels is trying emulation on M1 but who knows how well that will work. https://www.parallels.com/blogs/parallels-desktop-apple-sili...
Not "officially", but you can get hold of it as an individual and install it on e.g. a raspberry pi relatively easily already. Running it in a VM might be doable.
Actual Windows 10 ARM too, not the restrictive IoT version that's officially supported on the pi.
It can only virtualize ARM, of course.
You’re going to need an ARM version of Docker, running ARM Linux kernels, and ARM userland Linux binaries.
In any case, I bet the M3 will be able to run rings around the current MacPro.
• Apple has the benefits of complete vertical integration, both on the hardware and software side.
• Neural engine is essentially Tensor Cores in NVIDIA’s GPU but occupies at least 4x the equivalent die area as tensor cores (no public details on performance
• NVIDIA doesn’t want to make their consumer GPUs too powerful on tensor operations in order to not cannibalise their 1000% markup ML cards.
It’s almost a classic Intel: financial greed and financial engineering, combined with complacency from being long for so long.
Heck, AMD’s new top end card is tied with the 3090 - but $500 cheaper.
NVIDIA is reportedly scrambling to try and get back into TSMC who is going to make an example out of them.
If you want to talk TOPS, an A100 does up to 1.2 exa-ops INT8, and 2.4 exa-tops INT4. That is, an A100 is more than 1000 times more powerful than the M1 at inferencing, while also supporting up to FP32/FP64 weights, and a 3080 is more than 500 times more powerful than the M1 neural engine.
Now lets talk about actual processors instead of doing an analogy. 15cm per ns is more than enough to travel everywhere inside your CPU but the vast majority of logic is localized (usually signals stay in the same core). It's only awful if you go off package to DRAM or a second socket but then the budget is often higher than 1ns. Apple probably scores a lot of performance points here because the RAM is so close to the CPU.
> Compare eg, wind speed (particles) versus the speed of sound (field).
Huh? The speed of particles in the air is even faster than the speed of sound.
Getting a better result via Rosetta 2 on the single-threaded benchmark is very impressive. I assumed that their benchmark included some vector instructions (which as far as I understand Rosetta 2 does not emulate) so this would mean that these higher numbers are for general-purpose instructions vs vector. That said, I can't find any reference to AVX or SSE in the "Geekbench 5 CPU Workloads" document, although the one for Geekbench 4 does mention them. It would be interesting to see numbers for Geekbench 4 on M1, if it runs via Rosetta at all. I would imagine that the tool is able to detect what is supported to run optimized code for each CPU being evaluated.
If indeed v5 tests are not particularly about specialized performance but more about real-world use cases (e.g. PDF rendering, SQLite, image compression as others have mentioned) then running Geekbench v4 on M1+Rosetta would give a comparison of emulated x86 without vector instructions versus native x86 with all of its modern capabilities. Now if the M1 wins that…
A comparison like this would have demonstrated this point more definitively than the many graphs Apple showed that were conspicuously lacking axis labels.
however the cynical side is, Apple is having difficulty getting developers on board. Apple is notorious for not encouraging game developers to be on OS X to the point the common refrain in the Mac community has been to "buy a different system to game on". Many people don't have that luxury or like me don't want two systems on my desk when one should be sufficient.
That M1 debut was notable for one reason too many overlooked, they had very few developers showing off their wares and most of those they did are not well known.
Another good source of info is the article over at anandtech. Apple is doing a lot of stuff they Intel can’t because of x86. And also a bunch of stuff that neither Intel nor AMD want to do.
Also don’t forget that Apple is on TSMC 5nm while Intel is for the most part on Intel 14nm which is more or less equivalent to TSMC 10nm.
Conway's Law playing out, really.
* Decoding an x86 instruction takes a ridiculous amount of resources. Can't be optimized away because of backwards compatibility.
* Limited to small pages without jumping through weird OS hoops.
- huge pages don't exactly require weird OS hoops although i agree the 4kb→2mb→1gb page sizes are inconvenient
If Apples chip does not need this backwards compatibility, they essentially have to do it in software when emulating. They do this with a combination of recompilation at install time and some emulation at runtime. I suspect the compilation step is one of the main contributors to this boost in speed and efficiency. Windows could do the same, optimize the executable to better fit the underlying chip.
It's insane how quickly they fell behind.
I mean, 2022 for their first 7nm chips to hit market? That's crazy. Zen 3 hit the market two weeks ago.
If Apple continue like this, x86 machines won't be able to compete. I wonder if other vendors are looking at this and contemplating transitioning to ARM as well.
It's a Chromebook/Android Tablet hybrid for $300. It's now my primary ereader and "laptop" when I'm away from home since it has decent Android and Linux support.
Chromebooks keep being overpriced for what they offer.
Perhaps, but with what hardware? The PC era was defined by mostly open hardware that had a high degree of interoperability. ARM does not bring the same.
The only reason why ARM seems more closed is that you primarily interact with ARM chips on highly restrictive platforms. The M1 chip will go along way to changing that perception as consumers will now be able to use an ARM based platform as they normally would a regular computer. But, that's not the only way, you can buy a wide array of ARM based devices with recent-generation ARM designs and very flexible IO.
Also until you can run non-linux "desktop software" on one, it will always feel like a "tablet"
Pretty much all ARM application processors use soldered BGA packages over sockets, and there also isn't the same tradition of defining a standard pinout for a generation or two (or five) of chips like there has been for x86 (e.g. LGA1151, AM4, etc.).
You can't really design a motherboard without this, so each design tends to have a custom breakout board or carrier card that's designed to be used in relatively specific installations.
The closest analogue to what you described in the ARM world would be plugging either a Computer-on-Module or System-on-Chip into a carrier board. Today's ARM deployments are presently largely focused on energy efficiency, so they make heavier use of integration than typical x86 deployments.
> Also until you can run non-linux "desktop software" on one, it will always feel like a "tablet"
There are millions of linux users around the world that would challenge your assertion that desktop linux doesn't qualify as anything beyond a tablet. In any event, Windows already runs on ARM and some rapscallions have already got it up and running on a Raspberry Pi despite Microsoft's present prohibition on doing so. I'm sure Microsoft will free up the licensing in due time as market demand presents itself (if only to allow interoperability with M1 Mac VMs).
Until you can build your own general-purpose PC and can decide to make it ARM, it will not be considered an equivalent option, it will just be another tablet, “chromebook” or “board for hackers”
You can build your own general purpose PC with ARM today. It's not done regularly because ARM chips historically tended to be less powerful, so the type of person that went out and built their own PC wouldn't want to build an ARM PC. This is now rapidly changing given chips like Apple's M1 and the upcoming Arm Cortex-X1 are surpassing their x86 competitors in performance orientated benchmarks.
Once Microsoft gets their act together with their own version of Rosetta 2 and we reach the X2 generation, you'll start to see a rapid shift towards ARM desktops with X1 chips in the mid-tier segment. Eventually the premium "gamer" tier will follow as M1 has proven that it's possible create an ARM SoC that surpasses top tier x86 single threaded performance at a fraction of the power budget and mountains of thermal headroom. This will be accelerated by the fact Nvidia is purchasing ARM and now has massive incentive to get Nvidia discrete graphics cards paired up with ARM SoCs.
Most ARM SoCs are the opposite of open - they're proprietary to the max and don't even support an open boot process.
> The M1 chip will go along way to changing that perception as consumers
The M1-chip will change nothing w.r.t. to perception of "openness" - consumers don't care about the CPU inside their devices.
What consumers see is a machine that'll last longer on battery, is much quieter and performs outstandingly well.
Openness is important to developers and enthusiasts who want to use the hardware as more than just a commodity.
Apple Silicon will kill the ability to upgrade your Apple device - no more RAM expansion slots, no more external GPU support, no more upgrading of internal storage, no more 3rd party repairs.
Apple Silicon turns laptops and Macs into consoles: closed ecosystems in terms of software and hardware. I wouldn't call that "openness" at all.
The RK3399 inside the Pinebook Pro is also not exactly "open" from a hardware point of view: still no open source drivers for the GPU for example (other than reverse-engineered volunteer efforts).
It would seem that ARM has a much higher potential performance ceiling than already relatively optimized x86 and derivatives and so once we have enough viable ARM options for desktop or high-performance laptops, the industry won't have a choice but move along with it.
If you're building software and your competitor releases ARM support and suddenly their product performs better than yours, users will go to them.
If videogame makers realize that they can push more performance on ARM, they will start releasing ARM-optimizing games or even shift to ARM-first in time.
That said I seriously doubt that the ISA in and of itself is the reason the chips are slower. After all, underneath is a state of the art architecture. Until I see a independent true cross platform benchmark in e.g. in dotnet be significantly faster and more efficient, I remain skeptical of such incredible claims.
Mine is a delight.
They could even do it like the MacPro and couple it with some whackadoodle system for connecting modules (like Raspberry Pi hats) without having to worry too much about how exactly they'll integrate.
Wonder what they’ll come up with for full port 13”, 16” and iMac.
The top of the line’s i9 MacBook really doesn’t cut it for a lot of professional work —- especially when it thermally throttles before the work can even begin.
The AMD64 "emulation" on the other hand is brand new.
I was thinking that they'd probably add custom instructions for x86 emulation, I haven't heard if that's the case or not. Would be interesting to find out. Given the complexity of a modern x86-64 frontend, it was never that far-fetched to think that a combination of world-class JIT and a bit of custom silicon could have a negligible overhead compared to a pure silicon solution. After that it was mostly a question of how good the CPU core and technology node was. And Anandtech, among others, had already shown they were well on track to beating Intel there.
The real question is… how long does Rosetta 2 stick around? Rosetta 1 only lasted OSX 10.4 - 10.6.