Hacker News new | past | comments | ask | show | jobs | submit login
The ARM vs x86 Wars Have Begun: Power Analysis of Atom, Krait, and Cortex A15 (anandtech.com)
80 points by lispython on Jan 4, 2013 | hide | past | favorite | 51 comments



This is a fascinating article both in what it says, and what it doesn't say. If the future is really low powered SoC's then Intel is in a world of hurt. The reason for that is that if Intel can only match power/performance of ARM in their chips, then its a toss up for the manufacturer in terms of user visible impact on which to use, and that makes cost and/or differentiation the next selector in the 'whose do we buy' tree.

If Intel has to go to the mat on price it really kills their business model. One of the amazing things about Intel over the last 20 years has been that their part sells for up to 30% of the total cost of all parts of the system. And what that means is that if you get the CPU for half as much you save 15% on your total parts cost. That is huge savings. But if Intel has to cut their margins to get sales, they lose a lot of R&D dollars and their legendary advantage in Fabrication is suddenly underfunded. That is a huge problem for them. Intel has to win this fight or they will have to radically change the way they have structured their business.

The second part is differentiation. Intel just doesn't license the x86 architecture any more, they got out of that when competitors were out executing them in architecture design. What that means is that you really can't add an instruction if you're an Apple or a Cisco or what not to your CPU to make your particular use of the part more efficient. But with ARM you can. If you are an instruction set licensee you can make internal changes and keep the compatibility high. Apple just demonstrated with the A5x that this was something they were willing to do. There is no relationship with Intel where that sort of option would be possible.

So if Intel can only match ARM on its performance and power curves, they lose. They have to be either 50% faster at the same power levels on the same workload, or 50% more efficient at the same performance. +/- 10% or so isn't going to cut it when they are commanding a huge chunk of the parts cost.


On the contrary, intel made significant changes to their CPUs court apple's business: the low power core cpu was started specifically to enable the MBA, as well as the focus on a powerful integrated graphics chip for laptop CPUs. Yes, you need to be a major player to have this kind of relationship, but the same goes for if you want to design your own arm CPUs.

Intel's plan continues to be to make up 30%+ of system cost-- but they don't do this by simply providing a powerful CPU-- they integrate more and more into the CPU silicon, and that is their mobile/atom strategy too (see digital baseband integration, etc). It is not abnormal however; battery/digitizer etc are all commodity parts now, screens are quickly getting there too, the SOC is the only area that is so loosely bounded in terms of it's improvement potential.

Things have rarely looked better for intels fabs either, they almost totally dominate the desktop/laptop/server space, and have a serious prospect for significant expansion for the first time in a long time. There may be a small decline in PC sales, but the overall $ spent on processing systems (ie add phones and tablets) is rising a lot more.

I do agree with your broad point that intel will have to be noticeably better to get higher revenues for their chips, though. But if they get atom onto their leading CPU process (as planned), as well as out-of-order, then they are very likely to achieve this.


"On the contrary, intel made significant changes to their CPUs court apple's business: the low power core cpu was started specifically to enable the MBA, as well as the focus on a powerful integrated graphics chip for laptop CPUs."

Um, since the CPUs you are talking about came out years before MBA came out that sounds more like Steve talking...

Also, intel GPUs are the most commonly used in the (x86) world and has been for a long time (before apple even made products using x86 processors). How the idea or execution of including it in the CPU has anything to do with apple I haven't got a clue.

I'd be more willing "blame" the performance arms race of integrated GPU solutions on AMD buying ATi.


Qualcomm has already integrated pretty much everything and their SoCs are cheap, so the idea that Intel can charge a higher price in exchange for having more features sounds unlikely.


But the point stands that Apple can't add instructions to x86 like they can their iOS devices.

The emphasis on a low power x86 core may have been driven in the short term by a desire to win Apple's business, but power consumption and heat dissipation are (literally) forces of nature that have been around forever. The increasing desire to reduce datacenter costs and increase server density and mobile battery life are very long term trends.


"On the contrary, intel made significant changes to their CPUs court apple's business"

Since Apple has adopted Intel processors, Intel has removed NVIDIA's ability to build chipsets for Intel processors. Apple was using one of these chipsets when this happened. Given the choice between a newer NVIDIA chipset and Intel's integrated graphics, I would bet Apple would still be using NVIDIA.


> What that means is that you really can't add an instruction if you're an Apple or a Cisco or what not to your CPU to make your particular use of the part more efficient. But with ARM you can. If you are an instruction set licensee you can make internal changes and keep the compatibility high. Apple just demonstrated with the A5x that this was something they were willing to do. There is no relationship with Intel where that sort of option would be possible.

By licensing armv7s, Apple got to make the A6's CPU unbeholden to Samsung, Qualcomm, etc. They also got an armv7s using processor out before anyone else by making their own.

Intel's fab advantage and CPU design expertise, plus their willingness to make custom parts for Apple (see small outline C2D for original Macbook Air), would make up for what design control Apple loses by not designing in house.


"Intel's fab advantage and CPU design expertise, plus their willingness to make custom parts for Apple (see small outline C2D for original Macbook Air), would make up for what design control Apple loses by not designing in house."

As I understand it the small outline C2D was simply a packaging option rather than a silicon change. And then Intel turned around and has been trying to get all of the 'other' manufacturers to make MBA clones aka Ultrabooks.

If Apple says, "Make us a Haswell chip where there is an extra GPU core, but don't let anyone but us use that core." That is a hard place to be for Intel. Now they invest time and energy in a silicon change, and an all mask change at that (not just a metal layer change) with the proviso that only Apple can use those extra transistors. The only way to amortize that cost is on Apple sales, and if Apple says "Gee, that hasn't been as successful as we'd like, never mind." Intel takes a hit.

When I was at Sun, Sun decided to start making its own CPUs (the SPARC architecture). Seeing how they could tune their code and their process to the CPU gave them similar advantages that DEC, HP, and IBM had who all had an in house CPU capability. Previously Motorola was the chip maker but Motorola wouldn't agree to 'Sun specific' changes, especially if it forces Motorola to support a specialized part just for Sun. Early on Apple tried a different approach which was a consortium which had a CPU architecture (PowerPC) that could make chips just for it. That must have shown them the advantages of that approach but dissuaded them from wanting to convince 'somebody else' when they wanted something done their way.

FWIW, I also think the fact that Samsung had that capability (making their own ARM chips) and they were executing faster than Apple was really pushed Apple into the space it is today.

What that adds up to is Intel being frozen out of the market. Can't compete on price, can't compete on features, and can't compete on software base (any more). Bad news all around. Will be interesting to see if Intel decides they are going to start making smartphones/tablets themselves.


Apple just demonstrated with the A5x that this was something they were willing to do.

Is that right? I thought the two license types were for the full core designs (for those who just want to stamp out a pre-designed core) and an ISA license (for those who want to craft their own implementation). I was unaware having an ISA license entitled Apple to introduce new instructions, nor that they had added new instructions to the A6. (The A5x, AFAIK, is just a Cortex A9 at heart).


Was referring to news we read about the iPhone 5 like this coverage : http://arstechnica.com/apple/2012/09/apple-using-custom-arm-... which discussed changes Apple made to the architecture of their ARM SoC to achieve their goals. These are not changes they could have made to an Intel SoC. They could have asked for them, but if they got them then so would everyone else.


It looks like the instruction set they implemented was ARMv7s which is notable for its support of VFPv4 (hardware fused multiply add and better support for single precision vector floating point arithmetic.)

While Apple may certainly have been involved as far as I know, these are ARM technologies. In general if you want a new instruction I'd think you are better off going to a CISC vendor than a RISC one.


Putting Apple aside, an ISA license does allow adding instructions; see XScale's WMMX (even though it failed in the market).


> when competitors were out executing them

Missing a hyphen. I hope.


You'll notice that Intel isn't coming to Anandtech and offering to break open a bunch of android phones and comparing them with the atom android port.

On windows Intel is benefitting from a kernel and compiler that's spent 15 years being optimized for their isa. I am sure qualcomm worked very hard with microsoft getting RT out the door but I would wager that there is a lot more room left for optimization of krait on windows rt than there is for atom on windows 8.

Intel and Acer also have a lot of experience optimizing for the hard separation between the platform code and the os which typically involves a lot of cheating / second guessing. I'm pretty sure microsoft requires their arm platforms to support the traditional x86 style platform interfaces like uefi and acpi. Arm SOC manufacturers gave traditionally benefitted from being able to do deep integration into the OS and exploit tons of manufacturer specific optimizations. I highly doubt qualcomm enjoyed the same freedom with the NT kernel.

So while it's impressive to see atom operating with much better gating than it traditionally has had, I suspect that if you did the comparison on neutral ground using gcc on linux and let all the manufacturers do as much optimizing as they wanted you'd see the arm systems improving their performance per watt significantly. Meanwhile the atom would be lucky to just tread water.


You can argue it both ways, one the one hand windows benefits from the last 15 years of x86 optimization. On the other hand arm is benefiting from a lifetime of embedded optimization. Intel has only relatively recently shifted their focus to low wattage parts, and this demonstration on atom is not on their latest process.

What this benchmark shows is that the speculation about x86 being inherently inferior to ARM for low wattage is just that, speculation. x86 and Arm are very much in the same ballpark and we can expect the competition to really heat up in the coming years.


There is no war to be had until Intel seriously considers operating on much lower margins. Their problem is not just idle power & heat dissipation, their real problem is cost/ unit. Whatever Intel does going forward, their fat days are over.


Exactly, I think you're 100% right.

The real "war" is a price one. If Intel are willing to cannibalise their own per-CPU margins, they'll sell bucket loads of mobile chips.

If they're not, then they won't - it's not much harder than that to understand. I can see why they wouldn't be willing, once you get a relatively high performance Atom chip then the use case for a highly profitable Xeon chip becomes much narrower.


Intel is producing Atoms on an ancient process, while they are one generation ahead of every body else in cutting edge process. So this is Intel getting its feet wet, not Intel trying to compete. And atop of this, I am pretty sure there is a market for really high powered smart phones above the current price range.

[Edit: reformulated last sentence]


I'm curious as to what will be the applications that drive the need for more phone performance. I am pretty happy with the performance of my S3, if it were four times faster, I don't see how it is going to noticeably improve my user experience.


Actual desktop replacement. Think about a phone with similar specs as a three year old desktop. You can then carry your desktop around, do actual work wherever you find a reasonable screen and keyboard and you do no longer need to sync several devices.

Apart from this, any real augmented reality application can easily burn any excess computing power you may carry.


My only worry about having my phone be my entire computing infrastructure is it getting stolen. Not sure I'm up for that future possibility just yet.


Full disc encryption and, obviously, a serious backup solution should take care of most issues.

Also, using it more as a thin client would also minimize the risk.


I think I'd still want them to be more akin to Sun's original vision of network appliances. The phone acting more as a session token to an external system and a thin client.

Just noting the size and always present aspect makes security a different problem.


And yet Atom is on 32 nm while Tegra 3 is on 40 nm. And Intel announced plans to catch up Atom to the latest process.


Indeed. Their business model needs changing. It's not just price, it's also the fact that they don't offer as many flavours of SoC, aka with different basebands or integrated encoders, to even compete closely with the ARM ecosystem. Technically they may now be able to compete, from a business perspective: not!


The graphs in this article border on unreadable. I don't know why you'd post graphs like that unsmoothed unless your goal was for your readers to ignore them completely.

The claims he makes about power consumption certainly seem interesting, but I don't really feel like I can take them without a grain of salt given how hard it is to read a lot of the graphs he's drawing conclusions from.

The general trend of Atom finishing benchmarks earlier without drawing much more power is pretty interesting, at least. I never would have guessed that Atom would be a winner here - it has such a bad reputation.


The bad reputation came from the low performance (in low-end Windows machines), and it looks like it will continue to have that now that Cortex A15 is out.

From the power consumption point of view, it was completely unusable for a smartphone or tablet until recently, because when they made it, Atom had a 10W TDP, and for the past 5 years they've mainly tried to get that to 2W, while keeping the performance mostly unchanged. That performance used to be much higher than the high-end ARM chip at the time when they released it, but that's not the case anymore.


Another key problem with early Atom based machines was the rest of the chipset (io controller and such) that went with them, which could draw a fair amount of power too. I presume this has been improved as well as the CPU's power needs.


I believe that one overlooked fact is the ability of a company to purchase a license to design an ARM processors for their own particular usage scenarios as Apple has done. I believe this gives ARM a distinct advantage vs. Intel. With Intel you have to wait until they release a processor and then you have to integrate into your design. I think the ARM way of doing licensing will be a boon for them.


Has anyone else noticed how odd these benchmarks seem?

They run sunspider on different browsers in different OSes and use the results to compare processor performance per watt.

On the same OS, with the same browser, I can have higher variations on performance per watt with a single compiler flag (fastmath, -O2 vs -O3, etc). The same could be said by changing the kernel's scheduler. If we keep in mind that the browser, compiler and OS are all different for these tests, how can we accept the results as anything but noise from those differences?

One striking example of this is that the new version of chrome on the nexus 10 could easily account for the performance difference in the kraken test. How can we use that data to compare the performance per watt of the processors?


I think Intel is in an interesting business position. Just when AMD is getting into serious trouble, raising possible anti trust issues, ARM steps up. However ARM has actually the inferior architecture (for desktop/ modern tablet use).[1] So they can avoid anti trust issues by pointing to a competitor, who simply does not threaten Intels core revenue generators ( Desktops, Notebook [2]).

[1] I belive they still use in order architecture. And certainly no one but Intel has a 22nm fin fet process running. [2] I am actually not as sure about servers.


A15 is out of order.


Thanks, I should have checked that. ( The argument is probably about prefetch logic, but basically similar.)


Cortex A9 was out of order, too.

Ironically, Atom is still in-order.


A question for iOS devs out there: how tied to ARM is iOS Objective-C programming?

I ask because Android already supports Atom-based devices, and presumably Win8RT apps are easy to recompile for Intel - so it seems likely that it would be straightforward for any of the players in the mobile space to switch architectures on fairly short notice.


Objective-C was created in 1983.

http://en.wikipedia.org/wiki/Objective-C

Apple and NeXTStep have run on Intel, Motorola 68k, or PowerPC chips in the past. iOS is a subset of Mac OS, which currently runs on Intel.

In short, the CPU architecture choice is irrelevant.


You left out SPARC and PA-RISC. NeXTSTEP was probably the most cross-platform desktop OS ever sold.


As mentioned, Apple has plenty of experience switching architectures, so technically it's not a huge issue. Don't hold your breath though; since Apple's going out of it's way to build custom ARM cores now, it's safe to assume ARM (or if anything, a custom arch sooner than Intel) is their future now. Apple got burned when IBM couldn't keep up power/performance on PowerPC, leaving their fate entirely to a third-party is a mistake they're unlikely to make again.


Not an iOS dev, so I can not tell you how much inline assembler magic is in apps. But the operating system needs of course a fair amount of plattform specific code, as does the compiler. And in general quite a few optimizations are platform specific. So even though Apple uses llvm, the swap is a lot more involved than a -march=x64 compiler flag. This is also true for Android and WinRT.


Almost no libraries out there have inline assembler magic. If you wrote a library with inline assembler, you'd have to write it for arm & i386 so that developers could still use it in the simulator. There are very few people out there competent in both assembly languages who are also iOS developers. The need is also pretty low because if you want fast computations, you'd do it in the GPU. The only need might be some bit-banging code for interfacing with external hardware. So the only people who might need assembly are making complicated interface hardware.


There are plenty of cases where the cost and complexity of sending code to the GPU means that you use processor intrinsics to speed up your code.

As far as I know, most people still do encoding/decoding of audio/images/video on the CPU, unless there's a hardware codec available. Audio DSP is mostly done on the CPU too, I think. Audio is a bit more demanding than graphics in terms of bit depth — the "highp" precision should be taken as a minimum precision for audio work, it's not always available in the fragment shader, and it's definitely not enough precision for audio production apps like Garage Band.


> The only need might be some bit-banging code for interfacing with external hardware.

Atomic primitives are another reason to do inline assembly, though since recent GCC (and probably LLVM - though I haven't looked into it) provides CPU-agnostic atomic ops this isn't as big a deal as it used to be. (C++11 also probably would help, though I'm not sure how the adoption has been in mobile SDKs.)

Pretty sure inline asm is also still common for SIMD ops. Although the sane approach there would be to have a generic C implementation to fall back on if there is no optimized one present.


Clang and GCC both have new atomic built-ins that can be used for implementing C11 or C++11 atomics. You can use them whether or not you have C++11 support.

But I don't see much inline asm for SIMD ops. Almost every piece of SIMD code I see uses either intrinsics or writes the whole function in assembly.


I doubt a manufacturer will switch chips because it has better power consumption. There are lots of other power hungry components in a mobile phone. I'm guessing the CPU might only use up 15% of the total power consumption, so a CPU with half the power consumption would really only increase battery life by 7%.


Let's remember that Intel is sitting on a mountain of cash and even more importantly mountains of chip design talent and mountains of fab hardware. Intel can afford to play the long game. The interesting thing is how competitive Intel's first gen x86 part has been.


Anand does an exceptional job of acting as a proxy for Intel's marketing department.


This comment would have a lot more force with specifics that show how Anand/Intel's numbers are wrong or misleading.


I think the parent post is too strong, but numbers are by no means the only way of showing a bias.

Whilst i believe he tries to be objective, Anand may hold a slight bias towards intel. Take the discussion of the future; he talks of the advancements of the core architecture, comparing a yet-unseen 8W TDP[1] haswell, then talking confidently of halving that at 14nm. This is in no means a certainty; i think it is fair to say this would be a best-case scenario where intel focused fully on reducing power, as well as having a very successful 14nm node.

On top of this, there is no mention of price in this equation; i can't see a 'core' CPU in a nexus tablet (too expensive for the cost) or an ipad (too expensive for apple's margins) if intel want to keep their margins.

I should reiterate that everyone has a bias, and if Anand's is slightly pro-intel, then it is surely propagated by the high level of access intel seem to give him (take for example the 'x86 power myth busted' article; where the atom was compared to the intel cherry-picked (outdated) tegra3), and the fact that we have a very capable intel today. As said i believe he attempts to be objective still, so i still think his journalism is some of if not the best in this area.

1. keep in mind TDP is not peak power (which is where the A15 would be close to 4W), but the sustained heat output that should be dissipated in whatever enclosure, for which there is no standard. This would put the A15 in the 1-2W range.


That's the kind of specifics I was looking for (thanks). I agree about numbers, and one doesnt have to answer with numbers if they can be brushed aside with specifics that offer proper perspective.

I'm pretty hardware-ignorant myself, so I'd be at the mercy of articles like this if it weren't for comments.

It's interesting times - the stakes seem to be high because of the way that some platforms are tied to hardware architectures (which is my real interest here).


Intel already demonstrated at IDF that they have Haswell running at 8W while giving the same frame-rate as a 17w Ivy-bridge part when running the Unigine Heaven benchmark: http://www.anandtech.com/show/6262/intels-haswell-20x-lower-...


Four things determine the takeaway a reader will have when reading a benchmark

a) What is measured

b) What the product is compared against

c) Which subset of features the reviewer focusses on

d) The reviewers take on the design compromises specific to the particular product

Unfortunately all four of those are fairly subjective things so it is hard to make a fact based case against someone who is diligent with the measurements and excels at writing. That being said I invite you to find focus on intels inability to ship working graphic drivers, lack of ability to decode 1080p video for the longest time with an integrated chipset, focus on multithreaded benchmarks early on where amd was shipping more albeit slower cores at the same price point, lack of linux support for poulsbo chipsets etc, lack of comments on intels habit of gimping processors within the same family by disabling functionality( support for ECC, support for acceleration of Virtual machines, reference - http://wiki.xensource.com/xenwiki/VTdHowTo ).




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: