Hacker News new | comments | show | ask | jobs | submit login
X86's Days as a Consumer Microarchitecture are Numbered (plus.google.com)
97 points by andrewmunn 1792 days ago | hide | past | web | 78 comments | favorite

This is an largely vapid and meaningless prediction by someone who doesn't demonstrate anything but the most superficial knowledge of the microprocessor industry. Perhaps he knows something we don't, but as far as I can tell he's only extrapolating current market trends.

Obviously Intel (once led by Andy Grove, author of "Only the Paranoid Survive") is aware of the threat posed by ARM. If someone could explain how Intel will fail to meet the challenges of getting x86's performance-per-watt to match ARM's, and how this compares to the challenges ARM vendors face in order to get raw performance up to Intel's level, I would love to read it. However, this post offers little such insight.

One thing that could work against them is their cost structure. Intel is used to throwing a thousand or so engineers on each design. Having so many designers means they can squeeze out every last MHz, but it also means they need big margins and large volumes to recoup their costs.

The other big technical issue is the end of Dennard scaling. For most of the last three decades, scaling CMOS processes bought you three things: more transistors, higher frequency and lower power. Things are different now. We can't really scale frequency any more because we've run into the power wall. We used to get lower power at the same frequency by scaling the supply voltage, but this also required us to scale the threshold voltage (a device parameter). Unfortunately we can't scale the threshold voltage willy-nilly like in the past because leakage power increases for lower threshold voltages and is now a significant contributor to total power. We still get more transistors per unit area, but it's not clear whether the economic costs of building up new fabs and switching to a new process are offset by the benefits of having more transistors to play with.

The bottomline is that it's not clear whether Intel's biggest competitive advantage, that of having a manufacturing process superior to everyone else, is still that much of an advantage.

PS. One thing I find truly amazing is that Dennard predicted that we'd run into all these problems back in his landmark paper in 1974!

One thing that could work against them is their cost structure. Intel is used to throwing a thousand or so engineers on each design.

Exactly. It's also not obvious that the mobile chip market will pay a premium for Intel-calibre fabs. If it won't, then the question becomes whether a TSMC-made Atom is better than a TSMC-made ARM.

It's also fairly common to have custom hardware added to SoCs. Is Intel prepared to open up their processes to that sort of thing?

Unfortunately we can't scale the threshold voltage willy-nilly like in the past because leakage power increases for lower threshold voltages and is now a significant contributor to total power.

This one cuts both ways though. With leakage dominating active power, within a given node, the fabrication process will be relatively more important than microarchitecture, which is a point in Intel's favour.

This one cuts both ways though. With leakage dominating active power, within a given node, the fabrication process will be relatively more important than microarchitecture, which is a point in Intel's favour.

This is kinda nitpicking, but I'm not sure leakage will ever dominate active power. We still have the ability to reduce leakage if we want, we just have to give up frequency for it. In the past we didn't have to play this trade-off but even now I don't think it ever makes sense to run your chip so fast that leakage is more than dynamic power.

I do agree that for any given node, Intel is still going to ahead of the rest. It'll be interesting to see how much this helps them.

Leakage power is actually very significant part of the total power usage [1] and one of the bigger reasons why Intel developed the tri-gate technology [2].

Active power is the one that's related to the frequency (P ~= CV^2f). Leakage power will "leak" even if the transistor is not switching.

1. http://www.eetimes.com/electronics-news/4215605/Leakage-powe... 2. http://realworldtech.com/page.cfm?ArticleID=RWT050511195446

Not sure what you mean by significant, but typical leakage power numbers are something like 15-30% of total power.

Maybe you're referring to some papers that used to come out a few years ago which suggested that leakage power will dominate total power. As I said above, this is unlikely to happen. It doesn't make sense to operate at a combination of supply voltage (Vdd) and threshold voltage (Vt) where leakage dominates total power. I think these papers misunderstood the fact that threshold voltage and hence leakage itself is a knob that the device manufacturing folks can control.

Active power is the one that's related to the frequency (P ~= CV^2f). Leakage power will "leak" even if the transistor is not switching.

If you're implying that leakage power doesn't affect frequency, you are wrong. Transistor speed depends on the gate overdrive which, for modern velocity-saturated devices is proportional to Vdd-Vt. Leakage power itself is proportional to exp(-Vt). There is a clear trade-off here between how fast you run your chip and how much it will leak.

The papers I've seen point to values larger than 15~30% - I've seen ~50% cited for geometries as large as 65nm, only to get worse as we go to even smaller feature sizes. [1]

Threshold voltage is not really an effective knob, unless you assume that the feature size to be a knob and go against Moore's law, or that brand new, once in 10-years process innovation is a knob that designers can pick out of a hat. I don't think anyone's clamoring for return to 130nm parts on a smartphone. At each new process node, you're going to lose out on the amount of control you'll have over Vth.

This is basically what Intel did with the tri-gate transistors which gives them longer lease on life until they bump against subthreshold leakage. TSMC is on their first generation high-k metal gates, and still a process node or two away before jumping over to the tri-gate party.

1. http://www.eetimes.com/design/eda-design/4211228/Overcoming-...

If you're referring to this graph [1], that comes from an ITRS prediction. These predictions seem to be made assuming that we'll scale feature sizes assuming everything else will stay the same, which of course, is never the case. I wouldn't read too much into them. BTW, ITRS is famous for making ridiculous predictions like we'll using 15GHz by 2011.

[1] http://www.eetimes.com/ContentEETimes/Images/Design/Prog%20L....

Threshold voltage is not really an effective knob

Why is it not an effective knob? Most modern designs include sleep transistors in an attempt to not leak when a circuit is inactive. These would not work unless we could engineer high-vt transistors.

One rule of thumb I've seen writers on Anandtech express several times is that any given microarchitecture can cover at most about one order of magnitude for power consumption. This leads to laptop/desktop-oriented microarchitectures that can scale at most from about 13W to 130W by tweaking clock speeds and voltage. More recently, the high end has dropped down to at most about 95W for non-server chips, but it still means you have to go back to the drawing board before you have a CPU that can work in tablets and smaller devices.

So far, Intel has shown that they aren't very good at simultaneously developing two parallel product lines of CPUs. Their tick/tock strategy of alternating process shrinks and microarchitecture updates has been working great for years, but Atom has clearly been neglected. Prior to that, they had the P4 NetBurst architecture and the P6-based Pentium Ms on the market at the same time, but NetBurst hit a wall and the company lost a lot of ground to the Athlon 64 before they could come up with a high-performance successor to the Pentium M.

95W / 10 = 9.5W which is a problem but 1/4th the cores = 2.4W which is reasonable.

>lost a lot of ground to the Athlon 64 before they could come up with a high-performance successor to the Pentium M.

You realize that Intel's EMT-64 is effectively AMD-64, right? Intel, which loathed that cross-patent deal, is now reaping AMD's rewards. Core2 and later series processors are all using AMD's intellectual property (legally).

Intel is most certainly not using AMD's architecture, they are using AMD's instruction set, yes, but they internals have nothing to do with AMD's designs. The Core series was an evolution of the Pentium M which was based on the Pentium III architecture.

Yes. Intel has more than made up for the mess they were in circa 2003-2005, but the fact remains that AMD was able to truly embarrass Intel for quite some time, both by beating Intel to market with several new technologies, and by significantly eroding Intel's market share for desktop and server chips.

Hello, I'm the author of this article. I do not have a degree in electrical or computer engineering. I'm merely stating the trends I've seen in the PC industry over the last few years.

I am, however, a Software Engineer. I know that most of the perceived lag on a modern desktop is not due to the CPU, but inefficient I/O to the hard disk or network. One must only look at the iPad 2 too see that's very possible to make a fast computer with beautiful 60 FPS animations and snappy applications using only an 800 MHZ dual core ARM CPU. Ironically, my iPad feels way faster than my Macbook Pro most of the time.

You don't need to be an expert in the microprocessor industry to know that the CPU performance race is over. It's all about power consumption now, and X86 fails miserably at lower power computing. Unless you know something I don't.

X86 fails miserably at lower power computing

x86 currently doesn't scale down to the level required for smartphones.

However, it is getting close in the tablet space. Estimates for the Tegra 3 are around 3-4W TDP[1], while the Cedar Trail Atoms are around 5.5W TDP. In early 2012 Intel will release their Medfield Atom chips, which will make the competition even more interesting.

[1] http://semiaccurate.com/forums/showthread.php?t=4169

[2] http://www.extremetech.com/computing/94184-early-cedar-trail...

>"Unless you know something I don't."

Based upon history, reports of the x86's death have been greatly exaggerated - since the late 1980's.

Here's a nice 1999 article from Ars: [http://arstechnica.com/cpu/4q99/risc-cisc/rvc-1.html]

and the archive.org version for those without IE4 or Netscape Navigator: [http://web.archive.org/web/19991129051550/http://arstechnica...]

Floating Point operations are an example of the hurdle faced by the RISC processors such as ARM - RISC ideology suggests that dedicated FPU hardware and instructions should not be used despite the performance hit that software implementations incur.

On the other hand, the x86 CISC approach has allowed for increased integration based on changing market demands over the past 20 years (e.g. FPU integration with the 80486 in 1989 and MMX in 1996 on the Pentium).

That sort of flexibility has advantages.

I just want to point out that the RISC/CISC term is anachronistic, it doesn't really apply anymore to the desktop and server world. Intel's x86 processors are RISC micro-instructions but with a CISC-like interface for example, effectively blending both. It's what allowed them to race ahead of all competitors in the first place.

edit: My excuses, I couldn't access the cited arstechnica article.

The linked Ars Technica article in fact agrees with you, but retains the term to describe competing design philosophies.

I know that most of the perceived lag on a modern desktop is not due to the CPU, but inefficient I/O to the hard disk or network

Most perceived lag on a modern desktop comes from excessive abstraction which results in poor coding practices. You could certainly argue that IO bottlenecks or a lack of system resources will certainly have an impact but that impact wont be realized until the environment is somewhat saturated. A simple solution to the hard drive bottleneck is to throw a SATA3 SSD in there instead, or to give a system more RAM to boost disk caching, problem solved. On the other hand, no amount of system resources will alleviate a performance hit caused by shoddy coding. This is the reason that I refuse to use Google docs, the performance is about as good as Wordperfect on Windows 95 because of all the abstraction insanity.

One must only look at the iPad 2 too see that's very possible to make a fast computer with beautiful 60 FPS animations and snappy applications using only an 800 MHZ dual core ARM CPU.

The iPad 2 is about as powerful as my Pentium 4 was back in the early 2000s. Shrinking it down to that level is certainly an accomplishment but it's not worth the shock and awe that you present it to be. It's nice to have a device such as the iPad 2 to fill the time when you wish you had a computer but it is in no way a full desktop substitute.

Ironically, my iPad feels way faster than my Macbook Pro most of the time.

Your MacBook is a fundamentally different device than your iPad. They may feel similar but this is purely superficial, the underlying operations are vastly different. If your MacBook is that sluggish, it's either because you're using an Apple product or you've got a PEBKAC error.

You don't need to be an expert in the microprocessor industry to know that the CPU performance race is over

Yes you do. The CPU performance race has been over for the past 5 years but not for the reason you think it is. The CPU performance race is over because AMD choked and threw in the towel. In 2007 AMD's flagship Phenom processor was bested by Intel's then worst in class Core2Quad Q6600 in almost benchmark (if not every benchmark). In 2011 AMD's flagship octal core Bulldozer processor was beaten by a Intel's worst in class quad core i7 920 from 2 years ago which also had an added handicap of only having 2 of its 3 memory channels loaded with DIMMs. Don't blame AMD's failures on the market, or Intel, blame them on AMD.

The fact that the CPU performance race is over doesn't mean that Intel has won, it merely means that Intel is the only competitor since AMD is effectively now a non-contender. It also doesn't mean that there is room in the desktop market for ARM CPUs, or that desktop hardware manufacturers are suddenly going to start writing drivers for two completely different architectures.

While it is certainly true that ARM is gaining on Intel in the performance space, it is still a long long way behind and that gap is only going to get harder and harder to close as time goes on. This is going to be doubly difficult when ARM manufacturers try to catch up to Intel in the general purpose execution department. It's easy enough to say that ARM has a lead in performance per watt if you ignore all of the special hardware capabilities that Intel CPUs have which are mostly absent on ARM or if you forget that power consumption scales logarithmically with voltage and that voltage is necessary to maintain a higher frequency.

It's all about power consumption now, and X86 fails miserably at lower power computing. Unless you know something I don't.

I do know something you don't. Architectures aren't designed to scale infinitely in both directions on the power scale yet Intel still manages to operate dual core full featured processors in the 17 watt range that will still destroy any dual or quad core ARM processor that gets put up against it. Also, I'm not sure how you can justify your statement "it's all about power consumption" because for 95% of the desktop market heat is a non issue whereas a lack of performance certainly is. If you live in a datacenter the constant whine of fans and AC units can certainly get annoying but as I mentioned above, there are already low power solutions that can be had without reinventing the wheel.

It strikes me funny how AMD couldn't compete in a market with Intel and maybe Via, but somehow they think they can compete in a market with 3+ strong competitors.

This feels like throwing the baby out with the bathwater.

Most perceived lag on a modern desktop comes from excessive abstraction which results in poor coding practices.

This is worthless without actual numbers, which I doubt you have. Hardware people blame software, software people blame hardware, as it has always been, so mote it be, amen.

Here though it is not about blaming software or hardware people.

Here is what John Carmack talks about his troubles with the lack of PC performance due to the multitude of APIs to reach the hardware:

John Carmack: ... That's really been driven home by this past project by working at a very low level of the hardware on consoles and comparing that to these PCs that are true orders of magnitude more powerful than the PS3 or something, but struggle in many cases to keep up the same minimum latency. They have tons of bandwidth, they can render at many more multi-samples, multiple megapixels per screen, but to be able to go through the cycle and get feedback... “fence here, update this here, and draw them there...” it struggles to get that done in 16ms, and that is frustrating.

Later in the article John expands on the thick software problem.

The article is here: http://pcper.com/reviews/Editorial/John-Carmack-Interview-GP...

That quote is a bit out of context. The paragraph starts "I don't worry about the GPU hardware at all. I worry about the drivers a lot...". He's talking specifically about GPU performance.

If you want numbers, try comparing the stack depth in a modern application's event handler to those from 10 years ago. Qt4 alone, for example, routinely approaches 50 calls deep just to update a canvas in response to a mouse event. Add to that a dozen or more layers between the compositing manager, window manager, X, display driver, and the kernel, and the end-to-end latency climbs through the roof.

I hope that the end of higher Ghz processors will make it viable again to optimize for code performance instead of optimized for developer time.

Indirectly related: the size of current systems: a typical desktop system is written in about 200 millions lines of code (about 10K books, or a library). http://vpri.org/ (co-founded by Alan Kay) is trying to make a roughly equivalent system in 20K LOCs, or about one single book. And it looks like they can do it (5 years in the project, 1 more year to go).

Let's say it is possible. That would mean current systems are about ten thousands times bigger than they could be. That's 4 orders of magnitude. And even if it isn't 4 full orders of magnitude, I'm willing to bet on 3.

It is not yet about raw speed, or latency. But when a system is at least 3 orders of magnitudes bigger than it could be, it does mean that something there vastly suboptimal. And runtime performance could very well be part of that "something".

Yes, but is that 20K LOC system equivalent in functionality to the larger systems? In every respect, and not just the ones you happen to care about?

Just the ones they happen to care about. I don't think it matters such a great deal however: people tend to care about the same things. Feature creep is when you want to fully satisfy everyone, a few people at the time. Plus, if you want your missing feature, you can code it. I mean, you really can. Many components of that system don't spend more than 1K LOC, they really are accessible.

But that's kind of a straw man. Even if you convince me that feature creep really is valuable, lack of features explains but 1 order of magnitude out of 4. There's still 3 to go. I have two explanations for those.

First, they reuse their code. A lot. When they write a compiler, all phases (parsing, AST to intermediate language, optimizations, code generation) are done with the same tool (augmented Parsing Expression Grammars, search for the OMeta language for more details). When they draw something on the screen, be it a window frame, a drawing, or text, they again use a single piece of code. Mere factorization goes a long way. Id' say it explains about 1 order of magnitude as well.

Second, their use of specialized languages yield astonishing results: they can build a self-implementing compilation system in about 1000 lines (including a bunch of optimizations). 200 more lines gets you a reasonably efficient implementation of Javascript, 200 more gets you Prolog, and a couple hundreds more can get you about any DSL you may want (external DSLs, not your average Ruby/Haskell combinator library). They implemented an equivalent of Cairo in 457 lines, which is about 100 times smaller (and quite efficient to boot, but that was a surprise bonus). They did a TCP-IP stack in about 160 lines, which again is about 100 times smaller than a typical C implementation. And they did all that with specialized languages that themselves are implemented in very little code. Based on that, I'd say their use of domain specific languages explains about 2 orders of magnitude. (Don't take my word for it. See their last progress report here: http://www.vpri.org/pdf/tr2011004_steps11.pdf )

To sum up, we could argue that current systems are about 4 orders of magnitude too big. Of the 4, 1 may be debatable (lots of features). Another (not reusing and factorizing code) is obviously something that has Gone Wrong™ (I mean, it could have been avoided if we cared about it). The remaining 2 (DSLs) are a Silver Bullet. Not enough to kill the Complexity Werewolf, but it sure makes it much less frightening. By the way, we should note that the idea of DSLs is around for quite some time. Not using them so far may count as something that has Gone Wrong as well, though I'm not sure.

Also the beginning of "good enough" performance? In recent years client side software has not generated any new breakthroughs that require vast amount of computing power. Most opengl v2 + games look more or less the same, raytracing hasnt taken off inspite of intels best efforts and there are multiple solutions to do video decode in hardware. I think intel sorely needs something to come along that causes regular users to want to pay top dollar for top performance.

The only thing I run with any regularity, that feels like it could use more raw horsepower is the Clang/LLVM static-analyzer.

Adding an SSD seemed to make no difference, but with luck, the software will get some love and speed up.

Seems like a harsh characterization. When I read articles like this from a person who is enthusiastic and smart but not widely experienced yet, I read it in the frame of mind that this is what the author wishes were true, rather than as something that actually is true. In this case Andrew, who is a student and has been interning at Google apparently, would really like the world to leave the "x86" behind and move on to something presumably more akin to what ever he happens to think should be a worthy successor.

That being said, in terms of CPU's being shipped that are 'customer facing' and programmable with applications from multiple third parties, ARM chips in 'smart' phones and tablets are taking up a bigger chunk of the pie than any previous instruction set architecture (ISA). That includes both PowerPC (Apple products) and Motorola's 68K architecture (Sun and Apple products).

However, what the Andrew misses out on completely is the distinction between systems and processors and the effect that has on adoption rate. This 'secret weapon' that guards the x86 ISA from death like the charm on Harry Potter's head, was put there by IBM in 1981.

In 1981 IBM shipped its first "Personal Computer" and because it was new to IBM to do that and they expected mostly hobbiests to buy them, the 'hardware information' manual came with schematics, a BIOS listing, and where all the various chips were addressed and how those chips would work. Then as its popularity soared, it was 'cloned' (and this is very important), right down to the register level and with identical BIOS code. The parts were available from non-IBM sources and there was really nothing preventing an engineer from doing it except the off chance that IBM would sue them for something.

As it turned out they did sue for copyright violation on the BIOS code but that was really all they could do, the schematic could be copyrighted but implementations of the schematic were not. Once someone had implemented a BIOS in a 'clean room' and that the BIOS was legitimate was sucessfully litigated, the door was opened and the 'PC' business was born. The key here however was that every single one of them was register and peripheral compatible.

Another event happened at this time which helped seal the charm. Microsoft started selling MS-DOS which was software compatible (which is to say had the same APIs) as PC-DOS but could run on hardware that was not register compatible. Intel made a high integration chip, the 80186, which you could think of as a ancestor of today's system-on-a-chip (SOC) ARM chips. It ran MS-DOS but because the registers and peripherals were slightly different (better engineering wise, but different) programs that ran on PCs would not run on it if they talked to say the interrupt controller or the keyboard processor. Thus the term 'well behaved' programs was born, and they were few and far between. And the other side was Microsoft Flight Simulator that, in order to get any sort of performance at all, talked almsot exclusively to the bare metal, became the barometer of 'clone' ness. The question "Can it run Flight Simulator?" was a buyer discriminator and if the answer was 'no' then sales were disappointing.

Those two events, cemented for almost two decades the definition of what it meant to be a 'PC'.

Into those decades billions of person-hours were invested in software and tools and programs and features. A meeting of Microsoft and Intel regularly got together with OEMs and chip makers and system builders to define all of the details, the same details that were originally from the PC Hardware Manual, that everyone would agree on constituted a "PC". These became known as the "PC-98" standard (for PC's built after 1998) or the "PC-2000" standard. Things like power supplies, keyboards, board form factors and slot configurations all became sub processes within that ecosystem and followed the lead of this over-arching standard. Obscure stuff like what the thread pitch would be on the screws that sealed the cabinet, not so obscure stuff like the dimensions of the 'cut out' for built in peripheral ports. And during all that time the basic registers, the boot sequence, what BIOS provided, and the set of things that could be counted on to exist so that you could boot to a point to discover the new stuff all remained constant.

ARM doesn't have any of that. ARM, as an ISA, is controlled by a company that doesn't build chips, doesn't sell systems using those chips, and is not affected by 'stupid' choices in their architecture. All of that is offloaded to the 'ARM licensees.' And since anyone can license and ARM chip, they do. And that means you have ARM chips in FPGAs and ARM chips from embedded processor manufacturers, and ARM chips from video graphics companies. They are all different. Worse, they all boot differently, they all have different capabilities, they don't talk to a standard graphics configuration, they don't have a standard I/O configuration, they don't have a place where USB ports are expected to appear, or a standard way of asking 'what device is booting me and can I ask it for data?' Quite simply there is no standard ARM system.

And because they don't have a standard system, there isn't any leverage. Its like running a race with lead shoes, possible but very tiring.

Now some folks, and Andrew here is clearly one of them, think the system problem is solved by 'Android.' They believe that because software developers can write to Android APIs and have their code run on all Android machines, that they are done. Except that getting Android to run on an ARM system is painful. And worse the 'high volume' Android systems have features at different places (where the accellerometer is, how the graphics work, can it do 2D accelleration or not?) There is not Android 'pc' which gets to define all the detail bits and thus free manufacturers from the grip of having to hire expensive software types to figure this out.

In the end I agree with Stephen's comment that "If someone could explain how Intel will fail to meet the challenges of getting x86's performance-per-watt to match ARM's...." is a red herring, since Intel has literally years of runway to do that, meanwhile ARM platforms are dying (Playbook anyone?) because the cost to make them pushes them out beyond what the market will bear (and yes the iPad/iPhone are keeping a lid on what you can charge for one of these things)).

This is insightful, but I think you've pointed out the solution (for ARM) at the same time as the problem. Is there any reason that Microsoft couldn't repeat their earlier work and define an ARMPC-2013 standard? It seems like this will be necessary if they want Windows8-on-ARM to be a useable proposition.

The trick is get a system standard in place, that has the tendency to commoditize the chips and Intel was in a position to control the value chain (price of the CPU is still a disproportinate cost of the overall system). So someone has to create the standard on faith that the increased volume will make up for the price pressure that comes with commoditization.

Now ARM could come up with a spec for the 'ARM System Standard' and license/certify that. That has some possibility if someone like Google made sure that the Android kernel always ran on the 'reference design' standard. But that level of strategic thinking has been very hard to co-ordinate to date.

It seems to me that Microsoft is in a perfect position to promulgate such a standard - they don't care if the hardware is commoditized (in fact they would welcome it). It also appears that "will run Windows 8" should be a sufficient carrot to convince manufacturers to build to the spec.

As you say, Google is in a similar position, so perhaps a Microsoft/Google jointly supported standard makes a certain amount of sense, as odd as that sounds...

This reads like a person who has been saying "RISC beats CISC" since the Apple-on-PowerPC days because of ideological opinions about elegance, and is looking for any excuse to re-express that viewpoint.

Look, I still think microkernels are better than monolithic kernels, but you don't see me claiming Linux is doomed just because the L4 microkernel is running on 300 million mobile phones worldwide (http://en.wikipedia.org/wiki/Open_Kernel_Labs).

Monolithic kernels aren't going anywhere, and neither is x86.

Things are a lot different now than they were back then. The fact that ARM keeps growing at a very fast rate is very real. ARM will be in billions of smartphones and tablets and who knows what other kind of devices in a few years. If people start using those devices more and more instead of x86 PC's (including ARM PC's), then it's over for Intel.

ARM doesn't have to beat Intel in raw performance. They just need to make them irrelevant to most people. And they are succeeding.

In a classic disruptive innovation fashion, Intel will (continue) to move up market, in servers and super computers, where it will be more profitable for them and where ARM won't be able to reach them (for now). This will become obvious when ARM-based Windows 8 and OS X computers will become available at the end of next year or in 2013. As soon as that happens, laptops will start becoming a low-margin business for Intel.

As a "normal" person, why would you get a $1000 ultra-book, if a similar looking Transformer-like device will be available for $500, and have almost the same "perceived" performance. Dual core and quad core 2.5 Ghz Cortex A15 chips will show how that kind of performance is enough for most people.

that being said, since the recent multi core ARM cpus (hello tegra 3) have a TDP close to intel's atoms, and with the amount of engineering intel pushes into the mobile area, i wouldnt be too surprised if intel would actually beat arm as well in that area.

personally, i don't want a transformer device. the main reason is that it runs android and has little connectivity options. it cannot do anything CLOSE to what I do on the laptop. i own some tablets (!) and i rarely use them. i like to test stuff on them, and sometimes, browse the web or the like. but they're not very useful.

i like the phone better (its smaller!!) and the light laptop better (it does everything without compromises ! and im talking using word, the web, IM, etc, not coding. heck im+web and copy pasting around in android is such a pita. not even talking about getting a proper video out, or copying files on a usb stick.)

I love my transformer, but you can get a "normal person" ultrabook for the same price with better performance and maybe half the battery life. For "normal desktop" use that's probably a good tradeoff.

X86 is not going away, I agree, but Intel can hardly exercise the kind of dominance they've enjoyed for the last several years when they're facing serious threats at both the low and high end. At the low end, ARM simply beats x86 for anything with a battery. Intel has already lost the phone and tablet markets, and laptops are highly likely to follow.

At the high end, look at the http://top500.org/. #1 is based on SPARC VIIIfx. #10 is based on the PowerXCell 8i. #2 and #4 both derive much of their power from GPUs. Even that understates the situation, because many of the most powerful computers next year - Blue Waters, Mira, Sequoia - will also be based on non-x86 architectures. Then look at what Tilera or Adapteva are doing with many-core, what Convey is doing with FPGAs, what everyone is doing with GPUs. Intel is going to be a minority in the top ten soon, and what happens in HPC tends to filter down to servers.

So Intel has already lost mobile and HPC. Even if Intel keeps all of the desktop market, what percentage of the laptop and server markets could they afford to lose before they follow AMD? Maybe it will happen, maybe it won't, but anybody who can see beyond the "Windows and its imitators" segment of the market would recognize that as a realistic possibility.

> what happens in HPC tends to filter down to servers

Is this conventional wisdom? How does a petaflop race affect app servers and databases? It seems like most traditional server workloads could get by without a single FPU. The only thing they have in common is IO. Are there many data centers using Infiniband? (Maybe there are I don't know.)

The Cell architecture is an evolutionary dead end. SPARC is no more of a threat to X86 now than before. GPUs may be the next big thing for HPC but its got a long way to go to get out of its niche in the server market. (That niche being... face detection for photo sharing sites? Black-Scholes? Help me out here.)

I mean, I agree with your overall point, but I think it's more likely that ARM will steal all the data center work before anything from the HPC world does. They are too focused on LINPACK.

Are there many data centers using IB? Yes. SMP was common in HPC before it came down-market, likewise NUMA. Commodity processors have many features - vector instructions, specilative execution, SMT - first found in HPC. Power and cooling design at places liks Google and Facebook is heavily HPC-influenced as well. Certainly some things go the other way - e.g. Linux - but usually today's server design looks like last year's HPC design.

I'm not quite sure it's valid to write off SPARC as an architectural dead end when the current fastest computer in the world uses it, and the next crop of US competitors for that crown are all based on the Cell/BlueGene lineage. GPUs are also more broadly applicable than you might think. Besides video and audio processing, they can be used for many crypto-related tasks (witness their popularity for Bitcoin mining), various kinds of math relevant to data storage (e.g erasure codes or hashes for dedup), and so on. Many of their architectural features are also being copied by more general-purpose processors as core counts increase, as well.

Yes, high-end HPC is too obsessed with LINPACK. Nonetheless, it remains a good place to look when trying to predict the future of commodity servers. Even if ARM does displace x86 instead, many features besides the ISA are likely to come from HPC. Perhaps more relevantly, either outcome is still very bad for Intel.

Who cares what the top supercomputers are powered by, what matters is the market. I'm fairly certain that numerical simulation systems do not dominate the high-end computer market. Instead, it's still all about servers that spend most of their time shuffling data around, which intel platforms still do quite well at.

Additionally, intel still has plenty of time to get up to pace in the mobile market. The tablet market is as yet largely untapped, especially globally. I wouldn't be surprised if next gen atom processors made their way into leading edge tablets in the next few years, for example.

Generally speaking: forecasts that require intel to roll over and take a massive beating while billions upon billions of business leaks away to its competitors don't tend to pan out in reality. The only way that works is if intel goes bankrupt the instant a competitor comes on the scene, and that's just fantasy.

Assuming tablets will run android , i would be surprised if intel made much money from atom processors on the tablet, considering the competition.

I don't see it as a given that x86 tablets would have to run android. There are some roadblocks to running iOS, for example, but none that are insurmountable.

> At the low end, ARM simply beats x86 for anything with a battery.

I've seen this claim often, yet I could not find any sources that could back up this claim. Can you post a link to an article or some research that compares performance/watt (as well as actual power usage) between ARM and x86? I'm genuinely interested in this.

"Performance per watt" is great, but when I step within 2 meters of a power outlet I want "performance." And many machines (including laptops) spend their lives within 2 meters of a power outlet.

At what point did "consumer" start meaning "low end" or "handheld"? The 27" quad-core i7 iMac is a consumer computer. Gaming PCs are consumer computers.

Presumably you still care about "performance per dollar" though, and we're approaching a time when "dollars per Watt" isn't negligible.

Probably around the same time people started predicting the end of the PC era as smartphones and tablets took off.

This is really more AMD throwing in the towel vs. Intel. It's really disappointing to see a competitor leave such an important market. I'm much more disappointed to see the X86 market go from 2 to 1, than I am happy to see the ARM market go from 4 to 5.

If intel manages to beat AMD out of the x86 market then it will be the end of an era. AMD has done a pretty good job of guarding the x86 flanks in critical times - the P4 debacle, the move to 64 bit, operating in low margin parts of the market where intel has few offerings, offering 4 socket servers, currently offering higher number of cores than intel equivalents. Also they do a good job of exploring the design space and have at various times come up with useful innovations. AMD going out would make x86 a much more monolithic entity in the market and much more open to attack from competitors. Even currently AMD's low power brazos designs eke out a segment of the market where Intel has no real offerings ("good" opengl performance on a SOC).

Of course, AMD has been in trouble before and those times weren't as significant. I think what could end the era this time is not AMD throwing in the towel as such, but the fact that AMD could throw in the towel because there is somewhere else to go.

It's an important market now. AMD is doing the right thing: they are in no position to compete with Intel and the x86 is not going to see big growth unless Intel pulls off a miracle.

Where is AMD throwing in the towel? I know they've had some trouble with their latest designs, but I haven't seen anything about them giving up, just that they are trying again.

I think in a lot of ways this is related to the relative overpowering of desktop computers for daily use. It's been true for years now that computers are highly overpowered for what they are typically used for.

Quad core machine with a million gigs of ram for email and a web browser?

Sure there are LOTS of good reasons for having legitimate CPU power, but a lot of times any random Ghz level processor is going to provide plenty of responsiveness for daily tasks. The only thing I can think of that people typically do that is processor intensive is HD playback, and that is easily accelerated nowadays.

It's not always about absolute performance, it's about "good enough" performance. If ARM is going to supply good enough performance with the additional benefits of being cheaper and more portable, then why NOT use it?

This isn't about ARM versus Intel. This is about having adequately powered portables.

Intel is losing the low-end CPU market. That much is true. But the low-end CPU market is the new middle-end CPU market. I think we are going to see an age where more and more people have "low-end" portables as their main computers. The barrier between low-end, middle-end, and high-end has shifted significantly I think. A few years ago, we all had uses for high-end computers. Nowadays, what would be considered high-end is a waste for most people.

Also, we can't forget the impact of the cloud on this. We don't need a lot of computing power locally now. For many of the types of applications that one would need high cpu for, the cloud potentially provides those solutions for us.

I, for one, don't see myself trading in my desktop at work anytime soon. But I do see myself using my laptop a lot more than my desktop at home. My couch is a lot more comfortable than my computer chair.

> Quad core machine with a million gigs of ram for email and a web browser?

I hear this sentiment a lot, and in general I agree, but the fact is the app with the largest RAM footprint on my laptop is Firefox. Given the proliferation of web-based apps, I don't see the complexity of web browsers going down. We can always use more power.

And Firefox is actually one of the most memory efficient browsers right now.

What people fail to see is that "just a browser" is a completely idiotic and misinformed statement.

Browsers are probably one of the _most_ complex and powerful app on the system.

Browsers are basically running entiere applications, virtualized, in a sandbox per tab!

Heck, some websites are just not viewable on mobile right now (unfortunately) because mobile jus't aint nearly fast enough. Think WebGL for example. Few mobile browsers support it, but when they do, its pretty slow if the author didnt make a super low polygon and texture count version...

I disagree. A few years ago, what we had for high-end computers was practically equivalent what we currently hold in our pockets. It wasn't long ago I got my first dual core desktop but dual core CPUs in smartphones and tablets is now standard along with powerful GPUs and lots of RAM. And it's not enough. Every year, each device needs to be significantly more powerful than the last. The idea that we've reached some plateau of mobile computing power doesn't hold up to even recent history.

And that's the real problem for Intel. Mobile computing power is improving an incredible rate -- probably faster than anyone could have predicted -- and soon enough they'll reach a level where ARM and Intel are actually competing at the same level. We're not there yet but it's close. At that point we'll see if Intel has what it takes to stay in the game.

You seem to be operating on the assumption that an ARM CPU can match an x86 CPU in performance clock-for-clock.

While this may be true, it is not necessarily. Just pull out your old 4GHz P4, and see how it stacks up against your modern 3GHz desktop.

The point being architecture has come a long way for x86, and even if they match frequency, ARM will not necessarily come out of the gate capable of matching that.

> Quad core machine with a million gigs of ram for email and a web browser?

Sure, web browsers nowadays do much more than rendering html. It's actually among the most complex software package that your computer runs. Especially the high amounts of memory do not go to waste for a heavy web user.

I'm excited about ARM processors for low-power applications but I can't see ARM replacing x86 processors (even in the consumer space) until ARM performance is comparable to x86 performance. Right now the slowest MacBook Air is 6x faster than the iPad 2; until the difference is 2x or less I just can't see companies or consumers switching to ARM.

I think it's more a matter of the bulk of consumer "micro architecture" usage shifting from computers to "devices", a la tablets, mobile phones, cars, appliances, etc. Right now, we frame our picture of consumer micro architecture as devices that look or act like a computer, but that vision is changing rapidly. Cumulatively, these devices far, far outnumber computers.

6x the performance . . . for 8.5x the TDP. It's pretty easy to see why ARM designs are already preferable for almost anything that has a battery, and laptops are outselling desktops already. Sure, there will always be applications where single-thread performance will matter more than total performance or performance per watt for many cores - believe me, we learned that lesson at SiCortex - but those applications are not enough to sustain Intel as we know it. There's a reason they're developing MIC; without it they'd be squeezed between ARM clients and servers with 20+ ARM/MIPS/POWER/SPARC cores per chip (not even counting GPU/FPGA server plays). They need their own many-core product to compete.

I will take 6x the performance for 8.5x the TDP. My 3 year old MacBook Pro has plenty of battery life for my use case. You have to look at absolute performance numbers for the application.

The only thing I use that causes the fans on my Core 2 Duo MacBook Air to kick in is Flash, so I would definitely consider a slower CPU.

Also, note that the savings/benefits can be in ways other than more battery life e.g. price, fanless & sealed cpu, etc.

How are they mutually exclusive products?

Does it still have plenty of battery life when you're doing something besides commenting on Hacker News? Perhaps more importantly, how fast do you need a single instruction stream to be? I'm writing this on my own three-year-old MacBook Pro, but I know that its battery life is mostly contingent on the CPU remaining idle 99% of the time. For any multi-tasking workload, or even for single-task workloads if the software is written correctly, a more efficient CPU architecture with many smaller cores would offer both greater responsiveness and longer battery life even during my most computationally intensive moments. Maybe you're one of the 1% who really need higher single-thread performance, and not just because of crappy multi-core-naive software, but the market is driven by the other 99%.

I'm surprised that nobody's mentioned that X86 and ARM are instruction set architectures, not microarchitectures.

The two are interlinked, though.

Absolutely not at all.

We don't mind using an assortment of different tools to accomplish other tasks. Why should we expect that a one-size-fits all world of ARM tablets and phones is going to completely displace the tools we're using now?

There are many reasons why the current status quo won't change so soon and an important one is because ARM still lacks 64-bit addressing. Servers have long made the jump to 64-bit, so has most of the consumer-grade computers.

Without 64-bit support from its competitors, Intel doesn't have much to fear, especially in the datacenter space, where performance per watt is a powerful selling point.

ARMv8 64-bit instruction set architecture announced last month, designs due next year. When should Intel become afraid?

Absolutely never solely because of this. Intel makes ARM CPUs. It's a very important player in the ARM market.

It's more of a "AMD days are numbered": instead of having competition with one serious cometitor, they go head-on with a few serious cometitors using new (commercially for AMD) architecture.

Said like that this the title is pretty moot. But if you change it to "X86's Days as the only Personal Computer CPU Microarchitecture are Numbered" the author could have a point.

Good thing it is an 80-bit FP number.

Also, RISC is the Next Big Thing! By the way, Thin Client computing is going to kill the desktop, C++ and C are dead, Microsoft is killing Win32, Microsoft is dead, etc, etc.

Flagged for no meaningful content and inflammatory title.

Also 2012 is the Year of Linux on the Desktop. :)

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact