
Intel Atom versus ARM Cortex-A9 - gvb
http://www.eetimes.com/rss/showArticle.jhtml?articleID=222200621
======
protomyth
I am curious, given the amount of memory that we can put in smaller and
smaller devices, is there a clear 64-bit migration for the ARM like the Atoms
already have? It seems we will be exceeding 4GB of RAM in devices in the the
not to distant future.

~~~
rbanffy
I don't know, but if we stick to running open-source Unix-like software, it
should be trivial to switch architectures when needed. Most software I use
already has actively maintained ports to 64-bit architectures since the mid
90's.

I have several mid-to-late-90's RISC workstations (I restore and collect them)
that have 64-bit processors and OSs in them. 64-bit computing is news only to
the x86 world.

With about 1.5 gigs on my Atom-based netbook, it hardly hits the swap
partition unless I am running Windows on a VM session.

~~~
timthorn
I believe the question is more to do with the hardware options. Going to
64-bit will require a great deal more silicon area with a consequent increase
in power consumption, which tends to go against the ARM philosophy (indeed,
the later ARM instruction sets have gone more in the direction of 16bit).

~~~
rbanffy
There is no real need to migrate to 64-bit unless you need to cleanly address
more than 4 GB of memory. The case for the x86 migration to 64-bits was not so
much the extra memory space, but the extra registers. ARM always had plenty of
them.

More to the point, moving ARM to 64-bits is relatively simple - and the ARM
cores are so small even if you had twice the silicon area for that (a worse
than worst-case-scenario) you would end up with a core that is still a
fraction the size of a tiny x86. A Cortex core occupies less than 2 square
millimeters.

With cache.

~~~
timthorn
Ah, Cortex ...

Cortex is the branding for the ARM implementations of the ARM v7 Architecture.
A brand was chosen for the first time because historically, even numbered ARM
cores don't sell well (ARM8, anyone? ARM10?), so the next core after the ARM11
would have had to skip to ARM13. But 13 is unlucky. So that's ARM15 then... In
addition, it would be very easy for the ARM v7 architecture to be confusd with
the ARM7TDMI implementation of the ARM v4 architecture.

The grand plan was to bring out new cores in V7 targetting three distinct
market segments - Applications, Realtime and Microcontrollers (see what they
did there?!), with very different performance points and feature sets. Thus,
the Cortex M3 has a die size of 0.86mm2 on a 180nm process, but that's a weedy
core. The A8 has a die size of a bit under 4mm2 on a 65nm (albeit speed
optimised) process, without the ETM included.

So, I'm not quite sure what core you are talking about - but it is also the
case that the Atom die does contain a significant number of perhiperals and
busses.

A 64-bit ARM won't have a problem beating an x64 processor in terms of
MIPS/Watt, I'm fairly sure.

Edit - I should have explained that the number following the A/R/M is meant to
indicate performance and not be a version number. The A5 and A9 both came out
after the A8, for example - the ARM market isn't about increasing performance
with every design, but ensuring a good fit for a design envelope.

~~~
rbanffy
"So, I'm not quite sure what core you are talking about"

<http://www.arm.com/products/CPUs/ARM_Cortex-R4F.html>

Whatever that is. I am no ARM expert.

Edit: I think this one would be better. The R is for real-time applications

<http://www.arm.com/products/CPUs/ARMCortex-A9_MPCore.html>

This other one (<http://www.arm.com/products/CPUs/ARM_Cortex-A8.html>) is
about 4 mm2.

------
notauser
FTA> "As form factors shrink power efficiency and battery life will
increasingly drive market success."

I hope it does work out like that, because cheap Linux-only hardware will only
help Linux adoption! However I still think this is incorrect. You can already
get a netbook with more than one useful day's battery life, which means you
can always plug it in overnight. There are people who need more but that's a
fairly small niche.

x86 compatibility is a much bigger deal. The percentage of netbooks sold with
Linux has dropped from 100% to somewhere between 33% and 4% (sources:
[http://windowsteamblog.com/blogs/windowsexperience/archive/2...](http://windowsteamblog.com/blogs/windowsexperience/archive/2009/04/03/windows-
on-netbook-pcs-a-year-in-review.aspx) and <http://www.osnews.com/story/22587>
).

That's still not a bad number, especially if it's closer to 33%, but it does
put ARM in the position of at best trying to win a market within a market.

On top of that there's a question of how much further form factors are really
going to shrink. Keyboards can only get so small, and we are already seeing a
trend towards _bigger_ net books - not smaller.

Finally, they are comparing against the old single processor atoms, not the
new dual core atoms that are already available.

That still leaves ARM with a potential advantage in the tablet market, where
there is a lot less space for batteries and the software stack is better
concealed. But the fate of the tablet market is still in doubt.

~~~
gvb
You do need to be a little careful looking at netbook market share and size
statistics because Microsoft decided they could not cede that market to linux.
As a result, they have distorted the market by licensing WinXP (and now Win7?)
at an attractive price point and added laptop/netbook size and speed criteria
to discourage low end netbooks.

~~~
timthorn
"As a result, they have distorted the market by licensing WinXP (and now
Win7?) at an attractive price point and added laptop/netbook size and speed
criteria to discourage low end netbooks."

How is selling their software at an attractive price to the market classed as
distortion?

~~~
notauser
Because in my opinion it looks awfully close to dumping that is being cross-
subsidized by their profitable laptop and desktop monopoly.

"Microsoft charges local OEMs $32 for XP Home on netbooks, compared to around
$65 for XP Home on desktops. Large OEMs are believed to pay much less."
<http://www.crn.com/software/212902058>

~~~
timthorn
Surely it is only dumping when selling below cost? I can't imagine that MS is
losing money on this.

Charging differing prices for different market segments is a reasonable
business practice, and it could be argued that Linux is the OS that is
distorting the market, seeing that there's no licensing fee at all involved.
:)

~~~
notauser
With software it's really hard to say what below cost actually is. Amortized
fixed cost, or just the variable cost?

If you are talking about the latter then no, it's probably not dumping. If you
are looking at the former I bet it's not so clear cut. WinXP development and
all of the associated cost (such as the loss-making products like IE that help
keep WinXP in it's market position) would have been pretty expensive - and the
expenditure is still going on today thanks to things like security support and
IE upgrades.

------
tedunangst
The video is designed to convince you that the ARM system is just as fast as
the Atom system. But they don't present evidence to support this. They only
demonstrate that the ARM is sufficient to do basic web browsing. I want to see
a demo playing something that actually taxes the CPU, e.g. Quake2. Then we'll
see what's better.

If it's all about "just fast enough to view the BBC", then what prevents Intel
from scaling the Atom back to 500MHz?

PS: What is the graphic "accelerator" in the intel system?

~~~
wmf
_They only demonstrate that the ARM is sufficient to do basic web browsing._

That is impressive, considering that the ARM Cortex A8 was _not_ sufficient.
They have gone from "not good enough" to "good enough".

------
pieter
_Even more notable is that the Intel-based netbook is running at 1.6-GHz clock
frequency the dual-core Cortex-A9 development is running at 500-MHz._

I don't get why people keep saying this. One is x86-based, the other is ARM-
based. Why should you be able to compare cpu frequency? Is it notable that a
Core 2 processor performs better at 3.2GHz than a P4? And that's even in the
same general CPU architecture.

~~~
gvb
This is the core of the power vs. speed trade-off. Since the ARM is getting
roughly the same amount of work done at 1/3 the gross clock frequency, it has
a _substantial_ advantage in producing a lower power implementation.

Intel has been very clever (read "spends a _lot_ of money") in their process
to make it run faster at a given power level, but their legacy architecture is
a major millstone around their low power aspirations.

There are two aspects of power consumption in digital electronics:

1) Static power (leakage)

2) Dynamic power (cost of switching state)

The static power is what the part consumes while sitting idle. As a gross
generalization, optimizing for execution speed typically hurts static power
(you end up with more leakage to facilitate faster switching speeds). Note
that smaller geometries help speed, but hurt leakage. Semi manufactures,
including Intel, fight this by being more clever with their process, doing
things like SOI (IBM/Freescale, also AMD?) and high-K insulation (the latest
"hotness", if you pardon my pun).

Dynamic power is the cost of switching transistors on and off, of toggling
interconnects high and low. This is largely a capacitive charge/discharge
effect. Every time you switch a transistor gate or drive a transmission line,
you have to push charge onto or drain charge off of the gate or line. It costs
power to do this. It costs more power to do this faster.

CPU speed can also have "knock-on" effects throughout the system. Faster CPUs
(typically) have or need faster memories and other peripherals. If the
memories are slow, for instance, the CPU spends a _lot_ of its time waiting
for cache lines to be filled or emptied. Intel helps this by having larger
caches, but caches cost power too - bigger caches have more leakage (1) and
faster caches cost more for dynamic power (2).

Intel is working hard to make a low power x86 architecture and succeeding to
some extent, but they have to do a lot of brute forcing to overcome their
legacy CPU architecture.

Like martial arts, your strength (x86 ISA and their legacy of emphasizing raw
clock rate) can become your weakness if your opponent can figure out how to
use it against you.

~~~
tedunangst
I liked your post, but you didn't explain at all why a legacy x86 ISA has
anything at all to do with either static or dynamic power.

~~~
timthorn
The x86 ISA has significant complexity to it. The essence of the problem is
that the logic required to decode x86 instructions (originally designed to be
featureful for assembly programmers' convenience) is large, compared to that
required for the ARM ISA which was designed to be decoded in an efficient
manner.

That further logic certainly increases static power in terms of the extra area
used, and also increases the dynamic power as more translation is required to
get from an x86 instruction to the hardware control signals (I believe that
the central part of the Atom effectively runs a different ISA translated from
the x86 ISA via microcode).

~~~
tedunangst
What percentage of an Atom is devoted to decode? Atom is around 45 million
transistors. The PPro which also decoded to mini-RISC had 5 million. Somehow I
don't think the Atom is really burning that much space/power on decode.

You and gvb seem to imply that an ARM (RISC) chip which does less work per
instruction somehow does more work per clock. I find that surprising.
Otherwise, what is meant by the "emphasize raw clock rate" comment?

~~~
sketerpot
The Cortex-A9 executes more instructions per clock because it can decode more
than one instruction at a time (2 to 4, usually 4) and execute them out of
order to get more instruction-level parallelism. Atom can decode a maximum of
two instructions at a time, and executes them in order.

The Atom also uses precious pipeline stages on decoding the x86 instruction
set -- all of it. There are a lot of instructions there, and all but the most
common require the processor to read sequences of several micro-instructions
from a ROM.

Now, the Atom tries to compensate for this with Hyperthreading: they share the
decode unit among two threads (hypothetically) and have those threads share
the execution units on the chip. This gives Atom a boost in performance per
watt if you can find two processes that need to run at the same time. For
single-threaded workloads, Cortex-A9 looks like it will continue to trounce
Atom pretty severely, and for multi-threaded workloads, it's not expensive to
put down some more A9 cores. It's not like they have to decode x86
instructions or anything, so they can be pretty darn small.

