
The ARM server apocalypse - Ecio78
http://storagezilla.typepad.com/storagezilla/2013/06/the-arm-server-apocalypse.html
======
hapless
"If, by 2015, 64-Bit ARM makes it within striking distance of x86 server
processor performance [...]"

There isn't a big enough rolleyes in the universe. ARM's sink-or-swim pitch
points will be will be power consumption and density. Competition on
performance isn't even a choice.

x86 server buys, today, are often about how much RAM I can get in a rack unit.
If Aarch64 servers can get enough RAM tied to a chip, and make memory access
fast enough, there's a pitch to be made based on ultra low-power, high-density
virt hosts.

If they can't, they remain novelties for "hyperscale" hosting of static
content and toy sites.

~~~
akiselev
Yeah the chance of ARM improving their architecture to match 10nm SoCs (2015
is the release date for Intel's 10nm line and they said that the will push
their SoCs to the latest tech), let alone 2015 x86 is laughable.

Intel's mad focus on power consumption might catch up to within like 10-50% of
ARM at 25-100% better performance.

------
t05ter
The article is pretty shallow in the sense that it just casually glosses over
a few key characteristics that make the server/datacenter space an interesting
battlefield for the current round of architecture wars.

It will be an interesting next couple of years. Intel didn't sit on their
hands with this one, like they did with the mobile market. They saw the
potential for ARM SoCs, using a large number of smaller/lighter cores to take
market share in things like simple web servers or hosting static data. Their
response was their Atom based "Centerton" which is cannibalizing their own
Xeon line.

As for more CPU performance intensive tasks, ARM has yet to prove itself in
the performance/watt arena, especially in the 64-bit realm of server features,
like error correcting code or RAS. Most sys admins building infrastructure
like to play things safe, since they have to live with their decisions for
years.

It seems like ARMs biggest advantage will be price, and the SoC business model
of custom tailoring silicon for a customer's needs.

------
glurgh
Someone's been just about to beat Intel at the high-performance CPU game, any
day now, for a couple of decades. The AIM (Apple/IBM/Motorola) alliance was
very fond of hockeystick graphs showing their planned crushing of other CPUs
(shown as a flatter line, usually coyly labeled 'CISC'), for instance, back in
the early 90s.

It's not an impossible thing but given the track record of such claims and
predictions, the only comment they should elicit is 'Shut up and show me the
silicon'

~~~
venomsnake
If you throw enough "This time it is different" on the web a person may guess
one.

When i shop for a new desktop I feel I am essentially paying intel tax. Since
AMD begun to play catch up the intel desktop product line moved from providing
value for the customer to extracting consumer surplus more efficiently. I
would be surprised if this is not the case in the server market.

The lack of VT-d on K processors and the totally locked other CPUs are a good
example. Nobody likes being milked which can create some desire to diversify
from intel even if the savings are not enormous.

~~~
frozenport
Are yoy saying Dell won't buy Intel because Intel is evil? Recently China
started to wind down its anti-Intel stance in pqrt due to a new supercomputer
deal.

~~~
venomsnake
Dell don't buy intel. They resell intel so it is all the same for them.

They will gladly resell whatever someone decides to buy. But every time there
is a new chip on the market the big cloud operators and big end clients for
intel chips take calculators and begin to evaluate very carefully.

Also intel are not evil. They just milk their customers.

------
moconnor
Intel's Xeon Phi cards pack 240 x86 CPUs per card at _very_ low power per
core. You can run Linux on them. Soon we'll be serving web content and VMs on
them.

The best bang-per-watt supercomputer in the world right now is a ton of Xeon
hosts with 2 of these cards per machine.

We've got a stack of then in the office; they're pretty interesting.

~~~
lgeek
> Intel's Xeon Phi cards pack 240 x86 CPUs per card

> We've got a stack of then in the office; they're pretty interesting.

That's funny, seeing as the only Phi coprocessor you can buy right now only
has 60 cores [0].

I mean, what Intel has done is quite impressive, but let's not exaggerate.

[0] [http://ark.intel.com/products/71992/Intel-Xeon-Phi-
Coprocess...](http://ark.intel.com/products/71992/Intel-Xeon-Phi-
Coprocessor-5110P-8GB-1_053-GHz-60-core)

------
berkut
This is fine for VMs and maybe low-load servers, but the problem for high-
performance servers is ARM sucks memory-wise as they don't have the patents on
the cache hierarchies that Intel has (and AMD can use), and thus have very
limited cache systems.

AMD's future entry into the ARM market might change this, as they can use
these patents.

------
VLM
"... with that comes all the awe & terror of living with Windows."

I had to LOL. Not even any point stating the obvious. That's good writing, can
coin quite a phrase, unless thats a quote from somewhere.

On the bigger picture, I donno. I've seen the infinite circular wheel of IT
rotate around quite a few times and this overall idea sounds like a rehashed
Transmeta marketing message. That didn't turn out so well last time, but maybe
the situation is different this time. Probably not, but maybe.

~~~
sp332
Transmeta's CPUs worked fine, but the market wasn't looking for underpowered
mobile processors at the time. They sold off their IP to Intel who rolled it
into the Pentium M which was the basis for the Core series.

~~~
akanaber
The Transmeta Crusoe (VLIW with software translation) was nothing like the
Pentium M which was a derivative of the P6. The Pentium M did add some
cleverness in fusing some of its hardware translated micro-ops but that's not
terribly related to what Transmeta were doing.

You may be misremembering the patent lawsuit Transmeta filed against Intel,
long after the Pentium M had been released and Crusoe had failed commercially,
basically as an alternative monetisation strategy. Intel did end up paying
them to go away, but their IP was not "rolled into" Pentium M any more than
Eolas's was into Internet Explorer.

~~~
sp332
Oh I didn't mean core architecture or instruction set or anything, just some
of the power management.

------
luu
The business case for serving web content isn't as strong as you might think
from a back of the envelope calculation about cost/power vs. performance. It's
true that you'll find that you can get more throughput per cost with ARM/Atom
(mostly due to the lower power, but also because the machines are cheaper),
but, when you actually do it you'll find that latency is significantly higher,
if you compare ARM/Atom boxes with low load vs. fast x86 machines with high
utilization [1]. An argument people often make is that ARM is going to catch
x86. Sure, possibly, but why wouldn't you expect x86 to catch ARM? [2]

A lesson that people seem to have to keep re-learning, over and over again, is
that latency matters a lot on the web. A few ms increase in latency has a
measurable effect on your income, as people just close the webpage and click
elsewhere. A significant increase in latency is disastrous.

[1]
[http://users.ece.utexas.edu/~vjreddi/UT/Publications/Entries...](http://users.ece.utexas.edu/~vjreddi/UT/Publications/Entries/2011/8/1_Mobile_Processors_for_Energy-
Efficient_Web_Search,_In__i_IEEE_Transactions_on_Computer_Systems_\(TOCS\)__i_,_Vol._29,_No._4,_Article_9,_August_2011._files/reddi2011-tocs.pdf).
This paper describes a way you can use low-power boxes to get better results
for the same cost, but it doesn't involve simply swapping your core i5s with
Atoms or ARMs, which is what a lot of people seem to want to do.

[2] One of the major lessons computer architects learned perhaps 10-15 years
ago is that the instruction set just doesn't matter that much, compared to the
microarchitecture, the manufacturing process, and the quality of the circuit
design. Intel has a decisive advantage in manufacturing that's been growing
for approximately two decades, ARM doesn't even try to compete in circuit
design with full-custom design or fancy circuit techniques, so that leaves the
microarchitecture. Although I'd disagree, you might make a case that ARM
simply has better architects, and that ARM would produce a better design than
Intel if they targeted the exact same space. But, I doubt you'd try to make
the case that you'd expect ARM's design to be so superior that it will
obviously overcome Intel's other advantages.

Another case you might make is that Intel simply won't target the same space,
to avoid cannibalizing their own market, but, in addition to obviously moving
towards that space with Atom, they have a history of ruthlessness that makes
that seem unlikely.

Intel used to be a dominant player in the DRAM industry, but they killed off
their DRAM business when they were a leader in the field, because they
recognized that it would become a commodity industry. After becoming a market
leader in SRAMs, one of the competitors invented flash; they realized the
significance and focused on flash and microprocessors, while, again, killing
off what was (then) a major cash cow. It's very hard to imagine Intel just
sitting and slowly losing their dominance of the microprocessor industry,
ending up with a position like IBM or Sun. They've never done that in the
past, so why would you expect them to start now?

~~~
akiselev
Just to second this point, most people just have no conception of what
semiconductor manufacturing is like and this article seems especially
misinformed about the manufacturing side. For example: "The ARM collective
simply show up to Samsung or TWSC, or TI or Global Foundries, ask for a
zillion processors to run off the manufacturing lines and wonder if they can
have them by Tuesday?" which is just total bullshit. The reason Apple had to
buy so many processors from Samsung for so long is because setting up a new
processor is expensive and difficult, whether it's an Intel or Global
Foundries and it sure as hell doesn't take less than six months, let alone a
week before you can get a reliable shipment of chips for a consumer product.
(and before you can talk about volume of ARM vs Intel, compare the dozen+
companies that can manufacture 24-60-something nm to the ONE company that can
do 22 and lower: Intel)

There are many competitors in the semiconductor manufacturing space because
COMBINED they don't have the manufacturing or scientific resources of Intel
(TI and Qualcomm have massive R&D arms but they're more concerned with
wireless and other electronics). Intel's first mass market 14 nm facility is
supposed to go live this year and Samsung just BARELY got their 14nm demo out
this year. Intel's plant was $5 billion and started in 2011 which means that
Intel has years to improve their power, bus speeds, and reliability before any
manufacturer will even be able to sell their first processor. As the
technology nears sub-10nm, the gaps between performance and power consumption
between the architectures will be more and more obvious.

All of the other semifab manufacturers are extremely reliant on third party
suppliers for their factories whereas Intel helped develop a huge portion of
their technology, many times outright owning part of the equipment
manufacturers (for example,
[http://www.extremetech.com/computing/132604-intel-invests-
in...](http://www.extremetech.com/computing/132604-intel-invests-in-asml-to-
boost-extreme-uv-lithography-massive-450mm-wafers)). 14nm was the point where
they hit a lot of physical phenomena that prevent e-beam lithography and some
of the other methods from working and Intel's pretty much the only one that
can really push this technology forward.

Edit: Also, [http://www.tomshardware.com/news/intel-cpu-
processor-5nm,175...](http://www.tomshardware.com/news/intel-cpu-
processor-5nm,17578.html) \- it's over. Notice how Intel said they're going to
push their SoC's to their current processes? Hopefully this means that in 2015
we'll have 10nm x86 SoCs (at which point ARM will be, best case scenario,
nearing end-of-life 22nm)

~~~
brigade
Shrinking feature size alone has had diminishing returns on efficiency for the
last couple generations, and I haven't heard any indications that sub 22 nm
would reverse that. Obviously it still increases your transistor budget, which
is admittedly important for server CPUs since caches are generally the single
largest use of transistors.

But efficiency-wise it's FinFET, not feature size, that has given Intel the
biggest advantage as of late. And indeed the ARM foundries might not have
volume shipping FinFET SoCs until 2015.

I have no idea why you're comparing TI/Qualcomm to Intel when discussing
foundry R&D - it's TSMC, GloFo, and Samsung that are relevant there.

Also, Tom's hardware has its years off - Intel isn't shipping 14nm CPUs until
_next_ year, with 20nm ARM SoCs expected around the same timeframe. Then 10nm
isn't expected until 2016 at the earliest, again with 14-16nm ARM SoCs in
about the same timeframe.

So yes, Intel is about a generation ahead of its competition, and even more
factoring in FinFET. But that's not the end of the world - IBM, AMD, and
Oracle still make server CPUs despite this. Not as successfully as Intel
obviously, but enough that there could be room for one or two makers of ARM
microservers to enter.

~~~
akiselev
I forgot about FinFET and just extrapolated from the i-series jump but the
only important differences I've seen in processors in the last 3-4 years has
been power efficiency, cache, and bus speed. Clock speeds (especially with
TurboBoost thrown in) became more and more erratic as a metric for my use
cases so at this point it's all about Intel's microarchitecture and process
(in my perspective). They've also been one of the most advanced firms
materials wise so as we start to get smaller and smaller, I think Intel will
begin to use new materials that will increase the gains on FinFET and process.
IIRC in 2012 they were two "generations" ahead on high dielectric materials
and their silicon straining process for mass production.

I just picked two companies off the cuff that are relatively well known and
that I think could make an impact in the server market with ARM. I think
comparing ARM mobile SoCs to Intel's x86 or AMDs AMD64 is disingenuous and
since all TSMC and Global Foundries can do is play catchup (hence me touting
Intel's process advantage), the brunt is left on the microarchitecture
designers and the integrators to really make a server that can beat Intel's
Xeon. I'm sure TSMC/Global Foundries are deeply involved in the design aspect
but I think other companies will make or break ARM in the server market.

As for the dates, that's a shame to hear :(. However, I just looked it up and
TSMC finished their 20nm design this year so I really doubt 10nm chips will be
in a server-ready state in 2016. I think that the next few years are on ARM's
turf but if Intel can hit the cost sweet spot that ARM is at (or even just in
the ballpark) with Intel's process and performance, it might be the end of ARM
as a non-mobile/embedded contender.

------
anizan
ARM SoCs have big problems when it comes to memory throughput which is a big
part of a server's performance. i ran [http://code.google.com/p/byte-
unixbench/](http://code.google.com/p/byte-unixbench/) on a Exynos 4412 SoC and
some of the memory dependent results was horrendous.

~~~
timthorn
Try a server optimised SoC rather than a mobile one; the optimisations are
somewhat different.

------
Swannie
Am I being naive here? Isn't the whole point of virtualisation, that I can put
50 "average" VMs onto a top-end server, I can put 200 "small" VMs onto it, or
1000s of lightweight LXC containers.

What limitation are ARM micro-servers going to overcome? CPU bound tasks? No.
Memory bound tasks? Well, with memory extension techniques, x86 can have huge
quantities of memory, no not so much. So IO bound tasks? Maybe... Is there
something else I missed?

Workload per watt? Power management in x86 processors, chipsets, and general
server design is getting better all the time. Of course, there are things
where FPGA's or ASICs will win every time... so what is it that I get with an
ARM that I can't get with a FPGA + x86? Or is it more that ARM SoC + FPGA/ASIC
is where things get attractive?

~~~
vardump
Memcached/redis sounds like something ARM CPUs might be good at.

Maybe execution performance predictability and better isolation of running a
single instance on bare metal ARMv8 vs. a hypervisor running 50 instances on
x86? In my experience, performance varies wildly on virtualized systems. Maybe
x86 VM worst case can be worse than running on bare metal on ARM?

Anything where you need only a few instances and have relatively low
performance requirements, like SOHO servers? I'd love to have something
_generic_ that consumes just a few watts but could do diverse tasks from
routing, VPN, file serving, etc. at 500+ Mbps.

I guess it remains to be seen what kind of niche ARM servers will carve. I'm
excited to try them out, to see how far they can be pushed.

------
csense
Modern websites have a lot of moving parts, and they _all_ have to move to ARM
before you can think about changing your server architecture.

Do proprietary databases run on ARM? What about the language your web
application is written in? Last I heard (~6 months ago) official ARM Java is
available for the bleeding edge, I'm not sure if it's made it into an official
major release yet.

Then there's the incompatible softfloat / hardfloat divide, just to make life
fun for those who write JIT compilers (and regular compilers).

------
PaulHoule
Ordinary Windows 8 on x86 has the same waves of instability as the author
remarks about ARM builds.

I've noticed weird and wonderful changes in Windows 8 such as a situation that
would have been an intermittent BSOD before now causes a instant and
deliberate reboot and an attempt to, as much as possible, pretend it didn't
happen.

I say it's about time.

~~~
voltagex_
What? There's still going to be memory.dmp somewhere in c:\windows that'll
tell you what happened.

~~~
PaulHoule
sure, but it's just the right thing, the Unix thing to do.

------
Qantourisc
They say it only cost 30M to make a ARM, but how much would it cost to make
one that runs at the same practical speed then a x86 ? That's IMO the real
question ...

------
Qantourisc
Ow I'd also like an ARM that can emulate x86 at native speed please. Otherwise
a lot of stuff will stop running :)

------
pbharrin
I think he meant TSMC when he said TWSC.

