
ARM v8-A with Scalable Vector Extensions: Aiming for HPC and Data Center - pella
http://www.anandtech.com/show/10586/arm-announces-arm-v8a-with-scalable-vector-extensions-aiming-for-hpc-and-data-center
======
dogma1138
Intel losing the HPC space to NVIDIA(GPU's) and now ARM they can't be happy.

Don't know how "profitable" the HPC market at least nation-state level
supercomputing is, at least directly. But the indirect gains from it,
prestige, contacts, and technology transfer during the codevelopment process
are likely to be quite huge.

If AMD actually hits Intel where it hurts with Zen and we don't get a
Bulldozen at the end like we got with well Bulldozer; Intel might be up for a
rough ride, especially if it's 3D Memory technology which seems to be now the
"next big thing" at least as far as Intel's development goes doesn't pans out
that well.

~~~
SixSigma
Intel just announced they will be fabbing ARM chips

[http://www.bloomberg.com/news/articles/2016-08-16/intel-
lice...](http://www.bloomberg.com/news/articles/2016-08-16/intel-licenses-arm-
technology-in-move-to-boost-foundry-business)

~~~
mtgx
Which only means ARM chips will become even more competitive with x86 chips.
Intel seems to want to be in the "dumb chip manufacturer" market (kind of like
how ISPs are "dumb pipes") - because if this business pans out for them, then
the x86 business will struggle even more as a direct consequence of it. x86
was barely competitive with ARM at the same performance and power consumption
level when Intel was using its manufacturing edge against ARM chip makers. If
that gets equalized, then ARM chips should win big against Intel's own chips.

However, I expect Intel to try and shoot itself in its own foot in this
business, for the same reason they killed Xscale previously. Because Intel is
at a cross-point now (pun intended). They have to make a choice - either they
"go big" with ARM chip manufacturing, or they try to protect the interests of
the x86 division. They may put all sorts of caveats on ARM chip customers that
want to buy Intel's manufacturing, so their chips can't compete directly with
Intel's own x86 chips. But that will significantly hurt the potential of its
ARM chip manufacturing business - so then may have to ask themselves, why are
they even bothering with ARM chip manufacturing, if they're not going to go
all-in on it?

So either Intel repeats its Xscale mistake by protecting the x86 chip
business, and risks being completely left out of the mobile/IoT markets
forever (in any capacity), while also losing a few more billions of dollars
with this "ARM manufacturing experiment". Or it goes all-in with ARM chip
manufacturing, manufacturing anything from IoT ARM chips to Xeon ARM
competitors for anyone from MediaTek to Qualcomm, and then its x86 risks a
severe decline over the next decade.

Either way, Intel has to make a choice. They can't have their cake and eat it,
too, no matter how much they'd wish that to happen right now.

~~~
pjmlp
Still I would like for them to keep improving the x86, otherwise we will
eventually just switch our chip overlord.

~~~
pcwalton
AArch64 is a very clean ISA. From an architectural point of view, I wouldn't
be upset if AArch64 eroded away x86-64.

Of course, I don't want to sacrifice competition. It'd be awesome to see Intel
and ARM competing in features, performance, power, and price on a simple
target for compilers and programmers.

(Yes, I know as a practical matter x86-64 will never go away; it's simply that
I like the direction the market is going.)

------
pella
with code examples:

"Little ARMs pump 2,048-bit muscles in training for Fujitsu's Post-K exascale
mega-brain"

[http://www.theregister.co.uk/2016/08/22/armv8_scalable_vecto...](http://www.theregister.co.uk/2016/08/22/armv8_scalable_vectors/)

~~~
monocasa
Those code examples are all NEON or no vector ops. No SVE unfortunately.

------
0xFFC
I would love see more attack from ARM toward Intel in different sector. I
would love to see ARM SoC for desktop users.

Intel is kinda company which is used to ripping off people without serious
competitor.I really like the idea of competition in CompArch area.

Lets be honest, before smartphone raise, Intel was socking people's blood and
in most area, they didn't have any serious competitor.

I know ARM was selling small chips, but AFAIK (I might be wrong) those market
is not even close to servers,desktops, etc market's which Intel was king for
long time.

~~~
SixSigma
Maybe this is part of the reason Intel decided to re-license ARM and get
fabbing

[http://www.bloomberg.com/news/articles/2016-08-16/intel-
lice...](http://www.bloomberg.com/news/articles/2016-08-16/intel-licenses-arm-
technology-in-move-to-boost-foundry-business)

------
cesarb
Does anyone know how this compares with Hwacha or the still unreleased RISC-V
Vector extension?

------
pella
[https://community.arm.com/groups/processors/blog/2016/08/22/...](https://community.arm.com/groups/processors/blog/2016/08/22/technology-
update-the-scalable-vector-extension-sve-for-the-armv8-a-architecture)

------
ericvh
More details in the upstreamed binutils support:
[https://sourceware.org/ml/binutils/2016-08/msg00166.html](https://sourceware.org/ml/binutils/2016-08/msg00166.html)

------
gbrown_
Anyone have details on the cache design for this? 2048 bits is much larger
than most cache lines in chips today.

~~~
beautifulpeople
Looking at the article, I'm not sure 2048-bit lines are quite needed (not to
say that that wouldn't be interesting), from
([http://www.theregister.co.uk/2016/08/22/armv8_scalable_vecto...](http://www.theregister.co.uk/2016/08/22/armv8_scalable_vectors/)):
"And once a program has been built for SVE, it will run comfortably on any
SVE-capable processor without recompilation, whether the CPU has support for
512, 1,024 or the full 2,048 bits. The SVE unit can automatically break a
2,048-bit vector into, say, four 512-bit vectors if its silicon implementation
doesn't support the full length." This paragraph from The Register implies
that you could have smaller chunks or larger depending on the silicon
implementation. If you look at a 64-byte cache line (what most architectures
have today, power & itanium are notable exceptions) that would mean 512-bits
per line (assuming you can use the whole line, i.e. packed). For 2048-bit that
means 4 cache lines worth of data could potentially be operated on at once.

