
Fujitsu Switches Horses for Post-K Supercomputer, Will Ride ARM into Exascale - rbc
https://www.top500.org/news/fujitsu-switches-horses-for-post-k-supercomputer-will-ride-arm-into-exascale/
======
monocasa
> Designing a manycore CPU, which the Post-K processor will almost certainly
> be, with a simpler RISC core as its base, is inherently more efficient than
> trying to do that with a more complex architecture like SPARC64.

Honestly, ARM is a very complex architecture these days. There's somewhere in
the neighborhood of a thousand instructions. It's very close to x86 in
complexity. Tons of cruft built up over the past thirty years too (although it
was a cleaner architecture than x86 to start off with so it has that going for
it).

~~~
trsohmers
While the conversion from x86 to Intel's internal microcode obviously adds
complexity, internally Intel has been a RISC load/store architecture since
2006. Since ARM moved to ARMv8, I groaningly refer to v8 as "x86 v2" with most
implementations with ridiculous pipelines and OoO execution. The only actually
good ARMv8 implementation I've seen doesn't actually implement ARMv8 at all
internally... NVIDIA's (now basically dead) Denver architecture does Transmeta
style codemorphing to support ARMv8 instructions while having a much simpler
(and efficient) internal architecture.

~~~
scriptdevil
Denver isn't close to dead. I work in the Denver team at NVIDIA. If you mean
market adoption, yes - Nexus 9 is the only major device to have shipped with
Denver cores as of today. But development is still very active internally.

~~~
lgeek
Somewhat off topic, but I wish NVIDIA published more information about
Denver's DBO and the reasoning behind some design decisions. For example the
company line is to just generate 'decent code' and let the DBO take care of it
[0], but that clearly doesn't always work in practice [1], which requires a
lot of trial & error and some reverse engineering to generate good code for
it. Also, the decision to always invalidate the translation when writing to an
executable page (or whatever other super slow operation is done in that case)
seems odd (and affects the performance of some JIT approaches) since ARM
software is expected to explicitly flush the caches when modifying code
anyway.

[0]
[https://www.youtube.com/watch?v=oEuXA0_9feM](https://www.youtube.com/watch?v=oEuXA0_9feM)

[1] [https://github.com/ssvb/tinymembench/wiki/Nexus-9-(Tegra-
TK1...](https://github.com/ssvb/tinymembench/wiki/Nexus-9-\(Tegra-TK1-T132---
Denver\))

------
cordite
In the general space of computing, what kinds of alternatives to the Xeon Phi
are there that want to do their own mini-supercomputer tasks? (very subjective
on task definition, but for now let's suppose things that are hard to do on a
GPU effectively).

It seems like the Parallella tried to get close, but did not succeed in
breaking into the market. (On Amazon, looks to be marketed more for a high
powered raspberry pi alternative...)

Which seems to be more useful? The power to compile same code with alternate
flags (Xeon Phi style); cross-compile to other arch's (x86 host -> ARM
binary); JIT bytecode, LLVM bitcode, or Intermediate-Representation formats?

What kind of access patterns would be most common for a hobbyist or an
enterprise to cater to? For example, one issue the Xeon Phi has is memory
controller contention, which makes it less optimal for less structured
relational analysis.

------
hangonhn
I don't get why people keep reporting this as ARM unseating x86 when in this
specific case its ARM replacing SPARC. Is there a perspective I am missing?

~~~
trhway
interesting, to say the least, choice for Fujitsu, i mean they do have
SPARC/RISC experience and given that the current top dog in the world is
Alpha/RISC based
[https://en.wikipedia.org/wiki/Sunway_TaihuLight](https://en.wikipedia.org/wiki/Sunway_TaihuLight)
, it seem strange to make such a huge bet on completely new architecture -
Top500 has no ARM systems at least at the top of the list.

Beside big picture, there are such pesky details as "Fortran for ARM" HPC
compiler. All these bearded and not so guys who walked Sun hallways for years
... The Intel HPC compilers also well established. What is available for ARM
in that department?

~~~
beautifulpeople
I think you're missing the big picture here. Looks like Cray has been looking
at ARM for awhile for Fast Forward 2 (US DoE I think, announced 2014:
[http://investors.cray.com/phoenix.zhtml?c=98390&p=irol-
newsA...](http://investors.cray.com/phoenix.zhtml?c=98390&p=irol-
newsArticle&ID=1990117)). Looks like Barcelona Supercomputing Center also
deployed ARM based test systems quite awhile ago
([http://nvidianews.nvidia.com/news/barcelona-
supercomputing-c...](http://nvidianews.nvidia.com/news/barcelona-
supercomputing-center-to-deploy-world-s-first-arm-based-cpu-gpu-hybrid-
supercomputer), [http://gizmodo.com/5898633/can-an-arm-based-supercomputer-
be...](http://gizmodo.com/5898633/can-an-arm-based-supercomputer-become-the-
worlds-fastest)).

~~~
gnufx
For what it's worth, there's current ARM propaganda [huge PDF!] at
[http://emit.tech/EMiT2016/Adeniyi-Jones-
EMiT2016-Barcelona.p...](http://emit.tech/EMiT2016/Adeniyi-Jones-
EMiT2016-Barcelona.pdf) (and incidentally something related to a more
interesting sort of ARM-based system from the same meeting:
[http://emit.tech/EMiT2016/Navaridas-
EMiT2016-Barcelona.pdf](http://emit.tech/EMiT2016/Navaridas-
EMiT2016-Barcelona.pdf)). That may not be terribly relevant to Fujitsu,
though.

