
Japan plans 130-petaflops supercomputer - thetechgraph
http://arstechnica.com/business/2016/11/supercomputer-japan-130-petaflop-china-beating-number-cruncher/
======
jpatokal
Serious if likely naive question: what's the use case for supercomputers these
days, instead of just spinning up a cluster in your favorite cloud and running
your favorite MapReduce derivative on it? These guys pulled off a petaflop on
AWS with 156k cores for $33k, and that was in 2013.

[1] [http://arstechnica.com/information-
technology/2013/11/18-hou...](http://arstechnica.com/information-
technology/2013/11/18-hours-33k-and-156314-cores-amazon-cloud-hpc-hits-a-
petaflop/)

~~~
fnord123
There's plenty of reasons. Here's a bullet list with some thoughts.

* Using virtual machines isn't a bad idea for cases where you have a one off calculation for a paper that you need to publish. But for some groups they need to continually model and predict stuff. For example, weather forecasting needs to be done every day.

* Hadoop MapReduce and friends are extremely slow. Virtualization slows things down as well. The person in your article was using software called Shrodinger which isn't MR.

* Often, MapReduce performance is bottlenecked by the shuffle stage. This requires a lot of data to be passed around the network. Ethernet and http connections are not ideal for this if performance is your goal (though they are good for resilience). RDMA (remote direct memory access) with Infiniband is much better (offloading lots off the cpu, less chatter, etc). There exist RMDA plugins for Hadoop but they're of ymmv quality.

* MR isn't an ideal paradigm for every calculation. MR comes from search indexing (google) and is useful in areas where you need to look at all the data and moving the data around is more intense than the calculations. If you can filter, feature extract, or otherwise reduce the amount of data being moved around, the benefits of using HPC style clusters win out. Places where you must look at all the data: Indexing, Adversary detection (insurance fraud, intruder detection), format change (mp4->webm, gz, json->bson), ingestion, backups. If you're not doing these things you might want to consider whether MR is right for you.

* Even in search indexing, deep learning is becoming en vogue and this is extremely computationally expensive. This means the cost of moving the data around is again taking a back seat.

* A large part of the benefits of Hadoop style MR was having local data. But network speeds have outpaced drive speeds so data locality is less of an issue. Coupled with deep learning, there's less need for HDFS type systems. GPFS or Object storage is fine.

Also, the page says it's only a theoretical petaflop:

"While Stowe says the Amazon cluster hit 1.21 petaflops, that's the
theoretical peak speed rather than the actual performance. In the Linpack
benchmark used to test supercomputer speeds, the theoretical peak is always
reported, but the real-world results are what count when ranking the world's
fastest machines."

~~~
dankohn1
This is a fantastic answer. Could you or anyone else please point me to any
examples of trying to run high-resolution weather forecasting, such as COAMPS
[0] or WRF [1] on commodity cloud infrastructure instead of HPC?

[0] [http://www.nrlmry.navy.mil/coamps-
web/web/home](http://www.nrlmry.navy.mil/coamps-web/web/home) [1]
[http://www.wrf-model.org/index.php](http://www.wrf-model.org/index.php)

And fnord123, could you please email me?

~~~
fnord123
I think the only thing that most HPC sites use which wouldn't be found in
cloud infrastructure is the interconnects and a parallel file system. Back in
2004 I guess HPC sites were often using PPC on AIX. That's just gone. It's
almost all Linux on x86_64 now aside from some interesting machines in Japan
and China.

So the distinction of "commodity" that Google made in the original map reduce
paper in 2004 is gone. And the benefits of data locality are all but gone. The
paper was great when it came out but it's long in the tooth since the hardware
of today is different.

To answer your actual question, I've never heard of anyone running WRF on
cloud infrastructure. It doesn't have to be a terrible idea. If you're trying
to come up with a competing model then the agility you get might be needed.
i.e. being able to roll onto new hardware as it becomes available instead of
having to work with budget cycles for a hardware refresh lets you move from
tesla k80 to titans whenever AWS decided to make them available.

Searching around, I see Glenn Lockwood did some benchmarks in 2013 with AWS:

[https://glennklockwood.blogspot.be/2013/12/high-
performance-...](https://glennklockwood.blogspot.be/2013/12/high-performance-
virtualization-sr-iov_14.html)

This lot seem to have done some more benchmarking in 2014 (tldr 44% overhead
on VMs - but take the time to read it since glockwood is always a good read):
[https://www.researchgate.net/publication/301407792_Performan...](https://www.researchgate.net/publication/301407792_Performance_Characterisation_and_Evaluation_of_WRF_Model_on_Cloud_and_HPC_Architectures)

I think the prevailing issue is that latency on virtualized nodes is poor so
you have to work around it. Maybe fat nodes with gpgpu can overcome the
issues.

If you need to talk to me privately then you can message me on reddit.

------
brey
I wonder how long until we have that kind of computing power on our phones and
look back with a wry smile ...

or - how long ago would $173M have bought you the kind of power we carry
around today in a pocket?

~~~
ddeck
While not quite $173MM, the 1985 Cray 2 cost $32 million in 2010 dollars[1],
weighed 2500kg and consumed 150-200 kilowatts in power.

Estimates put the processing power of an iPhone 5 at at around 2.7x times that
of the Cray 2[2]

[1]
[http://www.theregister.co.uk/2012/03/08/supercomputing_vs_ho...](http://www.theregister.co.uk/2012/03/08/supercomputing_vs_home_usage/)

[2] [http://pages.experts-exchange.com/processing-power-
compared/](http://pages.experts-exchange.com/processing-power-compared/)

~~~
sqldba
Wish there was a good book about the history of crays. It's the kind of thing
you hear talked about but are unlikely to ever see.

Mainframes in general too.

I'm currently listening to Soul of a New Machine.

~~~
strictnein
> "Wish there was a good book about the history of crays. It's the kind of
> thing you hear talked about but are unlikely to ever see."

That would be a great read.

My favorite Cray tidbit: for fun, Seymour Cray dug tunnels underneath his
home, and had a lot of his breakthroughs while doing so.

    
    
       he attributed the secret of his success to "visits by elves" while he worked in the tunnel: 
       "While I'm digging in the tunnel, the elves will often come to me with solutions to my problem."
    

[https://en.wikipedia.org/wiki/Seymour_Cray#Personal_life](https://en.wikipedia.org/wiki/Seymour_Cray#Personal_life)

~~~
sedachv
My favorite Seymour Cray anecdote is from Alan Kay when he did an AMA here on
hn a few months ago:

"Seymour Cray was a man of few words. I was there for three weeks before I
realized he was not the janitor."

[https://news.ycombinator.com/item?id=11941941](https://news.ycombinator.com/item?id=11941941)

------
stuntprogrammer
An important detail is that this is not 130 petaflops of double precision
floating point. Their target is >130 single or even half precision.

------
BuuQu9hu
Some technical details would be interesting.

~~~
hvidgaard
The bidding to create the computer is open to 8. Dec, so they only know how
fast they want it to be, and that is about it.

~~~
chx
But then how do they know the price?

~~~
ethbro
Because it's a government project. We give you a price and specs, you build it
late with cost overruns, then we realize the specs weren't really what we
wanted.

------
thesz
How much power this thing will draw?

Right now, I guess, 1 GFLOP is about 0.5-1W of power. This means that
130PFLOPS computer will draw 75-130MW, definitely even more.

It can be a problem.

------
at-fates-hands
Does it worry anybody that China owns the top two positions and the US
supercomputers in places 3-6 aren't even close to the top two in terms of
performance?

~~~
dekhn
No. having the top 3 top500 supercomputers doesn't translate into any real
economic benefit or other gain. They're really just wastes of resources.

------
ju-st
Is $173M expensive or cheap for 130 petaflops?

~~~
jankiel
Sunway TaihuLight (the most powerful supercomputer right now) was benchmarked
to have 93 petaflops[1]. The cost was 1.8 billion Yuan (US$273 million). So
cheap I guess.

[1] [https://www.top500.org/news/china-tops-supercomputer-
ranking...](https://www.top500.org/news/china-tops-supercomputer-rankings-
with-new-93-petaflop-machine/)

------
zumu
I have no relation to ars technica, but I found their version of the story to
be written in much more coherent English. For parties interested:
[http://arstechnica.com/business/2016/11/supercomputer-
japan-...](http://arstechnica.com/business/2016/11/supercomputer-
japan-130-petaflop-china-beating-number-cruncher/)

~~~
sctb
Thanks! We updated the link from [https://thetechgraph.com/2016/11/28/japans-
new-130-petaflops...](https://thetechgraph.com/2016/11/28/japans-
new-130-petaflops-supercomputer-will-cost-173-million/) a little while ago.

------
burai
But will it run cr... Sigh, need to update my cliches, BRB

~~~
chrift
BUT WILL IT BLEND?!

------
debt
i look forward to these massive supercomputers taking over and eventually
being capable of computing anything instantly.

all algorithms will become O(1) and even the shittiest code will be
instantaneous. which is great for me.

great for me maybe bad for encryption and governments and idk.

------
sickbeard
Every time I see one of these headlines, I think "so what?". Are we just awed
by how many "flops" it has or is there some actual practical advancement we
should be looking forward to?

