
Linux dominates supercomputers as never before - tanglesome
http://www.zdnet.com/linux-dominates-supercomputers-as-never-before-7000030890/
======
zurn
Another way to put it is that specialized supercomputers no longer exist, and
the top500 is now populated with super-branded racks of PCs running Linux with
souped up network cards.

For the young ones in the audience, the supercomputer term has historically
been used to describe machines with largely custom hardware from the CPUs up -
or at least from the CPUs onwards. Cray vector machines are the canonical
example. These were shared-memory / single-system-image machines, software
really saw it as one computer instead of a networked cluster.

~~~
alephnil
I don't think this description is entirely fair. What is expensive with these
computers is the fast interconnection buses between processors and modules,
which is needed for many scientific problems that can't easily be divided into
independent subproblems. Just having a large cluseter of computers connected
with normal Ethernet would not cut it. OK, you say souped out network cards,
and if you had said souped up switches as well, I could have given you some
points, but this is the major cost of modern supercomputers. You can get a non
souped-up cluster with the same number of processors for a one digit
percentage of the price of the supercomputer.

~~~
walshemj
Some hadoop clusters are looking at/using infiniband - cluster network design
depends on what sort of problems your dealing with.

~~~
weland
Pretty sure everything in top500 is using infiniband by now.

I was briefly involved with the supercomputing world about four years ago.
"Should we use Infiniband?" was no longer considered a non-rhetorical question
even then.

~~~
alephnil
Without Infiniband or similar proprietary buses, it is usually not considered
a supercomputer at all, but merely a cluster.

~~~
Alupis
I think the lines between "Super Computer" and "High Performance Cluster" have
become blurred.

------
Alupis
I really amazes me, that Linux is so darn flexible that it runs on some of the
world's largest machines, down to the smallest embedded devices.

After years of being a Windows fanboy -- I look back now and am very glad I
made the switch. With Linux -- it's never a case of "you can't do that".

~~~
hadrianoliveira
It also really amazes me how vendors make crappy drivers for linux compared to
windows

~~~
Alupis
This is not really true anymore.

About 99% of all devices "just work" out of the box, the other 1% either are
very obscure/specialty hardware, or are new hardware. Linux doesn't need a
vendor to support Linux like it used to... most generic drivers are built in
and work OK (just like the generic drivers built into Windows now-a-days).

~~~
adamors
> About 99% of all devices "just work" out of the box, the other 1% either are
> very obscure/specialty hardware, or are new hardware

Well if you count most GPUs under speciality hardware then I guess you're
right, but most people don't. And most linux GPU drivers still pale in
comparison to their Windows versions.

~~~
eslaught
Actually I would think GPUs are one of the biggest counterexamples where
driver support has been noticeably improving recently (mostly thanks to Valve
and Steam). Even the open source drivers work well enough to let you get work
done (although the performance might not be where you want it to be).

Compare that to driver support for things like the fingerprint reader in a
Lenovo Thinkpad. I wouldn't want to bet on getting that working at all, unless
Lenovo had preinstalled Linux for you. And unlike GPUs, this is likely to be
all or nothing, so if you don't get the drivers you can't use that hardware
feature at all.

~~~
jnbiche
Actually, there are both closed and open source drivers for Thinkpad
fingerprint readers. The open source ones are even well-maintained:

[https://launchpad.net/~fingerprint/+archive/fingerprint-
gui](https://launchpad.net/~fingerprint/+archive/fingerprint-gui)

~~~
eslaught
Aha! Good to be proven wrong there.

------
onalark
It's worth pointing out that this is what most people think of as the
traditional "Linux Operating System" but sans the Linux kernel.

Also, there are a few components on these supercomputers that you won't find
on a typical workstation or cluster machine.

First, the kernel itself is frequently very small, lightweight, and much
closer to what you would find on an embedded system than on a traditional
desktop computer. That's because once the program on a supercomputer is
loaded, the kernel's job has been mostly to get out of the way. This isn't to
say that you don't find the Linux kernel on these supercomputers, it just
isn't as common as you would think from reading this piece.

Fear not, many pieces of these operating systems are still open source. Here's
IBM's fusedOS prototype:[https://github.com/ibm-
research/fusedos](https://github.com/ibm-research/fusedos).

Second, many of these computers primarily only run code in C, C++, Fortran,
and Python. These tend to be the only major languages in play on the HPC
machines, with acceleration frameworks such as OpenMP, OpenCL, and CUDA
playing major roles.

Finally, everything is glued together with MPI, a high-level (at least it was
in the 90s) abstraction for scientific programming that maps down to very
high-performance networks designed to help scientific codes "scale", that is,
run effectively when millions of cores are simultaneously engaged.

These are beautiful machines producing important science, and the GNU/Linux
operating system plays an incredibly important role in both their
implementation and culture.

~~~
gh02t
This is not exactly true, most of the ones I know are running straight up
Linux. Titan, for example, runs basically SLES on the login nodes and Compute
Node Linux on the compute nodes. CNL is a pared down version of Linux, but it
definitely is the "real" Linux kernel with the functionality you'd expect to
be there.

Most are just slightly spruced up commodity server hardware running Linux. I'm
not sure if this is what you're suggesting, but they don't run
C/Fortran/whatever on bare metal. They're run by the OS on the compute node
just like a normal OS process, except that tasks are dispatched to compute
nodes by a central cluster manager. Processes running in a gang communicate
via MPI to share data, though coprocessors are also pretty popular as well so
you see a lot of communication between the host processor and a coprocessor
too. Titan and Tianhe both actually have most of their compute power in the
coprocessors (Xeon Phi and Nvidia Tesla, respectively), but they're still
arranged in a master-slave arrangement just like if you bought a Phi or Tesla
and stuck it in a spare PCI-E slot. They use plain old PCI-E, too. The Cray
XT/XE series (a popular model of which Titan is an example) is basically just
really nice blades with integrated cooling and a network backbone in a custom
cabinet and possibly coprocessors attached to each blade. You could just as
easily run Windows XP and play Minesweeper on each blade if you really wanted
to, except maybe for some driver issues. The most foreign thing is probably
the network backbones, where fabric architectures like Infiniband are popular.

They're also not limited to specific programming languages. In truth, you can
run whatever you want if someone has paid the bill for your resource
allocation. I watch people run MATLAB on large clusters all the time, which
hurts me because it's so damned inefficient. That said, Fortran and C++
comprise the overwhelming majority of large and computationally taxing codes.
Just because all that power is there doesn't mean that all of the users take
proper advantage of it. One of the larger calculations run on Titan that I
know of (Denovo, a nuclear reactor simulation code) didn't even use the GPUs,
only the CPUs. Making codes that can take advantage of GPU processing ergo
Titan and its predecessor Jaguar has been a major project at the DOE, with
libraries like Trilinos being developed to make it easier on scientists, many
of whom are only computer programmers as a secondary concern.

The setup you're describing used to be how it was until up to maybe 10 years
ago and there are still systems in the top 500 that work like that. Probably
some new ones being built, too. But what I've described is what seems to be in
fashion these days mostly and the machines I use are all like that. I've heard
mumblings about FPGA coprocessors being the Next Big Thing, but we will see.

~~~
onalark
Yup, thanks for the comment. I think FPGA has interesting potential for
sequencing (and we need more sequencing compute), I'm not sure what's ahead
for the simulation machines.

~~~
gh02t
FPGA has potential for simulation too from what I know, but I honestly don't
know enough about them to comment intelligently about them. I expect most
people will use them via an intermediate abstraction layer, as from what I
know they can be tough to program for.

------
CraigJPerry
What does Linux not dominate?

I can think of only a few categories: desktops, ATM machines, real time,
SCADA.

What are the remaining big sectors Linux doesn't yet dominate?

~~~
nardi
How has no one mentioned _mobile_? Smartphones/tablets? You know, the largest
segment of general computation devices?

~~~
azdle
> How has no one mentioned mobile?

Because Linux does dominate mobile? Android is Linux and iOS is a *nux. (iOS
is a BSD derivative.)

~~~
JoshTheGeek
BSD isn't Linux. Android uses the Linux kernel, but not iOS. I'm not sure
which dominates, though

~~~
barbs
In numbers, I'm pretty sure Android dominated (can't remember the exact
numbers but Google should reveal them for you).

In revenue (app-sales etc) iOS dominates.

~~~
shmerl
_> In revenue (app-sales etc) iOS dominates._

Who made such research? I'm not sure if it's even possible to know exactly,
unless all vendors who run their own stores publish sales statistics.

~~~
TheSoftwareGuy
>unless all vendors who run their own stores publish sales statistics.

They do, thats one of those things that companies that are traded publicly are
required to do.

~~~
nl
That's untrue.

Publicly traded companies are required to disclose revenue, but not required
to break it down by source.

Neither Apple or Google break out the revenue from the *Stores (unless they
have started doing it very recently). There are third party analytics firms
that provide reasonable estimates though[1].

[1] eg [http://techcrunch.com/2014/06/23/google-play-quarterly-
app-r...](http://techcrunch.com/2014/06/23/google-play-quarterly-app-revenue-
more-than-doubled-over-past-year-thanks-to-games-freemium-apps/)

------
jason_slack
I think that I read a long while back that there was a cluster of PlayStation
3's, all running Linux that was pretty fast too.

Edit: Here is one of the articles: [http://phys.org/news/2010-12-air-
playstation-3s-supercompute...](http://phys.org/news/2010-12-air-
playstation-3s-supercomputer.html)

------
piokuc
In the article it was mentioned that ECMWF's supercomputer was the fastest one
of those running not on Linux. It's worth mentioning that ECMWF is in the
process of migrating to an XC30 Cray system which runs on Linux.

------
nroose
Um... That graph goes up to 600. Kind of an epic fail when it is an article
about the fastest computers in the world.

~~~
mnw21cam
Yes, and also a fairly basic research fail, referring to the ECMWF's _old_
supercomputer at position 60 as the UK's weather predicting system. They have
couple of new systems at positions 19 and 20. ECMWF is for Europe, not UK,
even if it is based in Reading. The key is in the name.

------
bch
What _are_ examples of HPC installations that *BSD is key? Free, Net, Dragon
Fly, Open?

------
lucidguppy
What alternatives are there really?

