

Hadoop's Uncomfortable Fit in HPC - drjohnson
http://glennklockwood.blogspot.com/2014/05/hadoops-uncomfortable-fit-in-hpc.html

======
jandrewrogers
As someone who has straddled the world of HPC and Big Data for years, this is
a pretty solid summary of the issues. From my perspective, both sides could
learn quite a bit from the other but there is little cross-fertilization.

Basically, Big Data is much more sophisticated at distributed systems computer
science and HPC is much more sophisticated at massive parallelism algorithms
computer science. HPC tends to throw money at hardware to solve distributed
computing problems, and Big Data is still very limited when it comes to
parallel algorithm design techniques because their sharding models are
incredibly primitive (hash and range only? no thanks) partly because of their
original use cases.

In my perfect world, a good platform would combine the areas of sophistication
from both but people in one domain tend to lack the perspective and experience
of the other. And in many cases, the only way you can learn the more advanced
techniques is by working in the field; I can't tell you how much really
sophisticated HPC computer science is not documented anywhere on the web but I
used it every day when I was working on those systems.

~~~
KaiserPro
The one thing that HPC (and VFX) have over anyone else, is rational thought.

A thing is used because it is the very fastest. we know its fast because we
have the numbers to prove it.

Infiniband is expensive, but it's much faster than ethernet for DMA. HTTP is
far to inefficient for MPI. Posix file systems are brilliant. Object storage
is far to slow and unreliable (see Http) ans usually miss important features
like locking or concurrent access.

when you're paying by the hour, that 5-30% VM/bytecode penalty is ruinously
expensive.

------
colin_mccabe
This article is a good overview-- one of the most informed I've seen from the
HPC side of the world.

I don't think "HDFS is very slow and very obtuse" is true, though. If you use
HDFS for what it was intended for-- reading large files that are mostly
local-- it does pretty well. MapReduce and Impala explicitly support this
paradigm by looking at where files are and scheduling computation there. If
your workload has a ton of metadata operations, this can become a bottleneck,
but that is not unique to HDFS. For example, if you use "ls" with the wrong
options on Lustre, things can get quite slow due to the large number of
metadata operations generated:
[https://wikis.nyu.edu/display/NYUHPC/Lustre+FAQ](https://wikis.nyu.edu/display/NYUHPC/Lustre+FAQ)

Lustre is a cool filesystem in many ways, and has been battle-hardened for
HPC. However, I think Lustre is relatively unlikely to gain ground in Hadoop
due to the complexity of administering and upgrading its in-kernel server.
Ceph might have a chance, though-- we'll see.

The article also doesn't discuss a lot of the areas in which Hadoop is way
ahead. Hadoop handles faults in individual nodes, rather than just assuming we
will use gold-plated hardware with infalliable RAID. Node failures happen, and
HDFS and Hadoop can recover-- even from NameNode failures, these days.
Google's Spanner paper points the way towards having strong consistency
without sacrificing (as much) availability.

I agree that cramming traditional HPC problems into the Map-Reduce framework
is not really a great idea. Instead, it would be better to use something like
Spark (which, to his credit, Lockwood mentions). Spark is getting a huge
amount of attention right now, and I think it's going to resolve some of these
issues.

------
batbomb
Hadoop will not penetrate HPC.

I'll say it again:

Hadoop will not penetrate HPC.

The reasons why vary quite a bit. But the core reasons are this:

Data is rarely local.

Resources are shared.

Users are heterogenous, even if the cluster isn't.

Users don't typically have CS degrees or understand more than basic networking
(Computational Chemists and biologists tend to be a but more advanced)

Users still heavily rely on Fortran.

Users rarely know much about SQL, so even Hive is often out of the question.

Clusters are rarely under the control of users. On top of that, Users are
rarely able to launch VMs.

Lots of other reasons too.

~~~
seanmcdirmid
And yet, CUDA is currently dominating in many HPC applications. There are
reasons Hadoop/MapReduce is a poor fit for HPC, but your argument points above
aren't very universal.

~~~
KaiserPro
In HPC they are.

CUDA wins because it is wonderfully fast. There is a huge benifit for
implementing it, even if it requires lots of work to get efficient MPI,
because the rewards are literally massive.

Hadoop uses REST ffs. You're never going to get low latency with REST. Its a
nasty hack in the world of web, let alone HPC where there really isn't any
excuse for using HTTP for something its really not designed to do.

But the question is this:

what do people actually use Hadoop for?

~~~
growse
What do people use Hadoop for? Are you asking because you genuinely don't
know?

Hadoop lets me pick up PBs of (typically) record-format data and stick it on a
bunch of cheap servers that are already standard items in my datacenter
catalogue and are well-understood by my datacenter/ops teams. I can then open
up any Java IDE, write a program now I can process that data. Up until this
point, I've not really needed much specialist knowledge, perhaps with the
exception of understanding the map/reduce paradigm which is a fairly basic
programming construct anyway.

There's lots of companies out there who have data problems that are bigger
than a few TB, and they're bored of shovelling money at Oracle so they can
have a fancy, expensive database when their data isn't actually relational to
begin with.

~~~
KaiserPro
Thats the thing, So I get that it gives you network aware Fork() Its just the
scheduling is a bit, well, lacking.

In my datacenter, we run dirty cheap iron. The only thing special are the core
switches(lots and lots of ten gig ports, to cope with the petabytes that we
shove across it daily)

we use standard linux tools, sharded posix file systems with a decent task
dispatcher that understands dependencies.

------
cabinpark
I guess, but at an HPC conference I went to last year there was a big session
on Hadoop and introducing it to the researchers as a potential alternative for
certain tasks. I also believe that technologies take a lot longer to penetrate
academia then industry given the nature of academics so I don't doubt Hadoop's
slow acceptance into the HPC community. Look at GPUs for example. They've been
around for a while but it's really only been in the last few years that the
tool has been put into the toolkits of researchers.

I do know one professor math professor who just a few hundred thousand on a
Hadoop cluster to experiment with numerical algorithms. Talking with him, and
I trust his opinion since he does research specifically in high performance
linear algebra, was that he thought the technology was very appealing and can
be an excellent addition to the repertoire. However there is still much work
to be done since numerical linear algebra (which is the backbone of almost
every single HPC calculation) using the Hadoop framework is very young. I
believe it'll take a few years for researchers to determine whether or not
Hadoop could be a useful tool for solving the types of problems academics do
or whether it's just not as effective as the current tools.

------
tdurden
> Its runtime environment exposes a lot of very strange things to the user for
> no particularly good reason (-Xmx1g? I'm still not sure why I need to
> specify this to see the version of Java I'm running

I am pretty sure you don't need to specify a maximum memory allocation size to
determine the version of Java being used.

~~~
MichaelGG
He added(?) a note explaining that the default is 4GB and will fail if the
user has limits which is common in HPC configurations.

~~~
EdwardDiego
Default is 1/4 of physical memory for a server JVM. I'd emphasise the word
server. Client JVMs are 1/4 of physical memory to a max of 256MB.

I always enjoying reading blogs by people unfamiliar with a technology about
why their poor understanding of a technology indicates that the technology
they know better is better.

------
ChuckMcM
This was a fascinating look at an HPC person't thought on Hadoop. Interesting
that before Beowulf "super computing" was exemplified by the Cray series. The
word was that NUMA was incompatible with HPC, but folks found ways around
that.

Its a totally valid observation that many HPC problems are not amenable to a
Hadoop cluster, but it is not clear to me yet that there are no HPC problems
that a distributed data approach might do better than the current system :-)

~~~
KaiserPro
Scheduling is what HPC can do much better. The concept of
data/message/cache/storage/memory affinity seems to be a novelty.

------
epaulson
The real mismatch is handling failures. The HPC world has grown up handling
failures by failing the entire computation. As a defense against failure, they
usually checkpoint the entire computation.

Part of this is the fault of the MPI standardization process: MPI didn't even
try to deal with dynamic-but-expected changes to the computation (much less
changes caused by "node" failure) because not all of the initial systems MPI
was targeting could handle nodes coming or going. In this way, MPI was sort of
a step backwards - PVM was much more flexible than MPI from the get-go.

I haven't paid much attention to MPI in a few years, so maybe some of the MPI-
FT approaches have caught on.

One of the initial targets for YARN was MPI, but it was never clear to me that
any MPI users actually wanted to use YARN.

With just a few tweaks, PVM and AFS could have given us a pretty good Big Data
stack 15 years ago (And in the case of security, a better stack than what
we've got today.) The hardware wasn't quite cheap enough to make the case as
compelling as it is today, and the MapReduce programming model helped sell the
entire approach, but it was a real missed opportunity.

~~~
anonymousDan
Out of interest, what are the security issues you see with Big Data stacks?

~~~
TallGuyShort
Most of the tools are written under the assumption that you're behind a
firewall or airgap that will do all your security for you, and the controls
are very weak and poorly enforced in the younger projects. Even in the mature
ones, there's hasn't really been a cohesive model of security across the stack
that meets the needs of stricter organizations. That's improving - I'm doing
some work with Apache Sentry (incubating) that lets you apply more
sophisticated security policies to fine-grained data in Hive, Impala and Solr
(and in the future, hopefully many others). There are also projects like
Apache Knox that, as I understand it, is a gateway for your cluster - so
taking the idea of having an external tool protect your cluster, but at least
one that understands what's happening in the cluster and handles
authentication, etc. Multi-tenancy is also in it's infancy for Hadoop - but
I'd expect as more of the projects start integrating better with YARN's
resource management this will also be improving in the very near future.

------
pjmlp
I think a possible good outcome of people trying to squeeze Hadoop in HPC
world is already happening with the pressure to give more focus to performance
in Java 9 roadmap.

If it will really happen, remains to be seen. But I am following it with
interest.

------
frozenport
The real problem is everybody trying to jump on the Big Data™ train. I work a
block from the NCSA and everybody is saying big data, even when it means a few
gigabytes in a traditionally non-data format like images. The molecular
dynamics simulations at UIUC don't generate Big Data unless you refer to the
log files, which are generated once and of an order smaller than most people
see.

"What is Big Data?" is frequently answered with "What I do!"

~~~
EdwardDiego
The only working definition of "Big Data" that I've come across is "Data too
large to work with in a traditional RDBMS".

~~~
slurry
"Big Data is any thing which is crash Excel." \- @DEVOPS_BORAT

~~~
EdwardDiego
Oh, that is fantastic. :D

