Hadoop is a poor imitation of a mainframe. When you are processing large lumps of data you need the following:
A task scheduler
A message passing system(although in some workloads this can be proxied by the scheduler or filesystem)
In mainframe land, you have a large machine with many processors all linked to the same kernel. As its one large machine, there is a scheduler that's so good you don't even know its there. However mainframes require specialist skills.
If you want shared memory cluster, you're going to need inifiband (which is actually quite cheap now, and has 40gig ethernet type majigger in it aswell)
but people are scared of that, so that leaves us with your standard commodity cluster.
In VFX we have this nailed. before leaving the industry I was looking after a ~25k CPU cluster backed to 15pb of storage. No fancy map reduce bollocks, just plain files and scripts. Everything was connected by ten gigs ethernet. Sustained write rate of 25 gigabytes a second.
firstly task placement is dead simple: https://github.com/mikrosimage/OpenRenderManagement/wiki/Int... look how simple that task mapping is. It is legitimate to point out that there are no primitives for IPC, but then your IPC needs are varied. Most log churn apps don't really need IPC, they just need a catcher to aggregate the results. Map:reduce has fooled people into thinking that everything is must be that way.
This is designed to run on bared metal. Anything else is a waste of CPU/IO. We used cgroups to limit process memory, so that we could run concurrent tasks on the same box. But everything ran the same image, anything else is just a plain arse ache.
The last item is file storage. HDFS is just terrible. the reason that its able to scale is that it provides literally nothing. Its just a key:value store, based on a bunch of key:value stores.
If you want fast super scalable posix filesystem, then use GPFS(or elastic storage). The new release has a kind of software raid that allows you to recover from a dead disk in less than an hour. (think ZFS with shards)
failing that you can use lustre (although that requires decent hardware raids, as its a glorified network raid0)
or if you are lucky you can split your file system into name spaces that reside on different NFS servers.
Either way, Hadoop, and hadoop inspired infrastructures are normally not the right answer. Measos is also probably not the answer unless you want a shared memory cluster. but then if you want this you need inifinband...