MapReduce has a lot of limitations. It doesn't have a query language, instead, you need to figure out the sequence of map and reduce steps and implement those in your favourite low level language yourself.
And it can't do efficient joins. That means you need to visit each and every row for each and every map-reduce stage. There's no b-tree or other "lookup structure".
And it's a batch based framework, which means if you add 1% more data, you have to re-analyze the entire data set, rather than update previous results with the new 1%.
Disclaimer: I work at Endeca, which is about to launch Latitude, an Enterprise platform for big data analysis. But (a) I work in Engineering, not sales or marketing, so I spend my time thinking about the advantages and disadvantages of various technologies, rather than how to sell them, and (b) I'm an actual human being who has independent thoughts.
Similarly, it's up to the programmer to write a reducer which allows incremental additions of data. And even then, you still need to make a pass over all the data.
But the limit on the reducer is fundamental. Some reduce functions are not associative, and some don't even have type [T x U] -> T x U. In those cases, there is nothing to be done but redo the reduce.
For quick and dirty map reduce on a smaller node count I've started to really like Disco (discoproject.org). You just pull down the backend with your package manager, push your files into ddfs, write a python script, and run it.
my personal favorite is BashReduce (~120 lines shell script vs ~600k lines of java code in hadoop): http://blog.last.fm/2009/04/06/mapreduce-bash-script
If you're in bioinformatics you might be interested in this talk on handling ridiculous amounts of data (PyCon 2011): http://blip.tv/pycon-us-videos-2009-2010-2011/pycon-2011-han...
There's no question that Hadoop is the elephant in the room, so to speak. It is very robust and performant, and there's a great ecosystem and community. It is quite complex as a result and getting it set up and tuned can take a lot of time and effort.
I've got the distributed file system working and am working on the processing part now. The underlying framework is more general purpose than MR, working at the level of data or record streams which can be run through LINQ, for example. Dryad has this but it's a much more complicated beast.
Even though more general purpose computation is possible with such a framework, it turns out that to achieve scale, your problem needs to be parallelizable and MR is a good way to do that. I think that's why we aren't seeing much in the way of alternatives, yet--it's a question of the "enemy of good enough."