

Big Data Analytics: MapReduce - haifeng
http://haifengl.wordpress.com/2014/08/18/big-data-analytics-mapreduce/

======
dgomez1092
I'm curious to know more about why there would be greater difficulty using
MapReduce jobs to be able to go through a K-Means clustering analysis. As for
as the computational effort on the CPU what would be the cost to your RAM's
mapper when you access this remote file with all the iterative data; I'm
assuming I could make an implication that Apache spark could be better since
it allows for actually in memory processing. Is there really a significant
difference in computational usage in mbps?

~~~
glxc
k-means clustering is iterative while standard mapreduce is single pass

