

Mining of Massive Datasets  - rajesht
http://infolab.stanford.edu/~ullman/mmds.html

======
0x12
The book mentions a number of distributed file systems but omits one that I
think deserves mention: <http://www.gluster.org/>

Glusterfs is an interesting take on the DFS concept and it is open source.

~~~
Estragon
glusterfs is what we use at work, and I have seen some fairly strange behavior
from it during heavy FS traffic. Makes me a bit nervous.

------
ahalan
Related courses: [http://www.quora.com/What-are-some-courses-on-large-scale-
le...](http://www.quora.com/What-are-some-courses-on-large-scale-learning)

Workshops: [http://www.quora.com/What-are-some-workshops-on-large-
scale-...](http://www.quora.com/What-are-some-workshops-on-large-scale-
learning)

Also see the tutorial "Scaling Up Machine Learning" at KDD2011:
<http://hunch.net/~large_scale_survey/>

------
danso
Thanks for the link, highly useful and very readable.

Loved this from the webpage intro: "We are sorry to have to mention this
point, but we have evidence that other items we have published on the Web have
been appropriated and republished under other names. It is easy to detect such
misuse, by the way, as you will learn in Chapter 3."

------
smokinn
I've recently finished up to chapter 3 and already I can say I highly
recommend reading this. So far it's been excellent.

------
webspiderus
This is a course that went with this book:
<http://www.stanford.edu/class/cs246/cs246-11-mmds/>

Some interesting material in the presentations and the homeworks as well,
although the bulk of the content is definitely in the textbook.

------
binarysolo
Great class, wonderful teachers -- thanks for the submit.

------
wslh
I recommend looking at <http://theinfo.org/> community.

------
PaulHoule
This is nice. Nothing in it is really new though, flipping through it I kept
thinking about what I was doing five years ago. On the other hand, past is
prelude.

------
Hitchhiker
Thanks !

------
khookie
thank you!

