

Data Mining and Analysis: Fundamental Concepts and Algorithms [pdf] - rawfael
http://www.dcc.ufmg.br/livros/miningalgorithms/DokuWiki/doku.php?id=pdf

======
ralmeida
I had the luck of being taught by Prof. Meira - who is one hell of a professor
- while this book was in draft stage. Here you'll find not only a great
toolkit of techniques of data mining, but more importantly, a very
comprehensive arsenal of concepts to properly analyse your data and the
results of the techniques applied.

As grizzlon said in a sibling comment, you'll find the chapter dependencies on
page 10, in case you need to look up/learn something specific on demand.

~~~
mathattack
The chapter dependencies is one of those things that is so obvious you wonder
why it isn't in every textbook.

How was the class? The book is very comprehensive. I'd imagine it would be
hard to cover it in a year, let alone a quarter or semester. What environments
and languages did the class program in?

~~~
ralmeida
Agreed about the chapter dependencies. I hope this becomes a trend!

The class is one semester long, and is divided between data analysis, frequent
pattern mining, clustering and classification. Most of the book is either
covered or briefly discussed. It is indeed a 'deep' class: UFMG has a strong
data mining/machine learning/information retrieval/natural computing/other
related areas program, so each class can afford to be pretty specific.

The class is taught at the same time to undergrad and graduate students (the
difference being that each group has a different class project; grads have to
write a basic research paper).

There are lots of pen on paper theoretical quizzes and tests. Technologically,
AFAIK there are no restrictions on which technologies to use, but popular
choices are those which the TAs are most experienced in, usually Weka, C++ and
Python. Like other classes taught by Prof. Meira, students are pushed as far
as possible in terms of evaluation difficulty, then graded on a curve. UFMG
alumni considering these classes should be careful if they decide to take it
along with other difficult classes.

Here is the course page (in pt-br, but Google Translate should be OK):
[http://homepages.dcc.ufmg.br/~meira/DokuWiki/wiki/ensino_md](http://homepages.dcc.ufmg.br/~meira/DokuWiki/wiki/ensino_md)

Off-topic: OP seems to share my first and one of my last names lol. Another
seemingly brazilian commenter in the topic also seems to share my first name.
This may or may not indicate a correlation between the name Rafael and
computer science in Brazil lol.

~~~
mathattack
Ahhh - UFMG - solid place. I heard UFMG is why Google is in Belo Horizonte.

As for names, when I saw yours, I thought of this R Almeida.
[http://www.sherdog.com/fighter/Ricardo-
Almeida-11](http://www.sherdog.com/fighter/Ricardo-Almeida-11)

~~~
ralmeida
That would be correct. Google settled in Belo Horizonte by buying Akwan Search
Technologies, which was co-founded by UFMG professors Nivio Ziviani[1] and
Berthier Ribeiro-Neto (currently Google's Head of Engineering in Brazil).
Information Retrieval - a department in which they play key roles - is another
strong area of the university.

Ziviani currently is a co-founder in recommendation startups Zunnit[2] (just
upstairs in the building I am right now) and Neemu[3].

PS: Ha! Maybe this guy is the reason I can't get my screen name everywhere I
want!

[1]
[http://en.wikipedia.org/wiki/Nivio_Ziviani](http://en.wikipedia.org/wiki/Nivio_Ziviani)
[2] [http://zunnit.com/](http://zunnit.com/) [3]
[http://neemu.com/](http://neemu.com/)

------
gizzlon
Like the _Chapter Dependencies_ on page 10.. Don't think I have seen that
before

------
greenm2
This is great, I've been looking for something like this, to refresh and build
on a class I took in college. To my surprise, this is my professor's actual
book I look forward to reading the book again. Thanks!

------
arthurcolle
Thank you for the PDF. I am a math major and have been feeling rather bummed
that I haven't had much of a chance to learn techniques for data analysis, so
this is really appreciated.

------
thejosh
Awesome, thanks!

I have ~3.5million pieces of structured data to go through, I will find this
paper interesting.

------
r4pha
It's great to see quality content from Brazil. Thank you.

------
ssantic
Looks great!

------
matiasb
Good!

