

Advice for AI aspirations? - matmann2001

I'm an undergrad in Computer Engineering, and I find the fields of AI, Machine Learning, and HCI EXTREMELY interesting.<p>Do you have any advice for important skills or things to learn, people to talk to, or things to read that would better prepare me for this field?  Any advice is welcome.
======
mindcrime
You can't take enough maths and statistics classes. Machine Learning - these
days at least - is very maths and statistics oriented. Linear Algebra is big,
so make sure you have that covered.

If you want to get your toes in the water a bit with ML, there are some great
ML libraries that encapsulate some of the popular algorithms. Mahout[1],
Weka[2] and Mallet[3] are popular in the Java world,

A lot of folks use Python for ML as well, and there are some good libraries
there.

The R language is also popular in ML circles; as is C++. If you learn some
combination of Java, Python, C++ and/or R, you'll be in good shape from a
programming language standpoint.

Check out <http://mloss.org/software/> also.

Some good books to get started with include:

Algorithms of the Intelligent Web[4]

Programming Collective Intelligence[5]

Collective Intelligence In Action[6]

Stanford make a great series of lectures[7] available online that you might
find useful.

[1]: <http://mahout.apache.org/>

[2]: <http://www.cs.waikato.ac.nz/ml/weka/>

[3]: <http://mallet.cs.umass.edu/>

[4]: <http://www.manning.com/marmanis/>

[5]: <http://www.amazon.com/gp/product/0596529325>

[6]: [http://www.amazon.com/Collective-Intelligence-Action-
Satnam-...](http://www.amazon.com/Collective-Intelligence-Action-Satnam-
Alag/dp/1933988312)

[7]:
[http://see.stanford.edu/see/lecturelist.aspx?coll=348ca38a-3...](http://see.stanford.edu/see/lecturelist.aspx?coll=348ca38a-3a6d-4052-937d-cb017338d7b1)

------
mustafaf
If you really want to learn the fundamental underpinnings of machine learning,
you will need a strong background in probability and stochastic processes. I
would suggest Python (or MATLAB if you can get access to it) to learn how
different methods works. That way you can separate mathematical issues from
programming issues. As far as courses go, you should be looking for courses in
Liner Algebra, Numerical Computation/Optimization (Convex, Nonlinear),
Statistical Inference, Stochastic Processes.

Good References: 1) Elements of Statistical Learning - Hastie, Tibshirani and
Friedman

2) Pattern Classification - Duda, Hart and Stork

3) Pattern Recognition - Theoridis, Koutroumbas

4) Machine Learning - Tom Mitchell

5)
[http://videolectures.net/Top/Computer_Science/Machine_Learni...](http://videolectures.net/Top/Computer_Science/Machine_Learning/)

------
snikolov
Beyond just learning theory (although this is crucial), make sure to get your
hands dirty implementing something, talking to knowledgeable people
(professors, researchers in industry, classmates with common interests), and
finding out what actually gets used in practice and why. Sometimes they don't
tell you these things in classes.

When you are somewhat comfortable with some basic concepts, it might be
interesting to form a reading group with some like-minded classmates and
scrutinize some papers, from classics, to the more recent research.

------
snikolov
I've found this set of tutorials to be a great resource for picking up many
machine learning methods

<http://www.autonlab.org/tutorials/>

Somewhat complementary is this set of implementations (in Python) of many
common machine learning techniques

<http://www-ist.massey.ac.nz/smarsland/MLbook.html>

------
curt
Take classes in or read about biology/neurobiology, human development (how the
brain develops and learns), and psychology (how the brain reacts and
responds). I've done AI work and come at it from a unique perspective due to
my biology/engineering education. Normally I come up with some unique
solutions & products to problems that can't be solved using conventional
means.

------
HilbertSpace
Let's roll back to some simple things that actually make sense. Suppose we are
give a triangle ABC with sides a, b, c, that is, side a is opposite angle A,
etc. Then from some simple trigonometry, we know the law of cosines that

c^2 = a^2 + b^2 - 2ab cos(C)

Then if C is a right angle, cos(C) = 0 and we have the Pythagorean theorem

c^2 = a^2 + b^2

So, given the triangle we have the law of cosines. If in addition we know that
angle C is a right angle, then we have the Pythagorean theorem.

Well, to conclude the Pythagorean theorem, it is just crucial to have a
suitable assumption, say, angle C is a right angle. Else the Pythagorean
theorem does not hold.

We know these things.

'Machine learning' is a four letter word -- junk. Here is why: They have a
'paradigm': Get an 'algorithm' from somewhere, put it a black box, and say
"Try this!". By analogy, this is mixing up snake oil instead of good
chemistry.

The big, huge problem is that, with their nonsense paradigm, there is no good
reason to believe that the black box has any value. By analogy, they take the
Pythagorean theorem and apply it to all triangles ignoring if there is a right
angle or not.

There is a rational sickness ubiquitous in 'computer science': Take results
from mathematics and with great determination ignore the assumptions. Machine
learning is one of the sickest.

Bluntly put, done with any rational care, machine learning is a topic in
applied math, but the workers in the field know far too little math, make
gross mistakes at nearly every chance, and, in particular, ignore the
mathematical assumptions and, thus, proceed with no rational support.

Computing was not always this way: The origins of electronic digital computing
were in scientific calculations in WWII. D. Knuth's books on 'The Art of
Computer Programming' are fairly careful mathematically.

At the time of the first editions of Knuth's books, hot topics were
'algorithms', especially for sorting and searching. So, there it was fairly
clear what such an algorithm was to do.

Then the definition of an 'algorithm' grew to include just any piece of code
that ran and, hopefully, usually stopped, and what properties the results had
were not emphasized. So, we have gone some years with a programmer announcing
that they have an algorithm to tell people, say, what they should eat for
breakfast. They call a venture partner who says, "So, you have an algorithm.".
Gee, anyone can have an algorithm for anything; the question is, what
properties does the algorithm have? As long as computing wants to ignore the
properties and the assumptions, they will have a tough time doing better than
snake oil.

Part of this very relaxed attitude toward algorithms came from early work in
'artificial intelligence' (AI) where the criteria were:

"Do you have running code? Does the code produce something that at first
glance looks like something a human might do?".

Then these criteria opened the doors wide to just any intuitive, heuristic, or
black box techniques. In particular, now can call nearly anything 'machine
learning'.

Actually just guessing, as from intuitive and heuristic efforts, is a tough
way to find something powerful for a challenging problem. Instead, starting
with some assumptions and doing some derivations is much more powerful, e.g.,
as for the law of cosines and the Pythagorean theorem.

The problem for AI, machine learning, etc. in computer science proceeding
mathematically is that the corresponding math is often a bit advanced, and the
profs didn't take the right courses in grad school and don't know the math.

Actually, for 'learning', there is a lot that is quite solid in parts of
mathematical statistics. There the math is done with good care, is basically
correct; and someone with a better math background than many of the statistics
profs can make the math rock solid. There is more for 'learning' in
optimization, control theory, and stochastic optimal control.

So, much of AI and machine learning is a C- student's rip off of some Cliffs
Notes summary of mathematical statistics applied ignoring the assumptions.
Bummer.

All that said, if you want to see AI approach _the singularity_ where there is
something like actual intelligence, then it does appear that you will need
some very new approaches and ideas. Even done very well, not all of this work
will be solid mathematics.

In the meanwhile, broadly the _good stuff_ is in selected topics in
mathematics. I suggest you emphasize linear algebra, mathematical analysis,
and optimization and then relatively advanced approaches to probability,
stochastic processes, and mathematical statistics.

