Main website: http://guidetodatamining.com/chapter-1/
The actual pdfs
I think a solid machine learning book coupled with a chart like this (http://scikit-learn.org/stable/tutorial/machine_learning_map...) is the most useful way to approach machine learning.
I have thought about this problem for a long time. Initially, I used to think it was a mere matter of finding clever Machine Learning algorithms that could auto-magically detect features for you. There are others who think it is a matter of throwing a bunch of subject matter experts into the problem and building feature vectors. I have worked with both philosophies of people. I think the answer lies somewhere in the middle. And it is a fucking hard problem.
You need to do some clever algorithmic work. Maybe use a PCA or your favorite dimensionality reduction trick but also be clever enough (mathematically that is) to realize when PCA is brittle and when not to use it. You also need to have enough understanding of the subject to be able to ask the right questions from your subject matter experts. It is easy to get five people in a room and ask them to describe a song. But it is much harder to find out the right way of asking them: Is this song the kind of song that you would dance to while drunk on three beers in a podunk town in the Mid West? That requires experience to get that sort of intuition.
The only thing I can recommend (pun unintended) is to keep solving these problems, focus on the domain(s) that you are deeply passionate about. Don't spread yourself too thin and make yourself to be a general purpose Machine Learner. That is the only way to get to building beautiful end to end products.
The usual iteration loop for a machine-learning system is:
1) Try features.
2) Evaluate your system. If you are happy with your scores/metrics, stop.
3) Do error analysis/run diagnostics. Go to 1.
It is the last step I find inexperienced people usually lacking. You need to examine your errors and find commonalities among them.
For example, is there a certain class of error that hints at a certain feature you haven't tried? Do your existing features provide decent results to begin with? Or maybe you should just get more data? Whatever path you take, it needs to be defensible.
So, in terms of general advice:
1) Realise it's an iteration loop, not a fixed recipe.
2) Be able to justify why you are trying whatever it is you are trying.
With time, you'll develop intuition for given domains.
This is just a restatement, I think, of the question you are responding to. I suspect a book on this topic would be quite popular.
Then move on to another domain and repeat. As you continue doing this patterns will emerge across the domains. This is usually the best time to start working through a machine learning textbook.
When you take this approach you defer a chunk of the theory for practical examples, but can immediately start to apply the algorithms and have some results.
Machine Learning: a Probabilistic Perspective, by Murphy
Pattern classification, by Duda et all
The Elements of Statistical Learning, by Hastie et all. It is free from Stanford.
Mining of Massive Datasets, free from Stanford.
Bayesian Reasoning and Machine Learning, by Barber, free available online.
Learning from data, by Abu-Mostafa.
It comes with Caltech video lectures: http://work.caltech.edu/telecourse.html
Pattern Recognition and Machine Learning, by Bischop
Information Theory, Inference, and Learning Algorithms, by Mackay, free.
Classification, Parameter Estimation and State Estimation, by van der Heijden.
Computer Vision: Models, Learning, and Inference, by Prince, available for free
Probabilistic Graphical Models, by Koller. Has an accompanying course on Coursera.
this is an excellent review (but doesn't cover books by Mohri, Rostamizadeh, Talwalkar and Abu-Mostafa , Magdon-Ismail, Lin: http://www.amazon.com/review/R32N9EIEOMIPQU/ref=cm_cr_pr_per...
But he goes quite deep in the mathematical explanations (which is a great point, there is no better way to learn and understand) meaning you have to be willing to work on your math for this book.
Great content, otherwise. But I swear all this 'web 3.0' gimmickry has set the web back at least a decade.
The problem was my poor CSS practices.
Edit: fixed link
My explanation is that it's easy to write about that and feel good after.
Can you be more specific?