I do believe however that some experience is needed to understand what is possible and best benefit from existing tools or to be able to communicate with machine learning engineers about your needs.
I was reading Domingos' "The Master Algorithm" several days ago and a mathematician inquired about the book. He knew a group of ML developers. His opinion was that "ML doesn't look very interesting: all you do is play with the parameters, turn the knobs, and/or change the model until something works. There's no real progress there; nothing substantial."
Rather than sending a batallion of bright developers into the ML swamp where they will largely be frustrated, learn little and contribute less, I'd be tempted to guide them into other fields.
1) I think working knowledge of ML is extremely useful to many developers, and generally under-taught in universities. See the old Joel article which mentions "Google uses Bayesian filtering like MS uses the IF statement" http://www.joelonsoftware.com/items/2005/10/17.html). A well-rounded developer should know the basics (Logistic regression, SVMs, know some things about CDNNs etc.), it will make him much more adept at problem-solving. I suspect Google's internal push to get people up to speed is not to turn them all into ML researchers, but rather to make sure that everybody "knows the basics well enough".
So I think it is useful to teach developers about the things ML has to offer.
2) Mathematically, it seems that in ML the "engineering" side has run far ahead of the theory side. The sudden breakthrough in the mid-2000s is IMO still not fully understood - and parts of it may have been very accidental. Initially, it was thought that pre-training was the big breakthrough, but it is quite unclear what the big breakthrough was. It could be that simply the increase of data / compute sizes and the switch to minibatch-SGD explains why modern DNNs generalize well (interesting paper on the topic: https://arxiv.org/abs/1509.01240). There is a lot of good mathematics to be written, but I am not sure whether the folks at Google will write it - given the incentive structures (performance reviews, impact statements) it is unlikely that somebody gets promoted for "cleaning up the theory".
3) From a development perspective: There are a ton of interesting engineering problems underneath the progress in ML. If you look at Jeff Dean, he is a superstar engineer, not necessarily a mathematician, and a lot of the progress the Google Brain team made were engineering advances to scale / distribute etc. - so by training the engineers in ML, you also get to have better infrastructure over time.
So I don't think they are sending "developers into ML swamps"; I think they are trying to reach the point where "Google uses DNNs like MS uses IF".
Google, Facebook, M$ Research, and perhaps Yahoo are extreme outliers. They have zottabytes of broad unstructured text data, so they mine it. Everybody else has megabytes of narrow structured data, most of it commercial transations of their products. That stuff has already been effectively mined by traditional basic OLAP methods. Most/all of the value has been extracted.
Mainstream software apps have yet to show the value of using ML. Such apps have access to very limited data of very narrow relevance. The utility of ML in such domains isn't new; it's classic optimization. Or it's bayesian anticipation. But it's not a game changer. Frankly, the use of ML in most mainstream apps is more likely to add distraction and annoyance as the computer mispredicts your intent -- like Microsoft Bob did.
Maybe "life in the cloud" will create new opportunities for smarter software. But I definitely don't want free apps making their own decisions when to notify me. I guarantee that will get old immediately. So how will this work? Frankly, I can't guess. Like Apple's iAds, programming ML into the mainstream or cloud sounds like an idea that will serve the software / cloud vendor far better than the user.
Humans have been gathering and analyzing data for thousands of years. We have _not_ waited for Google's latest ML or neural nets to do analyses. Otherwise I'd be carving this post onto a stone for future generations to peruse.
The valuable and understandable AI, the step that will make a difference, isn't in "big data" - it's in figuring out how to do what those humans have been doing all those thousands of years.
I can't speak for anyone else, but "M$"
Often the available sensors/assays failed to detect reliable info. Or the phenomenon of interest interdepended on too many variables expressed with too great a dynamic range for us to detect reliably or model usefully. (The present lull in genomics R&D illustrates this well, as do automated interpretation of signals like EEG and NMR spectra.) And the signals that we can extract are often uninterpretable or sporadic. Alas, gathering more data won't yield more signal. Given the present limit on sensor resolution, you just get more mixed signals.
The potential of all ML is limited by the depth of the data that are essential for the discrimination of subtler signals. In the domains you mention (medicine, biology, geology, other sciences) I'm convinced we need better sensors more than greater amounts of the same data available now. We need better hypotheses which lead to better ideas of where to look and what to look for. In general, ML can't help with that. Until we better imagine how the mechanism might work, our questions remain too vague.
To wit, I'm afraid that applying ML to most software apps will suffer from the same limited ROI. I suspect that most app and user data is too shallow for mining to add appreciable value, no matter how clever it is.
Only a wild-eyed ML "gold digger" could imagine that there is a vein of gold in those mines. The reality is that, with few exceptions, we'll find more lumps of coal.
Perhaps I should switch from an ML swamp metaphor to an ML mine metaphor? <--Hah! Do that with ML!
As a snarky remark: Maybe i am not yet qualified enough for real criticism as an cs-student, but i don't like it such sharp destinations between engineering and theory. All the "trial an error" in ml can be a useful guide to solving the theory. Also i guess the work of Jeff Dean is quite often more theoretical as the work of an average engineer. While i feel that if we have not developed a theory behind such tools, we have not really understood them, no one knows how komplex these things really are. I think/feel this makes ml-related engineering harder than software projects with a well understood theory
I just hope there are enough computer-scientists/mathmaticians at universities (or google ;) ) sharply looking on all the progess made in ml from the engineering side and asking themselves "what does that really mean?", because thats a hell of an interesting problem.
I may be wrong, my lecture on ml is next semester ;)
Is this a critique of the human mind or a praise of AI?
> When all ML work is done we'll have great pattern recognizers but nothing remotely akin to thought
Maybe our brains too are nothing but pattern recognizers. Maybe they are nothing but chemical reactions, or energy fields. But being reductionist about AI won't help us understand it either.
I have never used neural nets etc - but with simplified bayes spam filters this was possible and quite useful. I used to check which words were pushing a text into one or other category and which did not (when they should).
I may be one of the developers you speak of (with academic aspirations), presently considering my path forward.
I'm sceptical if going down the ML swamp is the best way forward.
> software engineers use the model.
You aren't disagreeing.
While it is true that for most people will not need to be able to whiteboard a binary tree inversion in their day to day, it seems like they expect their engineers to be able to throw themselves at any problem they're given and require them to be able to pivot in skillset quickly, and have an appreciation of all the developments going on around them so they can apply anything novel ideas developed internally to what they are currently working on.
In those cases, hiring based on sound knowledge of CS fundamentals seems like a good bet...
60k engineers is a pretty terrifying number though.
It's hard to describe, but research (which the vast majority of ML remains) is something that even a sound knowledge of fundamentals might not remotely be enough.
No, because they have rejected ML experts if they can't do their stupid dog & pony show.
> hiring based on sound knowledge of CS fundamentals seems like a good bet...
Too bad many of them can't get their heads around the ML math.
As an intro to ML, I am a fan of Courseras ML specialization that is done by the University of Washington (https://www.coursera.org/specializations/machine-learning). It's free, except for the capstone, and the instructors do a good job of giving both theoretical & practical grounding in various aspects of ML.
I am sure others will have good suggestions as well. Good luck.
I've started with Andrew Ng course and found it way too dry and too much mathematical where Dato one seem too simple.
Tensor Flow course seems humorously hard as 15 minutes in you get "Please implement Softmax using Python". Ok, maybe later.
import numpy as np
def softmax(x): return np.exp(x)/np.sum(np.exp(x))
where x is an array of numbers.
From that standpoint, graduate mathematics is more useful for a practitioner than any robust programming experience.
For a CS engineer who wants to be able to use the latest Inception neural net from Google in his pipeline, there is actually almost zero math need. It's like any other API. In goes the image, out comes the label.
What she would need to know, as a good utilizer of ML, is just a bunch of concepts, such as training/test/validation, bias/variation, how to extract features from data and how to select a good algorithm and framework. So it's mostly data cleaning and tuning hyperparameters, the latter of which can be learned by trial and error and by talking to experts. The direct applications of math for such an engineer would be pretty slim to nonexistent.
Since ML comes from statistics, math, programming, but also other scientific fields, it can even have many terms for essentially the same thing.
For me, as a developer, it was actually easiest to just read some tutorials like the docs for scikit learn and then just start digging through the code of a bunch of libraries. How people name the classes tells you what they think things should be called. But the code tells you what it actually does. I just bounced back and forth between code, tutorials/blogs and books. After a few months, I can actually have a reasonable conversation with our ML people in the language they use and everything else I look at seems easier because I understand most of the terms.
I think asking how to learn ML is a lot like asking how to learn German. It might feel like you need to start with the grammar rules. But I think immersion is the best way. Get the vocabulary, then come back to the rules. I also find that having a burning question in my mind helps me with immersion. So, if you can find a project that drives you, maybe that will help.
So starting with the math fundamentals as a developer seems like an easy way to burn yourself out. But everyone does learn differently. If not, there wouldn't be so many ML algorithms, right? Right?
Am I likely to need matrix multiplication if I start doing machine learning, or that the equivalent of writing a sort algorithm for a web dev - maybe useful to know the concepts, but in reality you won't actually use it?
It's easier to write algorithms against this.
Since a lot of ML libraries use native libraries for linear algebra, you might see a lot of implementations that are written in terms of linear algebra operations. So, if you're trying to read the code and you don't understand what the operations do, it may be hard to grok.
So, yeah, I think some understanding of linear algebra is necessary. Because it's sort of the atomic set of operations underlying most ML you'll see. To read the code, you need to be able to read the linear algebra. But you probably don't need to go read a book on linear algebra. I tried that and it pulled me away from what I wanted to know. It might be enough to just understand the numpy docs.
Not that there isn't value in immediate results for building excitement and interest--I just want to have proper expectations before I check it out as I'm in a similar state to the parent in terms of where my math is and wanting to dive in.
I've never used this Microsoft product, but if lets you take educated guesses at what will work, and gives you some insights into the intermediate steps, then its useful as a check that your mental model of machine learning is becoming more coherent and useful.
Plus, if you slot in something and it gives a better output, you can go back to your studies with a new target of finding out why X param changed things.
Again, that's just one example, and the instant visual feedback is awesome (I'm a visual learner, so that's huge). But at the end of the day, I know that there is a lot of math and code under the pretty graphics, and at some point I'll need to tackle that to make sure I am actually learning this and not just making assumptions based on what I can eyeball with some visualizations.
That's enough to implement and understand neural networks. You'll fumble around a lot more than you have to, but you can figure it out.
Honestly, you could probably fight your way through Ng's class with just matrix multiplication, which you can learn in less than an hour fairly easily.
More in-depth videos of the course are on YouTube: https://www.youtube.com/playlist?list=PLA89DCFA6ADACE599
Not exactly light on math, so you may want to read up on some multivariate Calculus and Linear Algebra before the later chapters. First few sections should be approachable regardless.
I looked a while ago and The Udacity nanodegree looks interesting but kind of a subset of the materials I'd already lined up. I also think part of the challenge is tailoring a curriculum to one's existing strengths, so in my case I'm spending less time on general programming / data munging, more on stats fundamentals and ML algorithms, and find that most all in one MOOCs have some material that is less worthwhile for me. Also: some of the projects they feature, like the kaggle competition https://www.kaggle.com/c/titanic can be undertaken independent of udacity.
I really think Python Machine Learning + https://www.kaggle.com/c/titanic + kaggle.com/c/forest-cover-type-prediction is a great place to start on the practical ML side.
Below is my favorite response by vaibkv:
vaibkv 15 days ago
Here's a tentative plan- 1. Do fully AndrewNg's course from Coursera
2. Do a course called AnalyticsEdge by MIT folks from edx.org. I can't recommend this course highly enough. It's a gem. You will learn practical stuff like RoC curves, and what not. Note that for a few things you will need to google and read on your own as the course might just give you an overview.
3. Keep the book "Elements of Statistical Learning" by Trevor Hastie handy. You will need to refer this book a lot.
4. There is also a course that Professor Hastie runs but I don't know the link for it. I highly recommend it as it gives a very good grounding on things like GBM, which are used a lot in practical scenarios.
5. Pick up twitter/enron emails/product reviews datasets and do sentiment analysis on it.
6. Pick up a lot of documents on some topic and make a program for automatically producing a summary of those documents - first read some papers on it.
7. Don't do Kaggle. It's something you do when you have considerable expertise with ML/AI.
8. Pick up flights data and do prediction for flight delays. Use different algorithms, compare them.
9. Make a recommendation system to recommend books/music/movies (or all).
10. Make a Neural Network to predict moves in a tic-tac-toe game.
These are a few things that can get you started. This is vast field but once you've done the above in earnest I think you have a good grounding.
Pick a topic that interests you and write a paper on it - it's not such a big deal.
You need to first learn calculus and linear algebra, and learn them very well. I would also recommend having a good understanding of probability. Learning all of these well will take at least a year, if not longer. For instance, I took one year of calculus in high school and then one semester each of linear algebra and probability, which that adds up to two years.
You'll need calculus so you can do optimization (i.e. at the simplest level, take a derivative, set it to 0, and solve. Of course there's more you can do with calculus in Machine Learning). You'll need linear algebra for almost everything in Machine Learning. Lastly, probability will be useful for understanding very basic methods like Naive Bayes. There are other methods built on probability also.
If you skimp on learning any of these, you will never be able to understand Machine Learning at a deep level, much less even a shallow level.
"kitten bathroom 2013"
And there was a picture of the cat sitting in the tub on a blanket. Simply amazing.
"Sacrificing what remained of their power user appeal" is imitating Apple!
Reminds me of the Ballmer/Gates strategy of everything must be Windows, which seemed flawed to me.
I would argue that Google+ didn't work out because Google was trying to play catch-up in a field that it just lacked knowledge in (social networks).
Whereas with machine learning, they're not playing catch-up, everyone else is. Of all the other tech titans out there, they're the ones really leading the pack.
That remark aside though, I agree with you. An attempt to go hard on machine learning and apply it everywhere will probably work out pretty badly. As fascinating as ML is, I just haven't bothered to learn it yet because I haven't the slightest idea what new and novel problem I'd solve with it that doesn't have a better solution through a more straight-forward approach.
Assuming they have the money, isn't this exactly the kind of reason Google should train up a wide spectrum of engineers from different teams and then see how they apply machine learning to their respective domains? It would be foolish for Google's management to think they can divine a priori all the best possible uses of ML in their various lines of business. Why not tool up a bunch of smart people, set them loose, and see what works?
After Google Now, DeepDream and all the self driving car hype, reading about that workshop being the start of the big transformation seems strange.
How did you get this impression? It has little basis in reality.
Interestingly trends shows me a steady incline for 'machine learning', while searches for 'neural networks' are dropping since 2004
Also 2008 in Deep Learning is 100 years ago :)
Sigh. Another instance of pop science getting most everything wrong (and I haven't even bothered to write anything about the technical content in the article).
AI is google's leverage. It should explore on that path.
Jeff Dean said, "The more people who think about solving problems in this way, the better we'll be". I sincerely hope that Sundar emphasizes the thoughtful application of ML and not allow black box algorithms take too central a role.
This kind of hubris swept through wall street banks during the structured products boom, ultimately leading to products such as synthetic collateralized debt obligations. Taking Jeff Dean's opinion about whether machine learning would be a good thing is like taking the opinion of the creator of synthetic CDOs whether they were a good thing. The authors and evangelists are blinded by optimism and opportunity.
Is Sundar Pichai swept away by the opportunities of machine learning and too biased to be aware of risks ? Is Sundar acting like Stan O'Neil did as he pulled all the stops at Merrill Lynch and went all-in with CDOs? I hope he isn't. It does not seem to be the case as he mentions thoughtful use of ML.
Nonethless, caution should be taken.
Like you say, it can be easy to think that something works when it really doesn't. I hope that the above quote isn't meant to be interpreted as "believe the results are correct." Evaluation is paramount when working on these systems to avoid making such mistakes. I assume Google is including evaluation in their machine learning training, but it would have been nice to see that pointed out in the article for folks who may have an interest in machine learning but don't know what's important to focus on.
Surely it is better to talk of learning deep neural nets, and such things. Or maybe "machine training" would be less intimidating. But I guess we're stuck with it, and it's not so bad.
I don't really like the word, but I don't really give a flop either.
I'm not sure how its better or worse than guru, rockstar, or any other lame word recruiters like to use to make us feel like the special snowflakes we are.
Which word would you like to see in place of 'ninja'?
Sheesh... "Do you want to make the world a better place?", with a photo of Gavin Belson holding an animal, would make me more inspired.
The article leads with a low-testosterone star.