Hacker News new | past | comments | ask | show | jobs | submit login
Enrollment Is Surging in Machine Learning Classes (nvidia.com)
278 points by Osiris30 on March 14, 2016 | hide | past | favorite | 127 comments

Based on recent (successful) job interviewing, I'd recommend people looking for a job in data science/ML to do a statistical learning course such as the Hastie and Tibshirani Stanford one [1] as a higher priority over ML/deep learning courses. It gives you a base level of knowledge in the field, and even for jobs that do deep learning, most of the technical questions will be about making sure you know the classical concepts really well.

[1] https://lagunita.stanford.edu/courses/HumanitiesSciences/Sta...

Everyone keeps linking ESL, but really ISLR is much easier to understand, provides more important clarifying context, and covers more or less the same information. ESL is more like a reference and prototype for ILSR

[edited] I would hire you tomorrow if you come from a quantitative background, know at least ISLR inside and out, and can communicate in a professional manner.

Excellent. I just started reading it. Shall we catch up in about 2 months?


I'm currently a software developer - feel free to reach out at jonschoning at gmail dotcom



Thanks for posting this


http://www-bcf.usc.edu/~gareth/ISL/ISLR%20Sixth%20Printing.p... is the download link for the newest printing.

How difficult is it to lend AI engineer/Data engineer (fresh grad) position for someone without masters/Ph.D? What do you recommend to person like this?

While I disagree with stuxnet79 about needing an advanced degree, you'll likely not find much without some kind of experience. In lieu an MS or PhD, you may want to start in entry-level development at a shop that also has some machine learning, big data analysis group and work your way in their over the course of a few years. After a few years you may be in a better position, experience-wise, than many M.S. grads i've seen.


To add some evidence - I co-chair PyDataLondon (3,000 members, UK's largest Python u/g, UK's most active data science group). I survey our members, our monthly attending group are 40% PhD, 40% MSc, 20% other, few have 5-10 yrs industry experience, the majority have 2-4 years. I'd argue that you need at least a relevant MSc + a couple of years experience to begin to talk of being a data scientist/AI engineer. Coming through data engineering in support of data science is a great route to get practical experience where there's a lot of job demand, at least in London.

Thanks for input. I have a BS in CS. I took couple of AI/ML centric courses in my undergrad. I worked on couple of ML centric open source projects, one of them featured on the front page of HN. And I've good stats on kaggle also. I'm applying for a job in top ML firm. I'm fresh grad. Should I apply for Software engineer or research engineer/Data scientist? Is my experience enough for research engineer/Data scientist?

Thanks!, I have a BS in CS. I took couple of AI/ML centric courses in my undergrad. I worked on couple of ML centric open source projects, one of them featured on the front page of HN. And I've good stats on kaggle also. I'm applying for a job in top ML firm. I'm fresh grad. Should I apply for Software engineer or research engineer/Data scientist? Is my experience enough for research engineer/Data scientist?

Its hard to say as I don't know your complete background or the level of the role at this firm. If they're at top ML firm and you're applying for a research engineer/Data scientist role then you're probably competing with a hefty bunch of experienced candidates (many of which i'm sure are on Kaggle too). If you're still lacking a background with professional experience then i'd suggest starting at a lesser role to get yourself in the door.

Thanks for help. That makes sense. I guess I'm gonna apply for software engineering role.

Unless your experience is exceptional and you are acing interviews left and right, I'd recommend getting a Masters at minimum. The unfortunate reality is that few will take you seriously without at least one advanced credential above a BSc.

With no experience you will need at least a Masters. Last I checked half of data scientists in the industry held a PhD and the other half held a Masters degree. You can get away without it if you somehow have significant experience in the field (experience always trumps everything else), but that's pretty rare.

Thanks!, I have a BS in CS. I took couple of AI/ML centric courses in my undergrad. I worked on couple of ML centric open source projects, one of them featured on the front page of HN. And I've good stats on kaggle also. I'm applying for a job in top ML firm. I'm fresh grad. Should I apply for Software engineer or research engineer/Data scientist? Is my experience enough for research engineer/Data scientist?

Sorry, didn't see this until now.

It's hard for me to say without knowing which firm or reading the job description, but to me research engineer implies a PhD level of knowledge. You won't get that from some open source projects and kaggle competitions.

CS109: Introduction to Probability for Computer Scientists http://web.stanford.edu/class/cs109/

also a great resource.

I'd be really interested in your background, the kind of projects you did and the kind of positions you applied for (I'm trying to switch to ML/data science)

That's nice to hear! I'm doing my masters thesis on statistical learning and I often think how un-glamour this field now is. No bayesianism, less engineering, but having probabilistic guarantees on your out of sample results as well as sample complexity, no matter the underlying distribution, can be quite beneficial.

If you have time, I'd also read something like David Mackay's information theory textbook [1] for more of a Bayesian perspective. Interviewers did seem to appreciate having multiple perspectives and interpretations of basic results, though less practical.

[1] http://www.inference.phy.cam.ac.uk/mackay/itila/

Given you're making recommendations on the topic of statistics, exactly how many companies did you talk to reach the recommendations you're providing and what if any bias was there in your job search?

Good points. I'm coming from a computational physics research background and applied for a few data science positions at startups, a large social network and a private ML research group, so not that many overall, beware of the small sample size.

The smaller startups seemed to want more "data engineering" experience.

What do you mean by "what if any bias was there in your job search"?

having probabilistic guarantees on your out of sample results as well as sample complexity, no matter the underlying distribution

What technique are you referring to?

Using concentration inequalities on Lipschitz convex learning algorithms to derive generalizing bounds. The seminal papers for this would be Stability and Generalization by Bousquet and Elisseef (2002), or those by Shalev Schwartz.

Tagging this thread for future use.

My wife is a PhD Data Scientist, and I'd like to at least have a basic level of understanding theory, processes and tools used in her field.

I can't see this link, what is it?

It's a link to his personal saved comments on HN. You can't see it because it's for his username.

Yours would be https://news.ycombinator.com/saved?id=catilac&comments=t (access it by clicking on your username on the top right then "saved comments")

Thank you!

You're very welcome!

ditto (except the wife part)

Are the lectures for that available outside of the regularly scheduled course offerings? I'm interested in coming at ML from the statistical direction.

I see in the link an empty course info page. Is it an online course that requires a login?

Anecdotally, a lot of the people taking these classes totally lack the background or intuition needed and aren't getting any real training in machine learning. They're learning some very rudimentary bits of data cleaning and how to use basic machine learning libraries.

I recently interviewed someone taking a (reputable) online masters in machine learning, and they couldn't describe how or why any of the models they were using worked, nor could they answer most basic questions about the problems / data they were working on.

I've always wondered why data science is so dominated by CS people. CS concepts are the least important thing for a data scientist to know. Fundamentals in math, statistics, and especially linear algebra are far more important. I would hire a statistician who has learned a few CS concepts over a computer scientist who has learned a few statistics concepts any day. Obviously being an expert in both is ideal, but that's pretty rare to find.

I teach at a well respected university for what is essentially a data science masters program, and most of my students come from CS. They are woefully unprepared, and even worse few of them seem to care at all about learning the mathematics behind anything that is going on.

Personally, I think if you can't read linear algebra at the same proficiency as you read English then you have no business calling yourself a data scientist. Unfortunately, in my experience that would describe most people who label themselves data scientists.

More worryingly, they make glaring data errors, suck as poor sample estimates, not understanding extrapolation of data to a larger subset, or even knowing what a confounding variable is.

See I think that the points you've raised are far more important than linear algebra. Learning linear algebra is pretty easy compared to applying statistical concepts to real problems (source: I do this stuff for a living).

Agreed. It's one of those subjects that people love to be fascinated with, and they see how lucrative the field is, so it gets a ton of attention. Unfortunately, they all want to shortcut it, and there are plenty of organizations who will help them try, but machine learning requires math, stats and arguably CS proficiency at at least an undergraduate minor level. And not many people have that.

You need to know the background which is Statistics Discrete Math Algorithms

Most books on the subject assume you already know what linear regression is, Naive Beyes is just explained briefly theoretically and it goes right into the code in spark, R or Clojure for example. However UC Berkeley's course is very theoretical, almost no code is shown and its just MATLAB code (don't remember the name off the top of my head), their spark course though is heavy on the code with IPython activities.

This could be a real problem in their respective careers (or not). Here's the analogy: scientists who use statistical tools as blackboxes are the ones responsible for the whole problem with misusing p-values. Similarly, poor intuition and training in machine learning leads to the blackbox mentality and consequently, problems with building working systems.

That's not really an analogy, it's an assertion without a shred of evidence.

I'm torn about the blackbox thing. On one hand, it's important to understand the underpinnings of a model. On the other, we utilize a multitude of things in our daily lives of which we have no fundamental understanding; that's abstraction in a nutshell.

Machine learning gets kinda scary, though. For one example, discrimination with ML is super easy. Check out fatml.org for instance. Also, with ML it's really easyfor an amateur to over fit like crazy and draw spurious conclusions due to poor methods. People think they have an intelligence when they instead have very finicky tools

Edit: another pointer here https://algorithmicfairness.wordpress.com

There is a crucial distinction between the multitude of things we utilize in our daily lives and machine learning/high-dimensional data analysis: we aren't equipped to intuit the workings of high-dimensional advanced-math statistical inference in the same way that we can intuit the workings of say, a water pump, or simple arithmetic on Excel, or simple database systems, unless we are appropriately trained in the relevant math and science.

Some examples: blackbox application of classifiers (e.g. WEKA gui as used by some for data exploration) can ignore parameter optimization, unbalanced sets, parsimony in features, dimensionality reduction, etc. etc.

Scientists are also domain experts tho, but bootcamp-style practitioners won't be.

As somebody who tries to learn from fundamentals, can you suggest a book (or two) and a course (online to accompany) it with, to get started in this area? I have been working as a programmer for a while(10 years).

Sure, a couple things.

(I'm assuming you're comfortable with multivariable calculus.)

Andrew Ng's coursera course is good.

PRML (pattern recognition and machine learning) by bishop is good, and has a useful introduction to probability theory.

You also want a good grounding in linear algebra. Strang is basically the authority on linear: http://ocw.mit.edu/courses/mathematics/18-06-linear-algebra-...

You want a strong grounding in probability theory and statistics. (This is the basic language and intuition of the entire field.) I don't have as many preferences here (although its the most important); someone in this thread pointed to a course on statistical learning @ stanford that's good.

A good understanding of optimization is helpful. Here's a link that leads to a useful MOOC for that: http://stanford.edu/~boyd/cvxbook/

there's a lot of other stuff (markov decision processes, gaussian processes, monte carlo methods come to mind) that is useful that I'm not pointing to, but if you've hit the other stuff here then you'll probably be able to find out those things.

If you're into it, https://www.coursera.org/course/pgm is good but not vital.

You may want to know about reinforcement learning. This answer does better than I can: https://www.quora.com/What-are-the-best-books-about-reinforc...

Deep learning seems popular these days :) (http://www.deeplearningbook.org/)

Otherwise, it depends on the domain.

For NLP, there's a great stanford course on deep learning + NLP (http://cs224d.stanford.edu/syllabus.html), but there's a ton of domain knowledge for most NLP work (and a lot of it really centers around data preparation).

For speech, theoretical computer science matters (weighted finite state transducers, formal languages, etc.)

For vision, again, stanford: (http://cs231n.stanford.edu/syllabus.html)

For other applications, well, ask someone else? :)


arxiv.org/list/cs.CL/recent arxiv.org/list/cs.NE/recent arxiv.org/list/cs.LG/recent arxiv.org/list/cs.AI/recent

EDIT: unfortunately, there's also a lot of practitioner's dark art; I picked a lot up as a research assistant, and then my first year in industry felt like being strapped to a rocket.

Oh no! I forgot about information theory! I don't have a specific recommendation, but it's very useful background.

This is fascinating stuff and well worth learning for its own sake but I think people that are jumping on this bandwagon with the idea that there is a bonanza of high paying jobs waiting for them are going to be disappointed. Even very heavily data-centric companies only hire a few ML specialists for every handful of general purpose code monkeys.

I think the same thing that happened to quants on Wall Street will happen to those pursuing Data Science/ML today (if it hasn't already happened). Post-2008 there was such a glut of qualified quants that companies moved the goalpost and now it's very difficult to even be considered for a role if you don't have a PhD.

With so many programs and courses springing up about ML, in a few years I suspect there will be a glut of Data Scientists on the market, and once again companies will use the PhD as a filter. Already it seems like a good number of places will only hire PhDs for such roles.

As far as I can tell Wall Street has always had a penchant for pedigree in addition to skills. Large companies are often the same. People used to talk about all software engineers needing CS degrees in the future. Again in Wall Street and large companies you'll see this, but there are still plenty of well paid, talented software developers with no formal CS training.

In my experience hiring and chatting with other people hiring data scientists, there's the same trouble as there is with software engineers. No matter how many people have the training there's still a dearth of applicants that are truly talented and can actually do things. PhDs fleeing academia for a promise of easy employment and money are a huge bulk of new data scientists that I've seen and most of them have a very hard time taking deep knowledge and applying it to solve real-world problems.

At least in tech I think the future of Data Science lies in the perpetually small group of people that will have a proven track record of coming into companies and actually solving problems, just as it has been in software development.

I think the same thing that happened to quants on Wall Street will happen to those pursuing Data Science/ML today (if it hasn't already happened). Post-2008 there was such a glut of qualified quants that companies moved the goalpost and now it's very difficult to even be considered for a role if you don't have a PhD.

I'm not sure that's a valid comparison. There's a relatively fixed and fairly small pool of companies who need quants. Machine Learning, OTOH, can be used by almost any company in existence (even if most of them don't realize it yet). And plenty of companies don't need somebody doing cutting edge academic research in ML... they need somebody who can use a pre-packaged library or service and apply linear regression, or k-means, or build a simple neural network with backprop.

It can't, though. ML works best when you have enough data to build models that are robust and resilient to noise, and enough customers (or users) that these models will move the needle.

The vast majority of startups and small businesses - those whose customer base measures in the dozens to hundreds - should be going out, engaging their customers person-to-person, and looking for qualitative data, because that's what'll move the needle on their sales. There's no point in understanding "your customer base" as a unit until it's big enough that it behaves, statistically, as a unit; instead, you should be focusing on "your customers", individually. Once you get into the thousands of customers you can start applying some basic learning models, and once you get into the millions machine-learning becomes as fundamental as pricing.

But you gotta get there first, and many businesses haven't. And even if they have, userbase-wise, they need to build the infrastructure (through web & mobile devs, backend engineers, data scientists, etc.) to log, store, and clean all that data before they can apply machine-learning to it.

But you gotta get there first, and many businesses haven't.

Agreed. But many have as well. So I'll still argue that there are more potential positions for people doing "applied ML" than there are for quants. I'm open to being proven wrong though.

And even if they have, userbase-wise, they need to build the infrastructure (through web & mobile devs, backend engineers, data scientists, etc.) to log, store, and clean all that data before they can apply machine-learning to it.

We're working on a MLaaS offering to help reduce the need for a lot of that stuff. And there are some offerings in that space already.

Once it becomes straightforward to do ML with a pre-packaged library, you'll quickly start to see Amazon or a third-party offer effective MLaaS. That will suck quite a bit of oxygen out of the room.

Already exists:


Problem is, writing an effective machine-learning model already doesn't require knowing the algorithms well. It requires knowing your data well. You can provide tools for this, and AML does, but there's no substitute for actually working with the data day-in-and-day-out and developing an intuition for it.

(Deep learning promises to change that a bit, since the relevant features are extracted for you by the algorithm and you don't need to do any particular data cleaning or feature extraction work. You still need to understand your data well to understand how to train the model, though, and how to apply primitive ML operations - classification, regression, clustering, etc. - to a real-world problem.)

You still need to understand your data well to understand how to train the model, though, and how to apply primitive ML operations - classification, regression, clustering, etc. - to a real-world problem.)

And this is the kind of stuff that I believe can be done by people who don't necessarily need phd's in Stats or ML. A decent grounding in statistics / ML, and good domain knowledge should be enough to support using pre-packaged algorithms to solve business problems.

Interesting. I sometimes mentor junior / aspiring devs on a site, and have seen that many of them seem to be trying to get on to the bandwagon you mention; some of them don't even have good experience with basic programming, and are still trying to do machine learning / data science MOOCs, bootcamps or courses on free sites. Some of them seem to be trying to learn programming at the same time as data science / machine learning. Don't think that is going to work out well for them. Many of them seem to in it for the supposed bonanza. Wanting to get a well-paying job is fine in itself, but many of them seem to be trying to shortcut the learning process, either due to ignorance or just because they don't want to spend a lot of time.

Edit: grammar.

Here's an example I came across on that site, that shows what I mean --- someone put in a request: "I want to build an advanced app and I don't know where to start."

Hey Wow. Hoo Boy (or Girl). Maybe you should start by learning the basics, get some work experience, and only then try to pull a Zark Muckerberg or a Bacefook.

I have been working in teams with a big ML component for many years, but I have been working from "the outside". I keep on being promised to work more in depth, but never get a chance to. When job hunting last year, I noticed that at almost every company, the division being developers and those working on ML is almost 100%. Both roles are silo'd. Good luck transitioning from dev to ML if you have already started your career. It was not like this just 4 years ago.

100% true, most of time companies hire PhDs only.

Ooyala has only 3 data scientists. All have PhDs and come from a math/statistics/com sci background. Kueski only has 1!

True from my experience as well. If you're going to invest in a Data Scientist, you want the cachet of having a Phd in that role.

Not startups.

Where would you go next if that sub-area is already saturated? I'd particularly like to get some opinions for self-taught devs tired of CRUD and DevOps land(s.)

Good question. I think GOOD crud developers are still in high demand and probably will be for some time. As for myself I've been focusing on getting good at building complex user interfaces. Can't go wrong learning React now.

Personally, I think that data visualization would be an interesting niche to go to.

> This is fascinating stuff and well worth learning for its own sake

Touché. As a web developer, I have no illusions on competing with people with the proper mathematical background. However, there are still some problems that are relevant and interesting in our field(collaborative filtering for instance)

So I guess if you temper your expectations, you won't get burned.

>>Even very heavily data-centric companies only hire a few ML specialists for every handful of general purpose code monkeys.

True for now, but it will change. Once terms like 'Statistics', 'ML' and 'Data science' will become common place, there will be a ridiculous amount of lust for automation jobs.

Sure you won't be fixing the beta release bugs for skynet 1.0. But almost any job which has a scope for data collection, analysis and automation will demand skills in these area.

I'm currently enrolled in a Stanford ML class on Coursera. I'm not very far into it so I don't have a lot to say about the class in particular but I know WHY I signed up for it.

I graduated from uni with a Computer Engineering degree in 2005- ML wasn't really a thing being offered at that point. I loved algebra and calculus but hated statistics. All I hear about these days is machine learning, so I wanted to see what the fuss was all about, as I've really never been exposed to it. I also wanted a university style exposure to it as I wanted to ease into some of the statistical concepts necessary as I wasn't good at them way back when and I haven't practiced them in 10 years.

Finally, after some initial exposure, I may find that ML or some of its concepts will be another tool in my problem solving tool bag.

I feel like a lot of people have my same motivation. We are hearing all the hype, we weren't really exposed in our formative years, and we're curious. Also, my particular course was free. So what's the harm?

Good points. Just to clarify in case folks take your experience in 2005 as the rule, while certainly not as widespread as today, ML was definitely being offered at various universities in the late 90s and early 2000s: for example, the textbook "Machine Learning" by Tom Mitchell [1] was published in 1997 and was used in the undergrad ML class at Georgia Tech when I took it in 2001-2002.

[1] http://www.cs.cmu.edu/afs/cs.cmu.edu/user/mitchell/ftp/mlboo...

That was why I said wasn't really a thing being offered. There were certainly some classes that dabbled in the general area, but nothing that I can remember that came out and said "this is Machine Learning".

At least in my Computer Engineering curriculum it was very much about the electrical engineering and software development fundamentals.

I graduated in 1999, and that too in India. So imagine my situation :). I finished the Coursera ML class recently (though I had done some ML prior to that) and I really enjoyed the class. You should definitely aim to finish it. The assignments towards the end feel relatively simpler compared to the initial ones.

I've read that machine learning is making statistics classes fall out of favor. I can understand why. In a lot of ways, machine learning is statistics in reverse. Instead of starting with a hypothesis and trying to reject the null hypothesis, unsupervised machine learning starts with data and generates hypothesis.

I'm not even sure it's "in reverse." The supervised portion of ML is almost entirely regular ol statistics.

I think Statistics departments have lagged behind in teaching ML techniques as first class citizens of the statistical world, but also that ML folks tend to gloss over or ignore a lot of the benefits formal inference and statistical thinking bring to the table.

Traditional statistics is very very good at helping us learn a lot about relatively simple (or carefully and deliberately simplified) processes, and provides a rich background in study design.

ML techniques are good at helping us learn a little bit about arbitrarily complicated processes, and apply that knowledge quickly. A modern practitioner in either field should have a working knowledge of both [families of] paradigms.

I guess there's a difference in perspective if you take your machine learning classes with the Mathematics/Stats department rather than the Computer Science department.

It's more than just statistics, broadly speaking that is the scientific process.

It is part of the scientific process, just as it is part of the machine learning process. The hypothesis has to come from somewhere. In classic science, it comes from a scientist or group of scientists who notice something strange or see a correlation, devise an explanation, then test it to see of it is correct. Unsupervised machine learning just automates the second and part of the first step by having one or more algorithms devise the hypotheses instead of humans.

ML is still supposed to do the third step. IMO, where ML often falls down due to the immaturity of the field is in not creating good experiments to test the models (hypotheses) generated by the algorithms.

Side note: Zen and the Art of Motorcycle Maintenance is a classic read about "The hypothesis has to come from somewhere".

Concretely, does anyone know of a good competitor to Andrew Ng's infamous Coursera class? His material dates to 2011 and that's an eternity in the computing world

ML is almost entirely based upon math that's decades upon decades old. The only reason it's become such a big thing lately is that the hardware and data sets have finally caught up to the point where it's useful on a broad scale.

Sure, the research portion of the field has made a lot of strides since 2011 but for anything that's not PhD or research level stuff, Ng's class is perfectly up to date.

You'd be hard pressed to find a better class anywhere.

If you're speaking tools and frameworks, yes, but core data structures, algorithms, and paradigms change very slowly. Andrew's course focuses on the latter, basic ML techniques that have been used for years and will continue to be used for years to come.

As someone pointed out in an earlier thread, the actual course is more advanced than the Coursera class, and its lectures are also available online: https://www.youtube.com/view_play_list?p=A89DCFA6ADACE599

As well as the problem sets: https://see.stanford.edu/Course/CS229

They are not any more recent, though.

I really enjoyed this one: https://work.caltech.edu/telecourse.html

As stated in another comment, the basics haven't changed much. The libraries you will use have evolved though. My impression is that that is where the innovation has been.

All of the concepts are still valid since he covers the fundamentals.

This math has been around since the early 80s. Don't panic.

How about this one?


There's also the Geoffrey Hinton class on Coursera, although I'm not sure if additional sections of it are being offered per-se. But you can still enroll in it and watch the videos and stuff. I don't know if it's any more recent than ang's class, but it goes into more detail in some areas and covers slightly different topics. At worst, it's a good complement to the other ang class.

For Neural Networks, I'm currently working along with Stanford's CS231N: http://cs231n.stanford.edu/

But I wouldn't really be able to keep up with it if I hadn't taken Ng's Machine Learning course first. The basics it teaches aren't out of date at all, and there's lots of regular old ML stuff in there that is useful now and hasn't changed in the interim apart from maybe which library you might pick.

I took an Intro AI class in college (a broad ML class with some other techniques thrown in) and to be honest, I found it incredibly boring. I don't mean this to be a dig at the field, but I just thought I'd share my personal feelings. I didn't much enjoy the process that ML required but the end result was fairly cool (though not personally rewarding enough to be worth the process).

Distributed Systems was far more fascinating in my opinion.

This has generally been my experience too, the results of AI are cool and exciting, but the programming side often ends up dull and boring - accentuated by the "I have no idea why tuning that parameter worked" factor that happens in any complex AI system.

Introducing senses and movement - i.e., robotics - makes the boring parts worth it for me though.

If you have a background in hardware design / chip architecture, things get even more interesting. Figuring out how to balance memory bandwidth, power and optimal datatypes (think integer v. floating point) is fascinating stuff.

This is great news; with the CS & ML basics covered, we in the industry [1] are only left with instilling some common sense into the graduates :)

In academia, it's publish or perish, so much of the cutting edge research is over-engineered (over-researched?) and too brittle to be relied on in production. Not to mention lacking a usable implementation.

Because in practice, many business problems that call for automation and ML can be solved using the simplest of techniques to a satisfactory degree. The challenge fresh graduates face is rarely advanced math. It's usually solving the right problem and making the solution robust enough to be reliable in production (+communicating this to all stakeholders).

Model interpretability and your ability to analyse errors and iterate the solution are worth way more than a few percent gain in accuracy (accuracy/f1 are rarely the measures most relevant to the business goals too; the cost matrix is usually trickier than that). Pulling every opaque deep learning library under the sun into a system that could be solved using a few regexps and ifs to get a 5% KPI boost is not a good idea.

Building practical Machine Learning models is as much about solid engineering and understanding the business objectives as it is about math&theory, though the math cannot be skipped. We're not nearly at the stage where some "generic ML in the cloud" can cut it, wherever there's real money on the line. Successful systems are still very domain specific, built with significant SME expertise.

[1] Source & plug: we run a ML mentoring program for promising university students, as well as give corporate trainings on Machine Learning in Python: http://rare-technologies.com

This is definitely consistent with my experience. Every candidate I've interviewed for software engineering internships this year says that their main technical interest is machine learning.

Meaning no offense to you, sounds like a terribly boring bunch. More constructively, it seems from present time like putting oneself on a fast track to become commodity.

Strengthening your comfort with math and algorithms is the first step. If you are doing your undergrad, do as many math classes as you possibly can. I think the hardest part of breaking into a career in machine learning and data science is learning the math on your own. Being enrolled in a real class helps.

In laymans terms what's the intersection between ML and AI?

ML is a subset of AI.

So your answer should have been "ML" :p

Genuine question: how do you benefit from this classes if you just want to use third-party libraries? I mean, you can now do amazing things just using others' APIs without knowing what's behind.

If you want to use a library to find the shortest path in a graph, then the input is simply the list of nodes and edges and is easy to understand. You can use a third party library to solve the problem without knowing how the algorithm works under the hood.

But the arguments to a ML algorithm provided by third-party libraries include various parameters to tune the algorithm and make tradeoffs between accuracy/time etc. Debugging the algorithm for your particular dataset also requires some knowledge of how the algorithm works.

But there are use cases where you need not know what's happening behind the scenes. For example, you can use a ML as a Service provider to classify your photos without knowing how their algorithms work.

There seems to be a bit of an art to choosing your machine learning algorithm and parameters. The class is a good way to get a grasp on that.

I agree. Also, I think there are going to be a proliferation of API's to choose from and it will be helpful to understand which ones are the best and why. Some could be flat out wrong.

You will get more accurate results that are better motivated if you understand the underlying algorithm. Just like you get better results if you understand your data.

Don't rely on a magic box.

> Don't rely on a magic box.

That's quite funny given the circumstances. I think the neural network algorithm disagrees, and it really wants you to rely on a magic box.

Have you tried training deep learning models before? Deep learning requires a ton of hyperparameter and architectural tuning, and there are a bajillion data transformation, mathematical optimization, etc. tricks that people use. The end product is still a magic box, but there's a ton of background knowledge required to train these models well.

This set of lectures provides a nice balance between providing a (relatively) gentle theoretical basis for the algorithms and also providing code labs in R for using R library implementations.


Somebody has to be the 'others'. Somebody has to know what's behind.

you still need to know what is appropriate to try. Depending on the problem there might be colossal mistakes that you can make just because you don't know what's behind the scenes.

Another useful, related, online course, Statistical Aspects of Data Mining (Stats 202) from Stanford & GoogleTechTalks https://www.youtube.com/watch?v=zRsMEl6PHhM&list=PLA4B3B4CB6...

I've studied EE before at master's level, but I'm going back to do a master in CS with 4/5 courses in ML. In the end I feel there is a great advantage in being able to implement the algorithms you design for ML applications. What do you think?

The ML class here is the most popular class by far in the entire department, enough that it's the only class where enrollment is based upon an exam. Half of the "enrolled" ended up either getting cut or dropping before the first day of class.

For people who want to learn more about machine learning, check out this free textbook written by Yoshua Bengio.


I always felt classes in emerging fields were a little premature. It seems like it always takes a few years for professors to learn how to teach it, much less what exactly they should be teaching.

Yep. Berkeley only recently started offering a ML class in 2013. The last time it was offered was 1988-1990, the last AI boom (assuming the course numbers meant the same thing back then).

tagging this thread for future reference


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact