Meanwhile, statistical methods for non-image/text data with identifiable features can often work better than neural networks, but they are not as sexy. (Good discussion on HN about this: https://news.ycombinator.com/item?id=13563892)
Maybe we should explain the cost/benefit of the buzzwords vs. the science?
I'm curious what Fisher, Neyman, and Pearson would say about the current state of the field. Especially considering how often they and other statisticians disagreed with each other throughout the 20th century.
As long as we can reasonably understand the I/O (and that use of the software can pass a security review), I don't see a problem with this.
The Upfront Summit is LA's premier technology event, with more than 750 of the country's top investors, startups, and corporate executives
If he was talking to executives, it may be sound advice. It's extremely likely the hot business opportunities in the short-term will be applying ML techniques to outstrip competitors trying to solve problems with traditional hand-coded solutions. In business-speak, "Learn ML" translates to "Familiarize yourself with the space and hire the people you need who know the topic," because that's how a company "learns" something.
E.g. when I was reading up on genetic algorithms etc. 20 years ago we also expected the "revolution" to be right around the corner, and that things like genetic programming would change the world in a few years time. And while various of those methods found use some places, most places that could have used at least some of the simpler ones, still don't.
In other words, I think talking about a 3 year timeline is crazy. It's getting more attention, sure, but there is so much low-hanging fruit that most developers could be busy for the next 20 years putting in place the most trivial algoriths all over the place and we still wouldn't have picked off even the low hanging fruit where the computational resources and algorithms and data to make a big impact were well within reach 20 years ago.
This certainly means there is plenty of room for a lot developers to do very cool stuff and build careers on machine learning today, but it also mean most developers will not have to learn the state of the art - or anything near it - for a very long time.
As a concrete example I give to people, consider all of the search boxes out there on various sites - product searches, location searches, site searches - that are straight keyword based searches that don't take into account any clickstream data to improve ranking. The proportion of search boxes I see that take advantage of the available data is vanishingly small, even though very basic analysis can improve the perceived relevance of the results massively.
We certainly will see more companies invest in proper machine learning as the payoff gets higher and difficulty in taking advantage of it drops. But we will also see a huge proportion of sites that could use it continue to ignore it for years to come.
There are big business opportunities in finding ways of making a dent in that portion of the market, though, and so learning this stuff can certainly be well worth it on a personal level, but I don't believe in his timeline in terms of the overall market.
- I know enough machine learning to be dangerous, but I'm hardly ever asked to use it. I designed a Bayesian classifier for my own startup around 6 years ago, analyzing political donor networks. I've completed the Stanford ML course. Back in college I did a math minor, so I'm comfortable with linear algebra, calculus, etc. I'm pretty comfortable with statistics of both kinds. But my bread-and-butter is freelance web development . . . and I'm not really even sure how to find work doing more MLy things.
- I've read over and over that the most time-consuming part of ML work is data collection & cleanup, and that matches my own experience. It is the same thing that killed so many data warehouse projects in the 90s. You don't need a Ph.D. to do it, but it is a tough and costly prerequisite. So it seems like you'll need non-ML programmers even for specifically ML projects.
- In a similar vein, Google has written about the challenges of "operationalizing" machine learning projects.[1] Having a little experience collaborating with a team doing an ML project, where they did the ML engine and I did the user-facing application, I can say that many ML experts are not experts in building reliable, production-ready software.
- Will there ever be a Wordpress of machine learning? If there is, the author will be rich, but you won't need a Ph.D. to operate it. But because ML requires hooks into your existing systems, I don't know if this will ever happen. What will happen I think is plugins to existing e-commerce systems for product recommendation or other off-the-shelf ML-powered features. These already exist, but I assume they will become more prevalent and powerful over time. In any case, the mainstreaming of ML for business will be inversely correlated with the expense to implement it, which suggests it will be easier and easier for non-expert developers to use (and misuse).
[1] https://research.google.com/pubs/pub43146.html
So there's some off the shelf ml, which is pretty great, but the problem of automatic feature selection - especially if new features need to be plumbed into a system - is still insufficiently solved. We've got systems for paring down the set of available features, but our systems have no idea how to ask for new features. That's essentially domain research, after all, and probably a 'strong ai-complete' problem.
I think this is key - prospective clients won't ask for it because they don't understand where it could be used, and they won't understand the heavy ML methods. An approach there would be to pitch things like improving search results using a bayesian classifier applied to analytics data as a cheap upgrade when quoting other work. Until people are used to even the basic statistical approaches they won't be ready to invest in something more drastic.
Maybe if it was coming from Bill Gates, Mark Zuckerberg, or another tech titan with some actual coding experience and a deeper level of learning about what ML even is. Cuban's a business man, and most CEO's I know don't have a clue about the stacks that run their own company, let alone what's popular.
That said, I do think ML will be important, but I develop ecommerce apps and things of the such in Laravel, unless I move into AI and Neural nets I don't see needing to know a lot about ML (though I wouldn't mind moving in that direction as that space picks up) -- but there's still plenty of opportunities without it.
I suppose there will be (and to some degree already are) machine learning magic wands, but they are going to be the kinds of things that suck capital out of the companies that utilize them (through loss of proprietary data and competition blocking moats.)
"""He thinks even programming is vulnerable to being automated and reducing the number of available programming jobs."""
I believed something similar could happen within 1-2 years of learning/writing AI programs (more than 12 years back). I believed it so much that it consumed most of my weekends as I took on the Genetic Programming approach. Yes! computers can write programs - BUT trying reading those. Eventually after spending hours or days, you will be able to read those programs and you might find a simple "hello-world" program represented by a complex mathematical equation. Good luck trying to get such program fixed by humans. Imagine an experience decoding deep-learning-neural-nets.
However, that is black-box from a programmer perspective.
From a business/management personnel perspective - the code is a black box anyways. When they get NNs that can generate required software, they will replace the people-manager with a NN-manager (who is a programmer btw!)
This is more VC/founders who are hyping up AI and need more ML folks so they can drive down costs.
The modern tools for ML/deep learning are accessible to all open source and well documented. And as I note in my top-level comment, old-fashioned statistical methods like linear regression are more than sufficient for real-world business problems, and definitely do not require a PhD to grok.
Stop spouting this bullshit. You don't need a PhD, and you don't need to advance the field to be doing 'serious ML'. All you need to be able to do is know how and when to apply it to solve crucial business problems.
Yet, I can still apply machine learning to solve Ax=b problems. More importantly, I can use business analysis and write code to transform business problems into an Ax=b problem, and then optimize it.
You don't need a PhD to grok optimizing a vector to transform a matrix of inputs into a vector of observed outputs, then apply that trained vector going forward. Neural nets are slightly more complicated than a straight linear regression, but only slightly. I'd call decision tree methods like GBM even more complex, but still eminently grokkable for a decent programmer.
It isn't required for any serious research effort, except by the accident of inertia. And it certainly isn't a necessary indicator of determination.
Good thing the vast majority of businesses won't need "serious" ML, but instead will require only simple implementations to help solve business problems.
Obviously you will want a theoretical expert on your team. But if your startup is counting on having a room full of them, good luck.
I've a PhD and held ML-engineer positions in a few different companies - I've good industry awareness.
Most applied ML, for most companies, right now, is actually relatively simple models (hand-coded rules! logistic regression! You'd be shocked how common these are.) The bulk of the work is data cleaning, gathering, integration, deployment, productisation, reliability, avoiding pathological cases, special-casing, Product, UX. You do need ML specialists who understand the stuff, to make it all work and come together - but the ratio of ML specialists to the wider team is low. Maybe 1 or 2 specialist on a team of 10 for an ML heavy product.
This is going to remain the case IMO. Yes, there will be small teams, in highly resourced organizations (GOOG, FB etc), academic research labs, or occasional hard-tech startups, who do new model development. Maybe if AI becomes huge, you'll see more traditional Fortune 500s spin up similar efforts.
But there'll be a much wider set of people&businesses applying and tuning well understood approaches, rather than doing new model development. And you just don't need as many ML specialists, for that approach.
Even with deep learning, the tooling will advance. I mean, even look at all the research papers describing applications at the moment - so many of them are using pre-trained models. Industry will be similar. Tooling will even advance, and you'll be able to do increasingly more with off-the-shelf pieces.
I think ML is absolutely going to have a big impact - I buy at least some of the hype. But should all developers, or even a substantial minority of developers, start learning ML as a career imperative? I don't think so.
Finally, it takes serious time to learn this stuff.
Its easy to dabble (and worthwhile doing - its fun; and sometimes you can do powerful things in a using tools in a very blackbox manner!). But actually thoroughly learning it takes time. It takes serious time to build statistical intuition, as just one example.
We could easily end up with a great many career developers who have a specialization in ML, frustrated they never get to use it.
Reading through the comments, I see that Cuban's statement upset and even angered several HN commenters. That is a strong emotional reaction.
I am not saying it is the 5 stages of grief, but the first 3 fits: denial, anger, bargaining, depression and acceptance.
Also from Howard Aiken: Don't worry about people stealing your ideas. If your ideas are any good, you'll have to ram them down people's throats.
Luddites of 18th century thought they would never be replaced and continued on their trajectory.
A lot of white-collar jobs may be automated (or otherwise changed beyond recognition due to technology) after about thirty years maybe, but not three.
By all means, start learning actual machine learning methods too, rather than just statistical methods, but as I pointed out elsewhere on this thread: there is low hanging fruit everywhere.
Not nearly all of those will need "proper" machine learning, and even fewer will be willing to pay what it will cost to hire people with in-depth machine learning experience or pay the development costs or computational costs anytime soon. But a lot of them will buy into the buzzwords and look for a cheaper halfway-house or be open to pitches.
Perhaps a good time to be an ML grad coming fresh out of college.
This Edward project, for example, looks really interesting.
http://edwardlib.org/
