Disclosure: I work at Google at Kubeflow There are a few different reasons: 1) M...

chimtim · on Jan 17, 2018

Exactly, most non-IT companies have very limited skills in this area. This is why they outsource. As an analogy, when I want to furnish my office, I look for an interior decorator who takes care of ordering furniture etc., within a budget. I don't screw around with a 3D printing API to make chairs and tables. It is too low level and not my core expertise.

And second, your comment that most folks achieve any accuracy is strange. These are not real businesses, they are mostly developers and hobbyists trying to learn. These folks sign up for kaggle and poke a few scripts and view half a class on coursera on ML. They are not real businesses and they have no money. Most of the real businesses are hiring startups or large companies that hire data scientists with domain expertise like in oil, manufacturing etc. (the IBM model). ML as an API is a disaster as a business model.

Also, AutoML is not even close to being better than humans even for a specific problem (across datasets). These click-bait titles don't fly outside of AI conferences.

TheIronYuppie · on Jan 17, 2018

I think we may be talking to different customers. I talked with ~200 customers last year, and the most common question was "What do I use ML for?" and the second most common was "How do I get started?"

Put another way, the average customer has ~zero ML usage today. I'd guess that 95%+ of all businesses have ML usage today. Further guessing would say that <1% of ML usage actually care about levels of accuracy beyond "it's better than the hacked together set of rules/filters I use today." These are very large businesses with lots of money to spend on a solution.

There are many ways to measure "better", and AutoML does apply here. This includes "better == faster to train or develop" [1], "better == you need less data" [2], "better == lower error rates"[3]. While I agree that many of measures do not apply across datasets, most customers only have one dataset per problem.

[1] Predictive accuracy and run-time - https://repositorio-aberto.up.pt/bitstream/10216/104210/2/19...

[2] Less data - https://arxiv.org/abs/1703.03400

[3] https://link.springer.com/article/10.1007/s10994-017-5687-8

Disclosure: I work at Google on Kubeflow

TheIronYuppie · on Jan 18, 2018

Sorry, that total should be-

- Average customer has zero ML

- Nearly no customers are using any ML (difference between median and mode)

- Of those that are using, very few care about better than human perf

chimtim · on Jan 18, 2018

I actually have worked with lots of customers in deploying ML. This was my perspective. Thanks for sharing your perspective at Google.

TuringNYC · on Jan 17, 2018

From the article "There’s a very limited number of people that can create advanced machine learning models." -- Curious if this is really the case? It is certainly the case with my generation of engineers but half the student interns I interview from top-20 comp sci programs do this on weekends for hackathons.

Is the argument that it is easy to implement stock models but hard to tune the models for specific types of image inputs? Inst that pretty easily solved with some parameter grid searches? How much specialized skill does it take to re-do networks from traditional inception architecture or what not into something specific for hot-dogs or satellite imagery or medical images?

nl · on Jan 17, 2018

(I build machine learning models professionally)

* half the student interns I interview from top-20 comp sci programs do this on weekends for hackathons*

It's trivially easy to take what someone else has built and modify it slightly for a similar problem, especially in a hackerthon environment where you can ignore edge cases etc.

See if they can build a new model from scratch for a new type of problem. I'm not saying that AutoML can do this either, but I interview large numbers of PhDs who don't know where to start on doing something new.