
Ask HN: Does ML research ever get translated to industry? - hsikka
It seems like industry doesn’t really use any of the recent advancements that come fresh from research, including things like capsule networks or advances in Neural Architecture search?<p>Is there a gulf between the release of cutting edge research and its commercialization? Couldn’t many of these models be commercialized faster and be made available as enterprise products?
======
dennybritz
Some reasons that come to mind:

1\. Most of the advances do not result in large enough gains to justify them
being translated into industry. 99.9% of research papers propose techniques
that result in small gains in the optimization metric (accuracy, ROC AUC, BLEU
score, etc). However, this comes at the expense of added cost in complexity,
more expensive training, model instability, challenges in code
maintainability, and so on. For the vast majority of companies, unless you are
Google AdWords or Google Translate, a tiny gain in metric X is not worth the
costs mentioned above. You're much better off using proven off-the-shelf
models that have stood the test of time, are fast to train and easy to
maintain. Even if they are 1% worse.

2\. Research tends to focus on model improvements and you are not allowed to
touch your train/test data. That makes sense as otherwise competing approaches
would not be comparable. However, in the real world you have the freedom of
collecting more training data, cleaning your data, selecting more appropriate
validation/test data, and so on. The vast majority of times, getting
better/cleaner/more data beats getting a slightly better model. And it's much
easier to implement. So for industry it often makes more sense to focus on
that.

3\. Metrics optimized in research papers rarely translate into real world
business metrics, but many research ideas are overfit to those metrics and/or
datasets. For example, translation papers optimize something called BLEU
score, but in the real world the thing that matters is user satisfaction and
"human evaluations", which cannot easily be optimized in research. Similarly,
no business sells "ImageNet recognition accuracy". Research overfits to this
metric on this dataset (because that's how papers are evaluated) but it's not
obvious that a model doing better on this metric will also do better on some
other metric or dataset, even if they are similar. In fact, even datasets that
are known to contain errors are still used as-is, because they have always
been used.

~~~
jlelonm
Interesting - I replied to your twitter post about this but I’ll go ahead and
say it here as well:

If you were starting a PhD in CS/ML right now, and you wanted to be as useful
as possible primarily to industry (while still being impactful academically),
would you focus on the theoretical aspects on those weaknesses you mentioned?
(e.g. model maintainability, complexity, etc)

~~~
hsikka
I’d also love to hear the answer to this :)

------
idan_pan
I think that an important obstacle is that the industry and academy have
different goals and therefore different motivations. We (researchers from Palo
Alto Networks and Shodan), recently published a paper on machine learning
challenges in cyber security, aiming to the academy.
[https://arxiv.org/abs/1812.07858](https://arxiv.org/abs/1812.07858)

Finding problems that will be of interest to the academy wasn't easy. Taking
algorithms that will be based on them and turning them into a product will be
hard.

We explained why textbook machine learning will fail on these problem, what
make it interesting academically. We also provide data for academic use, what
enables academic research.

So far response is positive, so I hope that we found a way for academy-
industry cooperation.

------
tixocloud
When we talk about commercialisation, it almost always has to do with business
performance improvement. When you’re thinking about making a change to improve
the performance of business, then you have to consider the benefits/risk ratio
of your change (in this case, swapping for a better model).

Say your model only increases performance by 1%. It’s unproven within the
business and against the test of time. Not to mention someone generally needs
to know how it works and maintain it. Someone needs to be responsible for the
change and be able to explain why the new model is better (and will continue
to stay better).

And generally businesses buy solutions not releases of models. There’s a lot
of additional work that goes into commercialising a model as a product than
the actual model itself. I know because I’ve tried to do it before.

We are working on developing software that will let us be more agile with ML
and rapidly release new models to compete with existing models, which will
help us learn about the effectiveness of new modelling techniques and help us
build business cases.

------
p1esk
Capsule networks don’t work very well. Why would anyone want to use them in
industry?

~~~
malux85
and Neural Architecture search is (currently) too computationally expensive
for non-trivial problems

------
bhaskar75
Large enterprises need to have a faster research adoption cycle too. They need
to do their bit. It’s not all researchers fault. At least in my org (a very
large conventional co ) our DL adoption is hardly beyond transfer learning,
using CNNs as feature detectors. We lament lack of labeled data and there is
tangible research in few-shot learning and semi-supervised learning to at
least get us started. But we still don’t do it. I do understand that it
sometimes hard to conceptualize a product around some advanced in ML, eg GANs
or RL, but there are way too many others that we can adopt without killing
ourselves.

Enterprises should perhaps have a team / group that tries to fill the gap
between research and adoptions.

------
dendisuhubdy
I beg to differ. Some ideas related to training methods of deep neural
networks e.g binarized neural networks, asynchronous SGD are used in
datacenter infrastructure see [https://research.fb.com/wp-
content/uploads/2017/12/hpca-2018...](https://research.fb.com/wp-
content/uploads/2017/12/hpca-2018-facebook.pdf). Some of the companies that I
built in the past in Montreal uses SOTA research for sound event detection, a
dropout method that friends and I co-authored were used in production at an
NLP company (that I cannot disclose).

------
crazysci9
Have a look at [http://cKnowledge.org/rpi-crowd-
tuning](http://cKnowledge.org/rpi-crowd-tuning) and
[https://portalparts.acm.org/3230000/3229762/fm/frontmatter.p...](https://portalparts.acm.org/3230000/3229762/fm/frontmatter.pdf)
\- they also discuss a growing gap between academia and industry, and suggest
some solutions.

------
mostafab
Because researchers have little incentives to commercialize their research,
especially in academia. They take pride of this insulation from the 'real
world'. That's the root cause why research metrics are different from industry
metrics.

~~~
eghri
As someone who works in applied ML but tries to stay heavily on top of
research, I think it’s this. I am always stunned when I try to recruit people,
talk with folks at conferences, or even look for jobs myself how few people
care about commercialization. I’ve seen this attitude grow a lot in corporate
R&D in recent years too as they pull in more academics with promises of
independence from the “business grind”.

~~~
idan_pan
I'm not sure that it is just lack of interest in commercialisation, though
that does exist. I think that the academy and industry work in different
scale. In the industry low risk short term project have high value. The
academy aims for novelty which usually means high risk and longer terms.
Companies that aim to high risk longer term are usually either start up that
that project is their core or large companies that can afford failing 9 out of
10 projects and have the benefit form the others to return the investment in 5
years.

