
Ask HN: Is there really a market for deep learning skills without a Ph.D? - meow_mix
Earlier I was intrigued by an crash-course posted on here titled &quot;Learn Tensorflow and deep learning without a Ph.D&quot;, linking to a GCP page here: https:&#x2F;&#x2F;cloud.google.com&#x2F;blog&#x2F;big-data&#x2F;2017&#x2F;01&#x2F;learn-tensorflow-and-deep-learning-without-a-phd<p>Would those offering jobs related to deep learning really be comfortable offering the a position building &#x2F; using these kinds of models without an advanced degree?<p>Those who have gotten a job in deep learning &#x2F; machine learning without an advanced degree, could you share your experience?
======
brandonb
Yes! Deep learning is a young field. I have a B.S., for example, and have
published work at Interspeech (speech recognition conference) and NIPS
workshop (machine learning conference). You can too.

My advice:

    
    
      * 80% of machine learning is software engineering. If you're a strong software engineer, with some good math foundations (calculus, probability, matrix algebra), you can make significant contributions.
    
      * For deep learning, you can start with the crash course, but you'll eventually want to read a full textbook. If you read the new Goodfellow, Bengio, and Courville textbook and understand the details of the algorithms, then you'll be up to speed with the major ideas in deep learning today: http://www.deeplearningbook.org/
    
      * Find a company with existing deep learning experts who need software help -- that's most companies applying deep learning.
    
      * Read research papers. Some will be easy to read, some will be hard. If they're hard, read the references they cite. Eventually they'll get easier.
    

Let me know if you need help!

~~~
dorian-graph

      * 80% of machine learning is software engineering. If you're a strong software engineer, with some good math foundations (calculus, probability, matrix algebra), you can make significant contributions.
    

Do you have some examples of this? I'd love to have a look at some real world
ones.

~~~
brandonb
Not sure if this is what you're looking for, but Andrew Ng touched on a lot of
pragmatic parts of software engineering in machine learning teams at Deep
Learning School:
[https://www.youtube.com/watch?v=F1ka6a13S9I&t=3380s](https://www.youtube.com/watch?v=F1ka6a13S9I&t=3380s)

------
kozikow
I am a co-founder of deep learning startup -
[https://tensorflight.com](https://tensorflight.com). Engineers without PhD,
but some deep learning experience, are useful on applied deep learning teams
like ours. Please get in touch at kozikow@tensorflight.com.

Training model itself takes maybe 10-20% of overall engineering effort. You
need to have at least one person on the team, who grinded their teeth on
training deep learning models. My co-founder says at least half a year of full
time experience. I would say that there is nothing that would fundamentally
require the PhD, just that very few people without the PhD have the sufficient
experience right now. What's the most important is an intuition what method
will help the most in your situation.

Within model training there are many tasks that can be off-loaded to someone
without heavy deep learning expertise - e.g. data augmentation or evaluation
statistics. Someone with deep learning experience is still required to know
which tasks will have the highest impact.

Except the model training there is lots of work on everything around it.
Inference pipeline, web server, data gathering, hardware. Majority is "plain
engineering", but there is a few engineering skills specific to deep learning
- e.g. inference pipeline or hardware to train models. What's more, even
within the domain like computer vision, "classic" methods are sometimes still
more cost effective than deep learning.

~~~
searine
>What's the most important is an intuition what method will help the most in
your situation.

A PhD doesn't teach you a method, it teaches you this.

That intuition comes from years of specialized training that no crash-course
or bootcamp can teach.

~~~
kozikow
"deep learning bootcamp" wouldn't be sufficient.

On the other hand, training deep learning models was very different world two
years ago. From a few examples in the field I know of, someone reasonably
talented with decent math fundamentals can catch up with the state of the art
in 6-12 months.

~~~
searine
>someone reasonably talented with decent math fundamentals can catch up with
the state of the art in 6-12 months.

Being able to parrot what is known, is far far far different from being able
to ask what comes next.

I am talking about the training it takes to develop a critical and creative
mind that can ask and answer novel questions.

------
alexmlamb2
My background is that I did my undergrad at JHU, then did applied Machine
Learning research for two years at Amazon, and now I'm a PhD student in ML/DL.
I was very into ML during undergrad, so overall I have ~7 years of experience.

My view is that Machine Learning is a deep field which could take decades to
fully appreciate, but which is also easily accessible, especially if one is
interested in building applications.

I think that a good litmus test for ML expertise is asking someone what % of
all NIPS papers from last year they'd be able to understand well enough to
reproduce after a quick reading.

[https://nips.cc/Conferences/2016/AcceptedPapers](https://nips.cc/Conferences/2016/AcceptedPapers)

Even though I have ~7 years of experience doing ML, I would say that I could
probably only fully appreciate ~10% of all NIPS papers.

~~~
t3nary
How difficult is it to get into a position at a top company where you can do
ML research, without having a PhD?

And I'm also wondering about going back to university: Is it difficult to get
into a good ML PhD program if you've been in the industry for some years?

~~~
achompas
Very difficult, at east for research. Top companies with ML research labs (FB,
Google, Uber) are basically externalized academic labs. The heads of these
labs are simply bringing talent over from their former departments (NYU,
Cambridge/UBC, CMU).

It is not impossible to work at these labs _as an engineer_ without a PhD, but
I don't have a deep understanding of these roles.

------
cpkscpks
I've hired a person without even a bachelors. They competed in Kaggle
competition relevant to what we were working on, and were one of the top
competitors. I reached out, hired, and it worked out well for all sides.
Eventually, they left to do a startup. But we got our money's worth, and they
got a decent salary for a while, and a decent spot on their resume.

I don't think the emphasis should be on Ph.D so much as on demonstrating your
competence. A thesis is a way to do that. Kaggle is another. A startup is
another. So is doing some interesting crunching on data sets and writing blog
posts or articles. Another way is to work up from Analyst to Data Scientist to
ML (or other jobs -- you might be able to start in ML at a startup at a low
salary).

But if you have no track record, you'll need to get one somehow. A single
course degree isn't sufficient. That's 1/32nd of a Bachelor's degree, 1/8th of
a Master's, or 1/25th of a Ph.D.

~~~
jdonaldson
If you hit Ph.D. level, courses are an afterthought, and no longer a
meaningful measure of the progress on your degree.

The situation reminds me of the first dotcom bubble, when folks were getting
hired to write web apps with little formal training. I can only imagine the
technical debt that is accumulating right now in the industry.

~~~
webmaven
_> folks were getting hired to write web apps with little formal training_

Yes, well, at the time, most people with formal training wouldn't go near a
web app project with a ten foot pole ("it'll all end in tears", they said).
And old-school Unix hackers versed in Perl weren't necessarily any more likely
to have a related degree.

 _> I can only imagine the technical debt that is accumulating right now in
the industry._

Don't worry about it. Most technical debt will get wiped out with the failure
of the company (usually for reasons unrelated to technical debt). Companies
that survive will do so despite the technical debt, and will have the
resources to rewrite things (hopefully avoiding Second System Syndrome).

OTOH, I can only imagine the future pain that low quality DL software patents
are going to cause when IP from failed companies gets sold off.

------
siliconc0w
Machine learning in most shops is in desperate need of better engineering
practices. PhDs are brilliant but they tend to only apply that brilliance
narrowly. They often don't really care about things like re-usability,
performance, testing, monitoring, or even version control. There is large
demand for 'model engineering' to basically help build pipelines to get them
data and to create reusable frameworks so models can be iterated on and
validated and deployed quickly and easily. It's easy to say the new model has
a better AOC than the old one but the true test is whether the model in front
of users actually does what it's supposed to do.

------
verdverm
My answer is yes. I've worked in companies which have many engineers who work
with deep learning. No PhD needed.

There is a difference between doing deep learning research and building a
product powered by deep learning. (with some amount of correlation of the
respective success in both categories depending on the possession of a PhD).
In my experience, the engineers are far, far better at building a product
which can create value in the market place. Deep learning algorithms /
architectures cannot do this alone. A product encapsulates a user experience
which is often completely separated from the particular learner powering the
experience. However, engineers without the understanding of basic ML practices
(which apply more generally) cannot build great products. (they tend to
violate ML theory, i.e. they make dirty data or draw causation where there is
correlation). You can see why Google is putting all of their engineers through
a 6 month ML course.

------
santaclaus
A Ph.D. is about publishing papers in top journals and managing academic
drama. If neither of those is your end goal, then if you can demonstrate an
ability to bring value with ML then a Ph.D. is definitely not required (maybe
half the ML group at our company has PhDs).

------
thomnific
I would echo another comment here and say that it depends. However (speaking
from my own perspective) depending on the discipline there's no specific need
to do a PhD in order to understand and apply machine learning.

It also depends on what you mean by "advanced degree". I have an undergraduate
degree in physics/math and a master's in economics/finance, and I find between
those two things I've been able to follow developments in machine learning and
also to apply them to my work. In fact, I used to get a bit annoyed with those
who would imply that I "must" do a PhD ... I would say that's certainly true
if I wanted to invent new estimators etc., but otherwise not so much. A PhD
can be great for other reasons but it's not the sort of thing required for
actually doing my job.

------
trelliscoded
I've had to build some relatively simple deep learning systems around
tensorflow, and I don't have any special training other than the usual
engineering math and statistics.

My observation is that it's much more important to be clever with identifying
possible inputs to train on rather than focusing too much on the machine
learning itself. A crappy ML implementation that was trained on 20 data sets
which are highly relevant and well curated does better than a highly tuned ML
system with half the inputs and bad outliers.

------
rinchik
A typical answer to any somewhat complicated question: it depends.

If you will find use cases for your newly obtained knowledge the you will be
able to secure a job. It's all about practicality. You most likely will get
hired to solve a specific problem and if you will be able to market yourself
and showcase how your skills can boost revenues, you'll be okay.

That's where "domain-diverse" mindset is important. I don't think Ph.D is
required anywhere outside of R&D. Ph.D is certainly a plus, but not a
requirement.

------
sumodm
My 2 cents 1\. You don't need a Ph.D to build a product that uses DL/ML. Some
part of your work here could be to define the problem and to understand how to
phrase the problem so that it is solvable (if its too easy, your moat won't be
technological). Your contribution here could be applying deep learning to new
applications. Specifically for applying DL in industry, someone with ability
to quickly tryout a lot of things is a good thing to have (with some amount of
self-discipline). 2\. You don't need a Ph.D to be part of a team that is
taking on a hard DL/ML problem. You will hopefully have leaders who can set
the directions. 3\. Ph.D like most degrees is a label. If you can develop the
skills of a good researcher (like any other craftsman), learning to comprehend
research ideas quickly, keeping tab of interesting ideas and recent progress,
then one could easily find connections and even publish good papers. 4\. Now
if you really want to understand what happens deep down, why does DL work, to
understand how it could be viewed as Tensor Decomposition or coming up with
new mathematical optimization or drawing a new connection between sub-areas:
then having dedicated time to build those skill-sets (aka Ph.D) is very
helpful.

------
Dzugaru
I'm working in the field (and winning contests) without a bachelor. As was
said here [0]

"Overall, machine learning systems can be thought of as a machine learning
core — usually an advanced algorithm which requires a few chapters from Ian’s
book to understand — surrounded by a huge amount of software engineering."

[0] [https://blog.gregbrockman.com/define-cto-
openai](https://blog.gregbrockman.com/define-cto-openai)

So that's what I mostly do - write code, try new ideas :) Surely, I had to
refresh my knowledge in some areas like probabilities, and have a general
understanding how math works, but anyone can do that.

------
skadamat
Many have gotten research engineer jobs (vs. research scientist). Check out
the profiles of research engineers at places like Apple, Facebook, and OpenAI.
Knowing deep learning is an advantage in these roles even if the job is
primarily software engineering (not research / science).

------
tjpaudio
Definitely. I have a high-paying data science job at a major corporation with
only a 4 year degree in economics and got it after 2 years in the field
working analyst jobs. I took stats classes with my free electives, got
database skills from my first job and taught myself how to code along the way.
Many of my peers have advanced degrees and I can hang no problem. I will say
though, that the data science boot camps are not the answer. The applicants we
receive from them are generally weak and we have never hired any of them.

~~~
jayajay
you said none of the bootcamp people that applied were hired. out of those
bootcamp people, how many had bachelors, and what was the distribution of
their degrees? i.e. how many had degrees in CS, Math, Physics, Biology, etc.
Alternatively, out of the math degrees that applied (regardless of bootcamp),
how many were hired?

~~~
tjpaudio
Of the bootcamp people, everyone has had at least a bachelors, maybe weighted
towards less stem but that's certainly not the rule. I don't think I have a
good numbers backed answer for you, but I also don't think prior academics is
the issue. It's mostly peoples ability to problem solve. We have no shortage
of applicants but it still takes us forever to find people that can do the
job.

~~~
jayajay
You mentioned people's "ability to problem solve", and that it "takes forever
to find people that can do the job". This is very interesting. How do you
measure "problem solving ability" in your applicant pool when all you have are
resumes? Do you simply look for points on the resume which reflect positive
improvement? Do you have more than just a resume to base your judgement off
of?

E.g. consider these two similar points on a resume: "improved loading times
tenfold by using compound indexes and denormalization" would reflect an
improvement from a problematic situation to a better situation. Whereas
"designed schema and implemented indexed database" does not explicitly name a
problem and its solution.

I hope my point makes sense, and I am wondering if little differences in the
wording make a big impact in how you perceive someone's ability to identify
and solve problems.

~~~
tjpaudio
We give a data set and ask some open ended questions. Out of 800 applicants,
maybe 50 complete the exercise, and then maybe less than 10 we'll want to talk
to based on what they sent. I spent maybe 5 hours on the dataset when I was
interviewing for the position myself.

~~~
jayajay
Wow, only 6% complete the task? I am testing my luck here, but do you have any
sample/link to a data-set or questions you have asked before, or maybe a link
to something similar to what you have asked?

~~~
tjpaudio
The dataset is private but the general gist is its like 100,000 rows with
conversion and revenue data with time and 2 other dimensions. There are some
outliers (clearly errors) in the data and strong seasonality differences in
some dimensions, and data sparsity in others. The applicant isn't told about
these, but we look for them to find it in their analysis and control for it in
a predictive model they are asked to build based on the data.

------
DrNuke
In real life, having a Ph.D. means you might find, test and put forward a new
and original case for your industry by using deep learning or whatever new
tool. Without a Ph.D. and the burden of originality, the classical route
stands: apply wherever you can, prepare for interviews, await response until
you get an invitation and hopefully an offer.

------
earthly10x
There's always going to be a market for results. Better results, better
recommendations, predictions and relevancy. These are the battle lines.
Advance these and you're on the team, at least my team. Nothing beats side-by-
side comparisons despite the reputations industry or academia would like to
safeguard.

------
mrmaximus
Ask yourself this: Is there a market for database developers that don't have a
PH.D in Relational Algebra? "AI" will be democratized very quickly just like
data (SQL, Excel, etc.) has and solving business problems with technology will
be more important than ivory tower tinkering. That being said, if you're
interested in Deep Learning... do yourself the favor of truly understanding
what is happening with backpropogation and eventually learn to code networks
by hand, from memory; simply to aid your own intuition.

~~~
cr0sh
> "AI" will be democratized very quickly...

Honestly, I think we're already there, with frameworks like TensorFlow/Keras,
Torch, and others.

As a software engineer with 25+ years of experience, these frameworks take a
ton of the pain out of writing ML applications (particularly neural networks,
which is where they are mainly focused). They also make it easy to integrate
GPU and other multi-core-based training into the mix with almost zero effort.

------
nabla9
Having good working knowledge in numerical programming, signal processing,
statistics etc. transfers relatively well into data science and deep learning.
But if you see graph like this:
[http://imgur.com/a/bxR2x](http://imgur.com/a/bxR2x) and don't immediately
recognize what kind of process generates it, you will find it difficult to
understand and analyze why things work and why they don't work. Ability to
read difficult research paper and reproduce and implement it might be good
test for your skills if you doubt them.

In most data science/mining/analytics companies there are PhD's working as
chief scientist, senior analytical lead etc. Below that there can be junior
analytic positions that are more programming oriented, but they require good
mathematical backround. There are also junior positions for programmers who do
mostly programming and as a part of the team. In larger companies there are
senior engineer positions that concentrate on numerical programming and
implementing company specific algorithms etc. Understanding the terminology
and software is very valuable and helps getting into these positions (software
developer, engineer) even without deep domain knowledge. Someone who knows ins
and outs of low level graphics and game programming might be valuable asset
and his knowledge might transfer.

~~~
jnbiche
Is the graph specific to deep learning, or do you just mean something like a
bimodal distribution or some other statistics concept?

~~~
nabla9
It's the probability distribution of oscillating function (sine wave) with
some noise. It's just one test to see if one has working knowledge.

Recognizing distributions you have seen in the book is book knowledge. Seeing
a distribution and being able to mentally see what kind of thing it might be
when drawn into x,y-axis and figure it out is working knowledge.

~~~
thw198990
The distribution of sin(X) depends on that of X. It can even be made to look
Gaussian (well other than the tail).

Your post strikes me as annoyingly pretentious.

------
g_delgado14
I'd assume there are a lot of menial tasks within the field that don't require
years of academic experience. Not to mention one can learn on the job and get
assigned increasingly complex tasks as the candidate proves his / her ability
to learn quickly. Also, the salary of this type of candidate is probably lower
in the long run due to the missing credentials and formal education.

------
meow_mix
Thanks for so many positive responses everyone! Seems like the outlook on this
question is brighter than I had anticipated.

------
deepnotderp
Yes! There are many companies (including mine!) who have DL experts but need
software engineers. Ping me if interested :)

~~~
tempw
Could you share your company?

~~~
deepnotderp
sixsamuraisoldier [at] gmail [dot] com

Yes, we allow remote work. You'll mainly be implementing fast deep learning
algorithms such as FFT convolutions, quantization, etc.

------
sytelus
It depends on what you want to do. Training and testing models with deep
learning doesn't require PhD. Designing network architectures, debugging and
pushing model to its limits requires significant insights in to theoretical
foundations on the other hand. Doing something novel like GANs would require
even more formal background. However, I would say anyone with undergraduate
math (linear algebra and bit of calculus) can learn all of this if they are
determined and willing to invest couple of years of intensive study time. PhD
programs just facilates this more easily along with certificate of approval
from experts in the field that others can trust about your skill sets.

------
jmcmahon443
I have met a lot of companies doing natural language processing, who are
looking for BS and MS.

~~~
webmaven
Likely that is actually a filter for is experience in the NLP domain, which
you aren't likely to get in most garden-variety SWE positions.

------
deepnotderp
And yes, I published without a PhD as well.

