
Ask HN: Is deep learning obsession in college ill founded? - muazzam
Background:<p>I&#x27;m a CS junior (about to become a senior) and in our last year, people choose their capstone project that they work on for the entire year.  For some years (say since 2017), deep learning projects completely dominate the other projects in term of number and the awards that they go on to win. I understand the principles behind it, even find it cool, but the whole inscrutable nature of it is problematic to me.
======
ageitgey
Deep learning is what is cool in CS right now. It lets you do new things that
you couldn't do before. Based on that, it's going to be over-represented in
projects by undergrads looking to do "cool" projects and show off their new-
found skills.

But that isn't really a problem. In most cases, the projects you do as an
undergrad don't affect your professional life in any way after you get your
first job. Very, very few undergrad projects turn into real projects that
anyone uses after the student graduates.

So don't worry too much about it. Ten years ago, every senior project was an
app. Twenty years ago, every senior project was a website. It's just a sign of
the times and doesn't matter in the long run.

------
rvz
> I understand the principles behind it, even find it cool, but the whole
> inscrutable nature of it is problematic to me.

Spot on.

On top of that we have 'AI' models getting fooled over adversarial attacks
which just involve a single pixel change. As long as these issues are not
tackled or not researched well enough, then we'll be pretty much be heading
into another AI Winter and the hype cycle will go through its trough of
disillusionment phase. Being unable to inspect the black-box nature of such
deep-learning systems is why highly regulated industries involving danger to
life such as healthcare or other safety critical industries label deep-
learning solutions as unsafe for them.

Sure, all you see right now are other students and startups 'applying' deep
learning everywhere, but they are hardly advancing the field unlike DeepMind
and OpenAI are. In terms of learning, it's something good to learn as a
student at college, but creating a AI startup now requires using Google,
Amazon or Microsofts data center's for training which is clearly not
sustainable anyway.

Security related projects and research are always where it's at.

~~~
visarga
> As long as these issues are not tackled or not researched well enough, then
> we'll be pretty much be heading into another AI Winter and the hype cycle
> will go through its trough of disillusionment phase.

It's not all or nothing as you present it. ML models can be useful even if
they are imperfect - and we should not forget humans aren't perfect either.
For example, a model could reduce 50% of the time necessary to enter an
invoice into the database. It's imperfect, yet useful.

A model need not run alone without any safety. It can have plain old
programming rules to validate its outputs, or use human in the loop.

> Sure, all you see right now are other students and startups 'applying' deep
> learning everywhere, but they are hardly advancing the field unlike DeepMind
> and OpenAI are.

On the contrary, I would say that what DeepMind and OpenAI are doing is
largely irrelevant for industry. There is a huge number of domains where no ML
model has been created, and that is because there are so few people who can
make them. The low hanging fruit hasn't been picked yet. It's like electricity
at the beginning of the 20th century. The work these students and startups are
doing is the good, useful work. You don't need DeepMind grade models to solve
most real problems.

> creating a AI startup now requires using Google, Amazon or Microsofts data
> center's for training

You can train most useful models on a single machine today. Some, like
Logistic Regression, train in seconds or minutes. Others take an hour, or a
day. Some heavy ones take a week to train. If you don't do hyper-parameter
search or cutting edge research you only need a few runs to get a working
model. It's data tagging that usually takes months or years.

------
_bxg1
Personally I think deep-learning is a bubble, and it will soon collapse to its
natural place in computer science. Which is not to say that it's a fad that
will disappear, only that it will retreat to being just a regular tool among
the many tools we have for solving different kinds of problems. Its
inscrutable nature is definitely problematic for some use-cases, and not so
problematic for others.

~~~
sin7
I've been doing the data thing for a while. During one of my defenses of R,
someone brought up that R was a black hole. That if you programmed in R, you
were a user who just filled in the correct function arguments and it just spit
out the answer. And that was when my thoughts on machine learning changed.

The vast majority of us are users. We massage the data to be in a certain
shape, then feed it through a machine that someone else created. We can change
the parameters. We can change the data. But few of us are going to look in to
the code of a random forest function.

I've switched tracks and started doing web development. Playing with the hyper
parameters in machine learning is no different than changing the feel of a
drop down by changing the colors, fonts and other things to fit a certain
aesthetic.

I could be wrong, but I have yet to meet anyone that has done anything besides
use packages created by others to call themselves data scientists. I think
that opens it up to becoming just another tool no different than Excel.

~~~
ishjoh
years ago on the first ML hype wave I completed the excellent MOOC by Andrew
Ng. In that course, he did go through the math and it was helpful to me to
understand what was going on under the hood, but even then the value of ML
wasn't understanding what it was doing, but understanding if your model was
doing something well. I think your take that using packages created by others
will be mostly what we do moving forward, and that's also true of pretty much
all software development.

------
uoaei
Not ill-founded so much as jumping the gun.

To understand why neural networks work, you will have to understand how a
whole host of smaller, simpler ML models work in _excruciating detail_.
Multiple linear regression, logistic regression, etc. What they mean, how they
work, what's really going on "inside", what the underlying probabilistic model
represents, etc.

Neural networks are great because it takes basically all of those smaller
ideas and concatenates them into a super flexible statistical machine. It's
really cool to see the "in->out" but it's even cooler once you have a good
grasp on what's going on in the intermediate steps.

In my experience, almost everyone working with neural networks don't have
those details down. This goes 100-fold for non-research roles. They learned
the Keras API and are happy stacking layers, and as long as the output looks
nice they push to production. For most cases empirical validation is probably
enough, because NNs usually can achieve _some_ incremental improvement just by
virtue of the fact they have so many damn degrees of freedom. But to get a
well-performing, well-founded model, you need to know the ins and outs.

------
deuslovult
I'm an ML engineer, and I agree with you- deep learning is by far the most
common approach for new problems in informatics.

Imo deep learning is so popular because it "works". For a classification
problem, if you try a linear baseline and a deep learning model, and you do a
reasonable job of hyperparameter tuning and experimental design, it's likely
you will outperform a simpler model. This holds true across many problem
spaces.

I think the issue is that modern DL frameworks make it a little too easy to
get pretty good performance on new problems. Other techniques generally
require more background knowledge to make reasonable modeling assumptions, and
still frequently perform worse than a naively applied DL approach.

I think DL will remain, in practice and education, a very popular tool. But it
is essential to learn traditional statistical inference and other background
to appropriately contextualize DL models so it isn't just some form of black
magic.

~~~
mattkrause
A lot of those comparisons strike me as shaky.

It's easy to beat a naive logistic regression model with a good neural
network, but the gap often closes once you start trying to tune the logistic
model too. (And it's not like the neural networks aren't tuned either--
architecture search, data augmentation, etc).

Recent review on medical data:
[https://www.sciencedirect.com/science/article/abs/pii/S08954...](https://www.sciencedirect.com/science/article/abs/pii/S0895435618310813)

~~~
deuslovult
Logistic regression is exactly a NN with no hidden layers and a sigmoid
activation function. A feedforward NN with additional layers is strictly more
expressive than logistic regression.

~~~
mattkrause
Yes! The million dollar question is how much of that expressivity is actually
required.

In many papers, the "baseline" logistic regression model is very stripped
down: y~logit(.) but the neural network has had its expressiveness optimized
in various ways. People aren't comparing against a 3 layer feedfoward network;
there's augmentation and pre-training, architecture search and special
learning schemes.

My claim is that if you want to claim that a problem _needs_ the expressivity
that (only) a neural network provides, you ought to be devoting a great deal
of effort to the logistic regression model too. Make it a steelman, rather
than a strawman, if you will.

------
poulsbohemian
22 years ago when I was in your shoes, distributed systems were the topic of
the day, and all of us were going to be building systems with CORBA and
DCOM... so guess what my project and paper were about? That's right, things I
never touched in my career, but darn it if they didn't help me get my first
job because they were hot topics of the moment.

So, pick something in "AI" that is the hotness of the moment, learn what you
can, do your best, and then get on with life and career.

------
md2020
My situation is the same as yours, CS junior heading into my capstone project
next semester, and my opinion is a resounding yes. The deep learning obsession
is almost certainly a hype bubble. I have observed the same here at my
university, the "But what if we did it with deep learning?" projects are
almost reaching meme status. It's rather disheartening as someone who actually
is interested in AGI, but I've been driven away from wanting to pursue the
field since the current research seems lacking in ambition and substance. My
previous summer internship had me reading a lot of deep learning papers on
arXiv, and the vast majority of them seem to be tweaking a single parameter in
a DNN, achieving a 0.3% increase in score on an arbitrary benchmark, and
calling it a meaningful result. I'd personally like to see more people doing
work like the kind DeepMind does that seems to actually achieve breakthroughs
informed by knowledge from neuroscience, but I have a feeling we won't see
that anytime soon since DeepMind gets their pick of the best researchers in
the world. I'm just an undergrad though, would love to hear the opinions of
more knowledgeable people! Specifically, I'd like to hear arguments against
the sentiment that "Deep learning right now is pretty much alchemy". How is
the work in deep learning helping us understand the nature of intelligence,
rather than just helping Facebook and Amazon better target advertisements and
product recommendations?

~~~
visarga
> How is the work in deep learning helping us understand the nature of
> intelligence

Neural networks performance on a problem is a benchmark of its real
difficulty. It gives us insight, a new perspective.

In millennia of deliberations what have philosophers have discovered about the
nature of intelligence? And then .. a neural net beats us at all board games,
another can solve differential equations, another can translate, another can
see, and so on. Have we really not learned anything by these inventions?

Another advantage of DL is that it frames the problem of intelligence in
mathematical concepts and rigorous evaluation.

I, for one, have reconsidered all my spiritual beliefs after learning about
the agent-environment-reward model of reinforcement learning. A new way of
framing the agent and life, so parsimonious and powerful. And it does not
require a soul, or a god, or anything outside out real environment, and yet
can explain so much.

The whole machine learning paradigm is another powerful concept through which
we can understand how we might function. Previously you might wonder how
emotion, thought, sensation, imagination and will relate to each other. Now we
can understand how they might be implemented and wired together, and what
principles support their function.

~~~
md2020
> And then .. a neural net beats us at all board games, another can solve
> differential equations, another can translate, another can see, and so on.
> Have we really not learned anything by these inventions?

I would still argue no, we haven't learned anything about intelligence from
these. They are impressive achievements, but strictly in the sense of "We
found a way to use computers in a way we were not using them before".

1.) Neural nets + MCTS beat us at all board games--board games that humans
invented and can achieve mastery at. If a human Go player was born who could
beat that version of AlphaZero, we would not say that person has solved
intelligence.

2.) Differential equations: also invented/discovered by humans, can also be
solved by humans

3.) Translate and see: See above, with the additional caveat that humans
actually are better at translating and seeing than deep learning systems are.

In addition, these were all achieved individually by systems with different
architectures and massive amounts of training data that would amount to
several human lifetimes. An 18 year-old human can play board games, drive a
car, do differential equations, and learn multiple languages, with a single
brain using a generalized structure and a fraction of the "training data"
afforded to DL. This indicates to me that ML as a whole is still very far off
the mark of General Intelligence.

> Previously you might wonder how emotion, thought, sensation, imagination and
> will relate to each other. Now we can understand how they might be
> implemented and wired together, and what principles support their function.

Previously? This is still an unanswered question. Show me where deep learning
research has even come close to producing a system that can learn and adapt
like a human mind does.

------
Irishsteve
It's the current trendy technology (For a lot of good reasons). So its only
normal students gravitate towards it.

It's not ideal - but if it wasn't DL it would be another topic / application.

------
debbiedowner
It's a well funded research area and in many cases outperforms other
solutions, so it's not going anywhere in Unis for the next decade at least.

It's not completely inscrutable in why it works, follow the thread on deep
learning theory research by starting from the names here:
[http://www.vision.jhu.edu/tutorials/CVPR17-Tutorial-Math-
Dee...](http://www.vision.jhu.edu/tutorials/CVPR17-Tutorial-Math-Deep-
Learning.htm)

------
en4bz
> During the gold rush it's a good time to be in the pick and shovel business

~~~
makapuf
Sure, nvidia is doing pretty well indeed.

------
AnimalMuppet
For your purposes, it doesn't matter. You can do your capstone in the current
hotness, or not, as you please. If you do it in deep learning and in five
years deep learning is passe, it won't matter. You'll have your degree and be
four years into your career. Or you'll be four years into grad school. You'll
be fine. (In this scenario, if your grad school is in deep learning, you _won
't_ be fine. Think harder about that choice than about your capstone.)

Given all that you've said, a capstone that tries to dent the "inscrutable
nature" of deep learning might be an interesting choice.

------
omarhaneef
I think deep learning works better than simple linear regressions because we
have already succeeded with simple linear regression wherever we can but we
have just started to get going with "deep" learning. And the best part is, as
new computers come out, the deeper we learn.

I will point out that the real win is with new data sources, and simple linear
regressions may still work there.

------
burfog
Possible projects that might keep you occupied for the entire year:

Make a Rust front-end for the GNU Compiler Collection.

Emulate something.

Write a hypervisor.

~~~
muazzam
Thanks, it helps. I'm actively looking for project ideas.

------
sys_64738
"Deep learning" is the latest buzzword to get all the dollars nowadays. In a
past decade corporations used to even fund Second Life for use at work.

------
Kenji
The most important thing is this: You pick a problem you want to solve, you
pick the tools you want to solve that problem with, and one of the tools in
your bag of tools is deep learning which you may end up using. Do it in that
order. Do not pick deep learning and then try to solve everything with deep
learning, that's putting the cart before the horse. That's all I can say about
things like deep learning, blockchain, etc. Let the problems lead you to the
solutions, not the other way around.

