
Foundations of deep learning - aidanrocke
https://github.com/pauli-space/foundations_for_deep_learning
======
Smerity
With all due respect, this is a quite random reading list, and appears more a
result of the 'if "deep learning" in post.title: post.upvote()' trend on
Hacker News ...

I will pick on two under the "Classics" section simply as I know the author
and have used their work, so am not in any way saying the work isn't useful
(it certainly can be in the right spot!), but it isn't "classic" or what I'd
recommend for early readers at all. "Uncertainty in Deep Learning" and
"Dropout as a Bayesian Approximation" were both published within the last year
and a half and is a PhD thesis + paper on interpretations of neural networks
in a Bayesian fashion. "Classic" for a paper + thesis less than two years old
is quite a stretch even for the fast moving field of deep learning. The same
holds true for many of the other papers in "Classics" such as batch norm which
is (a) recent and (b) certainly not the only of such techniques (see layer
norm, recurrent batch norm, ...) and (c) has complications in
implementation[1].

As the simplest example, why is the original dropout paper[7] not under
classics? It's an elegant paper, fundamentally important for current neural
networks, and is more classic than "Dropout as a Bayesian Approximation" or
"Dropout Rademacher Complexity of Deep Neural Networks" which are both listed.

I'm also highly dubious of the noted neuroscience connection - most deep
learning researchers use very little from neuroscience.

Again, this list may be helpful to the creator of the repo and tailored toward
their specific research direction but it is not useful for readers from Hacker
News or those aiming to get their start in deep learning. Why so many upvotes?
Zero comments? Zero discussion?

If you want a book, check out the Deep Learning book[2]. If you want a course
for RNNs, check out CS224d[3]. If you want a course for CNNs, check out
CS231n[4]. If you want to get down and dirty in a practical software
engineering way, check out Fast AI[5]. If you want summaries of select recent
deep learning papers in GitHub format, check out Denny Britz's notes[8]. There
are many other starting points but those are my default suggestions.

If you really want to start learning, this isn't the right list for you and
I'd really like to suggest a more sane and potentially tailored path.
Seriously. If you reply with what you want, I'll do my best to suggest a
starting point.

Background: I'm a deep learning researcher who publishes papers and
articles[6].

[1]: [http://www.alexirpan.com/2017/04/26/perils-batch-
norm.html](http://www.alexirpan.com/2017/04/26/perils-batch-norm.html)

[2]: [http://www.deeplearningbook.org/](http://www.deeplearningbook.org/)

[3]: [http://cs224d.stanford.edu/](http://cs224d.stanford.edu/)

[4]: [http://cs231n.github.io/](http://cs231n.github.io/)

[5]: [http://course.fast.ai/](http://course.fast.ai/)

[6]:
[http://smerity.com/articles/2016/google_nmt_arch.html](http://smerity.com/articles/2016/google_nmt_arch.html)

[7]:
[https://www.cs.toronto.edu/~hinton/absps/JMLRdropout.pdf](https://www.cs.toronto.edu/~hinton/absps/JMLRdropout.pdf)

[8]: [https://github.com/dennybritz/deeplearning-
papernotes](https://github.com/dennybritz/deeplearning-papernotes)

~~~
aidanrocke
Hello Smerity,

I have read the deeplearning book and I think it's great that you know Yarin.
His PhD thesis is the closest thing to a survey paper on uncertainty in Deep
Learning. I also think that while the original dropout paper is nice, Yarin's
paper extends this method in a very useful manner.

Now, let me address your other points:

1\. This is not for people who want to get started in deep learning. Or it
could be. Do you know how I got started into mathematics?

2\. The dubious neuroscience connection isn't dubious at all. In particular, I
would also advise that you read 'Towards an integration of deep learning and
neuroscience' by Marblestone et al. If you still disagree, I could assure you
that Dr Kriegeskorte, Konrad Kording, Professor Bengio and many others are of
a different opinion.

Furthermore, the language of statistical learning has shown its limitations.
Consider Srivastava's paper on Locally Competitive Networks and every other
paper on deep rectifier networks for example. As the complexity of deep models
approach those of biological systems, I have no doubt we'll be using similar
tools to understand these models.

fyi: I'm currently working on deep rectifier networks.

Aidan

~~~
Smerity
Hey Aidan,

As noted, it's not a criticism of a reading list tailored for you, I simply
feel it's a really confusing reading list given the likely audience that will
arrive there from Hacker News and the title. I also don't think it can be
pieced together to form a good introductory reading list or that the contents
as stated are "foundational" yet given the progression and holes.

Your (1) confuses me still - I admittedly don't know how you got started in
mathematics or how that's relevant here?

Regarding (2), there certainly is a place for neuroscience to influencing
thinking or introduce new ideas, and I'm certainly not disregarding the
entirety of that potential intersection, but it's a very specific field that
is still largely disconnected from the practical application of neural
networks. This may change, slightly or substantially, in the future, but
thinking again with the lens of the audience of Hacker News (where the
incorrect adages of the style "deep learning learns just like a human brain"
gets thrown around frequently), I rarely want to exaggerate the influence that
neuroscience has on the field at this stage. I'd also lightly note that
appealing to authority isn't an argument, though do note the author's
pedigree.

~~~
aidanrocke
Good morning Smerity,

I'm sorry if my response sounded harsh. It wasn't my goal to confuse readers
and I admit that it's a work in progress. That said I must clarify a few
things as I realise that I left out important details:

1\. Regarding neuroscience: Those researchers I've mentioned(Kording et al.)
are actually people I've discussed this issue with recently. That's the
context. However, I must say that I thoroughly agree with you that "deep
learning learns just like a human brain" is a very inaccurate statement and a
reflection of the imprecise understanding many deep learning practitioners(and
some researchers) have of their own field.

2\. My approach to math: Was certainly not the approach used by deep learning
tutorials. The issue I have with the proliferation of these tutorials on the
internet is that they lead to familiarity rather than understanding. How to
reach such a theoretical understanding is debatable and I'd like to encourage
a debate.

In fact, this list is part of a bigger project which I'd like to collaborate
on with others in the near future. You can think of it as the early stages of
a Bourbaki project for deep learning. :)

Who was Bourbaki:
[http://www.ams.org/notices/200709/tx070901150p.pdf](http://www.ams.org/notices/200709/tx070901150p.pdf)

------
minimaxir
You can't take papers and reupload them to GitHub uncited from the original
source publication.

~~~
grzm
Also, the HN guidelines specify:

> _Please submit the original source._

[https://news.ycombinator.com/newsguidelines.html](https://news.ycombinator.com/newsguidelines.html)

~~~
minimaxir
The "Awesome List" trend of link aggregation on GitHub has been a gray area
for that particular HN rule.

~~~
grzm
That's generally a link of links, though, right? Or am I misremembering?

