
Ask HN: What Neural Networks/Deep Learning Books Should I Read? - Oreb
There are lots of deep learning books on the market. The vast majority of them are presenting practical examples using some Python (or whatever) deep learning framework. Such books don&#x27;t interest me at all. If I wanted to learn some particular framework, I would just look up the documentation for that framework.<p>I&#x27;m looking for two types of books:<p>1. A technical, math-heavy introduction to neural networks and deep learning, with little or no actual code (except possibly some pseudocode). The often recommended book by Goodfellow et al resembles what I&#x27;m looking for, but unfortunately, it completely lacks exercises.<p>2. An entertaining pop science like book which takes a more philosophical and cross disciplinary look at neural networks as well as their inspirations and applications. I haven&#x27;t been able to find a single book like this, but  surely it has to exist?<p>Recommendations, anyone?
======
phonebucket
It is a little old given how quickly the field moves, but 'Information Theory,
Inference and Learning Algorithms' has a chapter on neural networks. It is an
outstanding book: a labour of love from a very smart person. The exercises are
varied, explanations are great, there's a sprinkling of humour, and
connections drawn between multiple fields of study. Moreover, it is freely
available from the author's website:
[http://www.inference.org.uk/itprnn/book.html](http://www.inference.org.uk/itprnn/book.html)

Have you considered giving Goodfellow another shot, but trying to re-derive
the results therein as a form of exercise? I think that would likely be one of
the faster methods to bring yourself reasonably up to date with the field.

~~~
buckminster
You can find video of MacKay's lectures covering parts of the book online. I'm
more of a reader but I really enjoyed them. And the book is excellent.

~~~
yesenadam
Videos here:
[http://videolectures.net/course_information_theory_pattern_r...](http://videolectures.net/course_information_theory_pattern_recognition/)

Youtube links for the 2 videos on NN:

Lecture 15
[https://www.youtube.com/watch?v=Z1pcTxvCOgw](https://www.youtube.com/watch?v=Z1pcTxvCOgw)

16
[https://www.youtube.com/watch?v=OvMGPHpa_tM](https://www.youtube.com/watch?v=OvMGPHpa_tM)

------
nafizh
IMHO, people should not suggest books they have not personally read as it
creates a bias trend towards books which names people have heard somewhere
(mostly books by a bigshot author or from a big university/company). So, if
you are suggesting a book, please at least mention if you have read it or not.

------
FrozenSynapse
This for the background: [https://mml-book.github.io/](https://mml-
book.github.io/)

This for actual methods:
[http://www.deeplearningbook.org/](http://www.deeplearningbook.org/)

This is also useful, but harder to read than the previous ones:
[https://web.stanford.edu/~hastie/Papers/ESLII.pdf](https://web.stanford.edu/~hastie/Papers/ESLII.pdf)

~~~
humility
Hey, I did not take an undergraduate course in Algorithms but I'm super
interested in AI, especially neural nets and deep learning- would you
recommend I start with the background book?

Thanks for your suggestion btw.

~~~
_fullpint
OP mentions Essentials Of Statistical Learning — it’s a pretty heavy book that
gets mentioned quite a bit.

From the same Stanford publishing there is Introduction to Statistical
Learning. It’s a good intro to Machine Learning as a whole.

Far too often it seems people want to jump directly into Deep Learning, I’d
shy away from that and having a better understanding of ML as a discipline
makes the application of DL much more productive.

Edit: Also would like to add a lot of people want to use DL for imaging stuff.
Take some time to understand Digital Image Processing as well. It’s a good
introduction to convolution and filtering. As well as just understanding what
an image is and what can be done with it!

This is just sort of advice from my path.

The second book they mention also had some pretty heavy stuff involving
probability and probability models. If you can take some time to understand
Automata and it’s supplications such as Hidden Markov Models that’ll be a big
help.

Also you mentioning that you never taking a formal algorithm course. While it
isn’t necessary as you probably won’t be building anything from scratch.
Learning some dynamic programming methods is very helpful when understanding
FFT and it’s impact with convolution methods and also how some of these hidden
models for probability are evaluated efficiently.

------
stared
I wrote a piece on that in: [https://github.com/stared/thinking-in-tensors-
writing-in-pyt...](https://github.com/stared/thinking-in-tensors-writing-in-
pytorch#why-not-something-else)

> If I wanted to learn some particular framework, I would just look up the
> documentation for that framework.

Well, if you don't know deep learning, it is not how it works (unless it is a
poor book, which only provides an introduction to some API). Still, I
recommend "Deep Learning in Python" by Francois Chollet as it provides a good
overview of practical deep learning. For practical applications, a book WILL
use one framework or another or will be useless. If you understand
overfitting, L2 or batch processing in Keras, you will be able to use in any
other framework (after looking up its API).

When it comes to the mathematical background, Deep Learning Book by Ian
Goodfellow et al. is a great starting point, giving a lot of overview. Though,
it requires a lot of interest in maths. Convolutional networks start well
after page 300.

I struggled to find something in the middle ground - showing mathematical
foundations of deep learning, step by step, at the same time translating it
into code. The closest example is CS231n: Convolutional Neural Networks for
Visual Recognition (which is, IMHO, a masterpiece). Though, I believe that
instead of using NumPy we can use PyTorch, giving a smooth transition between
mathematical ideas and a practical, working code.

Not a book per se, but better than any other.

I am in the process of writing "Thinking in Tensors, Writing in PyTorch" (with
an idea of showing maths, code, fundamentals or practical examples) but it is
a slow process. It's a collaborative, open-source, repo - so open for
collaborators and contributors. :)

------
mendeza
My go-to when I started in Deep Learning is Stanford's CS 231n course
[http://cs231n.github.io/](http://cs231n.github.io/), the lecture notes are
amazing. It was instrumental when I first dove deep into Deep Learning and
helped me understand all the components needed to make Convolutional Neural
Networks(CNN) and Neural Networks(NN) work. I first read this and watched the
lecture videos. This course was a great primer for me to understand the
content and theory in Goodfellow's Deep Learning book.

~~~
nullbyte
Reading through the first module, this looks great! Lots of helpful graphics
and code examples.

Thanks for sharing.

------
high_derivative
For the topic of optimization, I recommend:

[https://people.csail.mit.edu/jsolomon/share/book/numerical_b...](https://people.csail.mit.edu/jsolomon/share/book/numerical_book.pdf)

It will give you a decent introduction to optimization methods underpinning
deep learning. Deep learning theory is optimization theory.

There are youtube lectures as well and Justin is a great lecturer.

I personally would not recommend the Goodfellow book. It's not a good book for
newcomers, at best it's a quick reminder of how some things work on a non-
rigorous level.

------
earthnail
I know it doesn’t sound at first like what you’re looking for, but I strongly
recommend the Fast.ai course.

Despite advertising itself as for coders and not math heavy, I found it to be
much better at explaining the math than, for example goodfellas’ book.

It’s also much better at talking about the inspirations of certain methods -
such as Dropout - than other sources I’ve found.

Overall, if you want a deep understanding of neural network, fast.ai is -
somewhat ironically given its branding - the best resource I’ve encountered by
a long shot.

~~~
nafizh
The best resource of fast.ai in my experience is the discussion on its forums.

------
rramadass
I am not a fan of your category (1) books before getting the intuition about
the underlying concepts. However while searching for such a book i came across
the following which might be to your taste (note that i have not read it);

* Neural Networks and Deep Learning: A Textbook by Charu Aggarwal - [https://www.amazon.com/Neural-Networks-Deep-Learning-Textboo...](https://www.amazon.com/Neural-Networks-Deep-Learning-Textbook/dp/3030068560/ref=sr_1_1?keywords=neural+networks+charu&qid=1565616070&s=books&sr=1-1) The author (from IBM Watson Research center) also has written several other books on related domains.

Under your category (2) though not a pop-science book, i found the following
old book (hence no DL) very good to really understand the intuition behind
ANNs.

* Neural Networks for Applied Sciences and Engineering: From Fundamentals to Complex Pattern Recognition by Sandhya Samarasinghe - [https://www.amazon.com/Neural-Networks-Applied-Sciences-Engi...](https://www.amazon.com/Neural-Networks-Applied-Sciences-Engineering-dp-084933375X/dp/084933375X/ref=mt_hardcover?_encoding=UTF8&me=&qid=1565616264)

------
yboris
I loved _Deep Learning with Python_ by François Chollet -- the creator of
_Keras_

[https://www.manning.com/books/deep-learning-with-
python](https://www.manning.com/books/deep-learning-with-python)

An excellent, coherent overview with working code to train a variety of neural
networks.

~~~
chimprich
This is a really excellent book - I'm currently reading it - but it seems to
be everything the questioner wanted to avoid.

It's not heavy on maths, and it mostly sticks with one framework and language.

~~~
brutus1213
Agreed that the book is excellent, and it also surprisingly does not fit into
the categories requested by the OP.

I wish the author did a sequel. The book is totally relevant but I can only
dream of what a modern version would look like.

------
ReDeiPirati
1\. Grokking Deep Learning (Andrew W. Trask) and Neural Networks and Deep
Learning (Michael Nielsen)

2\. I'll probably be off-point here, but maybe The Book of Why (Judea Pearl)
could be interesting reading for you.

FloydHub has a great article about this topic on their blog:
[https://blog.floydhub.com/best-deep-learning-books-
updated-f...](https://blog.floydhub.com/best-deep-learning-books-updated-
for-2019/).

------
mxyzptlk
Check out "Linear Algebra and Learning from Data" by Gilbert Strang. It
includes a nice introduction to Linear Algebra, touches on relevant statistics
and optimization, then puts them all together in chapters on neural networks.
It's a textbook, so exercises are included.

------
RocketSyntax
I'm reading "Hands-On Machine Learning with Scikit-Learn & Tensorflow." It's
really good at explaining the theory of how it works and providing examples of
what choices to make when using tf.

~~~
Oreb
Thank you, but unfortunately, that sounds like exactly the kind of book I _don
't_ want. Like I said, I'm not looking for books that present some particular
language, framework or toolkit.

~~~
acollins1331
I have also tried that book, and if you really want to skip the code snippets
in it, you can. The explanations from the basics of machine learning through
the more typical NN architectures (RNN, CNN etc) is probably the best one I've
found. Aurelion Geuron (the author) is a fantastic teacher. He is also a great
speaker, look for some of his talks on youtube.

------
currymj
The Michael Nielsen online text is very well-regarded and mostly fits what you
describe for 1, although it may have too much code for your taste in addition
to the mathematical basics. However the code is not "how to use TensorFlow"
but rather "implement backpropagation from scratch". It covers some
interesting topics including the universal approximation proof.

It has some reasonable exercises but no solutions, but I think people have
posted their own solutions online around the web.

[http://neuralnetworksanddeeplearning.com](http://neuralnetworksanddeeplearning.com)

I think this is your best bet, to be honest. Meat & potatoes neural networks
doesn't really require any super-deep mathematical knowledge (just linear
algebra and a very hand-wavy ability to do basic matrix calculus), and the
more advanced topics are moving way too fast for a textbook to cover them
(Goodfellow et al. is already getting out of date).

The recommendation of Strang's new book is probably also pretty good.

------
bitcurious
I thought this was a great book:
[http://neuralnetworksanddeeplearning.com/index.html](http://neuralnetworksanddeeplearning.com/index.html)

It well balanced, going into the intuition and math while not getting lost in
the theoretical weeds.

It doesn’t go into the state of the art, but it does give you the background
you’ll need to understand it.

------
jointpdf
Take a look at “Neural Networks: A Systematic Introduction” by Raul Rojas:
[https://page.mi.fu-berlin.de/rojas/neural/](https://page.mi.fu-
berlin.de/rojas/neural/)

It’s an older book, but it’s a deep dive into the math and intuition of neural
networks. We used it for a grad-level applied math course in neural networks
in 2013 (just as deep learning was emerging). It has tons of great
visualizations and interesting exercises, is very readable, and is the best
price (free).

I recommend reading a chapter or two to see how you like it, it has a bit of a
different flavor compared to more modern deep learning books.

~~~
thelastbender12
thank you, this looks very readable. I was very much looking for a treatment
like this that spends more time on how the primitives - perceptron, backprop,
etc. came about.

------
banjo_milkman
1\. The Goodfellow book is the obvious one. Another option is 'Machine
Learning: A probabilistic perspective' by Kevin Murphy, which does have
exercises but has less material on CNNs IIRC and is slightly older. Also
Bishop 'Pattern Recognition and Machine Learning' is a classic reference but
even older. 2\. The Pedro Domingos book the Master Algorithm tries to address
this itch / may be less philosophical that you want. It is >okay< but I don't
think people love it.

------
iandanforth
"Mathematics for Machine Learning" by Marc Peter Deisenroth, A Aldo Faisal,
and Cheng Soon Ong.

[https://mml-book.github.io/](https://mml-book.github.io/)

------
thelastbender12
Thanks for asking this. I had a related question - have you found any other
textbooks/courses which take a chronological view on the subject of neural
networks, sort of a tour-de-force of their history?

A lot of MOOCs and open courses do a very good job at teaching the deep
learning toolset for specific domains - vision, text, and so on. I was looking
to find a curated source on how neural network architectures and algorithms
gradually evolved over time as people realized they could solve a wider
variety of problems.

This probably sounds like a `seminar` course with extensive readings? A good
analogy for instance would be the MAA book on the history of integration [0],
which describes how the notion of integration was formalized over time. Thank
you for your help!

[0] [https://www.maa.org/press/maa-reviews/a-radical-approach-
to-...](https://www.maa.org/press/maa-reviews/a-radical-approach-to-lebesgues-
theory-of-integration)

------
pjmorris
Seems like 'Deep Learning' [0] might suit for your first type of book. For the
second type of book, may I suggest 'The Book of Why' by Judea Pearl. It isn't
focused specifically on deep learning, but it is focused philosophical and
cross disciplinary applications of statistical techniques.

disclaimer: I haven't really dug in to deep learning, so I'll wager there may
be great resources I'm completely unaware of.

[0] 'Deep Learning, Ian Goodfellow and Yoshua Bengio and Aaron Courville, MIT
Press, [https://www.deeplearningbook.org/](https://www.deeplearningbook.org/)

------
hnarayanan
I would read the Goodfellow book (though it is exactly as you describe).
Beyond this, I would follow courses online depending on your interest, e.g.:

\- Stanford's CS231n
([http://cs231n.stanford.edu](http://cs231n.stanford.edu)) for Computer Vision

\- Stanford's CS224n
([http://web.stanford.edu/class/cs224n/](http://web.stanford.edu/class/cs224n/))
for NLP

They both have pretty solid exercises, which includes work like implementing
back-propagation from first principles.

------
UncleOxidant
For your category 2 book I would recommend: The Master Algorithm: How the
Quest for the Ultimate Learning Machine
[https://www.goodreads.com/book/show/24612233-the-master-
algo...](https://www.goodreads.com/book/show/24612233-the-master-algorithm)
What I like about this book is that it covers the various ML camps. Not just
Deep Learning, but the Baysean approach as well.

------
emilwallner
Is there a math equivalent to Charles Petzold's Code? I.e. a brief history of
math where you feel like you invent the key steps yourself.

------
iandanforth
"[Goodfellow et al] completely lacks exercises."

Untrue, it has one
[https://www.deeplearningbook.org/linear_algebra.pdf](https://www.deeplearningbook.org/linear_algebra.pdf)
(tongue in cheek)

~~~
rhabarber
Correct. You have proved that the statement is wrong. Thus, the whole question
is wrong now..

------
vonnik
For #1: I second the recommendations for Nielsen, Goodfellow and Chollet. For
#2: Pedro Domingos and our wiki:
[https://skymind.com/wiki/](https://skymind.com/wiki/)

------
wpmoradi
I recommend reading
[http://aima.cs.berkeley.edu/](http://aima.cs.berkeley.edu/) . Has great intro
and goes into the details.

~~~
dlo
On that note, this is an excellent class based on the book:

[http://ai.berkeley.edu/home.html](http://ai.berkeley.edu/home.html)

------
rajacombinator
Read an intro to statistics instead. Don’t become another CS punter throwing
neural nets at problems with no understanding of the underlying material.

~~~
thrwwwy38947
can you justify the importance of statistics for Neural Nets?

------
martygwilliams
This stuff moves too quick for print. Check out Siraj Raval on Youtube.
Anything by Andrew Ng is great, too.

------
RocketSyntax
Ah, I just read your description. Allow me to change my recommendation to
"Practical Statistics for Data Scientists"

------
nicholast
Check out my collection of essays "From the Diaries of John Henry". It's not
exclusively about neural networks, also covers ground like quantum computing
and complexity theory. Book 3 goes into detail about the development of the
Automunge platform for preparing data for machine learning. More at:
[http://turingsquared.com](http://turingsquared.com)

------
billconan
there is only one book I like:

[http://neuralnetworksanddeeplearning.com/](http://neuralnetworksanddeeplearning.com/)

it explains the basics, intuitions well with step by step implementation.

the issue is that it lacks coverage on the latest topics.

------
jfdi
Type 1: Not sure.

Type 2: The Master Algorithm by Pedro Domingos

