Hacker News new | past | comments | ask | show | jobs | submit login
Ask HN: What Neural Networks/Deep Learning Books Should I Read?
275 points by Oreb 11 days ago | hide | past | web | favorite | 48 comments
There are lots of deep learning books on the market. The vast majority of them are presenting practical examples using some Python (or whatever) deep learning framework. Such books don't interest me at all. If I wanted to learn some particular framework, I would just look up the documentation for that framework.

I'm looking for two types of books:

1. A technical, math-heavy introduction to neural networks and deep learning, with little or no actual code (except possibly some pseudocode). The often recommended book by Goodfellow et al resembles what I'm looking for, but unfortunately, it completely lacks exercises.

2. An entertaining pop science like book which takes a more philosophical and cross disciplinary look at neural networks as well as their inspirations and applications. I haven't been able to find a single book like this, but surely it has to exist?

Recommendations, anyone?


It is a little old given how quickly the field moves, but 'Information Theory, Inference and Learning Algorithms' has a chapter on neural networks. It is an outstanding book: a labour of love from a very smart person. The exercises are varied, explanations are great, there's a sprinkling of humour, and connections drawn between multiple fields of study. Moreover, it is freely available from the author's website: http://www.inference.org.uk/itprnn/book.html

Have you considered giving Goodfellow another shot, but trying to re-derive the results therein as a form of exercise? I think that would likely be one of the faster methods to bring yourself reasonably up to date with the field.

I second this as well! ITILA is not just outstanding but offers a timeless perspective to statistical and uncertainty modeling, not to mention information theory which is often understudied in this field (including by myself).

As the parent poster says, this field moves fast but this book will give a solid grounding.

Even though the treatment on neural networks is short, the beginning chapters are worthwhile. The chapter one random variables and probability is one of the best introductions to probabilistic modeling which I’ve seen.

You can find video of MacKay's lectures covering parts of the book online. I'm more of a reader but I really enjoyed them. And the book is excellent.

second the Mackay book recommendation! It is a lovely text with a ton of examples, good reference for a senior course on applied probability.

IMHO, people should not suggest books they have not personally read as it creates a bias trend towards books which names people have heard somewhere (mostly books by a bigshot author or from a big university/company). So, if you are suggesting a book, please at least mention if you have read it or not.

This for the background: https://mml-book.github.io/

This for actual methods: http://www.deeplearningbook.org/

This is also useful, but harder to read than the previous ones: https://web.stanford.edu/~hastie/Papers/ESLII.pdf

Hey, I did not take an undergraduate course in Algorithms but I'm super interested in AI, especially neural nets and deep learning- would you recommend I start with the background book?

Thanks for your suggestion btw.

OP mentions Essentials Of Statistical Learning — it’s a pretty heavy book that gets mentioned quite a bit.

From the same Stanford publishing there is Introduction to Statistical Learning. It’s a good intro to Machine Learning as a whole.

Far too often it seems people want to jump directly into Deep Learning, I’d shy away from that and having a better understanding of ML as a discipline makes the application of DL much more productive.

Edit: Also would like to add a lot of people want to use DL for imaging stuff. Take some time to understand Digital Image Processing as well. It’s a good introduction to convolution and filtering. As well as just understanding what an image is and what can be done with it!

This is just sort of advice from my path.

The second book they mention also had some pretty heavy stuff involving probability and probability models. If you can take some time to understand Automata and it’s supplications such as Hidden Markov Models that’ll be a big help.

Also you mentioning that you never taking a formal algorithm course. While it isn’t necessary as you probably won’t be building anything from scratch. Learning some dynamic programming methods is very helpful when understanding FFT and it’s impact with convolution methods and also how some of these hidden models for probability are evaluated efficiently.

I wrote a piece on that in: https://github.com/stared/thinking-in-tensors-writing-in-pyt...

> If I wanted to learn some particular framework, I would just look up the documentation for that framework.

Well, if you don't know deep learning, it is not how it works (unless it is a poor book, which only provides an introduction to some API). Still, I recommend "Deep Learning in Python" by Francois Chollet as it provides a good overview of practical deep learning. For practical applications, a book WILL use one framework or another or will be useless. If you understand overfitting, L2 or batch processing in Keras, you will be able to use in any other framework (after looking up its API).

When it comes to the mathematical background, Deep Learning Book by Ian Goodfellow et al. is a great starting point, giving a lot of overview. Though, it requires a lot of interest in maths. Convolutional networks start well after page 300.

I struggled to find something in the middle ground - showing mathematical foundations of deep learning, step by step, at the same time translating it into code. The closest example is CS231n: Convolutional Neural Networks for Visual Recognition (which is, IMHO, a masterpiece). Though, I believe that instead of using NumPy we can use PyTorch, giving a smooth transition between mathematical ideas and a practical, working code.

Not a book per se, but better than any other.

I am in the process of writing "Thinking in Tensors, Writing in PyTorch" (with an idea of showing maths, code, fundamentals or practical examples) but it is a slow process. It's a collaborative, open-source, repo - so open for collaborators and contributors. :)

My go-to when I started in Deep Learning is Stanford's CS 231n course http://cs231n.github.io/, the lecture notes are amazing. It was instrumental when I first dove deep into Deep Learning and helped me understand all the components needed to make Convolutional Neural Networks(CNN) and Neural Networks(NN) work. I first read this and watched the lecture videos. This course was a great primer for me to understand the content and theory in Goodfellow's Deep Learning book.

Reading through the first module, this looks great! Lots of helpful graphics and code examples.

Thanks for sharing.

For the topic of optimization, I recommend:


It will give you a decent introduction to optimization methods underpinning deep learning. Deep learning theory is optimization theory.

There are youtube lectures as well and Justin is a great lecturer.

I personally would not recommend the Goodfellow book. It's not a good book for newcomers, at best it's a quick reminder of how some things work on a non-rigorous level.

I know it doesn’t sound at first like what you’re looking for, but I strongly recommend the Fast.ai course.

Despite advertising itself as for coders and not math heavy, I found it to be much better at explaining the math than, for example goodfellas’ book.

It’s also much better at talking about the inspirations of certain methods - such as Dropout - than other sources I’ve found.

Overall, if you want a deep understanding of neural network, fast.ai is - somewhat ironically given its branding - the best resource I’ve encountered by a long shot.

The best resource of fast.ai in my experience is the discussion on its forums.

I am not a fan of your category (1) books before getting the intuition about the underlying concepts. However while searching for such a book i came across the following which might be to your taste (note that i have not read it);

* Neural Networks and Deep Learning: A Textbook by Charu Aggarwal - https://www.amazon.com/Neural-Networks-Deep-Learning-Textboo... The author (from IBM Watson Research center) also has written several other books on related domains.

Under your category (2) though not a pop-science book, i found the following old book (hence no DL) very good to really understand the intuition behind ANNs.

* Neural Networks for Applied Sciences and Engineering: From Fundamentals to Complex Pattern Recognition by Sandhya Samarasinghe - https://www.amazon.com/Neural-Networks-Applied-Sciences-Engi...

I loved Deep Learning with Python by François Chollet -- the creator of Keras


An excellent, coherent overview with working code to train a variety of neural networks.

This is a really excellent book - I'm currently reading it - but it seems to be everything the questioner wanted to avoid.

It's not heavy on maths, and it mostly sticks with one framework and language.

Agreed that the book is excellent, and it also surprisingly does not fit into the categories requested by the OP.

I wish the author did a sequel. The book is totally relevant but I can only dream of what a modern version would look like.

+1 for this book. Such a pleasure to read and to work through the samples.

1. Grokking Deep Learning (Andrew W. Trask) and Neural Networks and Deep Learning (Michael Nielsen)

2. I'll probably be off-point here, but maybe The Book of Why (Judea Pearl) could be interesting reading for you.

FloydHub has a great article about this topic on their blog: https://blog.floydhub.com/best-deep-learning-books-updated-f....

Check out "Linear Algebra and Learning from Data" by Gilbert Strang. It includes a nice introduction to Linear Algebra, touches on relevant statistics and optimization, then puts them all together in chapters on neural networks. It's a textbook, so exercises are included.

I'm reading "Hands-On Machine Learning with Scikit-Learn & Tensorflow." It's really good at explaining the theory of how it works and providing examples of what choices to make when using tf.

Thank you, but unfortunately, that sounds like exactly the kind of book I don't want. Like I said, I'm not looking for books that present some particular language, framework or toolkit.

I have also tried that book, and if you really want to skip the code snippets in it, you can. The explanations from the basics of machine learning through the more typical NN architectures (RNN, CNN etc) is probably the best one I've found. Aurelion Geuron (the author) is a fantastic teacher. He is also a great speaker, look for some of his talks on youtube.

"Mathematics for Machine Learning" by Marc Peter Deisenroth, A Aldo Faisal, and Cheng Soon Ong.


The Michael Nielsen online text is very well-regarded and mostly fits what you describe for 1, although it may have too much code for your taste in addition to the mathematical basics. However the code is not "how to use TensorFlow" but rather "implement backpropagation from scratch". It covers some interesting topics including the universal approximation proof.

It has some reasonable exercises but no solutions, but I think people have posted their own solutions online around the web.


I think this is your best bet, to be honest. Meat & potatoes neural networks doesn't really require any super-deep mathematical knowledge (just linear algebra and a very hand-wavy ability to do basic matrix calculus), and the more advanced topics are moving way too fast for a textbook to cover them (Goodfellow et al. is already getting out of date).

The recommendation of Strang's new book is probably also pretty good.

I thought this was a great book: http://neuralnetworksanddeeplearning.com/index.html

It well balanced, going into the intuition and math while not getting lost in the theoretical weeds.

It doesn’t go into the state of the art, but it does give you the background you’ll need to understand it.

Take a look at “Neural Networks: A Systematic Introduction” by Raul Rojas: https://page.mi.fu-berlin.de/rojas/neural/

It’s an older book, but it’s a deep dive into the math and intuition of neural networks. We used it for a grad-level applied math course in neural networks in 2013 (just as deep learning was emerging). It has tons of great visualizations and interesting exercises, is very readable, and is the best price (free).

I recommend reading a chapter or two to see how you like it, it has a bit of a different flavor compared to more modern deep learning books.

thank you, this looks very readable. I was very much looking for a treatment like this that spends more time on how the primitives - perceptron, backprop, etc. came about.

1. The Goodfellow book is the obvious one. Another option is 'Machine Learning: A probabilistic perspective' by Kevin Murphy, which does have exercises but has less material on CNNs IIRC and is slightly older. Also Bishop 'Pattern Recognition and Machine Learning' is a classic reference but even older. 2. The Pedro Domingos book the Master Algorithm tries to address this itch / may be less philosophical that you want. It is >okay< but I don't think people love it.

Thanks for asking this. I had a related question - have you found any other textbooks/courses which take a chronological view on the subject of neural networks, sort of a tour-de-force of their history?

A lot of MOOCs and open courses do a very good job at teaching the deep learning toolset for specific domains - vision, text, and so on. I was looking to find a curated source on how neural network architectures and algorithms gradually evolved over time as people realized they could solve a wider variety of problems.

This probably sounds like a `seminar` course with extensive readings? A good analogy for instance would be the MAA book on the history of integration [0], which describes how the notion of integration was formalized over time. Thank you for your help!

[0] https://www.maa.org/press/maa-reviews/a-radical-approach-to-...

Seems like 'Deep Learning' [0] might suit for your first type of book. For the second type of book, may I suggest 'The Book of Why' by Judea Pearl. It isn't focused specifically on deep learning, but it is focused philosophical and cross disciplinary applications of statistical techniques.

disclaimer: I haven't really dug in to deep learning, so I'll wager there may be great resources I'm completely unaware of.

[0] 'Deep Learning, Ian Goodfellow and Yoshua Bengio and Aaron Courville, MIT Press, https://www.deeplearningbook.org/

I would read the Goodfellow book (though it is exactly as you describe). Beyond this, I would follow courses online depending on your interest, e.g.:

- Stanford's CS231n (http://cs231n.stanford.edu) for Computer Vision

- Stanford's CS224n (http://web.stanford.edu/class/cs224n/) for NLP

They both have pretty solid exercises, which includes work like implementing back-propagation from first principles.

For your category 2 book I would recommend: The Master Algorithm: How the Quest for the Ultimate Learning Machine https://www.goodreads.com/book/show/24612233-the-master-algo... What I like about this book is that it covers the various ML camps. Not just Deep Learning, but the Baysean approach as well.

Is there a math equivalent to Charles Petzold's Code? I.e. a brief history of math where you feel like you invent the key steps yourself.

"[Goodfellow et al] completely lacks exercises."

Untrue, it has one https://www.deeplearningbook.org/linear_algebra.pdf (tongue in cheek)

Correct. You have proved that the statement is wrong. Thus, the whole question is wrong now..

For #1: I second the recommendations for Nielsen, Goodfellow and Chollet. For #2: Pedro Domingos and our wiki: https://skymind.com/wiki/

I recommend reading http://aima.cs.berkeley.edu/ . Has great intro and goes into the details.

On that note, this is an excellent class based on the book:


Read an intro to statistics instead. Don’t become another CS punter throwing neural nets at problems with no understanding of the underlying material.

can you justify the importance of statistics for Neural Nets?

This stuff moves too quick for print. Check out Siraj Raval on Youtube. Anything by Andrew Ng is great, too.

Ah, I just read your description. Allow me to change my recommendation to "Practical Statistics for Data Scientists"

Check out my collection of essays "From the Diaries of John Henry". It's not exclusively about neural networks, also covers ground like quantum computing and complexity theory. Book 3 goes into detail about the development of the Automunge platform for preparing data for machine learning. More at: http://turingsquared.com

there is only one book I like:


it explains the basics, intuitions well with step by step implementation.

the issue is that it lacks coverage on the latest topics.

Type 1: Not sure.

Type 2: The Master Algorithm by Pedro Domingos

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact