
Free “Deep Learning” Textbook by Goodfellow and Bengio Now Finished - mbrundle
https://www.facebook.com/ia3n.goodfellow/posts/10102223910143043
======
j2kun
I spent a few weeks closely reading this book and I have to disagree with the
majority here. I didn't like the book at all. And I am an advanced math geek.

My main issue is that the book tells you all about the different parameter
tweaks, but passes little concrete wisdom to the reader. It doesn't
distinguish between modeling assumptions, and it replaces very simple
explanations of concepts with complicated paragraphs that I can't make sense
of.

I think it boils down to something that I have been feeling and hearing a lot
in the past few years: the statistical jargon is so overwhelming that the
authors can't explain things clearly. I can point to many examples in this
book that I feel are unnecessary stumbling blocks, but the fact is that I'll
spend an hour or two discussing parts of this book with a room full of smart
machine learning researchers, and at the end we'll all agree we don't
understand the material better than we did at the start.

On the other hand, I'll read _research papers_ that don't force the
statistical perspective down the reader's throat (e.g.
[http://arxiv.org/abs/1602.04485v1](http://arxiv.org/abs/1602.04485v1)) and
find them very easy to understand by comparison.

It might be a cultural difference, but I've heard this complaint enough from
experts who straddle both sides of the computational/statistical machine
learning divide that I don't think it's just me.

~~~
khuss
>> it replaces very simple explanations of concepts with complicated
paragraphs that I can't make sense of

It is good to see critical views but it will be even better if you could give
concrete examples for statements like the above. Also, what other books do you
recommend?

~~~
j2kun
> Also, what other books do you recommend?

This has been my frustration. I've been wanting to grok deep learning, but
haven't found any source that can explain it in a way that doesn't
overcomplicate simple things (which I can only tell it's doing when I already
understand the topic). I also don't have the time or incentive to really dig
in and do the math myself from scratch, since so much of it is wide open
research directions.

I've also had this experience with much much simpler areas of statistics and
statistical ML (cf. Markov chain monte carlo), so this is a sort of recurring
theme for me. Considering how simple MCMC is now that I do understand it, it's
difficult to dispel the nagging feeling that all the statistical ML literature
is (likely unintentionally) obfuscated.

I could give concrete examples of excerpts and entire sections of the book
that don't make sense to me, but I don't think it's all that productive
because a lot of it boils down to organizational disagreements, cultural
assumptions behind the math, and differences in priorities. Individually it
seems like nitpicking, but they add up quickly to general muddled confusion.
This is especially true when, for example, almost always the right answer to
the question "Why does this particular technique work?" is, "We have no clue,
but here is some anecdotal evidence and half-substantiated oversimplified
theories." Instead these answers are passed as well-known fact.

~~~
mattmcknight
The new O'Reilly book "Fundamentals of Deep Learning" by Nikhil Buduma
(available on Safari for a while now) is good at the fundamentals- very
clearly explained, nice diagrams. It it relatively close to the path of my
Neural Networks classes (although those were 20 years ago).

It necessarily shorter on detail in terms of the tricks of implementation that
have radically improved performance of these techniques over the past 5 years,
and might serve as a good read before diving into this via Goodfellow, Bengio,
and Courville.

~~~
j2kun
I haven't seen this one before. Looks like it's still in development.

~~~
mattmcknight
Yes, only the first three chapters are released.

------
baltcode
First impressions:

1\. It also covers "classical" artificial neural networks, i.e., things like
backprop from before Hinton and others made breakthroughs for deep learning.
This means you can start with this book even if you are new to ANNs. The later
sections cover "real deep learning".

2\. The language is great for beginners and users. You don't have to be an
advanced math geek to follow everything. They seem to cover a fair amount of
ground too, so its not dumbed down either.

3\. I guess it covers most of the underlying theory and practical technicques
but is implementation neutral. You should probably pick up a tutorial for your
favorite implementation like Theano, TensorFlow, etc.

All in all, I like it a lot.

~~~
ogrisel
The old backprop from the 80's / 90's is still in use and the primary way to
train deep nets. We tend to call it SGD (on a composition of differentiable
functions) nowadays but it's the same algorithm.

------
MasterScrat
This looks interesting, can't wait to dig into it.

Another great great free online book on this topic:
[http://neuralnetworksanddeeplearning.com/](http://neuralnetworksanddeeplearning.com/)

~~~
taneq
I'm working my way through that at the moment and so far it's pitched rather
well (at least for my prior experience), describing things simply and
concisely but still with enough background that it's easy enough to see where
everything fits.

------
rtnyftxx
url is [http://www.deeplearningbook.org/](http://www.deeplearningbook.org/)

~~~
newman314
PDF version is:
[http://www.deeplearningbook.org/front_matter.pdf](http://www.deeplearningbook.org/front_matter.pdf)

~~~
kolencherry
Unfortunately, that's just the ToC and the bibliography. It looks like their
"contract with MIT Press forbids distribution of too easily copied electronic
formats of the book."

~~~
wyldfire
Does that mean there's a fifteen-step asciidoc/gitbook/LaTeX build that is
freely distributed? Because I'm willing to spend a few minutes building a pdf
for the sake of getting good content.

~~~
teraflop
It's pretty easy to just run "Save as PDF" on each chapter individually, and
then stitch them together with:

    
    
        pdftk $(ls -tr *.pdf) cat output DeepLearningBook.pdf
    

Out of respect for the authors' contract, I won't post the resulting file
here, but anyone can reproduce it with about 5 minutes of work.

~~~
throwaway287391
In case anyone else has trouble with the "Save as PDF" part, I did it
successfully using Firefox on OSX: from the print menu, choose "show details"
and change all headers/footers to "\--blank--" so there's no ugly URLs/dates
in the corners, then press PDF -> "Save as PDF". (In Chrome, on the other
hand, saving/printing as PDF chopped each page into quarters for me...) At
least in the one chapter I've saved so far, all the math notation renders just
like in the HTML version.

------
liviu-
For anyone interested, Goodfellow is answering questions about the book at:
[https://www.reddit.com/r/MachineLearning/comments/4domnk/the...](https://www.reddit.com/r/MachineLearning/comments/4domnk/the_deep_learning_textbook_is_now_complete/)

------
muyuu
I don't claim to have a solution, but these models of book monetisation really
seem doomed. What are the chances that I will buy this book just because they
made it artificially harder for me to download it? Probably a net negative.

------
phatbyte
Thanks for this. I'm currently re-learning statics/probabilities and linear
algebra so your book will be useful in a few months down the line ;)

~~~
knoble
Would you mind sharing any of the resources you are using for re-learning?
I've been meaning to do the same.

~~~
lindbergh
Just saying, but if you want to hop onto the ML bandwagon (for instance), then
don't bother going over linear algebra or probabilities first, and instead
just learn what you need as you go. For example, the first sections of this
book are already devoted to getting you on the right track, and it's somewhat
standard to do so. And besides, there's no need in learning what are rotation
matrices if you won't use them.

~~~
yompers888
As a counterpoint, if parent is interested in taking ML further, a solid
foundation in linear algebra will be huge when more advanced signal processing
applications come up.

------
uptownfunk
This looks great, any other recommendations for enjoyable reads on ml/stat
learning?

ESL ISLR doing Bayesian data analysis w/ jags/Stan bda3 - gelman prob
graphical models Convex analysis - Boyd adv data analysis from elem pov -
shalizi

Trying to build out my library. I have a background in prob/stats/analysis and
measure theory/linear algebra and also knowledge of algorithms and data
structures at the advanced undergrad level, So I'm not too concerned about
technical depth just want to enjoy a good technical expository and gain
intuition.

------
osoba
Does anybody know how to make the book actually readable?
[http://i.imgur.com/C4rhclk.png](http://i.imgur.com/C4rhclk.png)

~~~
gcr
You can print to a pdf on Chrome. That comes out nicely for me.

~~~
davnicwil
Thanks!

Related to printing - this made me chuckle:

> Printing seems to work best printing directly from the browser, using
> Chrome. Other browsers do not work as well. In particular, the Edge browser
> displays the "does not equal" sign as the "equals" sign in some cases.

Of all the printing bugs for a maths/logic heavy text! Can just visualise the
head banging against the desk upon discovering this one, having struggled with
understanding something for hours - _well ok, either the universe is broken
or... oohhhhhhh_ :-D

------
kkylin
Can any practitioners / experts out there comment on the range of topics? For
example, I understand the book to be introductory, and so the scope is likely
somewhat limited. But how close does it get you to the ANNs currently in use,
at least conceptually if not in complete detail? Thanks!

~~~
ericjang
Non-expert DL practitioner here. The coverage is very good - if I were running
some kind of "Deep Learning onboarding class" for a university or tech
company, this is what I would present.

My favorite aspect of this book is that it provides a graphical models
interpretation of DL methods, which is the most powerful perspective we have
right now to reason about model design (instead of some large black box
function that we train end-to-end without knowing what's in between).

It also explains some fairly recent models and techniques well (VAE, DCGAN,
regularization) that form the basis of more complex architectures. If you
understand the models here, you should be able to understand the design
choices made in more complex architectures.

Thanks to Goodfellow, Bengio, and Courville for this excellent work.

~~~
kkylin
Great, that's very helpful to know. Thanks!

------
wodenokoto
The HTML format is quite peculiar.

It kinda looks like someone ran the original PDF through PDF.js and saved the
rendered output to a HTML file.

~~~
yorwba
The HTML source contains this:

<!-- Created by pdf2htmlEX
([https://github.com/coolwanglu/pdf2htmlex](https://github.com/coolwanglu/pdf2htmlex))
-->

------
MistahKoala
A bit unrelated, but can anybody tell me what typeface is used for the body
text in the PDF?

~~~
gcr
The font is "Computer Modern," the default LaTeX font. Nothing screams "I'm an
academic engineering textbook!" quite like Computer Modern.

~~~
eru
Concrete Roman could compete: [http://www.tex.ac.uk/FAQ-
concrete.html](http://www.tex.ac.uk/FAQ-concrete.html)

~~~
amelius
It is amazing that Knuth still had time left for some mathematics, between
designing fonts :)

~~~
mturmon
(Obligatory: [http://www.xkcd.com/974/](http://www.xkcd.com/974/) \-- but
Knuth solved the arbitrary condiments problem, and others besides.)

The _Concrete Mathematics_ book that resulted was a masterpiece in my opinion.
I learned so much about how to computationally _do_ discrete math from that
book. And it's a very elegant package.

------
maxaf
Athena ([http://athenapdf.com/](http://athenapdf.com/)) does a phenomenal job
at turning those HTML pages into convenient PDF files.

~~~
rajeevk
spent lot of time to convert each pages using this website and merged the pdfs
into single pdf and then found that Athena has not done the conversion
correctly. The diagrams not converted correctly.

Waste of time!!

------
arbre
Does this book mention attention models?

------
patmcguire
Don't quite get the complaints about it not being available in PDF. "We'll
publish your book, and you can give it away for free as long as you make
people click through to each chapter" is a much, much better deal than I would
expect from a big publisher.

~~~
baltcode
Pretty much ... the only edge case I feel for is having an offline copy in
places with low or expensive internet access. Like parts of the developing
world. Clicking through isn't such a big deal.

~~~
patmcguire
In the old days IE had a feature for saving pages for offline reading that
would ask you how deep to recursively traverse. It was considered hilarious to
set it for 99 on the school dialup connection.

------
jjawssd
Remove Facebook link

------
dandermotj
Somebody please package the html into a pdf!

~~~
bootload
_" Can I get a PDF of this book? No, our contract with MIT Press forbids
distribution of too easily copied electronic formats of the book."_

~~~
dandermotj
Look, I'm not planning on printing the whole thing, binding it and sticking it
on my shelf. I want the pdf because then I can use it when I'm offline, across
devices and search it easily.

If I want this book in hard copy, then I will purchase it - I've done this
regularly with free digital books - but when it is offered free digitally then
in my opinion prohibiting to only certain file formats is futile (as evidenced
here), and such constraints are ineffective attempts to encourage people to
buy the hard copy through inconvenience.

And I must add that this is no slight to the authors, whom have my greatest
appreciation for compiling their vast knowledge into a book and offering it
for free. These guys are legends.

~~~
bootload
@dandermotj I understand that, just included the reason why you cannot get it.
The online book really sucks. I turned the styling off.

------
1024core
Does it cover new(er) topics like Deep Reinforcement Learning, Residual
Networks, Inception nets, etc.?

~~~
kleiba
Have you tried looking at the TOC and index?

------
chatman
If there are restrictions around distribution formats, it is misleading to
call it "free".

~~~
sherjilozair
The FSF does not have a monopoly over the word "free". For a lot of people,
"free" means as in "free beer", which this book is, in the online format.

------
max_
They better release a PDF in the future.

