
Grokking Deep Learning - williamtrask
https://iamtrask.github.io/2016/08/17/grokking-deep-learning/
======
confiscate
Hey William it's a great book. The beginning was great--it was a great intro.

I think the book is still too long. Some of the passages are huge, with a long
long blocks of text. There's a lot of filler words in there like "which is a
bummer", and a lot of "say..." dot dot dots. As a reader, you need to spend
mental energy in trying to figure out what is the key essence you are trying
to convey in each paragraph--this would be fine if the book is dense to begin
with, but since you are trying to make this a concise intro, maybe it would be
best to reduce this mental energy requirement to the absolute minimum

A bit if feedback is, read the paragraphs your wrote, ask yourself "what is
the exact point I am trying to convey here?", and the remove words that can be
removed, without taking away from the point. You want the paragraph to be as
short and concise is possible, since that's what makes your book different
from all the other "dense" books out there on the same topic.

If someone in the book store opens up your book and "skims", if your
paragraphs are clearly small, with lots of whitespace between paragraphs, then
even without reading the text content in detail, the person can immediately
tell your book is special, and completely different from all the other books
on NN in the bookstore, and will be more likely to buy it right away.

I think at the beginning it was great. The use of everyday examples and
analogies made it very quick and simple for someone reading it to "get" what
you mean. But as the chapters progressed, e.g. chapter 3, the text gets more
dense and examples get fewer--in the later chapters it looks like you got more
excited about the technical details, and the calculus math etc, and the later
chapters no longer seem to relate to examples as much or are as concise as
before. The later chapters look more and more similar to the other denser
literatures in the field

In any case, overall it's a great book. It's very unique I have never seen
this condensed approach before, and very special compared to all the other NN
literatures out there. It's very refreshing. I'm sure it will inspire a new
generation of machine scientists who will remember this for years to come!

Thanks for writing this!

~~~
paulsutter
An editor. Behind every great writer was an even better editor.

For technical documentation, it's actually better to hire a really good
technical writing editor than a tech writer. Have the engineer spew out the
right ideas, and then the editor puts in the magic that makes it an effective
read. It's an easier process than trying to teach the subject to a tech
writer.

~~~
agibsonccc
We have an oreilly book dropping on deep learning next month at strata hadoop
in new york. We've been working on it for a few years now, but yes: can
confirm. Our editor has been amazing.The last 10 ft and minor tweaks have been
the biggest lesson for us in writing this.

(If you click this link: warning: it's not python)

[https://amazon.com/Deep-Learning-Practitioners-Adam-
Gibson/d...](https://amazon.com/Deep-Learning-Practitioners-Adam-
Gibson/dp/1491914254%3FSubscriptionId=AKIAJ2HQNLFCLUOUMLHQ&tag=zippylab-20&linkCode=sp1&camp=2025&creative=165953&creativeASIN=1491914254)

~~~
goldenkey
Warning: if you click the above link you will be cookied with an affiliate
tracker, zippylab-20. Even if you buy other products, they will be notified of
the exact items purchased and will receive a percentage of your purchases for
up to the next 24 hours.

The amazon affiliate panel shows an item by item breakdown of any purchases
made by cookied users, goodbye private purchases.

Here is a non tracked link:

[https://amazon.com/Deep-Learning-Practitioners-Adam-
Gibson/d...](https://amazon.com/Deep-Learning-Practitioners-Adam-
Gibson/dp/1491914254)

~~~
agibsonccc
Oh good find! I actually jusy copy and pasted this from my search history.
Thanks for the catch. I will be more careful in the future. Completely my
fault there.

~~~
denzil_correa
You can still edit your link in the earlier comment. Actions always speak
louder.

~~~
agibsonccc
I sadly can't. (It's past the edit duration)@dan is welcome too though.
Believe me authors don't make money from their books anyways :P (at least when
going not self publishing).

------
kylek
Ch 1-3 sample at [https://manning-
content.s3.amazonaws.com/download/2/0d079ab-...](https://manning-
content.s3.amazonaws.com/download/2/0d079ab-2402-4c30-85d6-91ba1063cca0/Trask_DeepLearning_MEAP_V01_ch1.pdf)
linked from [https://www.manning.com/books/grokking-deep-
learning](https://www.manning.com/books/grokking-deep-learning)

~~~
haskal
For people who are getting "Access Denied", I have a local copy. Send me an
email at <my username here> at protonmail.com.

EDIT: And reply to this message.

~~~
stevetursi
replying, as requested.. (thanks!)

------
rsp1984
Lots of great comments here already. Just some feedback in the hope that it's
useful:

\- Assume that your target audience is going to be very eager to learn about
DL but have no clue about what exactly to learn or where to even start. That's
why they are buying _your_ book in the first place and not some other more
dense text.

\- Hence, telling your readers what to learn and where to find more info is
just as important as the subject matter itself. This can be as easy as e.g.
telling the readers about certain keywords that they can use in their Google
searches.

\- The very best texts that I've read on complicated subjects were always
"coarse-to-fine", i.e. give the readers the big picture as early as possible,
then enable them to go into details at their own pace.

\- Conversely the worst text that I've read on complicated subjects were
either fine-to-coarse (trying to explain individual components in detail
before going to the big picture), not explaining the big picture at all or
being too verbose in the beginning (slowing down the eager readers and killing
their motivation). A good example of the latter is Apple's "Programming with
Objective-C" [1]. Horrible text IMO.

\- Following what was said above sometimes details aren't even necessary to
include in _your_ text as long as the readers are confident that they can find
their way around and get details elsewhere.

\- The very _very_ best texts I've read also always had a motivational
component. For someone who's just starting out the field looks vast and un-
conquerable and scary. If you show them, in simple words, the boundaries of
the field and which areas the experts are working on and even where current
research is struggling you help give confidence and trajectory to your
readers, so they can strive to become experts too.

[1]
[https://developer.apple.com/library/mac/documentation/Cocoa/...](https://developer.apple.com/library/mac/documentation/Cocoa/Conceptual/ProgrammingWithObjectiveC/Introduction/Introduction.html)

~~~
williamtrask
Excellent feedback! Thank you so much!

~~~
wibr
Just FYI, there are still a few typos in the sample pdf, I think a spell
checker might be useful?

"A neural network learns a function. This might seem confusing since I just
told you that it is a funtion. However, every neural network starts out
predicting randomly. In other words, our starting weight values are random...
thus our function predicts randomly. It's a random function. As you may
remember from the previous chapter, a neural network learns how to take an
input dataset and convert it into an output dataset. For example, it might
take an input dataset of Farenheit temperatures and learn to convert it into
an output dataset of Celsius temperatures. It might covert a pixel values
dataset"

funtion, Farenheit, covert

Edit: "We just take each weight... compute its affect on the error... and move
it in the right direction so that the error goes down (to 0)."

affect vs effect

------
dramaqueen
There's another book from this publisher called Grokking Algorithms. That one
left me very impressed. Usually, I don't care for any
"simplified/easified/dumbed-down" books because they often feel like a
compilation of buzzwords with all the important bits removed. I thought
Grokking Algorithms was simple, yet very meaty/substantial if that makes
sense.

~~~
williamtrask
I believe all the "Grokking" books have the same editor with very high
standards. :)

------
meursault334
The discussion of gradient descent was excellent. So far I'm quite impressed.
As others have said the question is going to be whether you can succeed at
building on this base in a way that makes the later topics accessible.

A nitpick is that you use the words "matrices" and "differentiable" at the end
of chapter 2. Maybe this is okay because you are signposting that these
concepts will be explained but if you are aiming for high school algebra level
readers with some python experience this could intimidate people.

~~~
radicality
He did say 'high school math' and not just algebra. It would be a pretty crap
high school math course if it didn't cover linear algebra and beginner-
intermediate calculus.

------
asah
Read sample chapter. Indeed, it's shockingly easy without being obvious. Hats
off.

~~~
giardini
Well, it should be, since there's nothing of consequence in it.

IMO this topic in Ycombinator is being heavily blogspammed on this book and
the topic should be deleted.

------
_raoulcousins
I get that paying for the MEAP will eventually get you access to all of the
chapters, but it seems a little steep at this point. I'd a lot more willing to
pay, say, $10 for access to _only_ the first three chapters, and then pay more
if I get hooked. I'm guessing that isn't possible.

It also stung a bit that the link said 'Click To See the First Few Chapters'
when in fact you click to see the first chapter and pay for the rest.

~~~
williamtrask
Not exactly the same, but here's a 50% off coupon code for the whole thing, so
only $20 for the whole book.

"mltrask"

~~~
williamtrask
(expires August 26th)

------
williamtrask
50% off coupon code!! (expires Aug 26)

"mltrask"

------
hayksaakian
Has anyone here read the book? what do you think about it, does it deliver on
the promise?

~~~
williamtrask
Author here, the first 3 chapters are in pre-publication. It is my hope that
people are willing to check out said chapters and help me refine it in any
ways it doesn't live up to the promise. Anyone who does, feel free to reach
out to me @iamtrask or via the book's Forum.

------
MehdiHK
I know it is silly and probably been taken care of. But in case, from the
sample in chapter 3-

    
    
      def neural_network(input_data, weight): 
        prediction = input * knob_weight 
        return prediction

------
iamcreasy
To the author : What else do I need to know besides basic python and algebra?
Since I am not a python programmer, can I translate the theories into other
languages easily?

~~~
oelmekki
I've been learning about neural networks lately and implemented mines in
golang. The biggest problem I had was that python is not chosen randomly:
neural networks scientists use it because of numpy.

Most importantly, numpy makes it really easy to deal with matrices (~ array of
arrays). You just make operations on them as if they were classic numbers (so,
you can do `a + b`, where both a and b are matrices).

While translating it into other languages not having numpy is indeed possible,
expect a bit of intellectual gymnastic.

I don't know which language you targeted, but for those who wish to use
golang, I made this matrix library:
[https://github.com/oelmekki/matrix](https://github.com/oelmekki/matrix)

EDIT: oh, btw. I'm initially a ruby dev. I've learnt python just enough to be
able to understand NNs code, that was easy (took me an afternoon). I won't
pretend this makes me a python developer, but learning just enough to
translate code in an other langage is straightforward.

~~~
iamcreasy
I plan to use Java since that's what I am mostly familiar with. I googled a
bit and I think there are couple[1] of Java libraries to handle n dimensional
array. Lets hope it will work out.

[1] [http://nd4j.org/](http://nd4j.org/)

~~~
oelmekki
It seems like nd4j has everything you need. Just remember, when seeing calls
to the `np.dot()` function in python's numpy that it is the "dot product"
operation on matrix, which is also known as standard mathematical matrix
multiplication, which in turn nd4j is calling `mmul` ("matrix multiplication",
in "Linear Algebra Operations" section).

When numpy is using the * operator between two matrices, it basically just
does a cell by cell multiplication

ie:

    
    
        result_matrix[x][y] = matrix1[x][y] * matrix2[x][y]

------
e19293001
Thank you for writing this especially on recommending memorization. I used to
do this though I never thought somebody else would be doing this to grok on to
something since this would be a crazy idea for others. Was really surprised
about that.

I hope you will use spaced repetition in which the reader will have a base
from which to move on a deeper level of understanding. Can't wait to buy this
book soon.

------
p1esk
How is it different from Michael Nielsen's book?

------
purplecpa
I'm interested but thought scikit-learn was the go to for Python machine
learning. Is there a reason there is no mention of it?

~~~
verandaguy
scikit-learn would likely fall into the category of "black box" frameworks the
author mentions. If I understand correctly, this book will let the reader gain
an understanding of the underlying algorithms from an intuitive standrpoint.

~~~
lmcinnes
It is worth noting that scikit-learn does value clear understandable
implementations, so you can actually pop open the source code and expect to
find something other than a black box. Now, in many cases you'll have
optimization work that means a slightly less obvious approach is taken, but
the scikit-learn maintainers do work hard to try and ensure that, if you want
to learn, you should be able to open up the code and do so.

------
MehdiHK
Is there any chance you (William) could add hand-drawn illustrations and flow-
charts? :) This makes a book outright welcoming IMHO. I loved that style in
Grokking Algorithms (or even in Getting Started in Electronics by Forrest M.
Mims III, if you want a distant example).

Thanks for your initiative, will buy the book soon as it is available.

------
kp25
Is there a subscription form, so that interested users can be notified when
the book will be ready ?

~~~
williamtrask
i'll tweet it out when it's done @iamtrask

------
lasfter
Hey William! Just wanted to point out a typo on the "The Solution." paragraph
on the front page.

"Everything you need to know to undrstand Deep Learning will be explained like
you would to a 5 year old, including..."

Love the intro

~~~
williamtrask
Fixed! Thank you so much!

------
kevindeasis
I don't usually buy programming books. But when I do it's usually with
manning. I think this will be the first time I will actually learn about deep
learning. Anyone got one of those 50% off coupons?

~~~
vollmond
[https://news.ycombinator.com/item?id=12312803](https://news.ycombinator.com/item?id=12312803)

------
bordercases
They should have called it "Learning Deep Learning Deeply"

~~~
sauwan
"Deep Learning, Deeply Learned"

~~~
curuinor
It's actually quite a shallow treatment. That's OK. Presumably, most folks who
just want to use the things don't want to know about transience in chaotic
attractors or VC-dimension and Radamacher complexity and stuff like that.

------
vgoklani
Thanks Andrew! I really enjoy reading your blog, buying the book is just a way
of saying thanks :) looking forward to your future work!

~~~
williamtrask
Thank you so much! Very kind words.

------
giardini
Where's the beef? The only "chapter" is 5 pages of fluff telling me why I
should read the book.

This should be flagged as blogspam.

~~~
egonschiele
You have left tons of negative comments throughout this post[1]. We get it,
you don't like the book. Writing a book is not profitable, Trask is going to
spend a lot of time just to help out other folks and share his knowledge. Be
aware that he is trying to do something nice here, and try to empathize.

[1]
[https://news.ycombinator.com/threads?id=giardini](https://news.ycombinator.com/threads?id=giardini)

~~~
giardini
"We get it, you don't like the book."

No, you don't "get it". And I may "love the book", if I ever read the supposed
"book".

I believe this is merely a marketing test for a book which likely does not yet
exist and news.ycombinator is not a forum for market testing or advertising
for books, even books on NN.

Furthermore, a post in trask's defense by another author who also has a book
published by Manning, is most unsavory. Publishers and authors and their
agents should cease spamming news.ycombinator.

~~~
egonschiele
His post has 455 points. I think that's a clear sign that people are actually
interested in this book, and it's not some publisher scam.

Yup, I have written a book with Manning (Grokking Algorithms), which is why I
commented about empathy towards authors.

~~~
giardini
Your comment has caused me to review the Hacker News Guidelines and I find
that I have violated no less than two, and now, possibly three, of the
guidelines, in particular the following:

"Please don't submit comments complaining that a submission is inappropriate
for the site. If you think a story is spam or off-topic, flag it by clicking
on its 'flag' link. If you think a comment is egregious, click on its
timestamp to go to its page, then click 'flag' at the top. (Not all users see
flag links; there's a small karma threshold.)"

" If you flag something, please don't also comment that you did.

....

" Please resist commenting about being downvoted. It never does any good, and
it makes boring reading."

I apologize to all for these violations.

Nonetheless I find the initial "chapter" of the aforementioned "book" to be
void of significant content and await the full publication before spending any
money.

~~~
egonschiele
That's great! Thanks for taking the time to read through that :)

------
reachtarunhere
This is very much like Collective Intelligence book but targets deep learning.
I am really excited. The draft seems pretty good.

------
jray
Publication in Spring 2017 (estimated)

:(

~~~
williamtrask
that's very conservative... i hope it is published long before then :)

------
shostack
I wish more than the first chapter were free so I could get a sense of the
"meat" of the book vs just the intro. I'd even be willing to give up my email
for it to be notified of new chapters (hint hint).

~~~
williamtrask
It was the first 3 chapters until about 6am this morning. I can offer a 50%
off coupon code though (fwiw)

"mltrask"

~~~
shostack
I appreciate the offer, but I'd much rather just be able to see at least
Chapter 2 or something like that. If it seems like the learning style is my
cup of tea, I'd definitely buy it, and even at full price. I just need to be
sure the way lessons are presented align with my learning style as I'm a bit
picky there. Unfortunately there isn't much taught in Chapter 1.

~~~
williamtrask
shoot me an email at liamtrask@gmail and i'll hook you up

~~~
shostack
Email sent, thanks for being flexible!

------
adamwi
First chapter was a great read! Will the following chapters be released all at
once or one by one when they are ready?

~~~
giardini
WTF is it a good read? Rah-rah stuff on why one should study NNs? Geez!

------
OliverD
Bought it thank you. Looking forward to reading the whole book :)

~~~
giardini
I fear you're in for a long wait. The only visible part currently is 5 pages
of "rah-rah - I love NNs".

------
jamesfisher
Is everyone here buying the $40 ebook?

------
giardini
You're off to a bad start by spamming news.ycombinator.

~~~
williamtrask
Shoot me an email and I can make it happen for you. liamtrask@ gmail

(but i don't control the website download offer)

------
th0ma5
Is using GitHub pages for promoting and selling a non open source project
really appropriate?

~~~
williamtrask
If it's not, i'm happy to change it. It's just my blog.

------
kkotak
Please change the title to say it's not free/open source.

~~~
reactor
Since when everything posted in HN should be free/open source?

------
molikto
I know DL. but will never use Python... 23333

~~~
Onewildgamer
Python feels like a pseudocode and is a popular language among hackers and
data scientists. As a result, there are more examples, tutorials to get
started, that's one of the chief reasons many choose it to learn new concepts.

------
avindroth
Gonna play Devil's Advocate here: is this the correct way to lower the barrier
to entry?

This is like trying to teach monads without having taught lambda calculus,
functors, and applicatives.

There is a clear order to knowledge, and people should master the books
dealing with prereqs if they want to grok deep learning.

Not jump into deep learning, just because it's the hot shit.

Part of these efforts to cheaply popularize CS makes computer science not a
real field. Just a fad.

Nobody will write a book called "Grokking Quantum Physics" claiming that
explaining quantum like "you are a five year old" will somehow cover for the
necessary mastery of classical physics.

If you read such a book and think you understand quantum physics, you are
terribly misguided.

Dunning-Kruger addicts people to feeling like they have mastered a subject,
without putting in the effort.

Little learning is a dangerous thing.

~~~
minimaxir
Making the field more accessible allows for more people to get involved and
contribute new ideas back to the field, which benefits everyone.

~~~
master_yoda_1
Its not about making it accessible, its about writing baseless wrong things.
The book author will do more harm then good for general audience. Its total
waste of time to read this book. Our time is limited better spend it on
something useful which is correct.

