Hacker News new | past | comments | ask | show | jobs | submit login
Grokking Deep Learning (iamtrask.github.io)
530 points by williamtrask on Aug 18, 2016 | hide | past | web | favorite | 112 comments

Hey William it's a great book. The beginning was great--it was a great intro.

I think the book is still too long. Some of the passages are huge, with a long long blocks of text. There's a lot of filler words in there like "which is a bummer", and a lot of "say..." dot dot dots. As a reader, you need to spend mental energy in trying to figure out what is the key essence you are trying to convey in each paragraph--this would be fine if the book is dense to begin with, but since you are trying to make this a concise intro, maybe it would be best to reduce this mental energy requirement to the absolute minimum

A bit if feedback is, read the paragraphs your wrote, ask yourself "what is the exact point I am trying to convey here?", and the remove words that can be removed, without taking away from the point. You want the paragraph to be as short and concise is possible, since that's what makes your book different from all the other "dense" books out there on the same topic.

If someone in the book store opens up your book and "skims", if your paragraphs are clearly small, with lots of whitespace between paragraphs, then even without reading the text content in detail, the person can immediately tell your book is special, and completely different from all the other books on NN in the bookstore, and will be more likely to buy it right away.

I think at the beginning it was great. The use of everyday examples and analogies made it very quick and simple for someone reading it to "get" what you mean. But as the chapters progressed, e.g. chapter 3, the text gets more dense and examples get fewer--in the later chapters it looks like you got more excited about the technical details, and the calculus math etc, and the later chapters no longer seem to relate to examples as much or are as concise as before. The later chapters look more and more similar to the other denser literatures in the field

In any case, overall it's a great book. It's very unique I have never seen this condensed approach before, and very special compared to all the other NN literatures out there. It's very refreshing. I'm sure it will inspire a new generation of machine scientists who will remember this for years to come!

Thanks for writing this!

An editor. Behind every great writer was an even better editor.

For technical documentation, it's actually better to hire a really good technical writing editor than a tech writer. Have the engineer spew out the right ideas, and then the editor puts in the magic that makes it an effective read. It's an easier process than trying to teach the subject to a tech writer.

We have an oreilly book dropping on deep learning next month at strata hadoop in new york. We've been working on it for a few years now, but yes: can confirm. Our editor has been amazing.The last 10 ft and minor tweaks have been the biggest lesson for us in writing this.

(If you click this link: warning: it's not python)


Warning: if you click the above link you will be cookied with an affiliate tracker, zippylab-20. Even if you buy other products, they will be notified of the exact items purchased and will receive a percentage of your purchases for up to the next 24 hours.

The amazon affiliate panel shows an item by item breakdown of any purchases made by cookied users, goodbye private purchases.

Here is a non tracked link:


Oh good find! I actually jusy copy and pasted this from my search history. Thanks for the catch. I will be more careful in the future. Completely my fault there.

You can still edit your link in the earlier comment. Actions always speak louder.

I sadly can't. (It's past the edit duration)@dan is welcome too though. Believe me authors don't make money from their books anyways :P (at least when going not self publishing).

> agibsonccc 9 hours ago [-]

> goldenkey 3 hours ago [-]

so no he can't.

So now we all need to delete our cookies....

This looks great! Any chance this book is available before then via O'Reilly Early Release etc? Even just the ToC would be nice.

Every so often I see something insightful, and surprising, and this is one of them :)

When I saw the premise of the book I was initially turned off, fearing an attempt at trivialization of the ML subject matter, but after reading the intro I kind of like it. It's intuitive and would work well as an introduction for hackers.

It seems a good way in ML is to hack your way around libraries until you get the feeling, and only after that start reading up on theory or doing some ML classes. The other way around is dry.

That's one of the nicest examples of constructive criticism that I've come across in a long time. Bookmarked for future reference to help me become better at this.

Thank you for writing this!

I'll second that

Thank you so much! Consider it done.

For people who are getting "Access Denied", I have a local copy. Send me an email at <my username here> at protonmail.com.

EDIT: And reply to this message.

replying, as requested.. (thanks!)

Just sent you an email.

Hi! Me too, thanks.

That comes back as 'access denied' for me.

Why did it get removed? I can only seem to view the first chapter now. Which really isn't enough to tell me if I'll like this or not.

Can't edit my post - but the links did work at first, sorry!

I get:

<Error> <Code> AccessDenied </Code> <Message> Access Denied </Message> <RequestId> 0EDEA284D5DB49F8 </RequestId> <HostId> XFsWH6DAZg2XergKFcQzu6qYFMvMd71dLlmHPHK9I0zQAwCk4B7c8pVbC499SAWMWTkuyGZo/Q0= </HostId> </Error>

Access denied

Lots of great comments here already. Just some feedback in the hope that it's useful:

- Assume that your target audience is going to be very eager to learn about DL but have no clue about what exactly to learn or where to even start. That's why they are buying your book in the first place and not some other more dense text.

- Hence, telling your readers what to learn and where to find more info is just as important as the subject matter itself. This can be as easy as e.g. telling the readers about certain keywords that they can use in their Google searches.

- The very best texts that I've read on complicated subjects were always "coarse-to-fine", i.e. give the readers the big picture as early as possible, then enable them to go into details at their own pace.

- Conversely the worst text that I've read on complicated subjects were either fine-to-coarse (trying to explain individual components in detail before going to the big picture), not explaining the big picture at all or being too verbose in the beginning (slowing down the eager readers and killing their motivation). A good example of the latter is Apple's "Programming with Objective-C" [1]. Horrible text IMO.

- Following what was said above sometimes details aren't even necessary to include in your text as long as the readers are confident that they can find their way around and get details elsewhere.

- The very very best texts I've read also always had a motivational component. For someone who's just starting out the field looks vast and un-conquerable and scary. If you show them, in simple words, the boundaries of the field and which areas the experts are working on and even where current research is struggling you help give confidence and trajectory to your readers, so they can strive to become experts too.

[1] https://developer.apple.com/library/mac/documentation/Cocoa/...

Excellent feedback! Thank you so much!

Just FYI, there are still a few typos in the sample pdf, I think a spell checker might be useful?

"A neural network learns a function. This might seem confusing since I just told you that it is a funtion. However, every neural network starts out predicting randomly. In other words, our starting weight values are random... thus our function predicts randomly. It's a random function. As you may remember from the previous chapter, a neural network learns how to take an input dataset and convert it into an output dataset. For example, it might take an input dataset of Farenheit temperatures and learn to convert it into an output dataset of Celsius temperatures. It might covert a pixel values dataset"

funtion, Farenheit, covert

Edit: "We just take each weight... compute its affect on the error... and move it in the right direction so that the error goes down (to 0)."

affect vs effect

There's another book from this publisher called Grokking Algorithms. That one left me very impressed. Usually, I don't care for any "simplified/easified/dumbed-down" books because they often feel like a compilation of buzzwords with all the important bits removed. I thought Grokking Algorithms was simple, yet very meaty/substantial if that makes sense.

I believe all the "Grokking" books have the same editor with very high standards. :)

The discussion of gradient descent was excellent. So far I'm quite impressed. As others have said the question is going to be whether you can succeed at building on this base in a way that makes the later topics accessible.

A nitpick is that you use the words "matrices" and "differentiable" at the end of chapter 2. Maybe this is okay because you are signposting that these concepts will be explained but if you are aiming for high school algebra level readers with some python experience this could intimidate people.

He did say 'high school math' and not just algebra. It would be a pretty crap high school math course if it didn't cover linear algebra and beginner-intermediate calculus.

Read sample chapter. Indeed, it's shockingly easy without being obvious. Hats off.

Well, it should be, since there's nothing of consequence in it.

IMO this topic in Ycombinator is being heavily blogspammed on this book and the topic should be deleted.

Thank you so much!

I get that paying for the MEAP will eventually get you access to all of the chapters, but it seems a little steep at this point. I'd a lot more willing to pay, say, $10 for access to only the first three chapters, and then pay more if I get hooked. I'm guessing that isn't possible.

It also stung a bit that the link said 'Click To See the First Few Chapters' when in fact you click to see the first chapter and pay for the rest.

Thanks for the feedback. I'll update the link title to be more precise.

Also, FWIW, I'm offering free Q/A for feedback on those chapters (assuming i don't get totally overwhelmed).

Having gone through the first chapter, I agree, if something like what the poster above mentioned is possible, that'd be preferable for my situation as well. Just my 2 cents.

Here's a 50% off coupon code, so only $20 for the whole thing


(that expires Aug 26 btw)

Not exactly the same, but here's a 50% off coupon code for the whole thing, so only $20 for the whole book.


(expires August 26th)

Nice! Bought it.

50% off coupon code!! (expires Aug 26)


Has anyone here read the book? what do you think about it, does it deliver on the promise?

Author here, the first 3 chapters are in pre-publication. It is my hope that people are willing to check out said chapters and help me refine it in any ways it doesn't live up to the promise. Anyone who does, feel free to reach out to me @iamtrask or via the book's Forum.

To the first question, apparently "no", since the book is likely not yet written.

Welcome to the internet.

I know it is silly and probably been taken care of. But in case, from the sample in chapter 3-

  def neural_network(input_data, weight): 
    prediction = input * knob_weight 
    return prediction

To the author : What else do I need to know besides basic python and algebra? Since I am not a python programmer, can I translate the theories into other languages easily?

I've been learning about neural networks lately and implemented mines in golang. The biggest problem I had was that python is not chosen randomly: neural networks scientists use it because of numpy.

Most importantly, numpy makes it really easy to deal with matrices (~ array of arrays). You just make operations on them as if they were classic numbers (so, you can do `a + b`, where both a and b are matrices).

While translating it into other languages not having numpy is indeed possible, expect a bit of intellectual gymnastic.

I don't know which language you targeted, but for those who wish to use golang, I made this matrix library: https://github.com/oelmekki/matrix

EDIT: oh, btw. I'm initially a ruby dev. I've learnt python just enough to be able to understand NNs code, that was easy (took me an afternoon). I won't pretend this makes me a python developer, but learning just enough to translate code in an other langage is straightforward.

I plan to use Java since that's what I am mostly familiar with. I googled a bit and I think there are couple[1] of Java libraries to handle n dimensional array. Lets hope it will work out.

[1] http://nd4j.org/

It seems like nd4j has everything you need. Just remember, when seeing calls to the `np.dot()` function in python's numpy that it is the "dot product" operation on matrix, which is also known as standard mathematical matrix multiplication, which in turn nd4j is calling `mmul` ("matrix multiplication", in "Linear Algebra Operations" section).

When numpy is using the * operator between two matrices, it basically just does a cell by cell multiplication


    result_matrix[x][y] = matrix1[x][y] * matrix2[x][y]

Hmm, you certainly can, but many of the intuitions come from reading little bits of python code. It's intuitive but I'd recommend doing a python tutorial first.

In a similar vein, is it using python specific libraries or would the examples be easily portable to other languages?

Not the author, but in the blog post he says he's using [numpy](http://www.numpy.org/). With google you might find similar libraries for the language you want to use, with a quick search I just found a quora post with a few similar libraries listed for C++.

what are those libraries? Thanks.

Thank you for writing this especially on recommending memorization. I used to do this though I never thought somebody else would be doing this to grok on to something since this would be a crazy idea for others. Was really surprised about that.

I hope you will use spaced repetition in which the reader will have a base from which to move on a deeper level of understanding. Can't wait to buy this book soon.

How is it different from Michael Nielsen's book?

I'm interested but thought scikit-learn was the go to for Python machine learning. Is there a reason there is no mention of it?

scikit-learn doesn't have a strong neural network codebase -- for anything not NN based they've largely got you covered (along with good infrastructure tooling for pipelines, cross validation, hyper-parameter searching etc.). Contrary to the impression you may get if you only follow the current buzzwords there is a great deal of value in machine learning right now beyond NNs and deep learning. On the other hand if deep learning is what you want to do, scikit-learn is not currently the best library for that.

scikit-learn would likely fall into the category of "black box" frameworks the author mentions. If I understand correctly, this book will let the reader gain an understanding of the underlying algorithms from an intuitive standrpoint.

It is worth noting that scikit-learn does value clear understandable implementations, so you can actually pop open the source code and expect to find something other than a black box. Now, in many cases you'll have optimization work that means a slightly less obvious approach is taken, but the scikit-learn maintainers do work hard to try and ensure that, if you want to learn, you should be able to open up the code and do so.

couldn't have said it better myself :)

Is there any chance you (William) could add hand-drawn illustrations and flow-charts? :) This makes a book outright welcoming IMHO. I loved that style in Grokking Algorithms (or even in Getting Started in Electronics by Forrest M. Mims III, if you want a distant example).

Thanks for your initiative, will buy the book soon as it is available.

Is there a subscription form, so that interested users can be notified when the book will be ready ?

i'll tweet it out when it's done @iamtrask

Hey William! Just wanted to point out a typo on the "The Solution." paragraph on the front page.

"Everything you need to know to undrstand Deep Learning will be explained like you would to a 5 year old, including..."

Love the intro

Fixed! Thank you so much!

I don't usually buy programming books. But when I do it's usually with manning. I think this will be the first time I will actually learn about deep learning. Anyone got one of those 50% off coupons?

> Anyone got one of those 50% off coupons?

Googling around usually gets you atleast 39% off. Here is one that worked for me ctwgeopytw

Signing up for their email usually gets you one of those soon after the book is announced and once close to publication date.

They should have called it "Learning Deep Learning Deeply"

"Deep Learning, Deeply Learned"

It's actually quite a shallow treatment. That's OK. Presumably, most folks who just want to use the things don't want to know about transience in chaotic attractors or VC-dimension and Radamacher complexity and stuff like that.

Thanks Andrew! I really enjoy reading your blog, buying the book is just a way of saying thanks :) looking forward to your future work!

Thank you so much! Very kind words.

Where's the beef? The only "chapter" is 5 pages of fluff telling me why I should read the book.

This should be flagged as blogspam.

You have left tons of negative comments throughout this post[1]. We get it, you don't like the book. Writing a book is not profitable, Trask is going to spend a lot of time just to help out other folks and share his knowledge. Be aware that he is trying to do something nice here, and try to empathize.

[1] https://news.ycombinator.com/threads?id=giardini

"We get it, you don't like the book."

No, you don't "get it". And I may "love the book", if I ever read the supposed "book".

I believe this is merely a marketing test for a book which likely does not yet exist and news.ycombinator is not a forum for market testing or advertising for books, even books on NN.

Furthermore, a post in trask's defense by another author who also has a book published by Manning, is most unsavory. Publishers and authors and their agents should cease spamming news.ycombinator.

His post has 455 points. I think that's a clear sign that people are actually interested in this book, and it's not some publisher scam.

Yup, I have written a book with Manning (Grokking Algorithms), which is why I commented about empathy towards authors.

Your comment has caused me to review the Hacker News Guidelines and I find that I have violated no less than two, and now, possibly three, of the guidelines, in particular the following:

"Please don't submit comments complaining that a submission is inappropriate for the site. If you think a story is spam or off-topic, flag it by clicking on its 'flag' link. If you think a comment is egregious, click on its timestamp to go to its page, then click 'flag' at the top. (Not all users see flag links; there's a small karma threshold.)"

" If you flag something, please don't also comment that you did.


" Please resist commenting about being downvoted. It never does any good, and it makes boring reading."

I apologize to all for these violations.

Nonetheless I find the initial "chapter" of the aforementioned "book" to be void of significant content and await the full publication before spending any money.

That's great! Thanks for taking the time to read through that :)

This is very much like Collective Intelligence book but targets deep learning. I am really excited. The draft seems pretty good.

Publication in Spring 2017 (estimated)


that's very conservative... i hope it is published long before then :)

I wish more than the first chapter were free so I could get a sense of the "meat" of the book vs just the intro. I'd even be willing to give up my email for it to be notified of new chapters (hint hint).

It was the first 3 chapters until about 6am this morning. I can offer a 50% off coupon code though (fwiw)


I appreciate the offer, but I'd much rather just be able to see at least Chapter 2 or something like that. If it seems like the learning style is my cup of tea, I'd definitely buy it, and even at full price. I just need to be sure the way lessons are presented align with my learning style as I'm a bit picky there. Unfortunately there isn't much taught in Chapter 1.

shoot me an email at liamtrask@gmail and i'll hook you up

Email sent, thanks for being flexible!

First chapter was a great read! Will the following chapters be released all at once or one by one when they are ready?

WTF is it a good read? Rah-rah stuff on why one should study NNs? Geez!

Bought it thank you. Looking forward to reading the whole book :)

I fear you're in for a long wait. The only visible part currently is 5 pages of "rah-rah - I love NNs".

Is everyone here buying the $40 ebook?

You're off to a bad start by spamming news.ycombinator.

We detached this subthread from https://news.ycombinator.com/item?id=12312791 and marked it off-topic.

Well, giardini, I'm not a bot. My offer to send you 3 chapters still stands. Sorry you're not happy. I also have plenty of free educational materials on Deep Learning on the above linked blog which you can use to rate my writing. If there's anything else I can do for you, please let me know.

Shoot me an email and I can make it happen for you. liamtrask@ gmail

(but i don't control the website download offer)


I haven't made any conclusions about williamtrask but I'm getting there with you. Your behavior in this thread is deplorable.

jacquesm: "Your behavior in this thread is deplorable."

Why, may I ask?

> I've concluded williamtrask is a bot - it really doesn't seem to "get" the point!

You're insulting one of your peers here. Insults, clever or otherwise do not belong on HN, especially not in threads where those peers offer their works for - limited - review.

If you're shocked by someone writing a book that has an actual audience here and even more shocked that they would have the temerity to charge for their work, if you complain they won't give you 'access' and then insult them to boot when they offer to do just that you go from 'clever' to 'asshole'.

I've flagged your comments and would really appreciate it if you found it in you to apologize to the topic starter, subthreads like these make me sad.

Is using GitHub pages for promoting and selling a non open source project really appropriate?

If it's not, i'm happy to change it. It's just my blog.

Why wouldn't it be? He's just using it as a blog.

Please change the title to say it's not free/open source.

Since when everything posted in HN should be free/open source?

I know DL. but will never use Python... 23333

Python feels like a pseudocode and is a popular language among hackers and data scientists. As a result, there are more examples, tutorials to get started, that's one of the chief reasons many choose it to learn new concepts.

Gonna play Devil's Advocate here: is this the correct way to lower the barrier to entry?

This is like trying to teach monads without having taught lambda calculus, functors, and applicatives.

There is a clear order to knowledge, and people should master the books dealing with prereqs if they want to grok deep learning.

Not jump into deep learning, just because it's the hot shit.

Part of these efforts to cheaply popularize CS makes computer science not a real field. Just a fad.

Nobody will write a book called "Grokking Quantum Physics" claiming that explaining quantum like "you are a five year old" will somehow cover for the necessary mastery of classical physics.

If you read such a book and think you understand quantum physics, you are terribly misguided.

Dunning-Kruger addicts people to feeling like they have mastered a subject, without putting in the effort.

Little learning is a dangerous thing.

Making the field more accessible allows for more people to get involved and contribute new ideas back to the field, which benefits everyone.

Its not about making it accessible, its about writing baseless wrong things. The book author will do more harm then good for general audience. Its total waste of time to read this book. Our time is limited better spend it on something useful which is correct.

Registration is open for Startup School 2019. Classes start July 22nd.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact