Hacker News new | past | comments | ask | show | jobs | submit login
Books every self-taught computer scientist should read (themoritzfamily.com)
229 points by ericmoritz on Feb 15, 2012 | hide | past | favorite | 76 comments

Speaking as an actually credentialed computer scientist, this list is ridiculous.

First and for all, only one of these books is actually about theoretical computer science, and even then.

Secondly, (feel free to disagree on this one) K&R is recommended more as a prestige book. So much of C is in the toolkit and libraries that I think it's a little silly to be recommending a 30 year old intro that is actually kind of hard to read.

Thirdly, this question comes up all the time. Here's an actually serious version of this question: http://cstheory.stackexchange.com/questions/3253/what-books-...

If you only care about actually practical issues to your life as a programmer, give this list a shot http://news.ycombinator.com/item?id=3320813

Addendum: Although I'm credentialed as a CompSci, I really work as an engineer. The difference? Scientists read papers (Avi Bryant) http://vimeo.com/4763707 and think about the nature of our work (Greg Wilson) http://vimeo.com/9270320

I apologize for any offense that I may have given by using the phase "self-taught computer scientist".

This article is my personal recommendation of books that I often recommend to junior developers that are smart but didn't complete a degree and often find their knowledge of basic computer science lacking.

To be fair, I didn't take offense to "self-taught computer scientist". I took offense to "every self-taught computer scientist".

I'll be the first to admit that university is bullshit and mostly acts as a social signifier. Like I mentioned above, I'm credentialed, and I'll put that on my business cards but between you and me and the internet at large I mostly work as a software engineer.

There are even many problems with talking about "computer science" because for the most part it's treated like a branch of mathematics and its academic circle really hates dabbling in messy empirical data.

Yet, there's a degree of rigour in it. There's an actual underpinning behind a lot of this stuff.

So! Can you be a self taught computer scientist? Certainly! K&R and The Little Schemer just have almost nothing to do with it - even if they might make you into better programmers :).

There are even many problems with talking about "computer science" because for the most part it's treated like a branch of mathematics and its academic circle really hates dabbling in messy empirical data.

This gets repeated again and again on HN, but it's not true. I am a CS researcher, and I always have messy empirical data. Systems research almost always has tons of experimental results. A large chunk of our papers are dedicated to the experimental design and results.

I have made this point many times before:




One of these days I'm going to actually set up a blog, write this point up as an essay, and be able to point to it.

My own personal definition of computer science: everything concerned with computation, both in the abstract and in the implementation.

From what I can tell (based on HN and other forums), systems research is empirical testing of systems, while theoretical computer science is math heavy. The typical systems paper is "The design, implementation, and performance of a [application] system, using [technology]". While a theory paper is "Proof of the existence of a solution for [problem] in time [O(something)]".

These two branches do talk to each other, but not much.

In general you got it, but CS theorists do more than just algorithmic analysis. They also may be after some other desired, provably guaranteed effect with an algorithm, and they'll say in passing "By the way, this is a linear algorithm, so it's performance isnt an issue." This is true in scheduling, cryptography and probably many other areas. (It's early, sorry.)

You got systems research pretty much head-on, but I would stress the design and implementation part more. That is, CS systems research tends towards engineering, where we claim we made a better whatzit, and then we need to provide a suite of experiments to support our claim.

Please do! And link to the rest of your research :)!

I'm just talking from my own (very) limited personal experience, though, and the Greg Wilson video I linked to (albeit it's been two years since I last watched it).

You can find everything that I have currently published in a peer-reviewed place on my (old) webpage in my profile. I've done lots since graduating (lots), but we've hit a bunch of paper rejections (aarrrrggg), so they're not published yet. I was, however, involved in a paper that we've submitted to a journal, so it has not gone through peer-review yet, but we made a tech report so that others could cite it: http://researcher.ibm.com/files/us-hirzel/tr11-rc25215-opt-c...

That looks interesting! I saved it for later.

As a self-taught C software developer, I came here to say the same thing Phillip said.

To it, I would also add that C isn't the "latin" of computer science (if any language is going to prove to be that, it's Lisp, but it's too early to say, other than to say that isn't going to be C).

I'd also suggest that the best book to follow up K&R is Hansen's _C Interfaces and Implementations_.

If I wanted to teach someone to be a computer scientist, I'd look for a book that would help them read papers. I'd also point them towards compiler theory, not so much because it's fundamental computer science (it's a vital applied discipline), but because it exposes you to more real computer science than most other domains.

Lisp is the "greek". Older, more free-wheeling. Intellectually more influential but swamped in "the real world" by the latin-speakers.

And full of lambdas.

Do you have a book recommendation to help someone read papers. I spent an couple evenings going over MIT course notes on discrete mathematics to attempt to read a paper on Conflict-free Replicated Data Types <http://hal.archives-ouvertes.fr/docs/00/55/55/88/PDF/techrep...;

What I would give for a book at that time that would help me translate the following:

    merge (X, Y ) : payload Z
       let ∀i ∈ [0, n − 1] : Z.P[i] = max(X.P[i], Y.P[i])

    def merge(X, Y):
       Z = ... new object ...
       for i in range(len(X.P) - 1):
           Z.P[i] = max(X.P[i], Y.P[i])
       return Z
I eventually figured it out but it was a bit rough trying to figure out what all the symbols meant.

Well… it depends on the paper! Some stuff might need more than a single course of background to fully understand.

For more rudimentary papers, any undergrad course on discrete mathematics should get you started. I personally was forced to read http://www.amazon.com/Discrete-Mathematics-Applications-Susa... - and it's pretty decent.

Thanks for the recommendation. That was one of the text books that I looked at but the sticker shock steered me towards free MIT course notes.

I don't know if it is true about that book in particular, but if I'm buying a textbook as a resource I usually look for the international editions. They sell near identical versions of the books (shuffle the problem sets around) in other places around for much, much less.

Buy it second hand from Amazon - they are much cheaper! ($17)

+1 for the book recommendation - I also have a copy of that.

There is a really nice book trying to teach exactly this (among other things) called "The Haskell Road To Logic, Maths and Programming":


available for free, according to


> books that I often recommend to junior developers that are smart but didn't complete a degree and often find their knowledge of basic computer science lacking.

I think you may be confusing "Computer Science" with "Software Engineering".

What is a "Credentialed Computer Scientist"? Do you mean a college degree in CS/EE? I'd imagine a whole slew of people here on HN are Credentialed in that sense.

I take it to mean, Ph.D; not from experience, but that's the only meaning that makes sense in the context.

My read: he just means it in the sense that I mean "professional" when I say I'm a professional application security person, except that it's weird to say you're a "professional computer science" when your day job doesn't involve writing papers.

That's pretty much it.

I got a nice piece of paper, but it's not like my day to day job involves much of what I learned during school.

One thing I've not seen yet is a list of books or online resources to take someone from beginner to expert in computer science. I've just seen lists like the submitted list or the lists in the Stack Exchange discussion that phillmv linked to, which are suggestions for books or articles or papers that everyone (in someone's opinion) should have read.

Anyone seen a CS equivalent of either of the following?

1. Gerald 't Hooft, the Nobel Prize winning physicist, maintains this list: http://www.staff.science.uu.nl/~hooft101/theorist.html

That list gives an ordered progression of links to online material meant to take a talented and ambitious person from high school to research level quantum field theory and string theory.

2. The "How to Become a Pure Mathematician (or Statistician)" list here: http://hbpms.blogspot.com/

That list starts assuming high school mathematics, and gives a series of stages to go from that to advanced graduate level, giving for each area of mathematics at each stage lists of books and online material appropriate for that stage.

I too would be interested in a CS equivalent of those resources.

Also, thanks for that first link! That could be useful to prepare myself for 6.002x.

I haven't read Mastering Algorithms in C (I have the perl version) but if you want to mention one book on algorithms, that's got to be Skiena's Algorithm design Manual (http://www.cs.sunysb.edu/~algorith/video-lectures/). I also like Jeff Erikson's algorithm lecture notes (http://www.cs.uiuc.edu/~jeffe/teaching/algorithms/notes/all-...) that comes as a 800+ PDF file.

I have to second Skiena's book. I had three books: the famous Cormen tome, "Algorithm Design" by Kleinberg and Tardos, and Skiena's "Algorithm Design Manual" available when learning advanced algorithms. I'm still amazed at how Skiena's book covers both the basics in a far better pedagogical way than the other two, as well as serving as a really nice lookup utility for tackling specific problems (the second half of the book is jammed packed with short problem description, as well as a overview of approaches and considerations). I kept looking up the same topic in all three, and ended up reading the Algorithm Design Manual.

That said, Cormen's book is famous for a reason, but more often than not, I felt it gave way for mathematical rigor than plain language.

Note: The second edition of Skiena's book is significantly improved over the first... well, besides the quality if the paper.

edit: Meant second edition of Skiena's book, not Cormen's. Sorry.

Skiena's is my favorite computer book of the last 5 years; it's radically different (and I think better) than CLR.

TAOCP is, considered on the whole, an even better work, but I end up using it way differently; I've consulted Skiena to solve problems, but TAOCP I more or less flip through and then read 10-20 pages of at random; I am never sorry I did.

CLRS (the Cormen tome) is actually on its third edition now, with some updates/additional content.

Thanks for the link! Erickson's notes appear to be a treasure trove of insight into algorithms. I flipped to the section on dynamic programming and was impressed by how well the idea was boiled down. Also, sorry for the pedantry, but that link is to a 374 page PDF. For any one that is truly insatiable, peel back the link to algorithms and one will find an everything.pdf that is indeed over 800 pages!

Itching to buy Skiena's book for the practical examples (after perusing what amazon would show me of it)! Another very accessible algorithms book I highly recommend is: http://hetland.org/writing/python-algorithms/

The big difference between Skiena's book and others (Cormen, Knuth) I've seen is inclusion of his own war stories and real life experiences and less emphasis on rigourous proof.

Yea mean "every self-taught PROGRAMMER should read".

Computer science is very different than day-to-day "real world" programming. Computer science is more akin to applied mathematics... its the study of algorithms and computation.

What your books teach is programming, building software and such.

See my reply to phillmv elsewhere in the thread.

I have a set of honest questions to make:

I understand what it means to be a self-taught programmer, which I was from ages 13 to 17, before I got into college and, in some sense, still am today, as many of the things I learn and have learned for the past few years were "self-taught" (or, as I prefer to say, taught me by the authors of books, papers, and blog posts I read).

Some people like to call themselves self-taught hackers, software engineers, etc. And that is ok too.

But what does it mean to be a self-taught computer scientist? Is there some criteria, e.g. do you have to publish a peer reviewed paper or something like that?

I would say a self-taught computer scientist is well-versed in theoretical, or 'purer' forms of computation. That is, I would expect knowledge of λ-calculus, what it means for a language to be regular or context-free, and so on. A software developer and a computer scientist are really very different things (but not mutually exclusive).

When I say, "self-taught computer scientist" I had in mind someone, much like myself that did not go through college to earn a comp-sci degree but instead learned on the job and solves problems scientifically rather than by brute force.

I agree with you that it is a bit a self-applied label that could be offensive to those who finished college and earned a degree. Much like someone calling themselves a doctor without getting a PhD.

You provided a good definition: solves problems scientifically rather than by brute force. I believe the essence of science comes from being able to think (and report your thinking) in a structured way. Meaning it is about content, but it is also about form.

And just so we can get this out of the way, I'm not in any way offended by people applying labels to themselves in good faith, I'm only slighted offended by people who are offended by that.

It's a good question. I am self-taught, and would never call myself a "self-taught computer scientist" ... "self-taught hacker" is just about right. For me the distinction would be that it's never going to get too theoretical, it's all applied (though I make every effort to make my applications efficient, correct and legible)

It's a matter of semantics, but I completely agree with you. I would have liked it if there was a book on Design Patterns in the list as well.

The Design Patterns book is not on the list because I have not read it yet. Based on what I know of it, I am sure that it deserves to be on the list.

(iirc) all but one of the interviewees in Coders at Work said the GOF design patterns book is rubbish. The guy who liked it said it was helpful to establish common language.

Read Refactoring.


In my view it is the single most important programming book in existence. Yes, other books will teach you other, incredibly important things. Other books will teach you the specifics of one technology or another. Other books will open up your mind on the very nature of programming (especially learning functional programming). And other books will teach you all of the incredibly important stuff that isn't siting at a keyboard and typing in code. But personally I'd rather work with someone who thoroughly groks refactoring even over someone who thoroughly groks functional programming.

I have to disagree with The Little Schemer, despite the fact that I took freshman CS with Matthias Felleisen (one of the two authors), thought his course was brilliant, and absolutely swore by Scheme by the time I was done.

I felt the book paled in comparison with his lectures. The book is, in some sense, too short; it doesn't have enough examples, and it also doesn't discuss essential language features such as let/local. Also, it has a shortage of long examples (pretty much everything there consists of a single function).

Also the book is ridiculously easy for the entire first half (approximately), and then suddenly dumps you into the deep end (they present the Y combinator).

What I got out of The Little Schemer wasn't really how to write Scheme but more how to solve problems recursively.

At least for me it excelled in that aspect. As a method for teaching Scheme, I am certain it leaves much to be desired.

> What I got out of The Little Schemer wasn't really how to write Scheme but more how to solve problems recursively.

They do explicitly claim in the book that that is their goal!

Programming Pearls is also an amazing gem you should read as early in your career as possible. If you end up heading into software, also check out The Mythical Man Month. Although dated, the lessons are timeless.

Programming Pearls was the book I came here to recommend. (Assuming that the original poster really meant "programmer" rather than "computer scientist".)

Unlike a lot of folks here I mostly agree with the list - I'm a self-taught programmer as well. But don't stop with the The Little Schemer - grab The Seasoned Schemer as well, and if you can make it through The Reasoned Schemer you'll learn quite a few things that many programmers know nothing about.

I still think that Rich Hickey's list, http://www.amazon.com/Clojure-Bookshelf/lm/R3LG3ZBZS4GCTH is the best collection of self-taught programmer must reads I've encountered.

Mastering Algorithms in C will go down in history as one of the most horribly formatted books ever. The code samples in this book are positively unreadable due to comments that go on for pages. To be fair, the content is good. But the formatting is so annoying that it's hard to sink into that content.

I read it on my Kindle so I assumed the bad formatting was caused by reformatting to fit the Kindle.

Okay, now I am confused. This article basically directly explained my situation. I learned by doing, not by studying or taking classes. I have a HUGE whole in my knowledge regarding algorithms, uncommon C of features, and things classified as true "Computer Science".

Anyway, I was thinking about reading these books. That is, until I saw phillmv's comment. There has been a lot of debate and suggested lists of books to read on this comment thread, but no concrete suggestions. I went from 3 suggested books to a few hundred. I know there are no magic books that make you a master of Computer Science, but does anyone have a concrete quick list of what is necessary for filling in the blanks in a self taught programmer's education?

If the focus is more on software engineering than computer science, i'd recommend "code complete" and "patterns of enterprise application architecture".

"Joel recommends SICP, The C Programming Language, The Unix Programming Environment, and Introduction to Algorithms as solid books for programmers who want to brush up on their fundamentals and potentially do well at programming interviews." - http://itc.conversationsnetwork.org/shows/detail4144.html

I would suggest the following book as a precursor to all of these books: Code by Charles Petzold.

It is genuinely one of the best books I have ever read.

Those are Amazon affiliate links in the article. Getting this on the front of HN will get you a few clicks.

That seems like fair compensation for assembling a list of books that, I may be interested in reading. Do you actually find this objectionable?

The only time there's really a valid reason to complain about affiliate links is if you're rewriting links posted by others; and even then IMHO disclosure is enough to clear you. Otherwise, affiliate links are serving their purpose properly (encouraging bloggers to post links to Amazon; and Amazon compensating the bloggers for the referral).

It's just honest to state that they are affiliate links, regardless of he deserves the revenue or not.

looks like he jumps at any opportunity to get his amazon affiliate link out there. even posting it with a pretty much useless comment on this g+ post:


(this link is from a story that just recently made the frontpage on hacker news)

Eric, how's the affiliate marketing going?

I would find a follow-up post discussing the numbers re: Amazon Affiliate click-through and purchase rates from a story hitting the front page of HN more interesting than yet another "here is a list of books that you should read" post, or is that a little too meta?

I have The Little Schemer, but have had trouble figuring out the best way to use the book. Do you just read it from cover to cover? Or do you cover up the right hand of the page and try to come up with an answer (maybe even typing it into the repl) before looking?

I read it cover to cover. For most of the chapters I read the left side, answer the question in my head and then looked to the right to see if I was right.

Having just learned Clojure using "The Joy of Clojure", If there was anything I needed to verify, I translated the code into clj and pasted it into a Clojure REPL.

I would add "Numerical Recipes in C" to the list. I have owned this book for many years and get allot of use out of it.


Add to the list Structure and Interpretation of Computer Programs, which fits perfectly following The Little Schemer, and you're all set.

As a self taught programmer were I to read all of the books "every programmer should read" I would never have time to actually program.

The author definitely should have included Algorithms in C++ [Sedgewick] and Introduction to Algorithms [CLRS].

I guess you really meant programmer, but I think this is a great list.

what if somebody starts with some other language such as Java or Python. Is mastering in C must to be a good programmer?

No, I don't think so. Those books were chosen not because they're good at teaching you C but rather because they're good at introducing you to concepts like memory layout, data structures and algorithms.

That was very far from what I expected :)

I am sorry. What did you expect?

The Little Schemer made me understand programming in a functional style much better than I did before reading it. It also solidified my opinion that I really would not like working in that style at all.

A scientist researches. Was that intentional?

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact