Personally, I find that the research literature is usually (intentionally or unintentionally) opaque and that most ideas worth learning are at least 5 years old, so textbooks are the best place to actually learn things from a source intended to teach you the material.
The list I could actually vouch for would be:
Russell and Norvig
Introduction to IR (Manning et al)
Database Systems: The Complete Book
Convex Optimization (Boyd)
Foundations of Statistical NLP
For distributed systems I'd read Lynch's book on Distributed Algorithms.
For stats Michael Freedman's book, Statistics, is a good simple introduction. Someone mentioned Calculus and Statistics (http://www.amazon.com/Calculus-Statistics-Dover-Books-Mathem...) here on HN a couple of weeks ago. I had looked it when it came out and re-reviewed it with that thread -- I really like it a fair bit more than I remember and a better text for those looking for a more rigorous treatment than Freedman -- although probably still too simple if you're reading Foundations of Statistical NLP.
There's also Scott's book on Programming Languages, which is worth reading.
There's this list: http://www.reddit.com/r/compsci/comments/gprp0/is_there_a_li...
I would add Lisp In Small Pieces to the compilers section of that list.
Regarding the list: I think it's a nice effort, but lacks several excellent and remarkable books. Just quickly browsing, I found the following essential books missing:
* In algorithms:
- Aho, Hopcroft, Ullman: The Design and Analysis of Algorithms.
- Wirth: Algorithms and Data Structures.
* In Compilers:
- Wirth: Compiler Construction.
- Grune, Bal, Jacobs, Cerial: Modern Compiler Design.
- Muchnick: Advanced Compiler Design and Implementation.
* In Lambda calculus:
- Barendregt: Introduction to Lambda Calculus.
* In theoretical computer science:
- Kozen: Automata and Computability.
- Davis: Computability and Unsolvability.
* In concepts of PLs:
- Turbak, Gifford, Sheldon: Design Concepts in Programming Languages.
Regarding compiler construction: As has been previously mentioned several times on HN, Cooper's and Torczon's "Engineering a Compiler" is a more recent and (IMHO much more readable and accessible) choice.
I am however a big fan of Matt Might's lectures : http://matt.might.net/teaching/compilers/spring-2011/ - v beautiful slides, v nice approach overall.
The Dragon is mostly old and hoary by today's standards; we frequently use GLR instead of LALR(1), it has no discussion of interesting modern developments like packrat parsers, there is next to no discussion of any part of the compiler that isn't parsing, etc. A product of its time, but not an edifice for the ages in the way that SICP is.
If you find a good stats/probability book, let me know; I've been looking for one for years and come up short. Russell and Norvig is ironically superior to most texts in the field.
The 0th edition (green dragon, Aho&Ullman) was mostly about parsing, but talked about everything.
The 1st edition (red dragon, Aho,Sethi&Ullman) was about everything not including JITs, vectorization, GCs and parallelization.
The 2nd edition (purple dragon, Aho,Lam,Sethi&Ullman) is about everything, and they even dropped operator precedence (Pratt) parsing to make room for more non-parsing stuff.
Are you sure? There are 200 pages that cover lexing and parsing -- of a 950 page book.
If anything I'd argue that they've largely punted on parsing given that it's a lot less important than it used to be (which may explain why they didn't add GLR or packrat parsing) and focused on things that were considered more interesting issues in compilers.
The Elements of Statistical Learning:
Second, while focused on computer vision, has great intro to probability and learning:
Computer Vision: Models, Learning, and Inference
Having some background in statistics, but none in either linguistics or NLP, that book was a revelation. If you read, and implemented all the exercises in that book you'd find a way to make millions, as NLP is a big deal right now. I did find a little too much concentration on the low level stuff (character parsing, bag of words etc), but in conjunction with Elements of Statistical Learning its wonderful.
I'm surprise nobody's mentioned Knuth, though maybe that goes without saying?
As for probability and statistics, I haven't really found anything (at an advanced) level that I've been happy with. Maybe it's because my background is in statistics, so it's a perception bias (I see the flaws more easily than with other subjects), but I think that most statistics textbooks are pretty rotten.
There are really only two that I'd recommend, and only one at a high level. Gelman & Hill is a great introduction to computational statistics at a high level, while still very readable (and enjoyable!)
Other than that, the only truly stellar statistics textbook I've ever seen was the one I used in my intro class in high school. It's sad, but it's a very true comment about the current state of most statistics textbooks (that I can find).
Everyone's waiting for the complete book.
I also like Papadimitriou's textbooks: Elements of the Theory of Computation and Computational Complexity.
> When a free register is wanted, and there is none left on the free-storage list, a reclamation (7) cycle starts.
> (7) We already called this process "garbage collection", but I guess I chickened out of using
it in the paper -- or else the Research Laboratory of Electronics grammar ladies wouldn’t let
lambda papers (steele, sussman)
and btw, an interview of john backus : http://archive.computerhistory.org/resources/access/text/Ora...
discussing amongst other things, the pros/cons of fp, and his issues solving how to fit IO; not mandatory but highly refreshing.
I'd also count a serious refactoring or structural improvement as having programmed something again.
Suppose we are to explain omaranto's statement to someone who is not a programmer. My try is: given that (i) programs, whether in source or binary format, can be indefinitely replicated with virtually zero cost and (ii) we know how to write modular programs and (iii) there are plenty of useful modules available for re-use, we never have to substantially re-write anything from scratch.
But (iii) is arguably false due to a variety of reasons: intellectual property issues, incompatible platforms, "incompatible" programming languages, etc.
And (ii) is a big lie.