I see a lot of comments asking how one should read these things, but the answer is that one shouldn't. These documents are more like diaries than textbooks. They'll show you the scaffolding of somebody else's technical mind, not how to build your own. The best textbooks are always focused, and honed by teaching experience.
Then again, maybe I just haven't experienced better material on that topic. Do you have any suggestions?
The problem is when people try to use the whole book for some goal -- whatever that goal is, it's never an efficient route.
Unfortunately, math textbooks that are both immediately useful and instantly accessible to computer scientists are few and far between. Why doesn't anyone strive to cover that gap? I find that the major point of contention with most math textbooks is their tendency to pile one definition after another without showing me how that definition is useful in practice. I have a limit in how many abstract mathematical structures and theorems I can absorb before my brain simply wipes them out since it deems them to be useless.
For that reason, the best math textbooks for me have been grounded in practicality (and are not really math textbooks). I learned calculus from Spivak, abstract algebra from Antoine Joux and analysis from Sussman.
As I was working through the books I would capture every numbered definition or theorem in a table, along with all of the explicit cross-references or implicit dependencies. Then I wrote a little D3 app to visualise the graph, with a few customisable views. E.g what is the dependency structure of the chapters? What is the full list of dependencies for a given theorem?
The result is actually pretty interesting, and the different levels of rigor that different authors choose really comes through. I would love to have a few weeks to sink into this project properly and make a public tool out of it.
For that nearly 2000 page PDF, actually I know enough about the contents to suspect that again a dependency structure graph, especially for anyone with a background in Baby Rudin (Principles of Mathematical Analysis), Big Rudin (Real and Complex Analysis or maybe Functional Analysis), surprisingly, would not look very interesting.
From just going through the table of contents, there are a lot of interesting and uncommon topics in those ~2000 pages, but the book looks more like a small encyclopedia or reference than an integrated whole.
And for something that long and in places with some rare topics, there is, surprisingly, especially for the goal of ML (machine learning), comparatively little on probability, stochastic processes, and mathematical statistics.
But in those many pages, there are many topics possibly valuable and not easy to find elsewhere, so keep a copy of the PDF and use it as a reference when appropriate.
Here is a short remark for readers without much background: Don't be afraid to dig in nearly anywhere in the ~2000 pages; maybe you will have to backtrack a little but in general can start nearly anywhere and don't have to read sequentially chapter 1, 2, ....
Warning: Some topics are covered relatively superficially while others have some really rare details.
Broad, Curious Point: The ~2000 pages may be this and that, but it is relatively close to the applied mathematics of operations research and numerical analysis, closer to those fields than what is usually regarded as computer science until computer science wants to absorb the math of operations research and numerical analysis. For more, some of the numerical analysis was, as I recall, mostly of interest for numerical solutions of partial differential equations. That field is also to be subsumed by computer science?
It absolutely depends on the text of course. One one end of the spectrum you have something like Euclid's Elements, where a DAG representation seems to be very well suited to spirit of the text. Of the books that I explored, Baby Rudin came the closest to this ideal. While many of the individual chapters contains long runs of mostly linear development, the interdependence of the chapters is complex, and apparently very carefully considered.
You might be right about the linked textbook. I suspect that there would be a core linear algebra subgraph that most of the rest of the book depends on, but otherwise little else in the way of complex structure.
One thing that occurred to me was that the reductio ad absurdum of this ends up being something like mathematical logic. As you try to capture the dependency structure with more and more precision, you find yourself trying to build a proof assistant.
Still, I think that there's something to be said for studying the structure of great textbooks purely as cultural artifacts.
Naw! Take another pass through Baby Rudin!
Here is some help: With appropriate mild assumptions, a continuous, real value function on a compact set is uniformly continuous. For the finite dimensional space of interest, compact is the same as closed and bounded. That is the main content of the first chapters.
First Application: With uniform continuity, get to show that the Riemann and Riemann Stieltjes sums converge so that those integrals do exist.
Second Application: The uniform limit of a sequence of continuous functions is continuous. This was a question on my Analysis Ph.D. qualifying exam -- I got it! Thus a certain normed space is complete, that is, is a Banach space -- I'm not sure this remark is in the book.
The chapter(s) on sequences and series is used to define ln, sin, cosine so that can do Fourier series.
The exterior algebra is essentially separate.
That's the main stuff.
Don't take the Riemann integral very seriously: The good stuff is Lebesgue's integral (partitions the range instead of the domain) as in the first half of Rudin's Real and Complex Analysis. There also get to see Banach space, Hilbert space, von Neumann's novel proof of the Radon-Nikodym theorem (grown up version of the fundamental theorem of calculus and with astounding power) and the Fourier integral. I won't take off points if you don't read the second half.
Rudin's Functional Analysis -- get to see distributions.
It's not at all clear to me what part of my comment you are correcting.
> Of the books that I explored, Baby Rudin came the closest to this ideal. While many of the individual chapters contains long runs of mostly linear development, the interdependence of the chapters is complex, and apparently very carefully considered.
and I removed what was "complex".
Many students hear bad things about Baby Rudin and don't try it. Of students who do try it, too many don't finish well. Of the students who do finish, only a small fraction take a second pass where they will get some of the clarity I typed in for you.
For my Ph.D., there were five qualifying exams. I did the best on four of them, and one of those was Analysis and fairly close to Baby Rudin. The department didn't offer a course to help students prepare for the Analysis exam. So, I was likely the only student who had taken two passes through Baby Rudin; the first pass was in a course; I did okay; the second pass was on my own and slowly; it was fun! The second pass is the main reason I did the best on that exam. The next year the department tried teaching a course from Baby Rudin: They did that for only one year; I can believe that few or none of the students did well with the course. A friend from another department wanted to learn Baby Rudin, took the course, had a hard time, and dropped it. He went away unhappy, and that was sad and not really necessary.
So, my experience with Baby Rudin suggests (i) take a formal course, (ii) use advice such as I gave here to see during the course some clarity in what is going on, how, and why; (iii) also study or glance at one or two other competitive or similar books, e.g., Spivak, Calculus on Manifolds, Fleming, Functions of Several Variables, Kolmogorov and Fomin, Apostol, and more, and (iv) take a second pass through Baby Rudin.
Net, Baby Rudin has discouraged a lot of students, too many. To many students, Baby Rudin looks forbiddingly severe and abstract; students can't figure out what the heck is going on, how, or why. E.g., your view was "complex" -- it can be "simple". With some help, e.g., as I typed in here for you, Baby Rudin can be okay.
I'm saying that after encoding the graph structure I found Baby Rudin contained more complexity (i.e. connectedness) than the other texts that I repeated the excersise for.
I know these areas well, and I would be hesitant to agree with this statement.
I stand behind my statement.
These are all fundamental OR topics but the book makes different choices on topics to cover. (to be fair, the book does not claim to target OR, so I'm not disagreeing with the book's premise, only yours).
I do think though that the book could benefit from feedback from specialists in each individual area. In some areas, I find the selection of topics to be a little bit "unconventional".
I thought his linear programming coverage was superficial. In particular, I didn't notice any coverage of linear programming for the problem of least cost flows on a network where each arc has a maximum capacity. The nice anti-cycling rule for the version of the simplex algorithm, with strongly feasible bases, is from W. Cunningham, long Chair at the Waterloo Department of Combinatorics and Optimization. It turns out that that algorithm makes a nice dent in the challenges of linear integer programming.
Yes, I saw not much on constraint qualifications. I'm the guy who showed that the Zangwill and Kuhn-Tucker constraint qualifications are independent for problems with only functional constraints -- my work also solved a problem stated but not solved in the famous Arrow, Hurwicz, Uzawa paper on constraint qualifications in mathematical economics.
Any really good book on linear programming is "close" to OR but can be attacked with your objection that it omits lots of OR.
so you're 2 removed from someone whose authority on the matter we would trust?
>I've published peer-reviewed original research, taught graduate courses
lol so does every phd student. not a high bar were' talking about here.
Please review the site guidelines: https://news.ycombinator.com/newsguidelines.html. Note that the first rule under Comments is "Be kind" and the second is "Don't be snarky".
I collected the data in a spreadsheet for each text. The columns were "Id, Type, Name, Chapter, Description, Dependencies". Id is the margin reference from the textbook, Type is one of theorem, definition, corollary, etc., Name is for named theorems or definitions, Dependencies is a comma separated list of Ids.
I then had a small Python script that built up a NetworkX  representation. Initially I was using Cytoscape  to explore the graphs, but then started work on a web interface using dagre-d3 for automated graph layout . This example  is very similar to the types of layouts it produced.
I found that a large textbook quickly grows into a big hairy ball, and that it is only useful if you are able to collapse and/or hide nodes intelligently. I.e. show the chapter dependency structure, or look at the sub-DAG involving a selection of nodes. I only got a little way into doing this when I got moved on to other distractions. Shazaam for bird song, or something like that...
Please feel free to run with it youself!
* This is dense
* This is hard to understand if you don't have an education in mathematics
* This is not really meant to be an "introduction to X" book unless you have the background knowledge
* Gallier (who imo is an awesome professor) really values rigor and not hiding anything behind hand-wavy statements. He is an incredibly serious mathematician + computer scientist (https://en.wikipedia.org/wiki/Jean_Gallier). Accordingly, this book (work in progress) tries to be incredibly detailed on the topics it chooses to cover (and may miss out on some areas of CS + ML solely bc it chooses detail over comprehensiveness).
There seems to a lack of mathematical unification in the work except perhaps these were a collection of topics that were of interest to the author(s).
I want to say this could be a good handbook, but handbooks (e.g. engineering handbooks) are usually collections of pieces written by specialists in each subfield. I suspect that by attempting to cover such a large chunk of the mathematical landscape, some things may have been lost in the treatment (except in topics that the authors are specialists in). Still, it is a good effort and I congratulate the authors on the ambitious attempt but would advise readers to consult texts written by experts in each subfield to get a sense of what topics are important and what are not.
Yes. Compared with this book, the one semester course "Foundations of Optimization" I took was better balanced and usually deeper.
> I want to say this could be a good handbook
I see it as best as a handbook. So, e.g., if hear about some topic X, want to know more, and find a treatment in this book, then give the book a shot but don't take the book too seriously as the balanced view of an expert.
> advise readers to consult texts written by experts in each subfield to get a sense of what topics are important and what are not.
Yes. And not just for "a sense" but for the actual content. E.g., I would go to some of the books in my professional library for the numerical analysis, linear programming, non-linear programming, regression analysis, Hilbert space, exterior algebra, abstract algebra, Lagrangian techniques, etc. And in the table of contents I didn't see anything on network flows which, considering what else is in that book, is a curious omission.
But, again, in summary, if want to know about topic X that is in the book, then likely give the book a shot, at least for a few minutes.
One more point: For students who want careers in academic research, one reason, likely the most important reason, to study such advanced and specialized topics is as an introduction to some promising streams of research. Well, I'd try to avoid using this book much for that goal and, instead, look for better material from better sources.
Right: Sure, some undergraduate math major who has been rushing ahead in math since, say, middle school, could study all of this book, but most of the book is mostly more advanced and/or specialized than undergraduate math majors will encounter.
But, yes, it would be wise to regard an undergraduate math major as a nearly necessary prerequisite for the book.
In that case, the 70+% of the academic computer science community will have a tough time having math background enough to make good use of this book.
Weird as it sounds, some of the happiest moments of my life come from deeply studying mathematics like this. When I finally cash out of the entrepreneur game, I'm looking forward to spending several decadent hours each day studying something like this again. Great fun.
The interesting thing is that, for a Math student (or a college professor) this is the FOUNDATION of math you need in order to study advanced stuff. For some of the recent math proofs (think famous ones) you'd need to study a whole year (after you know most of this stuff) just to learn the theory foundations for that proof.
I'm of course exaggerating a bit - most Math majors would probably go through most of the stuff in this PDF but not necessarily all (machine learning stuff for example, and possibly numerical analysis stuff) of it. And if they do they usually forget* most of it, except the areas they use in their daily research (assuming they got for an advanced degree). And when you use some of this stuff every day it becomes second nature after a certain point.
My experience is mostly based on the Romanian educational system which is typically more hardcore science (Math and Physics) even for Computer Engineering degrees.
*Sometimes you don't forget stuff, you just don't think about it until you see it again (concrete example for me: Schur decomposition) and then partial memories of it come back.
Before you begin, have a single source of truth for your daily goals whether its a calendar, todo list etc. I like a paper agenda notebook, my favorite layout is the emergent task planner from David Seah and the roterunner on amazon. A quick note on process: you want your schedule to be flexible and fault tolerant but specific, actionable and accountable. I like to block times of day for an activity and have a general idea of my goals for that day. I keep a big weekly todo list for each activity and chip away at it each week. You want to give yourself the least possible amount of choices each hour of your day but be able to tolerate errors in estimating time for a goal.
Write down a purpose, of what u want to get out of the book. Thing back to the learning objectives on a syllabus in college, these should be specific.
Look at the outline, take inventory of the topics you want to cover and their chapters/sections.
Now schedule some block of time in your calendar, agenda, etc each week to study the book.
Now when studying a chapter/section: first survey for a summary, practice questions, exercises etc. to get an overview.
Start by answering the exercises, making predictions about a section and generally pre testing your knowledge of the topic. Based on some psychological studies in the book, the more you challenge your mind the more connections your brain makes thus improving learning.
Next read through the section in order to fill in the gaps identified in your pre test phase, this purpose based reading improves retention. Dont underline, copy sections into notes, highlight etc these arent worth the time.
Now create a diagram, outline or other summary of the text for your notes without copying things from the book. Try to do it in your own words.
Finally, go back to the exercises you didnt complete or revisit your pretest questions and answer them. This greatly increases retention. Quizzing yourself is the best way to learn.
In summary the way you tackle 2k pages of material is the same way you do in school. Weekly work towards the goal with practice, clear objectives, creating study guides and outlines and testing yourself.
I'm really poor at math, so I've learned to work slowly and check if I really understand something before I go to the next chapter. Most maths is based on previous work / understanding. So, when I rush or try to skip pieces, I misunderstand the next part, making my rushing completely useless.
OTOH, this is a fairly extensive and in-depth reference, which means that if - in the course of your studies or research - you hit on a topic you need to understand better, the book is here to be consulted on that specific topic.
And you won't get the "I don't understand that part" feeling when you read it because the part that's giving you a headache is likely explained somewhere else in the book.
Then study the main trunk and larger branches.
For the deep roots, that is, the foundations, go light on those or will get confused, discouraged, and generally stuck-o. E.g., do some axiomatic set theory if really want; see some of the axiom of choice, but go light on the latest material on model theory and forcings.
Then with the trunk and larger branches, will be able, as circumstances require, get to nearly any of the leafs fairly quickly as required.
For that ~2000 page PDF, there, too, try to find the trunk and larger branches and mostly skip over the leafs. Maybe 80% of that book is leafs.
Finding the trunk and larger branches in that book could be challenging. So, instead, just stay close to the commonly recommended undergraduate math major texts and topics. For each such text, at least glance at the 2-3 leading competitive texts and, then, take what is in common as the trunk and larger branches. Also for each topic, pick the best looking treatment and then glance at the other treatments.
If you can, find a really good math prof in a really good math department at a really good university and try to chat with him about your progress for a cup of coffee (you pay) hopefully 4 times a year. If you can find more than one such prof, even better.
Some departments publish their old copies of their qualifying exams; might get some of those!
One more: Don't let yourself get stuck or discouraged. It's a fact of life that not all the materials are good. So there are some actual errors. Some exercises are just too darned hard, need material presented later in the book or not at all, or are just tangential curiosities not worth the trouble. Some of the writing can be obscure. Often there is not enough motivation to let you figure out what is important and what is not: If really obscure definition X or theorem Y seems not to be used in the rest of the book, then find some good reason to study X and Y or maybe delay them or just skip them. Sometimes obscure X, Y actually are important; sometimes not. There should be motivation, but commonly there is not.
If you can't after a good effort understand something or solve an exercise, don't assume that the difficulty is you or that you are leaving a dangerous hole in your knowledge. If later you decide you need that something, then return and give it another shot.
Generally for a really good, mature view of a book, just need more than one pass through it; so don't seek everything on just a first pass. Even given a really thorough first pass, you will notice significantly more on a second pass, e.g., some later material on the first pass will help you better understand the real points and importance of some earlier material on a second pass. And do glance at some alternative treatments.
One recommendation is to chew on a proof until you really see it thoroughly. A counter recommendation is just to notice that we are all short of time. Net, use some judgment.
TBBH, that's kind of disappointing.
I was hoping to see some sort of insight around deep learning stuff.