Hacker News new | past | comments | ask | show | jobs | submit login
A 2020 Vision of Linear Algebra (ocw.mit.edu)
835 points by organicfigs on May 12, 2020 | hide | past | favorite | 200 comments

I had very intelligent linear algebra professor in college but he was, in my opinion, a very poor communicator. I paid attention to lectures and stared at the text, but couldn't really understand the material. For the first part of a linear algebra course, students who don't mind blindly following mechanical processes for solving problems can do very well.

Unfortunately I'm one of those people who tends to reject the process until I understand why it works.

If it wasn't for Strang's thoughtful and sometimes even entertaining lectures via OCW, I probably would have failed the course. Instead, as the material became considerably more abstract and actually required understanding, I had my strongest exam scores. I didn't even pay attention in class. I finished with an A. Although my first exam was a 70/100, below the class average, the fact that I got an A overall suggests how poorly the rest of the class must have done on the latter material, where I felt my strongest thanks to the videos.

So anyway, thank you Gilbert Strang.

Durable and flexible knowledge...

After reading your comment and ansible's reply [0] I wanted to pause and comment on this.

The United States Air Force Academy found that its cadets who took their first calculus class with a professor who focused on conceptual understanding helped those cadets create a durable and flexible understanding of the math [1].

The kicker is that the cadets got worse scores in Calculus I and gave professors who taught in this way worse ratings.

Ansible's anecdotal reply is what a lot of students experience. A feeling of initial success with the material, but they later find that their knowledge of it was fleeting and inflexible. What the Air Force Academy study found was that professors who taught in the manner ansible described, that resulted in fleeting and inflexible knowledge, were rated higher by their students. Those students got better initial scores in Calculus I, but went on to do worse in later calculus courses and related courses.

I encourage you to read the study. It is as good of a study design and execution you can get in the social sciences.

David Epstein also discusses the study in Chapter 4 of his book, Range [2].

[0] https://news.ycombinator.com/item?id=23154241 [1] http://faculty.econ.ucdavis.edu/faculty/scarrell/profqual2.p... [2] https://www.goodreads.com/book/show/41795733-range

When I taught Calculus, I taught understanding over memorizing steps to arrive at a solution for a particular type of problem.

The very best students loved it, but most of the people didn't like it at all.

With mathematics, like with gym, you gain when you put in effort. Most people don't enjoy either.

Yes indeed. Outside of work, I'm an endurance sports person, so basically performance is correlated strongly with training hard and suffering. There is a saying, "Pain is weakness leaving the body", I first heard it in high school (team went on to win a state championship in a highly competitive state). When I was suffering on workouts I just pictured myself getting stronger.

OK hopefully I didn't get too far afield. To me, the analogous concept in learning, particularly in technical fields, is that "learning is ignorance leaving the mind".

In college, particularly math and physics, I /always/ focused on understanding the underlying principles. Initially it was out of fear that if I forgot the formulas, I could re-derive them. But a strange thing happened... through that process, I developed an intuition and an ability to "see" what formulas and concepts to apply when. Once I got to that point in a problem, "seeing it for what it was", finishing to the solution became busywork.

You're between a rock and a hard place.

The rock are the incentives, how your performance is measured, and the short duration you will have teaching these students.

The hard place is students who have likely spent 13 years in K-12 learning without understanding and are now being asked to do engage in practices they have little to no experience with.* They also have incentives to get good grades and a good GPA, which can be at odds with actual learning.

*To get more concrete, the practices have a name--Standards for Mathematical Practice (SMPs). The National Council of Teachers of Mathematics developed them and considers them the "heart and soul" of the Common Core Mathematics Standards. Not only are these practices absent from most classrooms, all too many teachers are not even aware of them! (see my Notch Generation reply to Sriram to understand why)


Did you happen to explain why you were teaching this way?

Very interesting. I don't understand why a teaching system cannot incorporate both a conceptual understanding as well as hands-on applied knowledge. Is it a matter of the time available?

Hi Sriram, a teaching system can incorporate both!

Apologies if my original reply made it seem like it can't.

Why don't teaching systems in America incorporate both the majority of the time?

Two major reasons:

1. Cultural inertia. Most teachers emulate the pedagogy that they experienced in their schooling. Some are aware that you can try to mix conceptual+procedural and try to. I call them the "notch generation"- trying to teach in a way that is different than they were taught. It's hard to do because...

2. The system is not designed to accommodate it. Incentives and higher order effects all conspire with cultural inertia to thwart it.

#2 bothered me so much in school. The system gauges success via tests that check short term learning. It really, really isn't good at measuring learning.

I always did very well on tests at school, but I wasn't really learning anything, or more precisely, I wasn't learning how to learn. I was learning how to pass tests, but that's a rather useless skill to have. I had to learn learning as an adult, and it was more difficult than if I had to learn it as a child.

Hey man, that really sucks, and I'm sorry to hear it. I have a bunch of follow-up questions I'm curious about. I know HN isn't the best way to track replies. I've got heymijo.hn at gmail set up if you want to shoot me an e-mail.

I've worked in both K-12 and post-secondary education, studied the history of education reform in the United States, and visited schools/met teachers/students/etc that I've connected with across the U.S.

I'm always interested in hearing someone's story about school, how it did/didn't meet their needs, and how it has impacted them.

> I paid attention to lectures and stared at the text, but couldn't really understand the material. For the first part of a linear algebra course, students who don't mind blindly following mechanical processes for solving problems can do very well.

I had a similar, though sort of opposite experience.

In high school, I breezed through the material, and started teaching myself calculus during the summer to prepare for university. Other than being a lazy student, I had no problems taking the 2nd semester advanced calc 2 and 3 courses my freshman year. I totally get what's being taught. There weren't a ton of practical examples, but I can easily see (for example) what the purpose of integration is, and how and why you'd do it in two or more dimensions. I could work the equations, no problem. Everything is great.

Along comes sophomore year, and still thinking I am hot stuff, I take advanced linear algebra and differential equations. More of the same, I thought.

Well... we seemed to spend the entire semester just solving different kinds of equations. No explanations given as to what they are for, where they are used, or what the point of any of it was. I struggled, for the very first time.

I either got a D or F for the mid-term exam, which was shocking to me.

We had one chapter where we were doing something practical. This is where you have a water tank, and a hole in to bottom. Because the pressure lessens as the tank empties, the flow rate is not constant. However, you can solve this via diff equations, and I really grokked it. I finally saw the point for some of what we had been doing. But it was just that one chapter, we skipped any other practical aspects for what we were studying.

I did end up pulling out a 'C' with that class, to my relief. Sure, most of the blame for my lousy performance must rest with me, because of my poor study habits. And a little blame can go to the TA, who wasn't a good communicator, so that hour every week was kind of useless. But I also blame the material and how it was presented.

I think that whether or not students do well, there's a common theme in university math curricula for non-math majors. Basically, math gets taught as a kind of "toolbox" of techniques. Unless there's a strong follow-up in subject matter courses (for example in engineering coursework), those math skills effectively evaporate.

Some places use a rigorous "proof-theoretic" approach in math curricula. It's much harder and takes more time, but it's better than merely grinding on hundreds of easy calc-101/diff-eq problems, because students gain an understanding that doesn't erode as easily once they forget "the tricks".

More CS, engineering and science students, IMHO, should dabble in math department courses beyond the the usual "required" sequence for their majors. It can be eye-opening and provide long lasting benefit to take a hardcore real-analysis course, abstract algebra or a number of other courses in math.

> More CS, engineering and science students, IMHO, should dabble in math department courses beyond the the usual "required" sequence for their majors

That was absolutely not allowed at my faculty (admittely computational linguistics, but I would have massively benefited from math courses). No courses other than the predefined ones, no matter how relevant. Now I have to learn so much afterwards, it's not even funny.

> ...have to learn so much afterwards, it's not even funny.

It's true.

The sad thing is these problems start well before university when high schools pressure students into "advanced" math coursework without demonstrating mastery of previous topics. It builds a shaky foundation and sets the student up for a lot of needless difficulty later on.

Much better to slow down, focus on fundamentals early on and then build breadth in university coursework.

Oh man, Differential Equations. After doing well in Calc 1-3 I thought it would be no big deal. I paid attention in class and barely did the homework because it all seemed so straightforward but it was boring and I was not engaged.

I came in for the first exam, sat there for maybe 15 minutes reading the questions, and realized I had no idea how to solve any of them.

Luckily it was before the drop date! That was a turning point where I decided to only take classes that seemed fun. For me that was discrete math, number theory, abstract algebra, etc.

It's probably an oversimplification, but differential equations -- as a field of study -- tends to be much more a grab bag of tricks than many branches of mathematics.

I took linear algebra through a community college and had one of those rare, really awesome CC instructors. He had spent most of his career at Cray and later Raytheon and then semi-retired as a community college instructor. He took time to make really great interactive Jupyter notebooks. That combined with 3 brown 1 blue videos really made linear algebra click for me.

My only regret is that I took the class as a six week short course. I think my recall would be better if I had taken the full semester. We covered all the material, but missed out on the longer spaced repetition. Linear Algebra was by far my favorite pure math course, I hope to revisit it soon. Maybe Strang's lectures are the way to do that.

There is a linear algebra series on Udemy called "Complete Linear Algebra: : theory and implementation" by Mike Cohen that I really enjoyed doing because he walks you through Matlab demonstrations (code included for Matlab and python. I adapt to Julia using the PyPlot wrapper for Julia).

I particularly like his videos because he breaks them down into small bites that are easy to work into your day and he's a great teacher.


Has he published these notebooks online anywhere?

At college I had a very poor algebra professor, and my first grade was 15/100. I didn't want to fail any classes, and since I only had them in the evening, I went on to take algebra with other two professors during the day. Those two were different, but I wouldn't say better than the first. But with determination, it finally clicked.

Second exam was 85/100, the highest between C.S. and Automation Engineer (both lectured by that first professor). While I do agree that a good teacher can pave the way for a good student, I think most of the work you have to do it yourself, as if your life depend on it (mine did).

I had a very similar situation in my linear algebra course: in hindsight, I would literally have been better off teaching myself the material than listening to the professor. To this day it's still the main weak spot in my math/stats knowledge base. I'm really interested to check out these lectures.

Intermediate Stats writing out Chi-Squares by hand on exams literally ended my academic inclinations. I had been using software to do this for a while and something about the process of being forced spend hours memorizing how to by hand just to "earn" a letter rubbed me the wrong way. I absolutely know much more about Chi-squares then I'd ever need to, possibly an imprinting of the bad experience.

haven't watched but based on summary these seem more about pedagogy than the subject - that said there is a full course worth of videos taught by same professor that are pretty good

Oh yeah Gilbert Strang's original course is amazing , https://ocw.mit.edu/courses/mathematics/18-06-linear-algebra...

It's ridiculous how much random college-level linear algebra textbook material I stared at before things clicked in the course of just jumping in and exploring 3D graphics and writing my own 3D vector, matrix multiplication and 3D transform headers and using them in making some games in plain C.

At some point it's like "Wait, is linear algebra really just about heaps of multiplication and addition? Like every dimension gets multiplied by values for every dimension, and values 0 and 1 are way more interesting than I previously appreciated. That funny identity matrix with the diagonal 1s in a sea of 0s, that's just an orthonormal basis where each corresponding dimension's axis is getting 100% of the multiplication like a noop. This is ridiculously simple yet unlocks an entire new world of understanding, why the hell couldn't my textbooks explain it in these terms on page 1? FML"

I'm still a noob when it comes to linear algebra and 3D stuff, but it feels like all the textbooks in the world couldn't have taught me what some hands-on 3D graphics programming impressed upon me rather quickly. Maybe my understanding is all wrong, feel free to correct me, as my understanding on this subject is entirely self-taught.

> Maybe my understanding is all wrong, feel free to correct me, as my understanding on this subject is entirely self-taught.

I wouldn't say it is all wrong. Just that the stuff you are talking about is a very tiny fraction of LA. I took a graduate class in LA, based on Strang's book. I have the book right here in front of me. So the stuff you allude to, i.e. rotation matrix, reflection matrix & projection matrix, is on p130 of Chapter 2. We got to that in the 1st month of the semester, & it got about 1 hour of classtime total. That's it. An LA class is like 4 months, or 50 hours. If the point of LA to derive those matrices so one can do 3D computer graphics with scaling, rotation & projection ? No, that stuff is too basic. We got 1 homework problem on that, that's it.

The stuff that most of the class struggled with ( & still struggle with, because Strang goes over it rather quickly in his book), is function spaces ( chapter 3, p182), Gram Schmidt for functions ( p184), FFTs, (p195), fibonacci & lucas numbers (p255), the whole stability of differential equations chapter ( he gives these hard and fast rules like a Differential Equation is stable if trace is negative & determinant is positive, but its not too clear why. ), quadratic forms & minimum principles - that whole 6th chapter glosses over too much material imo.

Overall, Strang's book is a solid A+ on how to get stuff done, but maybe a B- on why stuff works the way it works. Like, why should I find Rayleigh quotient if I want to minimize one quadratic divided by another ? Strang just says, do it & you'll get the minimum. How to find a quadratic over [-1,1] that is the least distance away from a cubic in that same space ? Again, Strang gives a method but the why part of it is quite mysterious.

Thanks for writing this. I intend to pick up a physical copy of Strang's book as it's been repeatedly mentioned in a positive light on HN.

So does LA get substantially more involved than just lots of multiplications and additions or is it always at the end of the day still just bags of floats getting multiplied and summed? Is it just a fantastic rabbit hole describing what values you put where in those bags of numbers?

Is it the same issue as the infamous "monads tutorials" problem, where the understanding takes a lot of time to infuse but looks obvious in retrospect when it finally clicks?

The "monad" problem is worse; many of the "tutorials" were actively wrong about some critical element, often more than one. I don't think I've seen someone claim to have linear algebra "click" but be fundamentally wrong about it somehow.

One advantage of linear algebra is that it is, well, linear. Linear is nice. It means you can decompose things into their independent elements, and put them all together again, without loss. The monad interface, as simple as it is, is not linear; specific implementations of it can have levels of complexity more like a Turing machine.

Maybe I'm biased, but I really don't think so. Monads are quite a bit more abstract than the concepts in linear algebra. Linear algebra is both geometric and algorithmic and therefore very intuitive. Most of the difficulty people have learning linear algebra can be attributed to poor teaching methods.

That depends on the part of linear algebra. In an abstract function space when you start calculating dimensions of kernels and the like and get ready to make the jump to infinite dimensions, Banach spaces, and Hilbert spaces, it's about as abstract as monads.

Well, to be fair, functional analysis is not part of linear algebra proper. (If you want to get more abstract, you go to rings and modules and from there to category theory.)

Typically, linear algebra is understood to be the study of finite-dimensional vector spaces, so functional analysis is not necessarily part of it.

However, things like the vector space of polynomials of degree at most n, the vector space of all homomorphisms between two vector spaces, the dual space of a vector space, etc. are all concepts that belong to linear algebra proper yet are more "abstract" than just "computations with matrices".

That's fair. I may have a bias coming from physics because quantum mechanics demands Hilbert Space Now! from the students.

LA is one of those topics that, to an extent, is built on a handful of core capabilities and concepts. Once you master those much of what follows are logical extensions or combinations of the previous. It goes on from there, but the value returned from the core material is wide reaching.

"whenever somebody gets a deeper understanding of monads, they immediately lose the ability to explain it to other" I don't remember where I've read this but it still holds even today.

I am convinced that monads induce a very specific kind of brain damage that makes a person incapable of ever explaining monads.

Start with a container.

    M a
Then add a way to put things in the container.

    a -> M a
Then add a way to use the thing in the container.

    M a -> (a -> M b) -> M b

Well, you see, that's one of the problems... monad implementations don't have to be "containers", or at least not the way most people mean. This was one of the critical errors in many of the aforementioned "tutorials". IO, the quintessential monad, is not a container, for instance.

(A nearly-exact parallel can be seen in the Iterator interface. You can describe it as "a thing that walks through a container presenting the items in order"... and yeah, that's the majority use case and where the idea came from... but it's also wrong. What it really is is just "a thing that presents items in some order". It doesn't have to be from "a container". You can have an iterator that produces integers in order, or strings in lexigraphic order, or yields bytes from a socket as they come in, or other things that have no "container" anywhere to be found. If you have "from a container" in your mental model then those things are confusing; if you understand it simply as "presenting items in order" then having an iterator that just yields integers makes perfect sense. A lot of the Monad confusion comes from adding extra clauses to what it is. Though by no means all of it.)

I wouldn't over-think it and over-describe it.

The "aha" realization that the "container" can be an ephemeral concept and not resident at run time can come later.

FWIW, I think of IO as a container: it contains the risk of side-effects within. All the examples you gave are containers in their own way.

The problem is telling people it's a container is "over describing" it. We don't need to hypothesize about that. We have the space suits and burritos to prove it is not a good didactic approach. It is not removing from the definition to simplify, it is adding to the definition, exactly as I carefully showed in my description of "Iterator". An Iterator is "a thing that presents a series of items". It does not simplify the discussion of Iterator to say "It's a thing that presents a series of items out of a container, but also, it doesn't have to be a container". It's not the first definition that's "overdescribing", it's the second.

Containers make sense.

Abstract computer science doesn't.

Part of why Haskell appears like such an implacable curmudgeon is the predilection of its community to believe that users must grasp type and logic theory to use it.

They don't.

Just like they don't need to have a mental model of their computer to write software for it.

In my experience, not having a mental model of the computer you are going to run your software on will bite you on the ass sooner or later.

Countless mobile apps and web sites have been made with nigh-zero understanding of the VMs, rendering engines, or underlaying machine architecture.

It's not the 80s anymore.

I'll just point out that neither of you have managed to really take a single step towards actually explaining monads.

I'm not trying, so that's not a surprise.

This has inspired me to try to update my post on the idea in a side window, but it's been sitting on my hard drive for over a year now and probably still has a ways to go yet.

Yes, well, good example.

Pretty sure my point still stands.

I've had good luck with explaining it as a characteristic of a programming language. In a language consisting of sequences of statements with bindings and function calls, we expect that


is the same as

a = x; f(a)

and the same as

g = f; g(x);

That's the monad laws. Whatever craziness you want to put in the semantics, those are properties you probably would like to preserve in your language.

I think I understand monads less now.

Do you really understand them less, or has it dislodged ideas that you thought were true? Moving towards zero is not always decreasing.

I'm no expert, although I think I remember that a Monad is basically just like allowing a sequence of statements to be executed. Like executing a code file ;)

Functional languages are really weird, for instance it's possible to switch line order of statements and the compiler will still figure out how to stitch that together. I think even JS in parts has or at least had that behaviour. (Actually that's useful when having mathematical formulas that are interdependent and you're too lazy to order them topologically by dependence)

On the other hand, just executing a sequence of commands in order to do I/O is only a normal thing to do since recently as far as I understand. The sweet spot for FP is IMHO something like React where state is strictly separated from the functions. (Imagine writing Hello World using Normal Maths)

(Please correct me if I'm wrong, which is probably quite likely ;))

"A monad is just a monoid in the category of endofunctors, what's the problem?"

I don't think this definition is correct. (A monad is an endofunctor.)

It's correct but jargony. So in the category of sets there is a notion of product between two sets called the Cartesian product, and one can do a couple things to endow this product with an identity element, for example one might use {{}} as that object in the category of sets.

The claim is that in other categories, there might be other natural combinations between two objects, for example a tensor product of Abelian groups combined with the integers Z as unit, or a composition of two endofunctors into a new endofunctor FF combined with the identity functor.

So the idea is that a monoid is somehow a destroyer of this combination operation; a monoid in sets un-combines the Cartesian product M × M back into the set M, and indeed this is a function (a set-arrow) from the combined objects to the underlying object.

By having an endofunctor combined with a natural transformation from FF back to F (natural transformations are the arrows in the category of endofunctors) a monad is therefore doing exactly what a monoid does, if you replace the "pre-monoid" combination step of the Cartesian product with instead a new "pre-monoid" combination step of endofunctor composition.

The definition is correct. A monad is an endofunctor with return and join functions. Just like a monoid in the category of sets is a set with identity and multiplication.

Honestly, I think working with computers (possibly some programming) should be more frequently integrated into math courses. A computer is a natural to really interact with the material, like labs in the natural sciences. We're lucky to live in a time where this is possible, but sadly math education is taking it's sweet time taking advantage of this possibility.

Sounds like a case of You Can't Tell People Anything: http://habitatchronicles.com/2004/04/you-cant-tell-people-an...

I guess a mathematician might look down upon sticking with 2d and 3d stuff because it leaves out all the interesting things that happen at 92382 or negative infinity. But yea, matrices are basically just a convenient way to write rows and rows of "ax + by + cz...". In linear algebra, you just do it so often, people made up their own syntax. And nothing can visualize it like transforming graphics, IMO.

You don't even have to go 3D, just starting with the points of a rectangle in 2D and asking, "how do you put the edge points of this rectangle 10px to the left, rotate them 45° and stretch them 200% vertically?" and you've applied a matrix. Even if you're not using the fancy brackets, you're using a matrix, and understanding it.

I think these are good examples, but to me "linear algebra thinking" lies in it's generality. For example, the derivative is a linear operator, so how do you write it down as a matrix? Google's PageRank is a solution of a matrix equation, what does that matrix represent? Etc.

> For example, the derivative is a linear operator, so how do you write it down as a matrix?

Consider polynomials in X of degree up, but not including N. The powers 1,X,...,X^(n-1) form a basis. Then the coefficients of the polynomial can be put in a column vector. If D is the derivative operator, DX^n = nX^(n-1), so the derivative matrix can be expressed as a sparse matrix with D_(n,n+1) = n. Visually, it's a matrix with the integers 1,2,...,n-1 on the super-diagonal.

You can also see that this is a nilpotent matrix for finite N, since repeated multiplication sends the entries further up into the upper right corner.

You can extend this to the infinite case for formal power series in X, too, where you don't worry about convergence.

> Google's PageRank is a solution of a matrix equation, what does that matrix represent?

Isn't it just the adjacency matrix of a big graph?

Anyway, I agree with you. Matrices and linear algebra is a really good inspiration for higher level concepts like vector spaces and Hilbert spaces and so on. That's where the real power lies. But even in such general domains, matrices are often used to do concrete computations on them, because we have a lot of tools for matrices.

I think the best thing anyone told me about linear algebra was that “matrices are just the coordinate form of a linear map.” So applying the map is equivalent to multiplying it’s matrix etc.

Reality, as always, is not that simple. Matrix analysis is a huge area in itself; and matrices can also be used to represent tensors (which generally are not seen as linear maps) and some other things.

> Wait, is linear algebra really just about heaps of multiplication and addition?

That's just one of dozens of things LA is "about"

> why the hell couldn't my textbooks explain it in these terms on page 1

Because you wouldn't have understood terms like

> orthonormal

and because it would have been unhelpful to everyone else who want in LA for the exact same reason you were.

Being obviously in retrospect doesn't mean it was obvious in forespect. You had to learn the material first.

Do you have any recommendations on videos/tutorials with graphics stuff that would help me grasp some of the linear algebra and basics about 2D/3D graphics? Your comment about 'staring at random college-level linear algebra' stuff resonated deeply with me, and I've always felt like I'm just not understanding how it all connects.

Not really, I didn't follow any specific guide. But if you like learning from youtube videos Casey Muratori has some decent streams about this stuff on his Handmade Hero channel. The 3Blue1Brown channel also has some relevant videos.

If you've never written a standalone software-rendered ray tracer, I found that to be a very useful exercise early on. There are plenty of tutorials for those on the interwebs.

Not the person you responded to but I found this course to be very good: https://www.edx.org/course/computer-graphics-2

Gilbert Strang's linear algebra course blew my mind back in high school, and I still use insights from it every day. Strang has a particular lecturing style where he approaches every topic several times, often beginning many lectures before the main treatment. At first I thought it was a bit confusing, but later I realized it helped build fluency, just like a language class.

I'm really thankful to MIT OCW for putting his lectures out for free -- in fact, I think I'll go donate to them now.

+1, I was very grateful to MIT OCW because when I learned Linear algebra, I could not have afforded it. Later when I got a job, I donated to OCW and I bought his book full price from his own site [1] just as a tribute to the guy.

[1] https://math.mit.edu/~gs/linearalgebra/

Hey me too! ( All of it )

Of course it is up to you what you do with your money but, they have a $17.5B endowment so there may be more needy causes if you were so inclined.

OCW opens up top-notch education to anyone and everyone, regardless of social or economic background. I wouldn’t take it for granted even with MIT’s eye-watering endowment, and I doubt donations to it are going to be paying cafeteria lunches for the students.

I hope you’re donating and actively contributing to many non-profit projects and that your comment comes from being tired of the world’s injustices rather than from callous impertinence, although I suspect it does not.

Most money which comes into MIT passes through overhead. That means if a foundation donates to MIT, a bit over 1/3 of that money might ends up with whatever they donated to. A bit under 2/3 might go into the general budget (overheads vary by funding source, but the numbers above are from one specific project).

On paper, overhead is used for costs of running the place. In practice, it's used for things like upscale faculty clubs, million-dollar executive salaries, $200 million buildings, etc. MIT has among the highest overheads in the academy. Ironically, MIT claims its ocean yacht makes money rather than losing money (which could very well be true).

If you're okay with the majority of your money going to graft, donate to MIT. With a project like OCW, which has such a huge cost:benefit ratio, accepting the graft with the donation may be a rational decision, if you subscribe to a system of ethics like utilitarianism.

Personally, I almost never donate to a charity where the highest-earner makes more than I do. I think if everyone did that, MIT might lose some of the graft and corruption which has built up there over the years.

MIT’s current overhead rate is 50.5%, but that’s pretty standard.

Here’s a scatterplot showing lots of research institutions’ rates. When this was published, MIT’s rate was slightly higher (54%). https://www.nature.com/news/indirect-costs-keeping-the-light... showing actual and calculated rates.

This also applies to federal research grants and is meant to cover costs associated with actually hosting the research (rent, utilities, support staff). Foundations can (and often do) negotiate lower rates. I’m not sure how donations are handled, but I don’t think the same F&A rates apply.

The article quotes 56%, as the negotiated rate on NiH projects, not 54%, but the rate varies by funder and by project. The 2/3 number is from an actual project I had insight into as the base overhead rate.

What's a little bit hidden there is that MIT dips into these funds multiple times, in pretty complex ways. For example, a sponsor might pay overhead and graduate student tuition (which just flows into MIT's general coffers). Or graduate student tuition might be waived, and the sponsor pays just overhead. Or a donor might pay overhead when the money comes in, and again on specific purchases. Or capital expenditures might waive overhead. Etc. The level of complexity is high, while the level of transparency is low.

I'm not claiming any of this is unique to MIT by any means. It's just where I have the most visibility.

What's cute is that MIT claims to lose money on everything. In the article you cited: "'We lose money on every piece of research that we do,' says Maria Zuber, vice-president for research at the Massachusetts Institute of Technology (MIT)." You'll find similar statements about educating undergraduates and tuition. And just about everything else. I've worked through the numbers at some point, and MIT only loses money with clever accounts; it's good PR to say MIT subsidize everything it does, but it's often not a reality.

I think his point is there are plenty of other noble causes that could use the money a lot more.

Malcom Gladwell made a similar point on his podcast http://revisionisthistory.com/episodes/06-my-little-hundred-...

Is there a better way to incentivize educational institutions to offer free content?

A modest proposal: ban donating to them so they need to radically increase their student body (online or offline) to earn their keep with tuition, instead of relying on the donations of the extremely rich parents of legacy admissions students.

What problem do you think drastically increasing the student body will solve?

I'm mainly thinking of elite institutions. I'd first invert the question and ask what problem places like Harvard, Yale, MIT are currently solving. (This small set of elite universities is absorbing the vast majority of donations.)

As the number of student has exploded over the decades around them they've kept their student numbers the same as before even as they collect more and more money which they hoard with no concern for opportunity-cost-of-capital whatsoever. This has created an intensifying zero-sum battle for the admissions slots in these universities. Meanwhile the state systems are increasingly overwhelmed and sketchy private universities are increasingly scamming students on the edges.

By keeping their numbers so low, the Ivy League retains comfortably small classes, no major change in their overall mission and professorial lifestyle. At the student level they function to perpetuate privilege across the generations, especially via legacy admissions which seems like a gilded age concept I can hardly believe is still a thing in 2020.

I'm saying if the Ivy League were to change its mission to serve America and the world as much as they could instead, it would do a great deal to help the vast mass of striving students who are barely not making the cut, as well as the economy over the long run, and even reduce tensions in American democracy. And that change of mission probably can only happen if they start to ignore what their biggest donors think they should focus on.

I'm in an online Master's program now. With way more students than I feel they can handle. I'm sure they're milking the tuition just fine, but when projects and papers aren't graded in time to determine if you should withdraw or soldier on, it's less awesome.

While your view makes sense in the theoretical, once again human failings cause it to not work well in reality.

+1. Since I took the OCW course, whenever I multiply a matrix and a vector or two matrices by hand, I hear his voice saying, "combinations of columns." Strang must have said those words hundreds of times in it. His lectures stick like nothing else.

This is very well put. Knowledge has a hierarchical (or perhaps even cyclical!) structure and it's unrealistic to think that a body of knowledge can be taught or learned sequentially.

In High school?

Yes. Linear Algebra is an extension of what is commonly called Algebra 2 or Precalculus in high school.

LA and Calculus can be studied independently in any order and then fruitfully combined later.

Have not watched the videos yet, but that seems to me more like an 1820 vision of linear algebra :-)

If you look at the order of topics in his book "An Introduction to Linear Algebra", you will find the topic "Linear Transformation" way back in chapter 8! Even after the chapters eigenvalue decomposition and singular value decomposition. But understanding that a matrix is just the representation of a linear transformation in a particular basis is probably the most important and first thing you should learn about matrices ...

Gaussian Elimination is indeed from the 1820s. All the rest is more recent than that. The idea of matrix decomposition per se comes from the 1850s. The earliest work on something like the SVD is from the 1870s.

You are onto something though. Strang is coming from a direction of numerical computations and algorithms for solving real-world problems. Pure mathematics departments for at least the past maybe 80 years often look down on numerical analysis, statistics, engineering, and natural science, and adopt a position that education of students should be optimized in the direction of helping them prove the maximally general results using the most abstract and technical machinery, with an unfortunate emphasis on symbol twiddling vs. examining concrete examples. By contrast, in the 19th century there was much more of a unified vision and more respect for computations and real-world problems. Gauss himself was employed throughout his career as an astronomer / geodesist, rather than as a mathematician, and arguably his most important work was inventing the method of least squares, which he used for interpreting astronomical observations.

With the rise of electronic computers, it is possible that the dominant 2050 vision of linear algebra and the dominant 1900 vision of linear algebra will be closer to each-other than either one is to a 1950 vision from a graduate course in a pure math department.

I believe this is largely because in the field of mathematics, Linear algebra is just the seed that sprouts the growth of other very very useful mathematical subjects like Abstract Algebra, Functional Analysis and so on. Linear Algebra is used as a stepping stone to more general theories that are also super useful.

Take Hilbert spaces for example. They are based on linear algebra. They are quite general and you might argue that there's a lot of symbol twiddling there. However, Hilbert spaces are/were essential in the study of Quantum Mechanics, which we can argue is a very important topic.

And if you only stick with matrices and numerics, you're bound to get stuck in the numbers and details and miss the big picture. A lot of results are much cleaner to obtain once you divorce yourself from the concrete world of matrix representation.

Of course, we should probably have the best of both worlds. I'm not saying applications are unimportant. Take something like signal processing, which relies heavily on both numerics and general theory.

So I'd like to add something to your point. Math departments optimize the education of math students towards the more general, and perhaps students not interested in pursuing pure math should have course-work that reflects that.

> Pure mathematics departments for at least the past maybe 80 years often look down on numerical analysis, statistics, engineering, and natural science, and adopt a position that education of students should be optimized in the direction of helping them prove the maximally general results using the most abstract and technical machinery, with an unfortunate emphasis on symbol twiddling vs. examining concrete examples.

I had this view when I took linear algebra as an undergraduate, but I have gradually changed on the subject over time. I took a standard "linear algebra for scientists and engineers" course but I found it too abstract at the time. The instructor rarely concentrated on examples and applications despite the more applied focus in the course title. Later I came to appreciate the abstraction, since it helped me understand more advanced mathematical topics unrelated to the "number-crunching" I originally associated the topic with. I now think the instructor had a more "unified" approach, but I didn't realize it at the time.

I believe in applications and theory going hand in hand together and benefiting each other. The computer is an incredibly powerful tool perfectly suited for this purpose. If we resist the urge to just see it as a push-button technology. Viewing matrices as a box of numbers instead of as a representation of a linear transformation leans too much in the direction of push-button for my taste.

Gil Strang does not view matrices as “just boxes of numbers”, nor does he teach that view.

YMMV, but I find pure mathematicians treat computers as “push-button technology” much more than applied mathematicians.

I am not disagreeing with you there when it comes to pure mathematicians :-D

Edit: But there are of course big exceptions there as well, for example Thomas Hales.

If you want to get into serious physics/engineering, the abstract aspect of linear algebra is much more important than the boring computational mechanics. And quite a bit of those computational mechanics lead you astray when you go to infinite dimensions.

How might they lead you astray? It almost sounds like you have an example in mind.

Indeed, Strang’s textbook starts with “I believe the teaching of linear algebra has become too abstract.”

He mentions this sentiment in a lot of interviews and things too.

> Have not watched the videos yet,

Then please do. I took several online linear algebra courses from sources I trust and they were pretty bad. Or let's put it another way: I'm a pretty clever guy and I was still left confused. Strang is excellent in the classroom, and I almost even like videos for learning now thanks to him (x1.5 speed is your friend). His videos should not be your only learning source, but judging his course only by the book might result in a lot of learners skipping what I found to be the best course by far. If you want to learn linear algebra, give Strang a try first and you might save a lot of time.

> a matrix is just the representation of a linear transformation

While this view certainly helps intuition at initial stages of learning, it is not "just" that, and computational methods involving matrices are of much more practical importance (similar to being able to add and multiply numbers which we are taught early in life) which is probably why the stress is on them first and foremost. Someone said, "learn to calculate, understanding will come later."

I don't think it was this way in 1820, but I agree with your point.

Even though I also use Linear Algebra mostly computationally today, the origin of it is in the geometry and I think this connection should come first. Also, "number crunching" is a boring way to learn things.

Though, "matrix way" can be good for engineers.

I also think that this point should be emphasized more. It helped me a lot when I realized it. I also liked the abstract approach to vector spaces. Of course, a matrix is just one way to represent linear transformations and it can also represent other things like a system of linear equations.

This is silly. LA is incredibly rich, broad, and feel area of study. You can't just grab one part of it and say it's the most important and first thing. And it's silly to say that whatever is most important should be first -- the central ideas depend on prepaeatt.

I’m not sure I agree when that one thing is the word linear (half the name). It just feels wrong not to ground the subject in the idea of a linear transformation.

When I was self studying linear algebra I found Strange’s book to be inadequate on its own, you really need the lectures to get the most value out of it.

I found going through Linear Algebra Done Right to provide a good counterbalance to Strang’s book+lectures.

It is interesting to compare this with 3Blue1Brown's linear algebra introduction on YouTube. He seems to have been the only mathematician who has actually mastered the medium; linear algebra lends itself very well to animations. The mathematicians don't understand how badly they need to animate some of these concepts.

3Blue1Brown's linear algebra animations were fun to watch but they did almost nothing for me except the basic fact that the "linear" part means lines.

The rest was effectively preaching to the choir so those that already know linear algebra nodded their heads and idiots like me were still flummoxed

That's interesting. I do think 3B1B's goal is probably to build better intuitions in people who already know it.

Yes, those lectures are good enough for intuitions only. For practice one has to read books. I just hope the viewer knows that. Which I have seen is absent among some people.

Yes. 3Blue1Brown has himself said multiple times no videos is substitute for text books.

Such is life

Agree, world class education in maths there. He understands the importance of examples and visualization and it changes everything.

Mathematicians and teachers aren't all computer programmers, so creating animations isn't their forte.

Books have pictures that do a pretty good job.

Animations are pretty and interestit but that isn't the same as teaching all the math.

Another good Linear Algebra book is "Linear Algebra Done Right", which Springer is giving for free right now.

Link: https://link.springer.com/book/10.1007/978-3-319-11080-6

There's also a free book "Linear Algebra Done Wrong," which might be worth checking out.


IMO Axler's book should be read either during or after you take an introductory course on Linear Algebra.

> You are probably about to begin your second exposure to linear algebra. Unlike your first brush with the subject, which probably emphasized Euclidean spaces and matrices, this encounter will focus on abstract vector spaces and linear maps.

I whole-heartedly agree. Axler's book is a great stepping stone to more abstract linear algebra.

Came here to say that. It is a wonderful book, and I think it provides a more "modern" approach than the one presented in the videos.

Thanks for this -- do you know if there's a consolidated list anywhere of other books Springer is making available for free right now?

Woah. That's a lot of books. Between the math, physics, and cs sections, years from now you'd look back and wonder if you really should have downloaded all of them.

This book is great and very much complementary to Strang's approach in that it leans more towards "abstract" linear algebra.

So I like this outline. It is very MIT-ish where there is a sense of teaching someone to solve practical engineering problems with matrices.

But, I do foresee some difficulties. One thing that I find really difficult, for example, is that I take undergrads who have had linear algebra and ask "what is the determinant?" and seldom get back the "best" conceptual answer, "the determinant is the product of the eigenvalues." Like, this is math, the best answer should not be the only one, but it should be ideally the most popular. We would consider it a failure in my mind if the most popular explanation of the fundamental theorem of calculus was not some variation of "integrals undo derivatives and vice versa". I don't see this approach solving that. Furthermore there is a lot of focus from day one on this CR decomposition which serves to say that a linear transform from R^m to R^n might map to a subspace of R^n with smaller dimension r < min(m,n) and while in some sense this is true it is itself quite "unphysical"—if a matrix contains noisy entries then it will generally only be degenerate in this way with probability zero. (You need perfect noise cancelation to get degeneracies, which amounts to a sort of neglected underlying conserved quantity which is pushing back on you and demanding to be conserved.) In that sense the CR decomposition is kind of pointless and is just working around some "perfect little counterexamples". So it seems weird to see someone say "hold this up as the most important thing!!"

> seldom get back the "best" conceptual answer, "the determinant is the product of the eigenvalues."

I found that the "best conceptual" answer depends a lot on taste, and what concepts you are familiar with.

In this case:

- Calculating exact eigenvalues of matrices larger than 4x4 is impractical, since it requires you to solve a polynomial of degree >4.

- The EV exist only in algebraically closed fields (complex numbers), while the determinant itself lives in the base field (rationals, reals).

How about:

- [Geometric Determinant] The determinant is the volume of the polytope (parallel-epiped) spanned by the column vectors of the matrix.

- [Coordinate Free Determinant] The determinant is the map induced between the highest exterior powers of the source and target vector spaces (https://en.wikipedia.org/wiki/Exterior_algebra)

- I think there is also a representation theoretic version, that characterizes the determinant as invariant under the Symmetric group acting by permutation on the columns/rows of the matrix.

Re: your last point, the determinant is the matrix function, which is fully anti-symmetric under permutation of rows and columns i.e. swapping a pair of rows or a pair of cols pulls out a minus sign. The definition of the determinant is related to the alternating representation of the symmetric group.

The permanant [1] is the matrix function which is fully symmetric, so permuting any rows or cols leaves it invariant. It emerges from the identity representation.

Finally, partially symmetric matrix functions are known as immanants [2], defined using the other irreps of the symmetric group.

[1] https://en.wikipedia.org/wiki/Permanent_%28mathematics%29

[2] https://en.wikipedia.org/wiki/Immanant

I don’t have an intuition for these concepts I’m afraid (I probably should watch the videos). What I don’t see for instance is how this relates to the fact that a matrix A with det(A) = 0 is not invertible.

Take a 3x3 matrix A for example. Then det(A) is the volume of the parallepiped formed by the row vectors. If one vector is a linear combination of the other two, this means that the vectors lie in a plane, which has volume 0 in 3D, so det(A) = 0. Since we have a plane in 3D, this means A can't express all vectors in 3D, so it's not invertible. This generalizes to any dimension.

The geometric version is the most intuitive for me:

If the volume of the prallel-epiped is zero, then there will be directions in the target space, that you did not hit. Hence he matrix can not be invertible.

I will have to understand what directions in target space are first. I’ll guess I’ll have to do the work ;)

So an eigenvector is a special direction which most linear transformations have. If you ask them to transform a vector in that direction, they do not rotate that direction to some other direction, they only scale it by a constant called the eigenvalue. Usually there are a bunch of these directions for one transform, and they do not need to be orthogonal. We often choose one vector as representative, like (-1, 1): but this is shorthand for all vectors (-t, t) for all t. One important thing is that the zero vector (0, 0) doesn't count (even though T 0 = 0) because it can't represent a whole direction.

So for example if I take (x, y) to T (x, y) = (3x + y, 2x + 4y), that is an example of what we call a linear transformation -- it obeys T(p1 + p2) = T p1 + T p2, it distributes over addition.

Now in addition to noticing that this is linear we may happen to notice that T (-1, 1) = (-3 + 1, -2 + 4) = (-2, 2). So in the direction (-t, t) we are just scaling vectors by a factor of 2, to (-2t, 2t). Similarly we might notice that T (1, 2) = (3 + 2, 2 + 8) = (5, 10). So in the direction (t, 2t) we are just scaling vectors by a factor of 5 to (5t, 10t).

These two scaling factors, 2 and 5, are called the eigenvalues of T. Their product, 10, is called the determinant of T. And in this case their eigenvectors span the entire space -- you can make any other (a, b) as a combination (-t1, t1) + (t2, 2 t2), for some numbers t1, t2. Actually t1 = (-2a + b)/3 and t2 = (a + b)/3, I can work out pretty quickly. And in this t-space this transformation is very easy to think about, it has been "diagonalized."

Sometimes these eigenvalues and eigenvectors don't exist, but we can patch that up with one of two tricks. The first trick is, for example, used for the 2x2 rotation matrices. These rotate every direction into some other direction, so how will I find some direction which "stands still"? The answer here is complex numbers, in this case it turns out that any 2x2 rotation by angle t will map the complex vector (1, i) to (cos t + i sin t, -sin t + i cos t) = (cos t + i sin t) * (1, i), so it has two complex eigenvalues e^(it), e^(-it). So the first trick is complex numbers. There is, it turns out, only one other class of weird transformation. In these weird transformations, it is possible to define chains of "generalized eigenvectors". Each chain starts with one ordinary eigenvector with an ordinary eigenvalue q, T v1 = q v1, and then the next element of the chain is a "generalized eigenvector of rank 2" which has T v2 = v1 + q v2, and then the next element of the chain is a "generalized eigenvector of rank 3" which has T v3 = v2 + q v3, and so on.

So it is a theorem that any NxN complex linear transformation has N linearly independent generalized eigenvectors which span the space, and "usually" these are all just normal eigenvectors and the matrix is "diagonalizable" (and even if they aren't, they come in families which start from one normal eigenvector and the matrix can be put into "Jordan normal form").

If you understood all of that, you are ready for the main result that you asked about. :)

For a linear transformation to be invertible, it needs to map distinct input vectors to distinct output vectors. If it maps two different input vectors to the same output, then invertibility fails.

So we know that invertibility fails when we can find distinct v1 and v2 such that T v1 = T v2.

Put another way, T v1 - T v2 = 0. But by the linearity property, T distributes over additions and subtractions, so this is the same as saying that T (v1 - v2) = 0 for v1 - v2 nonzero. This is enough to establish that v1 - v2 is an eigenvector with eigenvalue zero.

What does this do to the determinant, the product of all the eigenvalues? Well, zero times anything is zero. So if some linear transformation T is not invertible, then you immediately can conclude that det(T) = 0.

Furthermore this argument goes the other way too, with only a little subtlety related to these "generalized eigenvalues" -- basically, that the generalized eigenvalues always exist and there is always at least one eigenvector which actually has that eigenvalue, and that complex numbers still have this property that any finite product of complex numbers which results in zero can only come about if one of those numbers was zero. If you know all of those things, then you can work your way backwards to conclude that det(T) = 0 implies that one of the generalized eigenvalues is zero, which has at least one normal eigenvector v such that T v = 0, which I can then use to find many inputs for any given output, T u = T (u + k v) for any k

So to say that this product-of-eigenvalues is zero is to say that one of the eigenvalues is zero, and therefore the linear transformation is projecting down to some smaller, flatter subspace in a way that cannot be uniquely undone. If there is no such smaller flatter subspace, then the transformation must have been invertible all along.

Subjective, I find the geometric interpretation of the determinant to be the "best".

The most intuitive definition of determinant for me is it’s the growth or shrink factor of a differentiable function at a point. The eigenvalues of the Jacobian just demonstrate the various sources and sinks in a dynamic system.

This view also motivates the concept of vector bundles and vector spaces at a point.

After watching this and having read the comments, I am quite puzzled by the approach American seem to take to linear algebra. Are matrices viewed as the core of the subject in the USA ?

My country curriculum introduces linear algebra through group theory and vector spaces. Matrices come later.

I would not describe his approach as “the US approach” but it is a pretty standard approach to introducing linear algebra to engineers, which is the theme of the course.

I was also taught linear algebra this way, by an applied mathematician with a background in chemical engineering:

- start by solving Ax=b with row reduction

- develop theorems about linear independence and spanning sets of vectors based on these exercises

- introduce the determinant from the perspective of linear systems (rather than eg geometry or group theory)

- eigenvectors and eigenvalues

Later I switched from physics to math and TAed a more “algebraic” approach involving groups/rings/fields. But the matrix-first approach was more helpful for both my physics coursework and later courses in numerical linear algebra.

Yeah I would call this the engineering approach (matrices) vs the mathematical approach (algebra).

I took like 3-4 courses in the US involving the engineering approach, starting in high school and continuing through the college as a CS major. That was all that was required.

But I also like algebra, so I happened to take a 400-level course that only math majors take my senior of college. And then I got the group theory / vector space view on it. I don't think 95% of CS majors got that.

I don't think one is better than the other, but they should have tried to balance it out more. It helps to understand both viewpoints. (If you haven't seen the latter, then picture a 300-page text on linear algebra that doesn't mention matrices at all. It's all linear transformations and spaces.)

What country were you taught in? Wild guess: France?

I can't say whether or not it is the standard approach but I do know that it is very common in many countries to teach a linear algebra course that is heavy on matrix operations, that you can come away believing that linear algebra is somehow _about_ matrices and their operations. I know many in my university class seemed to believe that.

A book I enjoyed is Axler's Linear Algebra Done Right[0], in which, if I remember correctly, doesn't contain a single matrix.


I've recently started going through Axler carefully and doing the problems, a quarantine activity I guess, and have been enjoying it. I actually learned about this book on an older HN post.

It does have plenty of matrices. The main thing it really does is avoid determinants until the very end. The determinant is certainly something I remember learning as a kind of rote operation, without really understanding any intuition behind why you'd multiply and add these numbers in this particular way. I still feel lacking in "feel" here, which is why I suppose I'm going through Axler now.

Yeah, math professor here, this drove me crazy.

For example, I remember looking at the linear algebra book my department had used previously. Early on, it introduced the concept of the transpose of a matrix:


Superficially, it looks like something good to introduce. It is fodder for easy homework exercises, and there is a satisfyingly long list of formal properties satisfied.

But why? What does the transpose mean? For what sort of problem would you want to compute it?

There are good answers to these questions (see the "transpose of a linear map" section of the Wikipedia article I linked), but they are not easy for a beginner to the subject to appreciate.

IMO Axler's book should be read either during or after you take an introductory course on Linear Algebra.

> You are probably about to begin your second exposure to linear algebra. Unlike your first brush with the subject, which probably emphasized Euclidean spaces and matrices, this encounter will focus on abstract vector spaces and linear maps.

American here. We started with the group theory and vector space approach, though the group theory was fairly limited to just enough for vector spaces as there was a separate set of algebra classes.

It's not universal.

In general math departments in the US are less frightened of accidentally teaching the students something useful.

There's nothing particularly linear about groups.

No but the curriculum goes from groups to fields and from fields to vector spaces.

> After watching this and having read the comment, I am quite puzzled by the approach American seems to take to linear algebra. Are matrices viewed as the core of the subject in the USA ?

The US is a very big place. I doubt there is an american approach to linear algebra. We really don't have a single approach to anything. Different schools and majors probably approach the topic differently. My college had a linear algebra course specifically crafted for CS majors and engineers. I took that and it did focus on matrices. It was also the only math class that required programming. I believe math majors had their own linear algebra course.

> My country curriculum introduces linear algebra through group theory and vector spaces. Matrices come later.

Different strokes for different folks. If it worked out for you that's all that matters.

Admittedly, I never fully groked linear algebra.

Some of the concepts made sense, especially solving for linear systems of equations.

Recently, I decided to brush up on my math skills via Youtube videos, and came across this series: https://www.youtube.com/channel/UCYO_jab_esuFRV4b17AJtAw

It explains Linear Algebra concepts using 2D and 3D vector manipulation, and the animations help me visualize the underlying maths.

I am familiar with the material of linear algebra but haven't read his books. Could someone who has absorbed linear algebra from different sources and familiar with Strang's books comment on what's good and bad and unique about them.

In my time I had picked LA from Ben Noble, Halmos and Axler and the computation side of things from Golub & van Loan.

I like your bibliography!

Ben Noble's book was my entry to LA. I was an undergraduate and involved in a research activity that demanded a lot of knowledge of the eigenvalue problem. The concrete approach in that book helped a lot.

It was only later on that I took a class based on G&vL (implementing a bunch of basic LA factorizations in Matlab), and in my spare time read Halmos's book. I understand the coordinate-free algebraic approach, but I work on applications and that viewpoint has not stuck with me. The stuff on numerical accuracy in GvL really did stick, OTOH.

From the comments here, and Strang's book's table of contents, I gather that his book (which has a lot of fans) has a concrete geometric approach.

Self-reply: Here's a comparative review of Noble+Daniel vs. Strang: https://pdf.sciencedirectassets.com/271586/1-s2.0-S002437950...

Hey thanks. I am quite surprised to meet a fellow Nobler. I thought I was the only one. I self studied the material and no one in my social circle had read it from Noble.

I'm still a beginner and I think Strang's Linear algebra books are more like a supplement material to his lectures. If you need to build a solid theoretical foundation of linear algebra you'd need to consider other resources too.

Having said that, he is explaining many things really well and is helping a lot to build intuition. He is always cautious presenting things that are computationally inefficient and suggests the alternatives.

Exercises are too hard for me personally. I'd prefer a more laborious set of exercises helping to cement the material, (as in calculus or usual algebra) and then have one or two problem solving puzzles at the end.

So the focus is on different recipes to cook a matrix with ? Different operations one can do on a matrix ?

I hope its not just that, that would be very limiting considering what linear algebra is about and capable of.

No, his books are not recipes. The thing I'm struggling to communicate here is that he's got a more pragmatic style compared to other text books. The material he presents is complete and and he is doing great job making it approachable for non mathematicians.

His books usually expand on the subjects he presents at his online lectures. I see them as advanced lecture notes.

No worries, not your fault, its really on me (to read the books). Its not really fair of me to ask for a comprehensive description in comments. Thanks for your comments anyhow.

I read his book and I would say his book work in complement with his lectures. It is not good enough on it's own.

I've been wanting to learn linear algebra. I had some exposure in college along with my calc classes, but never really understood it fundamentally. Like it was mentioned, I mostly did matrix transforms but didn't realize fundamentally grasp.

I started doing LA on Khan academy, and checked out Linear Algebra Done Right. LADR was a little too much into the deep end for me. KA seemed to be good. One nice thing about KA is that when I didn't quite remember something (i.e. how exactly to multiply a matrix) I could just go to an earlier pre-LA lesson, pick it up, and then go back to LA where I left off. I'm a few lessons in.

What do you all recommend for someone like me?

If you want to learn LA by coding, coding the matrix by philip klein is a really good book. He even has his Coursera lecture videos (not available on coursera anymore last I checked) up on his website.

I'm currently trying to grok the finite element method. Gilbert Strang's explanation of the transition from the Galerkin method to FEM did more for me in terms of connecting the dots than anything else I could find on the web. And it wasn't even a lecture, just a kind of an interview. I think it's this one: youtube.com/watch?v=WwgrAH-IMOk


I feel like I don't really understand his explanation, because it's kind of vague. But I think that might be because you've seen the equations dozens of times, and I haven't seen them at all, so you were prepared to understand the video.

This makes sense. As said, it was about connecting the dots for me. Also, I don't even claim I fully understand FEM (or even Galerkin), it's just my hobby project.

That sounds interesting! What are you doing with it?

I just want to understand the magic behind static stress analysis. More generally I'm interested in emulating physics behind the rigid body model.

Maybe I will create a game prototype based on the mechanics but this is just a vague idea.

A 2018 paper by Strang about this approach:


I like all Strang's books... at least the ones I have. I don't have his Learning from Data book, yet... however.

I also really like the applied linear algebra book by Boyd Vandenberghe: https://web.stanford.edu/~boyd/vmls/ Free PDF is available on their website. There is Julia and Python code companions for the book and lecture slides from both Profs their websites. Also check out their other books, many of which have free PDF's available.

I can also recommend Data-Driven science and engineering by Brunton and Kutz. http://databookuw.com/ There used to be a free preprint PDF of the book but I can't find it now. Book is totally worth picking up... MATLAB and Python code available. Steve Brunton's lectures on YouTube are pretty damn good and compliment the book well: https://www.youtube.com/channel/UCm5mt-A4w61lknZ9lCsZtBw/fea...

Another really cool book is Algorithms for Optimization by Mykel Kochenderfer and Tim Wheeler: https://mitpress.mit.edu/books/algorithms-optimization. Julia code used in book.

Back in uni (2005), we used Dr. Strang's text for linear algebra. When reading the text, I felt like some down-to-earth professor was trying to explain these difficult topics as simply as possible. I remember discovering mit.edu back then and finding precious video lectures that went along with the book after the course. One of the very few times I was so genuinely happy and excited to watch math lectures online :p

We used his book then too, at Drexel in Philadelphia. Our prof at the time invited Dr Strang to guest lecture one time and I remember it being so clear and obvious as he talked that I thought “wow this is why an MIT education is so revered”.

I waited after the lecture to personally thank him and have him autograph the textbook; very glad I did in retrospect.

I started by only reading his book thinking it was enough. I was very wrong, these videos marry themselves beautifully with the content of the book which suddenly became incredibly clear once I started watching the videos. Strang teaching style can also seem odd at first, but don't give up, he is an amazing teacher who makes every concept simple to understand. This course is a true gift.

Wow, Prof Strang is 85 and still teaching! Thats very impressive and inspiring.

I've been doing the Statistics Micromasters from MIT. It's rigorous and very deep. I look forward to doing this.

For anyone who is already familiar with the Prof. Strang's lectures from previous years, the main new thing in this five-lecture mini-series is he tries to condense the material even further—maximum intuition and power-ideas, instead of the full-length in-class lecture format with derivations. This makes the material difficult to understand for beginners, but makes a great second source in addition to or after a regular LA class.

One of the interesting new ways of thinking in these lectures is the A = CR decomposition for any matrix A, where C is a matrix that contains a basis for the column space of A, while R contains the non-zero rows in RREF(A) — in other words a basis for the row space, see https://ocw.mit.edu/resources/res-18-010-a-2020-vision-of-li...

Example you can play with: https://live.sympy.org/?evaluate=C%20%3D%20Matrix(%5B%5B1%2C...

Thinking of A as CR might be a little intense as first-contact with linear algebra, but I think it contains the "essence" of what is going on, and could potentially set the stage for when these concepts are explained (normally much later in a linear algebra course). Also, I think the "A=CR picture" is a nice justification for where RREF(A) comes about... otherwise students always complain that the first few chapters on Gauss-Jordan elimination is "mind-numbing arithmetic" (which is kind of true...) but maybe if we present the algorithm as "finding the CR-decomposition which will help you understand dozens of other concepts in the remainder of the course" it would motivate more people to learn about RREFs and the G-J algo.

Since y'all are code-literate, here the SymPy function for finding the CR-decomposition of any matrix A:

  def crd(A):
      Computes the CR decomposition of the matrix A.
      rrefA, licols = A.rref()  # compute RREF(A)
      C = A[:, licols]          # linearly indep. cols of A
      r = len(licols)           # = rank(A)
      R = rrefA[0:r, :]         # non-zero rows in RREF(A)
      return C, R

Test to check it works: https://live.sympy.org/?evaluate=A%20%3D%20Matrix(%5B%0A%20%...

Gilbert Strang taught me how to sanely multiply matrices. His Introduction to Linear Algebra is very approachable. It's wildly different experience compared to linear algebra courses I had on university, it actually makes sense and is fun!

Linear algebra was one of those classes I was forced to take in undergrad as an engineering requirement - only to end up appreciating it immensely later on when I realized how many real world problems can be converted to matrix operations.

Professor Strang’s lectures helped me greatly during my linear algebra class. I thoroughly appreciated his clear, coherent lecture style.

On another note, he is such a nice guy. 10/10.

Two things really made linear algebra click for me: representing camera projections in a computer vision class and spectral graph theory, which basically connects graphs with linear algebra. In both of these, it seems like linear algebra was taken from the electrical engineering domain into computer science, which better fit my perspective.

Can anyone grow why in the first 5 minutes of part 1 he shows a 3 by 3 matrix multiples by a 1 by 3 vector yet verbally he pulls out of no where this idea that if you have _two_ 1 by 3 vectors that pass through the origin then their linear combinations can be represented by a plane? The jump from the 3D to the 2D has me lost and I gave up

The same concept applies in 2D, which might help you build the intuition to understand it in 3D.

If you have a vector v=(1,0) that points to the right, you can scale this vector infinitely in that direction by multiplying it by a positive scalar.

5v = (5,0)

62.1v = (62.1,0)

Similarly, you can scale that vector infinitely in the opposite direction (i.e. left) by multiplying it by a negative scalar:

-987v = (-987,0)

If we call this scalar c, the expression cv allows us to represent any point along the X axis simply by varying c, meaning that cv defines a line along that axis.

Similarly, we can do the same for a vector w=(0,1) along the Y axis, scaling it by d.

Now we have a method for moving to any point on the XY plane simply by varying c and d in the linear combination: cv + dw, meaning that we've defined a plane using two vectors.

Two caveats:

- this won't work if v and w are parallel; for example, if v = -w (and neither are zero) then we can only move along a line instead of a plane

- it also won't work if either of the vectors are zero, because no matter what you multiply by, a zero vector can only represent a single point

If you multiply the 1x3 vector by all scalars from -infinity to infinity you get all the points on a line.

If you do the same for another 1x3 vector, and it is not parallel to the first, you get all the points on a different line.

These two lines define a plane (and the cross product of the two vectors defines its normal vector)

I don’t know which video you’re talking about but two non parallel vectors are enough to represent a plane, the normal will be their cross product.

Also, if you have 3 dimensional vectors you were always in 3D.

I recently came across this rather in-depth series on linear algebra: https://www.youtube.com/playlist?list=PLlXfTHzgMRUKXD88IdzS1.... FWIW, I myself have only gone half-way thorugh part 1.

An interesting alternative to linear algebra is geometric algebra. I recommend googling around a bit for geometric algebra and trying out my implementation https://github.com/chakravala/Grassmann.jl

my linear algebra professor began his first lecture with a 5 minute rant informally titled “how you could have gotten an A in differential equations last semester without ever having taken calculus”

That certainly got our attention. I’ve always found linear algebra to be kind of ... almost soothing.

That's a rant I'd like to hear.

"look guys, at end of class, exam is always long list of silly second order system of differential equations.

Well, every time, we can make so-called "guess" that solution looks like e*rt. Why? We know that because professors will only give well-behaved systems on final exams because it's hard to grade the other kind.

So we know characteristic polynomials look like so (because of course they do, you can just memorize this) ... so now we lift out the coefficients into nifty thing called _matrix_ and now follow these easy four steps to get roots, plug back in, and incidentally these are "eigenvalues", we'll talk about this later ...

Bam. Done. A-, easy. No sweat."

Minus the matrix bits, that's basically how I slogged through DiffEq. Showing up to class was pointless because the prof would make up an equation to exploratively solve and inevitably it would be poorly behaved and the lecture would end with "... And and and for this kind of problem we have to use numerical methods".

IMO, if one is interested in a computational approach to Linear Algebra, Trefethens book, Numerical Linear Algebra, is the best.

That book discusses the actual algorithms used for computation. It is a bit more advanced, but amazingly clear.

What would you recommend as a good resource for learning about Linear Algebra in 2020?

I am aware of his course on OCW, but wondering is there something more interactive and/or newer than those lectures that has similar quality.

Frankly, couple with this book, it does hardly get better. You still have 3blue1brown[1] series of video, but it just brush off the surface.

[1] https://www.youtube.com/watch?v=fNk_zzaMoSs&list=PLZHQObOWTQ...

Picked up Strang's linear algebra book recently and I'm enjoying it. I've been consistently impressed with the content of MIT books.

Just curious .. what really are the usecases where of Linear Algebra is applied ? Any domain of software development ?

Any fields that have anything to do with video, image, audio, games, machine learning.

Just to have a taste of use cases: compression, filters(image filters for de-noising, HP & LP filters for audio), encoding/decoding, computer vision techniques, cryptography, neural nets, computer graphics (this is where most people learn how to use it in real computer programs)

It's applied pretty much everywhere. Most numerical problems have some linear algebra component to them. Physics uses it a lot too. A lot of non-linear problems have a linearization on which you can use linear algebra to obtain approximations. Ideas from linear algebra are used a lot in things like signal processing, quantum mechanics, etc.

- scale or rotate an image.

- root finding algorithm with more than one variable.

- graph problems like Google's PageRank

- statistical analysis

- 3d rendering (projecting a 3d scene onto a 2d image)

- solving systems of equation (also see linear programming)

Linear algebra is very basic and fundamental to physics and math.

Every science degree studies algebra in their first course, it should be regarded as something pretty basic. How comes people are still talking about this?

It was a good course I watched it online but I didn't understand much.

As someone who understands nothing of linear algebra, I have to say this "introduction" was gibberish. He may be a fantastic teacher, and perhaps it's a bit much to expect a 4 minute video to teach me anything, but it reminds me of talks from business people where what they're saying is obvious if you already understand it, and completely obscure otherwise.

This is not a course, or a primer or an introduction on Linea Algebra.

From the course description:

> These six brief videos, recorded in 2020, contain ideas and suggestions from Professor Strang about the recommended order of topics in teaching and learning linear algebra.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact