Linear algebra tutorial in four pages 256 points by ivan_ah on Dec 10, 2013 | hide | past | web | favorite | 111 comments

 A few minor mistakes (perhaps just in my eyes), but overall pretty good.The hardest part about teaching linear algebra is that nobody explains the big picture. I teach mathematics and computer science and regularly tutor linear algebra students, and I encounter students all the time who ask me "What are vectors good for? I thought all we cared about were matrices and doing RREF and stuff."For this reason, I deemphasize computations and emphasize the connection between linear maps and matrices. It can be summed up as follows: if you fix a basis, every linear map is a matrix and every matrix is a linear map, and operations on functions (like composition, inversion, whatever) correspond to operations on the corresponding matrices.It's definitely not an analogy or anything in "scare quotes" that would imply something different is going on behind the scenes. It's exactly the same thing.Other questions usually left out of discussions about linear algebra (and these notes): what are orthogonal vectors good for? Why would we ever want a basis besides the standard basis in real Euclidean space? Is Euclidean space the only vector space out there? Do vectors have to be lists of numbers?
 > What are orthogonal vectors good for?Any set of n linearly independent vectors B_a={\vec{a}_i}_{i=1..n} in a vector space can be used as a coordinate system, but the "computational cost" of finding the coordinates w.r.t. the basis B_a will be annoying. Each time you want to find the coordinate of a vector you have solve a system of linear equations.A basis consisting of orthogonal vectors {\vec{e}_i}_{i=1..n} is way cooler because you can calculate the coefficients of any vector using the formula $v_i = (\vec{v} · \vec{e}_i)/||\vec{e}_i||²Of course the best thing is to have an orthonormal basis B_s={\hat{e}_i}_{i=1..n}, so that the coefficients of a vector w.r.t B_s can be calculated simply v_i = \vec{v} · \hat{e}_i. A projection onto the \hat{e}_i subspace.> Why would we ever want a basis besides the standard basis in real Euclidean space?Hmm... I'm thinking eigenbases? The operations of a matrix A for vectors expressed in terms of its eigenbasis is just a scaling, i.e. {B_e}_[A]_{B_e} = Q^{-1} {B_s}_[A]_{B_s} Q = diagonal matrix = Λ.> Is Euclidean space the only vector space out there? > Do vectors have to be lists of numbers?Good point. If I bump this to 5-6 pages, I will try to include something about generalized vector spaces. Orthogonal polynomials could be a good one to cover.  Computer graphics is full of fun topics that answer these questions in accessible and visible/tangible ways!Skeletal character rigging in particular motivates a nice understanding of orthonormal bases and why a basis other than the identity matrix is very useful.Even simple 2d conversions between screen space & pixel space, for example, can be useful motivational examples- its the thing your web browser did to render what you're reading. :)Not that anyone would want to cover those topics in a 4 page linear algebra primer, but maybe there is some inspiration there for ways to include pictures that explain, instead of more math notation... ;)The last 2 questions there are interesting and worthwhile topics for the interested student, but I'd say, just my$0.02, don't ruin a good thing by stuffing too much into it...
 > Each time you want to find the coordinate of a vector you have solve a system of linear equations.This is wrong, unless I'm misreading you. The first time you want the coordinates of a vector in a nonstandard basis you need to construct the matrix that does the change of basis, which requires inverting a matrix (same cost as solving a LSE), but after that, it's just plain matrix-vector multiplication.
 I knew the answers to these questions :) I'm just saying they belong in a discussion of the topics. Maybe I just like to take math slower than a 4-page summary.
 maybe bump it to two different handouts?
 This is my favorite introductory textbook on linear algebra which goes to great lengths to avoid matrix and determinant computations:
 I read Axler's book quite recently, after pouring through reviews of linear algebra books. I did like it quite a lot -- it has a very clean, approachable narrative style. Having completed it, though, I feel there's maybe not enough operational knowledge gained from it. Meaning, it explains the theory, the motivations, well, but one might augment it with Schaum's[1], or some such guided exercises.
 Thanks, i have been doing a lot of machine learning hacks lately and almost everything involves "vector spaces", 249 pages should be fun, just finishing up "Think Bayes" by Alan Downey.
 Axler is not a text in applied linear algebra, it is a book you should read after you are already familiar with both basic abstract algebra and basic linear algebra (i.e. you know how to multiply matrices etc.), it then provides a nice motivation for linear algebra as a subfield of abstract algebra. You may still, however, not find the ideas of determinant, trace and other practical concepts very intuitive after reading Axler.
 Basic abstract algebra? Definitely not needed to use that book to its full potential.
 > Why would we ever want a basis besides the standard basis in real Euclidean space?It think it can be explained from a SVD pointview, the standard basis may not likely be optimal one, and didn't capture the most variance. The basis with larger variance, will presumably contains more information.After obtaining the good basis, we can approximate the original matrix using only the ones(often much smaller in number than standard basis) with larger variances. This is useful to find a shorter representation, yet preserves much of the distance information.I think this post explain the intuition pretty clear:
 Couldn't agree more! "Matrix is a function" is a super useful analogy, especially for us programmers, who deal with functions every day.
 Well, it's not an analogy! There's a 1-1 correspondence between matrices and linear maps over a vector space with a fixed basis. "A matrix represents a function" is no more an analogy than "The symbols 'V', '5', and '五' represent the number five" is an analogy.It's more like matrices are a useful way of representing certain functions, just like tally marks are a more useful way of representing numbers when you're counting the number of guests as they arrive but arabic numerals are more useful when you're doing arithmetic. Because these functions — linear maps — have a special structure to them, we can define matrix "multiplication" in a way that corresponds to the composition of linear maps.In other words, it's not a happy accident. We invented matrices as a way to reason about and compute with linear maps more effectively. For example, "diagonal matrices" and "upper-triangular matrices" (among others) play important roles throughout linear algebra, but without the matrix representation of linear maps it'd be hard to talk about these things.Here's the man you have to thank for putting all this together: http://en.wikipedia.org/wiki/Arthur_Cayley
 And matrix mult ~= function composition. I had a long ahaa moment reading this.
 That's a good idea. Perhaps when teaching people familiar with SVG, you can start by showing them the transform matrix syntax and explaining what it does. https://developer.mozilla.org/en-US/docs/Web/SVG/Attribute/t...
 So... are you going to answer them?
 Gilbert Strang explains clearly why we need change of basis with an example of image compression here, http://ocw.mit.edu/courses/mathematics/18-06-linear-algebra-...
 How about "Is Euclidean space the only vector space out there?" And "Do vectors have to be lists of numbers?"Perhaps with some examples?
 > "Do vectors have to be lists of numbers?"No. They just have to satisfy the vectors axioms (greatly simplifying, you can add vectors together and multiply them by scalars , and they behave as you would expect - associative, commutative, distributive, etc as appropriate.)So, in this regard, functions can be vectors, if you define addition to be pointwise addition. For example, consider the set of continuous functions on the unit interval [0, 1]. Add two together and you get another continuous function on the unit interval, multiply by a scalar and you get another continuous function on the unit interval, etc.
 It turns out that, if you restrict to spaces of finite dimension, then Euclidean space is the only vector space out there, up to your choice of labelling the basis elements and up to a certain linear transformation.In this sense, vectors can always be written as lists of numbers, but no, they don't have to be this in general. For example, you could have a vector space of smooth functions. It's infinite dimensional, and its elements (its vectors) are functions, not tuples of numbers.
 You must also add the condition that your scalars are real numbers. There are many finite dimensional vector spaces that are nothing like Euclidean space: any finite extension of a finite field, for example. This comes from Galois theory, but is not just abstract nonsense. One application to computing is if your scalars are only 0 or 1 and all operations are done mod 2, then you have a framework for doing error correcting codes, among other things. We call this set of scalars either the finite field of size 2 or the Galois field of size 2, aka GF(2).
 Yes, what I should have said is: if k is an algebraically closed field then n-dimensional vector spaces over k are isomorphic to k^n. Unless I've had too much to drink this evening.
 It's a little surprising, in a "no-bullshit" discussion of "theoretical and computational aspects of linear algebra," to see matrix inversion touted as the way to solve linear equations. The guide literally introduces the examples by saying "Dude, enough with the theory talk, let's see some calculations." Yet standard numerical practice avoids literal inversion, in favor of factorization methods.E.g., It is common practice to write the form A^{-1}b for economy of notation in mathematical formulas... The trouble is that a reader unfamiliar with numerical computation might assume that we actually compute A^{-1}... On most computers it is always more effective to calculate A^{-1}b by solving the linear system Ax = b using matrix factorization methods... (Dennis & Schnabel, "Numerical Methods for Unconstrained Optimization and Nonlinear Equations", section 3.2).E.g., As a final example we show how to avoid the pitfall of explicit inverse computation... The point of this example is to stress that when a matrix inverse is encountered in a formula, we must think in terms of solving equations rather than in terms of explicit inverse formation. (Golub and Van Loan, "Matrix Computations", section 3.4.11).
 Yes, one almost never inverts a matrix using a computer, and one never inverts a matrix by hand. What does that teach you? I was pretty surprised to see trivialities like this in a four-page summary. But, unfortunately, this is just a poor summary. It also barely covers the spectral decomposition, and has no mention at all of the singular-value decomposition.
 we do matrix inversion by hand in mathematics :D
 As someone with a math degree, I love this. However, I think the author over-estimates the familiarity of a typical high school student with mathematical notation:> The only prerequisite for this tutorial is a basic understanding of high school math conceptsI think fundamentally the material in these four pages is accessible to many high school graduates, but perhaps not in this concise rendering (which is awesome for me, but probably overwhelming for someone not familiar with set-theoretic notation, summation notation, etc.).
 You are right. I am totally trying to sneak in the \in and the \mathbb{R}'s in there to see if people will notice.Previously I had \forall in there too, but went over and removed things. On the TODO list is to introduce the { obj | obj description } notation for defining sets (in this case vector spaces).
 Well, I noticed! It stood out pretty clearly, I'd say.I really don't think that the intended audience of this sort of ground-up summary is going to be comfortable with \mathbb{R}^{m\times n} notation all over the place. Yes, you explain what the notation means, but there's zero chance that a reader who didn't already know that notation will become fluent in it right away: they're going to be flipping back to the definition every time. That might be okay if you're trying to teach all of math, but a person reading "Linear Algebra in Four Pages" presumably neither wants nor needs that full skill set. (I'd say that {obj|obj description} notation would make the text even more opaque to novices, no matter how cleanly introduced.)Why not just make a less precise reference to a table of "numbers" (which, as an added bonus, would leave the door open to complex matrices without having to start from scratch)?
 Twist: This is a great argument to teach notation earlier !
 A little observation: You repeatedly use language that fails to distinguish definitions & properties vs. effective computational methods.For example, in section G:> To find the eigenvalue of a matrix we start from the eigenvalue equation ....Solving the resulting equation is one way of computing eigenvalues. But it might not be the one you want to use in some practical situation.Just before that, in section F:> The determinant of a matrix, ... serves to check if a matrix is invertible or not.It is true that a square matrix is invertible iff is has nonzero determinant. It certainly is not true that, for a matrix of any size, computing the determinant is a good method for checking whether a matrix is invertible.
 > Solving the resulting equation is one way of computing eigenvalues. But it might not be the one you want to use in some practical situation.Related fun fact: Solving the eigenvalue equation is (in general) impossible in closed form for matrices 5x5 or larger, because there's no closed form solution to the quintic. So some other algorithm for finding eigenvectors and eigenvalues is often absolutely required in practice.
 Well, that depends on whether you want the exact eigenvalues.
 > It certainly is not true that, for a matrix of any size, computing the determinant is a good method for checking whether a matrix is invertible.Can you explain this part? I don't understand what you mean, and I have an Advanced Linear Algebra final in a few days, so I should probably know this stuff.I thought the determinant was not defined for non-square matrices. Also, I thought that inverses only existed for square matrices (implying that the determinant check is a good way of testing invertability).
 It's a valid way to test whether a (square) matrix has an inverse, but computing big determinants is a lot of effort (both from a "student work" perspective and from a "computational cost" perspective). The point is that in many cases, there are enormously more efficient ways to find an inverse (or, more generally, to solve matrix problems with inverses in them).
 Oh, okay. Thanks for the clarification.
 Good luck to anyone who has a linear algebra exam coming up!I also have a similar short tutorial for Newtonian mechanics here: http://cnd.mcgill.ca/~ivan/miniref/mech_in_7_pages.pdf
 LinAlg 2 coming up in two days. Its a fun course!
 I really enjoyed the Gilbert Strang videos on linear algebra back when I was taking the course at McGill: http://ocw.mit.edu/courses/mathematics/18-06-linear-algebra-...
 He has an exceedingly dry sense of humor. It's quite nice, when you're 40 minutes into an involved lecture. The series is indeed worth the watch.
 Yes, very entertaining and enlightening (even if you already know the subject). Love Strang's teaching style.
 I agree, his lecture deriving the determinant from a few desirable properties is a classic in my mind.
 Just to throw this out there in the hope it is useful, I have been using this book as a review of linear algebra:
 +1This book is awesome! The entire text is interspersed with collapsable examples and exercises.
 Roger Horn is one of the best linear algebra and matrix guys around, and I was, by wide margins, the star student in his class, effortlessly.For the notes, I found serious problems in just the first half of the first column on the first page.F'get about the four pages.If enough people want an outline in a few pages, then I'll consider knocking one out and putting up somewhere as a PDF file.
 I'd be happy to see three serious problems described in the first column of the first page.
 I'll take a crack at being as pedantic as possible. I'm just doing this to play along.1) "Linear algebra is the math of vectors and matrices" - I think this would be better phrased as "the math of vectors and linear transformations." Matrices are a convenient way to represent finite dimensional linear transformations, but I think it's putting the cart before the horse to say that linear algebra is the math of matrices. This is a minor problem, though.2) "A vector ⃗v ∈ R^n is an array of n real numbers" - In general vectors are not arrays of numbers, real or otherwise. This kind of definition will run into problems when you get into Hilbert/Banach spaces.3) "the function f(x) = ln(x) has the inverse f^−1(x) = e^x" - this is only true for x > 0 if we're talking about the real-valued logarithm function, since it's not defined for x <= 0.
 Thank you, that is awesome. So are all the problems ones of a pedantic nature? For example I understood the inverse function description as a definition even though there was a range (which I also knew) for which it was invalid.Generally I consider those sorts of things to be the difference between writing 'proofs' versus explaining concepts.
 Yeah, I wouldn't really consider any of those things to be "serious" problems, at least for the level that this guide is pitched at.
 > 2) "A vector v⃗ ∈ R^n is an array of n real numbers" - In general vectors are not arrays of numbers, real or otherwise. This kind of definition will run into problems when you get into Hilbert/Banach spaces.Surely this is ok if it is parsed as "(a vector in R^n) is an array of n real numbers".
 That's better, but still not entirely correct. An array of real numbers is not itself a vector. It is a representation of a vector in a particular basis. This is a technical point but very important to grok, especially when starting to look at vector spaces other than F^n.
 > An array of real numbers is not itself a vector. It is a representation of a vector in a particular basis.No, such an array can be either a vector or the representation (which also has to be a vector) you mentioned.To make such things make sense, need a definition of a vector space. It can be tempting to omit that definition from an outline, but the definition is just crucial, e.g., when talking about subspaces, which is often crucial. E.g., for m x n matrix A and n x 1 'vector' x, the set of all x such that Ax = 0 is a subspace, but to know this need the definition of a vector space.
 (1) In R^n, none of n, R, or R^n has been defined.Better: Let n be a positive integer, and let R be the set of real numbers. Then R^n is the set of all n-tuples over the real numbers. Such an n-tuple is a list of n numbers; e.g., (1, 3, 2.6, 1, 7) is an n-tuple for n = 5.For positive integers m and n, an m x n (read "m by n") (real) 'matrix' as a rectangular array of real numbers with m rows and n columns. Typically for integer i = 1, 2, ..., m and integer j = 1, 2, ..., n, in m x n matrix A = [a_{ij}] a_{ij} is the number, 'component', in row i and column j. An example of a 2 x 3 matrix is 1 9 4 4 0 -1  (typically surrounded with square brackets not easy to type without, say, TeX).Typically an n-tuple x in R^n is regarded as an n x 1 matrix. In this way, with m x n matrix A, n x 1 vector b, and matrix multiplication (we have yet to define) Ax, we can write Ax = b as a system of m linear equations in n unknowns x, and this is the most important foundation and application of linear algebra.(2) Similarly for positive integer n and the set of complex numbers C, C^n is the set of all n-tuples of complex numbers. And, we can have an m x n matrix of complex numbers.Once get very far in linear algebra, complex numbers become essential if only from the fundamental theorem of algebra (that is, from factoring a polynomial). In particular, eigenvalues are important, and they are often complex numbers.(3) R^n and C^n are examples of 'vector spaces', but there are also many other examples.For a 'vector space', we have a non-empty set V of 'vectors' and a 'field' F. Note: A 'field' is defined carefully in courses in abstract algebra, but the two leading examples in practice are R and C. Most commonly F = R or F = C, but there is also interest in other fields, e.g., the integers modulo a prime number. Here we assume that F = R or F = C.On V we have an operation 'addition' denoted by +. So, for vectors x in V and y in V, there is a vector x + y in V. Addition is commutative so that x + y = y + x. Vector addition is also associative, that is, for x, y, z in V,(x + y) + z = x + (y + z)Also there is a vector 0 in V so that for any x in V we have 0 + x = x.Also for x in V, there is -x in V so that x + (-x) = x - x = 0.Or set V with operation + forms a commutative group (defined carefully in courses in abstract algebra).On V with F we have an operation 'multiplication'. Then for x in V and a in F, with this 'multiplication' the 'product' ax in V.For x, y in V and a, b in F, we have two distributive laws, (a + b)x = ax + bx and a(x + y) = ax + ay. Also, for 0 in F, 0x is the vector 0. For the vector 0, a0 is also the vector 0.We have that (-1)x = -x.For some examples of vector spaces, (A) let F = R and let V be the set of all monophonic audio signals exactly 10 seconds long, (B) let F = R and let V be the set of all real valued random variables X where E[X^2] is finite, and (C) let F = R and for m x n matrix A and n x 1 vector x and m x 1 vector b, let V be the set of all x such so that Ax = b (so far the matrix product Ax has not been defined, but it will be defined so that Ax = b is a system of m linear equations in n unknowns, the components of x).(3) For set S, x in S means that x is an 'element' of S.(4) The notation (R, R, R) = R^3 is neither good nor standard.The notation R^3 = R X R X R as the set theory Cartesian product is standard.Actually, we already know just what (R, R, R) is -- a 3-tuple and not a set of all 3-tuples of real numbers.(5) R^{m x n} is questionable notation and, anyway, needs a definition.(6) For the "power" of that mathematics, might illustrate that with some applications. One of the most important is solving systems of linear equations. For more, a matrix can describe a linear transformation, and linearity is a powerful assumption that commonly holds with good accuracy in practice. So, vectors and matricies become powerful means of working with linear systems where a matrix describes the system and the inputs and outputs of that system are vectors.From G. F. Simmons, the two pillars of analysis in mathematics are continuity and linearity, and linear algebra emphsizes both, especially linearity.More examples of linearity include both differentiation and integration in calculus and the Fourier transform.(6) The definition of inverse function is wrong. That definition holds only when the given function f is 1-1. Even if f is not 1-1, the inverse function is still useful, but its values can be a set of values, that is, the inverse function is multiply valued. E.g., just take f(x) = x^2. Then the f^{-1}(1) = {-1, 1}.
 And modest too.
 Would you mind dropping the titles of those books. I think I found the P Halmos (http://www.amazon.com/Algebra-Problem-Dolciani-Mathematical-...), but I'm having some trouble finding the Nearing.Also, what background do you think is necessary for these texts?
 > I'm having some trouble finding the Nearing.In my first post here, I misspelled the name. As in my first response to your post, the correct spelling is Nering.> Also, what background do you think is necessary for these texts?The background to get started in linear algebra is essentially just high school algebra. In college, linear algebra is commonly the next course after calculus.For more, after linear algebra, commonly there is a course in 'analysis' such as Rudin's 'Principles' in my list. Without a good course, this book would be tough reading.The applications of linear algebra are so broad that for a good sampling of those applications the prerequisites in total are essentially at least a good undergraduate major in mathematics.
 Modded down because I found two useless statements in the first two sentences of your post.
 I'd appreciate that. My linear algebra is extremely basic (all picked up as needed for an AI course). Any well-rounded ground up resources would be awesome.
 I'd be keen to see your outline.
 If this post is representative I want to read as little of your writing as possible..
 yet here you are posting on an internet forum
 Roger Horn is one of the best linear algebra and matrix guys around, and I was, by wide margins, the star student in his class, effortlessly.That's good, because you clearly wouldn't have been the star student in English class.
 As English, there's nothing wrong with what I wrote.
 and I was, effortlessly and by a wide margin, the star student in his class.
 Please do. I got lost in section B.
 If I write a PDF file outline of linear algebra, where can I post it?Any suggestions?
 You could make a Dropbox file public with little effort, or create a Scribd account and post it there. Thank you in advance.
 Thanks.
 interested!
 erk, I don't think condensing much information in the smallest place possible is the best way to learn something (or even review it).I'm all for a no bullshit and quick way to get something. That's why I sometimes check learnXinYminutes.com or some random cheatsheets on Google. But this doesn't make it for me.Btw, if you really want to get a good grasp on Linear Algebra you should check Gilbert Strang's video courses on MIT OpenCourseWare. They are amazing and soooo easy to understand you don't even need to like mathematics to watch them. I haven't come across a better support to start with Linear Algebra.
 Course codes 18.06SC and 18.06 i think if anyone else was looking. Thanks for the pointer, really appreciated!
 Got excited for a moment...then did some prerequisites digging and ended up with a prerequisites dependency chain:18.06 -> 18.02 -> 18.01Where 18.02 = Multi Variable Calc 18-01 = Single Variable CalcConsidering that my whole point of learning linear algebra was to clear it as a roadblock for Machine Learning, this is what my whole Dependency chain looks like:Machine Learning -> Linear Algebra -> Multivariable Calc -> Single Variable Calc -> high school algebra and trigonometry.I have a feeling I'll end up sticking to being a web developer :)
 Calculus gets a bad rap for being difficult, but if you're learning on your own, you can just focus on the ideas and not on the arduous computation (which something like maxima can do for you). The core ideas of calculus can probably be learned in a week. Review all the trig on Khan Academy, then try watching some calculus lectures, focusing on the big ideas, not memorizing rules for computing derivatives or integrals.BTW, you probably don't need much calculus to learn most of the linear algebra you need; those requirements are mostly there for mathematical maturity, plus then being able to assign more interesting exercises.
 Gilbert Strang made a great series of lectures on the big ideas of calculus sans most of the computation - http://www.youtube.com/playlist?list=PLBE9407EA64E2C318
 You don't really need those calc courses to learn linear algebra. It's more about "mathematical maturity": familiarity with vectors, functions, inverse functions, a general intuition about multidimensional coordinates, and so on. A programming background would probably go a long way (although I don't speak from experience; I learned a lot of math before I started programming).It's hard to overstate the importance of linear algebra in software. It's really worth learning.
 I think the calc course prerequisites have an even simpler explanation: mathematicians don't usually manage to mention the existence of vectors until the multivariable calculus class. If you're already familiar with that idea, you shouldn't need to know anything about partial derivatives and surface integrals to understand linear algebra!
 Don't look at the prerequisites and dig right in. I don't think you'll have any problems. Linear Algebra is something that kinda stand on its own.> this is what my whole Dependency chain looks likeYou should not think like that when learning. If you want to be efficient just take shortcuts.Learn machine learning (btw there are great courses on coursera or Stanford about that), and if you're stuck on something just check on wikipedia or another resource.
 Why do linear algebra teaching materials never mention what a determinant is?(It's the product of the eigenvalues.)
 > It's the product of the eigenvalues.It sure is (provided you count multiplicities correctly), but that's not the one-sentence explanation I would have gone with, even to someone with much more intuition about eigenvalues than you'd get from this document. I would say the determinant is the volume scale factor (or hypervolume scale factor, in the general case).Actually, what's really interesting to me is your general point: why do never mention ? And why does this question not get asked more often? The nearest I've been able to come to a plausible answer is a mixture of (i) the key insight varies from person to person more than you'd think, combined with the closely related (ii) you can't teach key insights just by saying a few words.
 Thank you for your comment! First off the explanation regarding det(A) = the volume scaling of the transformation T_A associated with the matrix A. I've been stuck for over a week now trying to word this any other way possible because, in the current ordering of the sections, I'm covering determinants before linear transformations. Perhaps there is no better once sentence than talking about the volume and I should reorder the sections...You've raised a very important point regarding "routes to concepts" which really should be asked more often!> (ii) you can't teach key insights just by saying a few words.Generally true, though we have to say that the key difficulty in communicating insights is lacking prerequisites. Therefore, if you think very carefully about the prerequisite structure (i.e. model of the reader's previous knowledge) you can do lot's of interesting stuff in very few words.> (i) the key insight varies from person to person more than you'd think,Let G=(V,E) where V is the set of math concepts, and E are the links between them. Then there are as many ways to click on a concept x, as there there are in-links for x! In this case we have at least three routes: Route 1: geometric lin. trans T_A = {matrix:A, B_in:{e1,..,en}, B_out:{f1,..,fn}} ---> what happens to a unit cube 1x1x1, represented (1,...1) w.r.t. B_in, after going through T_A? ---> output will be (1,..,1) w.r.t. B_out ---> observe that cols of A are f1..fn therefore det(A) = (hyper)volume of (hyper)parallelepiped formed by vectors f1..fn Route 2: computational det(A) = { formula for 2x2, formula for 3x3, ... } ---> a easy test for invertibility of an n by n matrix sidenote: see Cramer's rule for another application of dets Route 3: algebraic given a 2x2 matrix A, find a coefficients A T D that satisfy the following matrix equation: B*A^2 + T*A + D = 0, the linear term is the trace T = sum_i a_ii, the constant term is D = det(A). sidenote: B*λ^2 + T*λ + D = {characteristic poly. of A} = det(A-λI)  So perhaps the task of teaching math is not so much to try to find "the right" or "the best" route to a concept, but to collect many explanations of routes (edges in G), then come up with a coherent narrative that covers as many edges as possible (while respecting the partial-sorted-order of prerequisites ).
 Uh, so you think this varies significantly from person to person? I don't know about you, but I think> The determinant of a matrix is the product of its eigenvalues, and calculating it allows you to check if the matrix is invertible or not.is a heck of a lot better than> The determinant of a matrix is "a special way" to combine the entries of a matrix that serves to check if a matrix is invertible or not.to pretty much any person.
 I love a lesson that doesn't include ANY real world examples. What's the purpose of this document? Does it accomplish that purpose?
 I'll have to add some examples, yes >> TODO.txtpurpose = to give a quick intro to the subject (the most important idea at least, namely, that matrix-vector product can represent linear transformations).
 Suppose that you are driving a car and you are hit with a big tree. The main eigenvector is in the direction from the car to the tree, the eigenvalue measures the deformation of the car produced by this accident. If the car is a half shorter after the accident then the eigenvalue is 1/2.
 Wow amazing stuff here, I agree with you that most math textbooks in our education system fail to explain concepts which are supposed to be very simple. Thank you so much for making things simpler. I'm having MATH 270 final next week at McGill, this is going to help a lot in studying :)
 No pseudo-inverse, no singular value decomposition, no least squares regression, no k-nearest neighbors and I'm sure I'm leaving many other things, all fundamental algebra very much needed for understanding and developing of computer science (e.g. machine learning, data storage and compression), telecommunications (e.g. queueing theory, multimedia coding and streaming) and many other fields related to engineering. I lovehate maths (I really struggle with them), but I honestly think our linear algebra book back in grad school was 600 pages for a good reason, you just can't do proper engineering without it. (Also I have another really thick book on the most important numerical methods for implementing that algebra on a computer, so, come on! :-))
 This is really basic. Noone(who uses the subject) should need a cheat sheet for this... Also there are really good books out there and linalg is a must nowdays. As for textbook: http://www.amazon.com/Linear-Algebra-Right-Undergraduate-Mat... As for reference: http://www.amazon.com/Matrix-Analysis-Roger-Horn/dp/05213863...
 Suppose that you want to project a vector of data (x1,x2,x3,...,xn) into the one dimensional subspace generated by the vector (1,1,...,1). What's the projection in this case?Hint: You obtain the most important concept of statistics.
 I wish I had this in grad school when I had forgotten all my undergrad Linear Algebra. It's not a very hard or deep aspect of math if you understand the fundamentals, so this is very useful.
 Courses would benefit from "quick-reference guides like the end of Dror Bar Natan's paper on Khovanov homology http://arxiv.org/abs/math/0201043 He says "It can fit inside your wallet."
 Very cool, I like title of the associated book "No Bullshit Guide To Linear Algebra" too.Does anyone know of a nice, short summary of discrete mathematics to go along with this?
 Studying at a university where all this and more is part of the first year education of computer scientists, I - probably foolishly - assumed basic linear algebra was common knowledge in the community. Now my interest is piqued: Which educational path / career path did you take ending up in IT and (more specifically) how much mathematical education did it include?What is your educational background and which route
 I bought this guy's previous book and the print quality was crap. I hope he improves this on his next book.
 In page 3, section B. Using elementary matrices: To remember what matrix correspond to a row operation just apply that operation to the identity matrix and you obtain the elementary associated matrix.
 Here is a 1914 book on calculus. Calculus made easy by Thompson Silvanus Philip
 thanks for the post, i've been trying to learn linear algebra for a long time but keep getting stuck / bored.
 For me what makes it really interesting is the realization that all of linear regression (think y = mx + b) is based on linear algebra, specifically the notion that figuring out a best-fit line (in the case of a single-dimensional input variable) is projection of the N-dimensional vector space of observations (where you have N data points) onto the best-fitting 2 dimensional vector spaces (assuming you're fitting a slope and an intercept).When thought of this way, a lot of linear algebra has geometric interpretations and for me this makes it a lot less abstract.
 Nearly everything in linear algebra has important geometric interpretations.In addition, a lot that works in the 3-space we live in (for the non-relativistic approximation) generalizes to linear algebra, vector spaces more generally, and, in particular, both Hilbert space and Banach space.
 Yeah, dergachev is pointing to one of the cool direct applications of linear algebra to machine learning.Suppose you are given the data D = (r_1; r_2; r_3) where each row r_i is an n-vector r_i=(a_i1, a_i2, ..., a_in, b_i). Each row consists of some observation data. We want to predict future b_j given the future \vec{a}_j, given that we have seen {r_i}_{i=1...N} (The data set consists of N data rows r_i where both \vec{a}_i and b_i are known).One simple model for b_j given \vec{a}_i = \vec{a}_i = (a_i1, a_i2, ..., a_in) is a linear model with $n$ parameters m_1,m_2,...,m_n: y_m(x_1,...x_n) = m_1x_1 + m_2x_2 + ... + m_nx^n = \vec{m} · \vec{x}  If the model is good then y_m(\vec{a}_i) approximates b_i well.But startup wisdom dictates that we should measure everything!Enter error term: e_i(\vec{m}) = |y_m(\vec{a}_i) - b_i|²,  the squared absolute-value of the difference between the model's prediction and the actual output---hence the name error term.Our goal is to make the sum $S$ of all the error terms as small as possible. S(\vec{m}) = \sum_{i=1}^{i=M} e_i(\vec{m})  Note that the "total squared error" is a function of the model parameters \vec{m}.At this point we have reached a level of complexity that becomes difficult to follow. Linear algebra to the rescue!We can express the "vector prediction" of the model y_m in "one shot" in terms of the following matrix equation: A\vec{m} = \vec{b} where A is an N by n matrix (contains the a_:: part of the data) \vec{m} is an n by 1 vector (model parameters---the unknown) \vec{b} is an N by 1 vector (contains the b_: part of the data)  To find \vec{m}, we must solve this matrix equation, however A is not a square matrix: A is a tall skinny matrix N >> n, so there is no A^{-1}.Okay so we don't have a A^{-1} to throw at the equation A\vec{m}=\vec{b} to cancel the A, but what else could we throw at it. Let's throw A^T at it! A^T A \vec{m} = A^T \vec{b} \ / N (an n by n matrix) N \vec{m} = A^T \vec{b}  Now the thing to observe is that if N is invertible, then we can find \vec{m} using \vec{m} = N^{-1} A^T \vec{b}  This solution to the problem is known as the "least squares fit" solution (i.e. choice of parameters for the model m_1, m_2, ..., m_n). This name comes from the fact that the vector \vec{m} is equal to the output of the following optimization problem \vec{m} = minimize S(\vec{m}) over all \vec{m}  Proof: http://en.wikipedia.org/wiki/Linear_least_squares_(mathemati...Technical detail: the matrix N=A^T*A is invertible if and only if the columns of A are linearly independent.TL;DR. When you have to do a "linear regression" model of data matrix X and labels \vec{y}, the best (in the sense of least squared error) linear model is \vec{m} = (X^T X)^{-1} X^T \vec{y}.
 That literally made more sense than anything I've learned this semester. :)
 Can you please post the tex associated with the pdf? Thanks in advance.
 Does anyone have a recommendation for an equally good and concise guide for Discrete Mathematics by any chance?
 Literally just finished this exam. Too bad I didn't see this summary 4 hours ago!
 Linear algebra class in one abbreviation: LAPACK.

Applications are open for YC Summer 2018

Search: