Hacker News new | past | comments | ask | show | jobs | submit login

> seldom get back the "best" conceptual answer, "the determinant is the product of the eigenvalues."

I found that the "best conceptual" answer depends a lot on taste, and what concepts you are familiar with.

In this case:

- Calculating exact eigenvalues of matrices larger than 4x4 is impractical, since it requires you to solve a polynomial of degree >4.

- The EV exist only in algebraically closed fields (complex numbers), while the determinant itself lives in the base field (rationals, reals).

How about:

- [Geometric Determinant] The determinant is the volume of the polytope (parallel-epiped) spanned by the column vectors of the matrix.

- [Coordinate Free Determinant] The determinant is the map induced between the highest exterior powers of the source and target vector spaces (https://en.wikipedia.org/wiki/Exterior_algebra)

- I think there is also a representation theoretic version, that characterizes the determinant as invariant under the Symmetric group acting by permutation on the columns/rows of the matrix.




Re: your last point, the determinant is the matrix function, which is fully anti-symmetric under permutation of rows and columns i.e. swapping a pair of rows or a pair of cols pulls out a minus sign. The definition of the determinant is related to the alternating representation of the symmetric group.

The permanant [1] is the matrix function which is fully symmetric, so permuting any rows or cols leaves it invariant. It emerges from the identity representation.

Finally, partially symmetric matrix functions are known as immanants [2], defined using the other irreps of the symmetric group.

[1] https://en.wikipedia.org/wiki/Permanent_%28mathematics%29

[2] https://en.wikipedia.org/wiki/Immanant


I don’t have an intuition for these concepts I’m afraid (I probably should watch the videos). What I don’t see for instance is how this relates to the fact that a matrix A with det(A) = 0 is not invertible.


Take a 3x3 matrix A for example. Then det(A) is the volume of the parallepiped formed by the row vectors. If one vector is a linear combination of the other two, this means that the vectors lie in a plane, which has volume 0 in 3D, so det(A) = 0. Since we have a plane in 3D, this means A can't express all vectors in 3D, so it's not invertible. This generalizes to any dimension.


The geometric version is the most intuitive for me:

If the volume of the prallel-epiped is zero, then there will be directions in the target space, that you did not hit. Hence he matrix can not be invertible.


I will have to understand what directions in target space are first. I’ll guess I’ll have to do the work ;)


So an eigenvector is a special direction which most linear transformations have. If you ask them to transform a vector in that direction, they do not rotate that direction to some other direction, they only scale it by a constant called the eigenvalue. Usually there are a bunch of these directions for one transform, and they do not need to be orthogonal. We often choose one vector as representative, like (-1, 1): but this is shorthand for all vectors (-t, t) for all t. One important thing is that the zero vector (0, 0) doesn't count (even though T 0 = 0) because it can't represent a whole direction.

So for example if I take (x, y) to T (x, y) = (3x + y, 2x + 4y), that is an example of what we call a linear transformation -- it obeys T(p1 + p2) = T p1 + T p2, it distributes over addition.

Now in addition to noticing that this is linear we may happen to notice that T (-1, 1) = (-3 + 1, -2 + 4) = (-2, 2). So in the direction (-t, t) we are just scaling vectors by a factor of 2, to (-2t, 2t). Similarly we might notice that T (1, 2) = (3 + 2, 2 + 8) = (5, 10). So in the direction (t, 2t) we are just scaling vectors by a factor of 5 to (5t, 10t).

These two scaling factors, 2 and 5, are called the eigenvalues of T. Their product, 10, is called the determinant of T. And in this case their eigenvectors span the entire space -- you can make any other (a, b) as a combination (-t1, t1) + (t2, 2 t2), for some numbers t1, t2. Actually t1 = (-2a + b)/3 and t2 = (a + b)/3, I can work out pretty quickly. And in this t-space this transformation is very easy to think about, it has been "diagonalized."

Sometimes these eigenvalues and eigenvectors don't exist, but we can patch that up with one of two tricks. The first trick is, for example, used for the 2x2 rotation matrices. These rotate every direction into some other direction, so how will I find some direction which "stands still"? The answer here is complex numbers, in this case it turns out that any 2x2 rotation by angle t will map the complex vector (1, i) to (cos t + i sin t, -sin t + i cos t) = (cos t + i sin t) * (1, i), so it has two complex eigenvalues e^(it), e^(-it). So the first trick is complex numbers. There is, it turns out, only one other class of weird transformation. In these weird transformations, it is possible to define chains of "generalized eigenvectors". Each chain starts with one ordinary eigenvector with an ordinary eigenvalue q, T v1 = q v1, and then the next element of the chain is a "generalized eigenvector of rank 2" which has T v2 = v1 + q v2, and then the next element of the chain is a "generalized eigenvector of rank 3" which has T v3 = v2 + q v3, and so on.

So it is a theorem that any NxN complex linear transformation has N linearly independent generalized eigenvectors which span the space, and "usually" these are all just normal eigenvectors and the matrix is "diagonalizable" (and even if they aren't, they come in families which start from one normal eigenvector and the matrix can be put into "Jordan normal form").

If you understood all of that, you are ready for the main result that you asked about. :)

For a linear transformation to be invertible, it needs to map distinct input vectors to distinct output vectors. If it maps two different input vectors to the same output, then invertibility fails.

So we know that invertibility fails when we can find distinct v1 and v2 such that T v1 = T v2.

Put another way, T v1 - T v2 = 0. But by the linearity property, T distributes over additions and subtractions, so this is the same as saying that T (v1 - v2) = 0 for v1 - v2 nonzero. This is enough to establish that v1 - v2 is an eigenvector with eigenvalue zero.

What does this do to the determinant, the product of all the eigenvalues? Well, zero times anything is zero. So if some linear transformation T is not invertible, then you immediately can conclude that det(T) = 0.

Furthermore this argument goes the other way too, with only a little subtlety related to these "generalized eigenvalues" -- basically, that the generalized eigenvalues always exist and there is always at least one eigenvector which actually has that eigenvalue, and that complex numbers still have this property that any finite product of complex numbers which results in zero can only come about if one of those numbers was zero. If you know all of those things, then you can work your way backwards to conclude that det(T) = 0 implies that one of the generalized eigenvalues is zero, which has at least one normal eigenvector v such that T v = 0, which I can then use to find many inputs for any given output, T u = T (u + k v) for any k

So to say that this product-of-eigenvalues is zero is to say that one of the eigenvalues is zero, and therefore the linear transformation is projecting down to some smaller, flatter subspace in a way that cannot be uniquely undone. If there is no such smaller flatter subspace, then the transformation must have been invertible all along.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: