Power Method for Approximating Eigenvalues [pdf]

acidburnNSA · on Dec 27, 2017

Fun fact, the power method is what all neutronics codes that simulate neutron distributions in nuclear reactors use. The diffusion/transport equation in a multiplying medium is a eigenvalue equation and the dominant eigenvalue is the inverse of k, the multiplication factor

acidburnNSA · on Dec 27, 2017

EDIT: Explained in detail in another comment: https://news.ycombinator.com/item?id=16017949

hervature · on Dec 27, 2017

The convergence analysis is a bit lacking, but there is a significant sped up when you store the last power of A and you keep multiplying by that instead of A. That is: A, A^2, A^4, A^8, ...

It makes the second case they give go from 60 iterations down to 7.

jules · on Dec 27, 2017

Multiplying a matrix by a matrix is a lot more expensive than a vector by a matrix.

nimish · on Dec 27, 2017

Yeah and if you're gonna go matrix matrix you might as well do the QR method

chestervonwinch · on Dec 27, 2017

Is there a relationship between the power method and some "standard" optimization algorithm (grad. descent, Newton's, ...) applied to maximization of the Rayleigh quotient?

nimish · on Dec 27, 2017

Conjugate gradient (And other krylov subspace iterations) are basically a generalization of power iterations where you consider the whole set of {b, Ab, A^2b...} rather than just the one.

selimthegrim · on Dec 27, 2017

MATLAB, for one, in its eigs function uses Arnoldi with restart (picking a arbitrary seed to zoom on "good" parts of the Krylov subspace). Imagine we pick an Arnoldi vector for restarting (a member of that orthonormal basis which spans that subspace) which is equivalent to some polynomial expression of the initial matrix times our initial vector. We essentially want to pick this vector using such a polynomial (not the characteristic) so that the values of this polynomial function peak as a function of the true eigenvalues of the matrix - this lets us pick out the Krylov space that better approximates the correct eigenvector and hence the correct eigenvalue.

hyperbovine · on Dec 27, 2017

The leading eigenspace is the best rank one approximation of A. So the power method finds a minimizer v of the (Frobenius) norm of A-vv^T.

gcr · on Dec 27, 2017

Once you know the dominant eigenvector, I recall there was some trick you could do to get the second-dominant eigenvector, by projecting the dominant one out somehow. How can you repeat power iteration to get all the eigenvectors of a matrix?

qwerty1793 · on Dec 27, 2017

The largest eigenvalue of (A - k)^{-1} is 1 / (x - k) where x is the smallest eigenvalue of A that is larger than k. So by using this technique on (A - k)^{-1} for various values of k you can find all of the eigenvalues of A.

acidburnNSA · on Dec 27, 2017

In practice this is numerically unstable and it's generally better to use methods that use a larger and orthogonal subspace (i.e. Krylov methods) to get multiple eigenvalue/vectors. In nuclear reactor problems you can maybe get about 5 eigenvalues by filtering the larger ones but after that it gets noisy fast. I've gotten up to 1000 good eigenvalues from a large neutron diffusion problem using Arnoldi.

jdonaldson · on Dec 27, 2017

You're talking about Arnoldi Iteration. It's useful in a few cases, but it doesn't scale as well due to the fact the component matrices are no longer sparse.

thomasahle · on Dec 27, 2017

If you know the first one, you can project everything onto its orthogonal plane.

xchip · on Dec 27, 2017

What are you all computing eigenvalues for? I am curious.

noelwelsh · on Dec 27, 2017

PageRank is computing the principal eigenvector of the transition matrix describing the connectivity of the web. PageRank is no longer, AFAIK, the primary signal that Google uses but it's an idea that was worth $Billion at least.

hyperbovine · on Dec 27, 2017

$25b, in fact. https://www.rose-hulman.edu/~bryan/googleFinalVersionFixed.p...

acidburnNSA · on Dec 27, 2017

In a nuclear reactor, neutrons are born (from fission or n,2n reactions) and neutrons are lost (by diffusion or absorption). The rates of these processes are all dependent on how many neutrons there are (call it X), and by conservation of neutrons, the mechanisms can be written:

Losses * X = Gains * X

or equivalently: (Losses - Gains) * X = 0

As it turns out, when you discretize this (matrix) equation over some spatial mesh (and sometimes energy and angular meshes as well), that is a classic eigenvalue problem AX=0 or (L - lambdaG)X = 0. You can just write down the loss and gain terms in every spatial mesh point and pass the matrix into an eigenvalue solver like the power method and you'll get a value for lambda and X. X (the eigenvector) is your neutron distribution in the reactor, it tells you where the neutrons are, where they're going, and how fast they are. This determines how much fission power is being produced where and how much each material is transmuting into fission products and whatnot. The eigenvalue lambda is a number that tells you how critical or not the chain reaction is. If it's 1.0, you're critical. If it's less than 1 you're exponentially growing (going up in power or exploding), and if you're above one, you're exponentially decaying.

And that my friends is how computing eigenvalues is the key to nuclear engineering.

phunge · on Dec 27, 2017

They're also used in spectral clustering, which is a powerful method for clustering data (see Ng 2001 for some pictures)

efavdb · on Dec 27, 2017

One example: A common ml algorithm called principal component analysis makes use of the eigenvectors and eigenvalues of a data set's correlation matrix. The dominant eigenvectors can be used as a basis for data compression and the size of their eigenvalues determines the accuracy of the compression.

hackcrafter · on Dec 27, 2017

With high dimensionality data (think many columns), plotting the first two eigenvectors provides a fantastic clustering of the rows.

In genetics for example, this plot may separate samples with different ethnicities (see PCA on SNPs).

antognini · on Dec 27, 2017

The eigenvalues of the Hessian of a neural network describe the curvature of the loss function. It is generally believed that in NNs trained with stochastic gradient descent regions with high curvature tend to generalize poorly.

tilt_error · on Dec 27, 2017

What textbook is this excerpt from?

deanmen · on Dec 27, 2017

Content seems to be in Larson, Elementary Linear Algebra and also in Complex Variables and Numerical Methods (GTU 2016) By Ravish R Singh, Mukul Bhatt

devxpy · on Dec 27, 2017

Erwin Kreyszig, Advanced Engineering Mathematics

dajohnson89 · on Dec 27, 2017

yep I recognize it. that's an awesome book.

devxpy · on Dec 27, 2017

It's actually my semester textbook