The visualization helped me spot an unstable fixed point and understand the behaviour of the algorithm near eigenvalue clashes. The behaviour's quite sophisticated.
I think I wrote this there, but here goes again. The idea is that a positive-definite symmetric matrix can be visualized as an ellipse. This follows from the spectral theorem. Each iteration of the QR algorithm causes the ellipse to fall towards the x-axis, as if under the influence of gravity. The unstable fixed point corresponds to when the ellipse is standing up precariously, unable to fall in either direction. If you tilt it by just a bit, it will fall over (so the fixed point is unstable).
The case when the ellipse is nearly circular (corresponding to near eigenvalue clashes) causes the ellipse to fall over slowly. I think this also makes physical sense, if you think of it being under the influence of gravity. If you think of this near-circle as being a matrix, then this matrix is nearly equal to a scalar multiple of the identity matrix, so its eigenvalues are essentially known. The fact that the ellipse falls very slowly implies that the eigenvectors are unstable near eigenvalue clashes, but the eigenvalues are easy to find.
Note: The issues surrounding the unstable fixed point can be fixed using Wilkinson shifts. This makes each iteration into a discontinuous function, allowing all the fixed points to be stable. The issue surrounding instability of the eigenvectors near eigenvalue clashes cannot be fixed, as it's intrinsic to eigendecomposition (even of symmetric matrices). The latter difficulties can be dodged by slightly perturbing the matrix, but the resulting eigenvectors can be very different from the eigenvectors of the unperturbed matrix.
Since you mentioned QR, I want to link [1] to one of my favorite articles, which explains how to use the QR decomposition to sample a random orthogonal matrix uniformly at random. The basic idea, of course, is that if A is a random matrix, and we compute A = QR, then Q is a random orthogonal matrix. But with what distribution? Well, it turns out to be almost uniform (according to Haar measure), but there are some density issues. A simple scaling of QR decomposition resolves the issue.
When I was a TA I loved giving this as an example to students. It's a great teaser for the rabbit hole that is random matrix theory.
> a positive-definite symmetric matrix can be visualized as an ellipse.
It is clear that you mean the ellipsoid as a set of points {x | x^T * A * x = 1} (or some other constant). There is another way in which all square matrices define an ellipsoid based on how the matrix transforms a unit sphere: {A*x | x^T * x = 1} (different matrices can map to the same ellipse here, however).
I always like to clarify which one.
(I unfortunately do not have intuition for the QR algorithm, but I am distracted by the description of an ellipse falling, as if it's rolling against the x-axis)
> It is clear that you mean the ellipsoid as a set of points {x | x^T * A * x = 1} (or some other constant). There is another way in which all square matrices define an ellipsoid based on how the matrix transforms a unit sphere: {A*x | x^T * x = 1} (different matrices can map to the same ellipse here, however).
I actually had the latter one in mind: The one given by {A*x | x^T * x = 1}. I hadn't thought of your other suggestion.
> (I unfortunately do not have intuition for the QR algorithm, but I am distracted by the description of an ellipse falling, as if it's rolling against the x-axis)
Each iteration causes it to rotate around the origin. The big semi-axis of the ellipse makes a smaller angle with the x-axis on each iteration. When the semi-axes are parallel to the coordinate axes, the matrix is diagonal.
> I actually had the latter one in mind: The one given by {Ax | x^T x = 1}.
I am near certain that you did not mean this.
> When the semi-axes are parallel to the coordinate axes, the matrix is diagonal.
Let D =
the matrix
| 2 0 |
| 0 1 |
And let R be any 2-by-2 rotation matrix.
A = D * R would give you a locus of points (under {Ax | x^T x = 1}) that has the semi-axes parallel to the coordinate axes. In general, A is not diagonal.
For the condition you mention (axis-aligned semi-axes of ellipsoid iff matrix is diagonal) is true for {x | x^T * A * x = 1} .
I only consider positive semi-definite symmetric matrices. A = D * R is almost certainly not such a matrix. The positive semi-definite symmetric case is the only one needed to compute SVDs.
A proof of convergence for the algorithm is only known for certain families of matrices. These include the positive-definite symmetric matrices, all symmetric matrices (once suitable improvements to the algorithm are made), the Hermitian matrices, and perhaps some others. But a proof that's valid for all matrices isn't known, even though the algorithm (when improved using Wilkinson shifts) appears to converge everywhere.
>> Under certain conditions,[4] the matrices Ak converge to a triangular matrix, the Schur form of A.
> So am I to understand that for A positive semi-definite, the A_{k} converges to a diagonal matrix?
Yes. Each iteration of QR (and LR) turns a PSD matrix into another PSD matrix. A PSD matrix in Schur form is necessarily a diagonal matrix.
>> The basic QR algorithm can be visualized in the case where A is a positive-definite symmetric matrix.
> It sounds like you can visualize the iterates A_{k} no matter what. Is the problem that there isn't a fixed point when the ellipsoid is axis-aligned?
My visualization technique assumes that the matrix is PSD. In the PSD case, there is a one-to-one correspondence between ellipsoids and PSD matrices. I don't know what happens if you apply it to non-PSD matrices, but the one-to-one correspondence gets lost.
The visualization helped me spot an unstable fixed point and understand the behaviour of the algorithm near eigenvalue clashes. The behaviour's quite sophisticated.
I think I wrote this there, but here goes again. The idea is that a positive-definite symmetric matrix can be visualized as an ellipse. This follows from the spectral theorem. Each iteration of the QR algorithm causes the ellipse to fall towards the x-axis, as if under the influence of gravity. The unstable fixed point corresponds to when the ellipse is standing up precariously, unable to fall in either direction. If you tilt it by just a bit, it will fall over (so the fixed point is unstable).
The case when the ellipse is nearly circular (corresponding to near eigenvalue clashes) causes the ellipse to fall over slowly. I think this also makes physical sense, if you think of it being under the influence of gravity. If you think of this near-circle as being a matrix, then this matrix is nearly equal to a scalar multiple of the identity matrix, so its eigenvalues are essentially known. The fact that the ellipse falls very slowly implies that the eigenvectors are unstable near eigenvalue clashes, but the eigenvalues are easy to find.
Note: The issues surrounding the unstable fixed point can be fixed using Wilkinson shifts. This makes each iteration into a discontinuous function, allowing all the fixed points to be stable. The issue surrounding instability of the eigenvectors near eigenvalue clashes cannot be fixed, as it's intrinsic to eigendecomposition (even of symmetric matrices). The latter difficulties can be dodged by slightly perturbing the matrix, but the resulting eigenvectors can be very different from the eigenvectors of the unperturbed matrix.