
Ask HN: what linear algebra do you use most often for practical problems? - vang3lis
Linear algebra comes up in almost every area of modern research, but what applications have you made most often? Is it SVD for latent semantic analysis or computing eigenvector for PageRank or something else? Discuss.
======
jey
I'm using SVD in building a porn recommendation engine. (Very NSFW, and the
recommendation engine itself isn't live yet: <http://fapseek.com>)

~~~
redrobot5050
You hiring?

~~~
jey
Email me.

------
kurtosis
I apologize for threadjacking but you guys might be able to help.

I have a different problem - I _would_ like to compute an approximate SVD of a
very large sparse matrix, (for spectral clustering) but I can't find a good
implementation which works for datasets too large to fit in core. This is a
hadoop scale problem. What's the best way to do this?

Of course, finding _all_ the singular values/vectors is out of the question,
but I just need the top hundred or so.

Does anyone here have any suggestions for how to do this? The obvious strategy
is just to construct a rank-100 approximation and optimize the singular values
and vectors so that they get as close as possible to the real matrix. I guess
gradient descent or something like that would work. Are there existing
packages that do this with hadoop?

~~~
buss
Gradient descent is a good solution for approximate SVD, I'm using it as part
of my data mining final project (working on the netflix prize). I'm using this
guy's code: <http://www.timelydevelopment.com/demos/NetflixPrize.aspx>,
modified to print out the singular vectors when it finishes. It took about 32
hours (can't quite remember) to find the first 64 singular values* on the
netflix dataset (480000x18000, 1.2% non-zero (or is it 1.8%?)) using a 2.2GHz
Opteron and ~2 gigs of ram.

I'm sure there are better methods, but this one is easy and is producing great
results. If you have any questions, you can shoot me an email at sbuss at cise
dot ufl dot edu.

As for hadoop, I don't know of any parallel implementations of this that
exist, but I don't think it would be /too/ hard to parallelize the gradient
descent approach. Just split up the error calculation into several smaller
chunks. If you get it running in parallel, let me know.

*edit: changed "vectors" to "values" in first paragraph.

~~~
kurtosis
Thanks for the link - these results are encouraging. I'll think about it - if
I come up with anything I'll let you know. Good luck.

------
Raplh
I'm not in software, but rather algorithm development for determining radio
positions (like GPS but including other methods for cell phones). I have used
eigenanalysis a number of times, a lot of orthogonal basis shifting of various
kinds, implicilty a lot of matrix inversions and pseudoinversions to solve
things. I do most of my R&D using matlab, and have learned a lot of linear
algebra reading their helpfiles. I use hilbert spaces and what I learned about
them 30 years ago almost constantly.

------
Tangurena
Simplex algorithm, for resource optimization solver.

<http://en.wikipedia.org/wiki/Simplex_algorithm>

------
thomaspaine
Non-negative matrix factorization for text clustering:
<http://www.procoders.net/?p=128>

------
biohacker42
SVD, PCA, all kinds of matrix transformations, all for bioinformatics.

------
omarish
You can save a ton of time by formulating a linear algebra problem as a
shortest path problem; you can end up using dijkstra's algorithm to enumerate
the possible solutions.

<http://en.wikipedia.org/wiki/Knapsack_problem>

~~~
jey
I don't get it (at all). Can you elaborate?

~~~
darkxanthos
He's saying basically, instead of manipulating a bunch of data algebraically
to get the exactly provably right answer you can use this other algo to get
you a damn good answer faster. (I think)

~~~
omarish
accurate

------
biotech
I probably use multi-linear regression most often, but occasionally I'll
encounter a Linear Programming problem.

I'm currently looking for a good Partial Least Squares algorithm in C/C++. Any
suggestions? I know R has a popular PLS algorithm, but I was hoping to avoid
the learning curve.

------
njoubert
Generating Jacobians for Implicit Euler Integration in a numerical simulation
project. SVD for rigid body center-of-mass calculations. Conjugate Gradient
for solving Ax=b on 9000x9000 matrices.

------
keefe
It's not a very interesting answer, but technically linear algebra :
transformation matrices for rotating and scaling graphics...

------
frankus
I've poked around the CGAffineTransform classes that Apple uses in their Core
Animation framework a bit, but unless you're doing something out of the
ordinary, you don't really need to know what's going on under the hood.

