

Case Study: Optimizing Matrix * Vector Multiplication for SSE - JabavuAdams
http://www.cortstratton.org/articles/OptimizingForSSE.php

======
manvsmachine
This is exactly why I'm really hoping that OpenCL takes off. While a good
read, to me the article shows why you shouldn't necessarily continue with the
status quo just because "it works" (granted, the article was written back in
'02 when there was no other real option). We have tons of insanely cheap
hardware now that handles SIMD better than any CPU, it's about time we start
making use of it.

