
Linear Algebra: What matrices are - lormayna
https://nolaymanleftbehind.wordpress.com/2011/07/10/linear-algebra-what-matrices-actually-are/
======
jameshart
I actually object a little bit to the claim that matrices _are_
representations of linear transformations. No, matrices are just two-
dimensional arrays of numbers. If you then define a specific 'mutliplication'
operation on those arrays, and a mapping from linear functions to matrices, it
turns out that that multiplication operation is isomorphic to function
application. That's neat! But it doesn't mean that that is what matrices
_are_.

If you came up with an isomorphism from matrices to a domain where it made
sense to define a different multiplication operation - like maybe placewise
multiplication, where

    
    
       [a b] * [e f] = [ae bf]
       [c d]   [g h]   [cg dh]
    

then that would be just as valid, but it wouldn't change what matrices 'are'.
In fact, because you know how to map matrices to linear functions, it would
let you describe an operation to combine two linear functions in a new way and
that might lead to some new insight about linear algebra!

It's like how, in school you were taught that you can't multiply vectors
together. Yet, in shader languages, it turns out that it's _really useful_ to
be able to multiply two vectors just by multiplying each component
([a,b][c,d]=[ab,cd]), so they define that as a valid operation.

~~~
ambicapter
Why were you taught that you can't multiply vectors together? Aren't there two
ways to "multiply" vectors–dot and cross?

~~~
philh
Traditionally, multiplication takes two elements of a set and turns them into
another element of that set. The dot product doesn't do that, it takes two
vectors and turns them into a scalar.

The cross product only exists in three dimensions. And it's not associative
(A×B×C gives a different answer depending which order you do it in), which is
another thing multiplication usually satisfies.

There are two other not-quite-multiplication operators that I recall seeing.
There's an analog of the cross product in two dimensions: (a,b,0)×(c,d,0) =
(0,0,ad-bc), so it can be useful to have an operator (a,b)×(c,d) = ad-bc,
again turning two vectors into a scalar.

And if the dot product is defined in terms of matrix multiplication by A·B =
AᵀB, then you can also define an operator ABᵀ, turning two vectors into a
matrix. These vectors don't even need to have the same length.

~~~
jules
FYI, in geometric algebra you can truly multiply two vectors and get another
element of the geometric algebra out, no matter the dimensionality.

~~~
tomadi
But the product isn't a vector of the same shape as the inputs, so it isn't a
vector multiplication. It is tensor multiplication.

------
terminalcommand
I had always trouble with math at high school. I managed to pass in the end,
but I never really got it. I think math should be thought like programming.
They should present you with a problem and then show you a cool way to solve
it. My math classes consisted of memorizing formulas and algorithms for
standardized tests. I learned about hyperbolas, matrices, integrals etc. but
none of them stuck with me. One of the few things I remember is basic
trigonometry, because we used to do practical stuff like getting an angle from
the ground and length from the building to calculate a person's height.
Moreover, I think math seriously needs a REPL. At my math exams(not tests,
written ones) I never could calculate the solution right, I would always make
a mistake. We need to acknowledge the fact that we are human and humans make
errors. We need to teach high school math in a hackable, practice-oriented
way. The current math curriculum excludes pupils who think different. If you
can't solve it in the traditional way, you're doomed. But you're actually
smart and can _understand_ math if you learn it by doing, hacking,
programming.

~~~
andrepd
Mathematics can be taught by good teachers and bad teachers, in a very good or
very bad way. It's definitely an important and difficult issue, because
teaching mathematics the right way in order to capture the attention and
interest of school children is a very difficult problem, not least of all
because there isn't even a consensus on what teaching mathematics the right
way _is_.

However, I don't think in the slightest that throwing CS buzzwords such as
"REPL" and "hackable" at the problem is the way to go.

~~~
hueving
They aren't buzzwords, they are simple ideas and don't have to be referred to
with those terms if they are triggering you. The OP just wants interactive,
flexible systems to learn math from, which I wholeheartedly agree with.

~~~
andrepd
Okay, my bad, then. Can you tell me what exactly is meant by a "hackable" math
learning environment? And how will a REPL (however it is realized) make
learning mathematics better?

~~~
reitanqild
Excel, for all its warts, is hackable. Programming languages are hackable.

Fill inn these boxes and get the solution is not hackable (unless the
programmer seriously failed at validating input.)

~~~
jawbone3
The great tragedy of maths education is that people come out the other side
and think "fill in the box" is maths. No, that is doing you sums, not living
in the world of mathematics.

The filling in of the boxes is most fairly compared to finding the typo in the
complicated regex, or the file with incorrect permissions on the web server

~~~
hueving
"Fill in the box" is not referring to sums. It's referring to extremely rigid
homework where you fill out each step in a process to finding out a result
very precisely. It coincides with the horrific bubble tests where you select 1
of 5 answers and there is no partial credit.

This style of learning is what dominates US schools now from basic arithmetic
all of the way up through advanced calculus.

------
opposite_moron
I know many have experience with Gilbert Strang's Introduction to Linear
Algebra textbook and course ([http://ocw.mit.edu/courses/mathematics/18-06sc-
linear-algebr...](http://ocw.mit.edu/courses/mathematics/18-06sc-linear-
algebra-fall-2011/Syllabus/)), but I thought it would be useful to mention
them here. I've found his explanations to be very intuitive.

~~~
craigching
Agreed, I bought the book and am doing the class as a "refresher". I put
refresher in quotes because there is stuff in Strang's class that I know we
didn't cover in my university class, notably SVD. Strang definitely has a way
of really helping you deeply understand the concepts. I remember the subject
pretty well but I don't think I understood it as well as I do now.

~~~
lotharbot
I don't know if Strang's approach would be good as an initial run through
Linear Algebra, but it's a great refresher/enhancer (I read through it in grad
school to prepare for prelims, back when I thought I was going to got for an
applied mathematics PhD.)

~~~
bosie
What would be?

~~~
jfaat
Maybe Khan academy?

[https://www.khanacademy.org/math/linear-
algebra](https://www.khanacademy.org/math/linear-algebra)

~~~
tomadi
Kahn Academy is great for people with no access to educational materials --
that's why it was created. But it isn't material developed by professional or
expert educators.

------
cousin_it
So, matrices are a convenient way to write down linear transformations. But
then the student might ask, why study linear transformations at all? Just
because they have nice properties? Why this particular set of nice properties,
and not some other set? As a rule of thumb, a good math explanation shouldn't
start with axioms and claim that they are "nice". First give some intuitive
examples, and only then say which axioms they satisfy.

For linear transformations, one possible avenue is start with the notion of
_derivative_. If we take a real-valued function, its derivative at a
particular point is just a single number, which represents the function's rate
of change. But what if we have a function that accepts, say, two real numbers
and outputs three? It turns out that the natural generalization of
"derivative" to such functions is a rectangular array of numbers:

    
    
        dy1/dx1 dy1/dx2
        dy2/dx1 dy2/dx2
        dy3/dx1 dy3/dx2
    

If we know these numbers (and nothing else), we can linearly approximate the
values of a function near a particular point, with at most quadratic error.

Now let's say we have two functions. The first one takes two numbers and
outputs three, and the second one takes three numbers and outputs four. If we
compose them together, can we find the derivative of the composite from the
two simpler derivatives by some kind of chain rule, like the one we have for
ordinary real-valued functions? It turns out that yes, we can, if we replace
the product of two numbers with the product of two matrices (defined in a
particular way).

Now it's easy to explain what linear transformations are. They are just
multidimensional functions whose derivative (matrix) is the same at every
point. They are just like one-dimensional linear functions, whose derivative
(number) is the same at every point. (For convenience, people also say that
every linear transformation must take the point (0,0,...) to the point
(0,0,...), so that matrices correspond one-to-one to linear transformations
and vice versa.)

If you want to work with linear transformations effortlessly, there's a lot
more intuition to develop, but this should serve for the basics.

~~~
tanderson92
> If we know these numbers (and nothing else), we can linearly approximate the
> values of a function near a particular point, with at most quadratic error.

Ignoring the fact that you need the function's values actually _at_ the
particular point in question, you do in fact need to know something else: you
need to know that the function in question has sufficient regularity (enough
smoothness) to allow an application of Taylor's theorem.

~~~
cousin_it
Yeah, I'm glossing over a bunch of things.

------
dahart
Matrices are axes. Vectors, or points, side by side. It wasn't until graduate
school, after years of math and OpenGL that What that meant really sunk in and
I really got how and why matrix ops are made of dot products.

The article is right in my case, I didn't learn the intuitive understanding at
first, and it could have been taught that way. It is also well written, for
people that know math, but I also feel like the article describes math using
more math, and that the point could be better made with a picture or two.
There's something about just seeing the correspondence between matrix rows, or
columns, and the axes of a space, or transform, that finally helped it all
sink in for me.

~~~
valarauca1
3 years into a math degree and this never clicked until I watched Feynman's
QED lectures.

Everyone just tells you, "oh now this can be a matrix." Nobody tells you _why_
matrices are useful, or how we ended up with them. Just, _you know matrices
use them_.

~~~
andrepd
The problem can be that most mathematics maintain that mathematics can be done
for its own sake (something which I 100% agree on). However interesting
mathematics is, though, there will be plenty of students more interested in
the applications to the real world (and there's nothing wrong with that) or
they may even have to interest in the mathematics until they see some clever
ways it can be used to solve or abstract a specific problem, at which point
they take a greater interest in the math itself.

As anecdotal evidence, most math teachers I had in college were at best
uninterested and at worst disdainful of applying the math to the real world
beyond theorems in a blackboard. Most physics teachers, however, in
introducing us to new concepts made an effort to show as soon as possible how
the definitions we made were inspired by real world problems and in turn
simplified or helped create new physics. This made it much easier to
appreciate the pure math itself.

~~~
valarauca1
Just explaining the pure mathmatical roots isn't always done, or cross field
relations. It drives me insane when you use cross field tools, and when you
bring up, _Oh so we 're doing X but with Y_.

The response is often, _No we do X because of Z_. Which often you're learning
Z. So now you just feel lost and confused. Its not 2 classes later until
you'll learn A maps Y to Z and X is actually an operation of A so it applies
to both.

I guess it's just me but the joy of math has always been its tangled
relationships to itself.

------
liamconnell
Not to be a smartass, but some might find this point of view interesting:

Mathematics never pays attention to what objects ARE, but rather what they DO.

~~~
ayberkt
Can you elaborate on that a bit or give an example maybe? It seems to me that
mathematics does pay attention to what things are, since mathematicians often
start their arguments with getting themselves and their readers to agree on
rigorous definitions of mathematical structures, before they do anything with
those.

~~~
Retra
It's basically duck-typing. If you can do arithmetic on it, it's a number. If
you can do matrix operations, it's a matrix. If it satisfies the axioms of X,
it's an X.

~~~
AstralStorm
Not true. You can do arithmetic on many things that are not a number. This is
why objects such as "rings" exist.

Matrix is not defined just by operations like a ring is, but also by structure
- you have N independent axes in a specified space.

~~~
verbin217
That's a bit pedantic but sure. Here it is fixed:

> If you can do _numeric_ arithmetic on it, it's a number.

~~~
tomadi
That is begging the question. What does "numeric" mean?

------
edtechdev
See also

[http://betterexplained.com/articles/linear-algebra-
guide/](http://betterexplained.com/articles/linear-algebra-guide/)

But to me, it's better to start with a context, a purpose for using matrices &
linear algebra first, and learn what and how to use matrices in that context.
The contexts that helped me included 3D graphics/games and later circuit
simulations.

------
craigching
Bengio's deep learning book has a nice linear algebra refresher if that's what
you need:
[http://www.iro.umontreal.ca/~bengioy/dlbook/](http://www.iro.umontreal.ca/~bengioy/dlbook/)

------
nimrody
Gilbert Strang's "The Fundamental Theorem of Linear Algebra" is very well
written and expands on the topic of matrices as mappings.

------
mxfh
In general I find it helpful in understanding mathematical concepts and
notations to learn about their history as well. What sort of problems they
were meant to solve in the first place and what were the methods used before
that.

[http://www-groups.dcs.st-andrews.ac.uk/~history/HistTopics/M...](http://www-
groups.dcs.st-
andrews.ac.uk/~history/HistTopics/Matrices_and_determinants.html)

------
z3t4
As a programmer I see matrices as an abstraction layer. Without them, formulas
for 3d calculations get very long and error prone.

~~~
zornthewise
From that point of view, the interesting thing would then be why this
particular abstraction layer works well which is what linear algebra answers.

This is in some sense the process all math students go through. The formulas
for computing determinant and multiplying matrices look really complicated and
it feels like a mystery as to why it works at all but then linear algebra
explains all of that slowly.

------
trampi
Great posting. I will definitely share this when someone has trouble with
understanding what matrices are used for.

------
octatoan
Everyone in this thread mourning how nobody really learns that symbol-pushing
isn't math (it isn't!) needs to look at Lockhart's Lament.

[https://www.maa.org/external_archive/devlin/LockhartsLament....](https://www.maa.org/external_archive/devlin/LockhartsLament.pdf)

------
snake117
I haven't done anything linear algebra related since my high school Algebra II
course. This was really simple for me to follow so thanks for posting. Its a
shame that the authors don't update it, they have same great content and I
would've definitely followed it.

------
pmalynin
Well, what you see as matrices (that is the array of numbers) is underneath a
function f: N x N -> F (Where F is a field of your choice, or a ring if you so
desire). So its a function that takes two natural numbers (i.e. column and
row) and outputs a number.

------
ivan_ah
What is more interesting to me is that linear functions: \mathbb{R}^n -->
\mathbb{R}^m turn out to be useful when applied in many, many different
problems areas. Call this "the unreasonable effectiveness of linear operators"
if you will.

~~~
wetmore
Linear operators are useful in many situations because

(1) They are "nice" operators which carry properties most functions can only
dream of having: f(v + w) = f(v) + f(w), and f(a _v) = a_ f(v). From these
properties you can develop a very rich theory (along with the properties of
vectors). These are the sorts of properties that we want all of our functions
to have when we are young and first learning algebra - how many prealgebra
students wish they could simplify (a + b)^2 to a^2 + b^2?

(2) A linear approximation is the first useful approximation for most
behaviours, and at a small enough scale almost anything looks linear. Once
you've made this approximation you get to exploit the properties of (1)

------
elektromekatron
I learnt this from looking at code for old 3d engines, where you have to first
make up your matrix functions, but then also unroll them for optimising stuff
like rotations round a single axis.

