There is also a theorem due to H. Hopf, stating that the only finite dimensional, commutative division algebras over R are C and R itself, but the only proof I know requires pretty heavy topological machinery.
See "The Brachistochrone Problem and Modern Control Theory" (http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.26....)
The TL;DR is that classical mechanics can be formulated as finding a time history of position and velocity which minimizes a certain function of them both. Hamiltonian mechanics essentially optimizes the function over position and velocity as if position and velocity were independent, then adds the constraint that velocity be the time derivative of position via a Lagrange multiplier. This Lagrange multiplier is the momentum.
Of course, no one in a physics course actually learns it that way; undergraduates just memorize some equations and graduate students, if they are lucky, learn if from the Hamilton-Jacobi equation (which is how Hamilton developed it in the first place) where the momenta become the spatial gradient of the function to be minimized when evaluated at all positions. (It's exactly the same as the Bellman return function in dynamic programming; hence the name "Hamilton-Jacobi-Bellman equation.")
IMHO, neither of these other developments is as intuitive as the Lagrange multipler one; but because Hamilton's original formulation obscures it, no one in physics learns it that way.
A fun postscript: when you consider the equations of classical mechanics in terms of position and momenta (this combination is what physicists call "phase space") they form a manifold with a special property called "symplectic"; symplectic geometry can be formulated in terms of matrices over the field of quaternions. According to some people, this idea was a big breakthrough, but I don't believe Hamilton himself, despite having formulated both, ever noticed.
I'm pretty sure learning a bit about Lagrange multipliers is standard for a grad classical mechanics class. Granted, I've forgotten all I learned back then, b/c it has never come up in anything else I do, but it was definitely part of the course.
The interpretation of the momenta themselves as Lagrange multipliers is completely nonstandard, though. Try googling "hamiltonian mechanics lagrange multipliers"; you'll get either examples like the sphere thing or hits from optimal control tutorials.
In his original paper deriving EPR ( http://en.wikipedia.org/wiki/EPR_paradox ), he believed it was a reducto ad absurdum which invalidated configuration-space based quantum mechanical theories:
We are thus forced to conclude that the quantum mechanical description of physical reality given by wavefunctions is not complete [...] No reasonable definition of reality could be expected to permit this.
Nevertheless, he thought a local and complete theory of QM was possible and spent many years searching. Bell showed this to be impossible (after Einstein's death). Eventually experiments showed the predictions of the EPR paper to hold, thereby implying that the definition of reality is not "reasonable".
If a scientist never gets things wrong, they aren't doing anything interesting.
Geometric Algebra for Physicists is very clear and gently paced. http://www.amazon.com/Geometric-Algebra-Physicists-Chris-Dor...
This one also looks good: http://www.amazon.com/Linear-Geometric-Algebra-Alan-Macdonal...
Complex numbers are a powerful tool for studying the two-dimensional plane: each point corresponds to a unique complex number . The beauty of this correspondence is that it allows you to add, subtract, and multiply points in the plane
When the professor mentioned that I could write all the vectors as complex numbers, and do regular old algebra with them, my brain almost melted.