
Math Symbols Explained with Python - amitness
https://amitness.com/2019/08/math-for-programmers/
======
exmadscientist
This is not a very good list. Several items are between "misleading" and
"wrong":

1\. "Pipe" should be called "bar" (IMO)

2\. "Vector norm" is a much deeper idea than just "2-norm" and if you limit
yourself to the 2-norm, you will run into problems

3\. "Set membership" is not called "epsilon"!

4\. I wouldn't describe functions as operating on "pools", but that's just me.
("Pools" seems to imply vector arguments, or at least set arguments, which is
not how functions are usually thought of operating, though they are defined
that way.)

5\. The description of R^2 as "2-D array" is wrong.

6\. I've never seen that notation for elementwise multiplication, though that
doesn't mean much.

7\. I've rarely seen that notation for dot products. Usually the center dot is
much more common.

8\. Hat is... nonstandard. It has many, many meanings.

~~~
justinmeiners
Good points. I know you know, but it's also worth mentioning that most math
symbols don't have a definite meaning. Meaning is defined in the context it is
presented, in whatever way the author feels makes the most sense. Math is a
natural, not a formal language.

~~~
carapace
The messed up thing is that the math notation of _computer science_ is in the
same mess. It doesn't even have a name.

(There's a talk about it but I can't find the link right now.)

~~~
justinmeiners
This isn't a mess, this is a good thing. Math is a way of talking about ideas,
not a formal system for computers.

~~~
carapace
So what's the difference in your view between math and philosophy?

~~~
justinmeiners
That's a little complicated as math is intertwined with science. But at the
core I only see them as a difference in subject matter. Math is a
philosophical study of space, logical structure, quantity, etc.

Proofs are just tools for articulating arguments about these subjects.
Deduction just tends to work better in math, so the methods are more refined
and agreed upon.

------
reikonomusha
It is faulty to think of the math notation and code being equivalent. The math
states something much more general, and often much deeper, than some mundane
for-loop. I’d have titled this “Python Recipes to Calculate Values of Simple
Vector Math Expressions”.

Often times the symbols may not even be concrete numbers. Often the “real”
number may not even be representable on a computer. Often a symbol _does_
represent a concrete number but indirectly defined by some set of rules (e.g.,
x is the smallest eigenvalue of M; or y is the greatest value less than the
supremum of f) that may require a whole library of code to calculate.

The mathematical notation is “meant” to be flexible and manipulated, not to be
interpreted (necessarily) as a rigid computational process. It should also be
noted that while some notation comes from convention, a lot of it is
improvised and contextual!

~~~
spinningslate
Well, yes. Sort of.

How many people will be coming to the maths of machine learning but _not_
expecting to implement it in a programming language?

Not denying your point: the maths is fundamentally broader. But the article
doesn't refute that. The opening sentence makes that pretty clear:

> When working with Machine Learning projects, you will come across a wide
> variety of equations that you need to implement in code.

~~~
reikonomusha
I claim that even in the math of machine learning, many of the symbols
presented do not mechanically translate as such. Other commenters have
discussed differing notational conventions, but even to assume that x_i is
indexing some real numbers is memory is often too much.

I’m not saying there isn’t any value showing somebody how an elementary
summation corresponds to a for-loop. I am saying it’s a lot more than that,
though.

------
lonelappde
I do wonder, at there people writing ML programs in Python without learning
basic math notation?

It's hard to imagine someone learning important fundamt math concepts without
using math notation, so anyone who benefits from this article should probably
take a detour through a math book before continuing their Python ML, or else
risk having beautiful code that implements nonsensical math.

~~~
tmh88j
>so anyone who benefits from this article should probably take a detour
through a math book before continuing their Python ML, or else risk having
beautiful code that implements nonsensical math.

That seems akin to suggesting someone should learn music theory before ever
attempting to pluck a guitar string. I personally learn best by putting a
subject to practice and figuring out the nuances by experimenting. No one
expects beginners in any subject to be capable of producing quality work,
whether programming or learning to cook.

~~~
vecter
That's a very bad analogy. People can become great musicians without ever
learning music theory, since being a "musician" is such a wide category. You
can be an awesome singer and not know what a diminished seventh is. I learned
how to play Moonlight Sonata, 3rd Movement in high school [0] (not that that
makes me good, just low-intermediate at best). I know practically nothing
about music theory except how to read sheet music.

On the other hand, you can never become a great, or good, or even mediocre ML
engineer without understanding the math. If anything, in a production business
environment, if you are responsible for creating the model yourself, you'd
just be doing more harm than good. Yes, you'd be able to use pre-baked
frameworks to run data through an algorithm or model that you have no idea of
what it's doing or why and get some numbers and graphs out, but you won't be
doing actual science or modeling. You might know how to use Python to do
something as simple as run a linear regression, which if you don't know the
math, you'd just have a vague understanding of it "finding the best fit line",
whatever that means. However, you wouldn't understand that (under the L^2
norm) it's minimizing sum of squared errors, the properties of that (BLUE
[1]), whether or not you'd want to apply regularization [2] or not,
statistical tests of whether or not a linear fit is even applicable, the
importance of outliers due to their overweighting under the L^2 norm, etc.

If I were responsible for a modeling or prediction project, I would never
trust it to a software engineer that didn't understand the math of what's
actually going on.

> so anyone who benefits from this article should probably take a detour
> through a math book before continuing their Python ML, or else risk having
> beautiful code that implements nonsensical math.

I couldn't agree with the GP more. That is 100% spot on.

[0]
[https://www.youtube.com/watch?v=zucBfXpCA6s](https://www.youtube.com/watch?v=zucBfXpCA6s)

[1]
[https://en.wikipedia.org/wiki/Gauss%E2%80%93Markov_theorem](https://en.wikipedia.org/wiki/Gauss%E2%80%93Markov_theorem)

[2]
[https://en.wikipedia.org/wiki/Tikhonov_regularization](https://en.wikipedia.org/wiki/Tikhonov_regularization)

~~~
tmh88j
>That's a very bad analogy. People can become great musicians without ever
learning music theory, since being a "musician" is such a wide category

I disagree, you're equating being good at something to being a world class
expert. You can become a great python developer without ever knowing how the
language implements a dictionary. You can't expect to remove the black box
aspect from everything you use, that's just impossible.

~~~
vecter
We're not talking about being a great Python developer. We're talking about
being a great applied ML practitioner. You simply cannot be good at that
without having a firm grasp of the underlying math.

~~~
tmh88j
This was about how the content is learned, not what content is learned.

>so anyone who benefits from this article should probably take a detour
through a math book before continuing their Python ML, or else risk having
beautiful code that implements nonsensical math.

My response to that comment was you can learn both at the same time. I
preferred to learn music theory while also learning how to play the guitar. I
also preferred to learn mathematics subjects while applying them, such as
machine learning.

------
canjobear
The article explains \hat{} as meaning a vector of unit length but in my
experience in ML it nearly always designates an estimate (including in the
first expression in this article).

~~~
newen
Lol yeah. You are probably best off thinking of x-hat as a completely
different letter to x, where x-hat is useful because it looks like x, and so
you can talk about an object x-hat that has some relation to the object x
without losing track of which objects you are talking about.

x-dot and x-tilde are also in the same family and I'm sure there are lots of
other symbols that people have put on top of x too.

------
boublepop
It’s interesting where the “technical limitations” set in on computers. We
can’t have symbolic math accurately represented because “how would you even do
that with keyboard input, and how would you encode it?!” Yet we have poop
emojis because that’s an absolute essential part of written communication
these days.

------
Davidbrcz
Im always in awe when such basic maths makes the front page of Hacker news.

~~~
lawik
I'm delighted. I've programmed since my teens but I never really enjoyed math
and have mostly picked up what I need when I really need to. I think more in
the ways of programming than in math, so this is a good way of helping me with
the syntax.

People come here from many angles. Mine was largely free of math. I gather
yours wasn't.

------
bitforger
I see many comments criticizing the post for only implementing simple
mathematics. This is true.

However, the approach of understanding math through code is still very
helpful, I think. Personally, implementing things that are fuzzy
mathematically provides immense clarity once I write them down.

For example, a simple monty hall simulator[1]. Or implementing matrix
multiplication multiple ways to understand why each is equivalent[2], and why
multiplying A(BC) can sometimes be faster than (AB)C[3].

I am not sure why this helps me. It may be because I was "raised" as a coder,
and so that is how my brain works. But I also think that implementing
something in code is very close to constructivist mathematics, in spirit. You
cannot prove anything if you cannot construct (/implement) it.

[1] [https://github.com/mitchellgordon95/implementing-
paradoxes/b...](https://github.com/mitchellgordon95/implementing-
paradoxes/blob/master/monty_hall.py) [2]
[https://github.com/mitchellgordon95/implementing-
paradoxes/b...](https://github.com/mitchellgordon95/implementing-
paradoxes/blob/master/matmul.py) [3]
[https://github.com/mitchellgordon95/implementing-
paradoxes/b...](https://github.com/mitchellgordon95/implementing-
paradoxes/blob/master/matmul_assoc.py)

------
carapace
If you liked this...

[https://mitpress.mit.edu/sites/default/files/titles/content/...](https://mitpress.mit.edu/sites/default/files/titles/content/sicm_edition_2/book.html)

"Structure and Interpretation of Classical Mechanics" by Gerald Jay Sussman
and Jack Wisdom

> There has been a remarkable revival of interest in classical mechanics in
> recent years. We now know that there is much more to classical mechanics
> than previously suspected. The behavior of classical systems is surprisingly
> rich; derivation of the equations of motion, the focus of traditional
> presentations of mechanics, is just the beginning. Classical systems display
> a complicated array of phenomena such as nonlinear resonances, chaotic
> behavior, and transitions to chaos.

> Traditional treatments of mechanics concentrate most of their effort on the
> extremely small class of symbolically tractable dynamical systems. We
> concentrate on developing general methods for studying the behavior of
> systems, whether or not they have a symbolic solution. Typical systems
> exhibit behavior that is qualitatively different from the solvable systems
> and surprisingly complicated. We focus on the phenomena of motion, and we
> make extensive use of computer simulation to explore this motion.

They are basically making the computer do the work, with emphasis on
unambiguous, computable notation.

(It could be considered a companion to SICP.)

------
westurner
Average of a _finite_ series: There's a statistics module in Python 3.4+:

    
    
      X = [1, 2, 3]
    
      from statistics import mean, fmean
      mean(X)
    
      # may or may not be preferable to
      sum(X) / len(X)
    

[https://docs.python.org/3/library/statistics.html#statistics...](https://docs.python.org/3/library/statistics.html#statistics.fmean)

Product of a terminating iterable:

    
    
      import operator
      from functools import reduce
      # from itertools import accumulate
      reduce(operator.mul, X)
    

Vector norm:

    
    
      from numpy import linalg as LA
      LA.norm(X)
    

[https://docs.scipy.org/doc/numpy/reference/generated/numpy.l...](https://docs.scipy.org/doc/numpy/reference/generated/numpy.linalg.norm.html)

Function domains and ranges can be specified and checked at compile-time with
type annotations or at runtime with type()/isinstance() or with something like
pycontracts or icontracts for checking preconditions and postconditions.

Dot product:

    
    
      Y = [4, 5, 6]
      np.dot(X, Y)
    

[https://docs.scipy.org/doc/numpy/reference/generated/numpy.d...](https://docs.scipy.org/doc/numpy/reference/generated/numpy.dot.html)

Unit vector:

    
    
      X / np.linalg.norm(X)

------
kstrauser
Seeing `for i in range(len(lst)): lst[i]...` gives me hives. That's cool if
you're wanting to be super explicit about how the indexing works, but in this
page it goes on to say "or you can just write sum(lst)" without worrying about
the indexing.

I would write the explanations like:

    
    
      result = 1
      x = [1, 2, 3, 4, 5]
      for number in x:
        result = result * number
      print(result)
    

which in my opinion is much closer to the way a mathematician would think
about the process.

------
jpeanuts
Maths notation can be wonderfully concise and precise, so it worth thinking
about following it closely when programming. One of my favorite examples of
this is the numpy `einsum` call [1]. It implements Einstein summation
convention [2] - thereby making working with the many dimensions of high-rank
tensors feasible.

E.g. this (Latex):

$C_{ml} = A_{ijkl} B_{ijkm}$

becomes (in Python):

C = einsum('ijkl,ijkm->ml', A, B)

[1]
[https://docs.scipy.org/doc/numpy/reference/generated/numpy.e...](https://docs.scipy.org/doc/numpy/reference/generated/numpy.einsum.html)
[2]
[https://en.wikipedia.org/wiki/Einstein_notation](https://en.wikipedia.org/wiki/Einstein_notation)

------
jay754
Solid post. I think someone who's just starting out with ML or even basic
mathematically notations would find this very helpful. Especially someone
knows programming, but struggles with mathematically jargon

------
lonelappde
For arrays, the math uses ordinal convention (1-based) but the Python uses
offset convention (0-based). That's OK, but the article should explicitly
mention that to avoid confusion.

------
stared
Shameless plug, but I worked for some time on "Thinking in Tensors, Writing in
PyTorch" pretty much focusing on turning LaTeX math into executable code.

E.g. Gradient Descent: formula, code, and visualization:

[https://github.com/stared/thinking-in-tensors-writing-in-
pyt...](https://github.com/stared/thinking-in-tensors-writing-in-
pytorch/blob/master/2%20Gradient%20Descent.ipynb)

------
gugagore
I think it is confusing not to address that math notation often (including the
one used here) uses 1-based indexing, while Python uses 0-based indexing.

------
fatso784
Given that FORTRAN was designed as basically rote translations from discrete
math notation, it is deeply ironic that we now need to explain things in the
opposite direction.

------
m4r35n357
TFA doesn't even mention what it is _really_ using (numpy), virtually none of
this is pure Python.

------
hwc
math.sqrt(x[0] __2 + x[1] __2 + x[2] __2)

can be simplified to:

math.sqrt(sum(v __2 for v in x))

------
_ZeD_
>>> Average: $$\frac{1}{N}\sum_{n=1}^N x_i$$

since when 1/N means "len(n)"?

~~~
lonelappde
The article says N = len(x)

