Hacker News new | past | comments | ask | show | jobs | submit login
Problems of Traditional Math Notation (2004) (xahlee.info)
67 points by mindcrime on Nov 5, 2017 | hide | past | web | favorite | 69 comments

> Traditional math notation has a lot problems. Inconsistencies and ambiguities. It lacks a grammar.

It doesn't require a grammar, because it is interpreted by humans, not parsed by a computer.

Ambiguity is a major problem in computer languages. It is not so in human languages—human languages are riddled with ambiguity from start to finish. But they've also been developed over a period of hundreds of years to assist human communication.

Math is written for humans to read. Programs are written for computers to read. What's good for one, is very bad for the other. Don't confuse the two—computers are very precise, and very stupid. Humans are very smart up to a point, but no further. Their needs are very different.

I agree. I find that it's easier to read and write math when the notation is ambiguous, but clear in a context shared by both the reader and writer. Even in the english language, it's up to the author to gauge the audience and decide how ambiguous they want to make their sentence. If you want to be fully precise you end up with sentences like "for every epsilon greater than zero there exists a delta greater than zero such that for all x if the difference between x and y is less than zero then the difference between f(x) and f(y) is less than epsilon".

The author is prematurely optimizing math - is Leibniz notation _really_ going to cause problems, or ambiguity with the meaning of parenthesis. Out of all the things that cause me difficulties with math, notation is the least significant. If notation _is_ a problem, then that means that I don't know what I'm doing.

If you really need to be specific, there are plenty of ways to do so using the English language.

> Math is written for humans to read. Programs are written for computers to read. What's good for one, is very bad for the other.

If programs were written for computers to read, we would still be writing them in machine language. Quite the contrary - programming language design is to a large degree a human-interface problem, because programs are primarily written to be read by humans: our teammates, our successors, even ourselves, coming back to the project after working on something else.

The existence of Brainfuck demonstrates that computers are indifferent to the qualities which make a programming language useful or legible to humans.

The irony in your comment that "math is written for humans to read" is that I find traditional math notation to be practically illegible. I typically cannot make any sense out of the pseudocode in a CS paper until it has been translated out of traditional math notation into some more familiar, programming-language-style notation.

To employ a metaphor, brainfuck is so close to a turing machine, it is the computer. At least, a general purpose computing machine without program is not a complete computer. On the upside, the machine definition is perfectly readable, because it's so simple.

> I typically cannot make any sense

The operative word here is "I"

And computer languages are obviously designed to be understandable by both humans and computers, but it is still the human that has to bend to the ways of the computer.

> but it is still the human that has to bend to the ways of the ...

... elder programmers. who would be called computers in olde times.

The real problem is symbolism, though it's necessary, because only in symbols can we express a wrong statement, while on the other hand you can't take an apple and another apple and have three in the end.

Programs are written for computers AND humans, and much ado goes into optimizing programming languages for human consumption. Interestingly, I don't think they are tending toward something that looks like math notation.

Math definitely has a grammar. In upper level classes and in papers mathematicians even use correct punctuation.

It's all very much just an extension of human language, just like you said.

Traditionally, mathematicians have a pointlessly hard time when manipulating higher-order functions. This goes from inventing new names for higher-order functions (e.g., "funcational", "transformation"), to constantly using new notations for application (F(g) becomes F[g], becomes F{g}, becomes \int g(x) dx, etc.), to leaving out the binders.

The latter is crazy. Consider the expression "F{e^{-t^2-y^2}}", which might stand for the fourier transform of a 2d gaussian, or it might be the fourier transform of t, with a parameter y, or it might just be the fourier transform of a constant function, or... It is only really defined in the surrounding text. The notation is incomplete and while in this example that might not be such a large problem it just gets worse as you pile on complexity. All integral transforms are written like this, as are expected value, variance and so on.

A lot of times notation in math is choosen to be suggestive. For example, the integral and sum notations are actually pretty neat, since they make common manipulations more visually clear. In particular, since the order of integration doesn't matter, putting the binder inside the integral as a "factor" is actually pretty inspired - commutativity makes Fubini obvious. The "d/dx" the author complains about can be made precise and is similarly great. Consider "dx/dz = dx/dy dy/dz", doesn't this just look entirely natural? But there's something about manipulating functions as first class objects that seems so unnatural to mathematicians that it needs cryptic notation to ward of the unwary...

If it becomes a problem, it is changed.

Plus there are reasons to treat functions of functions differently. Often there are subtle assumptions made on the domain of a function of a function. The space of functions is just too damn big to work with naturally. Almost none of the things you want to do to functions are applicable to all functions. R^R consists mostly of pathologies and monsters.

e.g. https://en.wikipedia.org/wiki/Weierstrass_function

And often we have functions which are only defined in an open neighborhood of zero, yet we still call them functions.

That the set theoretic definition of "function" as a functional relation between sets is rarely useful isn't really so surprising that you need to emphasize that you use a more reasonable notion of function all the time.

And the integral transform notation is frequently changed, but only ever locally and in different ways by different authors. Pick up two books on "Fourier transforms for engineers" and I promise you that you will find different notations for the same thing and probably even different notations within the same book. And none of them will be good.

The only problem with what you call higher order functions that I've ever had is that such expressions are necessarily more abstract and hard to grasp because they are more complex. But I don't see why you would say that it seems unnatural to mathematicians, on the contrary it's the most natural thing in the world. That's why no special notation is needed.

Maybe I'm missing something, but I don't see an issue at all. Perhaps you can present some examples?

I understand where you're going with the Fourier example, you often need surrounding context to know what's going on, you need to know what it is used for.

This is perhaps the biggest difference between math and computer notation, but it's a feature, not a limitation. If you want to you could easily repeat all the variables on the left side of each expression, to make it clear what the frequency domain variable is. But you don't do that, because there is no need for each expression to stand on its own. It's not even needed in all computer languages, e.g. Swift is typesafe but type is inferred and often not even explicit anywhere.

> The equation sign is used ambiguously. (for definition and for equality)

Definition and equality are really the same thing. A definition is just an introduction of a term/variable, and an equality giving it a value. So the only difference is if a term in the equation hasn't been introduced yet.

> The bracketing placement of symbols for absolute value, is not a matching pair, thus is ambiguous (when nested or sequential).

How was it not a matching pair? I only see two |'s in each equation so they have to pair up. What other way is there to interpret it?

> the notation for differentials “dy/dx” rapes the division notation. The differential notion has no position in math regarded as a computer language, as implied by the math philosophies of formalism and logicism.

I believe the original reasoning for notating derivatives as dy/dx is that it is a limit of a faction and so approaches a fraction of infinitesimals. with that background the division notation is simply extended to a new class of numbers.

I am not sure what the second sentence is trying to say.

Also pretty much all of the issues could be solved with gratuitous parenthesis. So we have a solution to get a formal grammar; we use the current notation because it is faster to read and write. So without any concrete proposals, this critique has very little content.

Absolute value is ambiguous if you have more than one occurence, like |a|b|c| could be (|a|)b(|c|) or |a(|b|)c|. But as you said, gratuitous parentheses solve this problem. A good typesetting program will also let you display the outer bars larger than the inner ones, so you can match them up even without parentheses

> Absolute value is ambiguous if you have more than one occurence, like |a|b|c| could be (|a|)b(|c|) or |a(|b|)c|.

Yeah, I get the issue in general, but the quote I referenced was specifically referring to the equations from Wikipedia.

My thought is that pretty much any mathematician would not choose to use absolute value bars in a case where it would be ambiguous. For example (|a|)b(|c|) would be instead written |ac|b and |a(|b|)c| would be written |abc|. And so unless it the equations are specifically about showing properties of absolute value (in which case just use parenthesis), I don't see a case where the ambiguity would pop up in real discourse.

For example (|a|)b(|c|) would be instead written |ac|b

That changes the meaning. You're now relying on multiplication being commutative which is not true for all sets.

> That changes the meaning. You're now relying on multiplication being commutative which is not true for all sets

As I said at the end of my last comment: I don't see a case where the ambiguity would pop up in real discourse. If you are dealing with non-commutative objects you can deal with a few more parenthesis. Just like non-associative objects result in many more.

It doesn't matter in practice. These two bars denote either absolute value, or norm, or determinant, all of which are real or complex numbers. People surely have the freedom to define a non-commutative multiplication with real/complex scalars, but have you ever seen one?

> Definition and equality are really the same thing. [...] So the only difference is [...].

An interesting way of justifying how two things are really the same thing, that is, by acknowledging their difference.

The point is that the pointed ambiguity (the difference) can be quite significant, and these become apparent when studying these sentences in a formal logic. Hence, the significance of the ambiguity.

This (and the article under discussion) just echoes the idea that mathematical sentences and proofs are considered "rigorous" but "informal", where "formal" is the domain of logic.

> Definition and equality are really the same thing. A definition is just an introduction of a term/variable, and an equality giving it a value. So the only difference is if a term in the equation hasn't been introduced yet.

In CS we talk about l-values and r-values. Only an l-value can be assigned to, but you can compare r-values for equality. So there is a solid distinction there between types of expression which is upheld in programming languages.

The main difference between math and programming languages is that it's perfectly fine in math to define l-values.

For example i^2 = -1 is the generally accepted definition of the imaginary unit.

It's r-values that cannot be assigned to. What's really missing from most conventional math, I think, is the notion of conditional expressions based on equality testing. "=" is usually only used in a declarative way, to declare that two expressions are equal.

See n+k patterns in Haskell for a programming exception to the rule you mention.

Actually, though, set-builder notation can use the equals sign to express a condition. e.g. the examples here https://en.wikipedia.org/wiki/Set-builder_notation. Maybe there are other examples in math.

Typically, mathematicians will just say it doesn't matter, because by context it makes it clear to humans. I disagree. I think it introduces lots of mis-understanding and garbage into our minds, especially those who have not studied symbolic logic and proof systems (which, is actually majority of mathematicians). The wishy-washy, ill-defined, subconscious, notions and formulas make logical analysis of math subjects difficult. Of course, mathematicians simply grew up with this, got used to it, so don't perceive any problem. They'd rather attribute the problem to the inherent difficulty of math concepts.

You could substitute quite a lot of other things for 'mathematicians/math' and this would ring equally true. I've noticed that people often defend the system they learned within even (or perhaps because?) when its limitations are creating a barrier for potential new entrants.

For most of mathematics, potential new entrants would be mathematicians from a different field. Then using familiar notation is actually helping them understand the new concepts, even if it erects a barrier for people with zero background knowledge. But if you have zero background, you'd be better off starting from the fundamentals (and learning the prevalent notation on the way) rather than diving straight into deeper mathematics.

In that, notation is really no different from a specialized language. A sentence like "a monad is a monoid in the category of endofunctors" doesn't make sense if you don't know what those strange words mean; but if you understand the concepts, you probably also know what they are called.

It doesn't create a barrier, except possibly for people from CS. I learned math before programming and it really struck me how much more OCD programmers are. Math notation is fine, ambiguity is not an issue, it's a strength. People only have philosophical objections to it.

Contrived examples of ambiguous expressions like in the article are pointless, since there are always non-ambiguous ways to write the same thing.

The flexibility of math notation makes it flow much better, you can vary the level of verbosity to suit the context. There are even conventions around variables names, so that no one would misunderstand an expression such as f = x^2 and think that f is a constant.

It's a beautiful, intuitive, highly efficient system evolved over millennia. It might not be perfect but if someone had come up with an improvement it would have been implemented already.

In this case, it is less about defending the system and more just confusion that these would be the sticking points.

I mean, I get that there is a lot to be confused by. There is a lot going on. But the points being argued here seem irrelevant to the point of someone just trying to get pedantry points.

If anything, I think everyone should be encouraged to create their own notation and have fun with it. A large part of the challenge will be translating it between their notation and a common one.

I understand that some symbolic systems are meant to be parsed by a machine (programming languages) and others are designed to be understood by people (spoken language) but mathematical notation is unique in that it does neither.

I think it's safe to say that the alternative to ambiguity would be a lot more symbols, I'm not sure that would help.

I find that it's very seldom that I struggle to understand an expression because of limits in the notation system, I think this is more a case of someone coming from computer science, where notation by necessity is completely unambiguous, and being dissatisfied with math notation on a philosophical level.

Math notation is a bit like writing down spoken languages, the intention is to provide just enough information so that a native speaker can understand which word is being referenced. It's definitely not to provide an encoding of the sounds that can be decoded by anyone with a simple table of correspondences.

Sometimes I find myself looking at an expression and understanding what is meant, but not really knowing _how_ I can understand it. Like when you intuitively can pronounce a word you've never seen before, but you can't consciously determine what rules you applied to arrive at that pronunciation. I think that is beautiful!

If I remember correctly, von Neumann was reviewing someone's dissertation (might have been Minsky's), and someone asked "Is this mathematics?", to which von Neumann replied "If it isn't now, it will be soon."

Doing the CS guys a kindness will probably pay the field back several times over. The latex-fussing and symbol collision in mathematics is not pretty.

I don't actually think it is desirable to make mathematical notation unambiguous, if that is even well defined. To again use natural language as an analogy, there are only some 26 letters in most Latin based alphabets, but there are many more sounds. The same "letter", or historical sound, is pronounced differently in different contexts, but since it's almost always predictable based on context or knowledge, the system works.

In the same way, you could say that in one sense (though possibly not the mathematical sense) there are lots of different uses for "=". Most of the time this obviously hasn't been an issue, if it had been, people would have introduced new symbols. For some cases they have done just that, it can sometimes be useful to explicitly distinguish definitions with ":=" or a triangle above the =. In the same way people sometimes use a triple line = for "is identical with". But seeing that everyone knows these symbols, but still make do with just = in most cases, even they are only half useful.

If we use ordinary language as an analogy, we're usually all speaking the same one, and ambiguity is manageable.

Whereas in the professions, we are so deep into sub-specialties and sub-languages that even without ambiguity we are approaching a Babel where mathematician does not understand mathematician and lawyer does not understand lawyer.

I have seen many computer programmers express their distaste in traditional math notations. I have an opposite opinion, probably because I majored in physics as an undergraduate. Whenever I read/write source codes with heavy math formulas, I feel lost in a sea of verbosity. While the unambiguity renders the trees clearer, the verbosity blinds me from the forest. I always have to step back and write in the traditional form to understand.

Because people with a math background, rather than those with a CS background, dominate the field of math, the notation is unlikely to change ever.

The examples in the article don't bother me much, but I've noticed that I struggle a lot more when it comes to linear algebra. It might be due to less experience, but I can easily lose track of which things are scalars vs vectors vs matrices vs tensors. I'm noticing it right now working through various tutorials on neural networks... I get the concepts easily but when I sit down to implement, translating the notation into code is a lot more difficult than the scalar functions I'm used to.

Some authors use conventions such as boldface for vectors and capital letters for matrices. There are also often conventions around different types of e.g. scalars. a, b, c are typically used for different things than i, j, k, within a text.

Matrix and trace derivatives became a lot easier when I started doing it all in index notation.

TLDR: author complains.

Why there is no Hitchhiker’s Guide to Mathematics for Programmers https://jeremykun.com/2013/02/08/why-there-is-no-hitchhikers...

Not directly math but Guy Steel's talk at Clojure Conj [0] on computer science type notation touches on related issues and is a great watch.

0. https://youtu.be/dCuZkaaou0Q

I am glad somebody is finally looking into this! There is an interaction between used/formalized language and thinking and often certain types of definitions and notations force thinking one way, making another way, maybe better in other situations, less visible. As a practical consequence, we might have a potent hammer for certain situations and teach people to nail all problems, instead of using another, more suitable approach based on different mechanics, just because the prevalent way of formalism enforces certain rules and obstructs creativity.

How is `||c||a|b|` ambiguous, given the knowledge that taking the absolute value is an idempotent function? I get you could read it as `abs(abs(c)).a.abs(b)`, but no one would. To me, it’s unambiguously `abs(abs(c).abs(a).b)`. AFAIK, this also applies to other uses of these symbols and, e.g., norm. (In fact, norm’ing twice doesn’t even make sense because the types clash, to use the programming term.) Also, IIRC, in LaTeX the size of the | symbols can be balanced and autoscaled with \left and \right, like parentheses.

It is ambiguous because, as you point out, it can be read either as `abs(abs(c)).a.abs(b)` or `abs(abs(c).abs(a).b)`.

A mathematician can of course think about it and know what is meant. But there are many alternate notations that could be used, where such ambiguities don't even occur (e.g., balanced parens).

Autoscaling does seem like a nice solution in this case, which preserves the traditional notation while resolving the ambiguity...

It is ambiguous because, as you point out, it can be read either as `abs(abs(c)).a.abs(b)` or `abs(abs(c).abs(a).b)`

Except in this case the ambiguity comes from typography rather than notation. ‖c‖a|b| for example is exactly the same statement with different typography and is a lot less ambiguous. The problem here isn't math notation, but trying to apply that notation using only pure ASCII.

I'm on your side, but single | usually doesn't mean norm, rather absolute value, so it would be fine to do it twice. Though as you said it's contrived.

My point is that abs∘abs=abs and mathematicians will simplify when they can. Double-| is norm and you definitely cannot do that twice, because:

    ‖.‖ : V → ℝ
...where V is a vector space. I guess technically the reals fulfil the definition of a vector space, but I'd imagine that would grate against most people!

I suppose one could read ||c||a|b| as norm(c).a.abs(b), but usually the sets involved and typesetting would distinguish this. (i.e., It's only ambiguous in ASCII!)

I know what you meant and I agree, i just read you as (alao) saying that he single lines could be norm. There are situations where you wouldn’t simplify, like if this expression was the result of some function applications and then in the next step you simplify it.

R can definitely be seen as a vector space, just like R^3 for example, but then you wouldn’t also have absolute valley.

It wouldn’t be ambiguous in ASCII either because c would be defined earlier, you’d know what it is.

I have wondered if not postfix is more suitable for writing math expressions. Yes, for simple expressions like 2*3+4 postfix is cumbersome, but for more complicated ones I think it has benefits. F.e: "k fac ln 2 ^ -1 ^" would be "1/ln(k!)^2"

It is easy to see that the postfix expression is just a composition of a bunch of functions "fac ln 2 ^ -1 ^", but not so easy in the infix case. It is also easy to derive the inverse of it: "-1 ^ 1/2 ^ exp fac^-1"

I'm fairly sure infix notation is easier to read for associative operations. It's hard to beat the simplicity of a + (b + c) = (a + b) + c.

I'm also not convinced that (a x n ^ *) is going to catch on as a replacement for ax^n.

I think some kind of postfix notation could be nice for category theory though, it always confuses me that X -f-> Y -g-> Z is a diagram for gf, not fg.

With infix notation you can clearly group terms and factors in your head. Postfix makes it almost impossible.

As a reasonably good computer programmer with a facility but no education in math, when I need to learn something new I always try to find some software program, typically something open source, that implements the math in question. Then I can work through the ambiguities of the equations by looking at the source code.

It's cool that you do this, but I wouldn't want anyone to get the idea that it's not worth learning to read mathematical notation.

> the ambiguities of the equations

Mathematical language is the least ambiguous language you will find anywhere, other than executable code.

I'm not exactly clear on what you mean by "mathematical language" in this context as mathematical notation is clearly filled with ambiguity, so I have to assume it is something else you are getting at. Can you elaborate on that ?

I said least ambiguous, not free of ambiguity. I challenge you to name a community which strives for precise language more than professional mathematicians.

Whether or not it's worth it depends on how often you need to interact with it. I don't need to interact with math notation very often; maybe only a few times per year. Further, deciphering the math notation by referencing is translation is not a bad way to learn.

And the commenter chooses the executable code.

3Blue1Brown made a video about math notation for power/logarithms/roots: https://www.youtube.com/watch?v=sULa9Lc4pck

It's messy to actually draw a triangle, but there are alternative notations for it.

what's messy about drawing?

I think in practice a triangle can be too large.

if you think large, write large.

The main problem with mathematics is the lack of types.

It has types—it has tons of types. They're just used very differently than in programming.

Which is a problem tho.

What do you see as the problems with that?

Lack of checkability which slows down progress. Look into work of the late Vladimir Voevodsky https://www.quantamagazine.org/univalent-foundations-redefin...

it would be so much better if they used Rust

More like coq

Coq is intuitionistic, classical proofs cannot be expressed. Mathematicians almost always (unless intuitionistic math is your niche) use classical proofs.

It kinda is yes. I don’t understand why classical proofs are still a thing.

What's wrong with the expression

(a+b)/a = a/b = φ

Does the author not like that a new symbol is introduced like that?

There seems to be a lot of confusion here. Yes, math notation certainly _can_ be used to write ambiguous expressions. This in itself is not the slightest problem, because it doesn't mean that you are _forced_ to use ambiguous syntax, you never are.

Applications are open for YC Summer 2019

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact