
How to Read Mathematics - espeed
http://web.stonehill.edu/compsci//History_Math/math-read.htm
======
rcthompson
I took a history of mathematics course at the University of Virginia. One of
the I interesting things I learned was that people used to write out equations
as sentences before symbolic notation was invented. And so you would realize
that a single moderately large equation (such as the quadratic equation) might
be equivalent to a full paragraph or more of text. So if you ever get
discouraged at how long it takes you to read mathematics, remember that
equations can be as information-dense as entire paragraphs, or even more so,
and just because it is presented in a spatially compact format doesn't
necessarily mean that it shouldn't take you at least as long as a text
paragraph to read and comprehend.

~~~
JumpCrisscross
Happen to remember the text you used?

~~~
Dn_Ab
I've read a couple of texts on the history of math. Most of them were dry, a
few were very entertaining like Crowe's on Vector Analysis. But I haven't
found anything that beats mac tutor. I used to read it nearly everyday many
years back.

The section on Al-Khwarizmi, a key innovator in algebra shows how laborious
the task was (site down linking cache):

[http://webcache.googleusercontent.com/search?q=cache:nOT6h0c...](http://webcache.googleusercontent.com/search?q=cache:nOT6h0c7J2oJ:www-
history.mcs.st-andrews.ac.uk/Biographies/Al-
Khwarizmi.html+&cd=1&hl=en&ct=clnk)

 _a square and 10 roots are equal to 39 units. The question therefore in this
type of equation is about as follows: what is the square which combined with
ten of its roots will give a sum total of 39? The manner of solving this type
of equation is to take one-half of the roots just mentioned. Now the roots in
the problem before us are 10. Therefore take 5, which multiplied by itself
gives 25, an amount which you add to 39 giving 64. Having taken then the
square root of this which is 8, subtract from it half the roots, 5 leaving 3.
The number three therefore represents one root of this square, which itself,
of course is 9. Nine therefore gives the square._

------
lotharbot
It can also be helpful to find others who are interested in reading the same
bit of math, and talk through it with them. They don't have to be particularly
better at it than you, they just have to have a similar level of interest and
curiosity. In grad school we read recently published papers in a small group
setting we called "journal club".

This process helps in a number of ways. It keeps you from reading too fast or
too passively because you're constantly asking and answering questions. It
makes you less likely to get stuck in a dead end for very long, because others
will see alternatives. It gives you an opportunity to ask about notation or
background concepts you aren't familiar with. It helps you keep track of the
big picture, because while some people are bogged down in a particular detail
(like "how do they get from equation A to equation B?") others will be trying
to tie it back to the big picture ("how does equation B fit into our overall
goal?") And it allows you to see the even bigger picture as others bring in
relevant knowledge or experience; it was pretty common to be working through a
paper and have someone mention how it tied in to their current research
project.

------
gajomi
A very nice article.

>A particular notorious example is the use of “It follows easily that” and
equivalent constructs. It means something like this:

>One can now check that the next statement is true with a certain amount of
essentially mechanical, though perhaps laborious, checking. I, the author,
could do it, but it would use up a large amount of space and perhaps not
accomplish much, since it'd be best for you to go ahead and do the computation
to clarify for yourself what's going on here. I promise that no new ideas are
involved, though of course you might need to think a little in order to find
just the right combination of good ideas to apply.

Even knowing this ahead of time, this kind of thing can be maddening. Namely,
when the meaning of the assertion is be sensitive to little sign changes,
index shifts and the like these are quite likely to end up in the computation.

~~~
Avshalom
My Basic Concepts of Mathematics teacher was notorious for getting about
halfway through a problem before announcing "and from here it's trivial to
show...". One day he walks into class one day and addressed us:

"I've been told by the head of the department that I can no longer say 'it's
trivial'. You have to understand when I say 'it's trivial', I just mean that I
can do it"

------
espeed
The article is by Shai Simonson, one of the instructors at ArsDigita
University (<http://aduni.org/>).

This was part of the reading material for ADU Course 0: "Mathematics for
Computer Science" (<http://aduni.org/courses/math/index.php?view=cw>), but
unfortunately the video lectures aren't available, whereas they are available
for all the other courses.

~~~
pm90
Shai is an incredible teacher; his video lectures on theory of computation
were a major part of what made me understand the beauty of CS theory
(automata, CFG's etc). It was sad to see that they had to shut down ADU; but I
guess it was an unsustainable model; if I recall correctly, there was
absolutely no tuition charged to the students.

------
johnbender
"The same half hour in a math article buys you 0-10 lines depending on the
article and how experienced you are at reading mathematics"

I'm still reading though the article, but this might be the most important bit
of information for anyone getting started reading papers that rely on maths. I
wish someone had told me this when I started in with the more complex comp-sci
papers because it's hard not to feel dense when you have to go over the same 3
or 4 pages of text time and time again to get the concepts.

~~~
evincarofautumn
Yeah, especially if you’re already a fast reader, it’s frustrating at first to
have to drop to what feels like a snail’s pace to properly understand
everything. But the information density is a real life-saver once you’re more
experienced, because it lets you readily experiment with things at that high
level, unencumbered.

------
lmarinho
During college, in order to plan my time, I used to measure my studying time
on more mathematically intense texts. I took 30-60mins per page on average. It
is a hard read, at least for me, but intensely gratifying too.

------
6ren

      > The way to really understand the idea is to re-create what the author left out.
    

If reading mathematics requires re-creating what the author left out, why not
leave it in?

Sure, it will be longer, but if the purpose is communication, wouldn't that be
better? The reasons I can think of are not beneficial for communicating
knowledge: that's how the game is played (tradition); it excludes the
uninitiated/untalented; it's neater to leave out the truth of discovery; it
makes the author seem superhuman; there's satisfaction for the reader in
understanding the puzzle.

 _EDIT_ the reader can skip explanations he can work out himself (or use them
as a check); papers are already structured with details in deeper, skip-able
sections. One can have a summary that excludes details altogether (like an
abstract, or equivalent to a present maths paper). In the article, the parts
left out are not "known", but steps that the author could work out themselves,
perhaps after many dead-ends to find the right combination. To avoid
repetition of known specific concepts (like vocab), one could explicitly
reference them, or assume them for a given audience.

Perhaps the essential problem is that the omitted steps are not a single
concept (like a well-known term), but many concepts, combined according to
other concepts (like a complex expression), so they can't be easily be
referenced, nor assumed. Someone, somewhere will have to work out the
combination - I'm suggesting it is more efficient for it to be the one writer
than the many readers.

~~~
ColinWright
One of my university lecturers gave what we all thought were dreadful
lectures. Muddled, unclear, chaotic, with no discernible thread. It took ages
to reconstruct and rework the material to a point where we could attack the
problems and old exam questions.

I got nearly full marks on that exam.

Other lecturers were brilliant. Clear, lucid, entertaining. I didn't get full
marks on their exams, because I found it hard to do the problems, even though
I thought I understood the material from the lectures.

Math is not a spectator sport. You need to get involved, otherwise you're in
the situation of someone who has watched a lot of tennis, but never played.

I used to mock the "it is clear that" phrase when it would take two or three
pages to show the result, but having done the work to show it, I was then
equipped to handle the next stage of the work. Having the explanation given to
me as to why it was "clear" would not have done that, my understanding would
be meagre, and unsatisfactory, and I would gradually fall behind and not
understand what was missing.

So no, it's not because:

* that's how the game is played (tradition);

* it excludes the uninitiated/untalented;

* it's neater to leave out the truth of discovery;

* it makes the author seem superhuman;

* there's satisfaction for the reader in understanding the puzzle.

When done properly it's genuinely for more effective communication. I'm not
saying it's always done well - not every writes equally well - and I'm not
saying that everyone always has the best motives, but working on what you see
as gaps in the presentation really is the best way to understand the material.

 _Added in edit:_

You said:

    
    
      > Someone, somewhere will have to work out the combination -
      > I'm suggesting it is more efficient for it to be the one writer
      > than the many readers.
    

If your purpose is to have it written down, then yes. If your purpose is to
communicate effectively to the readers, then no. The "doing" is an essential
part of the eventual "understanding".

------
RollAHardSix
Slightly off-topic but I will forever remember a math professor of mine, Bryan
Snare, declaring that 'You read Mathematics with a pencil!'.

True then, true today, true for eternity.

------
X4
Hi,

that's exactly the kind of problem that was going on through my mind for about
2 weeks now. We've never learned howto analyze whitepapers or dissertations.
Let alone howto convert an unknown mathematical formula into usable code. I
see this as one of the most essential skills we never learned. Could someone
please teach that skill? =))

I'd be very grateful if someone could help me and others understand how to
dissect and codify the main point of a whitepaper.

This is a very interesting paper for example, that I've tried to understand,
but I still only have a vague idea of it:
<http://pages.cs.wisc.edu/~jyc/papers/fibonacci.pdf>

Here <http://groups.csail.mit.edu/netmit/sFFT/> is another very interesting
algorithm I knew for a long time, but even though I understand the principle.
And tried to put it into code, I wasn't able to identify what the important
part of their whitepaper is. They even provided some pseudo-code which I've
seen in many other papers, but I've never seen where and how they standardized
the pseudo-code notation. The pseudo-code looks ambiguous to me. (NOTE: They
provide the code now, but that wasn't the case when the paper was published
first).

This kind of thing makes me feel stupid.

------
mavelikara
Previous discussion: <http://news.ycombinator.com/item?id=1576969>

------
tylerneylon
I agree that typical math writing must be absorbed _very_ slowly and carefully
to be understood well.

Unfortunately, in the world of mathematicians, this isn't what really happens
most of the time. In grad school classes (for math), students don't have time
to absorb the material this carefully - they must 'learn' too much too quickly
(or at least this was the case at my grad school), so that reading math
becomes learning just enough to pass exams. Writing a dissertation involves a
lot of talking to people and skimming papers to hopefully grab relevant bits.
Reviewing papers, for professors, involves handing papers to their grad
students and asking them to read it. Even writing a paper for journal
submission feels primarily about satisfying reviewers and getting in to the
best possible journal - being readable becomes a lower priority than space
constraints and the quibbles of a particular reviewer.

------
bearwithclaws
Here's the PDF (from Hacker Monthly) if anyone wants to print out in nice
magazine format:
<https://dl.dropbox.com/u/48613464/How%20To%20Read%20Math.pdf>

~~~
espeed
Oh cool, it was in issue #5 (<http://hackermonthly.com/issue-5.html>).

------
muyuu
Off-topic, but who uses an IE document icon as a favicon? got freaked out here
in the office thinking IE managed to crawl back anywhere I can mistakenly open
it somehow.

~~~
Aeons
The site was made using Word.

------
mck-
Great article. I think it can very well extend to reading code as well, which
may seem daunting at first.

~~~
slowpoke
Except that (good) code is written to be read, and to be understood - and not
be terse and cryptic, just so that people writing it have less to do. That's
essentially what most math texts do. I don't want math to be prose, but often,
a little more verbosity or communication of intent would be nice. Just like
good comments and documentation. In CS, this is universally accepted as good
style, and for very good reasons - and for the same reasons, it should be in
math, as well.

~~~
hackinthebochs
I think you're misunderstanding the purpose of the terseness of mathematics.
The point is to convey the important concepts as cleanly as possible, assuming
your reader has a certain level of prerequisite knowledge. The same can be
said about code: you assume your reader has a certain level of understanding
of the domain that you're modelling through code. The terseness is a virtue,
in the sense that only the critical bits of new information need be conveyed.
All the prerequisite knowledge is hidden behind abstractions that the reader
should be familiar with. Math, and code, is cryptic if one doesn't have the
prerequisite knowledge, but this is by design.

~~~
slowpoke
_> The same can be said about code: you assume your reader has a certain level
of understanding of the domain that you're modelling through code._

No, this is not at all the same. Code can be read without understanding
everything if it is well documented. Yes, code should be short and simple,
too. But not terse and cryptic - and that's what so many math texts do. So
yea, cool, you can compress half a page of written English into half a line of
mumbo-jumbo. Great. You have gained nothing.

I'm not saying that the half page of written English is better. Not at all.
But there's something somewhere between those two where math should be. And
not at the cryptic end of the scale.

Take just simple things such as _meaningful variable naming_. There are tons
of good reasons why we do this in programming. I can't count the number of
times I've tried to decipher math texts and had to look up over and over and
over again what some x or f or lambda or phi is actually supposed to
represent. It gets worse when they start making distinctions based on the
bloody way a symbol is typeset (x vs _x_ ).

It's perfectly fine to do this in quick calculations on paper (just as it is
fine to use one character variables in quick one-off scripts). But if you are
writing a prolonged mathematical text, then take the time to give variables
meaningful names. It's not that hard, and would go a long way for improving
readability.

~~~
hackinthebochs
I don't think descriptive variable names would be useful in math. The thing in
math is that "x" you see can be repeated 20 times in a set of equations.
Having to read "running_total" 20 times instead is a hindrance here rather
than a benefit.

The difference between math and code is the number of variables. A piece of
code can have an order of magnitude more variables than a set of math
equations. But in math, a few variables will usually be repeated many times.
You're simply optimizing different usage profiles.

So the benefit in descriptive names in code is being able to distinguish
easily the many different variables. The benefit in single letter variable
names in math is that you're able to write the information in a more compact,
digestible manner.

The reason this latter point is important is for the same reason short,
compact programming languages are a boon to comprehension. The faster you can
read a set of related items, the more of it is in your working memory at any
given time and thus the better you're able to understand it. I fully believe
the time it takes to input a chunk of information through your visual system
is inversely related to one's understanding of it. To put it another way,
working memory's decay function is parameterized by time.

~~~
slowpoke
_> The benefit in single letter variable names in math is that you're able to
write the information in a more compact, digestible manner._

You missed my point entirely. I said that you should not optimize for
_writing_ math, you should optimize for _reading_ math. And that's where the
same argument as for code holds true as well: It doesn't matter if it takes
longer to write, it will be read magnitudes more often, and that is what
matters most.

Again, if you're just doing quick calculations on paper or hacking something
into your shell/REPL, I really don't care if you use cryptic variable names. I
do, too. But I don't want to ever see this in a good program, and I think that
it doesn't belong into a good math text, either.

 _> The reason this latter point is important is for the same reason short,
compact programming languages are a boon to comprehension._

And again you missed what I explicitly stated: it's good to write in a concise
way, but only as long as it doesn't become cryptic. And I disagree with your
belief about information processing. It's better to build semantic relations
by having meaningful names than it is to process a lot of information in a
short time. It really doesn't matter how fast you are able read something - if
it doesn't make sense, you won't understand it.

~~~
hackinthebochs
>You missed my point entirely. I said that you should not optimize for writing
math, you should optimize for reading math.

 _My_ point is that these goals are nearly one in the same when you get to the
high level. Many mathematical relationships are very complex, usually not the
step-by-step procedures that is common in code. Thus being able to hold the
entire relationship in your head at once is crucial. Single letter names for
variables and functions are critical here (for the previously mentioned reason
working-memory-decay).

I basically have a bachelors in math, and I could not imagine reading complex
equations with full variable names. The hard part is understanding the whole,
not remembering what x or i means. If you find an equation cryptic, that just
means you don't have the requisite knowledge to really understand it.

~~~
slowpoke
Sorry, if all you have left is the (very typical) argument of "if you think
that way you just don't understand it", then I'll consider this discussion
over. I have no interest in hearing endless appeals to tradition. I'm sure
people made similar arguments for GOTO back in the day. Good thing we moved
on.

~~~
hackinthebochs
I would like to think my argument is far more nuanced than you're giving it
credit. This isn't an appeal to tradition; I'm making an argument that
explains why traditionally math has stayed with the "cryptic" notation rather
than descriptive variable names.

You're completely ignoring the working memory argument I'm making. A large
part of math is pattern matching: recognizing a common relationship between
variables (a common pattern or "theme") and investigating that relationship
further. This ability is critical to mathematical ability. Visual compactness
is crucial here. Using descriptive names will completely bog down your visual
system with reading words rather than identifying patterns.

Pattern matching is a critical skill when one becomes an expert in any field;
in math its of utmost importance. Math is optimized for reading by other
experts who have those same visual patterns committed to memory. This is the
optimal approach for the work's target audience. Yes, it makes advanced math
largely inaccessible to outsiders, but that's a part of the trade off.

~~~
slowpoke
_> You're completely ignoring the working memory argument I'm making._

I'm not ignoring it. I said I don't buy it. I generally don't believe that you
need visual compactness to to point of being cryptic to make use of pattern
matching. I'm not saying this is necessarily easy. Neither is choosing good
variable names in programming, especially at high abstraction levels. But
that's no excuse for not doing it.

The other thing is that mathematicians will finally have to accept that _they
aren't the only ones who need math_. It's not, for the most part, a "field for
experts" (to paraphrase you) which only those who are willing to become
experts at it are allowed to understand. It's a beautiful science with lots of
appliance in almost every other field. But it's inaccessible as hell in large
parts due to mathematicians with your attitude.

I strongly believe that math can be made more accessible and more "user-
friendly" for people who aren't experts at it. It will be hard to do that
without sacrificing its tremendous power and flexibility (which I am totally
against in software, as well). But it must (and will) eventually be done.

Take note, by the way, that the accessibility problem is less and less
important the more specialized the math you are doing becomes. I don't really
care if some hyper-abstract field of math that a few dozen people in the world
are even able to grasp the basics of is arcane and cryptic. That's something
only experts will care for and if they are fine with their niche field being
overly cryptic, so be it. What I'm concerned about is math in general.

I firmly agree with the position that math should not be taught as an
appliance, because that is missing the point. However, it's downright fatuous
to ignore the fact that math _has_ appliances, and that therefore, a lot of
people will have to learn a subset of it. To excuse bad practices which have
survived by virtue of no one questioning them with nothing but "it's optimized
for edge cases (experts)" is a cop-out. It ignores the problem.

I also find it funny that I already met with such vehement opposition for just
proposing meaningful variable and function names. That was just an example.
There are a lot more problems to be solved. Stuff like "this is trivial" or
"left as an exercise" as a way to avoid writing out cumbersome, but
nonetheless important (for non-experts) parts, the general terseness of
mathematical texts, including the parts in plain English. I could probably
find more.

Let me conclude with this: Math as a whole has accessibility problems, just
like a lot of CS and computer stuff. Flat out ignoring them or even asserting
"it must be this way" is incredibly ignorant. And we need dialogue to do this
- not condescending "there is no problem" rebuttals.

~~~
hackinthebochs
You make a lot of points that I agree with. I just don't think your solution
is the right one. Math, when it comes to writing proofs and peer review,
should absolutely be optimized for other experts. Whatever makes communicating
ideas precisely among themselves is what's important here. I don't see the
benefit of changing this process so those of us as outsiders can take a peek.

But as you said, there is a very large part of math that deals with applying
these concepts and communicating them to non-mathematicians. The way we go
about this is definitely in need of a do-over. Finding more intuitive ways to
communicate these ideas is critical a critical part of this (possibly
including more descriptive variable names). But, I think there will always be
an inherent chasm between the math experts and those who are using the results
they discover. An example is the difference between the calculus track of
courses that every science major takes, and say, abstract algebra. The math
that mathematicians do and the math that the rest of us learn will always be
vastly different. It's just the nature of the beast.

