Hacker News new | past | comments | ask | show | jobs | submit login
Programming as a Way of Thinking (scientificamerican.com)
322 points by pmcpinto on May 1, 2017 | hide | past | web | favorite | 131 comments

> With a computational approach, we can go “top down”, starting with libraries that implement the most important algorithms, like Fast Fourier Transform. Students can use the algorithms first and learn how they work later.

This is exactly how I understood it. Three years back, WebAudio API was a new addition in browsers and I decided I should make something with it. I settled on building a Whistle Detector using a cited research paper as my basis. I barely understood anything in that paper and had to dig into DSP, FFT and other basics to get around. As I had no external help, I struggled plenty but managed to complete it after two weeks [1].

Funny enough, we had a DSP course which never made sense to me. Two weeks into my puzzle I walked away with more useful knowledge than the course ever did. What has always motivated me to learn is the application. I can work tirelessly if there is an interesting thing to build, even if I have to go through mundane theory. But, I find it utterly tedious to learn theory with no immediate goal in mind.

[1]: https://stuff.shubhamjain.co/whistlerr/

That's a very interesting comment to me because I may be perhaps the opposite. I got a slow and frustrating start learning application development--I wasnt getting it. I felt I was following instructions on how to write esoteric insutructions and not understanding how it worked. I found the book Code by Petzold and read it. Then I discussed some of this with other developers and realized they were even more clueless than I was.

It seems some people love theory/understanding while some love building something practical/useful.

I learn best when I build something that illustrates the theory. Reading gives me a dim idea, but building something (and freely experimenting on the way) teaches me how it works.

I'm the opposite, and had to be persistent with searching for the right learning tools for programming, with the least terms and behaviours left undefined. If there are such things as 'learning styles', theoretical learners vs applied learners is the most intuitively clear distinction. However choosing a teaching and measurement paradigm that fits both without a lot of wiggling is not so intuitive...

I feel the same way about math and statistics. It makes much more sense if the knowledge is applied and makes it more interesting and worth learning.

I had a math teacher at my high school who used python and SAGE to teach us pre-calculus as well as programming, computational thinking and some mathematical logic. He was an absolute genius and I wouldn't be where I was if it wasn't for him - unfortunately the school saw him as a threat to the standard model of teaching and drove him insane before he just quit. I hope he's doing ok.

You should try to find him and say hello.

I'll have to email him - I was planning on doing so once I finish my senior honors thesis in CS education (which I'm dedicating to him).

I had a teacher that changed me for the better, unfortunately I never got the chance to thank him as he passed away unexpectedly :/

If I were you I would send an email right away. You never know what happens.

I had a similar experience with my AP calculus teacher in high school. Luckily, he's dramatically overqualified for the teaching position and his classes consistently do exceptionally on standardized tests, so the administration basically gives him a wide berth.

I've had thoughts like this, using sci-py to teach kids how to hack math. Our education system is kind of sad face.

It was definitely cool - he had us writing python functions to estimate the area under a curve (and then visualize it), and other things I don't quite remember at this moment. Once he made us define math functions just based on there being a 0, and a successor function, which happens to be what I'm doing in a graduate Math Logic course at my university.

That's really incredible.

It's sad to see educators like this get pushed out of their field. I had some incredible teachers as well, but also some awful ones. There needs to be greater competitive components added to the education system, such as statistically significant pupil success in education and career outcomes.

Have you heard this Wolfram TED talk about the subject:


> People laughed at Seymour Papert in the 1960s, more than half a century ago, when he vividly talked about children using computers as instruments for learning and for enhancing creativity, innovation, and "concretizing" computational thinking.[1]

[1] http://papert.org/

Did anyone else, other students, find his methods effective?

For sure - I can think of several other students who took his class and are now doing extremely well in Computer Science careers. Of course I don't have statistics on that or anything, but there's a large group of graduates from my school who were extremely upset when they found out what the school did to him.

In the beginning of 80s there was a concept called "Programming is the second literacy" [1] introduced by Soviet computer scientist A.P. Ershov. Nowadays it's obvious that there is a huge demand in special programming tools not only for professional programmers (C++, Java..), but for domain experts who use algorithms too (Python, DSLs..).

There is a popular objection that "not everyone should be able to code". But it depends on the "to code" definition. A good example here is with game designers. They may not know how to do low-level coding, work with 3d math or use C++ templates. But still for a really good game designer it's very important to have algorithmic thinking and they need to have a tool for testing their algorithmic ideas on the computer.

As for Python, there is an issue: it's hard to find examples of good style of Python programming (I think only few of us actually learnt Python with help of textbook). Some time ago I was happy to find notes by Peter Norvig about Python and about comparing Python and Lisp. His code is very elegant, see, for example: http://norvig.com/python-lisp.html

[1] http://www.sciencedirect.com/science/article/pii/01656074819...

>it's hard to find examples of good style of Python programming

I've never really bought that this matters so much as long as you just write sensibly in general. How many python users really make sure to do everything a "pythonic" way?

I think, it depends on how you use Python.

1. "Executable pseudocode", almost a toy language to explain few algorithms in your article.

2. Sysadmin tool. Replacement for shell, a way to write quick throwaway scripts.

3. "A modern Lisp". Universal language for implementing AI algorithms, DSLs etc.

I, personally, use 3rd variant. I write quite big programs in Python, for example I made few compilers in it. That's why code quality is important for me here.

I would posit there are far more methods than these besides. Python is a powerful language that is flexible enough to be used for all these and more. You or anyone will inevitably write code they believe to be best suited to their problem, pythonic or not, weighted by the coder's ability and personal definition of "pythonic"

I saw Allen Downey speak about this concept a year ago in a small forum at the University of Richmond. Several professors in the audience challenged his thinking. I approached one after that and asked what he thought and the answer was basically that Dr. Downey's approach is spot on. This coming from a CS professor.

I have just started reading his Think Stats book http://www.allendowney.com/wp/books/ which is starting to help me better understand statistics. Not far enough into the book yet to make a complete judgement however.

Got to say I like seeing different perspectives on traditional subjects. My youngest is going to Olin next year in Mechanical Engineering and seeing these sorts of articles keeps me excited about her unconventional engineering education at Olin.

I don't quite understand your comment. If the professor you spoke with said "Dr. Downey's approach is spot on" then why would said professor "challenge [Dr. Downey's] thinking"

These were testing his thinking. They wanted to find holes in it and in this one professor's view they did not.

Nice. I've actually found ideas in programming about modularity and complexity to be ways to approach such complexity in economics, physics, social relationships etc. "Notes on the Synthesis of Form" for example is about design in the abstract and often through examples in architecture but the ideas re: the ontology developed help and are helped by similar ideas in structuring software. This is especially the case when thinking about modeling live simulations like video games. In the vein of "Notes," our normal semantics about the "categories" of thought we have like CS, philosophy etc. might be more "categorical" than needed.

Some thing I want to play with more is OS-y scheduling and AI-y thought as ways to think about time management or problem-solving for "humans."

"With a computational approach, we can go “top down”, starting with libraries that implement the most important algorithms, like Fast Fourier Transform. Students can use the algorithms first and learn how they work later."

The "Learn how they work later" part sends out alarm bells to me. I don't have a lot of confidence that students will be particularly motivated to dig into how an algorithm works after their problem has been solved I fear this will lead to a generation of "just use $x package" and people blindly plugging in "magic algorithms" without understanding their choice. "Quicksort for everything" or "include leftpad" if you will...

I see it a little in my industry (Engineering) "oh your data is noisy just apply a kalman filter" never mind if it is appropriate or not.

Another aspect, one that I've encountered many times, is that even when someone has to implement the idea from scratch, they don't work to understand it and instead just translate pseudocode. I once had an interview where I was asked to explain why a piece of code (that the interviewer wrote) worked, because the interviewer did not know.

A benefit of the bottom up approach, of starting with the math and no reference implementation, is that you are less likely to implement it unless you understand it. And I should stress, these are tradeoffs.

This post reminds me of Amir Rachum's post about Knowledge Debt[1].

"This is how programming should be taught. You should do stuff way before you can figure out how it works. For a while, you should intentionally be ignorant about distracting details."

"You should, intentionally and tactically, decide which piece of information you can do without, for now. But you should also, intentionally and strategically, decide when to pay back that debt."

[1]: http://amir.rachum.com/blog/2016/09/15/knowledge-debt/

> Programming has changed. In first generation languages like FORTRAN and C, the burden was on programmers to translate high-level concepts into code. With modern programming languages—I’ll use Python as an example—we use functions, objects, modules, and libraries to extend the language...

This is the first paragraph. I understand that it's just setting the stage, but it makes so little sense by itself that I had to make an effort to continue reading.

Yeah, that was an odd start, given that Dr. Downey is a Professor of Computer Science. Even a charitable suggestion that he was looking to keep things as understandable as possible for a wide audience doesn't seem to give much defense.

I think he was trying to articulate how much more expressive/extensive programming languages and libraries (and tools?) have gotten over the years, so that a student can get to do something interesting with much less down-and-dirty arithmetic and arcana.

What has changed its the modules and libraries, which has exploded in availability.

The actual programming method haven't changed much since the dawn of Unix, or perhaps Lisp - although it took a while to spread.

The canonical, award winning version of these ideas is described in Ken Iverson's seminal 1979 paper 'Notation as a tool of thought'[0] - highly recommended to anyone who finds the ideas discussed interesting (even if they find the scientific american article lacking ... because it is)

[0] http://www.jsoftware.com/papers/tot.htm

Peter Naur's "Programming as Theory Building" (http://pages.cs.wisc.edu/~remzi/Naur.pdf) is also a closely related read.

"The computer revolution is a revolution in the way we think and in the way we express what we think.", SICP, 2nd Edition.

The article misses what I believe is the most important point, which is the concept of the abstraction of a function. A function is more than a computer concept: a function is a hammer, a violin, and a microscope; you put an input and you get an output.

Not only that, but you can then realise that functions are manipulable objects too, and have operations of their own--eta abstraction, reduction, composition, etc., and can be used to build all of modern mathematics pretty much from scratch.

The author focuses on the algorithmic and computational aspects of programming. Yet, nowadays programming is much more than this beautiful classical view on programming. Programming is more about describing a complex system by taking into account such aspects as concurrency, cross-cutting concerns, asynchronous events, transactionality, distributed computations etc.

That is not surprising, as this is not an article about programming in general, but about using programming for teaching in other areas.

I would, however, welcome the companion article, "Thinking as a Way of Programming", to balance the prevalent notion that testing is the only way that matters.

You might want to check out the author's book titled Think Complexity....

BTW, the article author, Allen B. Downey, has a marvelous collection of free technical books: http://greenteapress.com/wp/

And in his books, he does what he prays: teach you complex topics using computer programs. I don't know if it is a good approach of the general population, but for us programmers, it certainly is.

Thanks for linking those!

Not sure what I was expecting from the title, but for sure not what I read. Maybe I'm quite old now, but working with latches and switches was always a way of thinking. Writing awful BASIC with GOTO and subroutines was not that different. Writing OO code is quite different at first impact but then you discover that it is basically the same way of thinking. The real way of thinking of programmers doesn't have anything to do with the languages, it's more about finding solutions for some problem. If you can't find solutions by yourself, banging your head, it doesn't mean that you have to find a better language for your software, doing nothing in the meantime, it means that you have to start from scratch understanding how to fix things and how to solve problems. If you just wait for the next big thing or the next shiny solution that fixes everything for you then imho programming is not for you.

Good point, I think it's about finding the 'correct' ways to break small parts together into a solution for your problem. We certainly have some tools to help us along the way--OOP, FP, etc.--but there's definitely an element of craftsmanship to it.

Nice work. Of course teaching engineering [0] and physics [1] from a computational perspective is not a new idea.

[0] https://en.m.wikipedia.org/wiki/Structure_and_Interpretation...

[1] https://en.m.wikipedia.org/wiki/Structure_and_Interpretation...

I agree partially with this article. I think programming in a highly expressive language and runtime is a great way to get started exploring and develop intuition. I solved differential equations using mathematica in middle school, long before I even learned how to integrate. I solved instances of the traveling salesman problem long before I understood how complex it is. I played around with images and video files long before I learned anything about signal processing. I find that it is very helpful as a student to be able to learn from bottom up and top down simultaneously. Mathematica is much more forgiving than python if a student has no prior programming experience.

> The languages I am calling modern are not particularly new; in fact, Python is more than 25 years old. But they are not yet widely taught in high schools and colleges.

Seems like a strange claim, because

> eight of the top 10 CS departments, and 27 of the top 39, teach Python in introductory CS0 or CS1 courses. [1]

[1] https://cacm.acm.org/blogs/blog-cacm/176450-python-is-now-th...

If you posit that python is a better way to teach programming, and that top universities are more likely than average to use better ways of teaching, then that data point may be skewed.

Pete Naur's "programming as theory building" is an excellent read on the topic. Not only on the executable code front, I've also found the type notation used in Haskell to be a great modeling tool using which I can check the consistency of what I'm designing. I'm not talking Coq level proofs here - just an elegant notation that you can use on paper as effectively as in code.

Also check out his companion blog post: "Python as a way of thinking". Has some nice slides there.


If you liked this, you may also like https://www.youtube.com/watch?v=6J1vRrozgBg .

PS There is a written version of this, but I could not find a copy that wasn't behind a paywall. Perhaps someone else can.

Some discussion of the efficiency cost of our current favorite higher level languages would be nice. There are things about Python that people love, and I think we should strive to provide that with something more like a 2% performance penalty instead of ~200%+

Languages like Nim are very promising on that front.

Opinion is not fact. Given this is Scientific American should we expect something, say, more scientific?

Let's put this under the heading, "One Size Does Not Fit All".

The underlying assumption of this opinion is that there is only one model to teach too.

In the modern era we really need to take into account our differences.

In this case, functional versus procedural programming.

1. Write this in a Lisp as functional.

2. Present the functional form as human readable as the given procedural.

3. Conduct a study and gather the data.

4. Present the findings.

My experience has taught me that there is subset of the human population who think more naturally in functional programming.

But if one model is insufficient? then why would two be?

What languages are yet to be invented to meet the needs of the millions of daily programmers?

It is time for experimental science to be conducted given the millions of people involved.

The path has already been laid out with Usability research that is entirely experimental.

It is time to do the hard science on usability of claims of functional and procedural programming languages.

It is time to put to rest opinion and one size fits all.

The reason we went for Turbo Pascal in the old days was the readability. ( it was C or Pascal )

This article starts to get at an idea that I've had for a while now -- that mathematicians are doing it wrong.

In the software business we learned a long time ago to name our variables properly, name our functions logically, and to control complexity by breaking ideas into modules and then hiding the details inside.

If you can't name something, then you don't know what it is, and that tells you that you should rethink your design.

Mathematicians don't do that. They name variables "a" or "x", or worse, they use some Greek letter I can't type on my keyboard. They are entirely inconsistent in their use of variables: "phi" or "theta" can mean a zillion different things. I can't tell you the number of times I've read a computer science paper, a paper that uses entirely unnecessary equations, and doesn't bother to define symbols. This practice wastes everyone's time.

It's laziness, pure and simple.

Mathematics needs a general overhaul. The language of math needs to a complete redesign with a focus on understandability. And the key to it is to force mathematicians to name their variables.

Yes, I realize that mathematics deals in abstractions that have little relationship to the outside world, and it makes little sense to call a variable a "dog" or a "car". So what? It just means that we need a new vocabulary, a vocabulary that includes terms like "fourier transform" or "hypotenuse". Pretty much every industry has it's own vocabulary. Chemistry has thousands of terms. Biology even more. Computer science is full of them. Math is full of symbols that have no inherent or well-understood meaning. That should change.

This article starts to get at an idea that I've had for a while now -- that mathematicians are doing it wrong.

And you're making a typical programmer mistake in thinking that.

Programming is geared towards having people who are experts in programming, but not in a specific subject area, be able to develop and maintain code. Therefore practices like descriptive variable names help provide context to people who cannot be expected to have a deep understanding of their subject.

By contrast mathematics is geared towards having people who are experts in a particular branch of mathematics be able to think and communicate with other experts.

Experts naturally create jargon. Jargon is that it is short, concise and precise. This frees up the mind to be able to consider more complex statements, and removes mental friction from more complex manipulations of those statements. Over a lifetime of expertise, the mental effort of learning the jargon for your specialty gets repaid over and over again.

Mathematics takes this to an extreme. And for professional mathematicians, it is repaid in spades. There is an obvious barrier for generalists who wish to approach a mathematical topic. But your experience of how to be a generalist does not make a mathematician's experience of how to be an effective specialist any less valid.

(Note, this comment is informed by my experience having done mathematics into graduate school, followed by 20 years as a professional programmer.)

True, much of math is relegated to experts. This is _exactly_ the problem. Why do you think so many adults have such little understanding of math and will proudly proclaim that they haven't used algebra or geometry in their adult life?

Yes, there are are some areas of mathematics that are extremely specific and do require experts but a lot of math is useful to people in general and anything we do to make it easier for a new person to learn is something that will help the average person use it and will also make it easier for new people to enter the field.

The problem is that you can't really "think mathematically" without concise notation. Your brain won't hold the ideas.

The more that you get to stuff that you juggle around, the more that conciseness matters. So an important concept like "eigenvector" can have a lengthy name, but your vector is v, linear transform is T and the matrix representing the transform is M. Any other choice is actively harmful.

Making mathematics accessible to the general public isn't part of mathematicians' jobs, nor is it something most of them care about.

If you want to write some articles that make mathematics more accessible to the general public, and you want to use longer variable names in order to do that, nobody is going to stop you. It's unreasonable, however, to expect mathematicians to try to do this at the same time they're doing their actual jobs.

There is job security in keeping it arcane. Lawyers know that, too. But, since some professional mathematicians teach and others need to communicate with business people, you'd figure there might be some interest.

Yes, but in addition to bringing in more of the general public it also means you can get up to speed in a new area of the field more quickly.

The perception that math is useful stems from it's use by non experts.

As a topic goes from specialists to generalists reusing a tiny set of terms becomes a horrible waste of time. Worse, math becomes less precise when different areas reuse terms like infinity, set, numer, area, etc to mean vastly different things let alone identical notation for different ideas.

> The perception that math is useful stems from it's use by non experts.

You might be surprised by how many mathematicians would disagree with the assertion that "math is useful" (I suspect a majority would at least think such a statement was overly broad).

Not really, I chose to stop studying math because what I found interesting was generally really useless, and I find a pointless life's work to be a terrible waste of time.

Worst outcome I recall was being yelled at by a professor for not getting a PHD. ¯\_(ツ)_/¯

You could have specialized in a more applied field though? Math is a big place and plenty of people get their hands dirty with useful things.

Yes. As an example of how we programmers are no different than mathematicians, consider terms like "service" (or "data"), which is pretty much our version of the variable "x". Some of the terms in our jargon get used and reused in so many contexts, it must appear nonsensical to an outsider.

> Programming is geared towards having people who are experts in programming

Programming is getting simpler all the time. Compare modern Python or Ruby to the earliest punch cards. We programmers are going out of way to make it as easy to understand as possible for as many as we can. There are programming languages suitable for children to use while making robots out of Legos.

It is hard to see how variable names vs single character variables creates more friction than it reduces and might even allow domain experts outside of math without higher level math degrees in.

I am curious what other practices are history vs objectively pragmatic requirements. There is much greater pressure on programmers to use time efficiently so we spent a lot of time on pragmatic thinking, and we are nowhere near done. By contrast you sound like many mathematicians and niche experts claiming that you have it solved and know the most efficient way to proceed without even considering an experiment.

A mathematician doing math is more akin to a processor executing instruction than a developer doing programming. When you have to reference a variable X in your head 1000 times (an underestimate) in the process of working on a problem, it makes immense sense to name it X rather than DescriptiveVariableName.

In fact, in my own research for my Master's degree, I only made breakthroughs once I simplified the notation. And we're talking about a change like Q(x,y,z) -> Q, Q(x,0,y) -> Q_2^0, dQ(0,y,0)/dx -> Q_{1,3}^{1,0}. The power of concise notation can be quite a bit greater than that increase in opaqueness it creates.

The way math is communicated suffers from this, for sure. However, you gotta think about what will happen once you read a paper that has more descriptive variable names. You sit down to prove a few results. You start manipulating concepts and symbols. You end up shortening the names until you basically come up with your own concise notation. The thing is that how math is read is very much linked to how math is done.

I used to think this way, but as I learned more math, I realized that the way mathematicians do things isn't an issue in practice. I do wish notation was more standardized, but that's a nitpick (i.e. whether/how scalars and vectors are distinguished).

When it comes to proofs, you can only keep so much information in your head at once. In most cases, if you have a ton of variables floating around, you are probably trying to do too much at once and would be better served by breaking out a lemma or two from the main proof.

I will note the worst code I've ever read was written by a control theorist that used only single letter variables. I later found out the variable names were the same as a paper, but the paper was not provided in the program comments.

> Pretty much every industry has it's own vocabulary.

So does math. Open up any textbook and you'll see definitions everywhere. When it is useful to name something, mathematicians normally do.

> whether/how scalars and vectors are distinguished

This becomes less of an issue once you go through some abstract algebra. But typically, constants would come from the start of the English alphabet, and vectors, from the end.

> I will note the worst code I've ever read was written by a control theorist that used only single letter variables. I later found out the variable names were the same as a paper, but the paper was not provided in the program comments.

Did the code remain "worst" after you found out the names came from a paper?

> This becomes less of an issue once you go through some abstract algebra. But typically, constants would come from the start of the English alphabet, and vectors, from the end.

That rule gets broke quite a bit though. For example, if you have a optimization program with some affine inequality constraint A * x <= b, it's not uncommon to refer to the individual hyperplane constraints as a_i^T x <= b_i.

Math obviously has many more concepts than we have notation so selecting a unique notation for each concept is not feasible, but I bet some standardization can be done.

> Did the code remain "worst" after you found out the names came from a paper?

No, I am fine with naming things after variables from papers, but I always put a note as to which paper the notation came from. Giving someone a few hundred lines of code where few variables have more than 2 letters and no context is just cruel though.

It's more standard than people realize. Offhand:

x: "some real-valued variable"

n: "a countable quantity, usually a total"

i: "an index"

k: "some kind of constant", often an integer, whose value doesn't change, "c" is also used for this

e: almost never used as a variable, it's Euler's number

p: some kind of probability, or a prime number, along with p and q

t: some kind of parameter, often goes from [0,1] or (0,1)

z: complex numbers

I have a Master's in EE so I've studied this a bit.

For any high school student... i is irrational number. z is also the third dimension as in x,y,z. e is base 10 exponent as in 4e12.

So, while you might be used to that notation and in context it's clear that's far from universal. Further someone can be an expert in more than one area and incompatible notation slows down collaboration for little gain.

U, V: vector spaces

u, v: elements of vector spaces

G, H: groups

g, h: elements of groups or

g, h: homomorphisms, isomorphisms etc

e: group identity

K, F: fields

I: ideals

f: functions

(x): sequences

x_i: i-th element of a sequence

A, B: matrices

It all depends on the context it's used of course

This notation has developed over tens to hundreds of years before we had the capabilities of autocomplete and formal typing where a computer can help us write longer names more quickly. This is why single letters became prominent, they were simply easier and faster to write.

But anyone who is serious about writing maintainable code today should be using an IDE where the benefits of susinctness are entirely relegated by intellisense-like tools.

Trading readability for conciseness is near the top of my list of "crimes against future maintainers."

So I had never thought about this in the context of mathematical symbols, but this makes total sense and I'm strongly in favor of relegating mathematical conciseness in favor of readability and specificity.

It's actually the other way around. Mathematical notation used to be very much language-like and tedious to read. As time went by (and math became more complicated) notation was developed to make it more succinct and easier to understand. (And sometimes the more succinct notation helps to develop new insights. The change from Roman to the Indian/Arabic number systems made calculations easier for everybody) https://en.wikipedia.org/wiki/History_of_mathematical_notati...

Compare the two following statements:

One from Euklid's elements (written 2.5k years ago):

"Given two straight lines constructed from the ends of a straight line and meeting in a point, there cannot be constructed from the ends of the same straight line, and on the same side of it, two other straight lines meeting in another point and equal to the former two respectively, namely each equal to that from the same end."

And my attempt of translating the above, in what should effectively be Hilbert's notation (19th-20th century):

If there are two triangles ABC and ABD where AC=AD and BC=BD and C and D are on the same side of AB then C and D are the same point.

Which one was easier to parse in your mind?

As a bonus try rewriting this formula using longer variable names and tell me how legible it would look http://i.imgur.com/wCWkyNL.png (it's from a proof of one of Syllow's theorems https://en.wikipedia.org/wiki/Sylow_theorems )

Conversely, you just draw a picture and leave all of this tedium behind, making the import of what you are talking about obvious at a glance.

Proofs by picture arent proofs though. And how would you even convey by picture that two line segments are of the same length. Or that if you drew C and D as separate points they turn out to be the same point?

Oliver Byrne's edition of Euclid is a nice proof-by-picture example: https://www.math.ubc.ca/~cass/euclid/byrne.html

Those look more like proofs with pictures, than proofs by picture, but I'm too lazy to get involved into the obscure notation of that book and check whether a random proof from the book would be equivalent to a modern proof from a standard textbook.

Altough, judging by the old timey language of the book, it's possible the book predates Hilbert's axiomatization of Euclidean geometry and the proofs in it were good enough for the standards of its time.

In modern mathematics proof by picture generally means you've drawn / pointed out a single example, possibly wrongly or in a way that doesn't generalize, and because you've shown that one example holds you assume all possible examples hold. That, obviously, needs not be the case.

If you're talking about mathematically rigorous programming, conciseness and readability go hand in hand. It's actually easier to read a formula with single letter components than a complex one with long descriptive names in many cases. Also, if you're doing that kind of programming, you're also doing a lot of hand or latex computation. Minimizing the visual distance between the code and the convenient handwritten or typeset notation is incredibly important.

I once saw a statistics prof give a lecture with so many different meanings for sigma, s and S, that he had to resort to colored markers. And we were all still lost!

k, along with j, is also often an index.

Of course, this mathematical convention carried over into programming as well, with i,j, and k serving as index variables.

The letter "e" is often used as a generic edge name in a graph. Usually "v" is for vertex in that context.

If a variable can be literally anything, what would you name it instead of 'a'? 'anything'?

Sometimes a variable does not refer to anything specific at all. For example, type variables in Haskell are often denoted using single letters, simply because they can be literally anything, and no sane human wants to type out 'anything' all the time, just so her code can be used as a beginner's intro to the language in question.

There may be occasions when a single letter makes sense, but this should be the exception, not the rule. If you've got a ton of "anything" variables in your code, then that means there's probably a better, more understandable design for your system which expresses what the variables represent with more precision.

The variables named 'anything' can represent anything, that's the point. Are you saying we should avoid writing code that's too general?

I understand that not all languages are powerful enough to allow defining functions that can operate on any type. But this doesn't mean that languages with this expressive power should refrain from using it; it means that users of less powerful languages need to learn something new (as I had to).

That would create incredibly verbose code, which would be very hard to parse (IMHO).

Exactly the opposite. Typed languages without parametric/generic code result in extraordinarily verbose and bug-prone code. You end up constantly casting to/from void* or Object.

That's why all popular typed languages after C provide a facility for writing parametric code.

And without exception, the standard libraries for those languages using single-variable names (S,T) for types. Because when you have 1-3 type variables, and you're writing generic code, "Type" doesn't really communicate anything that "T" doesn't.

> If you've got a ton of "anything" variables in your code

> design for your system

We're not talking about programming; we're talking about math; that's the whole point.

In a broader context, the two are more similar than you are giving them credit for. I was working on a paper and wrote a rather long messy proof that involved a large number of variables. When I finished, I thought the thing was too difficult to follow so I rewrote it. This time I "refactored" the proof by breaking out a bunch of lemmas and developing a more concise notation. The result was not only a more legible proof, but a shorter one as well.

I completely agree that doing math and doing programming feel similar in a lot of ways.

I just think they are different enough that not everything is analogous, including best practices on variable naming.

Mathematicians work with pen and paper regularly. So long var names can be frustrating. If everyone goes to digital format it may be easier.

However, there are forces working against that. Mathematical proofs are more like sentences, or even poetry, than a structured language that can be compiled. They are meant for humans to read. So you jot down your proof over and over again in different formats, playing with both the logic and also the visual representation.

Another issue is that a lot of math is done just in your head, so you don't need a computer at all. When writing code you inherently need to test the codes execution, and also more frequently reference digital materials in your research.

Conciseness matters. For example, Einstein's convention for tensor notation - a lot of explicit sums are lost!

> ... a paper that uses entirely unnecessary equations, and doesn't bother to define symbols. This practice wastes everyone's time.

I agree with you there, notation should be defined, otherwise it doesn't help explain anything.

However, consider that for people who are already used to the notation those "unnecessary" equations are actually more compact and precise than reading the accompanying text. What seems difficult to you may be easy for someone else, and vice versa.

> ... the key to it is to force mathematicians to name their variables.

I believe you overestimate how much of an improvement that would be. How much code have you seen that had variables like int num or String str? If num can be any arbitrary number, and str is just a generic string, there isn't necessarily a more descriptive name you can use.

Mathematics is full of these cases. Say, some general equation involves three real numbers and a real valued function of two real parameters. They are completely generic, so a mathematician might name them x, y, z and f and write the equation as f(x,y) = z.

How would you call them? All I can think of would be the first parameter, the second parameter, the result and the function, which is not much more descriptive for the added verbosity.

The problem with using descriptive names in mathematics is that most entities you are dealing with are so generic that naming them doesn't help. Of course I'm not opposed to e.g. writing NormalDistribution instead of just N, since this is actually a very specific concept. But there is just no way to completely eliminate single letter variables from mathematics without using an equally nondescriptive replacement.

> However, consider that for people who are already used to the notation those "unnecessary" equations are actually more compact and precise than reading the accompanying text. What seems difficult to you may be easy for someone else, and vice versa.

A struggle for me over the years has been that there does not really seem to be any way of learning the notation, separate from the standard university education process. I'm not about to drop out of my career and go back to school just to learn to read mathematical notation, and it's basically impossible to look up the definitions for obscure symbols with unguessable names, so I simply remain ignorant.

Can you give an example of a place you've seen notation you can't understand? Typically, papers include a brief notation section where they define notation you can then google. In other cases, I've normally been able to google things like "what does the bar over <entity> mean?" successfully.

The last paper I remember trying to crack was Damas & Milner, on type systems, since I was working on a problem which appeared to be analogous to type inference. I asked a friend for help, and he very kindly translated it for me - he's actually written a long series of blog posts now, based on those emails:


As with virtually every other CS paper using math notation, I found the Damas & Milner paper utterly incomprehensible at first, but once I'd seen the formulas translated into a notation I can actually read, I was able to go back and learn something useful from it.

Unfortunately, it appears to be the case that there really is no one such thing as "math notation", not in the sense that there is one programming notation called "Python" and another called "Haskell" and yet another called "C++", such that one can go read a tutorial for some specific language and thereby come to understand how its notation works. Instead, it appears that "math notation" is a huge collection of little micro-notations, all mixed up together in a more or less ad-hoc fashion by the author of each paper.

Naming is hard, no question. Context helps, though. In realm where I live, computer science, x, y, and z likely correspond to something in the problem domain, like "rate" or "time". And even if we're dealing in pure, as opposed to applied, math, you should still be able to come up with names like "surfaceArea" or "interval".

So how would you rename a, b, c, x and y in the quadratic formula?


It is impossible to get better than one letter variable names here. This goes for most expressions in maths, to make things simpler we just use one letter variables for everything. This forces us to learn that the names aren't important, all that matters is how the names relate to each other.

The problem with notation only occurs when non mathematicians start to write down formulas without properly defining what everything is. This is not a problem with mathematical notation, this is a problem with statisticians/economists/physicists/Computer scientists etc who define their own global naming rules where P might denote probability/price/momentum/polynomial time problems etc.

The problem with the quadratic formula is that, without referring back to the polynomial or thinking through the proof, I have no way to know which coefficient corresponds to which term of the polynomial. Here's an improvement on the variable naming:

X and Y are descriptive of an overwhelmingly common abstraction, the cartesian coordinate system. They do about the best job they can do.

The coefficients (a, b, c) are opaque. It seems like they would be better generalized as (C_2, C_1, C_0) (or subscripts instead of _#), to let C_N represent "The coefficient for the x^N term of the polynomial").

A few extra characters that are acceptable to mathematicians (subscripts are ++good) would make the formula much more readable, and lets us generalize to higher-order polynomials for similar solutions (not that they necessarily exist).

By the time you get to math in college (at least past the normal calculus sequence) that's how it typically is done.

  \sum(i = 1, n) a_i * x^i
And other similar conventions.

Not everything can be so easily named. A lot of times mathematicians deal with a much higher level of abstraction than you are, and at that level, there are no intuitive names available. It is not a failing of mathematicians, but rather a failing of human natural language to be unable to name these concepts.

You might want to read some very old math books that predated the introduction of symbols to stand for things. Go read how ancient mathematicians express Pythagoras's theorem. We have verbose monstrosities that is much more simplified now.

Natural language is verbose. Once you can deal with the level of abstraction, conciseness reduces your cognitive load significant.

It is a different convention but I don't think it's inherently less understandable. The semantically opaque variable names in math texts are usually accompanied by some sort of explanation, e.g. "Let x be... and let f(x) be...". Whereas this type of comment is usually required for a proof, comments are often omitted from programming statements when the variable names are self-documenting.

I took discrete math last semester and part of the course focused on binary relations and their properties (reflexive, symmetric, transitive...etc). I saw students memorize the mathematical definitions for all properties and yet be unable to actually apply them to a particular relation. I suspect many people ended up ignoring the definitions and coped by developing their own or using intuition. This seems like a sign that math is not being communicated effectively.

"Translating" the math into Python code[1] helped me understand what these definition were saying and identify algorithms for exam time. The downside is that it took a lot of time and may not have been as effective as grinding practice questions.

1 - https://nbviewer.jupyter.org/github/bryik/jupyter-notebooks/...

IMO this is sort of an ironic example because it's a case where the names actually convey meaning well. It's a case where I've specifically noted to myself "I'll always remember these definitions because the names are so nice".

What about symmetric vs antisymmetric? My professor even warned us about how the names can be misleading (it is possible to have a relation that is both symmetric and antisymmetric).

In this case there's not really a naming problem either. The confusion comes from the idea that ((A -> B) && (A -> ~B)) is sometimes true. Capturing that subtlety in a name doesn't seem practical to me.

I'm not sure I understand. The problem with symmetric/antisymmetric is that the names make students think they are related and this can lead to misunderstanding (e.g. "since this relation is antisymmetric, it cannot be symmetric"). Arguably, this confusion would not exist if they were named differently.

I think maybe there's some confusion here regarding the definitions of symmetry and antisymmetry, because they're very closely related. Symmetry says "whenever A, then B" and antisymmetry says "whenever A, then not B". I.e, given a piece of information A, symmetry and antisymmetry tell you to draw opposite conclusions, which is why one is "anti" the other. The only time a relation can be both symmetric and antisymmetric is when the antecedent in their definitions is never true, which is the trivial case.

Is the problem that you want formulas to make sense on their own without the surrounding text? That's not how math is meant to be read. Formulas are reserved for only a few thoughts that are best expressed in that concise notation. The bulk of most mathematical writings is text.

Exactly. These arguments seem to be coming largely from people who don't actually use much math in their programming.

Are you a mathematician? How much math have you studied?

I'm a software developer. I need to understand computer science papers to get my work done.

I've studied enough. High school and several courses in college. Enough that I should be able to take it from here and learn on my own.

So let me get this straight...You are mainly a software developer who has an issue with math notation that is meant for mathematicians? Could it perhaps be that you haven't taken enough math to become in tune with how it's done/studied/read?

I come from the other direction. I studied math first. And yes, at first I had an issue with the long variable/function names. But soon enough I realized that programming is done quite differently than math. With IDEs and autocompletion, long variable names are not really a problem.

One could argue that variable names should be descriptive in programming, because there is a lot more reading involved. Whereas in math, there is a lot more writing/doing involved. I might manipulate a math variable name 1000 times in my head in the process of working on a problem. I might even WRITE it down that many times! In programming, I might READ a variable name 1000 times in debugging.

I agree. As someone who has done mathematics research, and studied on the purest end of the math spectrum, it's so aggravating to hear engineers who "took BOTH differential equations classes" explain how math should be when they clearly don't know what they're talking about.

I don't like mathematical notation's ambiguity and opaqueness much either, but it is fairly important for the notation to be terse. It's more important than in software because so many proofs are about this being equivalent to that, and symbols give you a lot of room to work with, which you would not have with words. And I mean, the vocabulary already exists, papers just need to explain which symbol represents what. If a paper tells me that epsilon is the learning rate for some model, I know what that means and I can just keep it in mind when I read the math. I don't think I'd understand better if they write out "learning rate" over and over again.

But that's not our experience in software. We do write "learningRate" over and over again, because we've learned that that makes the code much more understandable that just writing "e". In fact, we've learned that excessive terseness in software is an anti-pattern.

That works fine for software, but mathematics are a different use case. If you want to simplify or rearrange an equation, what matters most is the structure, not the precise meanings you assign to the variables. If you want to understand how an equation relates to the real world or the task at hand, I agree that named variables help -- but if you want to do math, that is to say, manipulate equations, it makes more sense to work in a terser representation. Patterns will stand out more.

Instead of dismissing mathematical syntax as laziness plain and simple Jeremy Kun (a mathematician) explains some reasons towards mathematical syntax.


Short variable names as "doing it wrong" depends on context. There are good reasons in math and in programming to use short names. See for example:

Descriptive Variable Names: A Code Smell. http://degoes.net/articles/insufficiently-polymorphic

Interestingly, 500 years ago, math was to your taste. There were not much symbols in it, but lots and lots of English (or, more often, French, German, or Latin) sentences that took paragraphs upon paragraphs of text to describe what today a couple of equations do...

To "explain".

I find a good paragraph explaining what the equation means and implies is far more useful than the equation alone.

This is all the typical "what" vs "why". If you just want to describe the what then the formula alone might be enough (and just presuming everyone know what the variables mean), but if you actually want to convey a some meaning that formula needs to be explained.

You are arguing against a strawman. I have never seen any piece of mathematics that was just formulas. The heavy majority of any mathematical writing is plain text -- formulas and notation are only used when it would be much less practical to express something in text.

I appreciate your response it is well thought out and leaves me struggling hard to find a counter-example. I feel that I know that I have had to deal with this, but I cannot find examples.

Before your response I had several hours with just downvotes and did not know why. I wish people would comment with their downvotes.

This feels like an instance of the XKCD:927 principle. Math notation tends to be big and ugly and hard to wrap your head around at times. It's context-specific (if you're summing over an array, you'll maybe use i as an index value and reference x_i. If you're working with complex numbers, i is a constant.

Importantly, this language and its practices have evolved over centuries, in response to a lot of different competing needs. It's probably highly-optimized for something, just not what you need it for.

That said, if you're dealing with indices of an array in a sum / product series and that array is of complex numbers, it's actually not terribly ambiguous to do x_i * (2i - 1) or something.

On the other (third? gripping?) hand, you do have a point - there's usually unwritten practices about what different variables reference (theta is usually an angle, k is a constant, n an integer, usually). That can be gatekeeping, and it's super-frustrating to work in a language that doesn't expose type information.

927: https://xkcd.com/927/

Thank you for writing a terrific criticism on how math is taught today.

In part because it's one of the oldest taught subjects, I think the field has gotten away with how it teaches. In every other field we have an escalation ladder of abstraction that starts from tangible descriptions that a motivated learner could understand all the way up to symbolic jargon that is only accessible to insiders.

Math doesn't have this at all. It goes from arithmetic straight into symbolic manipulation with only a sideways glance at describing three dimensional shapes. It's lazy and the field needs to acknowledge that it has practical dimension as well as a theoretical one and each needs to be taught and described differently.

> It's laziness, pure and simple.

No: it's assembly language, for the hand and pen.

Languages like Julia allow you to use greek letters. Just saying.

Doesn't any language that support Unicode allow that? Java for example. But, it's rather impractical to use, since greek letters are not that easy to type with English keyboard.

Julia's REPL allows for LaTeX like notation (e.g,. you can type \alpha to get the matching Greek letter), which makes it quite natural to use the Greek letters in places that make sense.

If the intended result is some kind of scientific presentation then kinda yes. If I am writing a classic software code, then I do not see how \alpha is better or more readable than, well - alpha.

Registration is open for Startup School 2019. Classes start July 22nd.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact