Hacker News new | past | comments | ask | show | jobs | submit login
Can a simple algebra test predict programming aptitude? (codeup.com)
104 points by dennyferra on Dec 12, 2014 | hide | past | favorite | 212 comments

Why do you care about it in the first place? Also what is your definition of good programmer? Just make a function? Make a class? Make a big app?

IMO you don't need to learn too much math before programming to be a good programmer. Basic operations are good enough, if you need more you learn as you go, you'll never know what you need. Lazy learning, like lazy variable loading.

The test you are making are more like problem solving which is closer to what programming is and not math. Of course basic math (+, -, /, *), but what it really matters is knowing what operations to do. The operation itself is the easier part.

A better test would be to make a chose between which approach to take to solve a problem and explain the rational behind. That is what will define a good programmer.

Also what better indicator you want than programming itself? You'll never make a choice based on this.

Also one thing is to figure out how to solve "here are 3 consecutive integers with a sum of 69. What are they?" than solving "here are X consecutive integers with a sum of Y. What are they?". Same in "Adriana’s age is 1/3rd of her dad’s age. If her dad is 36 years old, how old is Adriana?" vs "W’s age is X of Y’s age. If Y is Z years old, how old is W?". The complexity increases a lot.

I disagree with many of your points.

First, the algorithmic part of programming is a lot like doing algebra. So much so that algebra is an important part of any serious CS syllabus. Clearly "writing a function or a class" is not the skill they are testing here, but problem-solving.

Knowing how to solve simple algebraic problems engages your "problem solving" skills in a similar way as solving something by writing a program. Note that algebra is not "basic math (+,-,/,*)" like you said. Algebra, like programming, requires the ability to understand and write abstractions, and to figure out how to approach a problem.

> Also what better indicator you want than programming itself?

They clearly want to establish a correlation between something else and programming. This is interesting in itself.

IMO, what algebra and programming have in common is simply abstraction, so people in the mindset of recognizing patterns to which abstraction can be applied tend to do better as programmers / automation engineers in general.

Algebra and programming are very, very similar. Algebra [1] deals with formal languages and their rewrite rules in general. Programming is dealing with a certain (but yet very wide) class of formal languages and an extremely wide variety of rewrite rules (i.e., algebras).

It does not matter that nobody is using, say, a term "refactoring algebra", but it is still, essentially, an algebra.

[1] Here I'm referring to a generic meaning of "algebra", as in https://en.wikipedia.org/wiki/Universal_algebra

Even if it was just that, it would probably work as a predictor of skill. After all, people being good at one system of formalized abstract thought must be a reasonable predictor that they will be likely to master another one.

But it's more than that. Algebra and "algebraic reasoning" (to use a non-technical term I just made up) are the underpinnings of Computer Science, which is why Algebra is featured prominently in the CS syllabus. Of course this is obvious in CS theory (where you'll play with numbers as if they were programs), but it's also useful in order to write actual programs.

I'm not saying you must understand Algebra in order to write a CRUD in your typical web application. Likewise, you probably don't need to understand Relational Algebra to write SQL for the aforementioned web application. You can understand surprisingly little in order to "muddle through". And intelligent self-made programmers will actually be able to write complex software without a solid theoretical foundation to back them up (maybe rediscovering the wheel along the way). It's definitely possible to write software without knowing the theory.

However, I think you'll be a better programmer if you do. (My highly subjective opinion: it's definitely more fun if you also understand at least some of the theory!)

>why do you care about it in the first place? Also what is your definition of good programmer?

Is this a joke? They care because they are selecting applicants to a programming boot camp and want to select the ones more likely to be successful. I suppose they could make all of the applicants learn to program and select the best ones, but there could be issues with that. And they are pretty clear that they are defining 'good programmer' as 'good outcome in their course'.

> A better test would be...

Prove it.

>They care because they are selecting applicants to a programming boot camp and want to select the ones more likely to be successful.

Just make a programming test. IMO is more certain than relying on knowing algebra well or not. But I don't have any data to show you, is just me thinking about it.

>Prove it.

I can't, you are right. I just said what I was thinking.

You seem to be missing the point. They can't use a programming test on people who don't know how to program.

The thing is that they aren't saying that algebra learning causes good programming. They're saying it predicts it. They don't delve into underlying factors, but general mental ability is certainly implied.

In other words, an algebra test just acted as a noisy, imprecise measurement of general mental ability.

For an article about math, they seem surprisingly weak at differentiating between causation and correlation.

It’s fairly common for incoming Computer Science majors to ask the question, “Why do I have to learn all this math if I just want to learn to program?” The correlation above suggests a possible answer: The ability to understand basic mathematics is likely correlated with the ability to “think algorithmically,” which is well-known to be a foundational skill for expert programmers.

If it's just a weak (as the chart suggests) correlation, it doesn't mean learning one will make you better at the other. And if you do assume causality - it can go either way. Perhaps learning math makes you better at programming. Perhaps learning programming makes you better at math.

My 2 cents... It's more complicated because there are other things involved. Perhaps it's the nerd gene that makes people who like computers also more likely to play D&D and be in band. (I was a card carrying member) Does one of these 5 variables cause the other, or are they all part of the same thing?

They may be bad at differentiating causation and correlation, however, if they're just using the results as a way to try and predict which candidates will do better in the classes, this seems like a pretty good application. They're not going around teaching people algebra, based on this hypothesis -- they're recruiting potentially good programmers.

However, it is correct that educators would need to do a little more research if they want to try and apply these results to a classroom.

Precisely what I thought as soon as I read the study, simply because I was terrible at math generally (and algebra specifically) before I studied CS. Afterwords, it turns out I'm fine at math generally (and have a bit of ability for algebra). The problem I had was not the content matter but the teaching paradigm.

A much more interesting experiment would be to add the algebra test back into the end of the program and see if there is improvement and how that correlates (or doesn't) to programming competence.

I was in the exact same boat. I was "shit" at math, but my love for tech pushed me to CS. I barely passed the req maths such as the calcs and descreet math, but I did exceedingly well at the cs classes. I even did okay in the algo classes just because I was driven. The longer I'm in the industry though, the easier math becomes. I've been teaching myself algebra and calc on the side and things come a lot easier. Maybe it's just my wrestle with it attitude I've picked up over time or maybe my brain is rewired. Who knows.

The way I see it, you became good at math the same way a musician gets to Carnegie Hall - practice, practice, practice.

For some reason - possibly due to the disproportionately young age of mathematicians - there's a prevailing idea that math is an innate talent, something which you are either born with, or you aren't. I strongly disagree with that notion - putting in that 10,000 hours can improve one's mathematical abilities.

Should if you are going into a CE or CS course not have learnt to code at school before going to university English lit or classics students are expected to be able to read.

I learnt to program when I was 13 and that was in the middle stream at my upper school

This is highly dependent on your opportunity. As hard as it is to find qualified math and english teachers prior to the college level, think how hard it would be to find programming instruction.

That is to say, when I was in school before university there was no choice for studying programming.

There was but it was offered at a private religious school/camp, not via the public schools.

This was in a comprehensive i.e. a bog standard school and was in the middle i.e. vocational stream - possibly the approach of having specialisation at the middle and upper school works better for feeding children into UNI/Vocational track.

Right, so the bog standard schools you had the opportunity to attend offered something mine (and many others) do not. As someone who is always looking for good developers I certainly don't want to limit the number of people who can learn CS to only those who happened to live somewhere that offered programming at the public school level.

possibly not fetishizing CS over a more useful general introduction to programming might help.

I haven't had to do any of the big O stuff in anger but I have had to make sure a major telcos billing system reconciled.

At my Uni the assumption was that you didn't need to know before you started, but the upkick in 2nd year classes was very difficult, and you'd wind up with a few "15+ hours of non-class work per week" simultaneously. So most of the top students had programmed before, but there were a couple very talented outliers who just had an aptitude for it. (Smart plus no bad habits plus work ethic)

Your quote doesn't prove your point that they're confusing correlation and causation. In fact, it even uses the word "correlation" in it.

If there is one thing that has always bothered me, is people evaluating student's prowess in "Computer Science" (which is not programming, shame on you) by having them learn your normal run-of-the-mill programming language/paradigm like Javascript, Python, PHP and whatnot and then claim they are not apt to program because they cannot solve problems or handle variables, mutability, etc etc.

There are A LOT of different programming/automation paradigms in the world, lots of different languages, declarative, functional, OOP, imperative, logic, etc etc, maybe some of these students just can't get the convoluted mutable and unsafe structure of a Python or Java program but they might grok and appreciate the safety and simplicity of ML/Haskell or the flexibility of Lisp, why would you turn them down in such a way or force them through a path they don't really need to go through?

During my first year of Bachelor Computer Science at University we had introduction to programming in Scheme. It was a mind-expanding experience to me, I already knew how to program in C, Java, C++ and C# with a bit of PHP/Javascript back then and I was much ahead most of my peers but still I sat at the computer, trying to earn this alien language to me, and luckily I was humble enough to force myself through it and adapt my mind to a different paradigm. I know for certain that a lot of my friends and colleagues who already knew how to program struggled hard on that exam and some had to re-take it the following year (Which was unfortunately changed to C++ and they managed to pass without troubles) simply because they didn't think Scheme was a "real" programming language or useful at all. It was just too different from their previous experience.

What's even more interesting, I know some people from that class who had never touched a programming language before and weren't particularly strong at math. Those people are the ones I recall enjoying the course the most, they found it the easiest among all the other courses we had and passed it with excellent grades. Simply because their mind was apparently better wired for such paradigm and they had no preconceptions or prejudices that prevented them from learning it properly.

So my bottom line is, what makes you think that some people might not just have a "differently wired" brain that makes them think more easily with a different non-imperative paradigm for programming?

"Differently wired" wired brain is the ability to manipulate abstractions. Which is required no matter what kind of programming language you use. And it also required for doing algebra.

Recursion is as hard as mutable state to grasp if not harder. We might find a few people who can do one thing sligtly better than the other one, but not complete blanks.

The whole point of what I said is that "hard" is a difficult thing to judge. For some tasks and some languages recursion is much easier to understand, teach and use. Just take the good ol' classical example of quicksort in Haskell or fibonacci in Lisp.

Some people can grok recursion in a functional programming language much faster and more easily than they would understand the concept of branching, variable assignment (with mutability) and non-pure functions (what we used to call "procedures" back in the day).

Just take the good ol' classical example of quicksort in Haskell or fibonacci in Lisp.

Hell, recursive quicksort and mergesort are easier to teach and remember than the iterative versions even in C and Java. They're divide-and-conquer algorithms, which makes them a natural fit for a recursive solution.

The classic C/C++ versions are recursive, and using the STL, c++'s quick sort looks very similar to the functional one.

From cppreference.com: http://coliru.stacked-crooked.com/view?id=0b2cec4d8c69ffaf


I disagree. Recursion is simply inductive reasoning, which people are able to grasp at a young age.

maybe the initial concept, but what is easier to debug afterwards?

So much teaching material for beginning programming is tainted by the cult of mutability that I have coworkers writing for loops instead of maps or the like for relatively trivial tasks. The end result is I have a harder time reading code, and the amount of possible bugs grows.

In 35 years of programming across platforms and languages covering embedded to web applications it is my opinion that recursion is dangerous and useless. There is no way I would use recursion to evaluate how good a programmer might be. To quote the movie: "Frankly dear, I don't give a damn".

Do you imply that functional programming in general is dangerous and useless?

And what's your standing regarding the total languages? How dangerous is, say, Agda2? How useless is HOL?

Nothing whatsoever to do with functional programming.

How is it so? Recursion is the only possible form of iterative control flow in the functional languages. If you do not want to allow any forms of recursion, you cannot use functional languages at all.

Of course, you can explicitly demand totality, and it's perfectly fine, but still it is quite a severe limitation, and you'll have to allow optional unrestricted recursion if you want to have (at leat local) Turing completeness.

Recursion is a tool. It can be misused, but it is just a tool. And sometimes it can be very useful.

Recursion is mostly an academic tool. In the real world it is a bad idea.

It's interesting that a bunch of people went off in a down-vote spree yet not one person offered a non-trivial use of recursion in commercial, medical or military software.

It wastes resources and it simply isn't safe. Used in the wrong place it can actually kill people.

> Recursion is mostly an academic tool.


> In the real world it is a bad idea.

Wrong. Your view of the real world is severely distorted.

> yet not one person offered a non-trivial use of recursion in commercial, medical or military software.

I suspect your 35 years of experience was 1 year repeated 35 times. Otherwise you would have known thousands of cases where recursion is unavoidable.

Take any real world compiler, for example. People nowadays are spoiled, they want meaningful error messages, they want semantic highlighting in IDEs, and so on. So forget about the dragon book, automatons and all such crap - your only option is a recursive descent parser.

If you want a guaranteed quality and proven correctness of your software (since you've mentioned medical and military), you have no other option but to use total languages. It's just too hard to reason about anything else. Which means - you only have recursion and no other means of control flow. Want your software to be 100% correct, robust and predictable - use recursion. I can go on forever, but things I've listed are already more than enough to debunk your point.

> It wastes resources


> and it simply isn't safe.


> Used in the wrong place it can actually kill people.

Incompetent coders kill people. Epileptoids suffering from severe forms of Dunning-Kruger effect kill people. Those who repeat 1 year of experience instead of building experience progressively kill people.

Very nice attempt at shooting the messenger to attempt to make a really lame point. Of course ypu can use recursion in non mission critical applications such as a compiler. Take a moment to educate yourself as to the requirements for industrial, medical and military devices and you might begin to understand. And, BTW, we don't have to agree, that's the beauty of it. You can go on beleiving you are right and I can now see about repeating that one year of experience for the 36th. time.

> Take a moment to educate yourself

Said someone who does not even know what functional programming is. Hilarious.

I already told you that you can only prove correctness with total languages. Guess you not qualified enough to find any arguments to counter this point.

I asked you to educate yourself as to the issues and requirements of software used in the context of a system or device that could kill people. In this context industrial, medical and military applications are the most obvious places where you will find these kinds of systems. I have worked in all three of these domains in my 35 years of, as you suggested, repeating one year of experience. And, BTW, I am not just a software engineer, I quite often design the very hardware these software systems have to run on, which for the past 15 years or so has usually meant some rather massive Xilinx FPGA's with embedded processors, etc.

You continue to make a lame attempt to attack me instead of trying to understand the issue. Let me see if I can lend a perspective here. My prior posts were made from my iPad which is horrible for typing.

I can't remember how many programming languages I've used during my career. I started with machine language, moved to Forth, then C, APL, Lisp and C++ and Python. On the hardware front it's Verilog, but that's an HDL, which is irrelevant as it pertains to this discussion. The languages I listed above probably cover a huge chunk of my life's work. Of course, I've also worked extensively with what I call "web languages", mostly PHP and Javascript.

See any functional languages in that list? Yes, yes, I know they are "impure". Geez.

In the case of APL, this old paper deals with the question:


I worked with Lisp and APL professionally --not in an academic setting-- for about ten years. At the age of 20 I published my first paper at an international ACM conference on APL. I have a picture right here on my wall with Ken Iverson (boy did I look like a dork at 20!). That was 30 years ago!

Please stop insulting me.

Can one do recursion safely?

Yes. Absolutely. Without a doubt.

Wait a minute! Why, then, am I here saying the recursion is a bad idea and it is dangerous?

First you have to consider the fact that during the first, oh, twenty years of my career as a hardware/software engineer I designed things that could kill people if they failed catastrophically. That very quickly shifts your way of thinking. Screwing around with a website that can fail and nobody dies (and in a lot of cases, nobody cares) is very different from writing software for an industrial system that can cut a man in half. Very different. If you've never had that responsibility it might be hard to understand that elegance can be the enemy of safety.

Have you ever had to write software that considers the real-world possibility of a bit flipping in memory? And so my issue with recursion is one of theory vs. practice.

I like recursion. It can result in really elegant solutions when used in the right place and with proper safeguards. That last part is a hint as to where my opinion comes from. First, I'll let this very well respected post do some of the talking for me:


So you have software engineers graduating from CS programs where they are shown the elegance and magic of recursion. They've "done" Fibonacci and a number of other assignments, parsers, etc. Now they get hired to work somewhere like, say, Toyota, and 90+ people die because he though it'd be neat to use unbounded recursion in the engine control system. But, of course, this programmer didn't make a conscious decision to actually use unbounded recursion. All he was taught was recursion. You, know, that neat Fibonacci trick? And so he used it, oblivious to the potential consequences. He had no idea of what was happening at a low level in the system.

Frankly, I can't remember any programming technique that has killed more people. I wouldn't even know how to search for that information. And the Toyota case might be the only publicly available case we have on recursion.

If the problem highlighted in the CodingHorror.com article is true, why would anyone willingly allow a programmer to use such a dangerous approach to solving a problem? I know, without a shadow of a doubt, what CodingHorror is talking about because I have founded and run three tech companies and have had the pleasure of having to hire both hardware and software engineers. I have to tell you, until you experience this, it's unbelievable. And then you hire them and you come to discover how much more they don't know.

Part of the disconnect is that a lot of programmers have no clue as to what is actually going on at the machine level. Those of use who came up through machine/assembly language and low level stuff like Forth learn this because, well, you have to. Someone starting life as a programmer with something like Python has ZERO idea of what is going on behind the curtains. He is taught recursion and it is hot-shit-neat and he is eager to use it somewhere to show his buddies just how smart he is. And that's how you can end-up with a situation like the Toyota problem or, less dramatically so, a library with hidden recursion that absolutely blows-up the system without the library's user having a clue that an unbounded recursive function is sitting there waiting for the right opportunity to cause a failure.

So I'll put it this way: Every experienced and capable engineer I know and have worked with will use recursion if, and only if, there is no other way to accomplish the same task. Where there is, it often is faster, less resource intensive and potentially safer. When there isn't, it must be tightly bounded and tested to failure. Introducing recursion into a mission critical piece of software means introducing something that must forever be in the back of your mind as a potential source of catastrophic failure.

> See any functional languages in that list?

Not a single one. Not even a functional language, not to mention total languages. Lisp does not qualify as a functional language and never even pretended to be one. And APL is a tacit one, quite a distant thing from anything functional.

Now, try to read my previous post carefully. Referring to the relevant literature when in doubt. My point, again:

the only practical way to have a program 100% proven to be correct, and at the same time avoid wasting hundreds of man-years developing it, is to develop it in one of the total languages. Which implies that the only mean of iteration you've got is recursion and nothing but recursion.

I.e., if you want to be 100% sure that your software and hardware will not kill, and will behave exactly as specified, then you must use recursion.

Yes, of course I'm perfectly aware of the reasons behind things like JPL coding standards. They're pretty outdated and should not be brought back now, in 21st century.

> Have you ever had to write software that considers the real-world possibility of a bit flipping in memory?

I did, as well as designing hardware with such a possibility in mind. Before things like HOL and ACL2 became available it was a total hell to do so. Now it's a piece of cake.

> So you have software engineers graduating from CS programs

And how recursion is responsible for idiotic HR policies? One should never hire graduate engineers for anything distantly mission-critical. Never.

> And so he used it, oblivious to the potential consequences.

This sort of people will screw up anyway, no matter what kind of tools they're allowed to use. Recursion is far below pretty much everything else on a list of things that an ignorant epileptoid can screw up, starting with off-by-one and pointer arithmetics.

And, leaving the embedded software aside, stack overflow is by far the least common issue out there, dwarfed by things like null pointer dereferencing and the said off-by-one.

> And so my issue with recursion is one of theory vs. practice.

Practice is irrelevant. It will never give you a full coverage of all the things you can potentially screw up, not even with hundreds of years of experience. Theory, on contrary, is perfectly capable of doing so.

> Frankly, I can't remember any programming technique that has killed more people.

Magic constants. Perverted approach to parallelism (shared memory instead of message passing). Google for "Therac-25" for example.

> why would anyone willingly allow a programmer to use such a dangerous approach to solving a problem?

Where the other means of ensuring correctness are not available, the right coding standards are more or less ok (the said JPL coding guidelines for example).

> Those of use who came up through machine/assembly language and low level stuff like Forth learn this because, well, you have to.

Can you imagine someone without this kind of background allowed to meddle with a mission-critical embedded software? I'm not aware of such cases, and I spent quite a few years in predominantly embedded environment.

> if, and only if, there is no other way to accomplish the same task.

Bingo. In the total languages there is no other way to accomplish, well, anything. And only the total languages are safe enough for the mission-critical software and hardware. Turing-completeness is evil - this is what you should have been attacking, not the innocent recursion.

> Introducing recursion into a mission critical piece of software means introducing something that must forever be in the back of your mind as a potential source of catastrophic failure.

Human mind is a broken tool and should not be allowed to judge a quality of a software. Formal logic is much better in ensuring that the software behaves exactly as prescribed.

You see, i am arguing from practice and you from theory. In theory you are absolutely correct. In practice the vast majority of code is NOT written in total languages. A more likely scenario is one where somebody is using language X and, in the middle of a project they decide there's this recursion thing they read about or learned in school. And off they go. In that context. In the context of a practical world where virtually nobody is is using total languages, the injection of recursion by the wrong people is dangerous. I say this from first hand experience.

Let's just agree to disagree.

> You see, i am arguing from practice and you from theory.


My "theory" is being extremely widely used in practice. Take a look at the Intel FDIV bug history and its consequences for example. They've been formally certifying critical parts of the hardware ever since then.

Or take a look at the widely used seL4 kernel: http://ssrg.nicta.com.au/projects/seL4/

> In practice the vast majority of code is NOT written in total languages.

And this is the problem, not recursion or whatever else. Incompetent coders are evil, and you cannot blame any particular technique for any failures. Luckily, in my practice, nobody ever dared to let any graduate larvae to hang anywhere near any kind of mission-critial code. I've heard stories, of course, but never witnessed anything like that myself.

> they decide there's this recursion thing they read about or learned in school.

As I said, people of this degree of cognitive development will screw up anyway, no matter what techniques they use. In a monkey with a hand grenade situation you should not worry too much about a sharp stick this monkey may accidentally pick up.

> the injection of recursion by the wrong people is dangerous

Pretty much everything else they do is equally or more dangerous anyway.

And, getting back to your rant which started this thread - you said that recursion is dangerous and useless. Unconditionally.

Now it turns out that it may be dangerous, but only when used by complete tools and only in heavily mission-critical embedded projects. Quite a distance from what you've said originally. No wonder you've got such a response.

I think you are at a point where you just want to argue to argue because. I'll give one more spin and I am out.

The world you paint isn't the real world. The world I am considering is this one:


In other words, one where your total languages are virtually nowhere to be found. Not by a small margin, by a landslide.

So, as they say in the movies, in a world where the top 5 most used languages are C and and C derivatives and total languages are almost nowhere to be found or a thing of academia, then, yes, what I am saying is very relevant and, dare I say, true and appropriate.

And, BTW, even experienced programmers screw up recursion because it just isn't used that often. I don't want to blame newbies.

So I look at this world, as evidenced by the charts on that site and a data I am sure could be dug up on a bunch of other sites I can very easily conclude that total languages are still mostly in academia and so is safe recursion. In a world dominated by C and derivatives, recursion can be really dangerous. Bad recursion hidden inside a library can be really bad news.

Anyhow, I am done with the back and forth. I understand where you want to come from. I am just asking that you understand that the frame of reference you have constructed is not one that matches our current reality. Which means one can't say that recursion is safe because when total languages when virtually nobody, in relative terms, uses them. And then, when they do use them, you have to ask what they are using them for. Is it Academia or real world application?

Enough said. Thanks for a good discussion. I wish people were not so nasty with down-votes on HN as it detracts from trying to have a discussion based on ideas contrary to the underlying HN culture. That's just the way it is.

> The world you paint isn't the real world.

My world is this one:




... and so on.

And, honestly, I don't want to have anything in common with your "real world", where graduate larvae with IQ < 80 is allowed to code automotive microcontrollers.

> In other words, one where your total languages are virtually nowhere to be found.

We're speaking about mission critical stuff, which is already a virtually nonexistent thing dwarfed by CRUD and all such crap.

> And, BTW, even experienced programmers screw up recursion because it just isn't used that often. I don't want to blame newbies.

I see. You did not get a single word from what I said. Pity.

Let me repeat my point again: humans are brainless scumbags. They should never be trusted with anything important. If anything can be screwed up, it will be screwed up in an epic scale. The only way to avoid an epic screwup is to exclude this brainless scum from the process, and let the immaculate formal systems do the job.

> I am just asking that you understand that the frame of reference you have constructed is not one that matches our current reality.

I'd prefer to stay away as far as possible from your reality. Mine is much better. In my reality, a code without multiple layers of formal proof would not ever be signed off for anything mission critical (although I admit that the most deadly stuff I worked with were anti-aircraft systems, nothing fancy like nuclear plants and such).


Just so I can answer, why what?

Just in case this is what you were trying to understand:



You can certainly dig up more information. For me, and this is just my opinion based on 35 years of software and hardware engineering, recursion is a neat academic idea but in the real world it is useless and dangerous.

I'm seeing stack overflows (not a problem with tail-call optimization) as the biggest problem here. There are equally dangerous problems, though. Buffer overflows are a very real concern, especially in embedded software; but that doesn't mean that people don't ever use pointers. Everything about software has risks. Yes, they probably shouldn't have used recursion there in that car; but honestly, it's just a stack. Enough calls and some state could have been frozen. They should have put mitigations in place to stop the potential for a stack overflow to have actual, detrimental results. Do you refuse to ever make a function call for fear that you might be at the top of the stack somehow? After all, how can you know that you haven't accidentally created a state machine with function calls and are at the peak of it now, and end up causing the same problem. Recursion may have been the final manifestation, but it's certainly not the only place that failed.

I argue against your thought that recursion is useless, as well. It's a natural and intuitive solution for a lot of problems. The very concept of a dynamic programming solution starts with a recurrence relationship. It starts with recursion. Sorting algorithms are intutively recursive in many cases, Quicksort and Mergesort are just two examples.

And, 'dangerous' is an interesting term. I understand you've worked in embedded systems and hardware, and so those things are often critical systems that have to be super secure and completely bug free. Not every system is like that, though.

I want to understand why you believe that recursion is both useless and dangerous. Could you give some broad examples of the dangers of recursion from your years of experience?

It's trivial to blow the stack even on languages with tail call optimization.

Ok, try to blow the stack in a stackless language implementation.

F(x) = F(x) + F(x)

Eventually you hit OOM or some death limit even on a stackless language.

This is a memory leak now, not a stack overflow.

Thus demonstrating a finite stack. "Stackless" just a more efficent way to use memory. You could get the same things from dynamicly allocating and dealocating stack space as needed.

Put another way, allocating 1GB of stack space on a machine with 2GB of RAM is at worst the equivelent of using a stackless lanugage on a machine limited 1 GB of RAM.

PS: It's actually worse as stackless languages are stricly slower than actually having the equivelent stack space pre alocated and you need to put an pointer for each stack call pluss overhead for alocating memory.

By writing incorrect recursion? Or in some other way?

Both, the obvious example is the fibinatchi sequence f(x) = f(x-1) + f(x-2). There are lots of efficient ways of coding it but the most obvious O(n^2) approch is n depth and it can't be tail-call optimized.

Granted, you can write much more efficient code that can be tail call optimized, but there complex and less obvious. There are also languages which cache previous results so the native approch becomes O(n) with fairly elegant code that's still depth n.

PS: Anyway, tail call optimization only works when you can rewrite the code as a loop for more complex structures it's less useful.

> Anyway, tail call optimization only works when you can rewrite the code as a loop

You cannot turn a virtual tail call into a static jump.

Sure, it needs bookkeeping but it's still trivial to create a loop equivalent of any tail call optimizable code that your compiler recognizes.

PS: Feel free to look for any counter example.

> PS: Feel free to look for any counter example.

Ok, create a loop equivalent for `(define (f g x) (g x))`.


Though the mechanical equivalent would be something like:

  output, continue, current_function = f;
  if (current_function == f)
  {input = x ; continue = true; current_function=g}
  if (current_function == g)
  { output = input; continue = false;}
  return output;
You would then add any function you would tail call optimize into that function. Granted, with the right structure (inside > outside >... > inside) it can show up more than once on the stack but the same thing can happen despite tail call optimization.

Mini-interpreter implementations do not count, for performance reasons. You could have called a VM bytecode interpreter loop a "loop alternative to tail call" here.

My first example of (f x) is about as fast as you can get.

The longer example is not great the point is it's a very language agnostic tradeoff of speed vs memory. Replace the ‘if’ conditionals with a case statement and its much faster you can speed it up further by using continue.

You can speed it up even more by having a jump at the end of each statement, but that stops looking like a loop and is basically just the tail call optimization.

Anyway, the important part is its low memory utilization and compiler independent.

PS: I in no way suggest tail call optimization was not useful; just you can mechanically simulate it when not available. You can usually beat that mechanical approach if you need to speed things up further.

> as fast as you can get.

But yet 10x slower than a single indirect jump.

> just you can mechanically simulate it when not available.

It's too slow to ever be anywhere near practical - for this reason, almost no JVM language implementation really does anything like this, besides kawa, bigloo and alike, which are slower than some of the dumb interpreters like SISC.

I don't think I was clear enough, it simplifies to zero code (not even a jump) just the original input. (I thought about saying noop but even that's wasteful.)

>"it's slow"

Again, it's just a technique; think writing embedded code with a really dumb compiler. Anyway, saying its slow is not really a counter argument if your limited to say 2,000 bytes of ram you will make lot's of tradeoffs between efficiency and speed. Perhaps you have a select statement perhaps you don't, but starting off with pure ASM is often a pain.

EDIT: This is all from the perspective of dealing with tools that don't do tail call optimization, not writing a compiler / VM etc.

And how exactly it is related to recursion? You should have said "memory leaks are dangerous", and everyone would agree.

Ah, Scheme.. You must be an Indiana University grad.

Only school I know that uses Scheme that much.

Lots of top schools use scheme (via SICP). Cal, Stanford and MIT all do (or at least did when I was an undergrad).

After the dot.com bust cratered EECS enrollment (down by more than half after being steady for decades), MIT revised its curriculum and completely purged Scheme from it (or at least the required parts). It's all Python and Java now :-(.

Dude you went to MIT? That's rad. EDIT: why downvote someone for complimenting them on their top schooling?

Because it is noise (do you really think that either the poster or the rest of HN need to hear that you are impressed by MIT?), and also because it isn't even stated that he did go to MIT.

Of course they don't need to hear that; it's pretty much a given. Their superiority is well-established at this point.

Nope, sorry. Wrong side of the ocean.


Minnesota used Scheme when my son studied there. MIT did until about that year too.

"that much"

It's just one intro CS201 course. There might be another Scheme elective, though.

The scatter plots don't show that strong a correlation.

From the test questions, this isn't an "algebra test". It's a word problem decoding test. That's appropriate to programming, where you have to go from an informal specification of the problem to a formal one.

It's only the middling scores that aren't very predictive. Scores >=80% and <=60% are strongly predictive. Only 4/17 that scored 80% or higher scored lower than a 3 on programming ability. Only 1/13 that scored 60% or lower scored a 3 or higher on programming ability.

Yes it is algebra. Take second problem for instance.

X + (x + 1) + (x+ 2) = 69. Solve for x.

Or just recognize that three consecutive numbers mean you divide by three and get the median of the three numbers. So 69/3-1, 69/3, 69/3+1, giving 22, 23, 24. No algebra involved, you just need to interpret the problem and build a tiny model in your head. Exactly like (some kinds of) programming.

Amusingly, I taught a college math class many years ago. A question on one exam was:

x^3 = 27

One student solved it in the following way:

1^3 = 1

2^3 = 8

3^3 = 27

ans = 3

There was a meeting for the teachers to discuss grading standards, and I brought up my student's answer. I got a number of responses:

"That's not algebra." "The student obviously doesn't understand the problem."

And so forth. Being somewhat of a punk, I asked if any of them had, in their years of teaching, actually taught their students to recognize when an answer is "algebra" or how a right answer of 3 differs from a wrong answer of 3. It wasn't a pleasant discussion. This was a university with a prestigious math department, and I was just some guy off the street. I chatted with my students the next day. None of them had ever been told what it means to "show your work," or any of the other important trappings of school math. They either did it or they didn't.

Thinking about it more, I would have sidestepped the issue completely, by changing the problem to:

x^3 = 26

That's still algebra, using the equation (X-1) + (X) + (X+1) = 69 instead.

That's exactly how I solved it. I'd consider myself a pretty decent programmer, but I did poorly in algebra and calculus at middle, high school, and university level.

Give me "x + (x + 1) + (x + 2) = 69" and I wouldn't know where to start. I could probably turn that word problem into that equation, but I would have no idea how to actually solve the equation algebraically.

But given the word problem alone I knew to divide by 3 and add the neighboring numbers. Now, actually dividing by 3 I'd need to use a calculator or spend a lot of time with a pencil and paper, but at least I knew the steps to solve it.

I'm awful at rote math and mental math, which is why I'm thankful I can make the computer do it for me.

Are you saying that if the word problem was "2 consecutive numbers and a third number that was five higher than the second number" you'd be unable to solve it?

"Tricks" work for simple cases. Algebra works for all cases.

Yeah, I would be unable to solve it, given that I'm pretty sure there is no solution if we assume that "consecutive" implies integers (and I'd be curious what 'consecutive' means if we don't).

If instead we made it "2 consecutive numbers and a third number that was five higher than the second number add up to 70" (i.e., 21 + 22 + 27), that's easily solved using 'tricks'. Really, the trick mention (which is what I did in my head, too) is just a rephrasing of the algebra. That is, I would subtract 4 off of the thing that is 5 higher (so that it's now 1 higher; the problem is now the same, get 3 consecutive integers), subtract 4 off of the number I'm trying to get (so 66), so the problem is now 3 consecutive integers that equal 66, and solve the same way (66/3 = 22, so -> 21, 22, 23), and then just add the four back in to the highest (21, 22, 27).

You're right, my mistake. That "five higher" should have been "four higher".

The thing is that your solution is the algebraic solution. You're simplifying an equation by balancing both sides, when you subtract four. Just from our perspective, you simplify to a still difficult state, instead of the easiest possible state.

Algebra is just a way of formally stating what you did, and then offering some simplifications that speed up the process. Or offering more powerful methods that making solving more difficult problems easier.

'The thing is that your solution is the algebraic solution'

I agree. That's why I said

'Really, the trick mention(ed) ... is just a rephrasing of the algebra'

Just because you do number juggling in your head rather than write it formulaically doesn't make it an inferior technique, or make it not algebra, which was my point.

> Give me "x + (x + 1) + (x + 2) = 69" and I wouldn't know where to start.

You memorize a few rules about what things you're allowed to do, then you apply the rules to simplify the problem in front of you. Sometimes, the hard part is that you don't know one of the rules you need to know. Other times, it's figuring out which of the rules you need to know that you should use. By practicing lots of problems, you get a good intuition for which direction to go, but it's not uncommon with more difficult problems to take the wrong way and end up confused.

First, we're allowed to drop the parentheses with addition (the "associative" rule for addition), so

x + x + 1 + x + 2 = 69

Now we put the similar terms together:

x + x + x + 1 + 2 = 69

How many Xs do we have? Three. Another way of writing that is 3x. So replace that part. Also, 1+2=3, so we'll replace that as well. "Apply rule and replace" is pretty much the most fundamental mathematical operation.

3x + 3 = 69

Let's get rid of the 3 by subtracting it from both sides to keep the equation balanced. On the left side the 3 cancels out (that's why we did this). On the right, we get 69-3, which is 66.

3x = 66

At this point, we just divide by 3 and simplify.

x = 66/3 x = 22

Our numbers are x (22), x+1 (23) and x+2 (24) according to how we listed them in the original problem.

Found that approach creative and well structured. Point free mathematics :)

It seems like you're doing algebra :)

I had a little bit of a problem with that question as well. A clever student might just divide 69 by three and take that number, as well as the numbers on either side, as the answer.

Is that algebra? Is that not algebra? Is that some sort of more nebulous skill that's also correlated with programming? If they actually wanted to correlate ALGEBRA algebra with programming, I also thought they'd have more straight-forward questions. Using word problems brings all sorts of other issues into the picture.

I would say that it requires algebraic thoughts. You are assigning a placeholder for "x" as "that number that would fit in the middle here". To me, that seems algebraic. Especially at this very, very basic level of algebra we are discussing.

Except you are also testing one's ability to set up that problem which is a major key to problem solving ability in the first place? I might agree with you if the problems were already formatted and it was simply plug and chugging algebra and not deciphering word problems.

Does a "simple algebra test" correlate with IQ? Does "programming aptitude" (whatever the hell that is meant to be) correlate with IQ? The headline would then follow.

I know this is a socially unacceptable opinion but I think those correlations exist.

Codeup CEO here. We actually did some brief testing with classical IQ tests and saw two problems:

1) IQ tests are really long and expensive, so difficult to implement

2) They didn't correlate with performance as well as algebra.

I've read your answer about your methodology of gauging "programming aptitude" and I'm worried it might be susceptible to your judges' biases - it being purely subjective. I'll put more faith in your point number 2 if you come up with a more objective measure.

Agreed that an objective measurement would be ideal and this is second best.

However, we've had a hard time creating or locating an objective measurement of general programming skill. As demonstrated by the industry's non-use of standardized tests for programming, I don't think anyone else has either.

If you have a lead on something, would love to see it.

Trying to create an accurate measurement of programming skill is far more difficult than looking for correlation on tests compared with some other subjective measure.

This misguided attempt at instrumenting "skill" could prevent highly capable individuals from entering the field who don't have a solid math foundation, for societal or other reasons.

OTOH, identifying a decent mathematical thinking background as a predictor of CS potential is a pretty big win at a societal level. Teaching math is relatively cheap; you don't even have to buy computers...

I remember reading about a similar but different test that had good results. I can't find the source so this is just from what I remember. Before starting a course the students were shown short snippets of code and asked simple questions about it. They found that consistent answers were a predictor of who would do well. The theory was that people who are able to construct a mental model for the problems ended up being good at programming tasks, even if the models they had made didn't match what code actually does. Have you heard of or tried this approach?

I think you're remembering some papers from Dehnadi and Bornat, starting with http://www.eis.mdx.ac.uk/research/PhDArea/saeed/paper1.pdf

That looks like the paper I was thinking of. Thanks for including te link.

How do you know they didn't correlate with performance as well as algebra if they were too difficult to implement?

Because they tried them, then decided to not continue trying them because they didn't correlate as well and were difficult and expensive to implement.

There main test has 60 datapoints... how many tries do you think they made exactly before deciding they didn't correlate as well? Yes, IQ tests are expensive, but if you buy one test, the marginal cost of giving to an extra student is basically 0. I suspect it wasn't tested at all.

A real IQ test takes at least 2 hours of 1-on-1 time with a psycholigist to administer, so in the event that someone actually wants to do real iq testing the marginal cost is quite high.

The ASVAB, GRE, LSAT, MCAT, and SAT all have correlations with psychologist administered IQ tests of around 0.9. WORDSUM, which is a ten word vocabulary test has a correlation of 0.71[1].

[1]Every time I use the WORDSUM variable from the GSS people will complain that a score on a 10-question vocabulary test is not a good measure of intelligence. The reality is that “good” is too imprecise a term. The correlation between adult IQ and WORDSUM = 0.71. The source for this number is a 1980 paper, The Enduring Effects of Education on Verbal Skills. I’ve reproduced the relevant table…

The Enduring Effect of Education on Verbal Skills


That's what psychometricians want you too believe. There is no scientific reason it needs to be administered by a human or be so long.

Are they really that hard I have done iq tests administered by professionals I can't recall it taking that long.

Not quite. X can correlate with Y, and Y can correlate with Z without X and Z having a correlation. As a simple toy example let X and Z be independently taken from the standard uniform distribution, and let Y = X + Z.

fwiw, correlation actually is transitive if strong enough. This is obvious if corr(x,y) = 1 and corr(y,z) = 1, then corr(x,z) = 1. More generally, you can get this constraint by looking at the minimal value a for which the matrix [[1, corr(x,y), a], [corr(x,y), 1, corr(y, z)], [a, corr(y,z), 1]] remains positive semidefinite to find the minimal correlation of x and z given that of (x,y) and (y,z). I mention this because in this case the correlations may be strong enough so that the minimal a is positive, and because it's important to understand exactly when our intuition about transitivity of correlation breaks down.

For example, when x,y and y,z are perfectly correlated, we can see that corr(x,z) must be greater than 0.9 from: http://www.wolframalpha.com/input/?i=%5B%5B1%2C1%2C0.9%5D%2C...

A lot of things "correlate with IQ" to varying degrees. Education level, length of sleep, size of vocabulary, etc. I don't understand why the relationships you suggested would be socially unacceptable.

But just because those two things correlate with IQ (which is probably true, so let's just grant them as true even without evidence for the sake of argument), that does not imply that they are sufficiently strongly correlated with each other to be interesting for hiring purposes.

To see my point, consider that being red is correlated with being purple, and being blue is correlated with being purple, but being red is not correlated with being blue. (at least not without evidence).

I don't think the existence of those correlations depends on what you think. Har har. Suggesting a correlation is a statistical hypothesis that is testable but not really debatable.

I don't think such a hypothesis is awkward but I doubt high IQ as an excellent predictor for an individuals performance in tasks that are not IQ tests. As a layman when it comes to psychology I think IQ measures some things but the way the brain works, no real world task a person does is exactly like those IQ tests.

A brain is not a CPU and an IQ is not its clock frequency.

Aside from being wrong and indeed often disparaged, this still common attitude has the unfortunate effect of helping to exclude those who can make an excellent contribution to a programming team but who approach programming from a different point of view - for example, I have a friend who's working quite competently in the field and has general block with respect to math but came to programming through music and graphic design.

>"programming aptitude" (whatever the hell that is meant to be)

What's not clear about the term?

We as an industry have no objective measure for what is and is not a good program. Therefore, we have no objective measure for who is good at programming.

If you ask me, a good program is one that is documented accurately and thoroughly. So a good programmer is one who cares enough to document well, and has enough skill to do it with precision, speed, and with good judgement.

Of course, being a good scientist or mathematician requires the same skills: know who you are talking to, what they already understand, and why your work matters to them. And be obsessive about doing it the right way.

There is virtually no evidence that program documentation leads to "better" programs. In fact, there are many people that will maintain that the need for code documentation is evidence of a "bad" program.

Regardless of whether you subscribe to this theory or the opposite, there is no "objective" way to differentiate between these 2 spectrums.

No evidence? No evidence for a _definition_?

I am telling you what a good definition of a good program is. Of course there is no evidence that my definition is the same as your (unstated) definition. You can disagree, but this has nothing to do with evidence.

This article is built on a fundamentally invalid premise:

> 30-60% of CS college majors have failed their Introduction to Computer Science course because they simply could not learn to program.

This "You either have it or you don't" mindset is strikingly elitist. I simply don't buy the idea that there are people out there who absolutely cannot learn to program. I mean, put someone on a desert island and tell them they can't leave until they can write a program that uses a linked list, and I'm pretty sure most people would be able to get off the island eventually.

Of course, that doesn't mean that some people can learn to code more easily than others. But I'd be willing to posit that the vast majority of people could learn to code given enough time and the proper instruction.

This StackExchange seems to have a pretty good overview on this very question: http://programmers.stackexchange.com/questions/163631/has-no...

There is a No True Scotsman argument that arives here "Well, if someone later became a programmer, despite failing those tests earlier, then clearly they were skilled at becoming a programmer" blah blah; but it is still an interesting phenomenon. And yes, there will be outliers, of course; and there will be people that gradually mull their brain to change the way it thinks; but, it is still an interesting phenomenon.

I bet we could see this in other fields, too, it's just that programming is presently the hotness.

(edit: added more)

As for elitism. What is elitist about it if that /is/ true? Oh no, Billy can't program, but he's great with cars. Sarah sucks at linked lists, but she has a better understanding of the human body than any of her peers in the ICU. And so on.

I am not great at higher level Math (calculus) and I'm good with algebra but I do not think I'd consider those good predictors. I feel like what allows me to program is strong spatial reasoning. Doing a quick google search I found this: http://www.quora.com/Whats-the-nature-of-the-relationship-if...

Probably just anecdotal but curious to know what others feel about spatial reasoning? I think specially with object oriented programming it is helpful to be able to visualize all of the 'actors' in your head and how they relate/intertwine.

I would also agree that spatial reasoning is key to strong programming ability. Most data structures are best understood through spatial reasoning, for example. But I would also say that spatial reasoning is key to mathematical ability. I'm actually surprised that you would consider yourself strong with spatial reasoning but had trouble with calculus. Perhaps you just had a bad teacher or lacked proper motivation?

I'm terrible at spatial reasoning but have no problem with formal logic. I'd also like to think I have strong programming ability (of course we don't know how to measure that).

Spatial reasoning is a key mathematical ability for certain classes of problems (a big chunk of calculus as you mention) but helps not at all with another big class of mathematics (set theory for instance).

Finally, to you data structures are best understood via spatial reasoning, but there is nothing actually spatial about data structures so that is probably most likely just your own preference for modeling them.

I'm sure my bias plays a role, but I would disagree that there's nothing spatial about data structures. Linked lists, trees, heaps, etc, all have the characteristic property of a geometric interpretation: a natural notion of "distance" inherent in their definitions.

Set Theory?

Spatial reasoning is very helpful for me, particularly when designing data structures and algorithms. When reasoning through a problem, I can often be found making shapes in the air with my hands. Easier than getting up and walking to the whiteboard. :)

What? Where's the correlation? The data looks like a random scatter plot to me. It reminds me of this Physics report from ages ago...


I think you can deduce some correlation just by noting that the upper left and lower right parts of the graphs are relatively empty. Certainly there are statistical tests that would be more concrete. You can see some figures in the bottom right corners, but they're never discussed in the text.

An R^2 value of 0.33 does not even come close implying linear correlation.

This is the most important comment in this dramatically overwrought thread. The post's own data analysis refutes its hypothesis unambiguously.

Basic algerbra is used all the time in programming. It's part of a programmer's fundamental skill set. Of course, somebody who already has that particular skill will definitely have the advantage in a short programming course. It's like starting a 100m sprint with a 3 second head start. It says nothing, however, about who will be the better programmer over the long term. So I don't think their test is a good predictor of programming aptitude.

I'll reply to a specific comment already made as a subcomment here, to comment about the larger issues that have come up in several top-level replies in this thread.

I don't think such a hypothesis is awkward but I doubt high IQ as an excellent predictor for an individuals performance in tasks that are not IQ tests. As a layman when it comes to psychology I think IQ measures some things but the way the brain works, no real world task a person does is exactly like those IQ tests.

The interesting blog post submitted here is talking about the bread and butter of "industrial and organizational psychology," namely about how to select individuals for a training program. There are three generations of published research on this topic already, and there is a huge amount of ongoing research on this topic, because organizations all over the world want to figure out how to select successful applicants when there are more applicants than places in school or work programs.

The short answer is that there is a HUGE body of research to show that the single best hiring process you can use for hiring a worker, if you want to get a worker who will perform well on the job, is to use an IQ test for hiring.[1] The long answer is that some other applicant characteristics matter too, of course, but the single best thing to look at in a job applicant is "general mental ability." Work-sample tests are also very good for hiring for specific jobs, and are grossly underused in hiring in the United States.

To the point of the interesting submitted blog post, one always has to be empirical about these issues. The people running the bootcamp so far have found data that suggests that the algebra test they have tried is a bit more revelatory than the IQ test they tried, and less expensive besides. One response to that might be to suggest a test like the Wonderlic test (an inexpensive IQ test designed for company hiring procedures) but in the end, results matter. If empirically at this bootcamp, the algebra test works better than some other selection procedure, it doesn't even really matter why it works, just that it serves the bootcamp's purpose of identifying successful students from among multiple applicants. The data set is still small. I am very glad that the blog post includes a scatterplot of the data. More bivariate data should be shown that way, in blog posts on dozens of topics.

[1] My FAQ post on company hiring procedures, which I am still revising to put on my personal website after composing it for Hacker News, provides references for this and some commentary on legal issues in hiring.


  there is a HUGE body of research to show that the 
  single best hiring process you can use for hiring 
  a worker
The best process, assuming you have to use a single process for every job in the world? Or the best process even when hiring for a specific role?

I can understand IQ being the best choice if you had to judge poets, plumbers, town planners, golf instructors, programmers, salespeople and warehouse workers using the same process. But surely if you're exclusively hiring for one of those roles, you'd want to test their domain-specific knowledge?

As was written above:

> Work-sample tests are also very good for hiring for specific jobs, and are grossly underused in hiring in the United States.

But isn't essentially all hiring for a specific job?

Industrial and organizational psychologists have researched this issue, with hundreds of studies spanning most different kinds of jobs in most developed countries now published. A work-sample test is especially strong for spotting people who need to do something right away after they are hired. For any worker who may need to learn new things on the job--and that is a lot of different kinds of workers, in a lot of different industries, and almost all managers--a "general mental ability" test adds to the predictive validity of selection applicants who will do well on the job over time as industry conditions and market pressures and technologies change.

Not necessarily. For example militiaries routinely recruit (hire) first, and then later figure out where to put the hiree. The same is less common but does happen in industry as well with entry-level college graduate positions, and in some cases even with higher-level positions.

Even if what you say is true, though, some companies intentionally hire people without the necessary skills to do a job, then train them. In that case there's no point in testing them for job-specific skills they don't have so what you want is a way to determine which candidates are likely to be successful after training.

I can't dispute the author's statistics, after all, they're just a function of the data. But looking at the first graph, it seems like a generous interpretation of "predict." The anticipated signal-to-noise for any individual student looks like it would be pretty poor.

One other thought occurs to me. Would the same algebra test produce similar correlations with courses such as English and History, when adjusted for the differences in overall pass rates for those subjects?

It seems odd that something as interesting as programming isn't introduced until college. That seems like accepting students as music majors, who have never played music before. If I had my druthers, computation and simplistic programming would be part of the mainstream K-12 curriculum.

>It’s fairly common for incoming Computer Science majors to ask the question, “Why do I have to learn all this math if I just want to learn to program?”

Well, there's the reason you have a high drop-out rate in CS. People don't know what it is! Computer Science is not a vocational program. Computer Science is not computer programming.

If someone had talked to these kids asking “Why do I have to learn all this math if I just want to learn to program?” and told them "You don't have to learn all this math if you just want to learn to program" before they became CS majors, then they might not have become a CS major in the first place.

And, likely, some good computer scientists would be lost to the world. You should also ask them why they want to learn programming. Many of them will answer they want to become programmers, while they do not have a clear picture of what that entails (it is not code writing all the time, most programmers do not write games in languages they chose, a lot, despite the fact that it is a job where all workers can automate repetitive jobs, a lot of the work is repetitive, etc)

Maybe, once they realize what is the difference between programming and computer science, they prefer computer science. Problem is you cannot learn them the difference in a few weeks; it takes years to sink in.

So, what do you let them do in the mean time? Waiting is a waste of the years in which learning is easiest for them. So, do we let some grown-up decide who likely will make a good computer scientist, or do we let many more start on that trajectory and see how far they get?

I think the latter is the better choice, if we also provide smooth ways to move from one to he other.

[slightly related: I once read a teacher in a nursing school state: "when they come in, all the boys want to ride an ambulance, and all the girls want to work with kids. We have to work a bit on that in the first year"]

Most programming jobs involve a lot of algebra. You have variables, you do arithmetic to those variables, you apply functions. It's not a great surprise to me that if you are good at high school algebra, you can probably learn how to program.

You might not be a great programmer, but you'll be able to do it.

An algebra is just a system of rules.

Being good at number rules (school algebra) does not have much to do with being good at state rules (code algebra).

I'd say there's more correlation between functional programming and school algebra, if anything.

Being good at a set of abstract formal rules is probably at least some kind of indicator that you'll be good at another set of abstract formal rules.

Having recently taken a battery of psychological tests that measure several different cognitive functions, I am one living example that that is not the case.

How can you be an example that what I said "is not the case"? Are you saying that because you are not similarly good in two different sets of abstract formal systems, that this isn't a good indicator in general?

Lot of people in here offering cheap opinions, while mgirdley's team has done the actual work. They're not saying that the kinda of algebra test they're proposing is foolpoof, perfect, subjectivity-free, or even the best way of testing for programming aptitude.

What they _are_ saying is that the test is an efficient, effective way to do it, with better ROI than IQ tests.

It doesn't actually matter _why_ that's the case, though Animats' suggestion seems plausible: "From the test questions, this isn't an "algebra test". It's a word problem decoding test. That's appropriate to programming, where you have to go from an informal specification of the problem to a formal one."

I think that many people have difficulty learning to code in college because many of them aren't that enthusiastic about it. At least anecdotally, many people at my school are in computer science because that's where the zeitgeist is. If it was the eighties they'd be getting ready for law school.

I think that people would do much better in intro CS classes if we could get rid of some of the hype around programming and tech in general, and treat it like other courses. That way, far more of the students would be genuinely interested in the subject.

I'm very good at Algebraic math. But I don't think an algebraic test will show programming aptitude. While algebraic and calculus analysis are a form of problem solving that works incredibly well. It is by no means the fastest, or most eloquent way of solving a problem. Its just a single tool that a problem solver can use.

The main thing I fall back on is Heron's Problem. The classic algebraic/analytical solution is to find the local minimum of a function. Which works, this method isn't flashy yet completely functional. It requires no insight, just mentally vomiting something you learned in high school/college.

The geometric solution is far simpler, and offers insights into the foundation of trigonometry. Its really so simple and eloquent you feel like an brutish idiot doing a long form analysis.

I offer this because tools are merely tools, and math supplies many tools. You wouldn't hire a finish carpenter just based on his/her ability to swing a sledge hammer would you? I mean s/he will have to swing a hammer, but will it have to be a sledge hammer? Will they have to use a nail gun? Screw driver? I hope they are proficient in all of these tools, and since I'm outsourcing my time to them, I hope they know which tool to use at which time.

The correlation doesn't seem particularly strong, but the dependent variable here isn't right. It's a stack ranking that's forced to a normal distribution. A normal distribution might be appropriate if they randomly selected US citizens to join the program, but they don't. Students self-select to apply and they reject the bottom 66% of applicants. The resulting talent profile is far from being normally distributed.

You can correlate most anything with most anything. I'm not saying don't try, just to don't give correlations much credence in and of themselves. Thinking globally, what are the odds a coincidence doesn't happen? And when you exclude outliers, you may be excluding the most profound subjects, just as when you include non-outliers, the subjects may not be able to generalize outside the system of measurement.

> Despite hours of studying and tutoring, most of these underperforming students struggle with, and many ultimately give up on, programming as a career

They should follow the lead of those who figure out earlier they're not cut out for programming: embellish the CV with fake study, and go straight into the workforce as a programmer. Or they could get transferred into programming from some user department pitching to bring a "valuable user perspective" to IT. Or grease up some IT manager after-hours who'll bring them in as contractor with "special skills not readily available in the labor market" to bypass the usual HR checks and aptitude filters. Or if they're not of the same ethnicity as the HR personnel, send in a double to sit the aptitude test for them, knowing HR staff don't check applicant ID's too closely because they know no-one's going to make it through to an IT interview if they do. Most HR staff and IT managers just want the staff count up, they don't care whether they're productive or detrimental to the projects. If they have staff they have people to blame.

So I guess the real underlying message they are trying to say is, if a student can't pass an algebra test, they probably can't learn to program, so don't even bother trying to teach it to them. A more generalized form of the hypothesis is that: most people can't learn how to program.

Both forms of the hypothesis are pretty easy to refute and have already been refuted. See http://code.org and http://madewithcode.com for example. I just taught 2nd graders programming yesterday who don't even know how to divide, let alone any algebra.

And of course, there's always the radical idea that maybe you could actually teach them some algebra skills in the context of programming tasks. Just like some calculus students may need some remedial instruction and support.

Luckily, there's a whole field of research with several articles on this very topic. The field is called computer science education. See SIGCSE, for example: http://www.sigcse.org/

In fact, the field has already researched and debated this issue before, as well. There was a controversial article called "The camel has two humps" in 2006 that made the same claim as this post that a simple aptitude test could predict whether someone could learn how to program or not. The article (which was never even officially published) was later retracted: http://retractionwatch.com/2014/07/18/the-camel-doesnt-have-...

I don't think that's the message.

We are focused on making people professional, employable programmers. While I agree that everyone can play with scratch or build a basic webpage, that's different than the programming skills required to be employed in most dev jobs today.

(Codeup CEO here.)

An older study suggests :

In both genders,performance on Mathematics was found to be the best predictor in programming ability,followed by performance on a spatial test.


Surely the confidence bands in the plot aren't 95% confidence intervals...


R-Squared of 0.33?? Not a great fit, considering the sample size.

An R^2 of 0.33 is a fairly decent correlation for a study involving human subjects. It's obviously not perfect, but the authors are not claiming that it's perfect.

How is programming aptitude measured and defined?

Codeup CEO here: it's the instructors evaluating the students based on their interactions with then and answering the question: how capable is this person of building quality web applications?

I'm just wondering whether the model of which they base their programming problems for web apps is isomorphic (structurally preserving) to algebraic concepts.

Computer science and development in school is very different from computer science and development in the real world. You don't get to know what you have to know before you know it - that information doesn't exist in the ether of the collective consciousness anywhere. You have to draw it out from yourself.

You might not even know the words for the concepts you have to create in the real world - because they aren't defined, and it's different from pattern matching, finding invariants, optimizing around invariants, simplifying semantically and forming relational constructions. It's different from probabilistic modelling and inference. It's different from a machine doing all those things and a human reasoning on top of it, turtles all the way down (or up, rather).

If you really want to teach people, you have to be able to believe that every person has the capacity to exceed their boundaries and even perhaps demonstrate that you can exceed your own.

This is really philosophical at this point, but don't make the mistakes I've seen tons of educational institutions make. Don't define your students before they learn how to define themselves. Coding at it's core is a creative endeavor. If you want to build robots, code. If you want to build students, learn.

That is so subjective that it should be ignored. I've interviewed people who interviewed very well but in practice they failed at basic computer programming concepts and implementations.

We simply do not have an objective measure for program quality and thus do not have an objective measure for programming aptitude.

In the face of this problem we can either give up and not try to measure anything, or we can use an obviously subjective measure and understand where the weaknesses are.

I've interviewed and worked with many people, MANY!, including e.g., Stanford MS CS graduates, who can't write real-world apps. I assume these people can score well on an algebra test. As far as I know, the ability to write non-trivial programs in a commercially relevant time frame can only be judged based on a candidate having already done so. If there's otherwise some way to predict who these people are, I'd love to know what it is.

Social factors in a workplace also seem to be important and hard to predict. Excellent work for one employer isn't necessarily reproducible for another.

These instructors are spending all day, every day with the students as teachers lecturing and helping them through exercises. They're not interviewing them.

I'd be interested to see more questions from the test, but from the examples given I'd say s/algebra test/problem solving test/ is in order.

In that light, the findings become less surprising---still interesting though, since we can test math-word-problem-solving ability quite easily with pen and paper. Could this mean my KhanAcademy math achievements are a better hiring signal than my github?

In terms of hiring, it certainly makes more sense for candidates to pass a 1h test rather than expect them to "prove themselves" by solving a week-long coding challenge...

you're conflating that with graham's submarines, showing a mere correlation, not causation.

we’ve possibly found a simple method for sorting out the ninja programmers from the less capable programmers

Can we please stop calling them ninja/rockstar/etc programmers and simply use "professional programmers" or "master developers"? This language is cute when you're 16, I guess, but personally I don't want to be compared to a middle-ages asian hitman for hire.

Aptitude tests really annoy me. At least for me, when I was told that I was bad at math, it discouraged me and I didn't pursue it. Later when I was all grown up and decided to go back to school, I really applied myself and it turns out I was pretty good at physics and math, despite my aptitude when I was a teen.

I'll only comment on the opening line and the research it points to about the failure rate in intro to Computer Science. The author of the codeup piece immediately jumps to the conclusion that it's an inability on the part of the failing students to be able to learn to program (a detail that the study linked doesn't comment on). However, my first reaction is that if the failure rate is higher than we want, it means we either (1) aren't teaching the class well; or (2) the students are coming in without proper preparation.

Many students coming into an intro to CS class are already programming whizzes, and then there are others who have no experience with programming at all. So, already the instructor is in an impossible situation. Do you bore the promising kids with the major head start, risking losing future good students? Or do you ramp the class up to their speed and risk losing the kids who didn't come in already knowing the material? Obviously, most departments and instructors will, wittingly or not, choose the latter.

All of this is predicated on the idea that programming is solving sexy math problems.

Reality is that most programming is taking bits out of one bucket, combining them, and dropping them into some other bucket. It's the digital equivalent of shovel work.


What kind of "programming"?

What "ability"?

Ability to write space-O(my_god * n) sorting algorithms? Ability to duct-tape queries to an ill-documented SOAP webservice and upsert the return into a Your-Boss-Normal-Form database schema? Ability to brutally optimize the runtime of a really hairy numerical analysis algorithm? Ability to design a centralized data pipeline architecture and lead 12 hackers during the implementation and migration? Ability to find the off-the-shelf OSS project that solves the first 50% of the problem instead?

I was watching the news the other day on TV, which is something I almost never do, and there was a segment about the educative value of tablets, smartphones and laptops for teenagers and children. The value of the freely available information on the internet was discussed, along with the quality of free and... not free educative and scientific applications, and the ill effects of such devices on people's attention span, and... That's it.

Example devices included many, many Apple products, a few off-the-shelf Android devices. A Windows machine running MS Word.

Same ideology regarding technology in education: locked-down workstations with user-friendly applications, internet access which blocks every port except for 80 and 443.

A relentless and sustained effort to erase everything but the topmost layer of the IT stack. Locked-down devices. People come to me asking me to remove viruses, to speed up their old XP machines fraught with annoyware. No Windows license, don't want to pay for one, don't want Linux. Computers run on magic, right? How can you know that programming stuff and not be able to remove all that bad magic from my computer. Computer Science students who can't find the slash on their keyboards. People who see computers as a monolithic entity rather than a brittle but transparent stack of conceptually distinct layers. The new generation of systems administrators, can't scp a tar at the other end of the lab.

A teacher spending three hours explaining red-black trees to a class scrolling down 9gag; students, diploma in hand, who aren't entirely sure what's the difference between an array and a hash, who are not sure of what happens when one puts an array in another array, in PHP, after 120 hours spent theoretically writing PHP.

Computer Science students convinced that Linux is 100% not worth learning, in any way, because nobody uses it.

This is how the knowledge economy dies.

This is one of the most epic rants I've read on HN in quite a while. Thank you for the comedy, but it's unclear how anything beyond the first couple of paragraphs relates to the discussion here. You're saying that our education system produces ill-prepared CS graduates right? What does that have to do with the correlation between an algebra test and programming aptitude?

1) it was a rant that I needed to make

2) what the exact fuck is "programming aptitude" anyways? The article defines it as "having good grades in a CS program". Which is part of why I needed to rant against CS education.

>Computer Science students convinced that Linux is 100% not worth learning, in any way, because nobody uses it.

I learned Linux, and used it all throughout school. We never touched Windows or learned about how to develop on it.

I am now a full time, happily-embedded-in-the-windows- ecosystem C# developer.

I think fundamentals + balance of education on implementations is a more practical approach for a CS curriculum rather than "OMG LINUX IS TEH BEST AND ONLY".

At least you understand the nuance between a computer and an OS. This is a pretty arcane piece of knowledge, these days.

Appreciate the idea behind the article which may have value, but the data types seem ordinal? 1.0-4.0? is that trying to mirror US GPA system. Normalized based on what? p-values are an archaic only useful for MBAs and Med students.

We presumed and used a normal distribution for each class.

The numbers were arbitrary. Just what felt easier to understand for the staff.

Normality on ordinal scale data. You've done the equivalent calling functions expecting numeric input with ordinal data. https://en.wikipedia.org/wiki/Ordinal_data

Don't have the study handy, but measures of working memory such as the OSPAN were found to predict proficiency in a programming course even better than an algebra test that participants also took.

It would be interesting to administer these algebra tests to junior/mid/senior level programmers currently in the field.

One thing I'm a little uncertain of is the ideology behind these kinds of things, it seems like the primary purpose here is to keep people out of the field.

Also I think those are some pretty weak quantitative results, and good exhibits for argument against p-values.

Codeup CEO here: For our purposes we want to only take students we know we can help and will eventually succeed. If we bring in people who fail during and after class, it's bad for business.

It may be controversial but running this place has shown me that not everyone can program competently or is cut out to be a professional programmer. Anyone who says "everyone can code" is mistaken.

That might be overly cynical. IF it's true that there is some fundamental way-of-thinking or personality trait that is both in a large minority of people, and necessary to be a proficient programmer; it is extremely valuable to know about it.

A lot of people in the field, myself included, get a hint of intuition that some people either have it or they don't.

If we could get at (or closer to) some basic principle, we could both help people focus on their strengths (by identifying those who show this trait, and being transparent with ones who don't).. and hopefully learn how to cultivate the underlying trait itself.

Wouldn't it be funny/cute/useful if just practicing a bunch of algebra word problems made proficient programmers even better?

Why do you think the results are quantitatively weak? My first critique is just the small sample size with ~60 students and the tests, which appear to be different between the students.

The OP alluded to p-values, and I think this may be relevant:

(In short: p-values are pretty much useless as an indicator of "this experiment can be reproduced".)

Not that this necessarily refutes the original article, I just thought it might be relevant to discussion.

If you look in the methods, the experiment has been successfully reproduced twice after the initial trial with distinct cohorts of students.

Is the complete algebra test available somewhere?



I really don't want to sound rude, but this sounds like a classic case of correlation not causality, and I could see the consequences really hurting some people.

I am REALLY terrible at algebra (to the extent that I am borderline LD in math), and I have completed a degree in CS and hold a steady job as a software engineer. And I am definitely not the only one. Ask a room full of developers how many are bad at math, and I guarantee the results will surprise you. On the other hand, I know a plenty of folks I went through school with that were math majors so good at algebra they could do full page derivations piss drunk without at hitch. And yet, they would take an intro java course and be totally lost. If they didn't get a java intro class, how do you think they would have done with something as algorithmic as an assembly language, a functional language, or understanding the nuts and bolts like Turing machines and automata?

Algorithmic thinking is definitely an important part of programming, but it is just that, only a part. And quite a few types of development don't emphasize that sort of thinking. I could perhaps see this being a PARTIAL solution for areas where one would be doing a lot of functional programming, but the vast majority of the job market these days is still OOP.

This could really quash the job prospects of programmers who are perfectly capable of writing quality code but are poor with algebra.

I don't think the author claimed causality. He's interested in prediction, for which correlation is perfectly well-suited.

Also, nobody is claiming that the correlation is perfect. You can perfectly well see outliers in his graphs, but outliers don't disprove the correlation.

Fair point, but showing these graphs complete with the outliers gives a very poor platform to base handing out algebra tests in interviews without establishing causality.

> very poor platform to base handing out algebra tests in interviews without establishing causality


It probably shouldn't be your only criterion, but if it has a high correlation, why wouldn't that be sufficient for handing out algebra tests?

A high correlation corresponds to predictive power, even if it's not a causal relationship.

I think the point being made is that there are probably predictors which have many orders of magnitude more predictive power (e.g. work experience, education, interview questions, programming exercise, available source code, etc.).

The male gender is probably correlated to programming ability as well due to the fact that there are more male programmers but it's not a very useful predictor.

Predictors for a school to apply? I wouldn't bet on it. None of your suggestions seem fit.

Also, for a job it's probably a better predictor than a resume.

> Predictors for a school to apply?

First, it was suggested that this test should be used for programming job interviews. Second, why would a school who teaches programming care if students had programming ability? If you answer it is because the school wants to maintain a reputation of producing good programmers (hence adding value to the students' resumes), it contradicts your statement that this test is a better predictor than a resume. Otherwise, people would just study algebra instead of going to that school.

The author uses it as a predictor for acceptance at a school.

For the way, he was around the thread explaining that he can train a limited number of people, and that some of them not learning is wasteful and uninspiring.

> A high correlation corresponds to predictive power...

That's a bold statement. And one that is unfortunately often times false. Spurious correlations pop up everywhere. They can be notoriously bad at predicting things, and care must be taken not to take rash actions based on correlations that may have no actual clout. I'd say denying someone a job based off of an algebra test is sufficiently rash.

There's some correlations I love that illustrate this. You have "Lack of Pirates Causing Global Warming":


and one I hadn't seen before, "Internet Explorer vs. Murder Rate",


Why do you think those correlations don't have predictive power, eg, that if you give me a graph of the pirate rate over a long period of time, I can't make a pretty good guess about the global warming rate?

Now, obviously, these are only very coarsely correlated because they don't reflect small scale variations in either data set (which is why the graphs are so coarse). However, in this case, both seem to actually be driven by a hidden factor (technology and societal advancement), which means that they'll actually likely stay correlated (as long as the correlations with the hidden factor stay strong).

Similarly, internet explorer and murder rates.

tl;dr: Your articles are correct that correlation doesn't imply causation, but I never argued that increasing math or computer science education would improve scores in the other. Instead, I merely argued that as long as they're correlated, knowing about one will tell you about the other.

(Also, the examples you picked don't meet my definition of strongly correlated, because they don't mimic fine features of the data, but that's actually secondary, because the argument you presented doesn't even relate to what I claimed.)

If I'm trying to pick years with global warming and all I know is the pirate population, I can do pretty well.

There is a difference between “algebra with numbers” and “algebraic manipulation”. You do the latter all the time with code when you observe and use equivalences like this:

    if a then
      return b
      return c


    return (if a then b else c)
I.e., “return” is distributive over the branches of the “if” expression, so you can factor it out.

When you see “bad” programmers redundantly specifying Boolean expressions like “x == true” instead of “x”, it’s because they haven’t internalised the fact that “if” takes any algebraic expression denoting a Boolean, not just a special sort of expression with “&&” and “||” and “==” and “<”.

Beginning programmers often struggle with learning the elements available in a language and how they can be composed—and that is exactly what an algebra is. I would be very surprised if an aptitude for algebra did not correlate with an aptitute for programming.

This skill can help to write more concise code and refactor better but I wouldn't say it's the core component. Very well-functioning, maintainable code could certainly be written that was full of "== true" lines. That is preferable to indecipherable long-winded one-liners with no comments that are hard to parse.

That’s a false dichotomy. An indecipherable long-winded one-liner with no comments is not well factored code either.

I didn’t mean to imply that syntactic manipulation is the core component of programming, only that it’s a required skill, which beginners commonly struggle to learn. I’ve tutored a good number of beginner CS students, and a big issue is developing a general understanding of a language from the few concrete examples they’ve seen. If you’ve only ever seen “name.name()”, you may be surprised that “(expression).name()” is possible. If you’ve only ever seen “if (name == expression)”, you may be surprised that “if (name)” or “name = (name == expression)” are possible.

This article concerns how beginners learn to program, and I’m only arguing that I would expect an early understanding of the algebra of programs to be an advantage in learning how to productively manipulate programs. When you’re comfortable with the syntax, it is much easier to reason about the more important issue of semantics.

>only that it’s a required skill

What I'm saying is that it might not be required, just pleasant, given my example of someone who codes like that but otherwise writes well-structured good code.

I completely agree. My roommate is amazing with math, he has taken every math class available at Drexel for undergrads, but he can not code. He took a intro to programming class and did not pass. I on the other hand am okay at math, including algebra, but I can code. It is easy for me to understand, but it is impossible for him to understand. I think it goes down to Algorithmic thinking but also logical as well. That's the difference I see between my roommate and I.

> If they didn't get a java intro class, how do you think they would have done with something as algorithmic as an assembly language, a functional language, or understanding the nuts and bolts like Turing machines and automata?

I would argue that your last several examples (functional languages on) are more likely to be understood by people really good at math than Java is, and that Java isn't especially predictive in that regard.

Functional programming languages are closer in their structure to the way that math is structured than Java itself is, while Turing machines and automata are pretty much just outright math. (Especially when you start talking about the computational power of automata versus Turing machines and the equivalence between some kinds of grammars and automata.)

I expect you listed these because they're "hard" computer science, but I'd argue that you think they're "hard" computer science because they're mathy and not your forte.

I, for one, had much less trouble with Turing machines, autoamta, formal languages, computability, etc than I did with learning Java.

I think I would agree with you more if we were talking about Java on a full implementation basis (actually using the nuances of the language at a high level vs just learning the basic concepts such as variables and control structures common to most languages).

I also found Turing machines and automata much easier than conventional math classes. The point I was making was not that they are necessarily harder, but rather that they require a different type of mathematical thinking than your typical algebra based math. I would wager the same true for discreet math.

As for functional languages, I admittedly have never programmed in one and perhaps spoke out of personal ignorance. I have, however, heard from colleagues that back my original statement. It's basically hearsay, so take it or leave it.

I agree. I think we need to be specific about what's being measured here. Seems it'd be more helpful for backend algorithm development, less critical for front end coding.

I don't think it's oop as you suggest - I saw same correlation in a college c++ class. Half the class had easy a's, the rest had really hard c's. I've never seen such a bifurcation in any other college course, usually it's more gradual / continual distribution.

The reasoning behind it not being oop seems misread. I meant Object Oriented Programming. I'm not sure how the c++ correlation refutes that. Did you mean something else?

Completely agree. In the test presented one can probably replace 'algebra' with 'electrical engineering' or 'electromechanics' or 'physics' etc and get to the same outcome. I'm not exactly good at algebra either, mainly because it's not my field, but know enough about statistics in general to see that this article (well, not so much the article but some of the conclusions presented), is flawed.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact