As others have mentioned, its less about the syntax of a language and more about the plethora of libraries that is the real boon when it comes to using a programming language. It has been said that the majority of a tough problem is worked out unconsciously. If we eschew memorization for "just in time" methods, we are essentially putting a hard limit on the type and difficulty of problems we're capable of solving.
I personally think the amount of library functions memorized is the biggest difference between average programmers and the so-called 10x programmers. An extremely high level of fluency with a programming language environment is invaluable when it comes to efficiency and code quality. It is analogous to conversing in a language where one has to look up every other word compared to a native speaker who has a deep fluency. It's not just a matter of speed; the depth and quality of thought is orders of magnitude greater in the fluent speaker. The more of the heavy lifting you can do unconsciously, a vastly higher level of output one is able to produce for the same amount of mental effort. We accept this in just about every profession, yet we resist when it comes to programming. Personally, I'm glad my doctor memorizes a large amount of the facts he uses in his day-to-day work.
It is unfortunate that just about every new language brings with it a new set of standard libraries that we must learn to use effectively. We give up so much expertise and efficiency when we don't allow ourselves to build up a high level of familiarity with them. I can imagine a future where there is a standard library of methods that every programmer attempts to memorize and that every new programming language is programmed specifically for this standard library. We are already seeing this with language frameworks like the JVM and .NET, but we need to go even further. Hopefully libraries themselves will be created specifically to be easy to memorize.
The problem is that languages and especially libraries change too frequently. A doctor can learn the name of every bone in the body and that knowledge will for the rest of his life. In programming, there are very few things that will still be the same in 20+ years and we have no idea what those things are. It depends on the field but things can change so much within 2 or 3 years that its just not worth memorizing everything.
The timespan of change doesn't matter; the question, as always, is whether the benefits outweigh the cost.
Now, the cost is approximately <5 minutes per flashcard to give you roughly 95% recall (more details: http://www.gwern.net/Spaced%20repetition#what-to-add ); so the question is, does not knowing something you could've put into a flashcard - the consequences of not knowing it, the time it takes to look it up, whether you will even know to look it up, etc - outweighs the cost of a few minutes' review over that time period.
I think for a lot of stuff this can be true: shell scripting is not going away in 20 years, for example. If you're programming in it routinely, a language you will use for only 10 years can be worth memorizing for a while. If you're using a tool every day for the next year, there's going to be a lot worth memorizing there too.
> In programming, there are very few things that will still
> be the same in 20+ years and we have no idea what those
> things are.
For the most part we are walking toward local optima in programming languages with larger jumps rarely succeeding.
SRS is well designed for gradually changing bodies of knowledge. New cards are added and come up more often, older cards can be retired but they rarely come up anyway.
Interesting point. I think an analogy that many people can appreciate is that one's knowledge of a language and libraries plays the same role in programming that one's knowledge of mathematics plays when learning physics in school. If you're struggling with the math, you aren't able to devote much thought to the physics you're supposed to be learning. I understood the principle very well in school and always made sure to refresh my knowledge of the required math skills before tackling my physics work, but I haven't applied it to programming as well as I should. Food for thought.
"I personally think the amount of library functions memorized is the biggest difference between average programmers and the so-called 10x programmers."
I don't know, what about the ability to think outside the box, a solid grasp of the fundamentals, or the capability to understand and create high level abstractions?
Seems like you're saying the ability to glue together a bunch of library functions is what classifies a great programmer. If that's the case, it should make programming interview tests rather trivial.
The point is that the more of the details you have committed to memory, the better your able to "think outside of the box" because your limited cognitive capacity is not bogged down by the minutia. The analogy of fluency in natural language I think is very relevant. The difference in complexity of thought between someone fluent in a language vs someone just learning isn't a matter of intelligence, its a matter of having the building blocks of thought committed to memory. The more of the details that are second nature, the higher level of thought that results. I don't see why this is any different when it comes to programming.
These were my thoughts too - while I agree with the general sentiment of using Spaced Repetition; it also depends on the language you are using. If I find I'm beginning to recognize an abstraction I immediately begin to search for something someone else has already done.
The author of the post used PHP as an example and I think it's a terrible example because PHP's standard library and 3rd party libraries were all over the place in terms of naming conventions, structure, duplicated but not quite abstractions, &c...
I've found learning languages like Haskell to be very different - where learning the axioms and postulates is what leads me to implementation specifics. A good example: I was building accumulating recursive functions for a little project and was looking at the code and said this to myself, "This doesn't feel right, this looks like Scheme or Erlang code - not Haskell."
I set out to find any abstractions built into Haskell or 3rd party libraries that handled accumulators - low and behold folds! I've done that numerous times with this language proceeding from fundamentals and the process of abstraction to find first if the abstraction has already been done!
[EDIT]
I use Spaced Repetition for learning Haskell - but it isn't names of functions; it is the Monad laws, or Functor laws, &c... that I want to memorize.
I agree with you and I like your last point. It is indeed a problem that we have many languages and many libraries when many of them do the same thing. One big library to rule them all. Some day we will have that. There will be many algorithms in there, containers, network stack, everything you want and need. And cross platform of course.
What you mention is one more thing to remember when starting on a specific project/team: Which of the possible library functions should you use for a given Task?
> The more of the heavy lifting you can do unconsciously, a vastly higher level of output one is able to produce for the same amount of mental effort.
I agree, but working from the standard library has it backwards. One should subconsciously know what is in principle possible and then look up/remember how it's called in the language you are currently working in, and if it is not there implement it yourself.
But the act of searching and evaluating the results limits your cognitive capacity for building even greater abstractions. If you already have some standard set of implementations memorized that frees up your mind to use those pieces as building blocks for new, grander structures. As you fill your working memory with minutia, the larger abstractions are bumped out which makes it harder or impossible to create new abstractions using the higher level building blocks.
I get the resistance to memorization--I am the laziest SOB I know (and being in the company of programmers, that's no small feat). I got through college almost never taking notes and by just understanding concepts. It worked great for some subjects, namely math, CS, physics, etc. But one thing a math professor said one time in a higher level course for math majors that stuck with me: "If you don't memorize what came before, you will never be able to make new discoveries". In hard math it is expected to memorize the results that came before as this is the only way to discover patterns between them and create new associations.
I think the same applies to programming. Your mind can only hold and manipulate so many units of information at a time. The more abstract those units are, the greater the resulting mental structures will be. We artificially limit ourselves in our resistance to memorization.
I hope this format is useful to you guys. I actually spent the last two days making a video of this, with screen recordings of using Anki and such, and then felt it's a lot more efficient to just get the info on a static HTML page instead.
Certianly very appreciated. I find I can learn more from a static HTML page than a video, because it is easier to scan the content, and read, and reread things over than it would be in a video. Also it gives the ability to search the whole tutorial or article, without having to rewatch the whole video.
In our extensive user testing we have seen our users make great gains by using SRS. In blind Ruby coding tests we have seen users complete them faster with more confidence in their abilities and results they generate. Got some great data and numbers on exactly how much advantage learning this way gives you. Plan to release it when we launch in a couple of weeks. We are http://www.codesonic.com/ by the way.
I like static text much better. That said, I thought your article was great, but all the value was in the title and possibly the link to Wozniak's page. (I was already familiar with the concept of spaced repetition, so once you pointed out that this works for programming too I pretty much got it.)
I was a contributor to Anki and one of the earlier language learning bloggers who wrote about spaced repetition for Chinese learning. One thing I noticed at that time was that those of us Chinese learners who were programmers went crazy about spaced repetition... some went so far as to SRS at the exclusion of actual reading. Other language learners blogging about SRS, often weren't so thrilled about the concept.
At this point, I'm seeing the same thing again, but distributed to a larger group of people. Everybody and their cousins are making SRS programs for learning languages, often Chinese, and that giddy feeling I had when encountering the idea of SRS seems to be all over the place!
How I wish I could explain that the context matters and they'd learn faster by actually using these discrete chunks of knowledge in context! That reviewing decontextualized data is a poor way to synthesize most thing! But who would listen? I wouldn't have 4 years ago.
SRS is seductive. It's amazingly good for things in which there is a set number of essentially unrelated pieces of information to memorize (e.g. 5000 common Chinese characters), but it's terrible for learning something more nuanced (e.g. how to read Chinese). I do still keep it as a tool, but it's a special purpose one.
I agree with everything you said. SRS is very seductive, but after you do it for some time it loses its glamor and just becomes another tool in the tool belt. John Pasden's article that you linked to is great and on point. SRS becomes annoying after some time.
A few months from now Derek is going to be asked what Array(5) does. He will be pissed off that he doesn't know the correct answer, but does now remember to not use the Array constructor because it is ambiguous and the literal syntax is better.
var a = []; a.length = 5; is better than: var a = Array(5)
The former is obvious nonsense, the latter is non-obvious nonsense. If derek is like me, he will then be asked to rate his knowledge, and be pissed off that he now knows more but anki makes him admit to not knowing it.
> How I wish I could explain that the context matters and they'd learn faster by actually using these discrete chunks of knowledge in context! That reviewing decontextualized data is a poor way to synthesize most thing! But who would listen? I wouldn't have 4 years ago.
Good point. When I read about SRS for the first time a few weeks ago, I had roughly the same idea as the author of this article (except I didn't carry it out--a million kudos because he did :) ).
Now I think the difference with using SRS for learning a natural language is the ultimate goal: It's not "I wanted to learn how to program in language X", if I wanted to do that, I'd do some tutorials or try to complete some Project Euler challenges in that language. Instead, it's "I want to spend less time having to look up parameters and usage of functions in language X's standard library, a language I am already quite familiar with, apart from spending too much time reading the documentation".
Now that I think about it, I see another great advantage: It's a personal thing, I don't expect most people to battle with this problem as much as I do, but whenever I need to look up something in the docs, the problem is not only that this takes time, but also that I have the tendency to read way more of the docs than what is relevant to the problem at hand, I click through to other pages that seem interesting, which reference an external site, I look up stuff related to that and get completely sidetracked. Not only does this take an extraordinary amount of time, but also drains mental energy that I should be using to work on the initial programming problem. This lack of focus has gotten so bad that it resulted in severe burn-out and I've basically resigned myself to not pursuing a career in computer programming related jobs, even though that's what I love to do.
Not having to look up things, so I do not need to task-switch (as much) and keep focused on the programming, well I tried so many other things, I really must give this a shot.
When I read about SRS, there were a couple of programs available, is Anki generally considered to be the best (free) one to start with?
But can't you also use SRS with larger chunks? I think that's what the All Japanese All The Time author does and he's fairly confident about the results.
Putting whole sentences into an SRS definitely beats isolated words, but it's still not a replacement for actual reading.
The AJATT guy was big on SRS, but he was also watching hours of Japanese TV each day and even playing Japanese music and various other MP3s in his sleep. He got the requisite comprehensible input. He achieved a good level of language skill, but for the time he put in it wasn't exceptional.
L2 acquisition linguists generally agree that the important thing is "massive comprehensible input". This could be reading or listening. There is disagreement on whether if input alone is enough-- hardliners such as Krashen would say yes, others would say no. However, there's little disagreement that input is the most important factor. And an SRS will never keep up with actually reading a book when it comes to input.
3 years ago, I was that guy with a huge SRS deck, reading AJATT and writing my own blog about it. If you're unconvinced with the above, all I can say is make your huge SRS deck, do it daily and then come back in a few years and let me know how it went.
Haha, no, I believe you! Just curious to hear another perspective on it... and I had forgotten that he was getting hours of passive input in addition.
It just always sounded interesting because I had reasonable success with flashcards for single words. In retrospect, that was before I had the ability to read even short stories, so maybe I am remembering the utility of them wrong, or they seemed useful because the alternative was nothing.
This is a good post that I'm glad to see refers to a lot of the work done on the subject of memorisation and it would be very useful for vocabulary building in language acquisition (amongst other things).
I can't see the utility for programming languages (compared to human languages) however as the grammar and vocabulary of programming languages are tiny in comparison. The best way to learn them is to write something with them (IMHO). The available libraries are of course broader but the subset you use tends to differ from person to person and actual use will tend to reinforce what you most often use.
The "vocabulary" in a programming language isn't really the reserved words.
It's the libraries. Deep recall of the standard library of any major language would probably improve programming fluency a lot -- and it might improve code quality too.
I wrote an honours research proposal on this; email me (see profile) if you want a copy.
Having seen a lot of code that duplicates basic functionality of libraries used in the project I'm working on, just writing code might even be counterproductive as opposed to a systematic way of learning the libraries and their usage in the project, reinforcing bad behaviour.
Right. In my proposal on the topic, I proposed dividing a freshman CS class into two groups. One group would use spaced repetition, the second wouldn't.
To compare the groups, I proposed using a few common metrics (SLOC, cyclomatic complexity, Halstead's metrics) to get a gauge of the different size of solutions. My guess was that a more "fluent" student would write shorter and simpler programs simply by not needing to reinvent.
But (in reference to the parent topic of learning libraries vs. learning language primitives) shouldn't freshman CS students be mostly NOT using library functions? For example the classic exercises in sorting, learning various sorting algorithms, or manipulation of tree structures, matrices, etc. if you're just using library functions you aren't really learning what's going on.
Not really commenting on spaced repetition here, that should be helpful in either case.
At my university the freshman course was, for two years, taught in Haskell. To my eternal shame I managed to dodge most of the first year with a fistful of RPLs, so I never got to sample it.
The result was that lots of students dropped out of computer science and went elsewhere. More than had when being taught something else.
So they switched back to a intro course based on Java (which is their main teaching language for the first two years, C is the second language which is picked up in the 2nd half of the first year). Then you go on to bog-standard Data Structures & Algos / Computational Structures courses, Algo course etc etc which is where you learn the ins and outs of sorting, trees and so forth. Or, in my case, you barely do so because you are a lazy student who didn't do his homework.
The upshot is that you wind up spending most of the intro course teaching the mechanics of Java and motivating students with "interesting" examples. Various graphical geegaws, basically. In such cases students usually aren't using the standard lib, they're using a provided library.
I submitted 4 proposals; I wound up doing a different one.
The others were:
* a robust user-tracking protocol for websites (this is the project I went with). Inspired by a business idea I had in 2008; currently the basis of my "big" startup project.
* a "whole-machine" architecture proof-of-concept -- basically a blog app targeting a VPS. Gives you a lot more design options if you can control the architecture from the OS up. Inspired by my eye-blistering hatred for Wordpress.
* "A model of player Agency for software-created, interactive, just-in-time plot generation". There's simply no way this would have fit into a one year project, but it would be very interesting to pursue some day. I was particularly proud of my little taxonomy of plot generation mechanisms. Inspired by chatting to mates about what's wrong with MMORPGs.
I don't see use when learning to program but if I memorised, say, the PHP standard library with all its warts, it would make me far, far more efficient at work. Whenever I write in PHP, there's always going to be a lookup for something in the standard lib, whether it's the signature of a function, how it should be used or whether it exists at all.
PHP is a good example as it is very hard to remember all its functions, as it uses a mix of naming standards. (For example, the case-insensitive version of strcmp is strcasecmp, and the case-insensitive version of strstr is stristr)
Whenever I write PHP I look up a lot more methods than when I write Ruby, even though I have more experience in PHP (Ruby cheats, a little bit, by having multiple names for many methods, making a wild guess much more likely to be correct)
I've been doing this for almost a year and I agree it accelerates programming languages/libraries learning. Not having to switch between my text editor and the language/library reference lets me focus on the hard problems.
As a programming languages geek, a few months ago a realized that every time I came back to a language that I had learned earlier, I'd forgotten a good chunk of it, rendering all my efforts in vain. So I started using Anki. Here are a few tips from my experience:
- Koans[0] are great as source material. They're available in almost every mainstream language, the problems are usually solvable in less than a minute and you can just copy & paste.
- Always open your repl and type your answer. Otherwise you're not really thinking about the problem but just memorizing the answer.
Interested, in particular the Python ones! Do they go over the Python standard library and "pythonic" idioms?
Maybe you could put them on github, perhaps?
Only a few weeks ago I first read about the spaced repetition technique, and my first thought was "what would I want to memorize? maybe I could use this to learn the parameters and usage of Python (or some other language)'s standard library" ... and there it is, on HN, including someone offering a set of flash cards, saving me that work :)
Although creating one's own flash cards is a good first step in learning, I wonder how good it is (in that particular regard) to someone already reasonably familiar with the standard library, just wanting to save time by less looking up in the docs?
I have been programming for the last 15 years and I still sometimes look things up, even in languages I use daily. I would not really change this, since if I use it often I will remember it - - and if not, it's just one search away. I think memorizing random facts is a really bad way to become a better programmer - - much better ways is to solve problems and read expert code (via ex. GitHub).
Regarding the memorization technique I am unsure if this is the most effective way of remembering things. The book "Moonwalking with Einstein: The Art and Science of Remembering Everything" presents a lot of tricks on how to memorize things that could be more effective than the method described in this article.
If anything the most important skills are the "meta" ones - being able to judge search results quickly for their credibility and knowing what vocabulary to use.
And it is a good idea to look things up every now and then because they change. In the olden days if you wanted to open a file you just did it. Nowadays you have to consider things like TOCTOU, process credentials, character sets, permissions to require/set, virtual locations (eg "My Documents", localization), quotas, race conditions, fsyncs and renaming of temporary files etc. Heck even adding two integers can result in security flaws if not done carefully
Absolutely about the change point. It's also valuable to understand why things have changed, memorization is not going to help you with that, but you should be able to at least notice that things have changed and be able to conceptualize why they have.
Once you know a few languages, picking one up in the same family isn't that hard. The biggest challenges, for me, when picking up a new language is to:
1) Learn to write idiomatic code.
2) Learn the name of the library functions, so your flow isn't constantly interrupted by looking for what you need.
1) I think, can only be done through reading other people's code. But I've previously thought about doing 2) by using spaced repetition.
2) Learn the name of the library functions, so your flow isn't constantly interrupted by looking for what you need.
Eh, this isn't necessary. I've been programming (and continuously making a concerted effort to expand my programming knowledge) for more than a decade, and memorization of anything simply hasn't been very helpful.
For example I have no idea what Python's regex syntax/library function is, even though I've used it a dozen times in the past. But that doesn't matter: alt-tab to chrome, ctrl-N, "python regexp", stackoverflow pops up with an example, copy-paste, done. Five to ten seconds, max. Far from breaking my flow, it's become an integral part of it.
Zed is the same way, for what it's worth. He talks about it in a Peepcode screencast. "I memorize concepts, not names."
> But that doesn't matter: alt-tab to chrome, ctrl-N, "python regexp", stackoverflow pops up with an example, copy-paste, done.
I find, though, that this is one of the most insidious causes of bugs in my code. All the little idiosyncrasies in those functions - do they throw an exception when they fail or return a code? Are regexes multiline by default or not? does re.match() match the whole string to the pattern or just part or it? Time and time again it's my assumptions about these subtle behaviors that creates bugs in my code, and I've actually come to the conclusion that I need to do some SRS type learning to get newer languages into my head.
If you don't memorize anything you'll be going to stack overflow for everything. If you spend 10 seconds everytime you need to get the first char of a string, a substring, create an array etc you'll be wasting an awful amount of time. Knowing the core functions is a huge timesaver. Python's regexp functions are arguably not 'core' library functions.
If you spend 10 seconds everytime you need to get the first char of a string, a substring, create an array etc you'll be wasting an awful amount of time.
Not at all. You only need to do it once for the current program you're writing. Now you have a working example to refer back to, within the program. So the lookup time only comes into play the first time you need to do a certain thing. It's a negligible constant factor (about a couple minutes in total per program, and without interrupting flow).
Also, being prejudiced against those who have a bad memory is not helpful.
Knowing the core functions is a huge timesaver.
Not really. By using Google effectively, it's possible to be productive in any language without knowing any of the core functions.
This seems counter-productive to me -- you're going to be spending a lot of time memorizing functions you'll never use.
If you just look up function when you need them, you'll memorize the most important ones you use over time, no special practice needed.
So I really can't see how this isn't the worst form of "premature optimization". I guess it's fun if you've got nothing better to do, but I'd never recommend it.
(But I'm not saying you shouldn't read through a language and its libraries so you know what's there conceptually. That's a good thing to do, so you know what to look up later.)
Seems like a less than efficient way to learn and remember a programming language. Unlike natural languages, we can write programs with programming languages. We can run these programs, and if they still work, then they're probably still correct*. You can't do that with written/natural language.
As far as remembering programming language constructs and vocabulary, the way to do it is do projects. You use what you learned, and naturally, you're going to come back to it when you have to fix a bug or implement a new feature. All the while, you're building something good, too.
Another thing. If you're using a well-designed language, the concepts will incrementally build on themselves. They won't just be a bucket of orthogonal concepts, (like PHP, which seems to be mentioned in the article). Therefore, by learning fundamentals, you can more easily understand and decipher more advanced concepts. So, maybe you did forget what something does. But after a few minutes entering commands into a repl, you fully understand it again, because of your crystallized knowledge of other concepts.
I can't imagine a worse way to learn a new programming language than brute memorization.
What you say is true. But the way the human memory works makes the PHP library thing a non-issue. Even if it's badly designed, even if it's not well made, you will probably remember it in the same time frame. Maybe a little slower but not much. That's the flexibility of the humain brain for you. It's not picky at all.
I'm a C++ programmer but I remember myself learning and remembering the PHP functions very fast a few years ago. The PHP library is just a bunch of stuff put together and yet I still remember a lot of them today. Even the parameter names.
Even in spoken languages this applies. In French, you have to memorize the spelling of every word if you want to write without making too many mistakes because the spelling of most of the words doesn't make any sense sometimes. Sometimes the word is coherent with the sound, sometimes the word has mute characters, sometimes the word is bizzare because it comes from Old French... You have to remember everything. Ok maybe I shouldn't have picked French because we do have a big problem about dixlexia and the average teen douchebag can't even spell the word poker.
Oddly enough, I've tried SR multiple times in the past, with very bad outcomes. For example, if I added 10 new english words on day 1 (english is my second language), I would only recall ~10-20% the day after.
Then a few days ago I started reading The Memory Book which really emphasizes on visualizing things, making correlations, etc. The effort of visualizing and using imagination really made a difference: in one hour I memorized the list of 50 states forward and backward with almost 100% accuracy after 2-3 days (I had never head of some states before).
Quickly after I realized that my limited english vocabulary was also the weakness in this system, so I started visualizing and adding new words to Anki once again. The result is that my recalling after 24 hours is close to 80-90% for new words, which is encouraging.
I've been doing this for ~1 week, so take it with a grain of salt, but for me it seems that visualizing the words and using SR is helping. It takes 3 times the effort, but as long as I remember things, I don't care.
The idea of applying the system to programming sounds interesting, and definitely worth a try.
It's a well respected way of storing information. But (drawing a terrible analogy) it's a linked list, not a hash.
It's hard to perform a random lookup on something 2/3rds of the way through your mental tour.
SRS works on the well understood phenomena of repeated exposure and tested recall. They're very good at imprinting atomic information even if that information has no context whatsoever. The original research was done with randomly chosen letters.
To solve the hash problem you can use association and substitution. For example, while learning the word bowler the other day I imagined a bowl that served as a "tana" (my first language translation) for animals. It's a ridiculous association but it kinda works. The nice thing is that I then put words on Anki too.
When I was 12 years old I had a family friend-let's call him James. James was the son of my dad's best friend, and I saw him as a bit of an elder brother figure. I followed him around and tried to copy anything I saw him do, usually unsuccessfully. One day I saw James make a website. Naturally, I had to start making websites, so I whined at my dad until he bought me the HTML Black Book, and read the thousand-page tome cover to cover in a weekend-pretty much the exact opposite of the "find a project first" approach. Then I started making websites.
Having read the entire book was incredibly helpful. Anytime I thought of a feature I wanted to add, or wanted to rearrange the site in a certain way, I knew how to do it-or at least, I knew that there was some pattern that would easily solve exactly what I wanted to do. I made sites for myself, for my school, for my friends-and they were decent. Not fabulous, but my sites were a lot easier to update and maintain than the site my school's $250/hour consultant had built (tables were visibly different on each content page!)
I used the same approach when I finally started writing code professionally 10 years later. In one case I solved a problem in 17 minutes that my coworkers had spent months trying to solve, because I knew that there was an existing function that did exactly what they were looking for. These things were surprisingly hard to find in Google, but having near-encyclopedic knowledge of the language's features made it a lot easier. Instead of searching for a general problem in an obscure language, I knew the exact function name I was searching for and only had to look up syntax.
Memorization is incredibly useful, even in a problem-solving discipline like programming. And having an encyclopedic knowledge of your language is extremely rewarding when you remember a simple way to solve a complex problem :)
Frequency analysis of human language shows that word usage follows a Power law distribution (Zipf's Law) which, for example, means that:
* the 75 most common words make up 40% of occurrences
* the 200 most common words make up 50% of occurrences
* the 524 most common words make up 60% of occurrences
* the 1257 most common words make up 70% of occurrences
* the 2925 most common words make up 80% of occurrences
Speculatively, similar frequency analyses of library calls in various codebases for a programming language may reveal a similar long-tail distribution. This would provide a list of high frequency library routines that would be worth memorising - with clearly diminishing returns as the vocabulary grows.
This kind of analysis would also point at important targets for optimisation, simplification and parallelisation as well as direct the efforts of library implementers for new languages.
I'm working on an coder app that uses SRS.
More like 'Twitter for code, with SRS'. I want to 'follow' people who produce good cards, and get their feedback on the cards I make. This would make it superior to just having a snippets.txt file that I eventually import into Anki. Killer search, with tags, would make it better than github gist (although this will be integrated)...
Oh, and the scheduler is taylor made for code (code chunks usage is different from word usage, which means the equation parameters optimized for word learning won't work).
Contact me if you want to know more (see profile).
Interleaved and spaced practice is a very powerful learning and knowledge retention technique. Some of recent findings in cognitive science has given credence to this approach. If anyone is interested here is the link to one of the research papers.
I think memorization doesn't really help understand the fundamentals of any topic. Rote memorization can only get you so far. As concepts get harder and complex, dependence on mere memorization leaves a lot of holes in the understanding of any subject matter.
Specifically in programming, syntax or semantics form just a minor part of overall problem solving exercise. Efficient and effective programming requires repeated use of concepts or paradigms in solutions for diverse problems. Our brain is much more effective in registering and recalling facts or knowledge when that piece of knowledge is exercised in diverse scenarios.
I am a co-founder of Lymboo Math (http://www.lymboo.com), an online math practice program for elementary school children. We built the curriculum that follows the natural learning sequence of math concepts. On top of the comprehensive curriculum we implemented a practice structure that relies on interleaved and spaced practice. Students practice daily on individual topics until they are proficient in that topic. Then they move on to subsequent topics along the prerequisites-based curriculum. Throughout the program the system automatically incorporates spiral reviews of previous topics at regular intervals of time to effectively cement all the acquired knowledge.
What we have found is that children easily forget what they have learned just a weeks earlier, and their performance degrades in the initial mixed spiral reviews. However, as they continue the cycle of (learn--practice--review)*, they show improved performance in subsequent spiral reviews.
Mixed spiral reviews model interleaved (mix of topics) and spaced (in time) practice to enhance our context-switching skills. The neurons in the brain make new connections and store patterns that aid in quick and fluent recall of knowledge.
Interleaved and spaced learning techniques are more than just for memorization.
> I think memorization doesn't really help understand the fundamentals of any topic. Rote memorization can only get you so far. As concepts get harder and complex, dependence on mere memorization leaves a lot of holes in the understanding of any subject matter.
I look at the role 'rote memorization' plays very differently from what spaced and interleaved repetition does. Rote memorization is short-lived because the kind of patterns and associations that the brain builds internally is not that strong. Our brains 'remember' well when a piece of knowledge (for the lack of a better word) is repeatedly encountered in diverse settings. More the associations the better we remember and recall.
It is true that without a good understanding of fundamentals the advanced concepts will be difficult to comprehend. But, rote memorization should not be construed as something that lays a strong foundation of fundamentals. Even if we are able to recall basic concepts 'learned' via rote memorization, their applicability to understanding advanced concepts is very limited.
Very interesting idea, I think it could be very useful. Hopefully we'll see some community spring up around this idea and start sharing some anki decks for their favorite languages. I know I'll certainly start working on a couple of my own.
One idea for further expansion: go beyond just having specific language syntax, and also include cards on more generic algorithms and data strucutres. You could key it by time and space complexity, invariants, maybe also a pseudocode implementation.
It's pretty important to learn the material yourself first, then use the cards as a reminder of what you learned. When you go through someone else's deck, there's no context.
But I could see how, if you already knew a language, then going through someone else's quiz questions might keep you on your toes more than your own quiz questions.
...this could only make sense if you want to maintain fluency in programming languages that you are not using regularly. And why on earth would you want to maintain fluency in a language you are not using?! It's not as if you'll forget the basic concepts.
IMHO, if you want to maintain fluency in a programming language you're not using, say to keep your knowledge of C and C++ fresh while you're working in higher level language, you should keep contributing to a bunch of open-source projects written in that language, same as for natural languages you need to keep conversing with people that speak it or at least watching movies in it. Otherwise you'll end up knowing and artificial subset of that language that's useless for real work... And the plus is that having the os projects contributions in your resume will keep you employable for future work in that language.
There is a website that use the same concept to teach you languages, world capitals, even Vim or shell commands and you can create your own courses. I use it to learn mandarin, it's really useful. (http://memrise.com)
I prefer http://exponwords.com/ Lean, free, open source, web based, works well on mobile. I use it every day, mostly to learn German. (I know the author.)
I do not see the point of using SRS in something like programming.
Unlike a human natural language, there is little value in trying to memorise extreme details without the trial of placing them in an application context. As for new concepts, the value flows from the mere effort of initial understanding and any consequent usage. Meanwhile, the core of a programming language is almost always easily learned just by using it directly.
Of course, an SRS usage as described could be very useful for other contexts, e.g. remembering programming code snippets or concepts for the purpose of interviews.
I'm going to give it a try. There are certain programming languages that I use rather unfrequently like python or bash script. But when it comes to the point that I need to use them, I allways realize that I tend to forget so many useful commands and API calls, so I have to look'em up once again and again.
My point to this phenomenon is, that when I'v just looked up a (in my eyes) complicated unix command e.g., that at this very moment I know that I won't remember it two or three days later. This tool might be a good help to note them down and help me to memorize them.
I've been memorizing parts of Ruby with this method while reading "The Well Grounded Rubyist." You can import all my Ruby flashcards into Anki. Hope they help. Please email me if they do (email is in profile).
Great article.
I've made a chrome extension for making it easy to capture learning from internet and do spaced repetition. Check out http://memobutton.com
Nice - is it possible to combine it with existing SRS-systems? (I'm using Mnemosyne, so it would need to be possible to export the data in the mnemosyne xml format or as a tab separated text file)
The examples you give are idiosyncrasies that shouldn't be in your code base in the first place (IMHO). These are things that vary from language to language and trusting yourself and others who are going to be reading your code to keep track of them correctly is not a good idea.
a = 5 + '5'
Ignoring the lack of a sigil, in PHP, a = 10.
In Python, this fails to typecheck.
In Javascript, as you note, a = '55'.
(I used an interpreter for each of these languages to verify this)
For me, I could potentially see the usefulness for this in two things: the various meanings of the 'this' keyword in Javascript and the positioning system for CSS. I always have to look that stuff up. Never seems to stick.
Otherwise, the 'factoids' I have 'memorized' are all out of necessity and come from repeated usage. I don't consciously memorize anything.
I personally think the amount of library functions memorized is the biggest difference between average programmers and the so-called 10x programmers. An extremely high level of fluency with a programming language environment is invaluable when it comes to efficiency and code quality. It is analogous to conversing in a language where one has to look up every other word compared to a native speaker who has a deep fluency. It's not just a matter of speed; the depth and quality of thought is orders of magnitude greater in the fluent speaker. The more of the heavy lifting you can do unconsciously, a vastly higher level of output one is able to produce for the same amount of mental effort. We accept this in just about every profession, yet we resist when it comes to programming. Personally, I'm glad my doctor memorizes a large amount of the facts he uses in his day-to-day work.
It is unfortunate that just about every new language brings with it a new set of standard libraries that we must learn to use effectively. We give up so much expertise and efficiency when we don't allow ourselves to build up a high level of familiarity with them. I can imagine a future where there is a standard library of methods that every programmer attempts to memorize and that every new programming language is programmed specifically for this standard library. We are already seeing this with language frameworks like the JVM and .NET, but we need to go even further. Hopefully libraries themselves will be created specifically to be easy to memorize.