Hacker News new | past | comments | ask | show | jobs | submit login
Hacking language learning (rinik.net)
188 points by Void_ on Oct 29, 2012 | hide | past | web | favorite | 52 comments

The idea of 'hacking language learning' puts me in mind of Kato Lomb [1] a hungarian linguist who worked for the UN as one of the world's first simultaneous translators and knew around a dozen languages fluently. One of the most interesting things about her is that she didn't take any interest in languages before her adult life had begun; before that she was a physicist.

In one of her books she advocates reading foreign language texts without using a dictionary at all, just trying to figure out the meaning from what little you already know (of course you have to be at a certain level to even try this, although you'd be surprised how little you need to know to start.)

This process forces you to think logically about who words are constructed, and to use your own current knowledge to e figure out meanings of words. For example, you might already know a small part of a compound word in german, and use this along with the context of the sentence to figure out the meaning of the whole word.


This seems to be based on the idea that learning a language is mostly about learning words. That has not been my experience as a language learner. (As disclosed in my HN user profile, I am a native speaker of English who learned Chinese to the proficiency that I was able to work as a translator and as a Chinese-English interpreter.)

Words in one language do not have a one-to-one mapping with words in any other language. As the saying goes, "The map is not the territory." Each language has its own peculiarities of dividing up the Universe of experience into words, and especially each language has a different approach to arranging words into sentences and longer utterances with grammar and syntax.

I know of a Web page that lists some language-learning resources, especially useful for the case of learning one Indo-European language (like the blog post author's native langauge) while already knowing another.


The best single bit of advice I can give for someone who wants to learn a language thoroughly is to do a lot of what the blog post author is doing: reading in the target language. The section "Suggestions for Study" in the front matter of John DeFrancis's book Beginning Chinese Reader, Part I, which I first used to learn Chinese back in 1975 has great advice: "Fluency in reading can only be achieved by extensive practice on all the interrelated aspects of the reading process. To accomplish this we must READ, READ, READ" (capitalization as in original).

祝你好運。(Good luck!)

AFTER EDIT: The comment posted by _feda_ before this comment was posted that it is important to read target language text for meaning beyond one's current reading level, using context rather than a dictionary to figure things out, is correct. That has much to do with improving understanding of the second language, just grappling with the language directly a lot, not always relying on bilingual reference books.

> This seems to be based on the idea that learning a language is mostly about learning words. That has not been my experience as a language learner.

My experience, as a native English speaker who has learned German as an adult, is that different stages in language learning benefit from different kinds of study. There have been times where memorizing grammar rules have been enormously helpful to me, others where reading a lot is what I needed, and still others where holes in my vocabulary were holding me back. After a certain level of proficiency, the only new thing you encounter in the language is unknown words and phrases, so it makes sense to focus on learning them.

> Words in one language do not have a one-to-one mapping with words in any other language.

As you have mainly studied non-European languages, this might be more true in your experience than for those who are focusing on European languages. There are certainly many words that map one-to-one between German and English, especially in common usage.

What I like about the author's technique is that it seems like a smart way of priming the pump for real and detailed learning: when he encounters a new word in the text, he has a general idea of what it means from the definition, but he can sharpen that meaning with the particulars of the context where he finds it. Otherwise, he might get only the vaguest sense of what a word means from context, or have no clue at all.

I want to add that vocab would have helped me much more than the hours and hours of French verb drills I did in high school and university. I wish they would thrown so much vocab at you in school that retaining even 70-80% would be considered top notch.

Fluency in reading can only be achieved by extensive practice on all the interrelated aspects of the reading process. To accomplish this we must READ, READ, READ

I have little experience learning languages, but it seems that this sentence is pretty meaningless; we can replace the word "read" with any verb ("jump" is an amusing one) and retrieve the old adage "practice makes perfect". Is it your opinion that an hour spent reading Chinese is better than an hour spent practising with a speaker, or writing translations from English?

> This seems to be based on the idea that learning a language is mostly about learning words. That has not been my experience

Er, sure, there's tons of other important aspects to learning language, but words are really important. At some point (once you're somewhat competent with basic grammar and conversation flow, and have the core vocabulary down), lack of sufficient vocabulary can be a big hurdle for conversing with adults about non-trivial subjects. Educated conversation uses an amazing number of words... ><

Of course you're right that it's really good to learn words in context (e.g. by reading), not in isolation (on flash cards or whatever).

[Memory is also weird: for a huge number of words, I remember the exact context (e.g. book title/page/sentence or conversation topic) where I first learned it; I hate to think of the number of brain-cells all that info is using up, but it gives these words a nice ... familiar feel.]

I agree with you that bilingual reference books are often not terribly helpful when learning a foreign language. I also think that you're spot on when you write that reading extensively is an excellent way to improve your knowledge of a language. But in my experience, you can only do this once you have reached a certain stage in learning the language. Before that stage, it's critical to build up a sufficient vocabulary and (secondarily) to learn the grammar. I've learned several languages as an adult, and I made the most progress in the beginning by committing to learn at least 10 new words or phrases a day. Once you reach around 500 words (that's a wild guess, but it's in the ballpark), you should be able to start reading certain carefully-selected books and learning new words through context. In my personal experience, it's only at this stage of language learning that "immersion" is really effective.

So yes, learning a language is far more than learning words. In fact, to underline what you wrote above, most trained translators explicitly attempt to map ideas, not words (although there are in fact multiple "translation paradigms" on how exactly to go about this). So absolutely, many words and phrases are specific to a given language, and don't map one-to-one.

Nonetheless, as an adult learner, basic vocabulary -- combined with grammar -- provide you with a foundation upon which to build your language learning. Nouns like "man, woman, head, shoes, etc" and verbs like "go, speak, find, search". Without a basic vocabulary, it's pointless to try to build up oral fluency, or to read books. No serious language learner that I know willingly picks up a book in a foreign language and starts to study it without some preparation, or dives into in-depth spoken interactions with a speaker of the target language with no background in that language. Most try to establish the most common words of a language, memorize them, find a good grammar, study it, and only then do they start to interact with the spoken and/or written language.

(Background: native English speaker who learned one foreign language as an adolescent and several others as an adult / ATA-certified translator with a decade of experience doing technical/medical translations. Currently: transitioning back to professional web development!)

In a similar vein, here's a Python script which someone wrote that does a word analysis on Chinese texts:


It provides a breakdown of the words in the text, and even does lookups to supply definitions and frequency in an external corpus, so you know how common that word is in a broad range of texts. You can then take the output and paste it into a spreadsheet program, and from there import it into an SRS flashcard system like Anki for long-term memorization. Kind of a DIY solution but it's pretty handy for serious learners of Chinese.

It's also available as a web frontend here, under the "Wordlist" link:


With full end-to-end integration with Anki and a mobile e-reader, it would be very powerful indeed.

(I should also mention that the mobile app Pleco has a reader component that allows directly adding words from an ebook into its flashcard system, along with high-quality definitions. Pleco's probably the fullest Chinese-learning suite available on smartphones.)

edit: For mouseover definitions of Chinese and Japanese words in your browser, there's the extremely useful Perapera-kun Firefox addon. It allows you to add words to a wordlist which you can then export as a text file along with definitions. Tada, more cards for your Anki deck.

Anki is awesome! I've been using it to learn Finnish (http://ankisrs.net/)

I like Anki as well but I had several issues running it on android. I got through the issues on my phone but I still can't use it on my Nexus 7. Also, I was looking at German vocab and some of the translations were in Spanish, while all the rest were English.

Thanks! This is great. I've been trying to use a method like this while learning chinese and it's working pretty well.

This reminds me of an excellent article on Wired called 'Want to Remember Everything You'll Ever Learn? Surrender to This Algorithm' [1] in which Piotr Wozniak describes how he used the spacing effect to learn English, among other things.

In 1985 (!) he wrote SuperMemo, a piece of software that utilized this spacing effect:

"SuperMemo is based on the insight that there is an ideal moment to practice what you've learned. Practice too soon and you waste your time. Practice too late and you've forgotten the material and have to relearn it. The right time to practice is just at the moment you're about to forget. Unfortunately, this moment is different for every person and each bit of information."

The full article is a bit long but if you're interested in this stuff, it's well worth the read.

SuperMemo seems to have fallen behind as far as software goes, but there are great alternatives, like Anki [2] that use the same method.

1. http://www.wired.com/medtech/health/magazine/16-05/ff_woznia...

2. http://ankisrs.net/

I'm using Anki to learn Japanese, but since I started from scratch, I'm learning the words here first[0]. Start with a word from there, use an English-<Language> dictionary[1] and Google Translate together to disambiguate, then search on Google Images in <Language> to find an image of the term. Searching in the target language double checks that you have the correct term. After that, just practice in Anki.

The method I described is from a Lifehacker post earlier this year[2].

As an aside, Anki is excellent. In the past few months version 2.0, a new version deserving of a major version number, was released for Windows/Mac/Linux, and just recently for iOS. The corresponding Android version is a little behind but I imagine it will be out soon too.

[0] http://www.towerofbabelfish.com/Tower_of_Babelfish/Base_Voca... [1] http://www.csse.monash.edu.au/~jwb/cgi-bin/wwwjdic.cgi [2] http://lifehacker.com/5903288/i-learned-to-speak-four-langua...

Neat! I sympathize, trying to read books in Russian. Tracking progress with individual words reminds me of some of Michael Walmsley's work at the University of Waikato: http://www.cs.waikato.ac.nz/genquery.php?linklevel=4&lin...

His software actually picks things for you to read based on words that it knows you still need to learn.

Once you have a database of words that you know, you could use it to help select a next book to read (a book that does not have too large percentage of words that you don't know). Several times I bought a book only to find out that the language is way to advanced for my level. Very frustrating.

Not to steal your thunder, but this is a pretty amazing story that may inspire your own tool creation:


Spaced repetition seems pretty powerful. It seems like doing this while reading a book might be more repetitive than spaced repetition techniques recommend, but on the other hand you also have ongoing repetition of some of the more common words as you read the chapter. Thanks for the interesting article!

I think Spaced Repetition is precisely a tool, in the context of languages, for the words you don't get enough or constant enough contact through reading, writing, listening and speaking. For words that you do, you don't really need it since you'll memorize it from actual use/need.

Very cool!

At https://github.com/darius/spaced-out I tried to do something vaguely similar: from an aligned parallel corpus, automatically make a prioritized spaced-repetition deck for language learning. (I think I used Europarl.) So you get examples of the words in context, plus they're sorted with the most frequent ones first.

(There's also an SM2-based flashcard reviewer in Python. It's all very crude; I decided I didn't want to learn Swedish enough.)

This is being done, not with books, but with news stories, which for me is one step up from books-- I'm reading all this stuff anyways, may as well do it in Spanish... Only in Spanish-English right now, but the in-line translation is pretty good: http://www.nulu.com/

Thanks for the link, it's great! I see they are using the spacing algo in their flashcards...

I think the idea of the database that knows what words you know is kind of a fascinating idea in a way, in that it goes beyond personal information into personal knowledge, a digital reflection of your actual understanding which corrects itself continuously to reflect it more accurately.

Anyway, if this was a service I could actually use online, I most definitely would, and I might even pay for it (as I once did with smart.fm which became paid for a year or two ago). I'm currently learning german quite intensively, and anything that makes this highly laborious process (that of cramming new knowledge into my mind and trying to make it stick) more efficient is extremely useful to any language learner.

Quick tip for anyone learning foreign words on memrise or anywhere you need to type with a different keyboard layout for OSX.

System Preferences -> Language and Text -> Input Sources. Select the languages you want in the list. click "Keyboard Shortcuts" and enable whatever keyboard shortcut you desire to swap layouts.

ex: now I can press CMD+Space to switch keyboard layouts between Russian, French and Spanish.

memrise.com does this. at least for mandarin chinese. i've been enjoying it.

Looks cool. I think also Quizlet.com deserves to be mentioned here -- helped me so much during high school.

Quizlet sucks. It's all fun and games until some assholes come along and start tagging their worthless sets with tags you're following. "The Trolls of Quizlet", really? This crap doesn't happen on Memrise, and the Memrise interface is not set up to make it terribly rewarding for people to try it.

I would highly recommend http://www.silinternational.com/lingualinks/LANGUAGELEARNING...

Interviews with successful language learners gives a great set of language learning techniques and methodologies.

> The code is a mess, so I'll keep it to myself for now


It's not always about looking at code; some people could benefit from OP's solution right now.

I've been developing a similar system for accelerating my Japanese study, and I haven't been able to release it yet either. It has nothing to do with the quality of the code, it's that right now it's designed specifically for myself. It practically hardcodes my username and other details like that. Maybe other programmers could use it, but preparing such a package for general consumption takes lots more time and energy.

Same situation here as well. I started with eventual distribution in mind (yet another Japanese study app) but it's still a long way away from being what I'd consider releasable. Too many features coded in that I forced myself to try out and will eventually be modified, reconfigured, or whatever. I'll run out of excuses soon enough ;)

Totally. I would pay $10 for the code if it works for Kindles.

Off-topic: It's interesting that word 'countenance' appeared at the top in his screenshoot. That's the only word I still remember from my word-memorization spree while prepping for the GRE.

For those wondering: It mean's face (as in a person's face or like the 'face' in facebook. Countenance-book anybody? :)

I tried something similar with the text of classic video games (I speak English, learning French), since things like Zelda often use archaic terms. It worked quite well, even if you can only get text from other games in the series rather than the one you're playing.

Nice work. For almost all learning, there is a better solution out there . . . it just takes individuals to go after a creative solution using their brain . . . which is really language/dialect independent! Bravo.

nice app! I built myself a little cli program that would do some of the same things a while back, this definately has a much prettier interface though :) Also this has a learning adventure which is a good additional touch, I just used a feedback loop on my previous choices and what I had previously marked as "unsure".

I usually ended up being too lazy though to actually use it. I know exactly what your talking about though with conversation vs. literary I use german every day at work but still don't get half of what Thomas Mann or Goethe or trying to say.

It is funny that many people seem to have this idea at the same time. Must be that the time has come for this.

A while back I wrote a bunch of python scripts that do the same thing for subtitles for myself.

Regarding your problem with Goethe and Mann: Maybe you shouldn't try to start with "the masters". I read books in various languages and have come to the conclusion that after a hard day at work I just can't read a nobel prize winner in a foreign language. What I can read, however, is a crime novel, or something funny (Maybe even a comic book).

Here is a book that I would recommend for you to read if you want to read an important, well-known but easy german novel:

"Der Schatz im Silbersee" by Karl May: This is an escapist western written in the 19th century and actually one of the most read German novels of all times. Rest assured that every famous German you know (including Einstein and the bad one with the ridiculous beard) have read this in their youth.

you hit it right on the money with the nobel prize winners :) I have to (guiltily) admit reading a krimi or bestseller is fun, you know the words, you're done after a couple train rides, it was usually exiting - of course then you forget it, because the contents where basically nothing. I have liked max frisch though and Kafka is also surprisingly easy when you consider people study him, also thanks for karl may suggestion I'll put it on my to read list. The thing is sometimes you want something to think about and I know there's many great german authors and I'd like to be able to "get" them like I do Borges :)

Here's a tool that I wrote for practising French verb conjugation that also uses Mac OS X speech synthesis. https://github.com/euoia/ReVerb

This is great, because it integrates learning with something that you want to do anyway (reading), thus making it easier to come up with the motivation for learning.

Just a quick hint for anyone learning French: The Kindle reader comes with an inbuilt French dictionary. Hovering over a word will show its definition (in French).

Are there libraries that can turn voice into phonemes? It would be cool if I had to say a phrase and the computer scored me on how close I got to it.

Inspired by your idea, I created this web page: http://vocabulate.me

Incredible. Thank you so much!

Any thoughts on non-spaced language such as Japanese (which I am learning) and Chinese (which I speak)?

Brilliant! I believe this is how Rosetta Stone is teaching as well.

No, but this is how Rosetta should be teaching language. Instead they hawk their "immersive" language learning. Immersion is great for children, but adults learn differently (I say this as someone who learned a foreign language as a child and two as an adult). It's not sexy, and it's not pretty, but to build a foundation for learning a language as an adult, rote memorization of vocabulary is by far the most effective route in the beginning. Once you have a good base (say 500 of the most common words), you can start learning grammar and then, slowly, starting to practice speaking, reading and writing. But continuing to build up a vocabulary is vital to continuing to learn the language. Once you reach about 1000 of the most common words, things will start falling into place if you have adequate exposure to the language. In my opinion, it's only at this point that Rosetta Stone becomes worth using, and then only if you don't have access to friends, family, or tv channels in the language you're learning.

From my experience, Rosetta Stone does not work. I've also found from reading the about other peoples' negative experiences with Rosetta Stone I'm not alone in this. It's a nice method in theory, but in in practice it fails. Here is one well written argument for why it doesn't work: http://language101.com/reviews/rosetta-stone/

does anyone know the most common 500 words? Is it language neutral or does it vary from language to language? I definitely agree that learning the 500 or so most common words is a great start - even if some of the words are conjugated verbs and you don't have the background to understand the conjugation.

It varies from language to language, but those word lists are all around. https://en.wiktionary.org/wiki/Wiktionary:Frequency_lists is a good place to start.

I am not a believer in rote word memorization, but rather putting each word in a sentence and using the sentence as the memorization route. Rote memorization stores the word in a different part of the brain than the language center, and using the sentence as the unit of flash card seems to override this.

I still see use for flash cards to re-trigger words in memory, but not for actually learning new words. I can't tell you how much I struggled to de-link certain words because I "memorized" them at the same time and so mixed up the meanings of the two words.

You can use the Collins API instead of that crappy synthesizer! ;)

well done, man, well done. It warms my heart to see people who understood what computers are for.

it's a wonderfull idea to memorise some importants words

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact