Hacker News new | past | comments | ask | show | jobs | submit login
Language Learning with Netflix (languagelearningwithnetflix.com)
216 points by impoppy on May 6, 2023 | hide | past | favorite | 103 comments



I don't know if this extension foxes it, but I've found learning language through Netflix difficult as the subtitles and dialog don't match, neither for dubbing or original.


I find YouTube to be vastly better for language learning. Good subtitles are surprisingly easy to come by, the variety of content is much greater, and you can also download the transcripts for offline study. And, perhaps even better yet, the videos tend to be short. Repetition is super important for rapid progress, and it's much easier for me to watch a 10 minute video two or three times than it is something that's 30 minutes or longer.

If you're willing to shell out some money, YouTube + LingQ (which has a plugin for automatically ripping audio+transcripts into lessons) is so effective it's almost like cheating.


YouTube appears be using ML SST subtitles most of the time and tends to trip over simple things like homonyms and at worst throws up its hands and just skips over difficult (noisy, cross-talk, etc.) segments. I say this as someone taking a 2nd language course where we'll watch a video together and do a worksheet and class discussion in that language after. Sometimes the instructor's reaction at the end will be "wow those subtitles were bad!"

Edit for clarity: the subtitles are in the foreign language, not English so it's not an issue of machine translation.


There's a lot of inconsistency, Channels like Tom Scott have remarkably good subtitles (Done by a team of humans), with different color for each speaker, closed captions and even colorfully describing sounds

Auto generated subtitles work a lot better for Video Essays and Podcasts than for TV style content (And it's really inconsistent, Spanish <-> English works surprisingly well as long as there aren't crazy accents in the way)


[Music]


This plugin (Language Reactor) works with YouTube as well, you should try it. + Target and native captions in parallel. + Translate any word on hover (more on click). + Hotkeys to repeat, move to previous/next segment. + Pronounce single word. - Moderate downside that it's not possible to select and translate multiple words. I hardly use LingQ with this plugin, it's more immersive and much allows you focus on listening even more (which is good).


Really? Most transcripts and translations I encounter are so bad. Do you mean specific videos or channels?


Right. It's not that most channels have good hand-edited subtitles; it's that it's not too hard to find channels that have them. Usually once you find an interesting video with subtitles, most the rest of the ones on that channel will also have them, so that can easily be tens or hundreds of hours of subtitled content to work with.

It gets even easier if you set up a separate account that's dedicated to your target language so the algorithm's not just feeding you endless content in your primary language.


To this day I can't understand Youtube's asinine decision to not let me deactivate autotranslation of titles and the fact that it isn't consistently triggered makes it so much worse

Their multilingual UX is terrible


> I don't know if this extension foxes it, but I've found learning language through Netflix difficult as the subtitles and dialog don't match, neither for dubbing or original.

100 000 times this. I don't understand why it's like that but they simply often don't match. And it's not some automated translation that went wrong: it's as if the subtitles didn't match exactly the final "script". They don't match but the subtitles are still totally correct. Sometimes the sentences are formulated differently.

It's honestly both a mystery and a gigantic WTF for me. Are these only meant for deaf people? And how did they manage to get "correct but non-matching" subtitles?


They typically hire two different companies to do the translations, and the translations are optimized for different goals. Subtitles are just meant to be easy to read. With dubs they try to make what's being said at least vaguely line up with what the actors' lips are doing in an effort to avoid the infamous "1970s kung fu movie" effect.


The english subtitles for italian shows on netflix are so bad. They just mistranslate words or sentences for some reason.


Afaik, they are done by two different teams.

Plus, dubbing is sorta kinda trying to match the length of time actors need to say stuff. You cant have sound going while actors mouth are not moving at all. Nor the opposite - translation is done and actors mouth is still moving. And so those movements can not look completely odd. Written subtitles has no such limitations, resulting in different translation.


Good insight. In this case, why don't they display the dubbing text when dubbed audio is playing?


Call me strange, I actually like the effect. I feel that parallel translations can provide a richer context of what's being said than a single translation. For example, idiomatic phrases are frequently split where the text will provide the meaning of the idiom and the speech will transliterate the words. The cultural exposure feels richer to me.


Interesting. I watch everything with English subtitles/cc and get irritated when the subtitles/cc don't match what is being said. But maybe - as you said - I am a minority.


If I had to guess, it just does not exists in subtitle form, no one ever added time information to the translation. Otherwise you had it in subtitles options with cc.

Some shows have two versions of subtitles available - one with cc other without. Likely, majority of consumers are not learning language specifically and are just watching the show and normal subtitles are superior in that case.


I always assumed the opposite: the translated subtitles reflect exactly what the script says, while the actor may have remembered an approximation of the exact line, which is normally good enough not to bother with another take.


I would assume it’s the same as book translations, the point isn’t to translate it directly but in a way that makes sense in the target language. Although maybe a lot of subtitles for lesser TV and movies don’t have a lot of human input and the handler just goes with the softwares suggestion a lot of the time.


Exactly. Japanese translations to English almost never match sentence-by-sentence, but then again a direct translation wouldn't be what an English-speaking person would say anyway.


Sometimes the subtitles leave out filler words I presume to shorten the subtitle.


They translate it idiomatically (which means sometimes completely different sentences) and are constrained by length. The might also start from a voice translation that tries to match the lips.


Language learning via hearing comprehension of content not produced in the target language is almost impossible, because the subtitles never match.

However there‘s s difference between CC (close captions) and subtitles, with the former being the verbatim representation (including sfx, music etc.) in my experience.

I already commented [0] on this 2 years ago.

[0] https://news.ycombinator.com/item?id=27420959#27435311


Correct, and you can find CCs more likely on movies and shows that were shot in the respective language itself. For example, the stuff from https://www.netflix.com/browse/genre/100396 is much more likely to have 100% accurate captions if your goal is to learn Spanish


I found dubbed shows significantly easier to listen then native shows. It is actually easier to learn from those then from native shows. Dubbing is almost always better pronounced and less mixed with background sounds.

Also, the claim that it is impossible to learn if you don't have perfect cc subtitle in target language is absurd. You can use subtitles in own language to get the meaning.


CC is also space constrained so won’t match word for word.

Subtitles and CC are not transcripts.


It still helped me because even though it didn't match 100%, it at least gave me an idea of what the original dialog was about. Then I could derive the content from it. And it made fun to figure out the differences between what was written and what was said.


Toucan is better for language learning. It replaces words in a page with your target language.

https://jointoucan.com/


Alas toucan is a security nightmare (or was it when I tried it last year) - it was sending all the URLs I was visiting to the server - even local host stuff I was running on my machine. I checked that by using a local proxy and looking at all requests made to the toucan servers.

While I love the idea, I don’t trust the company with my complete surfing history. What they should have done is to have me opt into each website that I want to use toucan on and do not do anything if I visit others.


Crazy idea, if local LLMs are good enough to translate languages reliably, an open source extension that translates every page you visit would be so incredibly useful. You don't even have to change your habits, just carry on like normal, but while becoming a language sponge.

I guess you can already do this with Chrome's built-in translation, but that built-in translation leaves a lot to be desired, doesn't it?


Hm, I really don't get Toucan.

So it's doesn't appear to even attempt show the grammar of the target language, including the very basics, such as the word order.

Even for vocabulary acquisition, how is one going to learn noun classes (genders), case endings and articles, things like German separable verbs etc?


Is that the difference between subtitles and closed captioning?

If I recall, one is made from original script, one is typed up from aftually spoken audio.


I think language study just got an overpowered AI teacher. This works more or less for any pair of languages.

I am using GPT4 to reformat text from English to Japanese in easy reading mode. It is very good for language study using topics of interest.

> 私は (Watashi wa) [I am] GPT4を使って (GPT4 o tsukatte) [using GPT4] 英語から (Eigo kara) [from English] 日本語へ (Nihongo e) [to Japanese] 簡単な読み物 (Kantan na yomimono) [easy reading mode] に変換します。 (ni henkan shimasu) [to reformat] それは (Sore wa) [It is] 興味深いトピック (Kyoumi bukai topikku) [interesting topics] を使って (o tsukatte) [using] 言語学習 (Gengo gakushuu) [language study] にとても良い (ni totemo yoi) [very good] です。 (desu) [is]

Same, but in German:

> Ich benutze (I am using) [ikh benoot-se] GPT4 (GPT4) [ge-pe-te-fear] um Text (to reformat text) [oom tekst] aus Englisch (from English) [aus engl-ish] zu Japanisch (to Japanese) [tsoo yap-an-ish] in einfachem Lesemodus (in easy reading mode) [in ine-fakh-em leh-se-moh-dus] umzuformatieren (to reformat) [oom-tsoo-for-ma-teer-en]. Es ist sehr (It is very) [es ist zehr] gut für (good for) [goot fuhr] Sprachstudium (language study) [shprakh-shtoo-dee-oom] mit interessanten (using interesting) [mit int-er-es-sant-en] Themen (topics) [tay-men].

The prompt I used:

Create a Japanese easy reading mode version of the given English, breaking it into 2-4 word chunks, providing romaji and English translations in brackets for each phrase. This is intended for language study purposes.

~~

Of course this is just a reader prompt, we could also have chat mode, asking clarifying questions, asking for more examples of a phrase, generate quizzes, etc.

I am at this weird point where I know phonetically much more than I can read. This formatting helps a lot because you get to see the Kanji first, then you use romaji and English only when necessary. Being different scripts helps separate them visually so as not to read the romaji before I want to.


Do you usually use GPT4 to translate from English, or was that just for the example in your comment? Because the translated output highlights the major problem of this learning-through-AI approach: The generated output can just simply be wrong, like it is here (both the Japanese and German).


I think it is good enough to "break into" the text. It's not the most literary translation but you could just start from a Japanese text if that's what you wanted. I went for modding the English text I am currently reading as a language exercise.


The concern is the reader has zero idea the translation they're reading is wrong


As long as learners are aware, I agree. Getting you to the point where you can start reading "regular" texts in a language is worth it even if some of what you think you know at the point you get there is wrong.


The Ich -> ikh transliteration is wrong. Of course there are a bunch of German photetics which you can't translate into direct English transliterations because the sound inventory is different.

That being said, GPT is still pretty powerful for language learning but you really have to verify more than you trust.


and "ish" would do instead of "ikh", just to also offer the solution to the riddle.


I guess "ish" is closer than "ikh" but still not correct.


es gibt den 'ch' Laut nicht isoliert im Englischen, so wie es das Sean Connery sh nicht im Deutschen gibt. (also ch wie in ich oder nicht, nicht ch wie in ach)

sehr nahe dem ich-ch käme das ch aus much, halt ohne das t davor, mut-sh, aber auch nicht ganz.

alle Sprachen haben Laute die es in anderen Sprachen einfach nicht gibt, und manche davon sind auch nach einem gewissen Alter nur noch schwer zu lernen. mein Liebling waren glottale plosive aus dem Arabischen.

aber danke für den Hinweis, ja, auch ish ist nur eine Annäherung.


The beginning of the word "cute" is about as close as you can get to the ch in "ich" in English.


The closest English equivalent I can think of is the first sound in the word "huge" if it's pronounced with a lot of exaggeration. Even that's not a perfect match.


nein. absolut nein.

German 'ch' in ich is a fricative, like sh in fish. or the chute in parachute.

cute starts with a plosive, a click/plop sound, like check, TikTok curfew etc

edit: I'm talking about the german "ich" meaning I, self-reference, not about how to read that word as if it was an English word, rhyming with ick(y), there you're of course correct.


Check out the browser extension Yomichan. It does something very similar.


I just need a slick UI / Product to adopt AI now. C'mon Duolingo



Hello. Me and Ognjen made this, the main site is now: https://languagereactor.com/

Audio and subs usually don't match for dubbed audio tracks (it's usually ok if you watch a French movie with French subs etc.). I have code that processes the Netflix audio with Whisper ASR, it works very well. Hopefully Netflix don't mind us adding this (my email is in my profile), I think we'll bring this online in a couple of weeks. If any other video provider is interested in having Language Reactor support their site, mail me.

Also, we're rolling out a 'virtual conversation partner' starting today: https://forum.languagelearningwithnetflix.com/t/chat-feature...


Prompted by this post I tried language reactor and have a few notes:

Can you really only sign in with google?

I tried the phrase thingy. I don’t get it. It asks me about a word and shows me a sentence and a translation of the sentence. What’s the point here? Shouldn’t the translation be hidden so I have a chance to translate it myself?

There’s a lot of buttons I can click and I don’t get what they do. Shouldn’t I just choose “I understood”, “I didn’t understand” and perhaps a “don’t show this one again”?

How do I report that a sentence doesn’t have the word that the app suggests I should learn?

How do I report typos in sentences?


>> Can you really only sign in with google?

We've been planning to change auth lib before enabling other login methods, haven't gotten around to it.

>> I tried the phrase thingy. I don’t get it.

Probably the best info is here, should improve situation: https://forum.languagelearningwithnetflix.com/t/update-learn...

>> How do I report that a sentence doesn’t have the word that the app suggests I should learn?

Some words are lemmatised in unexpected ways (el / se are the same lemma). Sometimes the lemmatisation is wrong, despite using sota libs. There's more work to do.

>> How do I report typos in sentences?

It's not implemented.. I'm not aware of the typo issue.

Useful feedback. The project is still a work-in-progress.


I tried it out for Chinese. Why does it only look up single characters instead of words, which are often composed of multiple characters? Also, some of the pronunciations are wrong. For example: 乾净 gan1jing4, 乾 showed up as qian2, which is wrong.


are you planning to migrate the virtual conversation partner to languagereactor.com? Confusing with the "Netflix" part


The forum runs on the old domain still, the chatbot is on the new page (languagereactor.com).


I'd love to see video with subtitles in both the source language and the target language, with edges between the corresponding terms.

(Not perfect of course, translations never are, but for me (at least) it would ease understanding.)


Trying to learn French I exported all the subtitles for an episode of a French tv show ‘Lupin’ and then worked my way through reading it first and then watched it but unfortunately felt no easier. Probably because reading it once through really isn’t enough.

Would be fun if there was an entire 10 week course that worked up to an episode of real tv that by the time you get to it watching is completely fluent.


There a podcast I follow occasionally, the presenter does a brief news-style interest story in a random subject. The first read through is sometimes incomprehensible (to me) but then he breaks down each phrase, and explains them (in French mostly). At the end he reads the piece in full again, and it's mostly comprehensible ... but that's a 2 minute piece of speech. Trying to do that for a whole movie would be way too much for me.

Perhaps that would suit you too:

Learn French with daily podcasts https://www.chosesasavoir.com

RSS address: https://feeds.megaphone.fm/FODL4957050068

I use AntennaPod installed with F-Droid, far and away the best podcast player I've found.

(No affiliations or associations, just what I've found useful)


You might be interested in RFI's "easy" French news: https://francaisfacile.rfi.fr/fr/podcasts/journal-en-fran%C3...

They have articles and audio+transcript news if I remember correctly. I'm French myself but I found a lot of good opportunities to learn things organically in other languages when following the news (in Japanese for me that was NHK Easy: https://www3.nhk.or.jp/news/easy/ )

Bon courage pour ton apprentissage!


Just a quick note of thanks for your serendipitous comment - as someone who wants to improve their French, uses Android, likes RSS and wants a non-shit podcast app preferably with free software principles, this has pretty much nailed it


C'est formidable!


For Spanish, Destinos is something like that. It's a reasonably interesting telenovela that starts out with beginner-level language and quickly works up to more interesting dialogue.

https://www.learner.org/series/destinos-an-introduction-to-s...


French is such a nightmare to listen to. Even when I know every word I often still can't make them out.

I understand spoken Spanish and German much better despite far less education. I have put a lot of work into French and it's very frustrating.


What worked for me was to read a scene in advance and then watch that scene. Or watch with subtitles and then again without them. Whole show is too long.


I think that's called translinear translation. I like the idea, but hard to find the content. I never thought to look for film though. I wonder if there is a site where you can download movie subtitles.


I think you mean "interlinear" translation. Or aka a "gloss":

https://en.m.wikipedia.org/wiki/Interlinear_gloss

Particularly, in the Structure section the Taiwanese example.


You'd want something more powerful than subtitles - otherwise you get n^2 scaling with the number of languages.


See my other post


This extension scratches that exact itch — having audio in German and captions both in German and English for example


This might actually be a useful application for these language bots.


Anyone got good Mandarin content to recommend? Have a goal to be fluent by 40...

Am watching Three Body but the dialogue is sometimes a little tricky to follow.

When I think about how I assimilated Japanese from Anime, the reason it's so doable (IMO) is that the language used is reasonably straightforward and they actually talk quite slowly. I found this once you watch movies in Japanese and noticed it was harder to pick out what people were saying.


Seems like a good time to plug my project https://lazybug.ai perhaps it could a bit with Three-Body. It has interactive subtitles for that show in particular (https://lazybug.ai/player/threebodytencent/1/1) and many others that I've extracted with purpose-built OCR. It's 100% free and open source, runs client-side but with free server syncing of your data, with the obligatory Anki export.

Most of the shows on the list are ones I've seen recommended by others at some point, so perhaps if Three-Body is too tricky there you could find some inspiration there


I live in Taiwan, I've been learning Mandarin for 8 years and use it daily.

I kind of can't stand the mainland accent, and I can't understand simplified characters. You may have an easier time finding interesting content if you are learning mainland Chinese.

The big problem with Mandarin is that there's a content drought. I ask all my friends here what TV shows, games, YouTubers, movies they recommend. It's kind of hard to find good content in Mandarin.

Off the top of my head, though, there are a few really good ones. Here in Taiwan, though, the best stuff is either mixed with Taiwanese (Southern Hokkien) or purely in Taiwanese. That's a bit inaccessible for L2 folks such as myself, but I'm doing the best I can.

If you're not interested in Taiwanese content, sorry if these are of no use to you.

In no order: https://en.wikipedia.org/wiki/On_Children https://en.wikipedia.org/wiki/Marry_My_Dead_Body https://zh.wikipedia.org/wiki/%E6%A4%8D%E5%8A%87%E5%A0%B4%E2... (Taiwanese heavy, maybe 100% Taiwanese) https://en.wikipedia.org/wiki/Man_in_Love_(2021_film) (I think 100% Taiwanese, sorry!)

I think the content/culture drought is one of the most difficult factors in learning Chinese, personally. I used to study Japanese very seriously, and there was always a never-ending stream of amazing anime, games, TV shows, music, you name it.

I will say, though, if you're into music, the Taiwan indie music scene is world-class. Really good stuff.


There's lots more good stuff if you go further back in time. I haven't watched many Taiwanese shows, but I remember https://zh.wikipedia.org/zh-tw/%E5%BE%AE%E7%AC%91PASTA from 2006 as quite entertaining – I watched it via https://www.viki.com/tv/1775c-smiling-pasta


I know enough Hokkien and Teochew that Taiwanese won't sound completely alien. I'll add it to the list haha - thanks!


> hard to find good content in Mandarin.

what does this mean? aren't almost all the shows in Mandarin?


It means it all sucks.


Here's what I tried and recommend:

https://www.youtube.com/c/DANLIAOFreeToLearn - "Free To Learn Chinese", teacher uses a "natural method" where all of the content is in Chinese and most difficult words are paraphrased (HSK1-6)

https://www.youtube.com/@ShuoshuoChinese - similar but HSK1-4 and parts of the contents are in Chinese


Here's my list, the three stars indicate the rating on douban.com: https://www.jiong3.com/gradedwatching/

I would personally recommend: Secret, On Children, A Sun, Light the Night, Office Girls, Scissor Seven, Ancient Detective, Back to 1989


Many Disney movies have great Mandarin dubs on Disney+.


That’s very recent. Six months or so ago they had no mandarin dubs on Disney+ at all. We were pleasantly surprised to recently find that they added Chinese dubs for most of their content, even Encanto’s songs are redone.

Netflix is hit or miss. You can also just sub to various Chinese TV services.


Aside from WorkAudioBook I consider this the greatest language learning tool to exist.

I watch Breaking Bad dubbed in Spanish, subs don't match, and paused to try to guess what they were saying every sentence and did tons of look ups. This took 600 hours. Yes about ten hours per episode. By the end I was ready to find a 100% MONOLINGUAL teacher and communicate with her through the whole of every class (incredible feeling btw) to learn things formally. Learning becomes incredibly exciting and productive when you're ready to go full monolingual with it.

((Was I a false beginner? No. I wasn't primed in school, I had about 1 month Spanish in 9th grade before being forcibly expelled from the class with an F. I was definitely racist against the language age 13-17. I found the teacher playing and making us sing Macarena especially distasteful and that was part of what made me hateful.))


you could literally speak to a human speaker over video chat? there are very affordable services like this


I'm confused, what does video chat have to do with anything? this all took place in Colombia


All I want is star trek the next generation in LATAM spanish. Practically know that series by heart


If you have a Roku or Android TV box you should check out Pluto TV. It's a free live TV streaming app. There is a channel on there that is only star trek in Spanish. Personally I've never seen anything on that channel that wasn't TNG


My friend's startup is similar but for YouTube videos: polyglatte.com


It supports youtube and other platforms

https://www.languagereactor.com/


Chrome states that the new version of Language Reactor has been disabled, because the extenstion can read and change data on all amazon sites.


Happy to see my favorite extension on HackerNews! I have been using this extension to learn French, and it works quite well for me.


Migaku is a paid service that does this, and then lets you record individual segments into Anki flash cards, and works for a variety of video services. Awesome!

https://www.migaku.io/


Love this on the desktop, but it's there a way to use on an ipad - where I watch most movies. Or an app that will show subtitles on an alt lang than the spoken track?


Great idea, but I wish this supported Greek.

With new LLM tools I wonder if it would be possible to export the subtitles from netflix and then translate to the target language


I’ve found Language Reactor supports Greek for the limited choices Netflix provided. Also if you haven’t used it, Language Transfer is a phenomenal tool for learning Greek. I’m open to any suggestions you may have!


Not only possible, but likely trivial.


Well it likely depends on how hard it is to actually extract the subtitles from the netflix stream, since I know there's DRM in their videos


I tried "Language Reactor" (Chinese) and it was full of errors and only looked up words as one character, which is borderline useless.


I recently switched to https://www.trancy.org/


Thanks for sharing. It looks very cool. May I ask which Trancy tools are you using and how is it going?

I just gave it a quick try and wasn't impressed by the current state of things.

1a. The Language Reactor-style plugin. On YouTube, it seems to require that I disable AdBlock on the channels I want to watch. Why is that even a requirement?! I didn't like the idea and gave up.

1b. On Netflix, the first film I tried (Weißbier im Blut) failed with "no machine subtitles" or something to that effect. The show does have subtitles, and LR works, so I am not sure what the problem is.

2. I asked it to proofread a paragraph from "Die Schatzinsel". What I got back was a mix of German, English and gibberish (truncated for brevity):

<trancy>

At dabei first ru,h Iig always weiter thought ra "uchthete dead. man Der's Kap chestit"än st mustarr bete the ihn same eine large We suitcaseile in the an front, room dann, sch andlug in er my wieder nightmares mit der, Fa I hadust associated auf den this T thought withisch the, one st-leggedarr sailorte. sch Butär by nowfer und, br weach had long schließlich since stopped mit paying einem sche any attentionuß tolichen the, lyrics geme ofinen the Fl songuch, in and die on this W eveningorte aus,: it " wasR onlyuhe new, to Ihr Dr da. dr Livesüayben,."

andScore it seemed: to9 make/ no10 good

impressionIncorrect on him words because and he phrases: looked None quite

annoyedSuggestions before: continuing None his

conversationPol withishing the:

old- gardener Anfang Taylors about d aachte new rhe ich immerumat,ism....

</trancy>


I only watch Netflix on my Apple TV. How can I use this? Will it work if I Airplay from my MacBook to my TV?


Tried it. Didn't really work for me much.

Taking a slower approach using Youtube worked better.



But I watch my Netflix primarily on the TV :(


I found you could use a steam deck + dock to output to a TV then use a cheap wireless keyboard such as a Rii to control the setup from your couch. Then you can use the language reactor plugin in chrome to watch Netflix on your tv! You can substitute the steam deck and dock for any pc + hdmi out.


also worth saying that this site has a couple of other tools that are cool. my favourite was phrase pump, but I got lazy




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: