Hacker News new | past | comments | ask | show | jobs | submit login
Show HN: Building an open-source language-learning platform (librelingo.app)
159 points by kantord 4 months ago | hide | past | web | favorite | 68 comments

Hi kantord, congratulations on the site! I tried something similar about 10 years ago, called wikibabel. I then joined with another site called wikiotics.org. We were very focused on libre everything. In the end, we gave up after several years, but our content is still up and we've been looking for a new home for it. Please take a look and see if any of it would be useful for you (we had built a bunch of cards very similar to your demo lesson). We had stuff in a few dozen languages, including audio recordings (mostly of low quality). I've moved on, but I still love the goal, and would be happy to share some feedback on the experience if you want (contact on my profile page).

In short:

- focus on what the user wants, not the tech.

- is English->Spanish the best language pair to start with? You're directly competing with the giants like Duolingo. I tried to learn Burmese a few years back, I would have loved to find something like this for that niche.

Good luck!

Cantonese also has a ton of speakers across the world, and the best I could find a few years back was the FSI tapes

I would also love Cantonese in a Duolingo style, please!

Can't agree more. It is amazing how few resources there are for learning this amazing language.

That's a pretty neat site (Wikiotics, I mean - wikibabel appears to go to multiple places).

As someone who is learning Spanish at the moment, here is some feedback.

- I think a single click is sufficient to choose the correct answer and submit it; alternatively let me double click to choose and submit.

- I don't think a "Continue" step is necessary

- Without selecting an answer and pressing Back on the browser, the UI freaks out for a moment (Firefox 72.0.2)

- I love cheatsheets, so having a section for these would be awesome (finding good, consistent cheatsheets on the web is hard)

It is understandably a little sparse at the moment. Take a look at an app called Memrise - I think the style of its exercises would fit well here. I particularly like the "Fast Review" exercise which lets me do a speed review of all the words and phrases I have learned so far.

As others have mentioned already, it would be nice to be able to mark words you already know and stop them from showing. Additionally, being able to mark words you find difficult could appear more frequently.

I will keep an eye on this project; good work so far!

I built a cheatsheet for my iOS language app for Spanish:


I’ve also created some Spanish Word Search PDF’s. Most are on Github:


I was going to create a little book of 20 Spanish Word Search puzzles.

What’s your idea for Spanish Language cheatsheets? I’ve tossed around a few ideas myself.

Verbs, ser/estar, por/para, saber/conocer

I find grammar cheatsheets quite useful, for example:


Anything that explains grammatical rules and gives me a "template" of sorts is great because it doesn't give me the answer straight away but it engages my brain to adapt the word I want to the given rule. Topics like tenses, conjugation, pronouns are what I like having cheatsheets for.

Just tried it and love the idea. Agree with the comments here. Single click without needing to "submit" the answer is more than enough and then skip the "continue" step for correct answers. This would make it so much smoother and faster to use.

> I love cheatsheets, so having a section for these would be awesome (finding good, consistent cheatsheets on the web is hard)

Curious, do you mean for just languages, or in general?

I was recently thinking that a site with user-uploaded cheat sheets for all sorts of subjects would be pretty rad.

Neat idea! I mostly like them for languages (spoken, not programming) but I can see it being great for programming languages too. I guess there are loads of topics that could fit into this.

Do you envision that your site will enforce a particular layout(s) and style for the cheatsheets? Or will it just be more of a catalogue?

Enforcing a style/layouts is an interesting idea. Having that kind of consistency could be a good thing. Maybe having composable templates and a style guide would work.

I was thinking that it could be a catalogue with broad categories that would allow people to surf through for cheat sheets they might want to read, but there would also be room for more esoteric subjects.

For example, this cheat sheet on yeast for fermentation that I have bookmarked deserves a better format: https://www.reddit.com/r/firewater/comments/b1hu1h/my_genera...

Also, it'd be cool if it could be a platform where people can sell their cheat sheets. People could post free ones, too, and would be encouraged to do so because it'd be made easy for them to make an awesome infographic-y cheat sheet, but being able to make some money would incentivize quality too.

I agree with your points, controlling the app entirely with the keyboard seems pretty nice though

When my wife and I first met, we tried to build a language learning app (don't sign up, we stopped working on it) - https://llip.io/landing/

To get users, we started a meetup group to teach people Korean. https://www.meetup.com/San-Jose-Korean-Language-and-Culture-...

We stopped working on the app, but continued with the meetup group (3 years now and almost 1000 members).

Our conclusion is that the most effective way to learn a new language is a simple commitment to showing up regularly to language events. Meetup.com or your local public library will usually have regular (and free) events. Once you have that commitment, picking a great app will help expedite your learning.

Excited to see an open source language learning app!

What if you live in a region where you can't people speaking or learning the language you want to learn?

I've had a lot of success using a Discord server to learn French. The big gaps in my self-learning knowledge were pronunciation and conversation. (It's one thing to be able to read/write the language at your own pace but different to be able to "think" and react in that language in conversation.) Discord chat rooms are great and thanks to the voice channels I've almost got the 'r' down. Most of the servers even have user "flair" so you can let others know if you'd like to be corrected on your mistakes or if you prefer others let them slide.

That's what I originally thought too (couldn't find an existing group) but shamelessly I started one with my elementary grade Korean. Students much better than me came and I was able to ask them for help. Have you thought about starting a group at a local library or coffeeshop?

Great landing page. I’m working in the language learning space but lack the design skills.

buy one from themeforest.com or some other landing page url places for $20, then you save your time.

Then spend your time on the content to answer the questions your users are gonna have. Provide screenshots of your app.

If you are a self starter, I wouldn't recommend building an entire landing page yourself.

I’ve been working on my own iOS language apps for quite some time:


My original idea was to do lots of little games in one app to keep it interesting: Hangman, Word Search, 4 Pics 1 Word, ...

Recently, I’ve been breaking out the games into smaller apps. I think that’s better. Take your data and try to make some fun little games.

Also, I think noun gender is important. Should learn it at the same time as the noun. It can change the meaning of the word.

Finally, is anyone doing verbs? I have a simple Spanish verbs app. There are usually many rules to help make it easier.

I was going to open source the rules that I have but there doesn’t seem to be much interest:


Need to extract the data from my site:


One small piece of feedback : My (UK) keyboard doesn't have accented characters on it, and I don't know how to type them (well, I do, but many/most users won't), so typing "leon" instead of "león" for Lion wouldn't let me continue as there was a spelling mistake. Ideally, have an on-screen keyboard for accented characters, or instructions on how to type them, or less ideally allow spelling mistakes on such characters.

thanks! If you have a GitHub account, feel free to create an issue for that! https://github.com/kantord/LibreLingo/issues/new

Cool! Please make sure to fix what I consider the major problem of Duolingo: it asks so much stupid questions (things I already know perfectly, even if I'm going to forget these tomorrow these still are nonsensical to repeat that much during the same session) I get bored and start clicking too fast so I make mistakes out of pure inattention and get even more stupid questions as the result.

Maybe your ideal way of learning or type of platform is different.

I have thought of experimenting with the possibility of having different way of presenting the same content. For example, instead of a repetitive software solution, exporting the course material as a printable book.

No. I like the repetition way (it works and it provides instant gratification to support the flow state unless repetition rate is t0o intense) and what I see in the current version of LibreLingo is ok. Just don't overdo. Or make repetition intensity configurable.

Maybe what you want are options like AnkiDroid does: https://lh3.googleusercontent.com/QFBCiyA_6WBKfr8Xoo1MqOrQw1...

At the bottom there are 3 buttons that sort of means this:

  - repeat card more frequently
  - repeat card
  - repeat card less frequently
Android app: https://play.google.com/store/apps/details?id=com.ichi2.anki

One way to distinguish mistakes of inattention from "actual" mistakes is to leave marking the mistakes to the user. So if you get inattentive, you simply forget to mark yourself wrong.

My personal system works that way; though not for any particularly clever reason. I just have no idea how to automate the checks. I don't recommend anyone to try it unless you want to spend a few hours setting it up, but there's a screenshot in the README: https://github.com/Yorwba/alphabet-soup

Looks interesting but you should really ad a European language. I'm looking forward to learn Japanese and Mandarin once but don't feel like doing it right now and I probably am not alone. The LibreLingo author has made quite a wise decision to choose Spanish for the example language - it's the easiest and makes a lot sense for everybody to learn (given how many countries use it).

Well, I've chosen Japanese because I'm learning it. I'm not really looking for users right now, I just thought others might find some inspiration in what I'm doing.

You can progress faster in Duolingo if you want. It depends how you use the features.

Good idea.

I personally love Duolingo. But I fear that the course data, which is mostly user contributed, will eventually disappear if the company goes under...

There's a lot of politics involved in that course data, too. That an open-source approach exists is a very good thing.

same, that's my biggest fear

Some hero should scrape it.

Would that be legal? I wonder who holds the copyright to user-contributed data in DuoLingo (assuming they're copyrightable).

If some of the user-generated content isn't copyrightable, or was contributed by users willing and able to share it with a FOSS project, could only that data be scraped, or would it be too difficult to identify?

One way is to get Premium and download the course. I haven't looked at it, but I assume they haven't bothered to do any copy protection on those data packages. Not sure if they contain account-bound watermarks of any kind.

I'm not sure if it's a joke or not, but that web page its title has a typo. It says "langauge" instead of "language".

I also wanted to congratualte you for your effors. I'm a daily and long time user of Duolingo and thought of libre alternatives for a while.

I'd agree with some sentiments in this comment section, that you might want to find a niche, as competing with Duolingo or Babbel would be difficult. Duolingo doesn't too well with "smaller" languages and different scripts.

Are you using a TTS engine for the voice (I assume)? I was looking for TTS for a smaller language I'm studying, but I couldn't find anything. I hope that something comes out of the Mozilla's Common Voice project.

I still think it makes more sense to implement languages that more people want to learn.

However, the long term goal is making it easy for the community to build courses, so once the project is mature, it should be possible to include a way larger number of languages.

I am also thinking of things like conlang enthusiasts being able to create courses for their own conlangs.

I just wanted to chime in and say I think there's another good reason to focus on the more widely-spoken languages first: It's not just that there are more people who want to learn them, it's also that there are more people who can help contribute to the materials.

which makes it possible to create courses that have more content as opposed to being just brief introductions

There are plenty of other places to learn the common languages though. If you have good content for less common languages there are not many choices and you become the default place to go.

I have some experience with Duolingo's course builder. I'd be interested in exploring better ways to do this.

I'd be interested in hearing about your experience with the Duolingo course builder

It was a couple of years ago, November of 2015 according to my notes. I was involved with the Hungarian language course for a while, but there were some in-group political problems and I ended up getting shut out rather abruptly.

The Duolingo course builder is a rather slick UI, but I found it brittle. Like many UIs, it does what it does, and then it stops. There's no direct access to the underlying database, meaning you can't do any kind of search on problem phrases for instance. You can't do any bulk updates of any kind. All you can do is navigate through the course structure to the sentence you want, and use a custom editor to make changes to it. Very clunky.

They have some kind of mechanism in place to flag sentences where learners have problems, but as I wasn't involved in Hungarian after it went live, so I don't know how they work.

There were systematic problems with the course material (it had been adapted from the English course for Hungarians), and I finally ended up writing a spider that walked the course and recorded the data in my own database just so I could do queries against it to highlight where certain issues needed to be fixed.

My #1 takeaway, and this would automatically be addressed in an open-source context, is that the database has to be exposed. I'd be happiest if the canonical course were actually defined in a document that had an independent existence from the live database entirely and could be version controlled. (Obviously multimedia resources would have to live outside that document, but you could have a descriptor for each one and use it within the course definition.)

By all means provide a builder UI to smooth the process - but at volume, you'll want some way to just work on things in text or you'll end up buried in technical debt. You might even want to model the builder on a Wiki, for instance - a set of documents that could be considered a single book made up of articles.

My time is limited (isn't everybody's) or I would promise you the moon in terms of cooperation on this project - I've wanted to see it for a long time. I hope I'll have the time at least to use the platform and provide some constructive criticism.

I totally agree with this. I'm an avid Duolingo user with an over 2000 day streak but I would happily use another tool/game/platform if I could learn Malayalam.

My family is Malayalee and I know the entire language in my brain because I can understand them speaking to me,but I respond in English. I would pay a huge amount of money for an English<->Malayalam instructional book and I'd pay a huge amount of money for a Duolingo-similar Malayalam learning experience. Niche and rare languages have little to no representation in the popular apps.

Great effort - my thoughts as a learner of Finnish and perpetual searcher for internet learning content is that the biggest issue facing any learning platform (commercial or open source) is content, content, content.

It takes a huge amount of effort to build decent course material - most open source material are from a community of people that are also using the material - will this be the case with language learning?

One way to get content would be https://tatoeba.org For example, to teach "oso" = "bear", there are already ten different example sentences with audio recordings: https://tatoeba.org/eng/sentences/search?query=oso&from=spa&... (Okay, one of them is actually "oso" = "dare".)

The Nordic Language Processing Laboratory has a sizeable collection of open source parallel texts: http://opus.nlpl.eu

Click on any of the headings in [brackets] on the top of the page to see the language pairs available for download for that collection.

The problem with tatoeba is that the content doesn’t seem to be curated. I checked the translation of I love you in Japanese recently and more than half of the listing is hot garbage that would never be uttered by a native speaker. It’s just like these sentences where contributed by learners who just finished their first lesson.

Japanese is a special case, because most of the Japanese sentences come from the Tanaks corpus, which... consists of sentences written by learners of either English or Japanese who likely weren't all that qualified for the task.

However, the content is actively getting curated. E.g. https://tatoeba.org/eng/sentences/show/137571 is a translation of "I love you." owned by a native speaker. The search form doesn't make it easy to find native-speaker translations, though. The search option to limit results to native-speaker sentences only works if you're searching in the target language. E.g. if you want usage examples for "愛": https://tatoeba.org/eng/sentences/search?query=%E6%84%9B&fro...

If you're only using Tatoeba as a data source for automatic creation of course material, though, you can filter using whatever criteria you want.

yes. It's gonna be a lot of work to make that happen, but I imagine the course editor platform being open to more people, with different levels of involvement.

Volunteers on Duolingo are working on a Finnish course, and it is expected to go online this year, maybe April.

Completely tangential:

The Rosetta Stone / Duolingo style of language learning has never really worked for me; I generally have more success and an easier time staying engaged with the "comprehensible input" approach.

Are there any open platforms based on this method out there? Maybe something along the lines of LingQ, though perhaps without all the attempts at gamification.

I think the best approach is a few weeks with a duolingo style learning just so you get SOME grammar and a FEW words. Once you can understanding things then input is best, but if you don't understand input isn't going to do much good. If you can find English [any language you know] subtitles all input is comprehensible and it is best - but also the hardest and most frustrating.

If someone could just create a list of of youtube videos with subtitles in my language it would be huge. Doesn't matter if it is a blacksmith explaining how to shape metal, or a knitter showing some pattern, I just need input.

TBH, I don't even personally like Duolingo-style programs for getting down the base vocabulary, because, while it's a generally more pleasant experience than other options, I find the approach to be relatively slow and inefficient.

Here's a video (in French) that I found reflects my own experience quite well: https://www.youtube.com/watch?v=-KQ1qRJ2wQc

In a nutshell, I've got two complaints. First, while flashcarding is wicked efficient (I'd say essential) for review, it's not a particularly good tool for learning things in the first place. And an overly flashcard-oriented approach has real problems with a sort of streetlight effect: You tend to get stuck on what's easiest to illustrate with pictures, which often isn't what's most useful for communicating with others.

edit: Also, for what it's worth, the existing research seems to indicate that, while subtitles in our target language can be hugely beneficial, subtitles in a language you already speak actually inhibit language acquisition: https://journals.plos.org/plosone/article?id=10.1371/journal...

(There was another study with Brazilian English language learners that I found more compelling, but I'm unable to find it right now.)

The point is what works depends on where you are. If you need a new language getting that first 100 (1000?) words is important to do fast, and it doesn't matter too much which words you get. Then you need to hear the real language, and subtitles help you understand. As soon as you start to understand something though you need to switch them off (or to the target language).

It also depends, perhaps, on the method you're using. If you're trying to a method based on comprehensible input, then the point of contention would be that any materials that you can't understand without a translation are still too advanced for you, and you'll learn more efficiently if you instead seek materials that you really can comprehend.

That's where I like LingQ so much, at least in principle: It's theoretically a clearinghouse where you can find lots and lots of these kinds of materials for learning. (The truth is, it can be a bit difficult to sift through.)

I found some pretty useful courses on udemy. Most of the time u can find them on a good discount. Usually its like a class recording. Your milage may vary depending on your learning style

I am far from the first to say this, but please do push hard for languages that aren't commonplace (or well-developed) on other platforms.

My language of choice would be Farsi.

Good work and the roadmap looks pretty promising. I hope we will see native mobile apps. Would be a cool case to play around with SwiftUI and Jetpack Compose.

Love it. Seems a lot like duolingo, but with a lighter UI. Can you please add shortcuts, so you can type 1, 2 or 3 instead of clicking?

you can already do that. Maybe it's not very clear from the UI, but it basically has the same keyboard shortcuts

You're right. Vimium (browser extension) was the problem :)

Wonderful initiative. Would have loved to get involved but I really don't want to use GitHub. Any idea how people may contribute using other platform?

See if the maintainer is willing to accept email patches. You can clone the repo from the github link, make changes, and generate a patch which you can send to a maintainer.

Unfortunately it's a bit tough to avoid github, it's like the facebook of programmers. I don't see any particular reason why it's bad though? If anything, it might improve with microsoft now owning it.

I can also merge from GitLab (or other server) forks too, I think? I've never tried doing that, but it sounds like something that should be trivial

Git is designed to be distributed, you could push to someone else’s repo and they can then push to GitHub for you.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact