Hacker News new | past | comments | ask | show | jobs | submit login
A web app to read Latin texts with inline translations (adi.earth)
122 points by realaleris149 2 days ago | hide | past | favorite | 67 comments





After finishing DuoLingo’s Latin course, I wanted to read some Latin texts, but I didn't find easy enough texts and going back and forth to a dictionary was cumbersome.

So I created this app for reading basic Latin texts. The idea of the app is to have a Latin text with translation of each word under the paragraph line, which makes it easy to grasp the meaning but also focuses on reading the original Latin text.

It only has one book, if I finish it I might add others.

I used OpenAI to do the translation which looks pretty good for me, with the caveat that... well... I do not know Latin. This approach will not probably work for more complex texts.

This mode of reading works for me, not sure if is of interest to anyone else.

The app source is on GitHub if you are interested:

https://github.com/aleris/duplex-lectio

A couple of details about the dev process:

https://adi.earth/posts/duplex-lectio-read-latin-bilingual/


Duolingo really isn't good for learning languages. When I learned Latin, I used the immersive method of just starting to read simple texts to more complex ones. The principal book of this form is called Lingua Latina Per Se Illustrata, I have the color PDF if anyone so needs it. It's a book that starts off with very simple sentences and gradually introduces more complex topics like tenses and declensions as the story goes along.

> It's a book that starts off with very simple sentences and gradually introduces more complex topics like tenses and declensions as the story goes along.

I'm not sure this is actually an approach I'd recommend. I was recently asked to give some supplemental English tutoring to a Chinese brother and sister, 9 and 5 years old. The 5-year-old could already use and understand 'simple' sentences such as "what do you see?" and "where is your brother?", though I'll note that the subject-auxiliary inversion required by a question of that form isn't exactly a simple concept.

I got them a copy of The Cat in the Hat, and their mother objected that it was too advanced for either of them, because most of the verbs in The Cat in the Hat are in the past tense, which apparently isn't covered within the first four years of Chinese English instruction.

You can't learn what you're not exposed to, but you can learn a lot of what you are exposed to in a language.


Lingua Latina is for adults who already have some base level knowledge of tenses in their own language, preferably a Romance or Germanic language (as I believe some languages don't have tenses), not for children who have no concept of them. Once you start reading the book, it really does start to make sense while teaching you the various forms. It's on Internet Archive if anyone wants to read it: https://archive.org/details/lingva-latina-per-se-ilustrata-p...

> preferably a Romance or Germanic language (as I believe some languages don't have tenses)

A couple of points:

- If you natively speak a Romance language, learning Latin by example is going to be really easy for you. This doesn't belong in a comparison with anything else.

- Germanic speakers have no special advantage over any other Indo-European speakers.

- You might be interested to know that while Mandarin verbs don't inflect for tense, the negative particle does, so you have to observe a distinction between past and present tense whenever you're negating a verb.

我不理解 I don't understand

我没有理解 I didn't understand


I think there is a big difference between learning a living language versus a classical language like Latin or Ancient Greek. The point of learning a living language is to learn practical ways to communicate. While there are tiny communities of people who enjoy talking in classical languages, the vast majority of people learning a classical language do so to read real classical texts in the original.

I'm not sure how this responds to my comment?

Meh, when you say Duolingo isn't great, you should compare it to something in its same class. I don't think Duolingo competes with sitting down with a grammar book. The whole reason I used Duolingo, which got me up to speed with Spanish enough to read books, is that I could do it anywhere and any time rather than bust out a book for serious study. I was never going to do that, so Duolingo was strictly better for learning a language than a book.

People lose track of that every time they dunk on Duolingo.


>People lose track of that every time they dunk on Duolingo.

This, people want so badly to shit on stuff online that they don't consider that different people have different wants and needs.


I just wish Duolingo would move faster.

I do not speak latin, although I studied it for two years in high school and I'm a native speaker of a romance language, so my understanding of latin is pretty much basic to guesswork.

This is a really cool tool -- I often read latin texts with the original on one page and the translation on the other, just because I think it's interesting to see how they wrote/spoke at the time, but for the most part certain words or declinations throw me off guard. Inline literal translations really help there.

That being said, I noticed whilst reading some of the texts that the inline literal translations are still in latin, e.g. in "Part IV. I Some Barbarous Customs", most of the translated text is just latin. I guess OpenAI won't take all our jobs just yet!

I do have one suggestion for improvement though. Many of these texts have translations that are already in the public domain (older translations). It would be helpful to display the original Latin and a fluent English translation side by side, whilst still being able to toggle the literal translation on or off. This setup would make it easier to compare the original text with a fluent English translation, similar to the format used in some bilingual books.


If you'd like to try to use Wiktionary instead of OpenAI, you may find this useful:

https://dictionary.nuenki.app/get_definition?language=LatinC...

The code is here: https://github.com/Alex-Programs/nuenki-dictionary

However, feel free to use the API for small scale usage; the API can handle ~5 orders of magnitude more requests than it currently receives.

You can see that each word has many different definitions. It's very difficult to do a word-by-word translation that takes context into account, though I'm going to attempt it at some point using a small LLM that merely picks from Wiktionary data for a https://nuenki.app hover mode.


Thank you for the details: I really like the minimalist UI, well done! And I like to see you sharing the tools that you build to help others learning.

That being said, I'm surprised though, that the translations are GPT generated, so not sure how trustworthy this actually is. Domain foreign users have to be able to trust that learning resource are proof-read / accurate.

Not to say it's worthless, but you may want to note that the translations were done automatically and may contain errors.


I think it’s pretty cool, but with Latin a simple word by word translation will quickly become unmanageable because you need to re-order and break up sentences to make them understandable in other languages. If you take Cicero for example, it is common to have periods that last one full page of text.

“Translation of each word” is also called a “gloss”, and I think it’s absolutely vital for trying to read works in translation.

Lots of words and phrases have multiple meanings and connotations in their origin language and it’s not usually possible for a translation to bring the richness into the target language.

(I’m going to butcher this because I don’t have the text in front of me, but) Thomas Aquinas composed several hymns for the feast of Corpus Christi, one of which is “O salutaris hostia”, which contains a reference to “fer auxilium” which is often translated to “bring help”.

The choice of the word “fer” isn’t the most obvious choice for “bring”, though. Some translators have speculated that Aquinas chose “fer” because of how close it is to “ferculum”, which is a litter or wooden frame upon which spoils are carried, which refers to the crucifixion.

.... I think that’s right.

Anyway if you have a gloss along with a translation, it’s easier to include context like that as footnotes on individual words/phrases.


> The choice of the word “fer” isn’t the most obvious choice for “bring”, though.

Well, if you asked me how to say "bring" in Latin, that would be my first choice, and the irregularity of the verb tells us that it's very common in general, though not necessarily for this.

Lewis and Short has "In general, to bear, carry, bring"; "In particular, to move, bring, lead, conduct, drive, raise", which seems to hit the concept of "bringing help" squarely in the center. https://www.perseus.tufts.edu/hopper/morph?l=fero&la=la#lexi...


I love this mode of reading very much. I think this could be great for any language, not just Latin, and any target language, not just English. The only difficulty is sourcing good texts that are both fun to read and also suitable for language learning at various levels of skill.

I've been working on a tool to display the structure (and glosses) of a text. All the structural and definitional annotation is provided by me based on personal knowledge. So far I have most of a single short fable in Mandarin Chinese done. ;D

Does this sound like something you'd be interested in checking out?


Thank you for creating this. I took Latin in HS (over 30 years ago!) and would like to brush up.

>I used OpenAI to do the translation which looks pretty good for me, with the caveat that... well... I do not know Latin

OpenAI also does not know Latin. This is either a tool to troll people that can read Latin or a tool to help people “learn” a made up vaguely Latin-shaped set of gibberish that ChatGPT nondeterministically generated. This only works for a definition of “Latin” that is a sort of vibe wholly detached from structure, syntax, or vocabulary.


  vaguely Latin-shaped set of gibberish that ChatGPT nondeterministically generated
OP hadn't used AI to generate Latin, but to generate English.

It doesn’t matter which set of text you generate with ChatGPT in this case. Using it in either makes the output useless as a tool to learn anything about both sets of text. This issue is compounded further when, as the OP readily admits, there is no quality check involved (OP does not know Latin)

  Using it in either makes the output useless as a tool to learn anything about both sets of text
OP is using it as a tool to improve their comprehension of Latin. The tool shows a word by word translation, which is jarring to read linearly, but works well for filling in gaps.

It's far from useless as an aid to comprehend the Latin text.

It's been over thirty years since I last studied Latin, but I still remember enough to be able to tell that this tool would be useful for a learner, even without perfect accuracy.


I do know Latin (quite well), and gpt-4o does an extremely good job at translating from English to Latin and vice versa.

Note how one opinion provides the warrant for the other.

You're severely overstating the case here, as if it just produces random text like a markov generator or something.

A great collection of Latin texts you can use to find your next books: https://thelatinlibrary.com/

I suggest using ruby annotations for the translation part.

I am building a sanskrit reader[1] and needed a feature that allowed users to tag words with notes/meaning. ruby annotations work wonderfully for that.

[1] Site: https://www.adhyeta.org.in/

Backend: https://github.com/s-i-e-v-e/adhyeta

(The vocabulary features require a login, which is not open to the public as of yet.)


I can see some texts in Sanskrit but I couldn't find any ruby annotations, nor a way to add/suggest any.

I am still working on the app part of the website (https://app.adhyeta.org.in/). Will enable registration/login in a week or two.

This is how the interface looks like/works, when logged in.

https://www.adhyeta.org.in/a/images/z85.png (ruby annotations can be seen floating above the source text)

https://www.adhyeta.org.in/a/images/z88.png

The "app" is laid out exactly the same way as the www version is. The only difference is that it provides tools to manipulate the status of words. Login is required because vocabulary progress has to be tracked on a per-user basis.


How... how does it work?

Ruby annotations?[1] They were originally meant to provide pronunciation hints for Asian languages. The hints naturally float above or below the corresponding word. You do not have to play around with tables/grids/flex to make this happen.

[1] https://developer.mozilla.org/en-US/docs/Web/HTML/Element/ru...


+1 to this, they're an excellent little feature. Just bear in mind that the rendering differs substantially by browser, and some fonts (for the ruby text) work better than others. I'd recommend overriding as much as possible, including the font of the ruby text.

I primarily test with Chrome and Firefox (Epiphany, occasionally). And my usecase requires that I provide downloadable fonts as not all fonts have nice-looking Devanagari renderings. So, the overriding happens naturally.

Yeah, I also target Chrome and Firefox. I'll occasionally test on Edge.

I had a user report that pinyin was looking funny on Reddit. Turns out they use a system font stack which includes Verdana (and so only shows up on Windows), and Verdana does really poorly with Pinyin ruby (iirc it was a generic ruby thing, not just Pinyin).

I just replaced it with a custom system font stack without Verdana.

Not a difficult fix, but something to be aware of.


What's the function of the non-rendered rp-elements?

Backwards compatibility for browsers that do not support this feature. You put your opening and closing parenthesis tokens within the "rp" tags and such browsers will show the hints within those parentheses.

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/rp


This is nice! I do use ChatGPT a fair amount as I’m still crap at Latin and Greek, but a lot of scholarship just plonks it in untranslated. It has always worked fine for getting the gist. I am extremely lucky in that while moving house my wife allowed me to buy the complete Loeb collection but until then I spent a lot of time on Tufts’ Perseus library. The Latin and English sources (and Greek for that matter) are all in XML on GitHub, so some enterprising soul could probably hook up a wider set of books from here:

https://github.com/PerseusDL/canonical-latinLit/tree/master


Perseus seems to be slowly falling apart; I think it's unmaintained. I also think there's supposed to be a more current replacement. But that's as far as my knowledge goes; do you know more?

It does seem under resourced. Version 5 never appeared and I don’t know what 6 is supposed to look like. It’s still a great online resource, but having my own library, and access to the Hellenic and Roman Library in London, I’m happy to take things slow these days.

I built a similar thing a while back: https://github.com/katspaugh/interlinear.io

Useful when you're at a very low level of a language. Past B1 or so, it's mostly looking up individual words.


This is nice! Have been building something similar for Sanskrit (links in my other comment).

And yes, this is not helpful beyond a point. But most people never reach that point. So, any help is useful.


The jargon is „interlinear translation“.

The best article on the subject in English is maybe still Ernest Blum's "The New Old Way of Learning Languages" https://theamericanscholar.org/the-new-old-way-of-learning-l... but note that it's quite imperfect. For example, there's no mention of the fact that interlinears went mainstream in France in the century between Locke's efforts and Hamilton's. There's also a subreddit https://www.reddit.com/r/interlinear , but you may still have to message a mod to be allowed to post.


The web design's relatively minimalistic and clean. Saw the ruby annotation thing a bit ago in the HTML spec. Seems like there's lots of applications for them that don't get used.

Anybody happen to know of a Latin OCR software that could grab the text parts from Latin documents? Have an interest in translating the Etymologiae of bishop Isidore of Seville, since its one of the main reference texts for education in the middle ages. Just 280 pages of Latin's a bit much for hand copying to text files.

[1] https://en.wikipedia.org/wiki/Etymologiae

[2] Archive.org https://archive.org/details/etymologiaeaddde00isid


I have been using LLMs to restore some of my very rusty Russian reading skills. I give them a text in Russian and ask them to explain it word by word and sentence by sentence, and I ask follow-up questions about parts I am still unsure of. It works quite well, and I haven’t noticed any mistakes yet.

I just tried the same thing with the linked Latin text and Gemini Experimental 1206. The results appear below. Perhaps someone who knows Latin can tell us how accurate the glosses, translations, and grammatical explanations are.

https://www.gally.net/temp/20250102interlinearlatin.html


I used this method too and found several misstakes, it can be pretty gross mistakes where tone shift is not detected by choosing the wrong words. Idioms and fixed phrases are the hardest I guess. Usually I can get an explanation if I ask four-ten follow up questions it can be really hard to get it right. I hope you are not missing mistakes, and that it is just how I use it that makes it such a burden for me.

I have the worst time with transcripts, and email conversations.


The accuracy of the results might depend on the language you’re reading, the LLM you’re using, the nature of the text, and the amount of context you give to the LLM. When I’ve tested the best models with more or less standard texts, such as excerpts from novels or news articles, in English, Japanese, and Russian, the results have been extremely good. The latest versions of ChatGPT, Claude, and Gemini are able to explain the meetings of words quite well, and they also get the grammar correct. (I say this as a long-time language teacher and lexicographer. I have written and edited many textbooks and dictionaries for learners of English and Japanese; LLMs come close to my ability and maybe exceed it sometimes.)

They are not always so good, however, with more granular aspects of language, particularly the way words are written or pronounced—the problem the models have with the word “strawberry” is well known. I’ve also seen them struggle with the meanings of words and sentences in isolation, as a lack of context can confuse them (as it can confuse people).

In the case of emails or transcripts, the text might contain mistakes or non-standard language that might trip up the LLMs as well.

In any case, at least for major languages and non-critical applications, I think LLM’s are a great way to understand what is written in another language.


As a learning tool it works ok as long as you keep vigilant. When lose the original text though or become over reliant on the LLM results you will make mistakes. I will agree as long as you have lots of contexts the mistakes will be few, but in real human communication context might change fast and span multiple medias.

This is really nice, and I hope you continue with additional texts, as I have been trying once again to pick up Latin basics, using YouTube videos and Wheelock's Latin (which is one of the best foreign language books I've ever seen).

It would be nice to be able to add the macrons or even acute accent characters to show emphasis syllables as another toggle-able option -- ChatGPT seemed to do OK with that task for me just now.


There's some intensive spoken Latin course in Rome, I remember reading about it at the time and it sounded totally immersive. It was a follow on from what Fr Foster had run previously (https://www.smithsonianmag.com/history/father-reginald-foste...).

On my bucket list whenever I win the Powerball and can stop working full time


I suppose you've alreard heard of it, but Lingua Latina per se Illustrata is also supposed to be very good.

Especially after having had to learn Latin the "old fashioned way", which is basically like learning English via analyzing and translating Shakespeare word by word, musing about the meaning of each words position. All while hardly being able to produce your own sentences


It is very good, that's how I learned Latin. For those wanting to speak it, there are online chat rooms I used to use.

Wow, I like this design. Somehow you picked just the right font size combination so that I can focus on the Latin (but still the English is accessible)

I can imagine a click interaction (click on a word/phrase, learn more about the grammar, etc). What kind of interactions would people like for reading texts like this?

(this is so useful for Chinese too!)


Nice, and didn't realise Latin behaves like German by swapping the order of verbs at the end.

I really enjoyed this. You made a great choice in typography and the relative size where the English translations do not interfere with the original Latin text. I hope you can continue.

TIL the sentence structure of Latin is wild

Latin’s sentence structure is much more flexible than languages such as English. For example, Latin uses cases for nouns to indicate which is the subject and which is the object, whereas English uses word order. This allows Latin sentences to be constructed in various orders without affecting meaning. It also leads to Latin’s reputation for trying the patience of students such as myself (studied for five years at school many years ago).

People might find it interesting to know that English still has remnants of its case system in its personal pronouns, for example I/me/my/mine, he/him/his, she/her/her, who/whom/whose.

Some people struggle with who/whom but in fact it’s the same as “he” versus “him”: if you replace “who” with “he” in the sentence and it sounds wrong then you should be using “whom”.


This is also quite common in some more synthetic languages, like most slavic ones for example.

And the latin case is made muuuch more extreme due to the fact that we read poetry and stuff written in the yambic hexameter, which would have also been quite alien sounding to an average roman citizen on the street.


I remember in school one trick was always to start finding the verb and then use that to build a broader context of the phrase

It’s like passing data around in a list or tuple, versus key-value objects or dictionaries.

the difference, if you're looking to dive deeper into this, is that English is a more analytic language, and Latin is a more synthetic language. In Latin you mostly alter the form of the words to match their meaning. In English you mostly change order and add auxiliary words to change meaning. I'm qualifying these with "mostly" because nothing is all one way or all another, generally.

Word order does matter a bit in Latin, but much less than it does in English. "Dog bites man" and "man bites dog" are different due to word order. In Latin you can write "canis hominem mordet" in _any_ order and have it mean "dog bites man" because that's the only thing those nouns in those cases can mean. To do "man bites dog" you have to say "homo canem mordet" (again, in any order). Conventionally, though, you end a sentence or a subordinate clause with its verb, and the subject that matches up with a verb is usually the one closest to space-wise.


I think it's a useful start, but I for one still prefer consulting a dictionary on the side (e.g. Whitaker's Words -- https://latin-words.com/) to make sure I get the nuances and make sure I parse the individual words correctly.

The first thing that stood out to me with the translation is that it goes word-by-word, and doesn't have any room for ambiguity.

For instance, in the first line, "profectus est" is the third-person singular perfect form of the deponent verb "proficiscor" based on context, but it could also be the perfect passive form of "proficio" (which my brain initially gravitated toward, as one of the many derivatives of "facio"). I'd be a bit worried about picking the wrong one if given the word out of context. Or even just picking the wrong translation for a word: for the second sentence, using the word-for-word translation, I might try for "He long was spent at Periander, king of the Corinthians" when "He long dwelled at the house of Periander, king of the Corinthians" would read more naturally, using different translations for "apud" and "versatus erat".

Further, if you don't treat these pairs as holistic verb forms, you get very confused by just reordering words: "Arion, after he is having traveled abroad, ..." vs "Arion, after he traveled abroad, ..." And, it can cause some issues with relative ordering of events (where it's common to move between verb tenses to indicate that some events happened further in the past -- pluperfect vs perfect especially).

And, if you treat "erat" as "was" all the time (as is the case with "versatus erat"), you'll interpret pluperfect (which implies finality) as imperfect.

Later in the first paragraph, I'd run into a little bit of trouble with the ablative absolute ("Ingentibus opibus ibi comparatis"): "Great wealth there acquired" would more literally be translated as "With great wealth having been acquired", or, taking some liberties with the translation, "After he acquired great wealth, ..."

Moving on to the second heading: yes, "ut" is most commonly used as part of a result or a purpose clause with the subjunctive, but a newer reader might not understand why we use the subjunctive here instead of the infinitive. The most literal translation might be "The sailors make a plan such that they might rob and kill that man", but yes, once you're more used to the language, you'd translate it simply as "The sailors make a plan to rob and kill him".


Learned something new. It's incredible that you have this kind of detail... do you think there's a UX that can help readers learn more/explore this kind of ambiguity?

I think the interlinear is realy good at showing one translation (and isn't super distracting) -- do you imagine more like footnotes/etc type things?


I know various e-readers have the ability to look up unfamiliar words or phrases in a dictionary by tapping on them, so you can focus on only looking up things that you don't know, or look for another definition for some word if the definition you have doesn't quite fit. For instance, I would probably need to revisit my parse in that first sentence, "Arion, since he (or perhaps some unspecified deed? but why not the ablative absolute here?) was accomplished abroad, ..." using proficio as "has been accomplished" instead of proficiscor for "departed"); looking up profectus in a dictionary would yield the other base word.

I also wonder about jotting down translation notes while reading -- when seriously trying to translate a text, writing down notes in the margins or between lines (if there's appropriate spacing) helps a ton. At least until you're familiar with the grammar and the constructions, laboring through the translation is a huge help.

For someone like me who is reasonably familiar with the grammar, but might be rusty on the actual vocabulary (having last needed to use it in any semi-serious context more than a decade ago), I could see myself referring to the existing app's translation on occasion to give me some ideas regarding the actual words. But I don't know how helpful it is for someone newer to the language, who'd need more than just vocab.

Another commenter also mentions the possibility of pairing the Latin text with a well-established English translation. That might also be interesting; I could certainly muddle my way through a translation without knowing the grammar, but I would make plenty of mistakes if I didn't stop to think about cases and verb forms. What I think would help most is to perform a surface reading, followed by a refinement to make the text more idiomatic.

For instance, taking one of the sentences in the second paragraph: "Pecunia omni nautis oblata, vitam deprecatus est." My surface reading would look something like... "With all money offered to sailors, he begged his life." I might then refine that to "Having offered the sailors all of his money, he begged for his life."




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: