Hacker News new | past | comments | ask | show | jobs | submit login
Faulty machine translations litter the web (techxplore.com)
4 points by Brajeshwar 8 months ago | hide | past | favorite | 5 comments



A faulty translation to an obscure language is better than not having certain information available at all in that language, so I think "litter" is an unfair label.

'That means that as trillions of bits of data are ingested for AI training operations, regions under-represented on the web, such as African nations and other countries with more obscure languages, will face greater challenges in establishing reliable—and grammatical—large language models. With few native resources to draw upon, they must heavily rely on tainted translations flooding the market.'


Exactly. Something is better than nothing, and machine translations have recently improved significantly. Combined with human proof-reading, it's a great thing.


Well, human translations aren't necessarily better. I recently wanted to train an LLM to act as a translator. As my training data, I used crime novels that were written in one language and had been translated into the other. When preparing the dataset, I encountered many places where whole sentences were missing, or the meaning of the original had been clearly changed by the translator. Apparently, nobody had checked.


Human translation quality varies a lot but even good fiction translation is not always staying close to the original. If you would translate fiction true to the original, like you would translate a legal document it unlikely would not be an enjoyable read and that’s the main point for majority of fiction books. So translators often make some changes like using idioms from the destination language which rarely map 1:1 to idioms from the source language or make other changes to keep the style/mood of the original even if they have to sacrifice minor facts.


That is completely correct. What I had seen, however, was that the translator left out whole details for no apparent reason. That the protagonist feeds the cat before saying good night to his kid. Stuff like that. My guess is that the translator simply overlooked that sentence. And I’m surprised that nobody noticed during editing.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: