Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I have no doubt that the writer is better at translating than AI, but I have to say that AI translation has gotten so good that I'm not sure how much longer translation work will be there, or rather it might end up being more about auditing.

For example, I just read the Lawrence Ellsworth translation of The Three Musketeers, which I very thoroughly enjoyed. I don't speak or read French, but from my understanding Ellsworth's translation is considered one of the more accurate translations of the work.

Out of curiosity, I sic'd Claude Fable on the original French version of The Three Musketeers and told it to translate accurately, but also try and keep the same jovial tone as the original and do not censor anything. After it was done, I didn't read the entire output, but I did compare a few individual chapters between the Ellsworth translation and the Fable translation.

They were honestly remarkably similar. As far as I could tell, nothing was substantially different from the Ellsworth translation and the Fable translation. I do think that the prose for the Ellsworth translation was a bit better, but the prose for the Fable one was actually perfectly readable. Again, I don't speak French so I cannot say for sure, but I do not believe that I would have gotten a significantly different experience had I read the Fable version instead of the Ellsworth version.

Now, it's possible (and likely) that this is somewhat self-fulfilling; Fable might have been trained using Ellsworth's translation and as such it's very directly able to crib from it; sadly since I do not speak any language outside of English, there's sort of a catch-22: the only way I can compare the accuracy of a translation is to compare against other translations, but if other translations exist then that will likely influence the results, and if a translation doesn't already exist then I have no way of auditing it.

I'm still going to continue reading through Ellsworth's translations for the subsequent stories simply because that feels more canonical, and as I said I do think the prose was a bit better.

 help



> Out of curiosity, I sic'd Claude Fable on the original French version of The Three Musketeers and told it to translate accurately, but also try and keep the same jovial tone as the original and do not censor anything. After it was done, I didn't read the entire output, but I did compare a few individual chapters between the Ellsworth translation and the Fable translation.

This isn’t a great test, because Claude almost certainly has multiple translations of The Three Musketeers in its training data.


Read the last two paragraphs :)

The things is, this is almost certainly what's happening.

You can (could, maybe they 'fixed' it by now) get sota LLMs to reproduce entire novels near verbatim.

The idea of giving it parallel texts of those novels in different languages, to train it on translation, is so obvious it'd just be strange if the AI labs didn't do it.

In fact DeepL was doing basically that more than 10 y ago.


Oops, I legitimately missed the second-to-last paragraph.

I still think there are better tests you could do. Ideally, you would choose a book that was published recently—after the model’s cut-off date—which is considered to be a good translation. But even something like The Girl With the Dragon Tattoo, which is not particularly new and by no means obscure, would be better than a famous work of literature like The Three Musketeers that has many translations.


Almost certainly correct, though I've noticed that these LLMs like to complain when you give it stuff that is still in copyright. The Three Musketeers is thoroughly public domain everywhere so in that sense it's a good test, but of course because it's public domain everywhere there are lots of translations to crib from so I acknowledge it's not a great test because the training data almost certainly contains a competent translation.

Even if Fable didn't have Ellsworth's translation, it certainly has the William Barrow translation, which would still get it like 80+% of the way there.

My wife speaks Spanish, I should get her to do some kind of comparison with a Spanish book that doesn't have English translations.


They say "yes, I admit it, this is all invalid".

No, they are a disclaimer that it's possible that the data isn't conclusive. Not the same thing as saying "it's all invalid".

> I did compare a few individual chapters between the Ellsworth translation and the Fable translation.

I'm pretty sure the Ellsworth translation is in the corpus. You basically instructed claude to regurgitate it.

The llms all have the more famous books memorized. You can trick them to recite them more or less word for word.


I mentioned this specifically in my comment :)

... yet you still conclude "AI translation has gotten so good", so which is it?

I do think it's gotten pretty good. I'm just acknowledging my limitations in the matter. It's not a contradiction.

You acknowledge your test is fundamentally flawed but then still use it for the basis of your conclusion. It’s worse not better that you were aware of this.

Try translating some prose from English to another language, then, in a different model, back to English

I tried this with the original comment in the thread. Guaranteed to not be in the corpus, references a few terms that also wouldn't be in the corpus (Claude Fable), and long enough to be more than a sentence or two while short enough to compare in a discussion like this.

I did this with entirely local models I have sitting around on my laptop. Minimax M2.7 at a 3 bit quant with 8 bit quantized KV cache for English -> French, Gemma 4 31B QAT (4 bit quant) MTP for French -> English.

It's perfectly readable, but there are a few places where the phrasing is a bit more awkward after the double translation ("auditing" to "revision" in particular is a bit off). Gemma did comment on not knowing what Claude Fable was in its thought process: "The author compares Ellsworth's translation with one produced by "Claude Fable" (likely a misspelling of "Claude" or a specific version of Claude)."

Here's the double translation:

"I have no doubt that a writer is better at translating than AI, but I must say that AI translation has become so good that I'm not sure how much longer the profession of translation will exist—or rather, it may become more a matter of revision.

"For example, I just read Lawrence Ellsworth's translation of The Three Musketeers, which I enjoyed immensely. I neither speak nor read French, but from what I understand, Ellsworth's translation is considered one of the most faithful translations of the work.

"Out of curiosity, I asked Claude Fable to translate the original French version of The Three Musketeers; I asked it to translate faithfully, but also to try to maintain the same playful tone as the original and to censor nothing.

"Once it was finished, I didn't read the entire result, but I compared a few individual chapters between Ellsworth's translation and Fable's.

"They were honestly remarkably similar. As far as I can tell, nothing was substantially different between Ellsworth's translation and Fable's. I think the prose in Ellsworth's translation was slightly better, but Fable's was actually perfectly readable. Again, I don't speak French, so I can't say for certain, but I don't believe I would have had a significantly different experience if I had read Fable's version instead of Ellsworth's.

"It is possible (and probable) that this is partly a self-fulfilling prophecy; Fable may have been trained using Ellsworth's translation and can therefore draw directly from it. Unfortunately, since I don't speak any language other than English, there is a sort of vicious circle: the only way to compare the fidelity of a translation is to compare it to other translations, but if other translations already exist, that will likely influence the results, and if a translation doesn't exist yet, I have no way of verifying it.

"I am going to continue reading Ellsworth's translations for the following stories simply because it feels more canonical to me, and as I said, I think the prose was slightly better."


This is terrible. I never use em dashes!

I'm sorry but it's just such a glaring caveat. And the fact that you don't speak French...

As somebody who regularly reads translated works, including the occasional machine translation (MTL), they (MTL) suck. You got a hugely biased result, which you recognize.

Translation is hard. If you're familiar with reading translations from specific languages MTL works have a very specific smell to them, it's a bit hard to describe but it's there. A good translation is miles (kilometers, for those outside of the US) above MTL.

That's not to say that perhaps the latest LLMs will have better translation abilities, but that they are generally crap currently. Maybe they are fine for something very short, but absolutely not for longer content.


I read genres where MTL is somewhat commonly used. But good quality human translations take remarkable effort. And even artistic choices. Like choices between transliterating and translating. Or maybe in some cases just doing both for single name or term. And then keeping these choices consistent over substantial works.

And it is not like transliteration is consistent thing. Some cases would prefer the old way. Or existing already common one. Even across entirely different works from different authors.


It definitely takes a lot of work. I've read that it takes a good writer themselves to translate well, since it's such an artistic endeavor.

> As far as I could tell, nothing was substantially different from the Ellsworth translation and the Fable translation.

Crucially the full translation was part of ChatGPT’s training set. Recall is a pretty solved problem in machine learning.

How well does it translate a French novel published yesterday? Where neither the original novel nor any translations are in the training set yet? Or might not even exist!

I tried asking ChatGPT to translate a letter I wrote in Slovenian this weekend. It got the general gist but missed a lot of the nuance. Completely missed several of the little touches of tone where the right choice of synonym conveys a whole bunch of information.


Did no one actually finish reading my comment?

I feel like that wasn’t there when I started writing my comment. I also have a bad habit of quickly posting and then adding over a few minutes.

Glad we agree :)


Guess I have no way of proving it, but I pinky swear that I didn't edit it in later!

But yeah, I broadly do agree; if I read other languages I could find a book that hadn't been thoroughly translated to English and then I could give a proper analysis on how good the translation is, but since I'm a very stereotypical American I know exactly one language (and sometimes my comprehension of even that is questionable).


> I could find a book that hadn't been thoroughly translated to English and then I could give a proper analysis on how good the translation is, but since I'm a very stereotypical American I know exactly one language

So you actually cannot give a proper analysis.


Welcome to the internet

> Fable might have been trained using Ellsworth's translation and as such it's very directly able to crib from it

The `cp` program on my computer also has the remarkable ability to produce a faithful translation of The Three Musketeers when provided one as input.


Not necessarily. If you are using macOS and APFS, it will just make a link, it won't actually make a copy.

> Again, I don't speak French so I cannot say for sure

This reminds me of the adage, that ChatGPT is really great at everything except my own work.


Yeah, that's why I put the caveat in there. I have no real way to verify the result outside of checking against "known good" translations, though if the known-good translation exists then there's not exactly a lot of reason to do the AI translation in the first place.

I suspect if I knew another language I would be able to find errors in the translation.


Yes, it is another variation on the Gell-Mann Amnesia Effect. I have a number of non-developers in my circle of friends who think Claude is about to put me out of work. They think it is just a great tool for them, not a replacement. Of course!

> I have to say that AI translation has gotten so good

> I do not speak any language outside of English

So you are entirely unqualified to judge this, and you acknowledge yourself that your test is flawed to the point of being completely useless. Yet you make grandiose statements about the quality.


> but I have to say that AI translation has gotten so good that I'm not sure how much longer translation work will be there, or rather it might end up being more about auditing

It's functional? I wouldn't say it's poetic, I wouldn't want any AI translator translating art, like say a book or poem, I'd be so uncertain that it would correctly bridge the concepts

A good translator can make stylistic choices that elevate the work and make it fit in their language

(Having read lots of well translated manga and anime, also from what I understand there's a few books I've been told by my bilingual friend's are just chef's kiss quality translations)

Considering translating meaningful art is of some value, on that score I don't think we're there yet



I had the suspicion that this was more of a problem of missing context than lacking meta linguistic abilities. A text can be translated as "what's the meaning of the words" and "what would a person/character say in an other language in the same situation", and it's not in the prompt which one the user wants, but only in their head.

So I asked my free chatGPT mentioning that it is for a book but it failed too:

> For a book, a natural English translation would be:

> “Just three words: you are not alone.”

https://chatgpt.com/share/6a2d06a3-a3b4-83ed-9e0a-8ec07e05e3...

It even repeated the context. So it seems to me now that it indeed still lacks meta linguistic abilities. (I don't think that this proves anything meaningful about AI.)


I wonder if “Just 3 words: you’re not alone” would have been acceptable. :)

The Empire Strikes Back: "I'm your dad."

That are still 4 words, imo.

Honestly, translations of fiction are themselves creative works, and the translator needs to really understand both cultures and needs to write cohesively throughout the work. I'm not sure this is even really a question of "can it translate" so much as "can it create a good work of fiction" which is a much higher bar. So maybe the model can mimic the style (especially given that it was probably trained on existing translations) but could it really do so from scratch in a way that is actually compelling? I'm not so sure.

Of course as for the poor OP... is this a majority of what working translators are paid to do?

I suspect a lot of translation is just grunt work - technical and business documents. The lack of a cohesive voice with considered style is perhaps not really much of an issue in those. The expectations are just much lower; text that conveys the basic meaning is a much lower bar to clear.

She's probably better than a bot at that stuff, at least for now, but my concern is that it won't be "enough" better for businesses to justify her continued employment. And this is my general feeling about this stuff across society, in basically all domains.


I see the difficulties more in other areas, such as technical translations, specialist books, user manuals, and translating UIs, where contextual information and a back and forth with the client is needed to clarify details, and (for user manuals and UIs) the translator has to put themselves in the mind of the user and has to consider the possible contexts and use cases.

You're very likely to get a somewhat circular reference; the key (for me) is that for 90% of the usages, "standard translation LLMs" are just fine - I still recommend a translator but they're more of a proof-reader for both languages, catching where something slipped through.

This is sort of missing the point-- people who dont deal with linguistics dont understand that there are multiple types of translation. There's word for word (which is what you're talking about) and sense for sense. If you let an LLM do all of your translation you're letting it interpret huge amounts of intent and context it doesnt (and probably cant) access. The ways in which this impacts the translation will forever be unknown to you and in the worst case lost forever.

So i guess in the end it just matters how important the work is.


Actually I was talking about tonally as well.

A raw "word for word" translation (which I also tried) made the story somewhat hard to follow and very dry, but just asking it to keep the same kind of jovial swashbuckling tone of the original made something pretty similar to Ellsworth's translation.

Again, before someone decides to "correct" me on this, I am aware that it's very likely that the Ellsworth translations are part of the training set so it's not directly a fair comparison.


> If you let an LLM do all of your translation you're letting it interpret huge amounts of intent and context it doesnt (and probably cant) access.

Assuming lots of material local to the context one is wanting to translate is included, why couldn't it potentially access that additional context?


If you let an LLM do all of your translation you're letting it interpret huge amounts of intent and context it doesnt (and probably cant) access.

What’s the intent and context that a human translator of a text is typically privy to that an LLM is not?


Your example doesn't prove your point because you can't even tell it's translation, but also because you said it was not better and are not using it.

LLMs are now being aggressively manipulated for propaganda purposes. Powerful people have realized that people believe LLMs, and treat them as authoritative sources of fact.

The number of lies, lies by omission, deceptive distortions, and fallacious argument tactics they generate is absurd, and increasing rapidly. Translation, when done as a service you are paid for, can't be relied on by propaganda bots.


Do you have examples?

Tons. To pick the most recent example:

I was asking about some allegations relating to the Epstein files, and it used the slogan "Satanic Panic" in a weird way that gave me a vibe of discrediting victims. I'm too young to know much about it, so I asked some things about it. It explained the McMartin case in a way that seemed too absurd to be real. I asked some follow-up questions about what the strongest evidence was, and how it was explained.

The first deception was omission. Initially, it didn't even mention what was arguably the most significant evidence in the case, which was the presence of tunnels under the school. ChatGPT mentioned the tunnels, and how an archaeologist named E. Gary Stickel found evidence of tunnels. Here's what it said about that:

> However, that conclusion has been repeatedly challenged and is not treated as settled fact in the academic or forensic archaeology literature.

> Other archaeologists and later reviewers reinterpreted the same physical findings differently. One major counter-analysis (W. Joseph Wyatt’s review) argued that what Stickel identified as tunnels was more plausibly explained as pre-existing trash pits and construction-related disturbance from before the school was built in the 1960s.

The first lie was by omission, it didn't even mention this when I asked about the most important evidence. The next misleading piece was the framing. Dr. Stickel is a PhD archaeologist, and doing this sort of analysis is his area of expertise. He used nine criteria as a basis for determining the presence of tunnels, and all nine were met. He found "conclusive" evidence of tunnels, and that they matched the expected locations described by the victims. Dr. Stickel was the only expert to review the site before significant construction made such an analysis impossible.

The "major counter-analysis (W. Joseph Wyatt’s review)" was done by psychologist Joseph Wyatt, who never physically visited the site, and who is not an expert in anything related even loosely to archaeology. ChatGPT presented this guy in a way that made it seem that Stickel had been debunked by a comparable expert.


I learned last year that "translation" can be a very tricky thing. Because there's never a one-to-one correlation between one language's words, phrases, structures and metaphors, and another language's equivalent stuff. And LLM translations may not be the actual translation you want, or need.

I wrote up my experiences of translating Lorca and Cavafy poems here[1]. tl;dr: I have developed a massive new respect for translators; however much they're being paid, they probably need to be paid more!

[1] - https://rikverse2020.rikweb.org.uk/blog/adventures-in-poetry...


> … considered one of the more accurate translations of the work.

I think you’re missing a big point of translating literary works. A purely “accurate”, phrase-by-phrase translation is often not very good; the actual literary style, the feeling and the allusions and references, often get lost that way. A good translation of literary work requires a lot of deliberate choices by the translator to deviate from literal translations in ways that convey the style of the original, or an extra layer of meaning that would be lost by an “accurate” translation of a phrase. Also, being consistent with these choices matters a lot, which OP claims LLMs are less good at.



Already mentioned in the comment lol.

This moment is coming for software developers too

Yeah almost certainly, especially the ones who made a career out of "copypaste from StackOverflow", which is most engineers.

But even the good engineers should likely be a little worried.


why would it be different for other people if you already said senior level is not writing code but planning things out?

is there something about planning that LLMs cannot do being your crux of the argument?

what do you believe about your jobs or function that you think will be immune from AI replacing you?

If anything it seems your role is not that dissimilar to those translating languages or business requirements.

I am struggling to see what it is about this planning you do that cannot be done by AI because it seems to me thats not where the moat is rather I find the middle man jobs to be the most vulnerable to AI immediately much more than people writing code.

Because at least someone is watching the outputs from AI and understands the code and can communicate it easily back to the stakeholders without the middle man gate keeping and applying their "taste".

I have a feeling that anyone in your shoes is going to be working with code soon or they won't have much to offer anymore to the business. A stakeholder could easily replace the middle layer with AI and even as a business owner myself I do not see any need to add any more humans at the layer anymore unless they write code.


More specifically, it is coming for coders. If you make your living by banging out lines of code all day, then you may want to be looking at adjusting your career trajectory. But if that is your job, you are either very junior, or a bit foolish for getting into that situation.

so what is software developer doing if writing code is not part of their job

I don't see how not writing code is being offered as a moat, it seems like that is just translating business/stakeholder requirements to architecture/biz processes which is exactly the type of low hanging fruit that AI will capture first

or was it your point that the position sits closer to the stakeholders (relatively compared to those lifting) thus immune from replacement by AI

or is your argument that your taste is exquisite that no AI will be able to match it like it already has with software so far and it will not improve beyond the current state


I kind of view it this way. Yes, non-technical people can prompt and write code. But technical people can certainly also do the job of product people. So then, who would you want to do the end-to-end? A SWE, or a product person? Probably a SWE.

As software engineers, our job will expand horizontally. We will shift left, and right. But that’s really fine, because I’ve found SWE can be really good at that.

They’re good at requirements engineering. They’re good at quality assurance. They’re good at technical support. So, why not pay a SWE to be that person?

Or, at least, some SWEs are good at that. The ones that aren’t will struggle I think.


If you get to senior level then most of your job probably is not writing code, but planning things out. The code is largely an implementation detail.

At least that's how it was for me, maybe other peoples' careers are different.


Yes, my career has been different. At my workplaces seniors still have to code because they dont want to hire juniors

The "planning things out" has moved to another layer, called "architects"


> If you get to senior level then most of your job probably is not writing code, but planning things out.

If they're so good at banging out code now, they're coming for that too, you know.


I don't necessarily disagree, but there's gotta be a name for some kind of "infinite extrapolation" fallacy, where you assume that the current rate of progress will continue indefinitely.

That might happen, but I don't think it's implied, at least given literally every other bit of technology that has ever happened in history ever.


> I don't necessarily disagree, but there's gotta be a name for some kind of "infinite extrapolation" fallacy, where you assume that the current rate of progress will continue indefinitely.

I am not assuming they'll continue indefinitely, but it's a small step from writing code to planning out the code to write, and another small step from planning a coding project to planning a software project, etc.

These are all small steps, and because the act of specification + planning paid less than specification + planning + programming, what reason do you have for thinking that specification + planning is valuable enough to keep the salaries the same as specification + planning + programming?


I think with a fixed size problem, no we wouldn't be able to demand the same salaries that we get now.

I dispute that the problem is fixed size. The people who are senior engineers now will learn how to think at a higher level with the AI models.


> The people who are senior engineers now will learn how to think at a higher level with the AI models.

I think my argument is that, if they were going to do that, they would have done so by now - they already say that actual coding was only a small percentage of their work anyway.


Same thing architects do if drawing lines gets automated: architecture.

Would you trust living in a high rise designed by AI?

Designing a system that survives production is the job.


So what a lab researcher doing if typing articles is not part of the job?

Well--well look. I already told you: I deal with the god damn customers so the engineers don't have to. I have people skills; I am good at dealing with people. Can't you understand that? What the hell is wrong with you people?

https://www.reddit.com/r/ProductManagement/comments/uy1ot1/w...


I think this collapses a global, complex heirarchy of software engineering workers into a single monolith and serves only to advertise for frontier LLM providers. the point where you no longer need engineers is not going to be reached by making LLMs better and better.

I think there is going to be a long time before all of the obscure knowledge of a decent software developer can be completely replaced by AI. Though the job is going to change beyond recognition. It already has in many ways.

But not before a huge crash in optimism about their capabilities. Specifically wrt accuracy, reliability, efficiency, and organization/architecture.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: