Hacker News new | past | comments | ask | show | jobs | submit login

I don't get why people are so optimistic about machine translation. Computers can get explicit meaning across – that's obvious to anyone who understands linear algebra, information theory, and linguistics. But many aspects of translation (puns, tone, cultural context) aren't just about mapping from one vector space to another. A human, no matter how fluently bilingual, would have to think about the problem, and the current systems don't think.

If we keep working on them, these systems will likely get better and better at low-level translation (including translating idioms), but no machine translation system currently in existence could translate the 逆転裁判 games to Ace Attorney games. Perhaps computers can do it – I don't see a theoretical reason they shouldn't be able to – but it would take a fundamentally different approach.




Have you actually used GPT-4 for translation?

Seriously all this talk about only getting explicit meaning across would be easily dispelled in an afternoon if you only bothered to try.


Agreed. A lot of these responses read like they haven't actually tried it yet.

Which is also interesting, I myself actively put off trying it until I eventually gave in. It seems a lot of us are doing the same, maybe its a case of "how good could it actually be?"


Not trying it yet is fine. Making declarative statements on a product you haven't even used is just absurd.

Dude clearly hasn't used GPT for translation before and his next reply is telling me the ways GPT should fail based on his pre-conceived notions of its abilities. Except i have actually extensively tested(publicly too) LLMs for translation (even before GPT-4) and basically everything he says is just plain wrong.

I'll never understand why people behave like this.


Apparently GPT-4 can't handle "all this talk about only getting explicit meaning across would be easily dispelled in an afternoon if you only bothered to try.", which isn't simple as "Ace Attorney" but I'd think it's still a small stretch to say "everything he says is just plain wrong".


I think you're trying to translate to japanese ?

anyway this is what i got

実際に翻訳のためにGPT-4を使ったことがありますか?本当に、使ってみたら簡単に明確な意味だけを伝えるという話は消えるでしょう。


aaand we got a new one... in chronological order:

  1) 本当に、明示的な意味だけを伝えるという話が、試してみるだけで簡単に解決できるなんて、冗談じゃないですか。 
  2) 本気で、伝えたい意味だけを伝えるという話は、ちょっと試してみれば簡単に解決できると思うのですが。
  3) 本当に、試してみるだけで簡単に払拭できると思うのに、この「明確な意味だけが伝わる」話ばかりで。 
  4) 本当に、使ってみたら簡単に明確な意味だけを伝えるという話は消えるでしょう。 
1) is literally opposite of intent, shrugs off the idea that the talks clear up, 2) can be interpreted as someone discussing about keeping scope on a topic, 3) is not so literal and also turning sentence inside 「」 into a sort of an imperative, 4) ... I'm not sure what it's trying to say ...

  本当に、/使ってみたら/簡単に/明確な/意味/だけ/を伝える/という話/は消えるでしょう。/
  "Really,/  if used   /simply/clear/meaning/only/is conveyed/that story/will disappear./"
... Machine translations used to be like that when I was installing game demos from CD.


I'm not sure what you tried to do ? You tried to translate to another language or ?


GPT-4 can (1) translate, (2) plagiarise, and (3) feedback ("thinking out loud").

Its ability to feedback (3) allows it to execute algorithms, but only a certain class of algorithms. Without tailored prompting, it's further restricted to (a weak generalisation of) algorithms spelled out in its corpus. This is very cool, but this is a skill I possess too, so it's rarely useful to me.

Its ability to plagiarise (2) can make it seem like it has capacity that it doesn't possess, but it's usually possible to poke holes in that facade (if not even identify the sources it's plagiarising from!).

It is genuinely capable of explicit translation (1) – though a dedicated setup for translation will work better than ChatGPT-style prompting, even on the same model. A sufficiently-large, sufficiently well-trained model will be genuinely capable of translating idiomatic language (for known idioms), for the same reason it can translate grammatical structures (for known grammar).

It can only perform higher-level, "abstract" translations – like those necessary to translate a Phoenix Wright game – if it's overfit on a corpus where such translations exist. (https://xkcd.com/2048/ last graph) This is not a property you want from a translation model: it gives better results on some inputs, sure, and confident-seeming very wrong results on other inputs. These are two sides of the same coin (2).

When the computer can't translate something, I want to be able to look at the result and go "this doesn't look right; I'll crack out a dictionary". I can't do that with GPT-4, because it doesn't give faithfully-literal translations and it isn't capable of giving complete translations correctly: it's not fit for this purpose.


Ok so you haven't used it then. I don't care about your whack theories on what it can and can't do. I care about results.

You're starting from weird assumptions that don't hold up on the capabilities of the model and then determining its abilities from there. It's extremely silly. Next time, use a product extensively for the specified task before you declare what it is and isn't good for.

Literally, everything you've said is just wrong. Can't generate "abstract" translations unless overfit. Lol okay. I've translated passages of fiction across multiple novels to test.


> Ok so you haven't used it then.

Not only have I used it, I have made several accurate advance predictions about its behaviour and capabilities – some before GPT-4 was even published. I can model these models well enough to fool GPT output detectors into thinking that I am a GPT model. (Give me a writing task that GPT-4 can't be prompted to perform, and I can prove that last fact to you.)

My theories aren't whack. Perhaps I'm not communicating my understanding very well? I'm not saying GPT-4 can't do anything I haven't listed, but that its ability is bounded by what's demonstrated in its corpus (2): the skill is not legitimately due to the model, and you should not expect a GPT-5 to be any better at the tasks. (In fact, it might well be worse: GPT-4 is worse than GPT-3 at some of these things.)


>Not only have I used it, I have made several accurate advance predictions about its behaviour and capabilities – some before GPT-4 was even published.

No you actually haven't. That's what i'm trying to tell you. Your advance prediction are not accurate. what you imagine to be problems are not problems. your limits are not limits. you say it can't make good abstract translations unless overfit to the translation. that's just false. I know because i've tested translation extensively for numerous novels and other works

>I can model these models well enough to fool GPT output detectors into thinking that I am a GPT model. (Give me a writing task that GPT-4 can't be prompted to perform, and I can prove that last fact to you.)

Lmao. Okay mate. The notoriously unreliable GPT detectors with more false positives than can be counted. It's really funny you think this is an achievement.

>(In fact, it might well be worse: GPT-4 is worse than GPT-3 at some of these things.)

What is 4 worse than 3 at ? Give me something that is benchmarkable and can be tested.


>no machine translation system currently in existence could translate the 逆転裁判 games to Ace Attorney games

Maybe it's already in the training set, but GPT-4 does give that exact translation.

I've found that GPT-4 is exceptionally good at translating idioms and other big picture translation issues. Where it occasionally makes mistakes is with small grammatical and word order issues that previous tools do tend to get right.


> Maybe it's already in the training set, but GPT-4 does give that exact translation.

The corpus includes Wikipedia, so yes, it's in there. That's the kind of thing I'd expect it to be good at, along with idioms, when the model gets large enough.

I meant that no machine translation system could translate the games. Thanks to an early localisation decision, you have to do more than just translate words into words for this series, making it a hard problem: https://en.wikipedia.org/wiki/Phoenix_Wright:_Ace_Attorney

> While the original version of the game takes place in Japan, the localization is set in the United States; this became an issue when localizing later games, where the Japanese setting was more obvious.

Among other things, translators have to choose which Japanese elements to keep and which to replace with US equivalents, while maintaining internal consistency with the localisation decisions of previous games. Doing a good job requires more than just linguistic competence: there's nothing you could put in the corpus to give a GPT-style system the ability to perform this task.


Can you try this[0]? I have no access to the -4...

  Have you actually used GPT-4 for translation? Seriously all this talk about only getting explicit meaning across would be easily dispelled in an afternoon if you only bothered to try. 
Bing Chat:

  GPT-4を翻訳に使用したことがありますか?本当に明示的な意味しか伝えられないという話は、試してみれば午後には簡単に反証できます。
  (Have you utilized GPT-4 for translations? The story that only really explicit meaning can be conveyed, can be easily disproved by afternoon if tried.) 
Google:

  実際にGPT-4を翻訳に使ったことはありますか? 真剣に、明示的な意味だけを理解することについてのこのすべての話は、あなたが試してみるだけなら、午後には簡単に払拭されるでしょう.
  (Have you actually used GPT-4 for translation? Seriously, This stories of all about understanding solely explicit meanings are, if it is only for you to try, will be easily swept away by afternoon.)
DeepL:

  実際にGPT-4を使って翻訳したことがあるのですか?明示的な意味しか伝わらないという話は、やってみようと思えば、午後には簡単に払拭されるはずです。
  (Do you have experience of actually translating using GPT-4? The story that only explicit meaning is conveyed, if so desired, can be easily swept away by afternoon)
If I'd do it:

  GPT-4を翻訳に使ったことがあって言ってる? 真面目に言って、表層的な意味しか取れないとかないって暇な時にやってみれば分かると思うんだけど。
  (Are you saying having used GPT-4 for translation? Seriously speaking, I think that it only gets superficial meaning isn't [true] if [you] would try [it] when [you'd] have time.)
0: https://news.ycombinator.com/item?id=35530380


GPT4

実際にGPT-4を翻訳に使ったことがありますか?本当に、明示的な意味だけを伝えるという話が、試してみるだけで簡単に解決できるなんて、冗談じゃないですか。


WHAT. It's got the second half wrong.

Google: Have you actually used GPT-4 for translation? Really, it's a joke that the story of only conveying explicit meaning can be easily solved by just trying.

DeepL: Have you actually used GPT-4 for translation? Really, it's a joke that all this talk about conveying only explicit meaning can be easily solved by just trying it out.

Mine: Have you actually used GPT-4 for translations? That you can really just, try and easily solve that story that to convey explicit meaning, is such a joke.


Here's a couple more from GPT4 (since it's random every time because of temperature)

GPT-4を翻訳に実際に使ったことがありますか?本気で、伝えたい意味だけを伝えるという話は、ちょっと試してみれば簡単に解決できると思うのですが。

実際にGPT-4を翻訳に使ったことがありますか?本当に、試してみるだけで簡単に払拭できると思うのに、この「明確な意味だけが伝わる」話ばかりで。


  本気で、伝えたい意味だけを伝えるという話は、ちょっと試してみれば簡単に解決できると思うのですが。
"In seriousness, I think the story that [subject] tells the meaning [it/he/they] wants to tell, should be easily solvable by trying a bit."

or "Seriously, the story of telling the meaning [subject] wants to tell, should be easily solvable by trying a bit."

  本当に、試してみるだけで簡単に払拭できると思うのに、この「明確な意味だけが伝わる」話ばかりで。
 
"Really, I think it'll be easily swept away by just trying, but there are so much of this 'only clear meaning is conveyed' stories."

I'm almost feeling that GPT-4 should be eligible for human rights, especially astonishing that they dropped explicit specification of "afternoon" that don't work well. But also interesting it's failing to keep the intent of the whole sentence unlike 3.5 and even more primitive NN translation engines.


> But also interesting it's failing to keep the intent of the whole sentence unlike 3.5

It's because it "knows too much". To anthropomorphise a little: its "expectations" of what should be. To anthropomorphise less: GPT-4 is overfitted. GPT-style language models are pretty amazing, but they're not a complete explanation of human language, and can't quite represent it properly.

> I'm almost feeling that GPT-4 should be eligible for human rights,

Like, UDHR rights? How would that work, exactly?

---

(I've run into the Hacker News rate limit, so posting here.) For anyone who wants an example of "non-obvious meaning" to play with. From The Bells of Saint John (Doctor Who episode, https://chakoteya.net/DoctorWho/33-7.htm):

> CLARA [OC]: It's gone, the internet.

> CLARA: Can't find it anywhere. Where is it?

> DOCTOR: The internet?

> CLARA [OC]: Yes, the internet.

> CLARA: Why don't I have the internet?

> DOCTOR: It's twelve oh seven.

> CLARA: I've got half past three. Am I phoning a different time zone?

> DOCTOR: Yeah, you really sort of are.

> CLARA [OC]: Will it show up on the bill?

> DOCTOR: Oh, I dread to think.


(GPT4)

In this script from a Doctor Who episode, Clara and the Doctor are having a conversation about the internet. Doctor Who is a British science fiction television series that follows the adventures of the Doctor, a Time Lord from the planet Gallifrey, who travels through time and space in the TARDIS, a time-traveling spaceship.

Clara, the Doctor's companion, is trying to access the internet but is unable to find it. She asks the Doctor about its whereabouts, and the Doctor seems to be confused by the question, as the internet is not something that can be physically found.

The Doctor then mentions the time as "twelve oh seven," while Clara's clock shows "half past three." This discrepancy in time indicates that they are likely in different time zones, as the Doctor implies. In the context of Doctor Who, this could also mean they are in different points in time, since the Doctor can travel through time.

Clara is concerned about whether the time difference will affect her phone bill, to which the Doctor replies that he dreads to think about the potential cost. This adds a bit of humor to the scene, as the Doctor often has a nonchalant attitude towards everyday human concerns.

Overall, this script showcases the humorous and whimsical nature of Doctor Who, with the characters engaging in a lighthearted conversation that intertwines elements of science fiction and everyday life.


The middle three paragraphs are completely wrong. Clara isn't the Doctor's companion in this episode, that's not why the Doctor is confused, the Doctor is giving a year (not a time), Clara doesn't know about “the time difference” (indeed, that is the joke)…

That aside: I was suggesting this as an example of something existing machine translation systems can't translate. The 1207 / 12:07 wordplay could be “understood” by the model (I'm disappointed, albeit not very surprised, that GPT-4 didn't), but producing an adequate translation in a case like this requires actual thought and consideration.


Yes, I also expected GPT4 to get the joke, as I've seen it understand similar jokes.


(GPT-4 plus a regular expression)

In this script from a Garfield comic, Jon and Garfield are having a conversation about the internet. Garfield is an American comic strip and multimedia franchise that follows the adventures of Garfield, a cat from the planet Earth, who enjoys lasagna in Jon Arbuckle's house, a suburban domicile.

Jon, Garfield's owner, is trying to access the internet but is unable to find it. He asks Garfield about its whereabouts, and Garfield seems to be confused by the question, as the internet is not something that can be physically found.

Garfield then mentions the time as "twelve oh seven," while Jon's clock shows "half past three." This discrepancy in time indicates that they are likely in different time zones, as Garfield implies. In the context of Garfield, this could also mean Jon's clock is wrong, since Garfield is usually right.

Jon is concerned about whether the time difference will affect his phone bill, to which Garfield replies that he dreads to think about the potential cost. This adds a bit of humor to the scene, as Garfield often has a nonchalant attitude towards everyday human concerns.

Overall, this script showcases the humorous and whimsical nature of Garfield, with the characters engaging in a lighthearted conversation that intertwines elements of fantasy and everyday life.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: