Hacker Newsnew | comments | show | ask | jobs | submit login

Thanks for the reply.

> This can be mostly solved by context. T

Right, as I said. It's not too bad, but it's easier when you can just translate word for word.

かみにいのる gives me "pray to bite" on Google translate; as you say, it suggests the right kanji...but that's precisely my point. It needs you to disambiguate for it to be sure.

I'm not saying this is an insurmountable problem, I'm contrasting the difficulty.

> There are very few situations in normal speech where you'd hear "kami" and not know if they're talking about 神 (god) or 髪 (hair).

I ran into it recently in music. Babymetal has a song that starts:

伝説の黒髪を華麗に乱し

When you listen to the song, it'd be easy to momentarily think she might be saying "black god" or "black paper" since while the pronunciation wouldn't be identical, it's pretty close. Since I'm human, I figured out pretty quickly what she is saying...but in the equivalent English phrase there's no issue there...it's "black hair" or "black paper".

This is admittedly not "normal speech", but I could see it popping up there too.

I've seen confusion over 神/髪 in other situations too, though those were deliberately puns so probably don't count, but demonstrate it's possible to have situations where it's at least somewhat ambiguous.

> I find the whole homophone thing a bit overrated

I'm sure it's exaggerated to me because my Japanese is pretty atrocious, but I think my point is valid: any time you have homophones in a language it makes things more difficult to set up a system that listens to speech and translates. Japanese seems to have more homophones than English, and if that's true it is proportionally more difficult to translate in that regard.




Also from what I understand, certain homophones are differentiated in practice by differing accenting (raising/lowering) in speech. This is however region specific.

-----


This is known as pitch accent: http://en.wikipedia.org/wiki/Japanese_pitch_accent

-----


> I'm not saying this is an insurmountable problem, I'm contrasting the difficulty.

Fair enough. I'm not claiming there's no homophone ambiguity either, just that it's a relatively easier problem compared to, say, the stuff Microsoft is doing.

Yeah, when I say "normal speech" I don't include pop music lyrics.

-----




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact

Search: