Hacker News new | comments | show | ask | jobs | submit login

My current client is specialising in speech recognition, speech synthesis and automatic translation. They have something similar, focused on enterprise customers. I find this subject very interesting.

I am a Ruby guy and I only marginally get in contact with their C++ code, but from what I learned so far this stuff is extremely memory and CPU hungry. It also depends on having been fed the right amounts of input. That's why Google Translate is so good. They have tons and tons of data from all the websites they parse, and in many cases the content can be obtained in different languages. Corporate pages are often translated paragraph by paragraph by humans which results in perfect raw data to train these algorithms. Also for example all documents that the European Parliament produces are translated into the languages of all member states.

Everything that has to do with translation has to do with context. I think the software right now is as smart as a six year old kid, except that it has a much bigger vocabulary. But if you say "The process has stalled. Let's kill it." it probably only makes sense if you know you are talking about computers.

It's hard to imagine that computers one day might really understand everything we say. But just by using Google Translate I think they really might. Это является удивительным. (I don't speak Russian. I hope I didn't insult anyone now. ;))




> Corporate pages are often translated paragraph by paragraph by humans which results in perfect raw data to train these algorithms.

Actually this may be one of the reasons why Google's Japanese translations are so terrible. The why isn't really relevant here[0] (perhaps you already know anyway) but it there are times when the raw data becomes the most misleading.

[0] Obviously I still mean those actually translating by hand, not the companies which just throw all of their material into Google Translate and consider it a finely proofed document. There are plenty of the latter which makes for an amusing loop in the system.


Google Translate's English to Chinese is plain awful as well. At my last job, many of the employees only spoke Mandarin. If I had to communicate with them, I would use Google Translate and copy/paste the Chinese and the English for CYA. There was 2 people who spoke good English and Mandarin, so they were able to correctly translate the meaning. From what they told me, the translations were about 60 to 70% accurate and often comical. I guess the translations never asked my coworkers on a date or cursed them out, but ultimately, outside of the fact that my correspondence never happened unless it was an important issue and thus alerted the recipients that there is a pressing issue, the translations didn't serve much purpose. It was better than zero communication. There just has to be a lot of work done in the field.

I also disagree that having someone doing the translations on the webpage is a good source. One only needs to find Asian companies with direct translations to see how bad the human translations are. Just as those companies generally don't have good English writers, I think it is likely that American companies probably don't have outstanding Japanese / Chinese / et.al. writers.


I speak bad Japanese after having worked there for a while. But Japanese is extremely context dependent. They leave out a lot of words and you just deduce the meaning. (Don't hit me if that's wrong, but that is what I remember thinking back when I was learning it.)


The way you phrase that is a bit misleading. It's not that Japanese speech has any less info than most languages, it's just that you can set topics (add state to a stack, to put it in geek terms) that carry over into subsequent phrases. The correct translation of an individual sentence may thus depend on that previous context.

A simple but famous case is "Watashi wa hamburger desu". The sentence has no subject, so with no context that would get translated as "I am a hamburger", but if you fill in a previously defined subject, it could be "I [order] a hamburger", "My [favorite food] is a hamburger", etc.


To clarify further for those unfamiliar with the language,a super literal translation of "watashi wa hanbaga desu" is something like:

    (concerning/as for) myself, (it's) hamburger.
On it's own if Bob says this, basically it comes out as "I (am) (a) hamburger".

However, if Sally has just said something like "I'll have a salad...what about you Bob?" then it makes sense as Bob's order is the implied subject and it becomes "My order is hamburger." or "I'll have hamburger."

I know very little about linguistics but I think there are a bunch of other things that make Japanese-English difficult to translate via software as well.

There is the whole aspect of culture embedded in it. あなた could mean "you" or something like "dear/sweetie" depending on the context. There also the question of how to translate "you" (etc) in English text to Japanese as you have to consider politeness etc. If you are just translating a business web page it's probably safe to stick with polite forms, but if you are translating say the dialogue in a TV show you want to preserve the tone of the characters.

In terms of voice recognition, Japanese seems to have a lot of homophones to me when compared to English. It may just be my imagination, but here are some I ran into recently:

舶,錘, 頭, 摘む, 積む, 詰む, and 紡錘 are all pronounced つむ and mean completely different things. Or 六, 碌, and 録 are pronounced ろく. 上, 神, 紙, 髪, and 加味 are all pronounced かみ.

I seem to run into things like that regularly; when just hearing it spoken you need the context to figure out what they mean.


> 上, 神, 紙, 髪, and 加味 are all pronounced かみ.

This can be mostly solved by context. There are very few situations in normal speech where you'd hear "kami" and not know if they're talking about 神 (god) or 髪 (hair). Also, it's not particularly hard to code that knowledge. E.g. try かみにいのる (pray to god) and かみのけをきった (cut hair) on Google Translate. It will suggest the correct kanji in both cases.

Anyway, I'm not a native Japanese speaker, but I find the whole homophone thing a bit overrated. As far I as can recall the only pair of homophones that cause trouble in normal speech are 科学/化学 (both pronounced kagaku, meaning science/chemistry) and 私立/市立 (shiritsu, private/municipal).


Thanks for the reply.

> This can be mostly solved by context. T

Right, as I said. It's not too bad, but it's easier when you can just translate word for word.

かみにいのる gives me "pray to bite" on Google translate; as you say, it suggests the right kanji...but that's precisely my point. It needs you to disambiguate for it to be sure.

I'm not saying this is an insurmountable problem, I'm contrasting the difficulty.

> There are very few situations in normal speech where you'd hear "kami" and not know if they're talking about 神 (god) or 髪 (hair).

I ran into it recently in music. Babymetal has a song that starts:

伝説の黒髪を華麗に乱し

When you listen to the song, it'd be easy to momentarily think she might be saying "black god" or "black paper" since while the pronunciation wouldn't be identical, it's pretty close. Since I'm human, I figured out pretty quickly what she is saying...but in the equivalent English phrase there's no issue there...it's "black hair" or "black paper".

This is admittedly not "normal speech", but I could see it popping up there too.

I've seen confusion over 神/髪 in other situations too, though those were deliberately puns so probably don't count, but demonstrate it's possible to have situations where it's at least somewhat ambiguous.

> I find the whole homophone thing a bit overrated

I'm sure it's exaggerated to me because my Japanese is pretty atrocious, but I think my point is valid: any time you have homophones in a language it makes things more difficult to set up a system that listens to speech and translates. Japanese seems to have more homophones than English, and if that's true it is proportionally more difficult to translate in that regard.


Also from what I understand, certain homophones are differentiated in practice by differing accenting (raising/lowering) in speech. This is however region specific.



> I'm not saying this is an insurmountable problem, I'm contrasting the difficulty.

Fair enough. I'm not claiming there's no homophone ambiguity either, just that it's a relatively easier problem compared to, say, the stuff Microsoft is doing.

Yeah, when I say "normal speech" I don't include pop music lyrics.


As someone already pointed it out, you should have written the translation as "I - hamburger", not "I hamburger." This implies a pause in speech. I am native Russian speaker and Russian also has this case where the meaning of a sentence has to be deduced from the context of previous sentence. But in written Russian you could look at that sentence only and understand that someone is likely replying to something. I don't practice my Russian daily as my day-to-day communication is only in English. I tend to forget words when i communicate with my Russian friends via email, so i use Google Translator a lot to translate from English to Russian. I actually find that Google is pretty good at translating formal sentence structure you'd see in literature and absolutely abysmal at everything else.


Japanese has noticeably fewer phonemes than most languages (IIRC something like 21 compared to a "normal" 24-28) so it makes sense that there are more homophones. One interesting effect is that puns and innuendo are easier in Japanese. Of course it's easy enough to disambiguate in normal conversation.


This is not a unique feature of Japanese. You can do exactly the same in Russian using a dash (which translates to a short pause in speech): "я — гамбургер" (I — hamburger). In general "X — Y" means "X is Y" or another relationship between X and Y as indicated by preceding context.


Yep I have tried the Google English-Japanese translation before and it was just plain terrible.


That's why Google Translate is so good. They have tons and tons of data from all the websites they parse, and in many cases the content can be obtained in different languages. … Also for example all documents that the European Parliament produces are translated into the languages of all member states.

This can backfire. I remember hearing that sometimes/back in the day "Baile Átha Cliath" (the Irish for "Dublin", the capital city of Ireland) would sometimes get translated as "London" the capital of the UK. This is due to Google Translate trying to match up Laws in Ireland (in the Irish language) with UK laws (which would be very similar or potentially based on the same original law). However in the Irish law "Baile Átha Cliath" would be replaced with "London".

Here's an example of it: http://translate.google.com/#ga/en/L%C3%A1%20alainn%20inniu%...


You did not insult me, but you sound like a robot because of preserved English sentence structure.


funnily, translated to Polish it has perfectly natural structure. Это удивительным would be correct for Russian, yes?


Это удивительно.


oh, right, forgot to fix the suffix.


> and in many cases the content can be obtained in different languages

I wonder if there's some feedback loop caused by websites that used google translate itself to offer the alternative versions :)


If I recall correctly, that is one of the reasons that Google made the translation API a paid service. Regardless, in those cases, I suspect Google would be able flag their own translations and avoid using them, It would be much more interesting if their were a couple of companies dueing translations on many websites without coordination, or if one of them tries to sabotoge the others with subtly bad translations.


Speech Recognition would probably be best fed in the same way - find a neutral-sounding speeches for which transcripts exist.

Best would be parliamentary speeches with transcipts, and the closed captioning for national news programs. The main constraint is storage space/computational power.




Applications are open for YC Summer 2018

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: