Hacker News new | comments | show | ask | jobs | submit login

Anyway, this is a proof that siri is a pure cloud service and as such may work even on 5-yo Sagem...



Not exactly. The text-to-speech is done in the cloud, but the hard part (algorithmically speaking) is natural language processing, which apparently (I don't know for sure) is still done on the phone.

I don't know what Apple's excuse is though, but limited processing power is certainly not a problem.


I think you have it exactly backwards - the iPhone 4S/Siri speech-to-text/natural language processing are done in the CLOUD. The text-to-speech is done on the phone itself. My (non-Siri of course) iPhone 4's Voice Command stuff is COMPLETELY on the phone itself, and would do TtS of my contact list and Artist names, etc.


The article says, "The iPhone 4S really sends raw audio data". At least for Siri, TtS occurs on the cloud - not sure where the text processing > API occurs though.


Sending raw data and _receiving_ raw data are NOT the same thing. It's been clear that the iPhone 4S sends raw data to the cloud, based on people sniffing the network shortly after release.


From the article:

> The iPhone 4S really sends raw audio data. It’s compressed using the Speex audio codec, which makes sense as it’s a codec specifically tailored for VoIP.


I don't know why this is being downvoted -- I guess others are reading something different than I am?

There are three parts to Siri:

1. Speech-to-text (parent has it backwards but that's what he means, obviously)

2. Text-to-intent (referred to by parent as NLP)

3. Intent-to-API calls

Obviously, (1) happens in the cloud and (3) happens on the device. It is still unclear where (2) happens but if the cloud service only responds with text, it seems that (2) happens on the device.

And (2) is still a hard problem by itself.




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: