Hacker News new | past | comments | ask | show | jobs | submit login
Why Siri just might work (thisismynext.com)
44 points by thisisblurry on Oct 5, 2011 | hide | past | favorite | 30 comments



Here's how to think about Siri: it's a second user interface that's better than touch for many of the most commonly-used, tap-heavy features of iOS.

I need to remember to call my mom when I get home. Perfect use for iOS 5's new Reminders app, right? But to use Reminders for this, here are the steps today:

  Press the Home button, swipe to unlock.
  Navigate to the Reminders app icon (may take a press of the Home button and/or a few swipes), tap it.
  Tap "+" to add a new entry.
  Type "call mom <return>". [note that this step practically requires using two hands]
  Tap the just-created item.
  Tap "Remind Me".
  Switch "At a Location" to "On".
  Tap location, choose "Home". Tap the back button.
  Tap "When I Arrive".
  Tap "Done", "Done".
  Look back up from my phone that I've been staring at for the last 30 seconds.
Or, with Siri:

  Hold down the Home button for a couple of seconds.
  Say "Remind me to call my mom when I get home."
The Siri workflow is hugely easier (and safer) when driving, true, but that's almost incidental - it's simpler and "lighter" than touch even when you're sitting on your couch with your feet up.


Not to mention it doesn't need to impede on your current activity. Currently, if you're doing some work in an app, you can either close the app, create the reminder, go back to your app and try to get your brain back into the mindset of using that app. OR, you tell yourself that you'll create the reminder when you've finished completing the process you're doing in the current app (in which case you might forget to create the reminder altogether). I think Siri will help a lot in performing actions in the mobile/tablet space where multitasking is just not as simple as the desktop.


I always wished I had the android style "stack" of applications on my iphone for just this reason: short one-off tasks. But I also knew apple would never do it. This seems to be the solution they are going to push.


How much of this is just speculation and how much was baked into Siri prior to Apple's acquisition of it?

They can make anything look wonderful and intuitive in those promotional videos, since they focus on the optimal use cases. I want to know if the product is that smooth for everyday users (ie. people that haven't memorized the specific vocabulary) during everyday use.


I though the same initially, especially since I know a thing or two about Natural Language Processing. So I know that this is a very hard problem, to say the least. But then I remembered that we are talking about the company, that is known to make some of the most usable products in the world. I trust that Apple is smart enough to not shoot itself in the food, by releasing a half baked product like that.


So the downloadable Siri was half-baked?


I think Siri is really cool. It seems Apple took a human-first design and brought a lot of user information together to make it easier for humans to get things done.

My biggest frustration having used voice recognition across multiple different platforms is still the machine in question clearly understanding what I'm saying. I feel like if I'm in a car I have to turn off the music, wind up the windows, and tell the passengers to be quiet. I also hope that there's minimal road noise.

I can't wait to see if Apple's cracked this problem for me.


Does anybody know where Siri gets its local search results from? The independent version of the app has been pulled from the App Store, and the only "answers" I can find to my question online are low quality. If they somehow manage to produce good results, they'll stand with Google Places as one of only two generally reliable local search resources.

If somebody has some insight on this, and they'd be willing to discuss it in greater detail or at greater length than would be suitable in an HN post, please contact me (see my profile).


Two questions:

As far as I understand Siri uses Nuance's Dragon Naturally Speaking speech-recognition for actual speech recognition. Is that correct? If that is correct, then what is added value of Siri/Apple? Is it going to be better than Dragon?

Did anybody tried this? Does it really understand normal English or English with a little of an accent? There are so many speech recognition products on the market and none of them work well. Meaning they work like 50% or maybe 80% of time - so you just gave up.


A little off topic but why when I go to http://www.apple.com/iphone/features/siri.html and try to watch the video am I prompted to download Quicktime. I thought apple was all about HTML5 and killing off browser plugins?


That video is done with an HTML5 video tag. The thing about the HTML5 spec is that it doesn't actually specify what format the video should be in, so different people use different formats and AFAIK no browser supports all the common choices out of the box. It's an unfortunate state of affairs, but it seems essentially irreconcilable — this is why WebM could have been a big deal. Apple chose to use the QuickTime file format here, presumably on the assumption that iPhone users will have iTunes installed and QuickTime with it.


Is this the first time you've seen a self-serving inconsistency? Apple as a company represents many things, and a keen PR team with excellent wording is one of their strengths. Apple isn't "all about HTML5" and "killing off browser plugins", but rather they are about their best interests. Nixing Flash kept the Apple-controlled AppStore the dominant gateway for applications on the device. Quicktime is another Apple technology, so it is pretty much expected that they will use it (if for nothing other than cross-promotion/marketshare)


M$ - Silverlight Adobe - Flash Google - WebM

Calling Apple out as if they are some how evil here is disingenuous. All the players are pushing their own formats.

Doesn't safari work without quicktime and support the <video> tag in html5?

How about calling Adobe out for "ruining the internet" with flash? I'm glad Apple kept them from ruining the iphone (like they are attempting with android).

[Edit: correct Flash to Adobe]


Nixing Flash improved our lives (especially in the long term).


When I visit, I get served H264 video in an MPEG-4 container via the <video> tag. QuickTime fallback is offered for browsers that don't support MP4 natively.


The version of Chrome I'm using must have removed H264 support. Quicktime as a fallback is rather disappointing as Flash seems like it has a much higher install base and smaller disk space footprint.


See http://labs.divx.com/html5/ to see if your browser supports H264 demo. Both Chrome and IE9 do for me and I am still asked to install QuickTime.


In fact my chrome is working with MP4 video and this appears just to be Apple pushing quicktime on to Chrome users.


It could be a broken test for support, or UA sniffing or something. I'll file a bug against the site.


I too think Siri can be ported to say... Mac OS X!

Deriving the meaning from words and previous context is beyond difficult and I can only imagine the processing power required for it. But if the A5 can pull it off, so too should an x86 CPU.


The A5 isn't pulling it off. As with Google's voice stuff, it's sending a low quality audio stream to a datacenter where the heavy lifting is happening. Note that the presentation pointed out that Siri requires WiFi or 3G access.


Not sure ’bout that. During the event they only explicitly pointed out that voice recognition is happening on Apple’s servers when they talked about dictation. They introduced that as if it were something new and unique for the dictation part of Siri, not the commands.

I’m really quite confused about the details but the iPhone 4S will start selling soon so we will be able to find out how Siri works in more detail.


That's the way WP7 does it as well for anything other than simple commands.


Because it alleviates typing on an awkward keyboard in some situations.


But as he points out, it's voice only. No text input to Siri.

I never understood why an old girlfriend would always txt me. Then I rode around in a very loud car with her, and her friends. I actually wound up sending txts to communicate with her, even though she was sitting right next to me.

To wit, there are some situations were voice is not practical, optimal. What if I want to type on an awkward keyboard?


Loud venue? not useful. Trying to keep secrets? Not useful. Trying to text your friends in the flyest street slang? Not useful.

Hands dirty from cooking and you need to text someone back? Useful.


When voice control works with my thick slavic accent, then I'll be impressed.

Until then let me just stick to my lovely dependable keyboards.


They did mention that it learns your voice and vocab patterns over time. We'll see if that's truly the case.


Nuance technology really adapts to your own voice, even if the first recognition fails after a few (5 to 10) it really nails what you ask for.


this article is even more fascinating with the real implications of siri: http://5by5.tv/criticalpath/9




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: