
A Possible API for Siri - nevanking
https://notes.nevan.net/an-api-for-siri-831abf62dc73
======
nailer
Switched back to iOS this year. Totally dumbfounded siri doesn't work with
Maps, Outlook, Spotify or any of the other popular apps. That and Siri
confusing 'four' with 'for' all the time ('set alarm 415' sets an alarm for
3PM) make it basically unusable.

~~~
garblegarble
>'set alarm 415' sets an alarm for 3PM

Have you tried "set an alarm for 415"? That should disambiguate that you mean
"four" and not "for" the second time (tried that a few times, not sure if it's
the extra "for" or that it makes me clearly enunciate "four")

------
jonah
There are also similar api services like Houndify that are available:

[https://www.houndify.com/](https://www.houndify.com/)

------
Uehreka
This is a good analysis of how Apple might implement a Siri API. I have some
points I'd like to add too, based on my own (cursory) research into this
topic.

First of all, I feel confident that whatever API Apple exposes will probably
resemble one of the existing APIs we've seen before, such as:

\- Window/Kinect's Speech API[1]- I have no experience with this one, but one
of my friends used it a few years ago and described it as "loading grammars
for Kinect to use in parsing speech". \- The HTML5 Web Speech API[2] - I
prefer the version exposed by Annyang.js[3] which is geared towards creating
"event handlers" that fire when a certain command is spoken. The Web Speech
API gives you no access to how the audio is parsed into words, however the
implementation in Chrome is actually pretty awesome at detecting proper nouns,
place names and the like. \- The Alexa Skills Kit[4] - This example is
interesting because Alexa is essentially an "OS-level" assistant who has to
support all the services that have been registered to Amazon.

I'd also like to point out some of Apple's philosophy (as well as some of
their already completed work on Siri) and how it might influence a Siri API:

\- Starting in iOS 8, Apple gave Developers the ability to create
"extensions". "Siri Abilities" would probably be manifest as an extension, and
would follow the lead of existing extensions in terms of limitations. This
means users would have to opt-in to your app's Siri abilities, and the option
will not be super obvious to them unless they wanted to look for it. \- In an
interview on John Gruber's The Talk Show[5], Eddy Cue talked about how Siri
for TvOS has to be able to distinguish when Spanish speakers are throwing an
English movie title into the middle of a sentence. My guess is that their
speech parsing logic is super complicated, and they won't give 3rd party
developers a hook into it. \- Apple doesn't like to build things that require
tons of configuration. They probably won't give users the ability to configure
which app should answer "Send a message to mom". This would mean having a
screen that listed every command Siri supports. Even if it only listed the
commands that had "conflicts", listing out all of Siri's abilities to anyone
(even just developers in a piece of documentation) is something Apple will
never do. \- Apple's gotten better about this over the years, but they still
tend to make choices that privilege their apps over 3rd party apps (even if
it's just a side effect of a bigger policy).

So where does that leave us? I think Apple's Siri API will most closely
resemble the Alexa Skills Kit.

Developers will have to register a "namespace" for their app. The "namespace
registry" will be like the URL protocols mentioned in TFA. Developers will to
scramble to register their name, as it'll be what users have to say to
indicator which service Siri should use to fulfill the request. Once inside a
namespace, developers would have total freedom to write any commands they
want.

In the end, I predict that Siri commands (if used with the wake word) will
sound like this: "[Hey Siri], tell [WhatsApp] to send a message to [Mom] that
says [I'll be home for Thanksgiving]." Not the "Tea, Earl Grey, Hot"-est thing
in the world, but still really useful! (And this would be feature parity with
Amazon Echo)

If Apple sees the above command as too awkward or cumbersome, I predict that
they will not release a Siri API. I can't imagine them releasing an API that
gives developers more control. I do think TFA is right though about "Voice
Search in Apps" being a possible halfway point.

[1] [https://msdn.microsoft.com/en-
us/library/jj131034.aspx](https://msdn.microsoft.com/en-
us/library/jj131034.aspx)

[2]
[https://www.google.com/intl/en/chrome/demos/speech.html](https://www.google.com/intl/en/chrome/demos/speech.html)

[3] [https://www.talater.com/annyang/](https://www.talater.com/annyang/)

[4] [https://github.com/amzn/alexa-skills-kit-
js](https://github.com/amzn/alexa-skills-kit-js)

[5]
[http://daringfireball.net/thetalkshow/2016/02/12/ep-146](http://daringfireball.net/thetalkshow/2016/02/12/ep-146)

