
Annyang.js – Let visitors control your site with voice commands - danso
https://www.talater.com/annyang/
======
borplk
Is there a free/open software like this but for the entire desktop
environment? I'm looking for something that is highly scriptable so I can have
some simple magic phrases and link them to specific scripts If not, any non-
free ones? (I know about Dragon but not much)

I want things like "next" "back" "exit" kind of commands. If the trigger is
there I can write the scripts myself.

Ideally it would be context aware, so it could check if a browser is the
active window, if yes "next" and "back" go to next and previous tabs or
something. If IDE is active toggle through files, etc..

~~~
rjbwork
You might be interested in watching this talk:

[https://www.youtube.com/watch?v=8SkdfdXWYaI](https://www.youtube.com/watch?v=8SkdfdXWYaI)

Although, if I recall correctly, he uses Dragon Naturally Speaking, which is a
commercial product.

------
ddod
This idea was similarly proposed by a couple people when I posted my voice-
recognition resume/website ([https://benwasser.com](https://benwasser.com)) to
Hacker News. Glad to see it taking off.

Also, I didn't see it mentioned, but I found that I needed to use SSL to get
Chrome to not re-ask for permission after a brief timeout of silence. I
haven't had a chance to test out their implementation, but my guess is that
SSL is a big unstated requirement for decent usage.

~~~
brentjanderson
The SSL requirement is built-in to Chrome, it's one of the security
requirements for safe microphone usage. Annyang and anything else using the
microphone will always be subject to this requirement.

------
fredbfr
I remember, in 2006, doing a full multimodal (touch, mouse and voice) web
application with W3C standards: VoiceXML, XHTML and a glue dialect called X+V
(meaning "xhtml plus voicexml"). The browser at the time was Opera, and the
voice recognition was handled client side, with a free IBM ViaVoive plugin
that Opera could download with a single click. It already had many things
built-in to sync voice events to DOM events. It's nice to finally see it
embedded in major browsers. It is not clear to me, however, whether it works
offline or uses a web service.

------
kumarishan
This is really awesome. Sometimes I feel annoyed finding settings or controls
on website to perform certain tasks or go through lots of clicks. This api can
certainly make them easier and fun too.

If it can do something like "Facebook deactivate my account with reason its
temporary" then this lib can certainly redefine user experience too.

Looking forward to integrate it in my app soon.

------
robertnealan
First few queries worked, but when I told it "Show me tacos" and it honestly
responded with "Searching for porn...".

Either way, interesting idea - I'm curious how this might be used not only for
interacting with websites in novel ways but for ADA purposes where the user
has difficulty controlling a physical input device but can still speak.

~~~
oso2k
You're weren't looking for pink?

On a more serious note, I still wish we had things like this combined with the
intelligence and semantic deciphering that Ubiquity did [1]. I've disappointed
that Ubiquity didn't go further.

[1] [https://blog.mozilla.org/labs/2008/08/introducing-
ubiquity/](https://blog.mozilla.org/labs/2008/08/introducing-ubiquity/)

------
onassar
Love seeing this. The other day there was a thread about how there is so much
focus on supporting IE browsers with relatively low market share, yet not much
on accessibility issues within all browsers.

I think this is a great step towards highlighting those issues, and offering
up a solution to address some of them programatically.

------
RyanMcGreal
I really hope they stop development at version 0.7734.

~~~
Kiro
Why?

~~~
ErnestedCode
Arrested Development reference -- 07734 => hello => Annyong

------
delgaudm
Chrome on Android 4.4.2 ask for permission to use the microphone, plays the
"listening" sound beep a la google now, but does not seem to respond /
recognize. Anyone getting this to work on mobile device?

------
uptown
I'm not in a place where I can try this right now, but how does this work? Is
it constantly listening, or do you need to trigger / toggle on/off the
listening function?

~~~
eggbrain
It uses the Webkit Speech API[1], and (from what I can tell) it's constantly
listening[2]

[1][http://updates.html5rocks.com/2013/01/Voice-Driven-Web-
Apps-...](http://updates.html5rocks.com/2013/01/Voice-Driven-Web-Apps-
Introduction-to-the-Web-Speech-API)
[2][https://github.com/TalAter/annyang/blob/master/annyang.js#L9...](https://github.com/TalAter/annyang/blob/master/annyang.js#L91)

------
borplk
Does the web speech API expose the raw audio feed?

------
VikingCoder
Interesting... there's a project I was thinking about that this might work
for. Or at least, I can learn a lot from their code!

------
tbh
The Arrested Development reference in the footer makes me wonder if the name
ought to be Annyong instead.

~~~
whatthemick
Perhaps they named it for Annyong but changed it for the easier search
results?

------
notastartup
sounds like "Hello" in Korean, except that it's commonly spelled as Annyung or
even Annyong

