Hacker News new | past | comments | ask | show | jobs | submit login

Some of that stuff isn't too hard, if you can narrow down the domain of words you need to recognize. For example, let's say you wanted to have a hot word of "computer", like in star trek, you can literally filter the output of the recognizer (e.g. pocketsphinx) with grep and sed and it works not too badly. For the natural language part, you can get pretty far with a simple parser like the old infocom games used, esp. if your domain is limited. I'm making an open source multiplayer networked starship bridge simulator, kind of like star trek, using pocketsphinx for speech recognition, and it's working ok (not perfect, but ok.) Here is a demo: https://www.youtube.com/watch?v=tfcme7maygw



actually somebody even cross-compiled pocketsphinx to javascript with emscripten for this purpose:

https://syl22-00.github.io/pocketsphinx.js/live-demo.html

this works pretty well - all in the browser, especially if you drop in some better acoustic models.


Yeah I wouldn't call that "pretty well" - I said "not a number" and it outputted "one two one" on the digits example.

Maybe it just wasn't trained well enough to reject non-number inputs, but.. yeah doesn't exactly change my experience that Sphinx is awful.


You have to use a decent acoustic model - not the one in the demo. If you do I think it works 'pretty well' as a proof of concept. That said I'm not recommending Sphinx as a recognition framework, it is way behind the times in 2016, but this is the only 'in the wild' demo of this I've seen on the web, so I felt it was worth mentioning.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: