Hacker News new | past | comments | ask | show | jobs | submit login

I was trying to work something like this out to try about a month ago but had to put it aside for later. Running my speech recognition inside a virtual machine was a dealbreaker, but not all that uncommon for people doing this sort of thing. I really, really wanted to get Julius[1] running in OS X but after a couple tries I couldn't get it to build (problem on my end– this is a good reminder to get it sorted out). If you're looking for an alternative to CMU Sphinx that's still FOSS, you really should check Julius out. There are plenty of docs on getting it running with languages other than Japanese. If you're curious about how well it can work, check out this[2] demo (requires Chrome).

[1] http://julius.sourceforge.jp/en_index.php [2] http://www.workinprogress.ca/KIKU/dictation.php




If you're looking for an alternative to CMU Sphinx that's still FOSS, you really should check Julius out. There are plenty of docs on getting it running with languages other than Japanese. If you're curious about how well it can work, check out this[2] demo (requires Chrome).

[2] http://www.workinprogress.ca/KIKU/dictation.php

It seems like this demo is not using Julius, but it's mixing messages a bit. The bottom of the page says "Service provided by Google Inc.", but the link right next to it (for downloadable software, also apparently called "kiku"?) says Julius etc.


That does work well. I'm happy to pay for Dragon, but I find the Windows version so superior to the OSX software I refuse to run it on OSX...


The OS X "version" is a nightmare. It's guaranteed to break with every major OS release. Nuance takes months to release working versions. When it does work, it's hostile to any other apps that use the accessibility hooks, such as Text Expander, Alfred, etc., which would be awesome with speech input.

The history of the Mac version (acquisition of a company that licensed the Dragon engine) means that it and the Windows versions are very likely permanently divergent. Given the relative market sizes, the Windows version has the best development, the best recognition, and the least schizophrenic product support.

I am glad that dictation (apparently powered by Nuance's engine anyway) is to be included in Mavericks, including a disconnected (i.e., non-Siri) mode. Maintaining an application with a skeleton crew and relying on system services that change at a fundamental level every couple years is not a path to customer satisfaction.


> I am glad that dictation (apparently powered by Nuance's engine anyway) is to be included in Mavericks

I'd missed that, very interesting. I need a disconnected mode as being able to only dictate short passages, and especially using an online system that doesn't learn from corrections, is a pain.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: