Hacker News new | past | comments | ask | show | jobs | submit login

What are the state of the art open solutions to local voice recognition? Preferably with available models that a small org can also train themselves without millions in hardware.



I will add https://github.com/coqui-ai/STT, which is a continuation of DeepSpeech. Also, I've been messing around with https://github.com/ideasman42/nerd-dictation, which works on a VOSK backend - accuracy is decent, especially with the bigger model.



Kaldi, K2, ESPNet.


Nemo




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: