
PocketSphinx+PG+JavaScript Voice/Text Experiment - vmorgulis
https://vmorgulys.github.io/voice-text-sync.html
======
vmorgulis
Original video:
[https://www.youtube.com/watch?v=0KR2MSFROLI](https://www.youtube.com/watch?v=0KR2MSFROLI)

CMUSphinx:
[http://cmusphinx.sourceforge.net/](http://cmusphinx.sourceforge.net/)

~~~
detaro
So, what am I looking at? It seems like you fed the audio in PocketSphinx to
get time-tagged text and the site basically shows said text as subtitles to
what was said, is that the gist of it?

~~~
vmorgulis
> ... is that the gist of it?

Yes, it is.

I'd like to improve the speech recognition and expected some advice about
that.

Another possibility is to add a semantic level with NLP or use another library
like Kaldi ([http://kaldi-asr.org/](http://kaldi-asr.org/)).

Another particularity: the WAV file is serialized in JSON (as an array).

