Hacker News new | past | comments | ask | show | jobs | submit login

Yeah it is interesting, and it could also be a big boost to plain olde speech to text in cases where you have video if the errors were non-correlated (which I wasn't able to determine from skimming the readme.)

edit: now I see it is being used to match audio samples, not to generate text so it wouldn't create an independent value from the audio in this arrangement. Other than i.e. speaker attribution which they mentioned.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: