For the subtitles i use the speech recognition JS native api.
It's not the use cases, but it works pretty well