It's not open source, but PotPlayer supported it in December before VLC's announcement, so I'm not the first.
Incidentally, I added this feature in October.
I will imitate the good points of other players with ASR implementation.
Thanks, For translating, currently it translates one subtitle at a time, so the accuracy is low because it cannot retain the context before and after the subtitle.
I'm addressing this by supporting dual subtitles. Translation is assumed to be used only as a complement.
I would like to improve accuracy by preserving context, but I haven't found a good way to do this at the moment.
If we are talking about the accuracy of the transcription, it is very good if you use a large model.
At least the accuracy of whisper is far superior to Youtube's subtitle generation!
Thanks for the feedback!
It is possible to use an external dictionary tool to speak via the clipboard, but it seems difficult to support many languages.
It would be easy to implement using Microsoft UWP speech API, but there may be quality issues.
I will research if there is a good quality playback method locally, Thanks.
https://github.com/rambod-rahmani/ffmpeg-video-player