Can this be used as a real-time transcription or is it too slow for that?
Curious what anyone is using these days for a real-time transcription. It doesn't have to be perfect, but just good enough.
My kids watch some youtube vidoes where people will make a mod where it converts them talking to text then look for keywords and spawn a boss in Terraria if you say the wrong keyword etc.
I made a clone of that with the .NET System.Speech.Recognition library. It... works.. but my biggest problem is that #1 it waits until you are done speaking to translate to text on the callback, so there was too much of a delay for it to be fun.. the point is that it will be checking a stream of chatter. #2 is the recognition is pretty crap, I mean it's nearly good enough for my silly purpose but it's still pretty bad.
It might require too much work for what you are looking for, but the wav2letter library is the best real-time transcription OSS I have found by a considerable margin.
If your family uses Apple devices, Apple offers free on-device speech recognition. Only caveat is that it needs to be restarted every minute due to whatever stupid limitation (or bug) they've introduced.
The base model seems to run faster than real time on my machine. The “medium” model is larger and runs more slowly - roughly real time or maybe slightly slower.
Curious what anyone is using these days for a real-time transcription. It doesn't have to be perfect, but just good enough.
My kids watch some youtube vidoes where people will make a mod where it converts them talking to text then look for keywords and spawn a boss in Terraria if you say the wrong keyword etc.
I made a clone of that with the .NET System.Speech.Recognition library. It... works.. but my biggest problem is that #1 it waits until you are done speaking to translate to text on the callback, so there was too much of a delay for it to be fun.. the point is that it will be checking a stream of chatter. #2 is the recognition is pretty crap, I mean it's nearly good enough for my silly purpose but it's still pretty bad.