Hacker News new | past | comments | ask | show | jobs | submit login
Buzz: Transcribe audio from your microphones in real-time using OpenAI's Whisper (github.com/chidiwilliams)
79 points by rahimnathwani on Oct 20, 2022 | hide | past | favorite | 8 comments



I'm deaf as a post and rely on speech-to-text tools and features (Otter, Google Meet), usually paid, and constantly feel a sense of digital precarity: at any moment, these tools may be removed, changed, or experience outages, all which swing my world from inclusion to exclusion. I hope this works well, because it would give me control over my own tools for accessibility in a way that my workarounds don't.


Online tools also violate your privacy, as they have the contents of your conversations.


For me, it's a tough tradeoff between privacy and accessibility – without these tools, I have zero accessibility for audio-based mediums. My best bridge for now is Google Meet's captions, and when Meet isn't where the call's taking place, piping the audio into Meet as a virtual audio source.


The last section in the linked GH page says it can be built and run locally, although doesn't specify whether it uses external services or not.


Yes, I've been waiting for an app like this for quite some time. No need for a 3rd party cloud SaaS solution/subscription.


I ran this for the first time today, after reading about it on another HN thread. On my computer (running Ubuntu 20.x), it seems to work fine except that the stop button does nothing, so I can't stop the program and select+copy the displayed text.


How well does this work without a CUDA capable GPU? I tried out Whisper when it first showed up on a 12th gen i7 laptop in CPU mode and found that the larger, more accurate models would take enormous amounts of time.

I have been contemplating setting up a desktop machine with a discrete gpu so that I can play around with this further. I'd be curious if smaller sample sizes improve the performance, or if the smaller models are sufficient when doing voice transcription.


I'm waiting for the eventual OBS Plugin that is configured to work with these kinds of models.

On another note, it's crazy to think that we are on the cusp of such powerful machine learning models going mainstream.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: