Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Why do voice transcription apps charge monthly when Whisper runs locally? (lucidvoice.app)
47 points by metalogical 3 days ago | hide | past | favorite | 78 comments




It's why everything is a subscription these days: you make more money.

Consumers undercount the true total cost. And because X% of people will forget they're subscribed and keep paying forever.

If every month you had to either consent to recurring charge on your card or unsubscribe, I'm sure billions of revenue would evaporate overnight from people mass unsubscribing.

(I wish there was regulation that required companies to automatically pause monthly subscriptions if you haven't logged in to or used the service in any way for 3+ months. Though that would create some weird incentives)


I think requiring notification would be better than automatically pausing (or maybe let people choose?)

My elderly parents have a cheap voip landline that they never use but keep for peace of mind. It'd be unideal if that got automatically "paused" and then it didn't work the one time they tried to use it to call 911.

Sure, the scenario would mean their cell phones are not working, or they're suffering from some cognitive issue, so it's unlikely -- but still plausible.


That's a good example. In that case though I'd say that every month the voip line is providing a service. Something like a Netflix subscription that you haven't logged into for 6 months is more unequivocally providing no value.

Mandatory monthly notifications about charges seem better though and wouldn't lead to weird perverse incentives. (Like Netflix spamming your email with auto-login links so if you click one they can claim you did use their service)


Virtual credit card numbers are a great way to combat this.

For example, the Wall Street journal pricing is pretty wild (8 dollars a month for the first 3 months then jumps to much higher) so I use a virtual card which expires right before the planned price hike.

For other services I like to either use a virtual card with a single transaction limit, or just buy the service and cancel right away which typically is equivalent to just paying for a month


Agreed! I'm a long time privacy.com customer. It completely flips the script on subscriptions. I'll create a new card with only a budget for 1 month of the subscription. If I actually care about it I'll see the 2nd month's charge fail and quickly fix it. Also great for making sure free trials don't become forever subscriptions.

I tried to cancel a virtual card to cancel a service (that only allowed me to change anything by phone call, so this would have been far more convenient and less confrontational) and they tallied up a "delinquent balance" and threatened to sue if I didn't pay everything I owed in order to cancel.

Canceling the card does not work for predatory companies. Maybe for well-meaning ones that automatically cancel when a charge declines.


I switched home insurance away from liberty mutual after my term was up and did not renew. Three weeks after my coverage with them lapsed I received a notice from a collection agency for a late fee for coverage I never purchased. FUCK automatic billing and non-consensual subscriptions.

Take them to small claims court. You would win

It's too late for that now, but maybe if this happens again I will.

That's actually a great tip. Unfortunately I can't set them to expire at will, but using a 24 hours one (the usually available option here) is enough to get one month subscription without worries about the price hike.

Exactly this. When I looked at Wisprflow at $12/month, I realized over 2 years I'd pay ~$290 for software that runs entirely on my Mac with near-zero server costs on their end.

The "forgot to cancel" revenue model works, but (like you implied) it's predatory when the software doesn't need ongoing infrastructure.


You might want to check out https://whispernotes.app - it's a one-time purchase, no subscription. For offline apps with no ongoing server costs, I think buy-once should at least be an option alongside subscriptions.

It's not that you make more money with subscriptions. It's because you make some money to survive at all if you're not a big company. People who run a small business understand this.

I get that subscriptions help small businesses survive. But when the software runs 100% locally and doesn't need servers, one-time seems fairer. That's what I'm testing.

Whisper runs fine locally. So why are Willow ($144/year), Wisprflow ($120/year), and SuperWhisper ($120/year) all subscriptions?

I got frustrated paying monthly for something that could run on my Mac, so I built Lucid Voice:

- 100% offline (Nvidia Parakeet + Llama)

- $20 one-time (mainly to cover Apple's notarization costs)

- Runs on surprisingly low-end hardware (M1 base models work fine)

- No cloud, no data collection

Open to feedback on anything - pricing, the tech stack, or if this should just be free: https://lucidvoice.app


Thanks for sharing. As someone who used dragon naturally speaking, which did fine tuning back in the 90s, i was genuinely surprised at the implementation gap.

The pricing is good for customer.

At the same time, I think you shouldn't give away "Lifetime updates" for same pricing tier. Are you planning to support it for the next 10+ years and across next 5-10 mac hardware/version without any new license cost?


Honestly not trying to make serious money from this - more validating if people want local-first tools. If it takes off, might add a support tier, but ultimate goal isn't profit. Just wanted something that works without monthly charges.

Can you build a Linux version? :-)

Still haven't managed to find one that works as well as MacWhisper.


Get your favourite coding agent to make one. These things are incredibly simple. You really only need a few ingredients:

- `sherpa-onnx` bindings for your favourite language

- package for capturing your mic input

- package for hotkey capture

- package for clipboard management (or shell out to `xclip`)

- shell out to `xdotool key --clearmodifiers "ctrl+v"` to paste

Tell it to go research all the above and then assemble into whatever form you want. I had Claude write a Go daemon that loads parakeet and runs as a systemd user service listening for Alt-Space in about 20 minutes.


Check out SpeechNote, works great on both CPU based and GPU based machines depending on the model you use: https://github.com/mkiol/dsnote

> Can you build a Linux version? :-)

Generally speaking, it is the hardware not the OS that makes it easier to build for Macs right now.

Apple Neural Engine is a sleeping giant, in the middle of all this.


Parakeet still runs at 5x realtime on a middle-of-the-road CPU; it should be quite doable (at the cost of some battery life).

Mac-only for now but might port to Linux if there's enough demand. What desktop environment are you using?

Gnome on Debian. I'm a simple man.

Good idea. I was hoping to at least see an overview of this from my phone, but when I opened the link, it said it’s for desktop only and became uninterested.

Just pushed an update for mobile - should work now. Happy to give you a free license key if you want to try it on your Mac. Would love feedback

I have been looking for this on Linux (Gnome) and would love something like it. I can't find a simple way to get system-level voice input.

Yeah, hearing this a lot. I'll definitely into Linux support

Pleaee talk me into buying your product! :-) I currently use the free version of Superwhisper on Mac.

Hey Steve! SuperWhisper free is solid - main reason to switch is avoiding subscriptions entirely. For $20 one-time you get AI formatting (removes filler words, cleans up corrections) that SuperWhisper charges $120/year for. I'm building features users ask (e.g. file transcription) for and all future updates are included. Runs light on base M1s. Not trying to maximize profit here - just testing if people actually want local-first tools vs subscriptions. If it's not worth it after a week, email support@lucidvoice.app for a refund :)

I'm pretty sold. downloaded it earlier, had an issue with Lucid not automatically appearing in the list of apps to permit accessibility permission, but after a coffee I remembered that apps can be manually added (so you can ignore my support email! :-) ). My minor niggle would be no trial option, that always puts me off slightly - although I appreciate your refund offer. Oh hell, I'll just buy it. It's worth it to reward the slick onboarding process! :-)

> - No cloud, no data collection

How can this be guaranteed if it is closed source?

Other than that, great project.


You can block it with Mac's firewall and verify zero network calls. Privacy policy at https://lucidvoice.app/privacy

Might open source if there's demand - testing that now


Some developers seem to do okay by writing open source software and selling it on the App Store. I’ve even paid for software like that :)

Does this do the formatting that wisprflow will do kind of thing?

Yes - makes corrections, removes filler words, formats naturally. That's what the Llama post-processing does

Can you open source this

you mean give it away for free? how would he require payment?

Handy https://handy.computer/ is a free and open source voice transcription app. Works on macOS, Linux, and Windows, and you can pick which model. Everything runs locally. There is no API.

I use the app constantly, all day long.


I just fired this up at work, and it was so seamless and my few small tests worked well.

I have it bound to a mouse button. Something to try! Also I have "enter" bound to another mouse button. I hold down one to talk, then when it's done transcribing I press enter. I use an MMO Mouse, the $50 Corsair Scimitar.

MacWhisper [0] is a long-running project with a lot more features, what's the main difference, just the price?

[0] https://goodsnooze.gumroad.com/l/macwhisper


After trying a dozen or more paid and open source, I keep going back to MacWhisper. The dictation feature is in advanced beta but works well. The only thing I want it doesn't have is to have different models chosen for different tasks at the same time: one model for each drop folder, a different one for dictation and then another general-purpose one for drag-and-drop. I have the memory for it and MacWhisper can flush a model after a certain amount of unuse time anyway.

Right now it's dictation-first vs MacWhisper's file transcription focus. Planning to add meeting recording and batch processing eventually.

Heard some feedback about reliability issues with MacWhisper as well - trying to build something more stable from the ground up.


it's been a PITA though, lots of features not working, files disappearing and the developer blaming the OS (!), and separate payment for the iOS app.

For a premium UI around the open-source Whisper core, there's also MacWhisper: https://goodsnooze.gumroad.com/l/macwhisper

I'm deaf, so I test a lot of speech to text and transcription apps from an accessibility point of view.

My answer to "why have a monthly subscription" would be that you need capabilities that Whisper doesn't handle well, like real-time transcription in noisy environments.

That's not the niche you're targeting here, though. :)

My experience is that Whisper - not being built for real time speech to text - isn't as good at it as other tools are. You can hack something together by stacking together progressively more audio frames to feed to Whisper to give it context, but IME, you're going to get better results from a model that's designed for real-time STT in the first place, or by using a service like Azure Speech to Text which has excellent noise resilience... but which is also an ongoing cost which would justify a subscription. Real-time Whisper also devours your battery quickly.

That said - while I've had very good experiences with Parakeet in MacWhisper, I'm curious if you evaluated Apple's SpeechAnalyzer APIs at all. It's unfortunately limited macOS/iOS/iPadOS 26+ since it's a new API, but it's on device, has comparable quality of results to Whisper Large v3 Turbo and Parakeet, and seems to be better on battery usage.


This drive to turn one-time sales into repeating subscription fees has soured me even more on the concept of 'buying' (which really comes down to 'renting') software and makes it far less likely that I will ever pay to use these products. I'll go rather great lengths to avoid any software which comes with a price tag not even so much for the actual price - if I count time invested in finding, building, installing and maintaining free (as in freedom as well as beer) software alternatives to paid proprietary products it probably comes out even or more 'expensive' - but to avoid the whole licence/renew/upsell/'gold-silver-platinum-Pro-whatever' dance and the accompanying lock-in. It is a bit like how the onslaught of online advertising has turned me from being somewhat tolerant towards banner advertisements into a rabid content blocker who makes sure not a single piece of advertising ever gets to pollute my eyes or ears. Squeeze too hard and you'll find your hands empty before you realise it.

The same reason any software costs money even when you run it locally I suppose. Local software having a subscription instead of a single price seems to just be increasingly common these days.

I'd assume there are good free alternatives though. If not I'd have a non-zero motivation to build one, having dabbled enough with whisper and running several of my own distributed automatic transcription systems


Its been years since I touched transcription professionally, but my memory is that they started off as a mechanical turk operation.

Lawyers usually would purchase transcription devices, and then either they would have a pool of transcribers (i remember installing foot pedals for forward/back playback operation) or pay a subscription to the manufacturer for mysterious likely offshore people to transcribe for them.

People have a hard time letting go of revenue, but I am betting most of the same people are still in business and want to pied piper consumers of transcription services to the same business model that now costs them pennies instead of wages.



I use basically this but as a button on my MMO mouse.

This is great! I’ve been diving deep into local models that can run on this kind of hardware. Been building this exact same thing, but for complete recordings of meetings and such because, why not? I can even run a low-end model with ollama to refine and summaries the transcription. Even combining with smaller embedding models for a modern, semantic search. It has surprised me how well this works, and how fast it actually is locally.

Hopefully we will see even more locally run AI models in the future with a complete package.


I’m using an Intel Macbook with hardly any GPU capacity to speak of, so running locally isn’t straightforward for me.

I’ve found that Whispering [https://github.com/EpicenterHQ/epicenter/tree/main/apps/whis...] plus Groq pay-as-you-go is a great combination. Not quite free, but cheap enough that it isn’t a consideration.


Owners of a iPhone 12 and higher don't even have to pay- they can use the built in transcription in Voice Memos app for the most popular languages.

https://support.apple.com/guide/iphone/view-a-transcription-...


I made WhisperType [1].

The price tag is $30/YEAR. The current MRR is about $700 and I'm paying $7/mo for Groq Whisper Turbo.

These apps really don't have any reason to be so pricey, it's all just margin.

1. https://whispertype.com/


This is an ad for speech-to-text app that you need to pay for

Spokenly is free (one time fee of $0) and does the same (and even more)


Whisper runs so well locally on recent hardware, I've embedded it directly into hobbyist applications to provide STT-based commands.

I find the Windows included one (Win+H) very good.

Okay, but have you used the large Whisper model? Sure, voice typing has been around for 10 or 20 years. And it's great if you have a good mic and enunciate, but these new models are insane. You can just mumble something from across an entire room, with peanut butter in your mouth, and it won't miss a single word.

It might make up a bunch of words, like "subtitles by soandso", when there's silence though... /s

I went to try it but unfortunately requires license to be activated before trying out.

Should have a trial set up - working on that for the next update. For now, buy it and if you don't like it after a week, email support@lucidvoice.app and I'll refund immediately. Just testing what people actually want right now :)

Why do shops charge for strawberries when they grow for free in my garden?

They do not. (charge monthly for strawberries that have already been sold and "run locally")

I get what you're saying but I don't think that's the right analogy here.

Part of the cost is the amortization of the development cost of building and training the model. Perhaps that's why there is a monthly subscription component. That would make economic sense, although I suspect it's more psychological in that you want people to use your product and a subscription makes that easier. If you think about the cost of each generation then you're not going to have a good time using it.

My point was more though that you pay for the convenience and the inference cost. I can also make my own bread, but my local bakery or supermarket can make it much more efficiently at scale and cheaper.


Then you are arguing a point no one tried to make. They didn't say "Why do I have to pay for those people's work?"

It sounds like a valid argument that you did not articulate was "You can buy a house, but if a house costs way too much to buy, then another option is you can rent it." The house is a fixed static good like a local piece of software, but it just costs so much in total that you can't afford to buy it and have to rent access to borrow it. You can't copy a house for free so it's still not quite there but the essense is.

So maybe the model costs so much to create that if you were to buy it, it would have to cost ... Well the Chinese say they made a model for $6M. So that could be as little as $1 per person if it goes popular. Let's make it $100 just to be over the top generous. So maybe the analogy and the excuse for the subscription still doesn't wash.


[flagged]


How is this relevant to the app?

It’s relevant because it answers the “headline” question of why the market is charging for/gatekeeping free software. The answer is obviously because thats what capitalism incentivizes.

That's why I'm building Knowii Voice AI. It's a fork of Handy (handy.computer) in which I'm going to add fun features, exploring different areas of what we can do with voice on a computer.

It's local first, privacy first, one-time payment. You buy it and get lifetime updates.

Currently available for Windows and very soon for MacOS and Linux. I'm working on Wayland/Hyprland support because I'm using Omarchy;-)

https://voice-ai.knowii.net


so you're asking for $50 for a fork of a free app, on the premise that you're "going to" add to it

I have already rebuilt the UI to be fully responsive (great for tiling window managers), added new features, improved the history, fixed hotkey handling for Wayland and many other things. I'm a solopreneur, and if I want my business to survive I need this endeavor to be sustainable

People who run tiny side businesses understand this is the only way to go. Yes I could have started from scratch, but what's the point.

If you look at Handy's website, you'll see that the author encourages forks anyway.

I'm also offering support for my customers and will build what they want. It's a different game.


I wrote a post with more background here: https://www.knowii.net/c/announcements/knowii-voice-ai

It's literally just the free and open source Handy app with a slightly darker theme. Even your sales pitch is just lifted from Handy.

This is just not true. My landing page doesn't have much if anything in common with Handy's and that was never the goal. I stated explicitly that I forked Handy for a reason, it's not a secret and I'm not ashamed of it.

Of course, most of what my app does now is very similar to Handy, but isn't it normal when you fork something? Discrepancies grow over time, not overnight. I've already implemented many things differently, and am working on features that will probably never be in Handy anyway. I have different goals and ideas.

Some people only see evil in starting from an open source project and building something proprietary. But isn't the whole point of the MIT license to have full freedom? I love open source and I actually intend to contribute back to Handy.

People who request features from open source projects might never get what they want or need if it doesn't align with the maintainers vision or if they don't have the bandwidth. Most of the time, open source software comes without any guarantees, without any support, ... Which is perfectly fine since it all comes for free. What I'm doing is building a commercial product, with actual support, and long-term commitment to my customers.

I'm a solopreneur, working hard on the side, trying to build a sustainable business. And working on a project like this for a long time without any revenue is not sustainable unless you have enough runway. I did that a few years back and don't intend to make the same mistake again (https://www.dsebastien.net/2021-01-04-20-months-in-2k-hours-...).

As an example, I'm very focused on Knowledge Management & Obsidian. Integrating first-class support in Knowii Voice AI for interacting with Obsidian is one of the short-term goals I have in mind. It's not something that would make sense to add to Handy, it's too niche. But it does make sense for my app and my customers because many of them are also into knowledge management and have been following me for a long while.

Anyways. I'll build my project, find people who want to support my work, and do my best to deliver what they want and need. Sorry if it goes against the common ideas that forking an open source project to build something proprietary is wrong, that forks should be open source and that they should be vastly different from day one or be free.


When you've added a few more features it will be fine. Obsidian integration sounds cool



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: