Hacker News new | past | comments | ask | show | jobs | submit login
Show HN: I launched a super cheap and simple to use OCR tool for macOS (textcapture.app)
23 points by auden_pierce 1 day ago | hide | past | favorite | 41 comments
1. Click capture text 2. Select an area on screen with text 3. Paste the text anywhere

Are there other solutions out there? Yes, the best one that I've found is Text Sniper, it $8 so I decided to learn SwiftUI and release Text Capture for $0.99. This one uses MacOS builtin Vision API under the hood so it should also improve with new macOS releases. Would love to hear your feedback!






> Are there other solutions out there?

yes you can just do cmd+shift+4 to take a screen shot, then open the screenshot in the popup that appears and MacOS will automatically OCR it (orc button in the bottom right). This is a built in functionality in MacOS


I disabled that function because it gives the false illusion that docs and images can be saved with text and then will be indexable and searchable in the Finder and other apps; they are not. When I open a PDF, I need to know that it has native text actually saved in the file. If it doesn't, then I will OCR it so it is for sure indexable and searchable.

Interestingly the macOS one is not very accurate. I took a screenshot of your comment and macOS OCR read the "cmd+shift+4" as "cod+shift+4".

I wonder why that is? Could it mean that Apple trained their ocr tool to favor nontechnical text. Meaning the tool determined that “cod” was more likely than “cmd”

Interestingly, iOS corrected “cmd” to “cod” when I first typed it out.


The thing is, if the linked app is using Apple's Vision API, it will perform the same.

Good point. From the list of supported languages [1] it looks like it is in fact using the Vision API in fast mode (as accurate mode seems to support more languages).

[1] https://www.textcapture.app/#faq


It correctly OCR'd it for me.

I have been using this one for quite a while, it works well for me:

https://github.com/schappim/macOCR

(I'd say my number one use is snagging urls out of Zoom presentations, quicker and easier than a screenshot)


Agreed. But I do wonder if this product provides a better enough UX to be worth it’s current price. In my case, it doesn’t support the languages I use so I’ll be sticking with the default Mac feature.

I've been doing this for a while and find that the OCR performance is fantastic.

Works for images in Preview and even in Safari too. Super handy.

Works in Photos.app for searching for text in your photo albums too.

macOS OCR behavior extends to most similar things in iOS too.


Which makes Photos.app a surprisingly good recipe book.

Also as a rolodex. I just take pictures of business cards and you can long press to OCR the phone number and dial from that immediately with no need to even create contact entries unless it becomes a repeat relationship, and if you do, you can usually insta-create the contact card in full with just a long press on the image.

You can even search for text in images in Safari. I was dumbfounded the first time I searched for some text in a page and Safari found it in an image on the page.

The moment I realized this was now a table-stakes feature for a GUI OS, for me, was when I’d been reading and copy-pasting from an image for a couple minutes before realizing it wasn’t a PDF.

This looks wonderful! Just a small heads up, you have a meta tag listing @marc_louvion as the creator (assuming this landing page is built on one of his templates?). I figure you may want to update that so it has your info instead.

  <meta name="twitter:creator" content="@marc_louvion">

Thanks!

If you have installed Microsofts Power Toys on Windows [1], you can win+shift+T and select any area on screen and windows will OCR it and store it on your clipboard.

It's not SOTA AI powered OCR, but works great for copying a link on a streamed tech talk or text from an application / website that tries to make text not-selectable.

[1] https://learn.microsoft.com/en-us/windows/powertoys/


What makes it better than screenshot and Preview? The built in OCR is pretty great on MacOS.

If it's something you'd have to screenshot to use OCR on (i.e. it doesn't just let you directly select the text), this (and the other options) is a bit faster than having to take a screenshot then select the text from it (you select the region like when taking a screenshot and the text is OCRed and copied to the pasteboard in one go).

I use EasyDict, which also does translations with multiple services. Open source. https://github.com/tisfeng/Easydict

The system does this automatically on macOS and iOS in screenshots and stuff.

Any decent alternative for Linux or Ubuntu-based OS? Thanks.


Anyone know of good windows alternatives for this?


Windows 11 native screenshot tool does OCR for me.

You can also get it free with PowerToys from Microsoft and press WIN+SHIFT+T.


> Windows 11 native screenshot tool does OCR for me.

The win+shift+s command / snip & sketch tool? It doesn't appear to have any OCR option before or after capturing


After taking a screenshot with Win+Shift+S, it displays a notification with the preview of the image for me, on the lower right corner of the screen.

When I click that preview, it opens the "Snipping Tool":

https://support.microsoft.com/en-us/windows/use-snipping-too...

In this tool there's a button that does OCR: https://i.imgur.com/GtYUvSS.png

There's also Power Tools, which is another free option, also from Microsoft.

I hope it helps. Let me know if it worked for you or not.


The screenshot and the OCR are two different commands and keybinds. Check Powertoys.

https://screenotate.com/ is one windows example

Snipping Tool

dpScreenOCR

Cleanshot has OCR built in as a feature too

This is built into the OS itself. I don’t get it. What am I missing? I can select text in any image or screenshot seamlessly and very accurately, for $0.00 up front and $0.00 per month.

Native OS is limited to a few apps, so not seamless?

Screenshots auto-do OCR and you can screenshot anything.

Yes, screenshots is one of those apps, but it's not seamless: seamless "auto-do OCR" is selecting an area on the screen and getting text in your clipboard without other side effects. So no extra screenshot files created, no need to navigate another interface to select text in the screenshot

Neat and quick—what are the next couple of features you'd incorporate?

Thanks! Probably more languages, and barcode/QR code detection. I'm currently collecting feedback, would love to hear your suggestions -> https://insigh.to/b/textcapture



Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: