Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Ask HN: How do you OCR your receipts and bills in 2018?
62 points by Anonanonym on July 24, 2018 | hide | past | favorite | 47 comments
How do you get your personal financial documents from paper form into your spreadsheets or accounting programs? There always seems to be backlog of receipts and bills on my desk. I'm thinking that there has to be a better way than typing all these in. Please recommend what works for you?


I created Smart Receipts for receipt tracking and OCR a few years ago. I worked as a traveling consultant and needed to track receipts on a weekly basis, and I got tired of manually typing in the data.

It's available on both Android and iOS for download here:

https://smartreceipts.co/download

It's also open source:

https://github.com/wbaumann/SmartReceiptsLibrary https://github.com/wbaumann/SmartReceiptsiOS


For the hardware side, I use a Fujitsu ScanSnap - I believe my model is an S1300i - to scan paper documents. I use the bundled OCR software (ABBYY FineReader) which seems to work fairly well, though it's highly dependent on the resolution you scan at.

As for software, I haven't found anything I'm truly happy with. I've put some effort into using Mayan EDMS[0] to store the documents. It works well, but isn't perfect. Version 3.0 added a bunch of UI changes that should significantly improve matters, but I haven't gotten around to upgrading it yet.

I've also tried Alfresco Community Edition[1], but I think this tool is overkill for the job.

[0] https://www.mayan-edms.com/

[1] https://www.alfresco.com/alfresco-community-editions


+1 for Fujitsu ScanSnap

I believe mine is the same model. Small enough to be portable.


Most receipts in my country have a qr code at the bottom with the link to the receipt in human and machine readable format. In the next year this will be mandatory for everyone. Russia.


I was going to ask, who came up with a dumb idea of using a link (i.e. URL) to access receipt data over the Internet, but fortunately it seems to not be the case - the QR code on the receipt posted by 'savant_d[0] reads:

  t=20170221T095900&s=30.00&fn=99990789659&i=501&fp=4189225230&n=1
--

[0] - https://news.ycombinator.com/item?id=17601617



That is an excellent idea. I hope it catches on in the US.


Hardware: Fujitsu ScanSnap ix500, and a run-of-the-mill paper shredder

Software:

  on macOS: DEVONthink Pro Office, or MS OneNote, or Google Drive

  on Windows: MS OneNote or Google Drive


How are you liking DevonThink? It’s one of those bits of software I think about trying out from time to time but getting started with it looks like a bit of a time investment.


Yes. You’d have to invest some time to get used to it, but I’ll think it’s worth it. Especially on macOS I don’t see many good options. That said I can’t wait for them to release an major update (v3).


What do you use on Linux?


vim. backend/infra dev that doesn't require a GUI. Seriously.

I gave up on Linux on the desktop many years ago.

It became more of a time sink to try to find alternative apps and/or workarounds that mainstream (often paid) apps do on macOS/Windows with ease.

Or, to put it another way, I got tired of the constant tinkering and just wanted to get work done.


Scanbot (ios and Android app) https://scanbot.io/en/index.html . I like the backup to Google Drive, OCR is there and seems fine but didn't used it much.


I’ll chime in for Scanbot as well. I used to use a work multi page scanner for my documents, but unless it’s more than 10 pages I just use Scanbot now. It’s got great OCR and it automatically backs up to Google Drive or Dropbox.


I think this is the one I use as well, or a similar one that does basically the same. Much more convenient than taking a scanner everywhere..


Scanbot is fantastic, I've been using it for years now.


I don't eliminate the typing, but I use Evernote to grab a picture of the receipt, then tag it with the trip that I am on.

This at least reduces the burden of figuring out what receipt goes with what and I have a copy of it at all times in case of issues.


I don't have that many. But when I need to OCR a receipt or in general get text from an image, I use the Copyfish extension and then copy & paste from there to the accounting software:

https://addons.mozilla.org/en-US/firefox/addon/copyfish-ocr-...

https://chrome.google.com/webstore/detail/copyfish-%F0%9F%90...


A full-duplex scanner is worth the investment. I've configured mine so that it automatically drops the file on a network share for storage.

Receipts I ignore (the credit card transaction is caught by my online budgeting service). Random bills / statements get uploaded to Google Drive. There's a "silent OCR" that runs on the PDFs... you can't access the text directly or by API, but if you search for anything it will appear with high accuracy.

I've quit trying to organize anything since everything is easily findable.


I adore my Fujitsu. What OCR software are you using?


That's the point... I don't use any. Google just runs its OCR magic on uploaded docs and makes them searchable.


I didn't know that. I don't need to pay for Evernote anymore!


I used Shoeboxed [1] for about a year several years back. Really loved their service and they have great tech (they were pulling a remarkable amount of data out of most receipts, even years ago when I last used the service). I haven't convinced myself that I need the service enough to budget for rejoining it, but every so often I'm reminded of how useful it was as a service and debate it.

[1] https://www.shoeboxed.com/


I take a picture of my receipts and email them to a specific email, my accountant then make sure it's valid (etc.) and then process it in their software. Really not that expensive and it's one of those things that I'd rather do the moment I receive a receipt to avoid stacking them. Also, it's one of those things that I really want to delegate to a professional so I'm happy to pay for that service.


Surprised to still see so much hardware usage -- I'm curious where people find the advantages there?

I have found over the past few years paper recognition in iOS apps works great, both third party like JotNot as well as what Apple introduced last iOS update.

For workflow I use one of those options and then share with Dropbox, and then store those files in a encrypted disk image in Dropbox.


+1 for this question, especially if the folks here know anything about a decent document management system that's both cross-platform and made for home users. The scanner is the eaisest part of the discussion.


Totally agree. I've a 50 sheet, dual sided document scanner sitting in a box doing nothing simply because I can't settle on a nice system that silently takes the scanned TIFFs or PDFs, OCRs them, builds a DB and surfaces the whole lot in a web GUI for searching.

There's plenty of Enterprise grade Document indexing systems out there, but they all cost 10x as much as I'm willing to pay.

What I can't find is a FOSS or even cheap commercial software that does all of the above.


If you can live with some manual work and no OCR, the digital version of a filing cabinet is scanning everything as (multi-page) PDFs and naming the files like "2018-07-24 #invoice #ACME #business.pdf". Then you can easily find the right document with just Spotlight or similar. Fully cross platform and likely to be readable by any device now and in the future.


Expenses: Zerox MFD emails scans to mailbox (RFID card scan, no need to manually input email address etc), scans are then attached as email documents from MFD/printer to my inbox. I then attach in ERP expense claim. This is then approved via ERP system by whoever has the cost code I'm seeking approval for, then auto-submitted to finance when approved. It is pretty smooth.

Original stamped documents are filed for 9 years for audit purposes, YMMV on what type of document needs to be stored for what period of time.


I don't understand why organizations are still requiring retention of original documents. Scans/Faxes have been recognized as legally adequate copies for quite a while. Are there still regulations that require original documents?


The need for original copies does depend a lot on country.



I know about Receipts, which also can export its data to a few accounting programs. Unfortunately, it's a Mac-only app and I don't own a Mac. If anybody knows something similar for Linux, I'd be very happy.

https://www.receipts-app.com/index.html


What about plain old store reciepts? Like on little scraps of thermal paper. What do you all do with those?


I can't imagine a scenario where I'd have to prove that I brought, say, a donut.

So 98% of store receipts go straight in the trash. Before I've even left the store, if possible.


is this a shout-out to Mitch Hedberg?

reference: https://www.youtube.com/watch?v=fVTLFoB6yHk


I've been using https://foreceipt.com/ (iOS and web app, Android coming soon). It keeps all your data on your Google Drive account.


Do they do anything with your data? Some of these things analyze your spending and sell it to third parties.


Fujitsu N7100 Network Scanner, which I picked up openbox.

Scans to PDF with built in OCR, sent to email or drops it on my NAS.

This is a great scanner - double sided, has dualfeed detection, fast, and carries all the processing onboard.



There's scanning cabinet, part of Perkeep:

https://perkeep.org/app/scanningcabinet


I use TurboScan for Android for portable document scans. Don't use OCR much, so can't vouch for how good it is.

At home I use a old flatbed scanner.


I use the Microsoft Office Lens Android app. It uploads to OneDrive, which has a pretty basic UI, but a quick full+text search.


Receipt Bank via email, Dropbox sync or direct upload via iOS app or web app.


A phone and the Microsoft Office Lens app.


Stay away from any websites or apps (there's numerous) powered by Metabrite's SDK. They data mine and resell [1] the receipts scanned.

[1] https://www.metabrite.com/use-cases/


Office lens and certify app




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: