Features that I for myself want / nearly have:
- My mom should be able to install it.
- Scan from phone, upload to central server / nas
- Everything AES Encrypted when not logged in, decrypted on
use (including sqlite database on shutdown)
- Documents can be decrypted with plain openssh, without having the original program.
- Fulltext search through invoices (by extracting PDF text and putting them in SQLITE fulltext indexes)
- built-in PDF reader / printer
I've been working on a similar tool, but command line based. The idea is to create predefined profiles (e.g. bills go into a specific folder, with specified tags and specified resolution) and then just issue "$ scan bill" on the commandline. Never finished that one so far though. If Paperwork works for me, I probably won't :)
Edit: By the way, OCR-ed PDF archives in a directory structure can be searched nicely using pdfgrep (https://pdfgrep.org/).
I find the Linux/sane scanner drivers unbarable. I use a network scanner that uploads the pdfs to an ftp server, that syncs to dropbox. Works like a charm. Just need a tool to annotate an tag those pdfs.
Not saying that other solutions would not be better, but it all comes down to what one prefer to use :)
I'm going to use it to build an iPhoto replacement, what I've got so far is web based but I'll package it up with Electron.
I then have various Hazel (OSX) rules to automatically organize things out into various folders w/ date organization: receipts, bank statements, bills, etc.
Works somewhat well! You can do a search in Finder, which will also search contents, etc.
I would totally use this, but right now I have a hard time justifying actually having a scanner in my house. I just don't have enough paper coming in (well I do but I throw it away because I know I won't look at it)
The best (most accurate/flexible) OCR software I have ever used is ABBYY, which works so well I can only infer that it is powered by magic. Unfortunately that magic is proprietary, somewhat expensive (though not so bad really) and Windows only. I used it to help my mother digitize hundreds of pages of salary data for a consulting job she was doing where the text was formatted oddly and even with all that we only had a handful of errors in about 800 pages.
It doesn't have the awesome organizational stuff that Paperwork does however, which is what I've really been wanting for a while. This is reminding me of an awesome app I used to use when I had a Mac, which is DEVONthink. Basically a personal document database , Mac only and extremely useful, it was one thing I definitely missed. I use [the excellent, highly recommended for academics] Mendeley to organize PDF journal articles and such, but it's not so great for scans.
I doubt if Evernote wrote their own OCR engine. Any idea what it is that they use?