I use recoll[1] as my main homedir index/search tool. It supports many document formats and digs recursively into archives. It has a no-nonsense GUI, and a simple CLI interface. The search syntax is easy and flexible, allowing searches by many kinds of metadata, in addition to simple full-text search. It's slow with my collection of ~2.8 million files and the index on spinning rust, but it's thorough and reliable.
For email, I store everything in mboxes, and index/search with Mairix[2]. It's wickedly fast and the search capabilities make gmail blush. I use a little script to search:
if mairix -o $$ $*; then
mutt -e 'set quit=yes' -f ~/Mail/$$
fi
rm ~/Mail/$$
It's very common that I want to find an email from sometime in the last month, so my most common search is something like:
search d:1m- alice
And I'll have an instant view of all emails from alice in the last month.
There are great CLI tools already in this thread, but for some of my side-gig work I'm searching large piles of PDFs, docs formats, and ePubs with a GUI word processor open and need to reference the source by page/graf number. For those I use DocFetcher[1], a quirky and intermittently updated Java app that indexes file contents and provides rudimentary relevance searching along with regex. I index my docs, put the database it generates into a read-only shared directory, and point systems across OSs at that db so I can search quickly regardless of which box (or where) I'm working from, or can toss the app, db, and docs onto a thumbdrive for portability.
There's a commercial version that prioritized bugfixes, making the free and open version less attractive than it used to be. But it's still one of the better tools for the job when you want more than a grep-equivalent.
At my previous job I created something similar for a recruitment sister company. They had a ton of CV's in all kinds of formats (Word, Excel, PDF, rtf, plain text etc). I used Lucene.NET to do the indexing.
Both companies no longer exist and I've needed to find some text in docs of my own. If I have a bit of time I could recreate the app pretty easily.
On Windows I use Agent Ransack. I don’t know if it’s the best but it works well and predictably. Unlike the built in search of Windows where I still don’t understand how to reliably search for something
ripgrep has been a godsend for my bash workflow. It feels impossibly fast when used on git repositories. The caveat is that by default it omits .gitignored files, hidden files/directories, and binary files.