Invaluable tool. Can handle massive collections of files. For power users it feels more appropriate to have a dedicated search engine application like recoll rather than embed search in the desktop (for a long time I was disabling baloo on KDE desktops as it would make the machine unusable while it was indexing).
The recoll UI could be improved and in general the integration e.g. with python scipting or other tools could be made easier but this is a much appreciated project and it is good to see it keeps being developed.
In an ideal universe projects such as this should converge with other desktop apps to create truly empowering information tools (repatriating the agency that has been relinquished to the "cloud")
The resource use has been the death of any full text desktop search I have tried to use. How lightweight is Recoll? I don't see much on this on the website, other than I have to use an third part CPU limiter if I want to limit CPU use.
Both recoll and Xapian (the index engine) are written in C++. The document filters are in Python but they only run once per document and tend to be simple and fast (easy to add your own too btw). For my use case of 15GB or so PDF files it has been lean and fast. It has a pragmatic Unix tool feel. I am only using the CLI though so can't speak for the GUI.
Another Xapian-based tool I use is notmuch (email) and that one is very snappy too.
what is quite handy usability-wise: incremental index updates (after inserting new files) are fast and can be done on the fly while fully using the desktop
Recoll is not just useable for desktop applications, it can also be used as a local web search engine through recoll-webui [1] (link goes to my own repo which has some modifications to make it work with the Searx/SearxNG engine) which in turn can be used as an "engine" in Searx and SearxNG through the recoll engine [2] (which has been merged so it is no longer necessary to pull it from my repo).
This last option makes Searx/SearxNG useable for all types of searches, both local as well as remote. I've been using this exclusively for many years now over a large collection of documents (about 600.000 entries) with good results.
Also for MacOS and I was wondering what benefits Recoll would give. From the website:
> It seems that Recoll will sometimes find data that Spotlight misses (especially inside pdfs apparently, which is probably more to the credit of poppler than recoll itself).
I have some additional text search tool for MacOS. It was slightly better than Spotlight but I used it so rarely that I forgot it’s name. The thing that I miss most often in MacOS is - “which movie file has that scene that I remember so clearly”. Isn’t it time already to do something like that? That would be a noticeable breakthrough indeed. No?
I'm not aware if it exists in software form yet, but presently your use case can be solved by having a mutually advantageous social transaction with a /r/tipofmytongue user who is happy to help by recalling the title of the movie (or similar) you're thinking of.
Thank you for the suggestion. Not all my video files are publicly available movies, though. And sometimes I know the movie, but am struggling to find the exact time the episode takes place.
Use a browser extension like SingleFile to save pages you want to refer back to later to local HTML, then let Recoll index them.
If you have something doing this to every page you visit, and Recoll can see it, then Recoll can index it.
Regarding automatically saving every page you visit, there's multiple tools that do this. One I played with and liked - but I can't remember the name - it's 5 numbers and refers to a port you can type with localhost to search through all recorded pages. That or something like it would work really well with Recoll.
I use this script to make recoll produce pdfgrep-like output so that I can use it with Emacs and pdfgrep.el. This gives a nice interactive way to wade through thousands of pdf files.
On Windows I have been using Listary because it integrates into all file dialogs and Windows File Explorer. The killer feature is that it will navigate in file dialogs to where you are in the file explorer. Makes saving files so much easier.
It doesn't index contents just filenames so it is fast.
I am bit surprised by the downvotes. I guess it seems that people seem to be interested in getting a full text search, which for me has never really worked well because results are too noisy (thinking back at least to solutions such as copernic).
Yeah, there's a number of options for searching by file name etc, everything works well for me.
Searching within file contents seems to lack good options right now. Historically you had X1 desktop search, Google had a desktop search product, I think copernic. But most seem to be out of date.
The recoll UI could be improved and in general the integration e.g. with python scipting or other tools could be made easier but this is a much appreciated project and it is good to see it keeps being developed.
In an ideal universe projects such as this should converge with other desktop apps to create truly empowering information tools (repatriating the agency that has been relinquished to the "cloud")