I feel that most people gave up on organizing data and just went with concept of searching. This is a shame because searching wastes everyone's time to filter false positives, while small effort of tagging new content goes long way to enable discoverability. Web sites that allow you to filter content based on desired and undesired tags give you optimal way to recover information.
A very interesting feature in Tagsistant is tag relations. It enables tree hierarchies for tags ("anything starwars-related is also scifi-related"). Kind of ironic how they wanted to get away from tree structure of files, and then they implemented tree structure for tags. Perhaps a tagging system for organizing your tags would be better? :)
This meta-structure of data is fascinating. Is there a good resource that systematizes the area, with best practices and implementation tips?
Then I realised that they built it on top of FUSE, and SQL, and took a sigh of relief.
((EDIT: Perhaps this is a little harsh? I didn't mean to be harsh, just precise, but perhaps I went a little over the top -- I apologize to the authors if I did.))
I investigated the FUSE/db option earlier a year or two ago, and personally I don't see this as an interesting or compelling solution to the file<->tag problem. Because users move and rename files, pathnames are potentially semantically meaningless to the tag system. The contents of files change often, and arbitrarily (given things like MS Word's formats which are literal memory dumps of what word is doing, not to factor in encrypted files, etc.), because of this, file hashes are potentially semantically meaningless to the tag system.
In other words, basic file operations (reading/writing/renaming) will cause this system to break your tags without significant work to keep the file<->tag relation consistent. You can attempt to mitigate this problem through systems that keep track of files (inotify, etc.) but that introduces a runtime cost and has technical difficulties as well. It's 'designed' (albeit unintentionally) to break from the start, and the developer has to exert a large amount of effort to stop the system from breaking. To me it didn't seem like the effort was worth it, that the innate flaws were not worth surmounting. Unfortunately to avoid this from being a 'debbie downer' post, I'd have to talk about the alternative approach, which I don't really have space (or the time, right now) to do here.
If you want to support hard links you can decide whether to associate tags with files or inodes, depending on whether you want all linked files to have the same set of tags.
That would make things slightly better, but it's really not how things are supposed to look from userspace. You still have the problem of tags not being preserved across file copies, and not across filesystem boundaries (Which, the latter is almost a universal problem in this space, I guess).
Tagsistant has been around for ages.. here's it being discussed on HN in 2011:
At the moment the program's interface has been worked through, the documentation written up, and I'm just plugging the user-level interface together, so maybe a few days to a week to run through everything. I'm still not comfortable with my application-level testing either so we'll see I guess.
I guess a fun way of saying it would be, software is an artisanal craft so out of respect for what I'm building and for the users, I don't really want them to see something that is obviously imperfect, until I've smoothed those over.
The less fun way is that I don't want the responsibility of someone running it in production, then for things to go belly-up, haha
Would this later be connected to something like Spotlight on OSX?
Moreover, such a tagging system belongs to an OS-wide service; it's not something that should be implemented on its own by every application dealing with particular media types.
(Also, (1) even if "should cover most of the use cases" is correct, why not cover all of the use cases?; and (2) the mere presence of more powerful tools can encourage ingenuity to use them in ways that wouldn't have been imagined if we only had purpose-built, specialised tools.)
You see, it is a big effort to manually input all this data.
What is useful is reliable, public source entries.
You also want a documented data format for future compatibility. So you can migrate to the next format.
This furthermore gives the advantage that applications can use the metadata. Although I suppose the file system abstraction achieves the same?
I might be a bit too paranoid of file loss to install it though...
That is some users won't do that, but by adapting everything to these users we are making everything worse for everyone including them.
Instead of requiring every file to be properly tagged, why not take the Google-approach altogether?
I'm not trying to argue against search, I'm arguing against everyone having to rely on search because ux designers decided to remove every other way of finding files because someone might someday forget to tag the file or whatever the procedure is.