
TMSU: a tool born out of frustration with the hierarchical nature of filesystems - goblin89
http://tmsu.org/
======
zaphar
I'm actually not frustrated with the hierarchical nature of filesystems. I'm
most frustrated with the state of filesystem search these days. I don't want
to tag and curate my files. I want to search them.

I've strung together something using bleve full-text search and some OCR libs
to scratch my particular itch but it still doesn't quite get all there.

~~~
kpil
I agree.

I actually like the hierarchical organisation, and I don't like the 10 year
usability trend, particularly driven by Microsofts attempts to patch over
their horrid structure with even worse workarounds.

The solution to learning my mother were to find her documents is not hiding
the place 15 levels deep and having 10 symlinks to it.

Fast, unobtrusive indexing and a good structure is all that is needed. Humans
organise and memorise things in hierarchies, and while categories is another
helpful abstraction they tend to be confusing if they do not point to a fixed
position in a hierarchy in my opinion.

You can also explore a hierarchy; Not so with a cloud of things connected to a
bunch of tags.

~~~
blakeyrat
> Humans organise and memorise things in hierarchies,

Humans also track a lot of things based on spatial memory, which no OS since
Mac Classic has even _tried_ to make use of, which is a shame.

~~~
therein
How did Mac Classic make use of that?

~~~
chongli
In Mac Classic each folder you opened created a new window which remembered
all of its settings for size, position on screen, scroll position, icon
size/layout, etc. This window is an explicit and exclusive representation of
that particular folder and attempting to open the same folder again does not
create a new window, it simply shifts the focus to the already-open window.
The change in Mac OS X towards a browser for navigating the filesystem
hierarchy caused an active debate over spatial[0] vs. navigation (or
browser)[1] file managers.

[0]
[https://en.wikipedia.org/wiki/Spatial_file_manager](https://en.wikipedia.org/wiki/Spatial_file_manager)

[1]
[https://en.wikipedia.org/wiki/File_manager#Navigational_file...](https://en.wikipedia.org/wiki/File_manager#Navigational_file_manager)

~~~
blakeyrat
Ars did an amazing series of articles about the (IMO) missteps made in OS X,
and presented a theoretical design that would satisfy both Mac Classic fans
and fans of "browser-based" windowing systems.

[http://arstechnica.com/apple/2003/04/finder/2/](http://arstechnica.com/apple/2003/04/finder/2/)

Any software developer interested in designing usable software should read and
digest this. Alas, nobody working at Apple did, so I'm no longer an Apple
customer.

~~~
chongli
Thank you for this! I vaguely alluded to this (active debate) in my comment
but I had forgotten the source. I'm going to read through this now! :)

------
Mister_Snuggles
Stuff like this is cool, but it really points out just how poor the filesystem
is for organizing certain types of data. The "Why does TMSU not detect file
moves and renames?" question in the FAQ really highlights how this isn't
helped along by the filesystem at all.

One comment mentioned BFS, which had some really cool stuff. There's an Ars
Technica article that touches on some of it[0].

The secret to BFS, in my mind, is that applications use it. The Haiku Mail
app, as noted in the article, used the filesystem as its email database by
attaching its own attributes to messages. This is also an example used in the
"Practical Filesystem Design with the Be Filesystem" book[1].

Unless the metadata becomes a first class citizen in the filesystem, any
attempts to layer it on top will have problems. Either applications won't
understand it or normal filesystem operations will cause the metadata database
to become de-synced with the filesystem data.

[0] [http://arstechnica.com/information-technology/2010/06/the-
be...](http://arstechnica.com/information-technology/2010/06/the-beos-
filesystem/)

[1] [http://www.letterp.com/~dbg/practical-file-system-
design.pdf...](http://www.letterp.com/~dbg/practical-file-system-
design.pdf#75)

~~~
JadeNB
> Unless the metadata becomes a first class citizen in the filesystem, any
> attempts to layer it on top will have problems.

Remember when it seemed like Mac OS might give us a modern era of rampant
metadata
([http://arstechnica.com/apple/2005/04/macosx-10-4/6](http://arstechnica.com/apple/2005/04/macosx-10-4/6))?
Ah, those were the days.

~~~
Mister_Snuggles
Yup.

MacOS does an OK job of helping the user find things with Spotlight, but it's
not a full metadata system like BFS had.

Mail.app, for example, keeps each message in a separate file[0] (and probably
has a cache or separate database of this to make displaying mailboxes
quicker). This makes it easy for Spotlight to index, but all of the stuff that
you'd think of as metadata is actually just regular data inside the .emlx
file.

If Apple made a huge effort to start treating the metadata (assuming the
infrastructure described in the Ars Technica article still exists) as a first
class citizen and using it like BeOS did, maybe we can get there. This would
be a drastic rethink though. It feels like files, in some ways, are becoming
second-class citizens in the Mac world. Photos, for example, are managed in
the Photos app - you do not go into the filesystem and organize your photos.

One big problem with filesystem metadata is how do you transfer it? The Ars
article showed a sidecar file (._filename) being created when the file was
copied to a non-HFS volume. Now the metadata is detached from the file and
we're back to the same problem.

[0]
[http://mike.laiosa.org/2009/03/01/emlx.html](http://mike.laiosa.org/2009/03/01/emlx.html)

------
seagreen
A great idea! I think tags are definitely a better way to organize most
personal data than trees.

Also I like that they describe what data they actually change on your computer
right on the homepage: "TMSU does not alter your files in any way: they remain
unchanged on disk, or on the network, wherever you put them. TMSU maintains
its own database and you simply gain an additional view, which you can mount,
based upon the tags you set up."

Unfortunately building on a foundation of sand (meaning not TMSU's code, but
Unix filesystems) has downsides:

[https://github.com/oniony/TMSU/wiki/FAQ#why-does-tmsu-not-
de...](https://github.com/oniony/TMSU/wiki/FAQ#why-does-tmsu-not-detect-file-
moves-and-renames)

" Why does TMSU not detect file moves and renames?

To detect file moves/renames would require a daemon process watching the file
system for changes and support from the file system for these events. As some
file systems cannot provide these events (e.g. remote file systems) a
universal solution cannot be offered. Such a function may be added later for
those file systems that do provide file move/modification events but adding
support for this to TMSU is not a priority at this time.

The current solution is to periodically use the repair command which will
detect moved/renamed files and also update fingerprints for modified files.
(The limitation of this is that files that are both moved/renamed and modified
cannot be detected.) "

Ouch.

~~~
kbenson
Hmm, seems like they could have gone the other way, throw everything into a
DB, and then wrote a fuse plugin to access it all through traditional file
system mechanics. That would have allowed for gating direct access such that
moves and renames could be dealt with accordingly. Of course, there are other
problems with that approach, but probably not as many as you might think (the
file system _is_ a database, so you're really just choosing a back-end that is
less likely to be directly accessed).

~~~
seagreen

        they could have gone the other way, throw everything into
        a DB, and then wrote a fuse plugin to access it all
        through traditional file system
    

This is the Camlistore strategy!

    
    
        Of course, there are other problems with that approach
    

Could you elaborate more on these? I've never worked with FUSE.

~~~
kbenson
The other problems I was alluding to weren't really with FUSE, but one that
does pertain to FUSE is speed, since FUSE imposes overhead through a daemon
running in user space, and associated context level switches because of that.
From just looking into is again, this may have been mitigated to some larger
or smaller degree with some FUSE performance enhancements in 2012.

Specifically, I was referring to the different off the shelf database systems
which could be used. Each will have it's own benefits and drawbacks to storing
large chunks of data per-record. Benefits might include (relatively) easy
sharding or replication. Drawbacks might include not being space efficient for
removed files, not being as resilient to corruption due to crashes or
corruption affecting more than the files in use, or overly aggressive use of
memory to function efficiently.

If a custom database was developed, you could tailor to your exact needs, but
then you have much more work to do, and a period of immaturity.

Off the top of my head, if I were designing a general purpose system for
tagging files where people were expected to use it as a regular file system
and some overhead from FUSE was acceptable, I think I would leverage the file
system but in a different way. I would set up a specialized directory for the
files themselves, and store then hashed within it, and have a BerkelyDB
database relate filename to hash and tags, and use FUSE to do direct file
access. But that's my 5 minute assessment, so I reserve the right to change it
completely given someone pointing out the obvious problems. :)

------
fh973
This idea is rediscovered every few years in a new project. I tried this with
StorageBox many years ago and even did some UX research in this context. Turns
out many users don't like querying only as it they feel they can not search
their data exhaustively, and might "lose" some of their data this way.

Also for not tech savvy users, folders have the nice interaction pattern of
question-response via menu selection. They see something, click, see
something, click, without realizing that they are navigating a folder
hierarchy.

~~~
JoeAltmaier
Not convinced. Why wouldn't a 'tag browser' work just like that too? Even
better, since what I'm clicking on are meaningful tags, instead of 'directory
names'

------
AceyMan
I empathize with the OP's view that there's a usability gap with out of the
box file systems/namespace (short of bash/sed/awk/perl wrangling, natch), so I
applaud this project.

I have a couple of points/concerns after a quick read.

1) So 'tag' is the verb that updates the records, but 'tags' is a read?

Poor taxonomy, IMHO. If I'm using "tmsu" I _know_ I'm working with tags, so
I'd think natural switches would be the way to go ( tmsu add, tmsu ls|list,
&.) Or at least don't make the "create" verb and "read' verb differ only by
the plural 'S.'

2) Is there a way to list all existing tags? —not just the ones already bound
to a file but all available in the database (with regex filtering, of course).

That's what I'd need to pick from my 'tag pallette' _before_ actually tagging
so I could avoid creating synonyms accidentally that'd later require a merge.

------
mywittyname
The reason hierarchical structures are used in file systems is because they
are a pretty intuitive and, most importantly, _generic_ way of classifying and
storing information. For the most part, just about any file you have on your
computer can be stuffed into some sort of folder structure.

Specialized files like music, movies and images are a solved problem. iTunes
and other software do a great job at organizing this information and making it
easy to use. There's also software like Quicken for dealing with the other
common mountain of data people have.

What _might_ be useful is a piece of software that is capable of extracting
and managing metadata automatically. Think of a tool like iTunes that you feed
it a collection of files and it uses some form of ML to extract and create a
database build logical ontologies for this data. The big problem with this
kind of tool is finding a large, complex dataset that an individual has, but
that has not been organized by a specialized piece of software. I doubt these
exists in numbers significant enough to justify creating a project.

tl;dr: Directory structures are low-effort, generic, and discoverable ways of
dealing with files that are not managed by other applications. It will be hard
to improve on them without sacrificing one of those three attributes.

~~~
JoeAltmaier
I don't think its intuitive at all. We don't ask questions like "What parent
folder did I put that document it?" We ask "Where is that document I printed
yesterday. The one that I got via email."

With liberal use of tags, and the ability to browse them fluidly, we could ask
those sort of questions.

~~~
mywittyname
> My Documents [Sort by Date]

At the top.

The way that one person manages their documents probably isn't going to be the
same as another, but generally a person is consistent with all their files.

In this case, it's either a document that you have several types of, in which
case you would have an existing folder structure, or it's a one-off that you
dump in My Documents along with all your other one-offs. Even if you lose it,
file systems all keep date-time information, so you can easily search for the
all the files last modified yesterday.

The problem with liberal tagging is that it requires a bunch of up-front
effort that you're never going to perform. In this case, if it were an email,
you'd just use your email browser (specialized software!) to find the document
again based on the things you remember about it (date, source, size, etc).

~~~
JoeAltmaier
I don't remember it by date or size or maybe even source. That's just
parroting what we do now. I remember that I printed it. That could easily be a
tag. The tags need not be created by me; the tools could be promiscuously
tagging persistent data all the time, with useful clues. I'd learn some clues,
learn to use them.

That 'my documents' thing - you could easily create a 'view' on tags that
yielded that result. Without relying on Microsoft or whomever to do it for
you.

------
Skunkleton
How is this different than something like this:

    
    
        mkdir -p ~/tags/{music,big-jazz,mp3}
        ln -s /path/to/summer.mp3 ~/tags/music/
        ln -s /path/to/summer.mp3 ~/tags/big-jazz/
        ln -s /path/to/summer.mp3 ~/tags/mp3/
    

From there, you can use all normal filesystem tools to interact with your
'tags'. You could extend this with a simple script that handles duplicate file
names in the same tag by sticking a hash of the file before the extension.
Having a separate database for this information seems unnecessary.

~~~
noonespecial
This supplies you with only one relation between the tags: "And Then". Ex.
'Path' (and then) 'to' (and then) 'summer.mp3'. Or 'tags' (and then) 'big-
jazz'.

What you want is to also have AND, OR and NOT available. How would you find
'music' (and) 'mp3' (and not) 'big-jazz'?

~~~
dsp1234
I'm not saying this would be simple, or as nice as the tool given, but it's
not impossible.

First, do a 'ls' directory dump of the 3 folders. Then 'cut' out the softlink
destination from each file. This will end up with 3 files with soft links.
From there, 'grep' the music and mp3 files together to get a list of soft
links that are in both music and mp3. Then you can do an inverse 'grep' to
remove softlinks that are in big jazz.

Even if the system creates hard links instead of softlinks, then the same
could be done via inode numbers

------
prewett
I feel like I must be missing something, but I feel like tags is just a band-
aid. If you have enough files, you will end up with too many tags to manage
(think of a tag directory filled with tag files, instead of a home directory
filled with actual files), so you'll need hierarchical tags. Or kludge it with
"tag/subtag" which looks a lot like a directory. GMail added nested tags,
which seems to me like an admission that flat tags is not sufficient.

One problem is that regular people don't know how to organize a hierarchy.
General -> specific works really well, but that requires the ability to
generalize.

The only use I can see for tags is if you want files to be a member of more
than one directory. Other than music, everything I have is generally created
for a specific purpose, so tags are not particularly helpful.

------
JoeAltmaier
Its been a generation since the hierarchical file system was obsolete. Its
lame to have to hang every file on the ceremonial file tree like some
Christmas ornament. Hardly any app wants data organized like that.

In fact, nearly every large app does something to avoid it. They create their
own representations of a log, or a mail folder, or a document, or an image
(and on and on) and manage the details themselves. Because 'file systems' are
so lame and underpowered.

This tool begins to help. Creating flexible groupings (tags) resembles a
relational database. That's a start. I'd like to replace the OS file system
with something like that.

Instead of renaming files when you bring another copy onto your persistent
storage, you could just add a version tag to them. Leave the names alone! I
can tell my build system (or document store, or mail tool) what version I want
to deal with e.g. tag='version' value='2.5'. No collisions any more. No
requirement by the 'file system' to mash them into some file tree so they can
still be found, but don't collide.

In fact this system can do everything the hierarchical file system can do, and
more. Just add the tag 'parent directory name' and voila! You have a file tree
(if you want).

------
crescentfresh
So tag order is important?

i.e. `music mp3 folk` results in a different virtual file system than `folk
music mp3`?

I often think of tags as an unordered set, rather than an ordered list.

~~~
GordonS
An ordered list smells a bit like a hierarchy to me...

------
qwertyuiop924
This is actually kind of akin to BeFS. Although BeFS had greater capabilities
in some ways, being an actual filesystem. For the uninitiated, BeFS was the
native filesystem of BeOS, and allowed for metadata attributes that allowed
for querying and indexing capabilities akin to an relational DB. Or at least,
that's what Wikipedia says.

------
auganov
I just keep all my work-files (documents, downloads etc) in my desktop folder
(it's the "top-level" folder in Windows when you Alt-Up, have it symlinked at
/desktop/ too).

My default view is a detail view sorted by "date accessed" (descending) which
is what I need 99% of the time. Especially handy when uploading random images,
quick edits etc from the browser.

btw I highly recommend
[https://pathcopycopy.codeplex.com/](https://pathcopycopy.codeplex.com/) for
those that use a terminal and win explorer at the same time a lot.

------
gumby
Why not use file attributes and provide a nice interface for managing them?
(e.g. extend find to search them, etc)? A parallel database is fragile -- it
can trivially get out of synch.

~~~
codezero
I thought the same thing, but in practical terms you'd need to maintain a
separate db for portability anyways I guess.

------
swalsh
I seem to recall reading somewhere that one of the reasons Vista was a "bad"
os, was that it originally had much higher more optimistic ambitions to build
a sql like file system (i suppose similar to this). It however had issues, and
a decision to scrap it delayed the vista project, and reduced the scope of
"cool" things it was supposed to deliver.

~~~
rincebrain
The thing you're thinking of was indeed supposed to be in Vista, then post-
Vista, and then scrapped all together - WinFS.

[https://en.wikipedia.org/wiki/WinFS](https://en.wikipedia.org/wiki/WinFS)

~~~
marcosdumay
It was promised for Cairo (that become win95), under the name of Object File
System.

Different name, exact same promises.

------
deprave
UI nitpicking:

These:

$ tmsu tag summer.mp3 music big-jazz mp3

$ tmsu tag --tags "music mp3" foo.mp3 bar.mp3

$ tmsu tag spring.mp3 year=2003

Are confusing. Very non-Unix. They should be:

$ tmsu tag music,mp3,year=2003 summer.mp3

~~~
RubyPinch
I have to ask, how are they non-unix?

~~~
deprave
They're inconsistent. Sometimes the file name comes before something and
sometimes after something. It's a lot easier to remember commands if their
structure is uniform, and Unix commands usually have the general structure of
"program [options] [files]"

------
jandrese
This is an interesting idea, but it seems like an awful lot of work on my part
to go through and organize and tag every file. While some of it could be
automated (pulling ID3 tags out of MP3s), a lot of it seems to depend on me
figuring out good names for everything I make.

The biggest problem is that I have to figure out which tags are going to be
useful to me in the future and where to add them. This is relatively easy for
music (but even there can explode in complexity depending on how granular you
want to be), but more difficult for things like photos or papers.

IMHO, fully general tagging systems never work because the complexity explodes
as the number of potential tags increases. You need to narrow the scope down
to a specific domain so your tags can be limited to human scale.

------
pnathan
Very cool attack on the problem.

I've been working on ideas for managing filesystems and tools for thought for
a while off and on, it's starting to coalesce into a set of design ideas as
well as prototypes.

What I won't do is set up a filesystem: I don't think that adds value to me;
it mostly sounds technically complex and hard to figure out. And, I think that
driving the whole business through manual tagging is a lost cause. Manual
tagging can be _useful_, but actual attempts to derive semantic knowledge will
be _more_ useful. I have many thousands of documents - manual tagging ain't
gonna happen.

I need _semantic_ search and _semantic_ cross-referencing; something like a
Xanadu or a (much better) wiki/hypertext system.

------
edward
You can do the same thing with git-annex: [https://git-
annex.branchable.com/tips/metadata_driven_views/](https://git-
annex.branchable.com/tips/metadata_driven_views/)

------
DannyBee
This reminds me of BeOS's BFS, which let you do all this as part of the FS
itself :)

------
jprzybyl
I can't believe this, but I created something like this in an ad-hoc way. I
have a large amount of music, and I needed to be able to transfer them to
devices (my phone) based on some sort of tag - I need work-out music, I need
driving music, etc. Genres are inappropriate for this. The easiest way to
transfer this to the device is have a directory full of links that point to
the right files / directories.

So, depending on the need, I will either have two directories (files and tags)
or a number of file directories and a tag directory. File directories can have
whatever they want, and tag directories have either only tag directories (like
workout, driving, etc) or soft links.

Tagging a file / directory is easy - just link to it. Untagging is just as
easy. The links don't take up much space, especially next to the music. When
I'm transferring the files, I either use a script to make a directory with the
links replaced with their file counterparts, or I transfer with something like
rsync that can do that itself.

I'm amazed how similar this project is to my own solution. It's nice to have a
dedicated script for the whole thing, but the solution itself is very simple,
and easy to script with.

This isn't born out of frustration - hierarchical filesystems are perfectly
adequate for most tasks. But they have not been trees for a long time - we
have links, which let us make any graph we want out of those trees.

------
caseymarquis
This is just PDM.I built a system that does all of this, more, and integrates
into our company's existing products for managing manufacturing data. While my
version only watches the directories you tell it to, it also detects file
movement and changes via a service and through the hashing of files. I'd be
more impressed if someone built this into a file system directly. Combine
advanced tag based indexing with zfs and you've got something impressive.

------
padator
An old solution to this problem was the Logic File System
[https://en.wikipedia.org/wiki/Logic_File_System](https://en.wikipedia.org/wiki/Logic_File_System)
(disclaimer: I am one of the author).

------
lucb1e
I looked for a tagging system months ago, did fairly extensive research into
existing solutions (rather than writing a FUSE layer myself) and TMSU was the
leading result. I installed it and it's all ready to use.

Today, it's still all ready to use. I haven't touched it. I'm actually quite
happy with the way my filesystem works, I just had this idea how great it
would be to work with tag selections instead.

The only reason I might still use a tagging system is to tag some files I want
to back up manually (if at all), like a 50GB disk image or some temporary big
download, but in general I create one or two symlinks a year and I'm good. The
hierarchy works fine.

~~~
stordoff
> in general I create one or two symlinks a year and I'm good. The hierarchy
> works fine.

Same here. I have a few things that tags would be nice for, but it's
infrequent enough that current filesystems are fine. Couple of examples

* Tagging the source [CD/iTunes/Amazon etc.] of my music - tags would be nice (and possibly doable as IDTags etc.) but "/music/source/artist/album" or "/music/artist/album [source]" works fine

* Multiple paths to the same file - I have various media (movies, music, books, TV shows etc.) related to a single series in one folder. Those should also be in my main music etc. folders. Again tagging with the series name to make a virtual "series" folder would be nice, but symlinks solve that and it happens infrequently enough that it isn't an issue.

Other than those edge cases, I'd say most of my data fits pretty well into a
hierarchical structure.

------
mysecrets
Reminds me of ReiserFS's original goals.

[https://reiser4.wiki.kernel.org/index.php/Future_Vision](https://reiser4.wiki.kernel.org/index.php/Future_Vision)

------
zepto
Seems like the Tags feature that has been in MacOS X since 10.9

~~~
ghshephard
Was just going to post this - you can also access them from the CLI with
mdfind command, mdfind tag:jazz

~~~
dbm5
[https://github.com/jdberry/tag](https://github.com/jdberry/tag)

------
tmaly
I could see where a tool like this would be a huge help to legal and
compliance departments.

There is a frequent need to produce documents for discovery purposes. However,
you generally have to review the documents or send the to outside counsel
first for review before they are produced. This can take hundreds of man hours
and cost thousands of dollars.

Building something on top of TMSU could be a great solution for this task.

------
Chris2048
It would seems to me a better idea to order the VFS around queries - a
command-line command returns a query id, which represents a directory in the
VFS (such that the command is a bit like a mkdir for the VFS), the dir might
then be '/foo/tmsumountpoint/<queryid>/', and contains symlinks for all files
found in the query.

I'm sure FUSE can do this.

------
Zikes
Wasn't Windows 7 supposed to be built on a revolutionary new filesystem based
on a rdbms? I was pretty excited about that because it would have natively
enabled a lot of the features listed here. Unfortunately it was one of the
features they cut when Win7 went over budget and over deadline, and they never
brought it back for subsequent releases.

------
AlphaWeaver
This reminds me of something that would be useful in implementing Desktop Neo
[https://news.ycombinator.com/item?id=10932378](https://news.ycombinator.com/item?id=10932378)

------
toolslive
Spotlight can do the same thing, no ?
[https://en.wikipedia.org/wiki/Spotlight_%28software%29](https://en.wikipedia.org/wiki/Spotlight_%28software%29)

------
leanderleeco
Hey guys, you should look at diamond.io, we're building something that is a
tool to help you solve this problem and the problem of organizing information
in general. Check us out!

~~~
seagreen
"It's backed by a powerful Artificial Intelligence."

Ahhhhhhh!

You're tackling a laudable goal, but it would help to link to a page with more
technical details.

~~~
leanderleeco
Haha, I don't want to go too much off topic of this thread, but essentially we
can analyze actual file contents, source and metadata to find patterns between
file structures and associate them to user and "common" labels.

On top of that we try to get further accuracy by using user information to
help us find the context. We call it a Personal System, a repo of knowledge
and learned preferences heavily tailored around each individual user. We're
just in beta now but definitely put your email down if you want to try it out
eventually!

~~~
seagreen
I put my email down to spy on you:)

My personal feeling is that (A) you're totally right that we need better
_personal_ organization systems, but (B) the bottom layer (tagging, schemaing,
relationships) should not involve fuzzy processes but should be totally
understandable by the average user.

I know (B) is a weird opinion though so I look forward to seeing how far you
can get with (A) plus machine learning & whatever other tricks you guys plan:)

~~~
leanderleeco
Absolutely, I think (B) is quite a big concern for people. While it's nice to
have a fuzzy AI match things _most_ of the time, it's absolutely critical the
user can always go in and override and take control of the situation.

So for us, we very clearly distinguish "a user labelled this item" vs "we
guessed it was this label".

I'll keep you posted!

------
vgnanand
For some reason, I knew this would be about tagging files before reading the
article.

------
e12e
> TMSU does not alter your files in any way: they remain unchanged on disk, or
> on the network, wherever your put them. TMSU maintains its own database and
> you simply gain an additional view, which you can mount where you like,
> based upon the tags you set up.

I'm not so sure this is a great design decision. Now your tags are only in
your sqlite file, and you'll have to work extra hard to get a copy of the
relevant tags when you backup/copy etc.

I think storing tags in extended attributes[x], and possibly a separate
utility that maintains and index (hopefully shouldn't be needed just for the
tags, but might help with a) exposing file-level tags (like ID3, exif, file-
type (magic number) etc), and b) allow for automatic organization based on
full text and other content-based indexing.

It appears, on a * nix system, the only major reason to stay away from
extended attributes (apart from the limit on size of tag data) is NFS. But
samba should (AFAIK) work fine with extended attributes.

As far as I can gather, Gnome Beagle is dead, and Gnome Tracker[t] has taken
its place. But it's not crystal clear if Tracker will index tags placed in
files' extended attributes or not. If I understand correctly, Tracker's own
tagging utility, will only place/edit tags in the Tracker database/index. But
the indexers will certainly honour file-level tags for some files.

I don't really use full Desktop environments, but some kind of system with
inotify support, and a Xapian or similar back-end (like Tracker), does seem
like a good idea. It would certainly be nice to see such a system implemented
in Go, but I think an _architecture_ along the lines of Tracker is probably
worth keeping: A database daemon, an indexer and a set of query/view tools
(I'm not a fan of the centralized tag database, though).

Another alternative to Tracker would be Recoll:

[http://www.lesbonscomptes.com/recoll/index.html](http://www.lesbonscomptes.com/recoll/index.html)

[t] [https://github.com/GNOME/tracker](https://github.com/GNOME/tracker)

[x]
[http://www.lesbonscomptes.com/pages/extattrs.html](http://www.lesbonscomptes.com/pages/extattrs.html)

Btw, for editing/automating ID3 tags, I recommend "Ex Falso", the tag-editor
for Quod Libet (which is an audio player):
[http://quodlibet.readthedocs.io/en/latest/](http://quodlibet.readthedocs.io/en/latest/)

------
bossman702
This is really complicated

------
awesomerobot
This just seems like a different problem.

------
arnoooooo
I wish it would support tag hierarchies.

------
35bge57dtjku
Nice logo. D:

------
x5n1
It seems to me most of this stuff can be done via unix command line if you are
so adept... without this app.

~~~
zyxley
What existing tools can automatically generate a virtual filesystem based on
tag metadata?

~~~
x5n1
no not virtual filesystem, i meant querying your files by date or tag... find,
grep and something to print the tag would do the job.

~~~
marklgr
But where is the metadata? With OP's tool, it is in some db, so you don't have
to tamper with your files or their name.

~~~
x5n1
well with mp3 files the meta data is in the file, it's called an id3 tag. for
other files, depends on the file.

------
bossman702
Hello

