I want to have both options because traditional folders and a tag based file sys...

blunte · on June 11, 2019

In macOS (for some number of years now) you store files in a standard folder hierarchy, and you can add tags to files. With or without tags, you can use Spotlight (cmd-space) to quickly find files.

yoodenvranx · on June 11, 2019

Yes, I know that there are approaches which use the normal file system.

But what I want is different:

I want a universial "DB for binary files" where I can store binary data and all its metadata.

Then I can use this DB to build a app for picture galleries, music collections and tons of other things.

This DB should also support:

* automatic checksumming so that I can detect data corruption

* Some sort of version history so that I can store multiple versions of a file

* there could be built-in replication which I can use to see the same data (or parts of it) on all my devices

ViViDboarder · on June 11, 2019

Is this common enough a use case that it ought to be a file system feature?

Plenty of applications do just this today by using the file system plus an index in SQLite. Is that method insufficient?

sfopdxnonstop · on June 11, 2019

You asked: "Is this common enough a use case?"

Then you answered your own question: "Plenty of applications do just this today".

So yes, it is a common enough case that it could/should be built into the OS.

theamk · on June 12, 2019

Well, OS could certainly offer some support, like file change detection, but the main indexing is often too application specific.

Photo albums wants to do face recognition. Music player wants BPM detection. Should those be done by OS? I do not think so.

sjy · on June 11, 2019

Sounds a lot like ZFS.

__jal · on June 11, 2019

Exactly - ZFS offers everything on the bullet list.

vondur · on June 11, 2019

I think the old BeOS supported what you are describing.

waddlesplash · on June 11, 2019

And Haiku, its open-source successor does also.

outadoc · on June 11, 2019

I think we'll start to see that happening as machine learning is used to tag the files. See Google Photos for example, where you can basically use search in place of any sorting.

theamk · on June 12, 2019

I think most of the photo organizers offer this? I remember using digiKam back around 2006 or so, and it already had this feature.

I think tags have limited scope. They are great for photos. They are OK for music, but strictly in addition to the main hierarchy. You could use them for text docs, but folders + full text search is much better. And file level tags are completely useless for code/programming

maxerickson · on June 12, 2019

If you have a rich set of tags, you can have more than one main hierarchy.

Artist sort, year sort, genre sort, etc. Genre is really hard though.

There isn't much reason to use tags as the only way of organizing things though, they work great as views.

theamk · on June 12, 2019

Agree, views, but not main organizing principle.

For example, my music collection is big and diverse, and both "year sort" and "album sort" are kinda useless now, because there are actually multiple disjoint subsets. There is no point ever in showing me audiobooks for year 2010 and regular music for year 2010. I always only want a subset of it.

This is what I meant "strictly in addition to main hierarchy" -- let me keep my folders, and maybe when I want to go deep enough, I want to browse by tag. But even then it would not be a hashtag-like tags that the original page refers to.

ViViDboarder · on June 11, 2019

You can achieve this in existing systems with a hardlink.

TuringTest · on June 11, 2019

A hardlink is a (poor, partial) implementation of a tag in the file system. A tag will recover a collection of all the elements filed under the same label, and a link can't do that.

girzel · on June 11, 2019

I think hard links could. To take the up-thread example, you have one folder called "Vacation2018", and another called "BestDogPics". The photo of your dog on vacation lives in both folders, but hardlinked together.

yellowapple · on June 12, 2019

I think the key thing here is that files can indeed be categorized in different ways, and could - and should - exist simultaneously in different collections with different structures.

I therefore think it'd be worth separating the concept of a "filesystem" from a "folder system" or "index system". That is: keep the file storage itself flat (e.g. in a relational database table), then have different categorical "views" that could be relational and/or hierarchical pointing to these files. Naturally, those collections will have their own sets of metadata for that file.

So for example, you have a file named "img-8675309.png" in your camera's storage. The operating system presents a view of said camera storage, in the form of a flat list of files and some basic metadata like creation date (plus perhaps camera-specific metadata if, say, the camera driver is the thing generating the view). You could then open up views for your 2019 Bavaria vacation ("Vacations" → "Bavaria 2019" → "Photos") and your dog ("Pets" → "Fido" → "Photos"), set a sorting field for each view (for the vacation, probably chronological; for your dog, by however you define "best"), drag/drop the camera file into those views (in the latter case, maybe even drag it into the spot where you want it to show up ranking-wise), and the operating system would then add references to that file automatically (almost certainly copying it into a local cache) in both your opened views and potentially in some system-maintained views (e.g. "Local Files" → "Photos").

One of the slick things here is that file access could be entirely transparent to how those files are stored. For example, those views will of course include your device's internal storage, but might also include external devices (like the camera in the above example) or even remote services (like, say, your social media account). If you accidentally delete your prized Fido photo on your local machine, the "Pets" → "Fido" → "Photos" view could still have a reference to the copy on the camera, or a copy in your social media posts, or a copy in the system backup that automatically ran last Sunday, and thus retrieve it and re-cache it locally (or prompt you to plug your camera or your external USB drive back in so it can check there).

wuliwong · on June 11, 2019

I would like both as well. I can see value in tags but I am thinking from a developer perspective, I'm not sure how I would target a specific file in my code without a tree structure? Maybe I just haven't thought about this enough but it seems necessary.