Show HN: The Concept of a Personal File System

jessmartin · on June 2, 2022

At Fission, we built a library called WebNative File System[1] that effectively solves exactly this problem: every user gets their own file system stored in IPFS[2].

I've been building a few apps with WNFS over the past few months. It's a mind-bending architectural model coming from client-server architecture, but it has some clearly superior qualities, as you point out in your documentation. Feel free to swing by the Fission Discord if you want to talk about it[3].

[1] https://guide.fission.codes/developers/webnative/file-system...

[2] https://ipfs.io/

[3] https://fission.codes/discord

Rygian · on June 3, 2022

Thanks for sharing, awesome work! Indeed, this has some similarities to the problem I'm addressing. I wouldn't go as far as saying it "solves exactly the problem."

A couple of points where it diverges: it is backed by IPFS, so the user does not know and cannot decide where their data is physically stored — I consider this as a requirement. Another requirement, which I agree is more on the utopia side of the spectrum, is that all my devices (as a regular user) should use WNFS for all my user content by default, at the OS level (rather than me having to use dedicated applications that support WNFS). Reliance on IPFS, rather than a local-first approach, would be a blocker for this kind of low-level integration.

The very interesting parts for me are the DAG representation for the filesystem and the versioning capability (which comes naturally through IPFS, I guess).

swizzler · on June 2, 2022

I haven't read all these words yet, but I recognize enough of them to know I have similar interests. I was excited to see https://bazil.org/2014/04/24/introducing-bazil/. Of course, now defunct.

Rygian · on June 3, 2022

"Bazil separates knowledge of a file from the contents of the file, letting the laptop know about all of the files, without having to store the contents of the file." Yes, it hits the nail on the head!

This approach is probably at the right level of abstraction, presenting itself as an actual filesystem that could become the default e.g. for a /home partition (computer) or a /storage/data partition (Android phone).

Too bad development looks to be halted indeed.

jschveibinz · on May 28, 2022

Wow, lots of good ideas in here. Thank you for sharing!

mikewarot · on May 28, 2022

There is a small overlap between a ton of things I've been considering recently, and your ideas. We both want persistent storage of versioned files that can be shared directly with others.

I've been reading up on the Memex, as described by Bush in 1945. He saw it as the most important challenge facing mankind, and I think he was right. I think we blew it, big time. HTML and the Web isn't close to what he was describing. It's different in some key ways.

The Memex was a personal store of knowledge. All storage was local, and hyperlinked. There was a feature for exporting a "trail" of documents to an external storage medium. (Microfilm, in his case... he held several patents in that area, so it was very familiar to him)

The main reason for having the Memex was to hyperlink documents, and store "trails" through them. Giving things context, and seeing the connections between parts of a very complex idea was where the Memex would be most useful. Saving a single web page with all of it's included images in a folder isn't even close. It's only one document, not a web of them. All of the links outside of the saved document will eventually break. DRM prevents saving some content.

HTML doesn't let you mark up Hypertext. It stores the "markup" in with the document, emulating a piece of paper. This prevents many different markups of the same document, as the locations of the underlying actual text in the document move with each change, no matter how trivial.

A better way to mark up documents is to store the markup in a layer separate from the underlying text. If you must have it all in the same file, then store it in a zip folder structure, as has become common with open office documents, etc. The base layer should never change with markup. This small change then allows you to use things like GIT to version it, and know the changes.

If a system of transforms of the text is used, instead of the line oriented approach of GIT, you can then refer to anything in a document with a range. Those ranges can be transformed to match document versions with some simple math. This is impossible with HTML, in the long run.

If you store markup separate from text, then you can easily use a different type of pointer to refer to images, video, audio, or other media in a manner similar with the text markup. It becomes possible to annotate all media with a few relatively small changes.

I've become convinced that the best way to deal with the web as it now stands is to ingest HTML, extract out the text layer, and then separate/transform the markup to a set of layers in a new format. All markup would point to the underlying text by range, instead of relying on it's implicit address in the stream of HTML. Once this is done, it then becomes practical to turn the range into a resource locator.

I propose reading everything you personally browse from the internet into a stream of such documents and layers. Each and every object would have an automatic refence count and he included on a list of all available local objects. You could then garbage collect files that were unused after an appropriate amount of time. This would automatically save everything that you created a hyperlink to locally. You could then move the documents that were in this log, and linked to, to a more permanent home, and an archive.

Making access available to friends and family should be done via a capability object. Usernames and passwords shouldn't be used. This capability would be signed and represent both the address of, and access to, the relevant document(s). A list of capabilities given would be maintained locally, and checked for revocation by the underlying access system.

Wow... that's a lot of stuff... I hope some of it makes sense to you, and that you find these ideas useful in some manner. Thanks for sharing, and your time and attention in reading this reply.

Rygian · on May 28, 2022

Thanks for sharing! I'm glad to discover there's overlap, even if small, with your thinking.

I was not aware of the Memex, and I will definitely dive into it right away.

Regarding the concept of "document" I should maybe make it clearer that a document does not need to contain any text. Photos, images, videos, music, multimedia experiences, … are also first-class documents in a Personal File System. As such, the considerations about text vs markup would form a subset of the general problem of separating meaning from form, which I don't really address.

And another distinction I'd preserve is that the Personal File System focuses first and foremost on filing an average user's own production of documents and the immediate exchanges with their entourage (friends and family), as opposed to "things seen on the web" or the production of scientific communities. Those also belong in the Personal File System, but do not drive or define its utility.