
Rethinking the filesystem as global mutable state, the root of all evil - stargrave
https://www.devever.net/~hl/globalfs
======
mrfredward
Every piece of hardware on a computer is a piece of global mutable state.
There are many good reasons to hide that fact behind an abstraction, but I
can't help but think hiding global mutability is better handled at the
application level than the OS level, because there are too many cases where
the abstraction becomes extremely limiting.

As an example, most people would want to be able to import an image into a
word processor regardless of where that image is located (local drive, network
drive, floppy disk, etc.). To support that, most end user programs would want
to be offered access to the entire filesystem. The moment two applications do
this at once, you have all the shared mutable state problems we do now.

~~~
satyenr
It doesn't have to be that way. Consider the following scenario:

1\. Every file is owned by application.

2\. A file-store application is the custodian of files meant to be used by
multiple applications.

3\. An application that needs to edit a file must take ownership of the file
for the duration of the editing.

4\. An application that needs to read a file must borrow the files.

5\. A file can be borrowed by multiple applications, but owned by only one.

6\. Applications can provide the ability for shared ownership, but those would
by specialized applications capable of handling and merging simultaneous
changes.

This is a straw-man solution, and I am sure multiple problems will have to be
solved before such a system can become a reality.

> As an example, most people would want to be able to import an image into a
> word processor regardless of where that image is located (local drive,
> network drive, floppy disk, etc.). To support that, most end user programs
> would want to be offered access to the entire filesystem. The moment two
> applications do this at once, you have all the shared mutable state problems
> we do now.

I don't see why the custodian of photos can't be a photo management
application. In fact, a filesystem is the lowest common denominator. It is
possible to build higher level abstractions like PhotoStore, MusicStore,
MovieStore, CodeStore etc. which accommodate and make use of the properties of
individual data types to offer an enhanced experience.

~~~
heavenlyblue
Your example would fail to accommodate the most basic application of files:
logging and reading file logs at the same time.

~~~
satyenr
Who says logs have to be written to a file — at least as far as the user is
concerned? Logs are a series of events — a LogStore will let you read events
as they are appended. And even if you do want to store logs in “files”, the
solution proposed here here let’s you borrow them for the owning application.

~~~
heavenlyblue
My question goes more like this: “why the hell do you want to re-approach the
idea of a filesystem if the new approach can’t even accommodate something as
simple as storing log files?”.

It may be time for something completely different. Just stop calling it a
filesystem.

~~~
satyenr
Again, why do you care about files? Logs have nothing to do with files. Files
are just the interface used by filesystem to show the event stream that is
logs so you can use file based tools to access the.

I am not calling it a filesystem. Let’s call it a data bank?

~~~
heavenlyblue
It already exists in the form of databases.

First that comes to mind is etcd.

------
naasking
This was first explained by the capability security community. Plan 9's
private nameapaces is an approach to capability secure file systems, with the
default being the empty namespace. I'm surprised the article didn't mention
Plan 9 actually since it discusses capabilities.

~~~
ryanjshaw
I just realised it's 2019 and capabilities are still misunderstood, and the
ACL-capability-equivalency myth continues to result in poor solutions to
security problems.

For anybody who is curious, the general problem here is described in two great
papers as "the confused deputy" [1] and "designation without authority" [2].

Roughly put, systems built with ACLs as the primitive mechanism for
authorization can never produce practically secure systems.

[1]
[http://zoo.cs.yale.edu/classes/cs422/2010/bib/hardy88confuse...](http://zoo.cs.yale.edu/classes/cs422/2010/bib/hardy88confused.pdf)

[2]
[http://srl.cs.jhu.edu/pubs/SRL2003-02.pdf](http://srl.cs.jhu.edu/pubs/SRL2003-02.pdf)

~~~
boapnuaput
I fear that it's because we don't know how to make globes. [0]

[0]
[https://corbinsimpson.com/words/globe.html](https://corbinsimpson.com/words/globe.html)

~~~
ryanjshaw
Reading that felt surprisingly familiar! I pottered about for 2 years trying
to build a GUI for a Globe-based world and gave up.

------
JoeAltmaier
I've thought for years that 'file systems' are an abomination. We use a
cobbled-together schema of parent-directory, filename, some random date/time
stamps and maybe a three-letter extension. Why? Because we inheirited that
from some DOS days.

Why not a collection of immutable UUID-labelled resources with an arbitrary
schema of attributes? Like a relational database or some such.

Overkill? Doesn't every major app have to do this itself, in some container
(document/mail folder/image definition/contact list and on and on)?

Designers of operating systems have made no effort to capture this kind of
service and provide it as a fundamental OS feature. And OSs are exactly where
this belongs - where you can carefully get it right and everybody writing apps
can depend on it.

~~~
zrm
> Doesn't every major app have to do this itself, in some container
> (document/mail folder/image definition/contact list and on and on)?

It does, but it's also something that can be provided by a library as easily
as the OS. So e.g. Firefox uses sqlite, but it can use a portable library for
that, and then it isn't different on Windows than Linux or Mac.

And then the application gets to choose something with an appropriate level of
complexity. Sometimes all you need is Maildir or xattr(7) or JSON, sometimes
you need a full on SQL database. One size fits all rarely does in practice.

~~~
JoeAltmaier
Except, the results are opaque. I can't operate on the data without using
their tools, which might not be what I want.

Imagine if the OS let you browse all the app data, see all the relations (that
you were authorized to see), and write tools to operate over it. That would be
a most Open system, compared to everything we have which is essentially walled
off to us.

~~~
zrm
It's not really opaque as much as it is application-specific. But then how do
you fix that? If you give applications the ability to tag things then one mail
program uses a tag called "unread mail" and another uses a tag called "message
read" with the opposite value and another uses a tag called "mail flags" with
a bitmask where one of the bits is whether the message has been read or not
etc. Smells like Windows registry.

Either you somehow enforce a high degree of uniformity, which implies a pretty
serious lack of flexibility, or everybody gets to make their own decisions and
then everybody makes different ones. And the second one seems better as long
as the individualized thing they're doing is sufficiently well documented.

~~~
JoeAltmaier
The alternative, right now, is nobody can do anything at all like this.
There's not even the opportunity to create conventions in apps.

Opening up application data to OS tools, opens up a whole new world of
opportunities (and hazards) for app developers.

------
pjc50
Interesting, and should definitely be read with the companion
[https://www.devever.net/~hl/objectworld](https://www.devever.net/~hl/objectworld)

Given the same title, I would write a rather different article. So many of our
traditional filesystem problems center around concurrency: what kind of read-
after-write or durability semantics are guaranteed. A lot of effort goes into
this, which is necessary for databases to work on top of a general-purpose
filesystem on top of a disintermediated storage system that may be a disk or
SSD with a variety of block sizes. But for most operations it's unnecessary
overhead.

Many games ship with a "packfile", which is a pseudo-filesystem that appears
as a giant blob to the OS. Usually it's faster to seek individual small items
out of the blob than if they were separate files.

Further to that is the security problem; we've moved from "apps are all
trusted, but you need to watch the files closely between multiple users" to
"there's only one user, but you can't trust the apps".

Note how the cloud has avoided both of these by moving storage to three
different areas with different semantics. Apps can speak to "blob storage"
(S3) which has a transactionality/security granularity of one blob. Or they
can speak to a database (which has intelligence of its own), or separate raw
block storage if neither of those suits.

What if we moved from "everything is a file" to "everything is a URL"?
Possibly adding a system-default packfile mechanism. So an app would be
allowed access to everything that it ships with, but nothing outside except
what it could request as URLs with various sorts of security mechanism.

~~~
T-hawk
> Many games ship with a "packfile", which is a pseudo-filesystem that appears
> as a giant blob to the OS.

Games get to do this because they don't care about interoperability. The
packfile is intended to be a closed environment accessible only to the game
code itself, with no need for anything else to care about it.

This could be interpreted as a particular case of "can't trust the apps" \- a
game has no need or desire to trust anything else.

~~~
andrekandre
interestingly enough that seems close to how alan kay talks about objects
behaving... only they know how to open and read their own data, and sharing
would be done via itself through its ‘interface’ i guess in this case the
“object” is a whole game, but interesting nonetheless...

------
mnw21cam
As anyone who has put together a chroot knows, it isn't as simple as just
preventing a process from accessing the filesystem. Most programs _need_ to
access _a_ filesystem, just so they can load libc and the various other
libraries it might need.

~~~
AstralStorm
Do they now? Usually, the linker does all the loading, also known in Linux as
ld.so. (even when you use dlopen)

Lazy loading can thunk through kernel so that ld.so service loads things into
the process - and preferably strict loading should be used.

~~~
mnw21cam
Indeed. But ld.so runs in the context of the process that is starting up, so
if it doesn't have access to anything, it can't.

~~~
AstralStorm
It doesn't actually have to, but you'd need a fun bit of stub code in place of
unloaded function pointers etc.

------
gugagore
One thing missing from this rethinking, as far as I can tell, is names.

You can't move pointers between different processes in general, because each
process has its own "memory address namespace". But an absolute path generated
by one program does make sense to another program.

~~~
hlandau
That's true. I mainly cover this point - filesystems as a means of connecting
different applications together - in another article:
[https://www.devever.net/~hl/nexuses](https://www.devever.net/~hl/nexuses)

------
perlgeek
This is a fascinating idea, and totally obvious in retrospect :-)

I've noticed features that work towards cutting off access from global file
system. Containers are mentioned in the article, but there are other things
like systemd giving easy access to per-service private temp files, blocking
access to /home, whitelisting paths that are allowed at all etc.

The problem really is that this global mutable state has existed for 50+
years, so lots and lots of things now rely on it: dynamic loading of libraries
(could be separate objects from files), configuration (doesn't need to be
stored in files), UNIX sockets that are accessed by file name etc.

And not only have many programs started to rely on files, but many have relied
on them being mutable. Webservers can be told to re-read their config from a
running process, things like that.

So, is there any way forward for UNIX-based systems that don't break the
world?

Edit: another interesting observation is that iOS has already done away with
the global file system, at least from the perspective of the apps.

------
specialp
Web assembly is a good step in the elimination of the globally shared FS.
Dynamically linked libraries are explicitly provided with an index only that
links to the library location. [https://webassembly.org/docs/dynamic-
linking/](https://webassembly.org/docs/dynamic-linking/)

WASI [https://hacks.mozilla.org/2019/03/standardizing-wasi-a-
webas...](https://hacks.mozilla.org/2019/03/standardizing-wasi-a-webassembly-
system-interface/) will provide access to files given to the function by the
runtime, not vice versa as in operating systems where a syscall is sent to ask
permission for a file.

------
x3ro
This reminded me of this [1] talk from 32c3, which talks about the same issues
and presents and overview of some approaches ppl have come up with. Somewhat
BSD centric, but I found it interesting back then.

[1]:
[https://media.ccc.de/v/32c3-7231-cloudabi](https://media.ccc.de/v/32c3-7231-cloudabi)

------
skdotdan
One thing that has always astounded me is the fact that the following script
can be executed without any special permission (apart from execution, read and
write):

    
    
        rm -rf ~
    

A key to a more modern file system would be a much more granular permission
systems apart from "rwx" for a certain group of users.

~~~
nybble41
What permissions do you think that command _should_ require? You could add a
separate "delete" capability, but the fact that you have permission to write
to the file already means that you can delete or overwrite its contents, which
is effectively the same from the user's point of view. (Granted, the fact that
you can delete a read-only file if you have write permission on the
_directory_ is a bit odd, but that situation doesn't come up often.)

I don't think more granular permission can solve this problem. What does solve
it, however, is read-only copy-on-write snapshots. No matter how files are
renamed, deleted, or modified, you can always recover the original version
from the snapshot.

------
ncmncm
This is a very important idea, but it will take a long time to become
practical for typical system design environments.

Best get started.

------
asimpletune
Isn’t this sort of how sandboxing iOS apps already feels like?

------
pfdietz
> root of all evil

I see what you did there.

