As an example, most people would want to be able to import an image into a word processor regardless of where that image is located (local drive, network drive, floppy disk, etc.). To support that, most end user programs would want to be offered access to the entire filesystem. The moment two applications do this at once, you have all the shared mutable state problems we do now.
1. Every file is owned by application.
2. A file-store application is the custodian of files meant to be used by multiple applications.
3. An application that needs to edit a file must take ownership of the file for the duration of the editing.
4. An application that needs to read a file must borrow the files.
5. A file can be borrowed by multiple applications, but owned by only one.
6. Applications can provide the ability for shared ownership, but those would by specialized applications capable of handling and merging simultaneous changes.
This is a straw-man solution, and I am sure multiple problems will have to be solved before such a system can become a reality.
> As an example, most people would want to be able to import an image into a word processor regardless of where that image is located (local drive, network drive, floppy disk, etc.). To support that, most end user programs would want to be offered access to the entire filesystem. The moment two applications do this at once, you have all the shared mutable state problems we do now.
I don't see why the custodian of photos can't be a photo management application. In fact, a filesystem is the lowest common denominator. It is possible to build higher level abstractions like PhotoStore, MusicStore, MovieStore, CodeStore etc. which accommodate and make use of the properties of individual data types to offer an enhanced experience.
It may be time for something completely different. Just stop calling it a filesystem.
I am not calling it a filesystem. Let’s call it a data bank?
First that comes to mind is etcd.
> Applications can provide the ability for shared ownership, but those would by specialized applications capable of handling and merging simultaneous changes.
Finally, as I said, it is a straw-man proposal -- I haven't spent time working out all the kinks.
So flock() and file permissions then?
There are improvements that could be made here -- app-level permissions in addition to user-level permissions for example. But it's still fundamentally a filesystem.
We are talking about what the user/application sees and is capable of accessing.
But how is that different than a filesystem?
Suppose we add application-based ACLs to file permissions. Then the app does open("/path/to/file", O_RDONLY) as ever. If the app has permission to the file, it gets the new fd. If it doesn't, it gets EACCES as usual. Or the OS displays a dialog asking whether the app should have permanent or one-time access to that file, and then the call doesn't return until the user chooses one.
I don't see a fundamental change here. The application wouldn't necessarily even have to be modified.
Historically mostly because of swap, so the OS can move a page from memory to disk and then back to a different physical memory location without modifying the application's pointers. On large systems with 32-bit applications it was advantageous because the system may have had more memory than 32-bit pointers can address and then each application can have its own address space. ASLR nowadays.
But filesystems already have the equivalent abstraction. If you run out of space on /dev/sda you can add /dev/sdb, copy /home to it and then mount /dev/sdb1 /home and the application that reads /home/alice/file is blissfully unaware that anything has changed. Heck, half the time you're not even reading from the physical drive, the data is cached in memory and you're really reading it out of the page cache.
3. Ownership cannot be transferred. A file remains in the same application throughout its lifecycle.
4. An application can give another application read-only access to a namespace containing one or more files.
6. An application can give another application read/write access to a namespace containing one or more files through a transaction mechanism.
Do you see any advantages or disadvantages of both approaches from your point of view?
Who owns all the unassigned stuff? The file system. Give it a different name and you change nothing.
Compressed files are containers for files -- they are nothing on their own.
> Applications themselves are files.
Sure -- they go into ApplicationStore -- which tailored for say, quick application discovery and launch.
> Who owns all the unassigned stuff? T
What exactly is the unassigned stuff? Files taken from other machines? I don't see why they can't be put into an appropriate store based on type.
> I create my own file formats.
I assume you create your own file formats because you develop your own applications? In that case, your application will be the owner of the file in question.
Is the application store the only way to start at app? If I bind a global hotkey to start an app how is that handled?
I use this for example to open links in mpv with a singular line in my config file for the addon, tridactyl,
bind V hint -W ! mpv this means show a hint on links and for the given link send the chosen link to mpv, wherein mpv will look at the link and if it can read it will use youtube-dl to download the link to a temp dir and display the video.
Who owns what there?
Do I have to access my photos via a a singular photo manager? I can imagine images accessed in 17 different contexts by 30 different apps does each of them need permission to access each dir which contains images or just generically to access images.
Does this mean that a singular permission would control both access to the browsers cached image of the ycombinator logo on this page and some ones nude selfies?
I just don't think you can cleanly map apps -> files beyond the trivial cases without making something inflexible that sucks to use.
What does any of this have to do with a GUI? And for transporting, extract the data into a portable bundle, or take the entire store.
> Is the application store the only way to start at app? If I bind a global hotkey to start an app how is that handled?
Again, what does any of this have to do with how the applications are stored? Can you see the files that make up an iOS app? You can still start them, can’t you? A macOS app is a folder called Something.app, the actual binary lies somewhere inside. Do you typically need to poke inside the .app folder?
As far as the computer is concerned, you are merely starting a process. Again, what does that have to do with how applications are stored. They can be triggered however and wherever they are stored.
> Do I have to access my photos via a a singular photo manager? I can imagine images accessed in 17 different contexts by 30 different apps does each of them need permission to access each dir which contains images or just generically to access images
This is already being handled on iOS and Android. Photos apps is merely the default interface for the PhotoStore. You can access your photos from any app.
> Does this mean that a singular permission would control both access to the browsers cached image of the ycombinator logo on this page and some ones nude selfies?
Namespaces inside individual stores, or separate stores.
2. An actual filesystem allows you to run things that aren't in an app store bundle
3. Which app owns the files when An app starts an addon which starts a process which runs an app which accesses a file. Does the last app in the chain own the file? This is challenging because plenty of apps could do things based on arguments passed in that involve modifying the filesystem.
4. The way ios and android handle 2 apps accessing the same file is only acceptable if someone has envisioned the way you want this to happen to some degree on both ends. A file picker works on any type of file and an OS that can't have a file picker seems to be objectively worse whereas adding an image picker would be an upgrade. I enjoy using calibre to manage how ebooks are stored and beets for music but neither is bounded by an underlying system designed by others and neither locks said files into said structure or limits access according to it. Its trivial to all out to calibre including via the cli get the full path to a book for example and do something with it.
It seems fairly clear to me that there several layers of filesystem access.
Applications that should never have access to your filesystem because they are malicious. Avoided by installing only from trusted sources to avoid malware in the first place the best possible line of defense. This also makes it possible to get a list of applications that ought to be revoked and communicate this to users.
Apps that run as the user on their behalf that accesses the filesystem.
You appear to want to pile a layer on top of the last. This appears to only work for the simpler cases and I'm not even clear what the benefits are supposed to be.
Where does the file come from? Suppose you create a file in your terminal application, then it lives inside the terminal and you can use your terminal to run it.
> 2. An actual filesystem allows you to run things that aren't in an app store bundle
Neither a global namespace (file system) nor an appstore are required to execute a program.
> Which app owns the files when An app starts an addon which starts a process which runs an app which accesses a file.
This should be at the discretion of the application developer. This is also the way browsers already work with sandboxed addons.
> This appears to only work for the simpler cases and I'm not even clear what the benefits are supposed to be.
If you don't put everything in the same global namespace, you get more security, cohesion and compatibility.
That's why we use VMs, containers, sandboxes, and various user accounts instead of doing everything in a single filesystem using the root user.
It not even optimally secure. Why for example would your image editor need access to all your image files instead of just the ones passed in via a secure system dialog?
In that instance the dialog would be the checkpoint not the some weird file system borrow checker.
It's a tenuous analogy. DRAM, disks, caches...effectively the entire memory hierarchy is implemented as message-passing. Flip flops can be well, flipped, but those are super low-level. Almost all the rest of it requires you to send messages. It's kind of ironic that all those messages basically orchestrate the illusion of global mutable state.
Presumably because for certain things it was easier to reason about when things were mutable...
And now we're taking this facade of mutable state and trying to add a veneer over it to make it seem immutable.
It's circular :)
For anybody who is curious, the general problem here is described in two great papers as "the confused deputy"  and "designation without authority" .
Roughly put, systems built with ACLs as the primitive mechanism for authorization can never produce practically secure systems.
Why not a collection of immutable UUID-labelled resources with an arbitrary schema of attributes? Like a relational database or some such.
Overkill? Doesn't every major app have to do this itself, in some container (document/mail folder/image definition/contact list and on and on)?
Designers of operating systems have made no effort to capture this kind of service and provide it as a fundamental OS feature. And OSs are exactly where this belongs - where you can carefully get it right and everybody writing apps can depend on it.
IMHO, what's needed is something like a document database (in that it allows arbitrary schema) but even more importantly, it needs to have some system-level indexing. Right now, applications that deal with large amounts of small pieces of data each roll their own, e.g., embedded database, because (1) most filesystems aren't capable of adequate performance with very large numbers of small objects and (2) you can't maintain an app-independent index anyway, so... These applications then can't interact with each other without some app-specific API, which is often costly to develop against (at least, compared with a common filesystem API) and with only the capabilities that the app developer sees fit to give you. Which they often have very little incentive to do--their real incentives are usually to keep things proprietary and customers captive.
The absence of these two capabilities makes "the unix way" where "everything is a file" an impossible data model for these types of applications.
IMHO, the right data model is probably some combination of user (UUIDs or strings doesn't matter) and content-hash indexing with versioning and conflict resolution similar to Git (and CouchDB).
Actually they've made several. See BeOS and WinFS, for starters, though the idea is even older than that.
It does, but it's also something that can be provided by a library as easily as the OS. So e.g. Firefox uses sqlite, but it can use a portable library for that, and then it isn't different on Windows than Linux or Mac.
And then the application gets to choose something with an appropriate level of complexity. Sometimes all you need is Maildir or xattr(7) or JSON, sometimes you need a full on SQL database. One size fits all rarely does in practice.
Imagine if the OS let you browse all the app data, see all the relations (that you were authorized to see), and write tools to operate over it. That would be a most Open system, compared to everything we have which is essentially walled off to us.
Either you somehow enforce a high degree of uniformity, which implies a pretty serious lack of flexibility, or everybody gets to make their own decisions and then everybody makes different ones. And the second one seems better as long as the individualized thing they're doing is sufficiently well documented.
Opening up application data to OS tools, opens up a whole new world of opportunities (and hazards) for app developers.
I've spent time crawling around the SCSI/SAS/Storage of systems like that. Wrote dirty hacks for end-user clients to cope with incompatibilities. EG multiple versions with the same filename are totally fine. Filenames needn't and aren't unique in such systems, just another col in the DB. How do you tell a Windows ftp client which of the 20 different "report.txt" files it should download? (Filezilla actually has 'VMS' options for this, its more common than you think.)
If all you ever deal with is relatively simple, well defined, CRUD type applications on entirely abstract systems in some remote infrastructure, its easy to think that's how all development should look. It doesn't and it shouldn't.
The fact is, you can do these queries if you allow flexible attributes. With a fixed file system (almost all file systems) you cant ask. In practice, you've already overwritten the one you want, or one app conflicted with another, or you just can't do you want (have two versions of an app installed at the same time).
Given the same title, I would write a rather different article. So many of our traditional filesystem problems center around concurrency: what kind of read-after-write or durability semantics are guaranteed. A lot of effort goes into this, which is necessary for databases to work on top of a general-purpose filesystem on top of a disintermediated storage system that may be a disk or SSD with a variety of block sizes. But for most operations it's unnecessary overhead.
Many games ship with a "packfile", which is a pseudo-filesystem that appears as a giant blob to the OS. Usually it's faster to seek individual small items out of the blob than if they were separate files.
Further to that is the security problem; we've moved from "apps are all trusted, but you need to watch the files closely between multiple users" to "there's only one user, but you can't trust the apps".
Note how the cloud has avoided both of these by moving storage to three different areas with different semantics. Apps can speak to "blob storage" (S3) which has a transactionality/security granularity of one blob. Or they can speak to a database (which has intelligence of its own), or separate raw block storage if neither of those suits.
What if we moved from "everything is a file" to "everything is a URL"? Possibly adding a system-default packfile mechanism. So an app would be allowed access to everything that it ships with, but nothing outside except what it could request as URLs with various sorts of security mechanism.
Oh, Hello RDF. (Granted, in RDF everything is a URI but that's a minor detail here.)
At one point in my life I spent two years trying to work with RDF based data store modeling. Shoehorning everything into the subject-predicate-object worldview is probably beautiful from a purely mathematical point of view, but utter insanity in the real world.
Now, s-p-o model does open up a world of very interesting graph-theory approaches, especially if you're trying to build an inference engine. But at least in my experience it's not a realistic way to build general applications.
And if you thought XML as a data interface format was bad enough...
Do you mean "everything not on my machine is a URL"? We're getting somewhat close to that already.
Or do you mean "everything on my machine is a URL"? Then I don't see how it's different from "everything is a file with a full path", except that someone else is now responsible for the security mechanism.
"Everything is a URL" would give you a unified view of on-machine and off-machine resources...
Games get to do this because they don't care about interoperability. The packfile is intended to be a closed environment accessible only to the game code itself, with no need for anything else to care about it.
This could be interpreted as a particular case of "can't trust the apps" - a game has no need or desire to trust anything else.
Not sure what you're accomplishing exactly. Something would have to serve said url and you are moving your security problems into it. Mostly you can accomplish many of the same things with file security, with some difficulty.
Lazy loading can thunk through kernel so that ld.so service loads things into the process - and preferably strict loading should be used.
As far as I'm aware, it's quite common in microkernel-based OSes to implement even ELF loading, dynamic linking, etc. inside system libraries loaded inside the process creating the new process. So the parent process basically constructs the child process, which means you could easily design an OS such that the parent needs access to the library files, but the child doesn't.
It's also pointless. If a program can cause damage by loading harmful code it can also cause damage directly.
Restricting what programs can do is a great way to prevent experimentation and hinder progress.
>It's also pointless. If a program can cause damage by loading harmful code it can also cause damage directly.
It is not pointless, but it is also not perfect. That's why we have defense in depth. Where instead of having one perfect moat to protect the castle, you also have alligators and witches that turn people into frogs. :P
What do you mean by loading files "as code". Do you mean setting the execute bit on memory pages? You need to have the appropriate permissions to do that, and a locked down system wide policy can prevent programs from doing that as well.
You can't move pointers between different processes in general, because each process has its own "memory address namespace". But an absolute path generated by one program does make sense to another program.
I've noticed features that work towards cutting off access from global file system. Containers are mentioned in the article, but there are other things like systemd giving easy access to per-service private temp files, blocking access to /home, whitelisting paths that are allowed at all etc.
The problem really is that this global mutable state has existed for 50+ years, so lots and lots of things now rely on it: dynamic loading of libraries (could be separate objects from files), configuration (doesn't need to be stored in files), UNIX sockets that are accessed by file name etc.
And not only have many programs started to rely on files, but many have relied on them being mutable. Webservers can be told to re-read their config from a running process, things like that.
So, is there any way forward for UNIX-based systems that don't break the world?
Edit: another interesting observation is that iOS has already done away with the global file system, at least from the perspective of the apps.
WASI https://hacks.mozilla.org/2019/03/standardizing-wasi-a-webas... will provide access to files given to the function by the runtime, not vice versa as in operating systems where a syscall is sent to ask permission for a file.
rm -rf ~
I don't think more granular permission can solve this problem. What does solve it, however, is read-only copy-on-write snapshots. No matter how files are renamed, deleted, or modified, you can always recover the original version from the snapshot.
Best get started.
I see what you did there.