I'm the original author. I hadn't planned to publicize this yet. There are still some incomplete parts, broken links, and missing screenshots. But the Internet wants what it wants.
Just to clarify a few things.
I just joined Mozilla Devrel. None of this article has anything to do with Mozilla.
I know that none of the ideas in this article are new. I am a UX expert and have 25 years experience writing professional software. I personally used BeOS, Oberon, Plan 9, Amiga, and many others. I read research papers for fun. My whole point is that all of this has been done before, but not integrated into a nice coherent whole.
I know that a modern Linux can do most of these things with Wayland, custom window managers, DBus, search indexes, hard links, etc. My point is that the technology isn't that hard. What we need is to put all of these things into a nice coherent whole.
I know that creating a new mainstream desktop operating system is hopeless. I don't seriously propose doing this. However, I do think creating a working prototype on a single set of hardware (RPi3?) would be very useful. It would give us fertile playground to experiment with ideas that could be ported to mainstream OSes.
And thank you to the nearly 50 people who have signed up to the discussion list. What I most wanted out of this article was to find like minded people to discuss ideas with.
Thanks for writing this! I have especially been waiting for the next release of some OS to have something revolutionary for people who use the computers as workstations.. but at this point most updates seem to be about voice automation, accessibility, sync and other "mobile" features.. I have wanted a DBFS for a long time for both personal and work files, and my huge multimedia collection.. secondly the ability to pipe data from one GUI app to another will help us immensely.. its the main reason I feel more productive while using CLI.
I feel the answer, at least for desktop, is going to be some sort of hybrid between the app store model of smartphones and the upstream packager model of Linux distros.
What I'd really like to see is some data viz and machine learning tools to analyze the dependencies of open source software, and then intelligently cut extra strings. Fewer deps makes more reliable software.
Imagine a full system running on 10000 LoC I think this could be a step forward.
Also this blurs if not throws away the distinction between "desktop and web(remote)" applications. Because if integration of remote objects is sandboxed but still transparent you get improved usability.
Also I think you go not far enough. Databases for file system are fine but I think the ida of widgets or UI libraries altogether is not feasible anymore.
The system has to adopt to a individual level, people have different needs and workflows.
Highly adoptable and conversational interfaces are needed.
They haven't talked about it much, but the idea of no initial permissions is a good one. Today the OS must fundamentally not trust any application. Permissions at install time are a good idea. Permissions at use time is an even better idea. Provenance also helps. But nothing beats sandboxing the app as much as possible.
There's no silver bullets here, but we might be able to silver plate a few.
> Today the OS must fundamentally not trust any application.
I entirely agree.
> Permissions at install time are a good idea.
I'm actually not so sure about this. I think the "iOS model" of asking for permissions when necessary is much better than the "Android model"* of a big list of permissions on install, preventing use of the app if any of them aren't granted (leading to users giving apps less scrutiny over their permissions than than they would under the iOS model).
* I believe some recent versions of Android (M?) may support the "iOS model" in some form.
Recent Android versions have a pretty good system in place. The apps still work even without some permissions (for example Instagram can work for browsing, but won't be able to take any pictures without the camera permission), and you can easily review the permissions you have granted to any app, and revoke them after the fact.
Now, the real problem is that permissions are too general. The "access/read/write files" permission is all grouped up in one place, so you end up with tons of directory for every app in your root directory (that don't get deleted when uninstalling the apps that generated them), and you allow unnecessary access to other files as well. Or the network permission, which could lead to all sorts of traffic, while many developers just need it for ads.
There's an insoluble issue in that making permissions super granular - what you really need for good security - makes them unusable for Joe public, because he can't understand them. Heck, most people probably don't understand the very broad ones that Android has now.
Maybe whats needed is more of a trust model. Users could ask "what would Bruce Schneier do" for example. If Bruce [substitute trusted person of your choice] would install this app, then I'm happy to do it as well.
I think the macOS sandbox, "apps can only access their own files, unless the OS file open dialog is used to select" is a really clever solution, and could be extended -possibly to URLs too? On install an app can list "home domains" everything else requires a confirm or general permission (for web browsers).
I agree. On Linux, Flatpak actually does the same thing: apps only get permission to access resources when the user chooses to work with those resources by making selections in the UI.
Yes, it is still true. You can disable Internet access if you're on some Android based devices (I know you can with lineageos, the heir to CyanogenMod). However, if you have Google play services installed you may still see ads in the app.
Of course, you have a firewall if you're rooted but I'm not rooted when on a Nexus device.
Yes, since Android Marshmallow (6.0), which is almost 3 years old, there has been a granular permissions system. Most apps support it now, but there is the problem that only 45.8% of all Android devices have M or later.
Can we eliminate or reduce the need for permissions altogether?
E.g. OSes have an `open()` system call, which can potentially access any part of the filesystem, and then they layer an ever growing permission system to restrict this.
Can we design a system where there is no `open()` call at all? Instead, the program declares "I accept an image and output an image", and when it is invoked, the OS asks the user for an image, making it clear which program is asking for it. Then the OS calls the program, passing in the selected file.
This model has other advantages such as introspection, composability, validation and isolation (e.g. a program that declares it takes an image an output an image has no access to the network anyway and cannot send you image over the network.)
I don't see how this would work for command line tools (there are other applications where an extreme version wouldn't work, but for command line tools I really don't see a good workaround except giving broad permissions).
If command lines were built in systems designed around this principle, they would work slightly different than what we're used to. For instance, when invoking this 'program' on the command line, the system would discover it needs an image and use the command line itself to ask for an image from the user.
Alternatively there could be standard way to pass an image (or any other input) to a program - similar to a command line arg in current systems, for instance.
You would define rgrep as taking a stream of files. Generating the stream of files is outside the capability of any program (since we don't have `open` or even `listdir`). Instead you'd use a primitive of the storage system itself to define a stream of files that you pass to rgrep. Something like `rgrep(/my/path/, text)`. So it becomes impossible for any program to access a file without the user's explicit indication.
That still doesn't help with a find + xargs combination, or with any kind of problem where you can currently store file names in a file and use that for later processing.
You cant have a find program because you can't discover files, you must be provided them. But you can have a `filter` program, that takes a stream of files an outputs another stream of files matching a filter. You can then pipe the output of filter into another program.
Yes you cannot store filenames, but you could store some other serialized token generated from a file and the token could be used to recreate the file object. Alternatively, if you have an image based system, you don't have to convert the file object to a token explicitly - you just hold the references to the file objects and they're automatically persisted and loaded.
Correct - ls couldn't be a separate program - it would be replaced with a primitive that lets you explore the filesystem.
The point of such a system would be that programs cannot explore or read the filesystem as there is no filesystem API. But programs can operate on files explicitly given to them. So exploring the filesystem is restricted to some primitives that have to be used explicitly. The guarantee then is if I invoke a program without giving it a file or folder, I know it absolutely cannot access any file.
Define "explicitly", because if it means that I can't just type it in a shell then that disqualifies it from being a practical solution, and if I can't put it in a function/script that I can call from a shell that disqualifies it as well.
But if I can do those things (especially the second), then that seems to open at least some attack vectors (that would obviously depend on the actual rules).
You should be able to type it in a 'shell', and you should be able to set it on a function/program you call from the shell. But you cant download and run program that automatically references a file by path. This system is different enough a unix style system so I'll try and roughly describe some details (with some shortcuts) of how I imagine it. It is a messaging based data flow system (could be further refined, of course):
- The programs behave somewhat like classes - they define input and output 'slots' (akin to instance attributes). But they don't have access to a filesystem API (or potentially even other services, such as network). Programs can have multiple input and output slots.
- You can instantiate multiple instances of the program (just like multiple running processes for the same executable). Unlike running unix processes, instantiated programs can be persisted (and are by default) - it basically persists a reference to the program and references to the values for the input slots.
- When data is provided to the input slot of an instantiated program (lets call this data binding), the program produces output data in the output slot.
- You can build pipeline of programs by connecting the output slot of one program to the input slot of another. This is how you compose larger programs from smaller programs. This could even contain control and routing nodes so you can construct data flow graphs.
- Separately, there are some data stores, these could be filesystem style or relational or key/value.
The shell isn't a typical shell - it has the capability to compose programs and bind data. It also doesn't execute scripts at all - it can only be used interactively to compose and invoke the program graphs. A shell is bound to a data store - so it has access to the entire data store, but is only used interactively by an authenticated user.
So interactive invocation of a program may look something like this:
> /path/to/file1 | some_program | /path/to/file2
# this invokes some_program, attaches file1 to the input slot, saves the output slot contents to file2.
You could save the instantiated program itself if you want.
I disagree with you on a lot of things here. Files in two places will confuse people. Expecting a computer to know what hand gestures you're making requires a camera which can be more mistake-prone than voice, and like watching your eyes, a lot of people will just find this creepy. Limiting people to drag-n-drop or cut-n-paste will aggravate half of the userbase (and I use either one depending on the situation).
A lot of your requirements for a "modern" OS are pie-in-the-sky or just seem very particular to your taste. I didn't much here that you want that I preferred, so outside of the bloat (especially Windows and Ubunto requiring GPUs to process 3-D effects), I see more disadvantages with your changes than otherwise.
The BeOS stuff you mention is sort of true, but not really. BeOS once had a full on database based file system, but that's not BFS. The OFS (Old File System - what they called it with in BeOS) had full on no holds barred database storage. But the performance sucked, synching the metatdata in the file system and database (which were different structures apparently) was a bit flakey, and the base for the way the filesystem worked was incompatible with integrating other file systems in to the OS. Dominic Giampaolo and Cyril Meurillon ended up creating a whole new base for the file system with VFS layer, etc. As part of that the BFS file system was created. This incorperated a subset of what the file system used to be able to do - it had extended attributes (for adding arbitrary data to the file), a query layer to allow querying those attributes and a mechanism for live queries that would allow the data to dynamically update. But it wasn't really a database in the same way WinFS was meany to be - i.e. built on top of SQL Server.
If you want even more people to sign up, you may want to edit your post to include the URL to the list[1] in your original post. It took me quite some time to find it as you called it a discussion list here on HN and a “group” on your blog post.
Thanks for writing this. I think many of your observations are correct (and agree with them). I'm not as sold on your solutions though. Here's some thoughts:
Abstractions are a correct way to do things when we don't know what "correct" is or need to deal with lots of unknowns in a general way. And then when you need to aggregate lots of different abstractions together, it's often easier to sit yet another abstraction on top of that.
However, in many cases we have enough experience to know what we really need. There's no shame in capturing this knowledge and then "specializing" again to optimize things.
In the grand 'ol days, this also meant that the hardware was a true partner of the software rather than an IP restricted piece of black magic sealed behind 20 layers of firewalled software. (At first this wasn't entirely true, some vendors like Atari were almost allergic to letting developers know what was going on, but the trend reversed for a while). Did you want to write to a pixel on the screen? Just copy a few bytes to the RAM addresses that contained the screen data and the next go around on drawing the screen it would show up.
Sometime in the late 90s the pendulum started to swing back and now it feels like we're almost at full tilt the wrong way again despite all the demonstrations to the contrary that it was the wrong way to do things, paradoxically this seemed to happen after the open source revolution transformed software.
In the meanwhile, layers and layers and layers of stuff ended up being built in the meanwhile and now the bulk of software that runs is some kind of weird middle-ware that nobody even remotely understands. We're sitting on towers and towers of this stuff.
Here's a demo of an entire GUI OS with web browser that could fit in and boot off of a 1.4MB floppy disk and run on a 386 with 8MB of RAM. https://www.youtube.com/watch?v=K_VlI6IBEJ0
I would bet that most people using this site today would be pretty happy today if something not much better than this was their work environment.
People are surprised when somebody demonstrates that halfway decent consumer hardware can outperform multi-node distributed compute clusters on many tasks and all it took was somebody bothering to write decent code for it. Hell, we even already have the tools to do this well today:
There's an argument that developer time is worth more than machine time, but what about user time? If I write something that's used or impacts a million people, maybe spending an extra month writing some good low-level code is worth it.
Thankfully, and for whatever reasons, we're starting to hear some lone voices of sanity. We've largely stopped jamming up network pipes with overly verbose data interchange languages, the absurdity of text editors consuming multi-core and multi-GB system resources is being noticed, machines capable of trillions of operations per second taking seconds to do simple tasks and so on...it's being noticed.
Here's an old post I write on this some more, keep in mind I'm a lousy programmer with limited skills at code optimization, and the list of anecdotes at the end of that post has grown a bit since then.
Well I suggest trying to lean back, reading the feedback points again, and trying to understand what makes people think that way. I bet a UX focussed person can take that into consideration and craft a final version of this article that answers such kind of questions before they even come up. ;-)
> My point is that the technology isn't that hard.
I disagree - the technology is extremely hard. You're talking centuries of staff-hours to make your OS, if you want it to be a robust general-purpose OS and not a toy. Just the bit where you say you want the computer to scan what's in your hands and identify it? That in itself is extraordinarily difficult. You mischaracterise the task at hand by pretending it's simple.
I think the article gave that as an example of what the system should enable, rather than a prerequisite for launch. The system wide message bus and semantic shortcuts features should make it really easy for developers of advanced peripherals to plug into the system.
For example, a Kinect would be a lot more useful in ideal OS. You could bind gestures to window manager commands, for example.
Unfortunately, it's wholly impractical to do real work with message busses alone if you want to maximize performance.
See: Every performance inter-process system ever...
Could we cover a number of cases with copy-on-write semantics and system transactional memory? Sure, but the tech isn't broadly available yet, and it wouldn't cover everything.
Sometimes you just need to share a memory mapping and atomically flip a pointer...
It doesn't have to be implemented as a message bus. It should just be a message bus semantically. Under the hood we could implement all sorts of tricks to make it faster, as long as the applications don't notice the difference.
I would expect that to work to a point, but coherence and interlocking are eventually going to rear their ugly heads.
I've created and lived with multiple inter-process high-performance multimedia/data systems (e.g. video special effects and editing, real-time audio processing, bioinformatics), and I've yet to encounter a message passing semantic that could match the performance of manually managed systems for the broad range of non-pathological use-cases, not to speak of the broader range of theoretically possible use-cases.
If something's out there, I'd love to see it. So far as I know, nobody has cracked that nut yet.
Just to clarify a few things.
I just joined Mozilla Devrel. None of this article has anything to do with Mozilla.
I know that none of the ideas in this article are new. I am a UX expert and have 25 years experience writing professional software. I personally used BeOS, Oberon, Plan 9, Amiga, and many others. I read research papers for fun. My whole point is that all of this has been done before, but not integrated into a nice coherent whole.
I know that a modern Linux can do most of these things with Wayland, custom window managers, DBus, search indexes, hard links, etc. My point is that the technology isn't that hard. What we need is to put all of these things into a nice coherent whole.
I know that creating a new mainstream desktop operating system is hopeless. I don't seriously propose doing this. However, I do think creating a working prototype on a single set of hardware (RPi3?) would be very useful. It would give us fertile playground to experiment with ideas that could be ported to mainstream OSes.
And thank you to the nearly 50 people who have signed up to the discussion list. What I most wanted out of this article was to find like minded people to discuss ideas with.
Thanks, Josh