> Even on the desktop, the square peg is not the correct shape: we know that the system will be used by a single person and that the system does not need to protect the user from non-existent other users.
I don't know if this is really true. Many desktops are shared (e.g. by members of a family). And of course data centre machines are shared by multiple users, although often those users are all using processes operating using the same OS credentials.
Wouldn't it be interesting if processes running on my behalf within Facebook or HN couldn't access other users' private data, rather than relying on the programmers at FB or HN to get it right?
I think desktop-style devices are going more in the direction of being personal. And where they aren't, probably in the majority of cases you don't want the OS to mediate access poorly. I'm not saying that there aren't any counterexamples, though.
What you say about running processes "on behalf of you" is really quite interesting. There is no reason you should trust the application programmer to get it right, yet that's what the OS currently gives -- you can run your db as user "db" and httpd as user "httpd", but it doesn't do much good in terms of the actual user. So, some radical thinking is required. The editor of ;login: actually tried to point me in the direction you mention when we were working on the article, but I couldn't formulate clear enough thoughts on the subject to include in the article. Maybe someone else here has already thought about it and can put it into writing?
For many people you're probably right. I do think that there's tremendous value in segmenting out one's various personæ. There's no particularly good reason why I should give a binary game blob access to the same user data that contains my financial data, passwords &c.
A finer-grained system would be nice, no doubt, but OS users are pretty time-tested.
> Maybe someone else here has already thought about it and can put it into writing?
Well, in principle capabilities systems can do a lot of this already. As myself, I can give a capability to a server, and it can use that capability to execute work on my behalf; once I've received the result, I can (presumably) revoke that capability. Capabilities can even be used to implement filesystems: my process might have a filesystem root capability, which permits it to see a single directory, which is itself a list of capabilities to directories and files, &c. Pretty neat stuff.
There's been some interesting work with capabilities done in EROS, its successor Coyotos & Tahoe-LAFS.
I skimmed your thesis—really interesting work!
Mmmhmm. From what I understand, Android runs each Android application on the system as its own user and handles application permissions by making each permission its own group. Linux's user isolation is pretty good.
Other than lack of manpower and lack of interest, there's no reason why a Linux distro couldn't put in the medium-to-large amount of work it would take to make wrappers to do similar things for their most popular Linux applications. :)
Android runs Android applications each as a different Linux user.
I think the last two paragraphs of "Conclusions" are the most interesting part of the thesis.
No, because HN comments are public and the only reason to share data with Facebook is to make it available to (some) other people. It might be interesting for Dropbox or Google Drive, but even for Google Drive, the killer app and value-add over Microsoft Office is realtime collaboration with (specific) other users.
Such techniques are common in hardware platforms for building fault-tolerant systems but are not commonly used in software solutions. This is mainly because conventional languages do not permit different software modules to co-exist in such a way that there is no interference between modules. The commonly used threads model of programming, where resources are shared, makes it extremely difficult to isolate components from each other — errors in one component can propagate to another component and damage the internal consistency of the system.
...In our system "processes" and "concurrency" are part of the programming language and are not provided by the host operating system. This has a number of advantages over using operating system processes:
Concurrent programs run identically on different OSs—we are not limited by how processes are implemented on any particular operating system. The only observable difference when moving between OS’s, and processors should be due to different CPU speeds and memory sizes etc. All issues of synchronization, and inter-process interaction should be the same irrespective of the properties of the host operating system.
Our language based processes are much lighter-weight than conventional OS processes. Creating a new process in our language is a highly efficient operation, some orders of magnitude faster than process creation in most operating systems, and orders of magnitude faster than thread creation in most programming languages.
Our system has very little need of an operating system. We make use of very few operating system services, thus it is relatively easy to port our system to specialised environments such as embedded systems.
I don't read Armstrong's commentary as a criticism of the concept of an OS. I read it as the assertion that if you have specialized needs, the general-purpose code in most OS's is not-infrequently a poor fit for your application.
Erlang still heavily relies on the hardware abstraction and device drivers provided by modern OS's, and still makes use of OS services, it just makes use of fewer of those services than some other software projects.
That's quite funny to read this while the VW 'software cheat' is unfolding..
More seriously, reading this and https://news.ycombinator.com/item?id=10334579 in the same day makes for a quite interesting day.
This isn't why unprivileged users are unable to bind to ports lower than 1024. Software binding to those ports is assumed to be trusted system software. Preventing unprivved users from binding to those ports prevents a malicious unprivved user from finding a way to crash one of these trusted daemons and standing up his own, malicious copy in its place.
"It is no longer a catastrophe if an unprivileged process binds to transport layer ports less than 1024."
except that -from a security perspective- it kind of is. :)
Edit: To put a finer point on this and make it more explicit: Permitting unprivileged users to bind to "privileged" ports changes a service crash bug from a DoS into a complete service takeover.
1. You're eliminating entire classes of root-level processes.
2. You're restricting all other non-root processes from accessing those port(s).
I was double-checking to ensure that you understood that my original comment was addressing only the author's assertion,  and that I did not intend to assert that the only way to achieve the same security properties of the current way of doing things was the way things were currently done. :)
> 1. You're eliminating entire classes of root-level processes.
Eh... Given that the way this sort of thing is currently handled is to bind as root, then drop privs and fork, handing the FD to the bound socket to the forked child, nothing competently written ends up running as root that doesn't need to run as root for other reasons. (For example: sshd needs to switch to any user on the system, so it runs as root. Apache only needs to access things that httpd can access, so it runs as httpd.)
> 2. You're restricting all other non-root processes from accessing those port(s).
Linux already does this, but in a more general way. If you don't pass SO_REUSEPORT, you can't bind to a port that's already in use. If you do use SO_REUSEPORT, only code with the same EUID as the first code to bind to that port can bind to that port. See  for details.
Edit 2: This might also interest you: http://stackoverflow.com/questions/413807/is-there-a-way-for...
 Which was: "It is no longer a catastrophe if an unprivileged process binds to transport layer ports less than 1024." (Emphasis mine.)
The user-specificity of ports I'm speaking of would require that all access to a port be through a specified UID. Port reuse would only prevent attaching to already active ports. Nothing about keeping an otherwise, say, unused SMTP port 25 from getting snaked, no?
And there are many incompetent sysadmins out there. ;) Any system that has admin-configurable rules is bound to be misconfigured by some portion of its userbase.
Not that SELinux's or GRSecurity's MAC systems are easy to configure, but I think that they do a better job of what you're trying to do here than either of us would be likely to do with our first couple of iterations.
> The user-specificity of ports I'm speaking of would require that all access to a port be through a specified UID.
Right. That's obvious. I hope you didn't think that I thought otherwise.
> Nothing about keeping an otherwise, say, unused SMTP port 25 from getting snaked, no?
Nothing except the fact that -on almost every Linux system, and OS X system, and modern Windows system- 25 is in the privileged range, which requires that someone with root privs run the code that binds to the port.  :)
Generally, if you have root, you get to do whatever you want. So, if an untrusted user is running code as root, they're likely going to be able to either
* reconfigure whatever system either you or I cook up to prevent them from binding their code to a particular port
* run their malicious code with whatever EUID is require to bind to the port that they want
 Or for the system admin to have marked the binary with the right cap bits to override the privileged port restriction.
SELinux is a marvelous pain in the ass, though, isn't it? Haven't messed with GRSecurity.
Also, clarifying, services which aren't running as root cannot be compromised at root level directliy from the outside.
Which I hear happens.
This is so obvious that I seriously don't understand why you're bringing it up.
> SELinux is a marvelous pain in the ass, though, isn't it?
Flexible MAC systems are necessarily complex. The configuration for a complex system is bound to also be complex. This is rather unavoidable.
And, yes, mapping of complex realities onto interfaces for mediating those realities typically results in complex interfaces. Those which aren't sufficiently complex have simply squeezed the actual complexity elsewhere.
It's not a matter of whether hardware companies can get their firmware right, it's a matter of
* Who keeps developing the firmware after the company abandons the hardware?
* Who bothers to realize that the code for $HARDWARE_COMPANY's 200+ devices can be refactored down to a single module with a few, very minor tweaks for each device?
History tells us (and the explosion of model-specific drivers on Windows, for which there exist single-driver-with-minor-model-specific-tweaks on Linux, demonstrates) that hardware manufacturers generally have little to no desire to do either thing.
"Even on the desktop, the square peg is not the correct shape: we know that the system will be used by a single person and that the system does not need to protect the user from non-existent other users."
As I've intimated elsewhere, an (ostensibly single-user) modern PC has a bunch of unrelated daemons running to serve the logged-in user. User isolation, along with running each daemon as a different system user is what keeps those daemons from messing with the actual computer user's files and processes. This isolation is a pretty important property. Extending that isolation further; by -perhaps- writing wrappers that run each application as a different system user might be quite desirable. :)
"The current cloud trend is gearing towards unikernels, a term coined and popularized by the MirageOS project..."
Maybe. As Linux's ability to sandbox a given process from the rest of the system gets better and better, I suspect that this sandboxing will be the way that a lot of people choose to secure their systems. The sandstorm.io folks report that they have had great success just by using libseccomp on the software that they deploy.
First, recall that the advantage of universal computers is that they can simulate anything, including better computers. Objects themselves were supposed to be fully universal computers that got work done through message passing.
Second, the people at PARC had a "no centers" philosophy. They recognized that the key to building scalable systems is to keep responsibility widely distributed among the components.
Thus it's easy to see why you wouldn't need or want an OS:
* An OS is just a simulation of a nicer computer running atop a not-so-nice computer (i.e. the hardware). But, real objects can give you the same thing, and it just so happens that modern hardware components are real objects (i.e. universal computers that get work done by passing messages).
* Having centers in your system makes it hard to scale. And by "scale", we're not necessarily talking about things like "number of simultaneous users/processes", but rather about things like making sure that the Nth modification to the system is just as painless as the first one. The OS is clearly an unnecessary center because the hardware components are now better at handling the responsibility for their respective functionality, but the OS is also an undesired center because its opaqueness and rigidity makes it hard to modify big systems. Fortunately, since an OS is a simulation of a universal computer that means that it can also simulate other universal computers, so we're able to abstract away the OS by using it to run simulations of better computers with names like "Erlang", "Java", "Python", etc.
Finally, while I enjoyed the article and found it interesting, I do disagree with the author's tacit assumption that getting rid of the OS implies more opaque and locked down systems, or that locking down the system implies better reliability and security. Firstly, message passing is already a secure medium; stupid or malicious parser implementations are the biggest cause of the Internet's insecurity . Secondly, it's an empirical fact that the best way to achieve systemic reliability is with redundancy. This means having components that perform the same function, but that are produced independently by isolated teams that use different technologies and techniques. So reliability means having communication standards of some kind (i.e. protocols). Indeed, as long as companies continue to sell computers (i.e. a thing designed to let me simulate what I think is a better computer), we'll end up with more freedom to tinker. What we really have to worry about is companies selling things that they claim are computers, but that lacks that crucial facility (which is why the trend of calling an iPhone a computer, or OSX's SIP can be a little disquieting).
"The author's" assumption is that getting rid of unnecessary moving parts is an improvement. Having many slightly different copies of the same functionality in the stack is not redundancy, it's just silly -- "all problems with operating systems can be solved by removing layers of indirection"
I'm not too worried about companies selling tools instead of computers, just like I'm not upset upset that I didn't get a metallurgy plant when I bought a hammer. Yes, the philosophy of computation is a different thing (as Dr. Kay often points out), but sometimes you just need a tool instead of poetry. (nb. I'm not making a statement about whether or not everyone should understand computers)
Maybe if you think about a driver as a unit which performs a computation, the article will make more sense. I like to call drivers "protocol translators", but translating a protocol from one representation to another is really just a computation. The idea is to slowly liberate those units of computation from the clutches of the "centers", and then improve them, while still keeping the world functioning.
The author of this paper had all of the facts necessary to come to this realization:
"The early time-sharing systems isolated users from other users. [. . .] The time-sharing system also isolates the system and hardware components from the unprivileged user."
"The hypervisor provides the necessary isolation and controls guest resource use. Since the hypervisor exposes only a simple hardware-like interface to the guest, it is much easier to reason about what can and should happen than it is to do so with containers."
Those two paragraphs are just the same things written in different ways: isolation of users from each other, hardware abstraction, and resource-sharing. Those are the fundamental tasks of an operating system. If we call the operating system a "hypervisor" and the user an "application", it makes no difference.
Hypervisors have a place in allowing non-communicating users to run (different) existing kernels (and thus operating systems) on the same hardware. Great for VPS's.
However, using them for multiple potentially cooperating (but not necessarily mutually-trusting) applications is missing the point. Operating systems have all of the necessary tools (hardware hooks) to properly isolate different users ("applications"); if they currently don't do a proper job of it then that's just a sign we need redesigned operating systems, not an additional layer which has real hardware costs!
The author of this paper laments that "when you virtualize, it is more difficult to optimize resource usage, since applications do not know how to play along in the grand ecosystem", but this is one problem which operating systems have been solving fairly well for decades—or at least a lot of effort has been putting into solving these hard problems, and there's no sense in splitting that effort into hypervisor solutions.
 which is really just the intersection of the other two; each provides an abstraction to the user that it is the only consumer of the hardware resources—equivalently, an abstraction to the application that its virtual machine is the only consumer of the hardware resources.
 Yes, hypervisors do some different tasks than OS, but I think that's largely for two reasons, which are accidental as opposed to fundamental: 1) In order to make existing operating systems work without (much) modification, the hypervisor has to do hardware abstraction in a less abstract way, so to speak, than the OS can offer to processes. 2) At the CPU level, designers of the virtualization interface had both hindsight and freedom to improve the interfaces for hypervisor–OS interaction as compared to OS–application interaction.
 Examples of hardware costs are more memory usage by separate kernels and additional context switch cost between kernels.
With regards to doing a better job of it (and thus removing the current need for hypervisors for this task) I think moving away from POSIX/UNIX could help; the author has many points raised against operating systems that are really only valid criticism of *NIX.
 Any claim that hypervisors offer greater isolation between users/applications is either 1) a small result of the improvements in technology discussed in , or 2) a misunderstanding of the situation, where less (visible) work has been put into compromising hypervisors instead of OSes, but we have seen hypervisor exploits nonetheless.