
The Rise and Fall of the Operating System [pdf] - jsnell
http://www.fixup.fi/misc/usenix-login-2015/login_oct15_02_kantee.pdf
======
zeveb
Fascinating article—a really good read. I definitely want to check out the
work at rumpkernel.org. I do have one quibble with this line, though:

> Even on the desktop, the square peg is not the correct shape: we know that
> the system will be used by a single person and that the system does not need
> to protect the user from non-existent other users.

I don't know if this is really true. Many desktops are shared (e.g. by members
of a family). And of course data centre machines are shared by multiple users,
although often those users are all using processes operating using the same OS
credentials.

Wouldn't it be interesting if processes running on my behalf within Facebook
or HN _couldn 't_ access other users' private data, rather than relying on the
programmers at FB or HN to get it right?

~~~
anttiok
(author. thanks for the kind words).

I think desktop-style devices are going more in the direction of being
personal. And where they aren't, probably in the majority of cases you don't
want the OS to mediate access poorly. I'm not saying that there aren't any
counterexamples, though.

What you say about running processes "on behalf of you" is really quite
interesting. There is no reason you should trust the application programmer to
get it right, yet that's what the OS currently gives -- you can run your db as
user "db" and httpd as user "httpd", but it doesn't do much good in terms of
the actual user. So, some radical thinking is required. The editor of ;login:
actually tried to point me in the direction you mention when we were working
on the article, but I couldn't formulate clear enough thoughts on the subject
to include in the article. Maybe someone else here has already thought about
it and can put it into writing?

~~~
zeveb
> I think desktop-style devices are going more in the direction of being
> personal.

For many people you're probably right. I do think that there's tremendous
value in segmenting out one's various personæ. There's no particularly good
reason why I should give a binary game blob access to the same user data that
contains my financial data, passwords &c.

A finer-grained system would be nice, no doubt, but OS users are pretty time-
tested.

> Maybe someone else here has already thought about it and can put it into
> writing?

Well, in principle capabilities systems can do a lot of this already. As
myself, I can give a capability to a server, and it can use that capability to
execute work on my behalf; once I've received the result, I can (presumably)
revoke that capability. Capabilities can even be used to implement
filesystems: my process might have a filesystem root capability, which permits
it to see a single directory, which is itself a list of capabilities to
directories and files, &c. Pretty neat stuff.

There's been some interesting work with capabilities done in EROS, its
successor Coyotos & Tahoe-LAFS.

I skimmed your thesis— _really_ interesting work!

~~~
simoncion
> A finer-grained system would be nice, no doubt, but OS users are pretty
> time-tested.

Mmmhmm. From what I understand, Android runs each Android application on the
system as its own user and handles application permissions by making each
permission its own group. Linux's user isolation is pretty good.

Other than lack of manpower and lack of interest, there's no reason why a
Linux distro couldn't put in the medium-to-large amount of work it would take
to make wrappers to do similar things for their most popular Linux
applications. :)

~~~
cwyers
Anyone who's ever nervously handed their kid their phone so the kid could play
Angry Birds for a bit can tell you why Android's method of handling users
isn't perfect.

~~~
nulltype
I'm not sure what software level protections can stop your phone from being
dropped down the stairs.

~~~
cwyers
Fair enough, but I was more referring to "changing my wallpaper and
rearranging my widgets" kind of stuff.

------
lkrubner
The best criticism of the concept of an "Operating System" is implicit in Joe
Armstrong's thesis about Erlang:

\---------------------------

Such techniques are common in hardware platforms for building fault-tolerant
systems but are not commonly used in software solutions. This is mainly
because conventional languages do not permit different software modules to co-
exist in such a way that there is no interference between modules. The
commonly used threads model of programming, where resources are shared, makes
it extremely difficult to isolate components from each other — errors in one
component can propagate to another component and damage the internal
consistency of the system.

...In our system "processes" and "concurrency" are part of the programming
language and are not provided by the host operating system. This has a number
of advantages over using operating system processes:

Concurrent programs run identically on different OSs—we are not limited by how
processes are implemented on any particular operating system. The only
observable difference when moving between OS’s, and processors should be due
to different CPU speeds and memory sizes etc. All issues of synchronization,
and inter-process interaction should be the same irrespective of the
properties of the host operating system.

Our language based processes are much lighter-weight than conventional OS
processes. Creating a new process in our language is a highly efficient
operation, some orders of magnitude faster than process creation in most
operating systems, and orders of magnitude faster than thread creation in most
programming languages.

Our system has very little need of an operating system. We make use of very
few operating system services, thus it is relatively easy to port our system
to specialised environments such as embedded systems.

[http://www.erlang.org/download/armstrong_thesis_2003.pdf](http://www.erlang.org/download/armstrong_thesis_2003.pdf)

~~~
simoncion
Note: I'm having a _lot_ of fun with Erlang. I'm _very_ , very far from being
an Erlang expert, but I'm having a lot of fun working on my little Erlang
project.

I don't read Armstrong's commentary as a criticism of the concept of an OS. I
read it as the assertion that if you have specialized needs, the general-
purpose code in most OS's is not-infrequently a poor fit for your application.

Erlang still _heavily_ relies on the hardware abstraction and device drivers
provided by modern OS's, and _still_ makes use of OS services, it just makes
use of fewer of those services than _some_ other software projects.

------
renox
> Would smart but non-open hardware be a disaster? We can draw some
> inspiration from the automobile industry. Over the previous 30 years, we
> lost the ability to fix our cars and tinker with them. People like to
> complain about the loss of that ability. Nobody remembers to complain about
> how much better modern cars perform when they are working as expected.

That's quite funny to read this while the VW 'software cheat' is unfolding..

More seriously, reading this and
[https://news.ycombinator.com/item?id=10334579](https://news.ycombinator.com/item?id=10334579)
in the same day makes for a quite interesting day.

------
simoncion
"It is no longer a catastrophe if an unprivileged process binds to transport
layer ports less than 1024. Everyone should consider reading and writing the
network medium as unlimited due to hardware no longer costing a million
dollars, regardless of what an operating system does."

This isn't why unprivileged users are unable to bind to ports lower than 1024.
Software binding to those ports is assumed to be trusted system software.
Preventing unprivved users from binding to those ports prevents a malicious
unprivved user from finding a way to crash one of these trusted daemons and
standing up his own, malicious copy in its place.

~~~
dredmorbius
You could accomplish the same "don't allow Joe Random UserID to bind to port
X" by specifically mapping _designated_ UIDs to ports. This _doesn 't_ require
that accounts be root, avoids the security issues of running network services
as root, and prevents arbitrary processes from binding to ports.

~~~
simoncion
Sure, from a security perspective, there's no requirement that _0_ be the only
UID that can bind to privileged ports. But by doing what you suggest, you've
created a privileged user that can create privileged processes. The author
states:

"It is no longer a catastrophe if an unprivileged process binds to transport
layer ports less than 1024."

except that -from a security perspective- it kind of is. :)

Edit: To put a finer point on this and make it more explicit: Permitting
unprivileged users to bind to "privileged" ports changes a service crash bug
from a DoS into a _complete_ service takeover.

~~~
dredmorbius
If _one and only one UID_ can bind to a given port (much the way other service
access is limited by UID or GID), then you're _reducing_ overall attack
surface:

1\. You're eliminating entire classes of root-level processes.

2\. You're restricting _all other_ non-root processes from accessing those
port(s).

~~~
simoncion
I mostly agree. I guess you misunderstood what I wrote.

I was double-checking to ensure that you understood that my original comment
was addressing _only_ the author's assertion, [0] and that I did not intend to
assert that the _only_ way to achieve the same security properties of the
current way of doing things was the way things were currently done. :)

> 1\. You're eliminating entire classes of root-level processes.

Eh... Given that the way this sort of thing is currently handled is to bind as
root, then drop privs and fork, handing the FD to the bound socket to the
forked child, _nothing_ competently written ends up running as root that
doesn't _need_ to run as root for _other_ reasons. (For example: sshd needs to
switch to _any_ user on the system, so it runs as root. Apache only needs to
access things that httpd can access, so it runs as httpd.)

Edit:

> 2\. You're restricting all other non-root processes from accessing those
> port(s).

Linux already does this, but in a more general way. If you don't pass
SO_REUSEPORT, you can't bind to a port that's already in use. If you _do_ use
SO_REUSEPORT, only code with the same EUID as the first code to bind to that
port can bind to that port. See [1] for details.

Edit 2: This might also interest you:
[http://stackoverflow.com/questions/413807/is-there-a-way-
for...](http://stackoverflow.com/questions/413807/is-there-a-way-for-non-root-
processes-to-bind-to-privileged-ports-1024-on-l/414258#414258)

[0] Which was: "It is no longer a catastrophe if _an unprivileged_ process
binds to transport layer ports less than 1024." (Emphasis mine.)

[1] [https://lwn.net/Articles/542629/](https://lwn.net/Articles/542629/)

~~~
dredmorbius
Root-and-drop _was_ a nuance I considered mentioning. There's a lot of
incompetent code though. I may have written some of it.

The user-specificity of ports I'm speaking of would require that _all_ access
to a port be through a specified UID. Port reuse would only prevent attaching
to _already active_ ports. Nothing about keeping an otherwise, say, unused
SMTP port 25 from getting snaked, no?

~~~
simoncion
> Root-and-drop was a nuance I considered mentioning. There's a lot of
> incompetent code though.

And there are _many_ incompetent sysadmins out there. ;) Any system that has
admin-configurable rules is bound to be misconfigured by some portion of its
userbase.

Not that SELinux's or GRSecurity's MAC systems are _easy_ to configure, but I
think that they do a better job of what you're trying to do here than either
of us would be likely to do with our first couple of iterations.

> The user-specificity of ports I'm speaking of would require that all access
> to a port be through a specified UID.

Right. That's obvious. I hope you didn't think that I thought otherwise.

> Nothing about keeping an otherwise, say, unused SMTP port 25 from getting
> snaked, no?

Nothing except the fact that -on almost every Linux system, and OS X system,
and modern Windows system- 25 is in the privileged range, which requires that
someone with root privs run the code that binds to the port. [0] :)

Generally, if you have root, you get to do whatever you want. So, if an
untrusted user is running code as root, they're likely going to be able to
either

* reconfigure whatever system either you or I cook up to prevent them from binding their code to a particular port

or

* run their malicious code with whatever EUID is require to bind to the port that they want

[0] Or for the system admin to have marked the binary with the right cap bits
to override the privileged port restriction.

~~~
dredmorbius
Just noting, yes, we're pretty close to understandings here.

SELinux _is_ a marvelous pain in the ass, though, isn't it? Haven't messed
with GRSecurity.

Also, clarifying, services which aren't running as root cannot be compromised
_at root level_ directliy _from the outside_.

Which I hear happens.

~~~
simoncion
> ...services which aren't running as root cannot be compromised at root level
> [remotely, barring sploit-an-error-on-the-local-machine-to-gain-root-
> sploits].

This is so obvious that I _seriously_ don't understand why you're bringing it
up.

> SELinux _is_ a marvelous pain in the ass, though, isn't it?

Flexible MAC systems are necessarily complex. The configuration for a complex
system is bound to _also_ be complex. This is rather unavoidable.

~~~
dredmorbius
One thing age and experience have taught me is how often stating the bleeding
obvious is in fact necessary. If the point _is_ obvious to you, simply noting
it is sufficient.

And, yes, mapping of complex realities onto interfaces for mediating those
realities typically results in complex interfaces. Those which _aren 't_
sufficiently complex have simply squeezed the actual complexity elsewhere.

------
simoncion
"The solution for hardware device drivers is to push the complexity where it
belongs in 2015, not where it belonged in 1965. Some say they would not trust
hardware vendors to get complex software right, and therefore the complexity
should remain in software running on the CPU. As long as systems software
authors cannot get software right either, there is no huge difference in
correctness."

It's not a matter of whether hardware companies can get their firmware right,
it's a matter of

* Who keeps developing the firmware after the company abandons the hardware?

* Who bothers to realize that the code for $HARDWARE_COMPANY's 200+ devices can be refactored down to a single module with a few, very minor tweaks for each device?

History tells us (and the explosion of model-specific drivers on Windows, for
which there exist single-driver-with-minor-model-specific-tweaks on Linux,
demonstrates) that hardware manufacturers generally have little to no desire
to do _either_ thing.

"Even on the desktop, the square peg is not the correct shape: we know that
the system will be used by a single person and that the system does not need
to protect the user from non-existent other users."

As I've intimated elsewhere, an (ostensibly single-user) modern PC has a
_bunch_ of unrelated daemons running to serve the logged-in user. User
isolation, along with running each daemon as a different system user is what
keeps those daemons from messing with the _actual_ computer user's files and
processes. This isolation is a _pretty_ important property. Extending that
isolation further; by -perhaps- writing wrappers that run each application as
a different system user might be _quite_ desirable. :)

"The current cloud trend is gearing towards unikernels, a term coined and
popularized by the MirageOS project..."

Maybe. As Linux's ability to _sandbox_ a given process from the rest of the
system gets better and better, I suspect that this sandboxing will be the way
that a _lot_ of people choose to secure their systems. The sandstorm.io folks
report that they have had _great_ success just by using libseccomp on the
software that they deploy.

------
noblethrasher
Obviating the OS was a natural consequence of one of the original goals of OOP
as conceived by Kay et. al:

First, recall that the advantage of universal computers is that they can
simulate anything, including better computers. Objects themselves were
supposed to be fully universal computers that got work done through message
passing.

Second, the people at PARC had a "no centers" philosophy. They recognized that
the key to building scalable systems is to keep responsibility widely
distributed among the components.

Thus it's easy to see why you wouldn't need or want an OS:

* An OS is just a simulation of a nicer computer running atop a not-so-nice computer (i.e. the hardware). But, real objects can give you the same thing, and it just so happens that modern hardware components are real objects (i.e. universal computers that get work done by passing messages).

* Having centers in your system makes it hard to scale. And by "scale", we're not necessarily talking about things like "number of simultaneous users/processes", but rather about things like making sure that the Nth modification to the system is just as painless as the first one. The OS is clearly an unnecessary center because the hardware components are now better at handling the responsibility for their respective functionality, but the OS is also an undesired center because its opaqueness and rigidity makes it hard to modify big systems. Fortunately, since an OS is a simulation of a universal computer that means that it can also simulate other universal computers, so we're able to abstract away the OS by using it to run simulations of better computers with names like "Erlang", "Java", "Python", etc.

Finally, while I enjoyed the article and found it interesting, I do disagree
with the author's tacit assumption that getting rid of the OS implies more
opaque and locked down systems, or that locking down the system implies better
reliability and security. Firstly, message passing is already a secure medium;
stupid or malicious parser implementations are the biggest cause of the
Internet's insecurity [1]. Secondly, it's an empirical fact that the best way
to achieve systemic reliability is with redundancy. This means having
components that perform the same function, but that are produced independently
by isolated teams that use different technologies and techniques. So
reliability means having communication standards of some kind (i.e.
protocols). Indeed, as long as companies continue to sell computers (i.e. a
thing designed to let me simulate what I think is a better computer), we'll
end up with more freedom to tinker. What we really have to worry about is
companies selling things that they claim are computers, but that lacks that
crucial facility (which is why the trend of calling an iPhone a computer, or
OSX's SIP can be a little disquieting).

[1] [http://langsec.org/](http://langsec.org/)

~~~
anttiok
I think we're in rough agreement.

"The author's" assumption is that getting rid of unnecessary moving parts is
an improvement. Having many slightly different copies of the same
functionality in the stack is not redundancy, it's just silly -- "all problems
with operating systems can be solved by _removing_ layers of indirection"

I'm not too worried about companies selling tools instead of computers, just
like I'm not upset upset that I didn't get a metallurgy plant when I bought a
hammer. Yes, the philosophy of computation is a different thing (as Dr. Kay
often points out), but sometimes you just need a tool instead of poetry. (nb.
I'm not making a statement about whether or not everyone should understand
computers)

Maybe if you think about a driver as a unit which performs a computation, the
article will make more sense. I like to call drivers "protocol translators",
but translating a protocol from one representation to another is really just a
computation. The idea is to slowly liberate those units of computation from
the clutches of the "centers", and then improve them, while still keeping the
world functioning.

------
belovedeagle
I truly don't understand how someone could write this and not see the answer.
I've long been bothered by this problem, but here the author manages to
completely miss it: The very things we're trying to get out of hypervisors
today are _identical_ , or nearly so, to the things that we wanted to get out
of operating systems. That is, what we call a "hypervisor" today is really
just an OS.

The author of this paper had all of the facts necessary to come to this
realization:

"The early time-sharing systems isolated users from other users. [. . .] The
time-sharing system also isolates the system and hardware components from the
unprivileged user."

"The hypervisor provides the necessary isolation and controls guest resource
use. Since the hypervisor exposes only a simple hardware-like interface to the
guest, it is much easier to reason about what can and should happen than it is
to do so with containers."

Those two paragraphs are just the same things written in different ways:
isolation of users from each other, hardware abstraction, and resource-
sharing[1]. Those are the fundamental tasks of an operating system. If we call
the operating system a "hypervisor"[2] and the user an "application", it makes
no difference.

Hypervisors have a place in allowing non-communicating users to run
(different) _existing_ kernels (and thus operating systems) on the same
hardware. Great for VPS's. However, using them for multiple potentially
cooperating (but not necessarily mutually-trusting[4]) applications is missing
the point. Operating systems have all of the necessary tools (hardware hooks)
to properly isolate different users ("applications"); if they currently don't
do a proper job of it then that's just a sign we need redesigned operating
systems, not an additional layer which has real hardware costs![3]

The author of this paper laments that "when you virtualize, it is more
difficult to optimize resource usage, since applications do not know how to
play along in the grand ecosystem", but this is one problem which operating
systems have been solving fairly well for decades—or at least a lot of effort
has been putting into solving these hard problems, and there's no sense in
splitting that effort into hypervisor solutions.

[1] which is really just the intersection of the other two; each provides an
abstraction to the user that it is the only consumer of the hardware
resources—equivalently, an abstraction to the application that its virtual
machine is the only consumer of the hardware resources.

[2] Yes, hypervisors do some different tasks than OS, but I think that's
largely for two reasons, which are accidental as opposed to fundamental: 1) In
order to make existing operating systems work without (much) modification, the
hypervisor has to do hardware abstraction in a less abstract way, so to speak,
than the OS can offer to processes. 2) At the CPU level, designers of the
virtualization interface had both hindsight and freedom to improve the
interfaces for hypervisor–OS interaction as compared to OS–application
interaction.

[3] Examples of hardware costs are more memory usage by separate kernels and
additional context switch cost between kernels. With regards to doing a better
job of it (and thus removing the current need for hypervisors for this task) I
think moving away from POSIX/UNIX could help; the author has many points
raised against operating systems that are really only valid criticism of *NIX.

[4] Any claim that hypervisors offer greater isolation between
users/applications is either 1) a small result of the improvements in
technology discussed in [2], or 2) a misunderstanding of the situation, where
less (visible) work has been put into compromising hypervisors instead of
OSes, but we have seen hypervisor exploits nonetheless.

