
Domesticating applications, OpenBSD style - fcambus
http://lwn.net/SubscriberLink/651700/624e2249bacc2f67/
======
vezzy-fnord
The spender character is an example of why I stay away from LWN comments. Too
many narrow-minded, Linux-centrist commentators who are willfully ignorant or
even have vendettas against anything outside their Linux bubble.

Otherwise, kudos on LWN for reporting this.

~~~
gillianseed
Wait, so when Theo says 'Some BPF-style approaches have showed up. So you need
to write a program to observe your program, to keep things secure? That is
insane.'

That is fine and dandy, but when someone criticizes Theo's work in the same
scornful manner, it's suddenly 'narrow-minded', 'willfully ignorant' and part
of a 'vendetta'.

Pot meet kettle.

~~~
glass-
Because Theo is right, having to write programs to observe programs is insane.

While spender is wrong for multiple reasons. Such as that /var is mounted
nosuid on OpenBSD by default so their hypothetical "httpd chmod'ing to setuid"
doesn't work ("nix" points this out to them but isn't aware of OpenBSD's
defaults in this area). The other reason is they are decrying the whole system
because of a fairly weak attack scenario (requires a local user account) and
it's not like that attack isn't possible now, so while tame() doesn't protect
against that specific thing it protects against others but spender is just
saying nope, it doesn't protect against this specific scenario therefore it's
useless.

Also if you keep reading spender's commenters they go on to claim OpenBSD's
devs are delusional plagiarists so "vendetta" does seem appropriate.

~~~
_yy
Spender's point is that a capabilities-based approach does not really work for
real-world scenarios, and he probably has a point. The setuid thing was more
an example as I understood it.

While Spender's criticism isn't usually very diplomatic, he's often right.

Here's an example, this is QEMU's seccomp whitelist:
[http://git.qemu.org/?p=qemu.git;a=blob_plain;f=qemu-
seccomp....](http://git.qemu.org/?p=qemu.git;a=blob_plain;f=qemu-
seccomp.c;hb=HEAD)

As you can see, it's mostly worthless since QEMU has to do lots of
potentially-dangerous operations by design (even reading/writing to raw
devices). A proper mandatory access control framework actually restricts which
device files it can access, for example - as implemented in the
AppArmor/SELinux sVirt drivers. Theo's approach is similarly flawed.

~~~
vezzy-fnord
_capabilities-based approach_

tame() isn't a capabilities-based approach. Don't confuse POSIX capabilities
with actual capability-based security. The former hijacked an existing term to
refer to a different thing.

It's pretty obviously a limited API, but it also makes privilege dropping
absolutely trivial. Which is the point.

------
Pyxl101
Recently discussed on HN:

[https://news.ycombinator.com/item?id=9909429](https://news.ycombinator.com/item?id=9909429)

[https://marc.info/?l=openbsd-
tech&m=143725996614627&w=2](https://marc.info/?l=openbsd-
tech&m=143725996614627&w=2)

------
batou
I really like this solution. Far better than pissing around with SELinux
configuration and it's well contained.

~~~
bodyfour
I'm... conflicted.

On one hand, a _simple_ opt-in way for applications to give away the ability
to do operations is something I've wanted for sandboxing for a long time. From
an API perspective, tame() looks great.

However, the kernel-side implementation is way too ugly. Hardcoding pathnames
needed to use things like DNS, NIS, etc is just way too inflexible and
fragile.

I think the kernel should just support a set of fs/network/etc restrictions
that can be grown but not shrunk by the process. i.e. basically what seccomp-
bpf gives you on linux. What we really need is user-friendly libraries to make
using seccomp-bpf as easy as using tame()!

~~~
batou
I think it's quite simple really. Sure it's a bit ugly but most APIs are
behind the scenes. As long as the API itself is clean, that's not necessarily
a problem.

In this case though, the problem with growing the permission set is you then
need rules to define the sandbox "size" outside the process, then tools to
administer it, then the policies to be deployed with the program.

If you're using tame which is restrictive, based on how they do privsep in
OpenBSD, you're forking tasks off from a control process and then removing
privileges. So the program defines the policy internally, very granularly and
correctly.

How you'd do that with seccomp-bpf with forked child processes etc would be
vastly more complex. Consider OpenSMTPd for example which consists (I can't
remember the exact separations): master process, smtp listener, queue, writer.
These are all forked from the master process and talk to each other with
pipes. How would you mandate the call limitations efficiently if this was
externalised? Also how would you assure that a misapplied external policy
doesn't compromise your system.

~~~
bodyfour
The problem isn't that it's behind the scenes, it's what part of it is in
kernel space.

The user of tame() quite sensibly wants to specify a simple filter based on
what things they might need to do in the future. i.e. "I need to be able to
loop up details about local users". At the system-call level this implies a
number of things should be allowed i.e. "can open /etc/group for read-only
access, can read NIS configuration files, ..."

In the OpenBSD implementation of tame() all of these rules are _hardcoded into
the kernel_. Using LDAP instead of NIS? Well, you'll have to patch and
recompile I guess.

Since ultimately the userland process is opting-in to the restrictions there's
no security reason that it can't specify the rules to the kernel. (Assuming
here that care is taken so it can't DoS the kernel by pushing down a billion-
entry list or something)

Again, what OpenBSD gets right here is the API -- it _should_ be as simple as
possible for a process to say "I need to look up users, use DNS, and read
files; block all else" Making that a simple call like tame() is a great step
forward. I hope the seccomp-bpf people do something similar.

However, tame() should be a library function, not a syscall. The kernel
shouldn't know or care what "I need to use DNS" means, it should just be told
what files I can open and what sockets ops I'm allowed to do.

The OpenBSD implementation is simultaneously too rigid for some things (since
things like plugable nsswitch.conf modules can't work at all) and too loose to
support other useful ones (per-application policies like "block any attempt to
open a file for writing unless it's an append to $HOME/.ssh/known_hosts")

In UNIX there is a very long tradition of _not_ baking policies like this into
the kernel, and for good reason.

~~~
angersock
_The kernel shouldn 't know or care what "I need to use DNS" means, it should
just be told what files I can open and what sockets ops I'm allowed to do._

Is this perhaps an artifact of the OpenBSD development model, where programs
are _assumed_ to work tightly coupled to the kernel?

~~~
bodyfour
It is probably more acceptable to OpenBSD than it would be in Linux, but it
still doesn't feel like a good idea.

------
agumonkey
Logical time based "NX", interesting.

