I wish he'd acknowledge and discuss prior, effective work in this space instead of saying things "showed up" and they're "insane". For instance, a direct comparison to either seatbelt or seccomp-bpf would make it clear that the distinction between initialization vs. steady state is well-explored in production systems using this (like sandboxed Chrome renderers) and not novel.
I agree it's not better than seccomp-bpf plus a library to supply these kinds of common policies in userland, with the caveat that I don't know of any such library in common use on Linux.
pcwalton's gaol, which has both seccomp-bpf and seatbelt backends, has a concept of "profiles," which I think matches the general idea here:
To be fair I wouldn't use it in production yet, but it's quite a bit more reviewed than tame(2) is right now.
When seccomp is actually being used to implement the sandbox semantics, the application usually needs to be designed around it. It's very difficult to apply it this way to an existing application. I just don't think there's a strong use case for coarse control when it's so far from being the real hard problem.
Seccomp is really hard to integrate into existing programs, especially if they are monolithic and don't have privilege separation already. You can't narrow down the syscall footprint in complex programs enough to get decent protection unless you already have something along the lines of privilege separation. Third party libraries cause headaches too because their syscall footprint might conflict (now or in the future!) with the main program's seccomp configuration.
Capsicum looks interesting. The challenge with fancy security mechanisms is that many people contributing to the codebase may be unaware or care only about use-cases where fine-grained capabilities aren't needed.
At the end of the day, all these mechanisms can work well but it's best to use them from the start and make them one of the key concerns that all developers know about.
The main missing feature on Linux is the ability for programs to apply their own path-based MAC policy as seccomp can't be used to do string comparisons and couldn't realistically be extended to offer it. It's one of the main reasons that robust seccomp sandboxes require so much work.
Because, if so, that makes a whole lot of sense. (Adding security "for free" generally does).
This could conflict with on-the-fly upgrades, though. If it turns out that some later version of your program does in fact require <x>, then you'll have to kill and restart the process as opposed to upgrading on-the-fly. Perhaps not the end of the world, but worth noting.
This might not be useful in practice. The inspiration behind tame() is the observation that a program usually needs many more rights at initialisation-time than in its main-loop. Thus, tame()ing a program before it even starts is unlikely to be practical - if you give it enough permissions to successfully start (including whatever is needed for dynamic linking), you may not have reduced its capabilities much.
That's the best part about OpenBSD -- the APIs may not be binary or even source-code compatible between the releases, but the source code is usually as readable and as clear as it gets.
What I really like about a lot of the OpenBSD initiatives, is they don't overthink their solutions - they make them as simple as possible, but no simpler. Signify, which avoided the entire web-of-trust/PKI complication is another example.
There's a tool like this for Android phones. It not only can turn privileges off for an application, but also offers the option to provide apps fake info for things they don't need. You can, for example, deny address book access; if the app tries to access the address book, it gets a fake empty one. You can deny camera access; the app gets some canned image. This allows you to run overreaching apps while keeping them from overreaching.
The sandboxing was flawed, though. It causes problems for some applications. For example, Gang Garrison 2, a game I have worked on, fails to update itself if stuck in Program Files and not run as admin.
On an unrelated note, I've always had respect for how Theo de Raadt is both the project leader of a complete BSD system, yet also an active hacker. Contrast to Linus Torvalds, who's mostly a manager nowadays.
Not really different than hibernation.
EDIT: Here is it: CryoPID (https://github.com/maaziz/cryopid)
CryoPID allows you to capture the state of a running process in Linux and save it to a file. This file can then be used to resume the process later on, either after a reboot or even on another machine.
Modern and more sophisticated equivalents are CRIU and DMTCP.
> You need to deal with file handles / etc, but that can be done too.
That's actually the hard part. To get a real image of that process in time, you need to snapshot the full filesystem state, too. Or it could change out from beneath your program. Even more complicated: network state.
I know this kind of stuff is being worked on so VMs/containers/namespaces can be moved around but it seems to be one of those things that gets really complicated when you try to do it transparently for userspace.
How about if it's listening on a TCP port -- what happens if that port is in use by another process when the original one is thawed?
Could you please expand on your reasoning here? We're talking about restoring processes at arbitrary points in the future. That means we're not just talking about handles to files that were deliberately deleted while the process was running, but also anything that the process had open that was frozen that may have been subsequently deleted. That would seem to include any log file that gets rotated, which is not exactly rare, plus a ton more things.
I also think that treating network sockets as if they were disconnected is likely to go better than treating files that way - existing programs probably make more assumptions about disk state not changing unexpectedly than about network state not changing unexpectedly (even if both are technically not well founded).
I can't remember exactly where in the podcast they discussed it, but I believe it was just before the part where you could hear brains exploding in the background
(Case in point: you can have a system hibernate, have a supposedly locked file change, and have the system resume.)
Privileges (seems to fit the post):
Programming with Privileges Example:
In particular, the Solaris privileges model allows a program to gracefully degrade functionality and drop and reinstate privileges at different points of execution.
For example, if your process has TAME_GETPW opening /var/run/ypbind.lock enables TAME_INET. The reasoning behind this makes sense, but now it means that yp always has to open that file before it can do its thing. The behaviour of yp always opening that file before accessing the network is now required by the kernel.
The saving grace is that OpenBSD (and the other BSDs) are developed as a unified system, so if yp ever changes to no longer use that file, that change will only come as part of a version upgrade that includes the kernel, etc.
Chrome uses this for their sandbox of rendering processes.
 - https://farm9.staticflickr.com/8669/16418068728_b8dd8aa200_c...
If not, does anyone want to join forces to create one? An ultra-simple library that provides tame()-like functionality on all capable platforms should make writing secure software a lot easier.
 https://stackoverflow.com/questions/31373203/drop-privileges... if anyone's curious
This page has some details:
pr->ps_tame = parent->ps_tame
$ time cvs -d firstname.lastname@example.org:/cvs export -rHEAD src/sys/kern/kern_fork.c
0m2.73s real 0m0.06s user 0m0.04s system
1. it works
2. they're used to it
3. there isn't enough reason to change
4. lots of infrastructure would need rebuilt if they changed
> Once set, the flag is inherited by future children processes, and may not be cleared.
And mentions that specific new APIs should be used in order to manage processes through capabilities.
+ u_int ps_tame;
Makes sense now.
The stuff I'm thinking of that would plugin to Firefox or Photoshop probably do things that would already be allowed (read/write files, allocate memory, access network).
Either way this seems like an extremely simple way to lock down all the little command line utilities and small programs that make up a working unix system, so that if someone does get arbitrary command execution by other programs it gets much more difficult to chain exploits.
Of course, if you download an infected plugin, you're going to have a bad time. But that is likewise a problem if you download an infected program and run it. It's not an attack vector this syscall is meant to prevent.
On the other hand, a well designed plugin interface could set default permissions. For example the plugin interface could have a SQL method, so that a plugin does not need to talk to a socket directly.
Protocol based interactions (that requires clear API) do a better job at isolation than modules.
In a certain way linux too the obligation for drivers to be runned kernel space and thus the adaptation made for drivers to access resources have made some part of the kernel internal API klugish. On the other way, it is true that the use of modules without API enables linux to present an external API that does not change while you can modify the behaviour of the kernel and its internal.
(I guess there is a price to pay for everything, but not everybody is a genius like Torvalds)
I guess a root process could remain privileged, the main restricted process could be a child of that, and that main process could ask the root process to spawn plugins. But, that'd weaken the model a bit.
But assume nginx supports tame now, you know at least what the process can do and what it cannot do explicitly. If one day a zero-day attack was discovered in nginx, nginx running tame will have a lesser security impact, at least in theory.
If anything, the place to implement it statically could be either a virtual machine jit, or at the compilation stage.
Very interested to see how this works out.
How screwed am I?
Especially the development tools: the Morris worm enabled portability by distributing itself in source code form then building its binary on its target hosts.
My sister once read a novel about some very traditional, strictly religious people who fastened their shirts with string ties as they felt buttons were hooks that the Devil could use to grab hold of you.
I feel much the same way about files. I dont know what tomorrow's zero-day will look like but the chances are quite good that it will depend on a file that is installed by default. Cliff Stoll wrote in "The Cuckoo's Egg" of a subtle bug in a subprogram used by GNU emacs for email. Had the Lawrence Berkeley Laboratory used vi rather than emacs they would not have been vulnerable. ;-D
Yes it is a step in the right direction not to run daemons or windows services you dont need but its even better to remove them.
In 1990 I wrote an A/UX 2.0 remote root exploit to drive home my objection to one single file having incorrect permissions. Its source was about a dozen lines. That particular file was required but our default installs have many files we dont really need.
Also if you can read - not just execute - the binary to any program or library then your malware can load it into its memory then execute it. We have no way of knowing who is going to do that tomorrow but we do know there are many binaroies we do not really need.
If you develop code for your server, install the same distro in a vm on your desktop box then compile it there.
Taken to its logical conclusion, you sort of end up with a unikernel system, like Mirage OS: only the code necessary for the execution of the service is compiled into the kernel. These systems don't even have a shell.
While it is helpful that hardware memory management will protect against erroneous and malicious code, even better is for the code to be correct and ethical.
This because the MMU hardware takes up power, it costs money, generates heat and uses real estate. Also the software is complex and uses a lot of memory for page tables and complex allocation schemes.
The Oxford Semiconductor 911, 912 and 922 didnt even have a kernel nor did they have dynamic memory allocation, just stack and static memory with an infinite loop operating a state machine. A huge PITA to debug but the memory and flash were quite cheap because there werent very much of either.
maybe this is better:
what that means is that you can run the program but you cannot read it as a regular file.
To delete or create a file you must have write permission to the directory it is or will be found in.
Yes it's a PITA to take away your own permissions but your server is not the box you take with you when you hang out at Starbucks.