Now the kernel itself made the mistake, proving that the whole idea was not practical; it breaks too easily, and it breaks even when used in most restricted ways or when the whole (non-kernel provided) userspace has been designed for it (which was not a practical condition at all, to begin with).
IMO seccomp should be phased out entirely and eventually replaced by something else. Trying to "fix" it will yield nowhere: it is broken by design, since forever.
Then you would only have to update the library to work with new kernel versions.
But there isn't really anything that Linux can replace it with, pledge works because Openbsd controls both libc and the kernel.
You can workaround this sometimes though. Unless you actually care about fork with current memory copy, (i.e. you care about spawning new processes only) you can fork a "spawner" process early which is only a thin proxy for pipe->exec commands. You apply seccomp after spawner is ready and you're all good.
Seccomp like so many Linux interfaces is the “fuck it” here’s an exhaustive yet half baked set of tools, you can do anything! This barely works out in gp programming, but is always an unmitigated disaster in anything security related.
Pledge can be inherited by child processes too.
By comparison I've never gotten a complaint about pledge support (or the FreeBSD equivalent), which we also support.
Most aren't going to do that, though. So, their boxes are still nice targets.
To the end user, they only see your code broken by SELinux, and assume you haven't done your job. On the other hand, SELinux is codifying rules about UNIX that never existed and amount to emotional heuristics about the risk of Internet domain sockets being inherited around the system by the wrong process. The effect isn't to prevent Internet domain sockets being inherited by privileged processes, but breaking all sockets. That's garbage design, hidden behind marketing suggesting because the NSA contributed some code that the problem couldn't possibly be SELinux
Most people first encounter it when it breaks their program in opaque and surprising ways. They then dig in... to find that the solution is non-obvious. So they turn it off and remember it as that thing that breaks stuff.
Now depending on which libraries are in use, finding out which syscalls are actually required has often to be determined with strace or trial&error through all code paths. And with another version of the library, this set of system calls could even change without any changes to API or ABI, leaving the responsibility to track this to the application programmer.
So maybe in they are some contexts where it kind of work (until it doesn't, like here), but another model which works in more cases would be more useful...
You could avoid some of that complexity by building something more opinionated, but it turns out that 99.9% of users do something that only 0.1% of users do. The odds that you break enough to generate the same negative sentiment SELinux has, but without the tools to dog those users out, are quite high.
Maybe a middle ground could be achieved, but honestly I prefer simple proven approaches to grand overcomplicated designs. And it has to be in the kernel (or at least shipped and installed by it) and not just one more userspace layer on top of seccomp or the like, because otherwise I can't update the kernel without the risk of breaking everything, which in the traditional GNU/Linux distro world is an important workflow.
Really the Linux kernel being so decoupled from userspace is one of the point that let it be so successful. Having a technology which drops or reduces that characteristic so much is by definition not going to make that technology used everywhere... Which is a big opportunity loss compared to practical solutions.
First, Android is a Linux system, and underneath the app framework is really a pretty normal one. So I'm not sure what distinction you're drawing here.
Second, the grouping of related syscalls is something SELinux already does-- open, for example, doesn't care if you open() or openat(). That grouping lives in the kernel as well. So that would appear to do what you wanted?
Third, pledge does not do what you described. Although it restricts access to syscalls, the more important thing is how it restricts the behavior of those syscalls. That policy is something you could write in SELinux, but not in seccomp. And of course, pledge isn't capable of doing the useful things SELinux and seccomp can, like forbidding certain ioctls but allowing others.
Fourth, SELinux was not bitten by the VDSO issue.
Fifth, both pledge and SELinux get hit by the third party library issue in exactly the same way: you have a policy which was sufficient under the old version of the library, but something has changed and now they need something else. The only difference is that with SELinux the person who might know that the third party library had changed is also responsible for the policy, where with pledges you don't have that visibility.
Finally, I don't know what you mean with your last paragraph at all. The kernel has a contract with userspace and kernel developers broke it, is that what you mean by decoupling here? I'm not sure it was a good thing, and the kernel folks seem to agree...
Also I was never really talking about SELinux because I don't know much about it -- the article was initially talking about seccomp and so was I. If SELinux groups syscalls, then that's good and way better than seccomp on that topic in my opinion. BUT: I'm not a fan about having "security policies" potentially separated from applications and potentially written by yet another party -- at least not on that level -- in most case that makes no sense IMO; only the applications (1) authors know or are the most efficient to specify what is needed; and if furthers restrictions are wanted they ought to be tunable by the end user with a nice GUI and a few on/off controls.
About libraries evolving under your feet: if the library are not insane, and the groups of syscalls (or "subsyscalls" if needed, like the ioctl case) are good enough, then you shall avoid virtually all problems.
At least, for sure, you will avoid trivial problems like the VDSO one.
EDIT (1): "application" in the GNU/Linux sense. Android applications are completely different beasts.