
Running an eBPF program may require lifting the kernel lockdown - jgehrcke
https://gehrcke.de/2019/09/running-an-ebpf-program-may-require-lifting-the-kernel-lockdown/
======
zokier
One of the issues that came to mind is how limited and overloaded the Linux
error codes are, and EPERM might be one of the worst offenders. While
obviously not expecting Linux to change that due compatibility concerns, it'd
be wonderful if we'd have unique identifier for any code path (or nearly so)
that causes error to be returned. Of course it'd be useful to have some
grouping, so that applications that do not care do not need to know every
possible error code, but for investigations it'd be useful to know exactly why
something failed. For EPERM especially (but also others) there might some
security concerns about leaking more information about why the request was
denied, so in some cases some discretion is needed. But I'd be surprised if
that is frequent true concern.

I do wonder how much stuff would break if you patched Linux to return error
numbers with high bits set to autogenerated values. Obviously libc (and go if
used) would need patching, but how many places call syscalls directly _and_
are checking error codes carefully. Probably surprisingly long tail, but still
could be fun experiment.

~~~
quotemstr
If error codes were instead some kind of exception packaged with ancillary
data, you could easily stick a kernel stack in the event payload and get the
localization you're discussing that way. The ability to add context is what
makes error codes lose so badly to exceptions.

~~~
cyphar
Generating a kernel stacktrace for every syscall error return seems like it
would be needlessly wasteful on the return path (not to mention that it would
change between kernel releases, and wouldn't be useful to most developers).

A richer error system could be as simple as giving some more information about
why a syscall returned -EINVAL (because checking all possible flag bits to see
which one is not supported is really not a fun exercise).

------
theamk
I was super surprised about ability to lift kernel lockdown programmatically,
using sysrq_trigger file. I think it completely defeats entire point of the
lockdown - it is like a safe with spare key duct-taped to the side, annoying
but useless against any advisory.

The original kernel patch had a facility to disable programming lockdown
lifting, but this apparently did not make it into the kernel he is using.
Hopefully this was intentional, to make this less annoying for users during
the testing period.

~~~
londons_explore
The sysrq subsystem supports permissions - the distro maker could have
prevented it if they had wanted to.

------
btown
This is not the first time that security patches have caused eBPF to behave
oddly: [https://blog.cloudflare.com/ebpf-cant-
count/](https://blog.cloudflare.com/ebpf-cant-count/) is an _amazing_ anecdote
about how side-channel mitigations + a bug in the BPF verifier caused
arithmetic bugs to appear.

I hope that by the time things begin to hit mainline, Cloudflare's engineers
will chime in with ideas that allow (e)BPF to continue to run, as they seem to
use it widely internally.

