It is inherently buggy in numerous ways. It hardcodes the number of arguments a syscall has incorrectly. It screws up compat handling. It doesn't robustly match entries to returns. It has an utterly broken approach to handling x32 syscalls. It has terrifying code that does bizarre things involving path names (!). It doesn't handle containerization sensibly at all. I wouldn't be at all surprised if it contains major root holes. And last, but certainly not least, it's eminently clear that no one stress tests it.
If you really want to use it for production, invest the effort to fix it, please. (And cc me.) Otherwise do yourself a favor and stay away from it. Use the syscall tracing infrastructure instead.
eBPF is great, and we plan to support it as our collection mechanism, but not today and maybe not soon. When we wrote go-audit, there were few, (if any?), distros that had eBPF kernel support. Now Ubuntu 16.04 has support, as do many others, and that is great!
Releasing a tool that will work for most people today is very different to releasing one that will work for most people in a couple of years. We are sharing this imperfect tool that uses an using an imperfect syscall monitoring mechanism, both of which were written by imperfect humans.
We have found plenty of problems with auditd during development (some real facepalm stuff, to be sure), but it also does many useful things. We have worked around limitations and very definitely stress tested it in our environment. A lot. A lot lot.
If you don't trust auditd, that's a totally valid opinion to hold. I welcome criticism of our approach, but a PSA that doesn't involve having testing our tool is rather unfair.
I know basically nothing about auditd. It's the kernel code I don't trust. Go-auditd may well be fantastic, but treating the kernel part as a reliable black box seems unwise to me.
Edit: you might not need eBPF to get something better. Plain old "perf script" and the underlying ringbuffer API should work decently well on older kernels. There's a performance hit, but Steven Rostedt has a fix, and it should get backported to RHEL at least.
> support. Now Ubuntu 16.04 has support, as do many others, and that is great!
This! I'm rather tired of people preaching about eBPF when in reality many of us
have to run long term supported kernels/ distros like RedHat or Ubuntu LTS
What is said is "If Slack really want to use syscall auditing for production, invest the effort to fix it, please. (And cc me.)", which seems totally reasonable to me.
Slack's answer is "We have worked around limitations and very definitely stress tested it in our environment. A lot. A lot lot." As I understand, this means "No, Slack will not invest any effort to fix Linux syscall auditing in upstream kernel, because we have already worked around limitations." Which is kind of sad and expected.
Why not try to fix those limitations and push to upstream?
I was reading about eBPF and how it's instruction set is easily JITable across several architectures. I was wondering if it would be good as a VM for retro games. Something like Pico-8 , but where one could write games in this restricted C or eBPF-ASM.
As I understand it the VM calls specific functions with limits of calling convention (10 64-bit registers shared between arguments, returns etc.). VM can also expose helper functions. It seems quite capable for such a purpose.
Where could I find more materials to explore this side of eBPF? I found a user space eBPF VM  - probably a good start.
EDIT: In case anyone is wondering: Why? For fun!
Also, from memory, there aren't real stack frames right now. (This is a limitation of the implementation and the verifier, not a fundamental issue.)
> It doesn't robustly match entries to returns
Do you mean it fails to match the call/return (annoying), or that it may mismatch them? (pretty bad...)
> It doesn't handle containerization sensibly at all.
Do you mean it's just oblivious to namespaces (but works as far as I know) or something actually not working?
> Do you mean it fails to match the call/return (annoying), or that it may mismatch them? (pretty bad...)
__audit_syscall_entry() and __audit_syscall_exit() have some interesting checks in them, and I was never convinced that every entry would get paired with the corresponding exit and logged correctly.
>> It doesn't handle containerization sensibly at all.
> Do you mean it's just oblivious to namespaces (but works as far as I know) or something actually not working?
As far as I know, there is one global audit daemon and audit log, and you have to be globally privileged to use it.
Anyways this invites the question, are you allowing your production servers to make outbound internet connections? Generally, I would proxy outbound connections and/or use internal mirrors and repos for the installation of software.
sounds like overkill, but pretty cool i must say.
I don't fully get the argument for why on-host filtering is undesirable. Of course naively filtering for curl-originated connections isn't a solid detection scheme for rootkit-installs! That's just a naive filter, which a naive user could mis-use in a centralized way or in a distributed way.
As for event correlation (#2 of the pros), it can be done on-host too. And back-testing (#3) of new rules is indeed a highly valuable feature! But you certainly don't have to log everything centrally to get that capability. E.g. in the case of Falco, you can capture trace files and re-run any number of rules/filters on them.
I do agree with the point on rules being exposed to an attacker.
[Disclaimer: author of the initial version of Sysdig Falco]
But not all syscalls need to be logged. You don't need to audit all read()/write() calls for example. open/connect/exec/setid should give you plenty of information already. I know at least Fedora had audit included, but inactive, and I haven't heard of any terrible performance degradation there.
(it looks like the penalty was ~40ns in 2014 http://permalink.gmane.org/gmane.linux.kernel/1639528)
Are you guys hiring?
It's not for the faint of the heart - the volume of events to filter through requires some serious infrastructure - but it's an important component of a mature secops program.
I started tooling up a basic version of this but I need to change PTRACE flags, and change from reading /proc/[PID]/mem to `process_vm_readv(2)`