
Syscall Auditing at Scale - knoxa2511
https://slack.engineering/syscall-auditing-at-scale-e6a3ca8ac1b8#.1lnibz30j
======
amluto
As the kind-of-sort-of maintainer of Linux's syscall infrastructure on x86, I
have a public service announcement: the syscall auditing infrastructure is
awful.

It is inherently buggy in numerous ways. It hardcodes the number of arguments
a syscall has incorrectly. It screws up compat handling. It doesn't robustly
match entries to returns. It has an utterly broken approach to handling x32
syscalls. It has terrifying code that does bizarre things involving path names
(!). It doesn't handle containerization sensibly at all. I wouldn't be at all
surprised if it contains major root holes. And last, but certainly not least,
it's eminently clear that no one stress tests it.

If you really want to use it for production, invest the effort to fix it,
please. (And cc me.) Otherwise do yourself a favor and stay away from it. Use
the syscall tracing infrastructure instead.

~~~
rhuber
Hi - Ryan from the blog post here.

eBPF is great, and we plan to support it as our collection mechanism, but not
today and maybe not soon. When we wrote go-audit, there were few, (if any?),
distros that had eBPF kernel support. Now Ubuntu 16.04 has support, as do many
others, and that is great!

But..

Releasing a tool that will work for most people today is very different to
releasing one that will work for most people in a couple of years. We are
sharing this imperfect tool that uses an using an imperfect syscall monitoring
mechanism, both of which were written by imperfect humans.

We have found plenty of problems with auditd during development (some real
facepalm stuff, to be sure), but it also does many useful things. We have
worked around limitations and very definitely stress tested it in our
environment. A lot. A lot lot.

If you don't trust auditd, that's a totally valid opinion to hold. I welcome
criticism of our approach, but a PSA that doesn't involve having testing our
tool is rather unfair.

~~~
gbrown_
> When we wrote go-audit, there were few, (if any?), distros that had eBPF
> kernel

> support. Now Ubuntu 16.04 has support, as do many others, and that is great!

This! I'm rather tired of people preaching about eBPF when in reality many of
us have to run long term supported kernels/ distros like RedHat or Ubuntu LTS
releases.

~~~
sanxiyn
Where did amluto "preach" eBPF?

What is said is "If Slack really want to use syscall auditing for production,
invest the effort to fix it, please. (And cc me.)", which seems totally
reasonable to me.

Slack's answer is "We have worked around limitations and very definitely
stress tested it in our environment. A lot. A lot lot." As I understand, this
means "No, Slack will not invest any effort to fix Linux syscall auditing in
upstream kernel, because we have already worked around limitations." Which is
kind of sad and expected.

------
jtakkala
Great idea. I always thought that it's essential to log events in realtime to
a remote system that is secure and harder to compromise to modify the logs
post-intrusion. Way back in the day it was suggested to do this to an entirely
offline system by cutting the rx pins on a parallel cable, thereby only
allowing the one-way transmission of logs to the log server. I don't know if
anyone ever did that in practice though.

Anyways this invites the question, are you allowing your production servers to
make outbound internet connections? Generally, I would proxy outbound
connections and/or use internal mirrors and repos for the installation of
software.

~~~
henrygrew
> it was suggested to do this to an entirely offline system by cutting the rx
> pins on a parallel cable, thereby only allowing the one-way transmission of
> logs to the log server.

sounds like overkill, but pretty cool i must say.

------
henridf
Looks like a nice tool, and it's great to see syscalls getting more attention.

I don't fully get the argument for why on-host filtering is undesirable. Of
course naively filtering for curl-originated connections isn't a solid
detection scheme for rootkit-installs! That's just a naive filter, which a
naive user could mis-use in a centralized way or in a distributed way.

As for event correlation (#2 of the pros), it can be done on-host too. And
back-testing (#3) of new rules is indeed a highly valuable feature! But you
certainly don't have to log everything centrally to get that capability. E.g.
in the case of Falco, you can capture trace files and re-run any number of
rules/filters on them.

I do agree with the point on rules being exposed to an attacker.

[Disclaimer: author of the initial version of Sysdig Falco]

~~~
akadien
Regarding on-host filtering (edge analytics), my experience has been it's
because of performance, and I agree with the security angle, too.

------
nwmcsween
The issue with enabling syscall auditing is the overhead it introduces, iirc
some around two orders of magnitude, as in 200000/s -> 3000/s. I would just
use seccomp-bpf filters on a per program basis as the overhead there according
to benchmarks is much less.

~~~
viraptor
seccomp-bpf is awsome, but it has to be really baked into the app to be
useful. Not being able to filter by deref memory is pretty limiting.

But not all syscalls need to be logged. You don't need to audit all
read()/write() calls for example. open/connect/exec _/ set_id should give you
plenty of information already. I know at least Fedora had audit included, but
inactive, and I haven't heard of any terrible performance degradation there.

(it looks like the penalty was ~40ns in 2014
[http://permalink.gmane.org/gmane.linux.kernel/1639528](http://permalink.gmane.org/gmane.linux.kernel/1639528))

~~~
amluto
It was an order of magnitude higher at one point. These days it's not _that_
bad under most workloads.

------
akadien
This is seriously cool, in my opinion. In fact, I've been working on something
kind of similar in Go with more support for network monitoring and
vulnerability management that would feed into GrayLog.

Are you guys hiring?

------
jvehent
If you're wondering why this is useful, or if you need it at all, ask yourself
if you would want to get an alert when nginx execs a shell, or opens
/etc/shadow. Syscall auditing gives you the lowest possible interface to
capture these events.

It's not for the faint of the heart - the volume of events to filter through
requires some serious infrastructure - but it's an important component of a
mature secops program.

~~~
amluto
The even more mature secops programs might wonder whether the kernel syscall
auditing infrastructure itself is a problematic attack surface.

------
peterwwillis
If you want similar functionality to the first questions without audit, try
netfilter. It's still shitty logs, unfortunately, but so is most monitoring.

------
packetized
Similar to [https://github.com/gdestuynder/audisp-
json](https://github.com/gdestuynder/audisp-json) and
[https://github.com/mozilla/audit-go](https://github.com/mozilla/audit-go)

------
valarauca1
Couldn't you just do this in user-land with `PTRACE_SYSEMU`? Then your tracing
processing also has to make 1 system call to unlock and allow the other
process to run?

I started tooling up a basic version of this but I need to change PTRACE
flags, and change from reading /proc/[PID]/mem to `process_vm_readv(2)`

~~~
SEJeff
You know that ptrace messes up the child parent relationship (this is why you
can't strace a strace'd or gdb'd process already. It goes 1 level deep only)
and has a serious performance impact, whereas syscall tracing doesn't, right?

------
aliakhtar
Would this be useful in a containerized architecture where everything is run
as a container?

~~~
viraptor
At the host level - yes. All the applications still make the same requests to
the kernel. Different namespace doesn't matter.

------
ejholmes
Great article. I'd also recommend people take a look at ThreatStack for
aggregating syscall events:
[https://www.threatstack.com/](https://www.threatstack.com/).

------
mburns
[duplicate of
[https://news.ycombinator.com/item?id=13009796](https://news.ycombinator.com/item?id=13009796)]

~~~
dang
True, and of
[https://news.ycombinator.com/item?id=13007726](https://news.ycombinator.com/item?id=13007726)
before it. But there seems to be interest, so we've moved the comments to the
post currently ranked highest (that would be this one).

------
mistertrotsky
This is super great.

~~~
ms4720
Yes it is

