
Modern Linux service isolation - viraptor
https://defenceforstartups.com/posts/modern-linux-service-isolation.html
======
cyphar
I'm confused why the author doesn't mention cgroups, but did mention rlimits.
rlimits have a whole host of issues that are not really solvable without going
straight to cgroups. cgroups also have much more knobs than rlimits.

The quite odd thing is that they explicitly mention RLIMIT_NPROC, but not the
pids cgroup (which has effectively the same systemd interface). I'm quite
biased of course (my first kernel contribution was the pids cgroup
implementation).

~~~
viraptor
You're right. Cgroups could be added, but I didn't do it on purpose. One
reason is that they're a pretty big topic and would deserve a post on their
own. Another is that as far as security is concerned, I don't think they
provide more restrictions than rlimits. (they do make process management
easier though!)

For sharing compute resources - they're great. But apart from resource
consumption, do you think they make any specific attack vector impossible, or
harder? I couldn't come up with a specific example, but happy to learn
otherwise!

~~~
cyphar
> I don't think they provide more restrictions than rlimits.

They provide knobs which actually reflect backing resources rather than
limiting something arbitrary like the number of open files. Not to mention
that you can limit the amount of kernel memory used by a process (not possible
with rlimits).

Also, rlimits have interesting interactions once a process starts changing its
user ID or session ID. cgroups are actually linked directly to the set of
processes in the cgroup, and it's completely transparent what's going on.
rlimits are not transparent and have odd semantics that aren't described in
the man page.

Not to mention that rlimits use _signals_ for communicating when you've
started to hit some of the limits. This is not a good idea (Unix signals are
just bad in so many ways it's not funny). So you can't be sure that your
process will ever figure out that its hitting a limit.

> do you think they make any specific attack vector impossible, or harder

Kernel memory exhaustion is harder because you can be sure that your limits
are actually limiting the backing resource you're trying to protect. The
rlimits for open files and similar things are just not useful for things like
that.

Also, there are things like the blkio cgroup which don't have an rlimit
analogue. And then there's the fact that RLIMIT_CPU is completely useless (you
shouldn't care about how much _real time_ a process runs for, you should be
limiting what fraction of the total cpu power it can take up from the system).

In general, POSIX rlimits are just underpowered as a resource control
mechanism IMO. Especially since many are more historical artifacts than
anything else.

~~~
viraptor
Thanks. The kernel memory limit is interesting. I'll have to read some more
details there.

------
arpa
Was expecting generic, run-of-the-mill "just containerize using Docker/LXC",
found a refreshing perspective on dealing with services without explicit
containerization. Awesome post and thank you!

