So the idea is, instead of having a central firewall managing all the host rules, each service define it's own firewall policy ? How do I override a policy ?
I maybe missing something but somehow I'm not sure it's the right place to do this.
I'll end up joining the camp of SystemD does too much and breaks a lot of POSIX semantics making Linux systems hard to debug.
Lately it's been getting more and more in my way. Things that I have problems with lately, DNS, cgroup and namespace. Every time I've lost a considerable amount of time because of poorly documented and mostly unexpected SystemD behavior.
Color me annoyed.
Edit: Hum, well, wasn't supposed to but it end up into a rant
So are complaining it breaks POSIX. That does not bother me that much. What does is that their changes are weakly documented and some of them make systems very hard to debug.
1. if you stripped systemd back down to one function, it would be this one (management of services through activation sockets.) Activation sockets are, in fact, the core thing that the systemd/launchd/upstart paradigm offers over traditional init systems, and were the core reason that Linux distros switched to these init systems (because activation sockets allow for concurrent service startup—leading to faster initlevel-change times, impacting boot, shutdown, sleep/wake, dock/undock, etc.)
2. Activation sockets (and a few other core features, like targets and timers) are the totality of what “systemd” by itself means. The other components are a part of the systemd project, but are not what you get if you just download and compile the repo called “systemd.” You get an activation-socket-based init(8).
3. Even before systemd, Linux (and many other POSIX systems) had much of this same functionality already implemented in the form of inetd/xinetd. And since inetd only makes sense to run on certain kinds of POSIX systems (specifically: ones that have motd(5), uname(1), nsswitch(5), services(5), and a bunch of other above-POSIX.2 utilities, and whose init systems work like sysvinit/initrcd), you could group these concerns together and call them all “the base system.” Back then, every distro had its own “base system”, but that didn’t mean that the components of them were any less numerous or that those components didn’t need to evolve in precise lockstep. The components that were shared upstreams between multiple distros (like inetd) simply didn’t evolve—because there was no multilateral place for those evolution discussions to take place—and thus gradually code-rotted and were reduced to obsolescence, rather than keeping up with the times as something you’d actually want to plug new services into.
Systemd (the project, not the core “init system” component of it) is just a multilateral implementation of a base system, with “the systemd project” being the multilateral forum where distro vendors can propose and discuss the changes to base-system components like inetd that were previously fixed+rotting due to lack of ability to coordinate. The results are no more and no less than what you’d expect to happen when a working group, composed of a bunch of vendors who make their money off of the needs of enterprise customers, get together to evolve the Linux base system.
I’m not saying systemd (the project) is good or bad; I’m saying that there’s no alternative that sits in the same ecological landscape of players in the Linux space, that wouldn’t result in the same agendas being expressed through it. Systemd (the project) is inevitable; it’s a Nash equilibrium. (The only other one being the one we were in before, where we had upstream components like sysvinit and inetd and they never changed for ~30 years despite the demands of the distros and their customers.)
The point 2 is moot. I get one SystemD package from my distribution anyways. The "SystemD project" renders my system hard to debug. To the point of having fchmoat syscall working and chmod syscall not working and having no f*g clue why. And nobody can help on IRC because nobody, even the most experienced, understand what is going on. It just is not acceptable !
Point 3 is a recurring argument. Stating that SystemD simply coalesce base system binaries is misleading. It goes way beyond that. Does setting a BPF filter in a SystemD unit result in line stating so in the log ? I bet not. How the hell do I know then ? Seriously, for real, how do you debug those kind of systems ?!
I'm very sorry to say, the last 3~4 year, the 4 most hard to debug issues I've had all have their source in "SystemD, the project".
I have not doubt the SystemD project has good intention. I also have not doubt they do not caring one bit about what cognitive load and work load they _impose_ on others.
The road to hell is also paved with good intentions.
The value is that the service configuration knows which exact network behavior a service has. The global iptables state is not context aware unless you tag things by PID. And anyway it's a cleaner approach to bundle the firewall with the service instead of manipulating global iptables state.
You can override the BPF firewall by adding a drop-in service file which either appends an additional filter with `IPIngressFilterPath=filter` or deletes all previously configured filters with `IPIngressFilterPath=`.
As much as I loved the unit conf + socket activation concept at first and supported SystemD. It's become and unruly teenager. It has severely hindered my productivity on several occasion but I'm stuck with it on Debian. One more glitch and I'll probably start hating it's guts badly...
However, you're not stuck with systemd on Debian. Install sysvinit-core and remove systemd-sysv, and install libpam-elogind and elogind instead of libpam-systemd if you need it. You'll very likely still end up with some of the libraries installed - it's not the end of the world and you can install Devuan if you really don't want that, but systemd won't be running as PID 1 and you don't get rubbish like journald or resolved.
Edit : Also been bitten by the DNS mingling.
How does one get a global view though ? What am I allowing on this host and more importantly, how do I fix it when things go wrong ? (they always do at some point!)
If you're not a SystemD developer and/or don't following extremely closely what they are doing you end up with un-manageable system before you know. Just upgrade your distro for security issues and bam lot of thing stop working and you don't know why. It's getting to a point it's ridiculous.
Parent comment raises a very valid question. How do you manage the firewall policy with this system ? Netfilter configuration is already a mess on Linux (iptables ? nftables ? iptables-persistent package ? netfilter-persistent package ? some custom shell script that calls individual iptables rules ? rules dynamically inserted by scripts ?)
Each of these tools/methods has its flaws, but it becomes completely unmanageable if two are used at the same time.
When this won't work as expected, how should a sysadmin handle the situation ? Where do you even start debugging ? Is he/she expected to inspect every single unit file in search of the one that is amiss?
Can one get a list of all currently loaded rules ? (preferably with counters about matched packets and ideally with the possibility to log a packet matched by a rule)
systemd making this 'easy' to use may be a bad idea. In my eyes, it's just giving users more rope to hang themselves with.
The unit file format is often touted as an asset, because it's much simpler than the shell-goo you would find on most distributions (Debian and derivatives for example provided a skeleton file for people to write their own init.d services. Just the boilerplate was almost 100 lines of sh. Contrast with OpenBSD where most scripts to configure service startup are only a couple of lines).
Having a key=value format was touted as a plug as it made things easier. Turns out that's not exactly true because some settings will have the expected effect only if you add enable a different setting at the same time. In my mind, this translates as an if/else which makes me think systemd.unit(5) format is not INI style configuration, but a small programming language masquerading as a configuration.
Anyhow, this turned out longer than I expected. The fact that it would be possible to do with sysvinit does not mean it would be a good idea to do this sysvinit.
And no I don't have the answer, maybe using Lisp like Guix do?
I've spent a whole week trying to figure out why I had filesystem permission issue on chmod syscall and not on fchmodat. The culprit was SystemD trying to be clever with namespacing which could have been fine if the reported error was semantically correct. It gets in the way provokes untrackable issues.
And yes, namespace is a kernel tool. But it's SystemD that sets it up ...
Edit: tipo and clarification
There is more crazy shit that we can do. Like set up entire service meshes with load balancers for your systemd units. Very neat.
Snabb I think that also has XDP/BPF support and you can do similar things using Lua
Prior to BPF there was "Enet Packet Filter", then Ultrix Packet Filter then something under SunOS before it became BPF. BPF was created in 1990 (at Berkeley) which was widely BSD oriented.
BPF, BSD — notice the first letter is the same ;-)
Also, many systems (some BSDs, SmartOS, macOS, Windows10, …) have an in-kernel VM for running dtrace bytecode.
BPF is very interesting, I remember one thing is that it's of very small size and has no loops, but I don't understand its use case for firewall yet.
Second, I don't actually agree. JIT compilation is a generally accepted approach -- who cares what the IR is? And the bpf runtime is ULTRA constrained, so you can't slip a `system("rm -rf /")` in there (by a long shot).
So it's a question of interpreter vs. JIT, and a JIT makes it more feasible to use an ultra-simple language with aggressive verification, without losing too much speed.
Every feature has an attack surface, but you have to compare against the alternative, not the lack of feature.
(This assumes that you can't force it to use a kernel interpreter despite the JIT existing. Otherwise perhaps that is the part that should be disabled.)
AFAIK, BPF has in-kernel verifier on the safety of the BPF program, which is conservative and would reject safe program, let alone the real dangerous ones.
https://medium.com/@tyanir/understanding-bpf-check-alu-op-vu... has a longer description of one of them - namely, you can convince the verifier that a certain part of the program is dead code and therefore doesn't need to be verified when it does, and thereby get arbitrary unverified eBPF into the kernel.
There won't be an exploit for bpf. It's kind of a different layer and it's own system. "Exploit in bpf" is about the same level as "exploit in C". There's just no such general thing.