
Systemd's DynamicUser feature is currently dangerous - pwg
https://utcc.utoronto.ca/~cks/space/blog/linux/SystemdDynamicUserDangerous
======
CaliforniaKarl
This is the first time I'm learning about DynamicUser, and I'd appreciate if
someone check if the following line of thought is valid:

• We have compute environments that use a central LDAP service for end-user
account info. End-user account info is served from LDAP, with `sssd` or
`nslcd` being used on the systems (the only locally-defined accounts are
system accounts and special admin accounts).

• We do have accounts with UIDs in the range that systemd uses for the
DynamicUser feature.

• If systemd starts a DynamicUser service before the LDAP client (`sssd` et
al) is up and running, it might allocate a UID that is in use by LDAP.

• The systemd documentation does not specify if UIDs are chosen sequentially,
at random, or via some other method. So, one must assume that any UID in the
range—if (apparently) free—may be allocated.

Does that thinking make sense? It seems like we will have a pretty big problem
to deal with when we decide to move to a distro version which includes
DynamicUser support (for example, Ubuntu 18.04). It also doesn't look like
it's possible to set a custom range for DynamicUser UIDs.

~~~
leni536
Looks like the range is compiled in.

[https://github.com/systemd/systemd/blob/5a8b16409240bc95d95f...](https://github.com/systemd/systemd/blob/5a8b16409240bc95d95f3c19d6f5cd9366e6ccf4/meson.build#L663)

[https://github.com/systemd/systemd/blob/6f663594bc5ee087cb15...](https://github.com/systemd/systemd/blob/6f663594bc5ee087cb15175cb691eaebcc45a0a1/meson_options.txt#L159)

From the man page:

> Dynamic users/groups are allocated from the UID/GID range 61184…65519. _It
> is recommended to avoid this range for regular system or login users._

It looks like you will have to compile systemd with a different range to
behave well on your system.

Edit: Apparently the UID allocation is randomish:

[https://github.com/systemd/systemd/blob/7426028b7a649bdfa473...](https://github.com/systemd/systemd/blob/7426028b7a649bdfa4737fe4a355a6bd2e3d4543/src/core/dynamic-
user.c#L187)

~~~
AnIdiotOnTheNet
> Looks like the range is compiled in.

Of course it is. UNIX devs...

~~~
zaarn
You should checkout suckless projects, they compile in everything. There is no
config. Or rather, the #define statements are the config.

I think the only time I legit used that is on an Arduino where both code and
data have severe size limits.

~~~
qu4z-2
Alright, but they're designed for you to read through config.h (note that
these settings aren't inlined) and adjust them for your system before building
a binary. They're very explicitly not designed for a "grab your vendor binary
and you'll be fine" model. It's a different paradigm.

EDIT: The programs themselves are intended to be written in a programmer-
modifiable way (they host a list of common "plugins" in patch format). I think
the last thing you can say about systemd is that it encourages you to jump in
and modify its behaviour to suit your needs. That's not a Supported
Configuration(tm).

~~~
zaarn
>That's not a Supported Configuration(tm).

Depends on what you modify but it's to my knowledge supported if your patches
didn't cause the bug or behaviour problem. Plus there is plenty of ways to
configure systemd at compile time, you can toggle a lot of switches in
systemd.

------
throw2016
This would be quite difficult to debug for the average user. A lot of systemd
functionality appears to be designed for military environments served by
Redhat for instance journalctls binary logs for security auditing. But this is
not required by the vast majority of Linux users.

Offloading complexity to everyone to serve a specific use case is bad design.
It's like implementing high security military procedures in the average
office, not needed and a waste of time and resources.

Shouldn't security features over engineered by design like a time daemon
launched by dynamic users in a new mount space left to user choice. Surely
those who need that level of security should take the responsibility to enable
it, accept the debt and deal with the complexity, rather than imposing it on
everyone else. In this case ntpd is a better solution for average users.

Most distributions voted for an init system. An init has a limited role.
Systemd is proving to be anything but.

~~~
djsumdog
I agree entirely and the complexity over systemd has been one of the major
concerns.

I do like having a standardized way of managing processes. Systemd does make
packaging deb/rpm files way easier, but I don't really like the price.
Everything is abstracted to systemd. Mounts. udev. mult-users/logins
(consolekit).

I like fstab with UUIDs. I like manually mounting a USB stick when I insert
it. I like having the options of using an automounter or not using an
automounter.

At home I stick to Gentoo and Void. runit is super simple and I like the
concept behind it (although it does lack in some exceptions/logging issues).

I think ideally on my hosted solutions, the best thing going forward is a thin
Alpine with Docker and running all services as docker containers.

I really wish the FreeBSD port of Docker was still maintained. I'd switch
everything to FreeBSD+Docker if I could.

~~~
derefr
> I think ideally on my hosted solutions, the best thing going forward is a
> thin Alpine with Docker and running all services as docker containers.

That sounds like a lot more trouble than it's worth, compared to CoreOS or
Ubuntu Core. If everything is running in Docker, why does the "hypervisor's"
use of systemd matter? It's not using it _for_ anything.

------
ealexhudson
Most security features which try to lock things down properly, especially when
doing no-access-by-default, cause problems in unforeseen cases.

I don't think it's dangerous to develop better implementations that improve
security, even if they go wrong occasionally. Feels more dangerous to me
shooting down the attempts of people trying to raise the security bar.

~~~
simion314
I prefer Linus philosophy of not breaking userspace, so they should try
finding a solution that is not breaking things(I assume someone will comment
that is impossible in this case, add an attempt to prove/motivate it)

~~~
LukeShu
Things are broken in this case _because of a bug_ , not because of design
decisions. All of the features "locking things down" are opt-in in the service
file (to avoid "breaking userspace"), and the service file for systemd-
timesyncd opted-in.

~~~
qu4z-2
The "bug" being systemd making all sorts of undocumented assumptions about the
environment it's running it?

------
api
I see that systemd is still junk.

I love Alpine Linux. If we (ZeroTier) had our infrastructure to do over we'd
use it instead of CentOS for servers. It dumps systemd and countless other
pieces of over-engineered cruft that you don't need. It's a thing of beauty.
If you appreciate clean, well designed, fast, and parsimonious systems check
it out.

FreeBSD is also worth checking out for the same reason. It lacks a bit on the
hardware front but it's clean and fast and does not have systemd cancer.

Over-engineering is the plague of all modern software.

~~~
tannhaeuser
Completely agree, but what about Alpine using musl rather than GNU libc? AFAIK
this has caused trouble in the past (such as
[https://github.com/gliderlabs/docker-
alpine/issues/11](https://github.com/gliderlabs/docker-alpine/issues/11)), and
will eventually in the future again since third-party packages don't test
against non-GLIBC Linux or do they? I'm wondering if it's prime time for
Devuan.

~~~
djsumdog
So long as Docker works on Alpine and Docker+musl bugs/issues are addresses
and fixed, Alpine could be a great base for running a container
infrastructure. Within the containers you could use regular Ubuntu/glibc based
images.

~~~
tannhaeuser
I'd be careful with such assumptions. Docker forwards implementation details
of the host system/userspace and isn't a VM after all. A Docker image is _not_
forward-compatible with future Docker or host versions. Which implies the
question whether people use Docker for the wrong reasons if their intention is
to obtain future-proof reproducible builds etc., rather than merely increase
image-per-machine density

~~~
danudey
Docker provides a lot more future-proof reproducibility than other deployment
strategies, though. It's not 100% guaranteed, but whether a given docker image
is forwards-compatible doesn't matter as long as the reproducibility of
_creating_ the Docker image doesn't change substantially.

In other words, as long as I can get (or make) an ubuntu:xenial docker image
and apply the same (or similar) transformations to it to make the end result,
it doesn't matter nearly as much whether specifically this Docker instance
works across all versions of Docker forever.

------
cryptonector
> The user is automatically allocated from the UID range 61184–65519, by
> looking for a so far unused UID.

WAT. That's not OK. The UID (and GID) namespace is not that big (32-bit), but
it's big enough to avoid conflicts with existing uses: just use a range within
the larger range between (uid_t)(1UL<<31) and (uid_t)(-2).

Solaris 11+ and Illumos do this for dynamically assigning UIDs and GIDs to
SIDs that are not mapped by name to Unix users/groups.

~~~
LukeShu
Inside of a container (with user namespacing enabled), you won't have the full
32-bit range, and this must all work inside of containers; I'm not sure about
other container managers off the top of my head, but systemd-npawn only gives
containers a 16-bit subrange.

~~~
cryptonector
The container UID namespaces should be the same size, damnit. (That is how it
is in Solaris/Illumos zones...)

~~~
LukeShu
Linux user namespaces work as a 1-to-1 mapping of UIDs. Every UID in the
container has to map to a UID on the host, so the UID range of the container
is necessarily smaller than the UID range of the host (unless of course the
map is the identity, but then what's the point of having a separate
namespace?).

Maybe that is bad design, but if it is: it's Linux's fault, not systemd's.

------
exikyut
Since something's probably going to be said eventually, I'll do it this time.

> _...how timesyncd is supposed to get access through an inaccessible
> directory. I 'll quote the explanation for that:_

> > _[Access through /var/lib/private] is achieved by invoking the service
> process in a slightly modified mount name-space: it will see most of the
> file hierarchy the same way as everything else on the system ([...]), except
> for /var/lib/private, which is over-mounted with a read-only tmpfs file
> system instance, with a slightly more liberal access mode permitting the
> service read access. [...]_

Reading this, I didn't quite _completely_ facepalm, but...

This solution - the high-level general architecture/approach; the ideas used -
is, IMO, frankly insane.

It means your running system's state can no longer be easily and
straightforwardly reasoned about: no longer can you run a few commands and get
a high level idea of what's configured (with respect to filesystems) and how
everything's set up, see what files are where, and immediately know what a
given file's permissions are.

Instead, it seems you're now being expected to consider any arbitrary, given
filesystem path you're puzzling over from the perspective of every FS
namespace as viewed by each process (to be clear, this means every file *
every namespace * every process). No sysadmin/devops type is going to do that;
it's not sustainable.

This architecture is bizarre enough that few tools will be built to do
adequate introspection unless developers ( _glares at one in particular_ )
actually extend and build on this further and additional even more wonderful
breakage happens as a result, meaning that the tools _must_ be created in
order to keep systems manageable. Hopefully things don't get that bad? - but
in the meantime said tools don't exist, so people get to to reverse-engineer
PID 1 ( _AHEM_ ) the Fun™ way, and keep all the half-square, half-circle
pieces they discover along the way.

Looking further afield, I'm more hesitant about the future of Linux as a
viable trustworthy platform to have confidence in. I say that both from the
perspective of straightforward enjoyable maintenance (which Linux is already
struggling with) and from the perspective of _reasonably_ consistent and
surprise-free mental modelling to aid security best practice. UNIX was based
on the idea of "everything's a file". Not, IMHO, the best/most efficient
model; but okay. This... this blows that model out the window, because
suddenly we have architectural interestingness being built on building blocks
that exceed the scope of the original file model (look at a file, see the
permissions of that file), _but without pivoting /extending the basic building
blocks of the system to incorporate the new models_. Linux is still known as a
UNIX clone, and the UNIX standards ("everything's a file" being fundamental)
hasn't changed anytime recently, so this is... not _dishonest_ , but
definitely a potential source for a lot of confusion. And kind of technically
dishonest.

Furthermore, there's no defined direction for this new... standard? that seems
to be appearing. I can't effectively model this seemingly byzantine
architecture; I can't intuit landmarks or similarities from other systems
(although I'll admit I've only used Linux, Windows and DOS).

I do understand mount, PID and network namespacing. These concepts are not
that difficult to reason about, in isolation. But they can be combined in very
very unintuitive ways that make state analysis very difficult, and what I'm
trying to express here is that I don't consider the architecture presented to
be intuitive, easy to debug, or effective. (I never envisaged namespacing
being used like this, of course.) Perhaps it was the simplest solution, in
isolation, but it doesn't feel well-designed or thought through (with respect
to sane diagnostics and transparent low-level housekeeping).

Part of my freakout is that the tools available to examine namespaces are very
target-specific; they don't consider the system as a whole. The question is
whether the developers ( _briefly resumes glaring_ ) nearest the namespace
bits would be willing to maintain tools to help introspect at a holstic level.
That may be needed soon.

I guess the other part is that it feels Linux is getting really complicated. I
_think_ , based on my understanding of psychology, that this may be because
I've been using Linux for a few years now (a decade or so), and my usage of it
has perhaps become ingrained and rusted in place. Maybe so. But I do also
wonder if the bazaar has scaled to the point where nobody can keep track of
all the pieces as they move forward.

~~~
FooBarWidget
Can't you say the same thing about containers? Each container has its own
filesystem, PID and network namespace. I don't see people experiencing
containers as byzantine.

~~~
lmm
Containers are byzantine, but presumably you can at least run your own process
inside the container and debug from there? How can you hope to even start to
debug if your debugger has a different view of the filesystem from the process
you're debugging?

~~~
LukeShu
You can use the nsenter(1) command to enter the namespace of another process.

------
JdeBP
For a Hacker News discussion of this feature when it was first announced, see
[https://news.ycombinator.com/item?id=15419100](https://news.ycombinator.com/item?id=15419100)
.

------
CaliforniaKarl
See also
[https://news.ycombinator.com/item?id=17720980](https://news.ycombinator.com/item?id=17720980)

~~~
JdeBP
[https://news.ycombinator.com/item?id=17714360](https://news.ycombinator.com/item?id=17714360)
had the correct title from the original. (-:

