> The real problem here is that it is impossible to diagnose or debug this situation. Simply to get this far I had to read the systemd source code (to find the code in timesyncd that printed this specific error message) and then search through 25,000 lines of strace output. And I still don't know what the problem is or how to fix it.
This seems to be really common practice in modern software development, and it sucks.
There's a rush to rewrite old software and to match some or most of that software's feature set, enough to get people to adopt it, but useful error messages and other diagnostics represent a lot of extra work and so nobody bothers.
Not enough programmers are spending time as systems administrators, trying to keep plates spinning, and so they aren't learning to appreciate the value of software that helps you troubleshoot it when something goes wrong. It's turning into the long-running conflict between mechanics and automotive engineers, where mechanics are tired of things like, "To replace [simple part that needs to be replaced every 5 years]: begin unbolting the entire front of the car. The front of the car uses 23 similar, but different, types of fasteners; keep track of them."
It needs to be a rule that if somebody's going to rewrite some software, they've got to match or improve the logging and error handling in it. Error handling needs to start getting the kind of attention and shaming for bad behavior that bottom-barrel security practices gets.
systemd arguments were just "boot faster" and "it's newer". Everyone knew it was going to be broken, buggy, lacking features. As those were never actual goals.
In fact, the goals had features never intended for init. So we could say they were rushing for feature anti-parity.
see the table at the end. All the features systemd have show "no" for init. And "verbose debug", something you would consider very essential on a piece of software responsible for making your machine something broken and unusable, just show "no" for systemd, and nobody ever cared, because: boot faster (or so they say)
systemd is a mistake. But i guess now it is a permanent mistake.
systemd has been adopted by many different distros, however this is not because of "top down" pressure (From who, Red Hat? Most distros don't answer to Red Hat), and most of them don't have a (B)DFL who could have imposed that decision on them.
There are many arguments for systemd other than "boot faster" and "it's newer". In fact, the author of the article this is a thread for has written a pretty good list of them: https://utcc.utoronto.ca/~cks/space/blog/linux/SystemdRight
The article you linked... is bad for very many reasons. In particular, it makes no acknowledgement that what it refers to as "init" (http://www.nongnu.org/sysvinit/) is only really half of an "init system" as the term is commonly used today, and all distros would be pairing it with something else on top of it. The table at the bottom is particularly ridiculous, and I take issue with almost every row; a lot of that comes down to the silliness of comparing systemd to sysvinit-by-itself, rather than sysvinit+initscripts or sysvinit+OpenRC or sysvinit+LSB-init-scripts.
Often it was a "matter of fact" thing. If the ecosystem under the distro changes to use systemd distros will follow. There might be more advantages to it, but it being required to run Gnome 3 without additional effort, and in general other servies starting to rely on it, make the decision pro systemd the easy one for distro maintainers. You can see that point being made in the debian discussion on the issue.
Till they hit the showstopper bugs and can't solve them.
Debian maintained their own init-scripts framework (and still does, for kFreeBSD and other kernels), so that certainly didn't move out from under them.
Arch Linux maintained their own initscripts framework, so that wasn't moving out from under them.
I suppose there is a case to be made that distros were pushed to systemd by Gnome 3's reliance on systemd-logind's D-Bus API.
Systemd solves real problems that have to do with virtualisation and/or containerisation. Red Hat is the company that wrote Systemd and they did it for commercial reasons (i.e. they needed these facilities in order to fulfil contracts with paying customers). Red Hat contributes greatly to Gnome. I haven't checked, but I believe they are far and away the biggest contributor. They have financial reasons for doing so, but I don't really understand what they are.
The connection between Systemd and Gnome is real, but not quite what you think. Systemd does make some stuff in Gnome easier to do, but it isn't really necessary at all. Systemd was not developed for Gnome -- it was adopted by Gnome. It's a bit of a dog food problem. Most Gnome developers (i.e. Red Hat developers) have a vested interest in Systemd being successful. They want to make sure they are using it where they can. They saw no particular reason to make Gnome 3 compatible with any other init system because: 1. It would be more work and why should they pay for work that isn't in their interest 2. They want more people to adopt Systemd because that puts Red Hat in a good position to sell more contracts.
It's totally understandable and not really underhanded. I personally hate it with a passion, but things I hate are not always evil :-)
systemd predates the industry love affair with containerization and virtualization for workload management, although work on containerization within systemd may have helped increase awareness of containerization's value. I don't remember early reasons to use systemd mentioning either; the rationales trotted out in favor of systemd in its early days were reducing boot time and replacing shell scripts used during boot, shell scripts labeled things like "old" and "fragile", despite multiple decades of shell scripts being successfully used to reliably manage the boot process on open source and proprietary Unixes, alike.
According to wikipedia, the initial release was in March 2010. The industry started courting virtualization (as it is practiced today) with the appearance of EC2, in August 2006, and the love affair came shortly afterwards.
"Reducing boot time" is of great importance for short-lived containers - if it takes 60 seconds to boot a server, and you only run it for 120 seconds, then you have 50% overhead. If it takes 6 seconds, you're down to 5% overhead; If it takes 150ms (e.g. Intel Clear Container project), it's completely negligible.
Those scripts, are, in fact, old and fragile, and while they did manage the boot process, "reliable" is a stretch. I've personally filed 2 bugs against the horrible hodge-podge of isc-dhcp that, in rare occasions, randomly failed (which is a bit of a problem if your DHCPd box is unattended and a few hours drive).
Through the years, many systems much better than init scripts have been proposed and implemented - personally I like the daemontools family, which I think is inherently better thought out.
although it definitely helped to have red hat backing, systemd won on for technical reasons: it's much better than the init scripts; it's not inferior to any other system; and it comes with simple integration with network/resolvers/etc (by virtue of including those services) -- something that saves distributions a lot of work integrating various parts.
Containers don't tend to run an init.
Then again, that notion brings to mind a certain comic about a self-ddos...
But when containerization caught the zeitgeist, they were quick to trot out nspawn as an alternative to, say, Docker (never mind that Docker and RH devs had a bit of a spat around the same time).
Since then they have been shifting the goalposts towards whole system management, perhaps best likened to Windows Active Directory, by adding more and more sub-systems that frankly has crap all to do with booting.
On top of all this it seems the people involved regularly end up pulling grsec style "fixing" of systems that work in real life but are "broken" according to some spec or other (sometimes ignoring decades of hard earned experience in the process, as was the case with their DNS reimplementation).
Namespaces and cgroups are provided by the kernel. Any init system can use them. Cgroups are mounted at /sys/fs/cgroups and while there continues to be some friction on how these are mounted and used with systemd going its own way, other init systems like openrc manage them perfectly well.
The rest of it is standard networking, mounts and autostart services started by the container managers like LXC, Docker, libvirt for VMs and these can be done by any init system.
> This seems to be really common practice in modern software development, and it sucks
I have this discussion with the PM and CEO every now and then. It is impossible to give meaningful error messages for all errors. For known errors you can give a meaningful error message, but for errors not yet discovered it's more luck than anything if you give a meaningful error message. The best you can do it make the software robust so it'll work if a non critical subsystem/plugin fails.
The next part of the discussion is usually them suggesting to just allocate time to figure most errors and write meaningful error messages for them. But the problem is that this isn't a mechanical system, but a system several orders of magnitude more complex, and even if we did spend multiple man-years doing it, the cost in terms of added complexity would be enough to get any management directly responsible fired.
That said, I completely agree that developers should spend time supporting and running systems so they understand the impact of some of those decisions. Especially should it be expected that when you encounter errors that aren't bugs in the source code, you account for them.
But we don't get that in modern software. We get "Something went wrong :(".
Complex problems are not easily made simple or easily reasoned about. Something like systemd (or any non trivial piece of linux) could never be made simple or, with your hands down in the guts of it, easily understood.
Breaking systemd's functionality into a million separate pieces wouldn't make the complexity go away, it would necessarily increase it.
I don't know what the solution is, but "making things simpler and easier to understand" is a bit hollow. Its problem space was never simple or easy to understand (unless you think systemd just runs some things at startup).
With older _nix setups that was fully possible because daemons were pretty much (i am likely to get dogpiled on this) a user process without a shell to send output to.
With systemd on the other hand the state inside a systemd-managed unit, and the state outside of said "black box" may be wildly different. And to debug the person doing so have to attempt to maintain a massive mental image of state.
Frankly the kernel may well be the closest approximation, and it may well be better instrumented at this point. Never mind that building a second kernel in userspace is not a kudos point.
This problem was, however, largely non-existent back when most of the init code was shell code. It was easy to trace, easy to debug.
(Edit: obviously, "easy" if you know sh and the idioms. They are, nonetheless, far easier than stracing stuff)
It certainly had a lot of other problems (hence systemd's popularity) but debugging was definitely not one of them.
Not for end-users, no, because you're faced with a lot of future unknowns about who they are and what they were thinking of doing.
For developers, on the other hand... IMO this has similarities to the "exceptions or return-type checking" debate. Certain architectural choices will have a strong impact on whether the inevitable runtime/production failures are mysterious or not.
I recently deployed an upgrade of an "enterprise-type" jboss application, and it failed. The log file was literally 200kb of stack traces with some "INFO" entries sprinkled here and there.
I sent it to the vendor, and their response was "everything looks okay"
With systemd, a difficult bug leads to an attack on systemd itself, and questioning whether it should even exist.
I like systemd, it's a pity some vocal people aren't enthusiastic about it.
Just like politics and the news, when you read of something being attacked, I think "who is saying this, doing they have an underlying bias against this thing they are attacking"?
I guess my question would be, is this really a general issue about systemd being difficult to debug, or some specific component subsystem of systemd. systemd is a big place.
Why would someone speak-ill (attack) something they didn't have a bias against and were neutral on?
The question is not does a bias exist, it's if a bias is illegitimate. Many people who have issues with systemd have them because of their own experience, which they document. And often the very act of documenting their issues brands them as biased. When people say systemd is a regression and list their reasons, they are labeled as not having enough experience with systemd, and if they'd only give it a try, they'd see it's better. But that's the exact thing that's occurring, someone tried to do something that used to be obtainable, and their experience with systemd has been things being difficult, obscured, badly documented, undiscoverable.
If we're going to merely ask if attackers have a bias and use that as a reason to downplay or ignore their contrarian arguments, we should be asking if proponents have a bias and also use that to downplay and ignore their supportive arguments. "Are systemd developers in favor of systemd because Red Hat signs their paychecks?" is just as much an invalid assertion of bias, assuming the developers have made meaningful supportive cases that are independent of the bank account that funds them.
before systemd the bundled logger wrote to a file, didn't direct people to journald, that lags like a champ when a log file is > 1 meg.
before systemd errors in config files were on specific lines
I don't have bias against systemd I have bias against it's flaws.
I like its config file. I hate it's documentation.
I like that it has a cron like feature, I hate that its pretty impossible to debug.
_whats my bias_? I have >>5k machines to look after, I don't want to think about systemd. (and no, not all of them drain into splunk. that costs money.)
Worst of all, I hate the attitude: every problem is a deep and personal critism.
it's not, in the same way that when a bug brings down your entire estate at 4am isn't a personal attack on me. It's just terribly annoying, and I want it fixed.
In fact, I run both distros with and without systemd and broken init scripts are a constant headache, while I've yet to encounter a single issue caused by systemd.
For example, atm I need to have 3 services. Two of these services need to run at the same time, they can run separately but then they are not useful. The third services can only run with both services online and is first run 15 minutes after boot (not earlier!) and then every hour but the hourly run MUST NOT run before the first run of the boot (ie the very first run may happen 15 minutes after boot, even if the hourly run would have happened in that time, it cannot)
This is not trivially achievable with cron and classical init other than resorting to complicated shell scripts in both.
With SystemD I can cut down on the complexity by declaring this dependency in the service files and the timers in systemd are sufficiently configurable to achieve what I need.
Sure, systemd isn't flawless, but it's a mile better than the average init script I'd have to hack together otherwise. Journald isn't pretty either but with the right configuration you can cut down on the lag.
And before systemd problems like the above wouldn't have errors in specific lines either, some shell script would be subtly wrong and the entire charade would collapse in an instant.
You don't have to think about systemd at all. Ansible and other modern tooling can abstract managing the machines for you (atm I managing about 200 VMs in Ansible playbooks and it works great)
Anisble abstracts some stuff (ensure running, running as use, etc) but for more complex stuff (process type, dependency management etc) you're on your own shipping system files.
My main beef is actually three things:
1) crap error messages,
2) crap validataion
3) journalctl being a crap VIM wrapper.
of all of them, journalctl is the most obnoxious. Instead of deffering to the system viewer it enforces their crap viewer. It truncates by default, and without restarting the viewer, or forcing it through less. (Why didn't they use less by default? less code to maintain..)
journalctl has no own pager implementation and if no pager is installed then none is used.
I give you validation and error messages, systemd is a bit sparse on that front. Would be easier if there was a way for programs and services to signal back error messages to report.
You can actually enforce non-wrapping by setting SYSTEMD_LESS to FRXMK (the S option that is otherwise included causes truncating, if omitted you get wrapping)
This can all be configured in /etc/profile, .bashrc and friends
What I see is a an appalling amount of arrogance that comes with the "I'll solve all your current and future problems".
Systemd actually doesn't and in fact can't do what it pretends. The evidence is in the never ending list of posts like these that pop up all the time.
And the arrogance of pretending to be the top of the init.d hierarchy of init scripts is not sustained by the competence of actually fixing the problems that people keep reporting.
There is a reason for the existence of the systemd opposition, and is similar to the opposition you find in any field where the top of the hierarchy is abusively taken by force instead of by competence. This in fact will be the reason for systemd downfall and I'm confident to say that it will take with it everything that sustained it in it's position.
So posts hating on something are proof that something doesn't work?
The open-source way used to be "come up with an alternative and win on merit". These days, since what systemd is trying to do is complex, it seems to be "complain about it online but keep using it" which leads to a super toxic user base, which might explain why bugfixing it (since you have to interact with said users) is so hard.
I read the first blog post about systemd when it was written, and thought it was really interesting. Then I read the second blog post and thought "eh this is getting a bit complicated". And since then of course systemd has relentlessly taken over and added much more ... so of course practically any low-level system bug, randomly distributed, has a really high chance of being in systemd. And again, it does lots of stuff that was never needed before - in this case it's un-sharing a mount namespace for extra isolation of a small trusted service, which is overkill and complicated, and this struggle to figure out what the hell happened seems particularly unnecessary.
That's not correct. It has been adopted by all major distros due to ease of use and maintaibility of unit files. That is very clear from nearly all mailing list discussions.
Also, people can and do use gnome3 without systemd. (https://wiki.gentoo.org/wiki/GNOME/GNOME_Without_systemd).
Or rather, because unit files work the same across all systemd-using distros, they can actually be upstreamed to the application developers and distros just package them, which is a huge time-saver compared to having to maintain init scripts for hundreds of daemons.
The init system simply has to fire it up, take it down, and report on whether it's still running. Any further elaborations are features of the init system.
All the problems mentioned here were failures of the init system in doing things that the init system should be doing, but were hard to diagnose because this particular init system does not and never has put "easy to debug" as a high priority item.
Then there's a reasonable amount of isolation, which would be to run it under its own user, which can only write to one or two directories, and can't read user directories or non-world-readable system config files. This is already common practice. (Creating this user at package install time, or in the initial base distro config, is completely sufficient. And simpler.)
"as much as possible" is really not worth it (and sucks resources away from implementing reasonable security practices elsewhere).
But if you do believe that the identity of a speaker is more important than the arguments they make, note that the author of this blog post has a history of praising systemd (e.g. https://utcc.utoronto.ca/~cks/space/blog/linux/SystemdRight) There is no anti-systemd bias motivating this post.
I'm not condoning ad-hominem attacks. In fact I am pointing out that often it is systemd itself that it the victim of ad-hominem attacks, which is to say that often people attack systemd, not its behaviors.
It's understandable that people don't like critique, especially delivered harshly as it so often is on The internet. But that only applies to those who worked on systemd. What I don't understand is why it seems to have fanboys.
> I think "who is saying this, doing they have an underlying bias against this thing they are attacking"?
Whenever I see a post that does nothing but dismiss an argument based on who made it, I do question the poster. There is a chance it is warranted, but in those cases evidence is at least presented. This post didn't even offer that.
There are so many systemd haters out there that I immediately wonder about the motives when I read a new attack on systemd. Pointing this out is not attacking the author.
To give some context as to why, for non-C++ developers systemd makes things very difficult to debug and understand in general.
Moreover the biggest issue /I/ personally have is the fact it was essentially forced on me by the mainstream distro maintainers.
When you discuss the merits and fallbacks of systemd it’s common for people to either try to discuss its merits only against sysvinit. Or they imply that issues that have affected me personally are not such a big deal and I should be quiet because: progress.
It’s a frustrating and exhausting position to be in. It’s like everyone shoving Windows registry on Linux down your throat, with all the opaque, obtuse and stifling qualities that would come with such a solution and telling you and your ilk to be quiet and suck it up.
Edit: I’m getting pounded by downvotes. But I guess offering any kind of other opinion about systemd for any reason isn’t met with kindness. However I can say this: neither google nor amazon use systemd to host AWS or GCP. At least consider why that is before you bury your head in the sand.
Some decisions cannot be unmade. Short of starting from scratch with gentoo (which is a Herculean effort at scale) I’m not sure what you’re saying. I cannot be disgruntled at choices because distro maintainers time is worth more than mine? Fair. But how many people have to be in my shoes for it not to be fair.
Not RH-based, but have you checked out Devuan (systemd-free Debian)? I've always valued RHEL's (up until 6) stability, but I'm seeing Debian/Ubuntu becoming more and more used in enterprise roles where before these had already been strong in web roles. I'm also seeing a need for a systemd-free minimal O/S for container (not container host) roles.
Well why are you paying for a distro which doesn't meet your wishes then? Why not choose another one?
One thing I like about Gentoo that no other mainstream distro has is something like this https://packages.gentoo.org/packages/net-proxy/haproxy.
That page says that you can install any version of haproxy that suits you. You don't have to upgrade/downgrade the OS just because you need an older or newer piece of software.
But, as always, no distro is perfect. Gentoo has its issues too.
The two wretched sisters.
> C/C++ bad
> C is is stupid and breaks things
> C++ has serious mental health issues
Based on this alone I would say that yes, the difference is material and relevant.
The number of people who are C++ programmers and can deal with strace vs. C programmers who can deal with strace is really that significant? I don't buy it. The two groups intersect significantly.
The point is that other people who are not C or C++ programmers have to deal with this kind of stuff from systemd.
This is just pedantry for the sake of it.
Systemd would be only marginally harder to debug if it was in C++ or heck, even rust, or go.
No one is opening the source code here.
So pretty much the same as any other decision distro maintainers make, if you disagree with it then.
Prefer postfix or exim over sendmail? Sure, completely replaceable.
Prefer another cron system, Logging system. Prefer another /anything/ and it used to be possible to swap.
Not any longer.
Characterizing a bug report with a detailed write up and follow up solution as 'attacks' and 'bias' is inexplicable.
It's kind of fascinating to watch software become so polarizing. Like, it's technology. It doesn't inherently have feelings or bias. But we make it so.
The bug is the lack of any error handling in the DynamicUser feature of systemd. The original failure is not logged anywhere - and launching timesyncd just goes ahead even though the setup for it fails completely. Naturally timesyncd then fails to start.
(Or perhaps the bug is the whole DynamicUser feature. From reading the bug report, it looks like it will fail any time anyone has a user-only FUSE mount point on their system...)
> The machine that this happened on is an NFS client and (as is usual) its UID 0 is mapped to an unprivileged UID on our fileservers. On this machine there were some FUSE mounts in the home directories of users who have their $HOME not world readable (our default $HOME permissions are owner-only, to avoid accidents). When systemd was setting up the 'slightly modified mount name-space' it attempted to access these FUSE mounts as part of binding them into the namespace, but it failed because UID 0 had no permissions to look inside user home directories.
Basically, to get around root-only readable directories, a special mount namespace is created when Dynamic Users are used. That fails when there are mount points that systemd can't access, which was due to FUSE filesystems mounted inside owner-only home directories, combined with an NFS mount.
This is the second time in a few days that I've heard about systemd-related user / ownership problems. The other was related to when a systemd-managed socket needs to be owned by a user that is created by a daemon that relies on networking to already be up (ldap / activedirectory / whatever). systemd wants to create the socket at the same time as networking is starting, so the user doesn't exist yet -- and there isn't a simple way to tell systemd to delay creating that socket.
Instead, the post-start script of the user daemon can chown the socket, and services relying on it will need to be started after that -- and there is no generic way to say to start a service after the auth provider is up, so you have to be specific, making the dependency fragile.
Edit: However, there is an interesting problem related to that: when you use socket activation for the user-providing daemon itself, you'll get a boot deadlock. For instance, one of the common LDAP solutions, nss-pam-ldapd, uses NSS and PAM modules that use a local socket to talk to a daemon (nslcd) that handles actually speaking LDAP. If you adjust that daemon & its service file to use socket activation (a trivial patch), the system will deadlock! Systemd will start the nslcd socket, but then when systemd starts dbus.service as a special hack it will then call dbus_init() to register itself on the bus. This will cause dbus-daemon to use the NSS module and hit the nslcd socket. Since this happens very early, systemd probably hasn't started nslcd itself yet. Because systemd is single threaded, and it's waiting on dbus-daemon to reply, it won't be able to handle the request to start nslcd, and the boot process deadlocks. Eventually, something will time out, and it will hobble along in to a half-working system. You can hack around this by manually ordering nslcd.service before dbus.service.
When religions preferences come up, I prefer s6 because it’s scalable like daemon tools/runit. Old-school upstart was meh and ancient style rc scripts rarely did process supervision.
Debugging and simplicity are prime requirements for init’s.
PS: Logs aren’t files... they’re streams of structured messages often, that are unfortunately usually destructured into text lines. Log-rotation, logging to local files are horrible kludges that doesn’t scale for high-volume servers... log-structured log device, log sourcing or time-series structured db are more natural fits. Keeping log data in structured form instead of broken up in arbitrary line formats makes custom parsing unnecessary and log mining/transformation (ETL) much easier and usable. “Grepping” is simply looking at a text view of desired fields like a full-text db search. IOW, historical logs become random-access msgpack/protobufs streams while app logs are just always appending... and one string per log message for backwards compat, with a “logger”-type command for logfile-less 12-factor apps.
> systemd will tell me to reproduce on the latest version, Ubuntu will ignore it as always
Amen to that. Distro bug reporting/handling is a pretty sad story on its own, and so far dealing with Ubuntu has been the worst. The most insane instance was when I reported a bug in their ati x11 driver. It was about two weeks after a new non-LTS came out, I even tracked down the bug and found that it was already fixed upstream only 2 commits after the one they shipped. So they didn't need to do anything but apply that single patch. No response ever. 6 months later the next release just shipped newer drivers and the problem was gone, but hadn't I recompiled the fixed driver myself I wouldn't have been able to switch vt for half a year without crashing X. I really like open source and all and make an effort to produce usable bug reports to help improve things for everyone, but instances like this make me wonder what the f*ck is wrong with people maintaining such projects. <end unrelated rant>
You're pretty much stuck with systemd-resolved when you use systemd-networkd.
And systemd-networkd is a godsend. It's the first network setup tool that I've found that is:
- simple to set up for the 99% usecases like "just do DHCP whenever a physical interfaces goes UP"
- available and works the same on all major distros
- not a Lovecraftian monstrosity like network-manager
You tried to voice. Depending on your loyalty, you may think of exiting .
I don't get it. What's so bad about that?
That is a ludicrous design decision and is indicative of the developers having zero empathy towards their users.
Then it came puppet chef and a shitload other stuff because ‘devops’ which make the task easier once and then fuck up your environment at every single upgrade changing defaults, changing dependencies and changing the environment you run
I swear what could be done repeatedly the same way for a decade now it’s a frail mess that requires yearly fiddling to keep it running and current
No wonder companies need so much devops people/time these days, they need them to keep following the frail shit that’s dropped into stable distro.
Systemd is just one more thing that you have to reconfigure at every release because changes defaults subtly.
So you think that systemd-timesyncd's service file shouldn't choose the uid at service-start time.
That's not really the issue though. The issue is that for service isolation reasons (related to choosing the uid at service-start time), the service file wants to run systemd-timesyncd in a private mount namespace, but it is failing to properly set up that namespace. That's a more general issue, as there are a number of things that could trigger running the service in a private mount namespace; not just DynamicUser.
As I understand the bug, the problem is in systemd's version of the `mount --make-rslave /` step, which (for a reason unknown to me) tries to stat (as root) the mountpoint first, and if the mountpoint is on a remote FS that might fail, and systemd isn't correctly handling that.
Ironically, the alternative approach here is more descriptive and less imperative package management.
Witness manifest files on the old BSD pkg system. Rather than every package maintainer writing an install/deinstall script to invoke the relevant account database tools, packages place a @newuser or @newgroup directive in the file and the packaging system handles creating and deleting the account at appropriate times.
But that's not all it does and the dependencies of systemd alongside making systemd a dependency, results in a very different observable reality.
Obviously there's so much inertia in the existing system that it could realistically never happen at this point, but one can dream, right?
 Kind of like e.g. /etc/services but with UUIDs.
Some of the more interesting ideas that people have had, in my view, have been:
* ID systems that introduce hierarchies, allowing (say) a user to create multiple sub-users (one for running the WWW browser, one for running the office suite, one for running the chat program, ...);
* proper nonce ID creation with segregation guarantees (c.f. nonce SIDs in the Windows NT world); and
* IDs that are reference counted, accessible via descriptors, passable from process to process via descriptor-passing mechanisms, and explicitly supplied in system calls for opening/creating things.
My point wasn't so much that I had the perfect solution and that I'd thought of everything. It's that there are much better systems and that we should actually strive to get there.
I manage hundred of stand-along edge devices, mostly hardened PCs who have to bear up to 50 deg celsius environment from time to time; I've observered 5-10 times that RTCs were wrong by 5-10 hours in some direction after a power-loss, and openntpd fails miserably, timesyncd also not-so-good. ntp does ok. It's been much better since I started updating the rtc every 10 minutes from the system clock - most distros update from rtc on startup and to rtc on shutdown, but if the shutdown is after 6 months due to power loss, you're left with an rtc drift, which -- in harsh environment -- can be significant.
I am right now testing chrony, which seems to be a winner, but it's too early to tell.
Did `systemctl status` not report "degraded" for this failure? Maybe the problem is that there was no monitoring set up for failing services?
I'm pretty sure, by the way the OP described the problem that both the service state _and_ the product of the service are being monitored
Journald is pretty stupid when you think about it. Either you're setup is really small, and logging to text files by defaults would be simpler to use. Alternatively your setup is large enough that you have a log-host/log-analytics, and logging to text files would be a sane default, as you need to convert the binary log to text anyway to do log-shipping.
It hard to argue that writing startup script hasn't improved. I would much rather write a service file than an up-start script.
Edit: Wow, just realized that my debian laptop is running timesyncd. I run the same debian version on all my servers too but I installed ntpd on them. I guess I need to check how their conflicting, if any...
Mountain out of a mole hill.
Lack of debuggability? It's not like this is some threaded golang program you'll lose your mind attempting to make sense of in strace. systemd is a single-threaded C program, if you consider this as not debuggable it's your failing not systemd's.
There are valid complaints to be made about systemd, but from what I've seen the crux of the problem is generally the haste with which its features are conceived and developed. For the most part it's a relatively conservative codebase, in a simple and easy to understand language without much of a runtime interfering with relating the code to the underlying system calls performed.
This project doesn't have very many core contributors. It's somewhat incredible how much has been achieved with so little. But as a result, it's not perfect.
I have no idea if it has changed, as I gave up trying to get things fixed.
The one issue I still have with systemd is the way it handles the kernel's command line.
I work with Chromebooks a lot, using software that isn't ChromeOS. The cool thing is that they can use FIT images so you can load multile kernels and device tree bindings into one image. So I can use the same sdcard on the Acer tegra Chromebook, Samsung's Chromebook, as well as the ASUS Flip Chromebook. However, to boot from sdcard you have to enable developer mode. When you enable developer mode, it adds a flag to the kernel command line "debug" - systemd sees that and thinks you want systemd in debug mode (wat?) Unfortunately there is no way to override this. I've tried passing the log level as info, which sort of works, until journalctl starts, and IT parses the debug flag in the command line and ignores the flag that sets the log level to info.
No amount of explaining this helped and I (and many others who have had various issues) gave up on trying to help make systemd better.
The only workaround to this is to stop using 1 sdcard amongst all of my Chromebooks and use individual sdcards per machine that uses a u-boot built for each Chromebook that does not pass the debug flag.
I like systemd and many of it's features are helpful, but when you run into issues, it's almost always an uphill battle to fix them, even when you provide a patch.
A reasonable assumption, given that systemd is taking care of the whole OS, and if you want the kernel in debug mode, you’d reasonably want the system in debug mode too. But it’s not obviously correct, either, and ultimately it boils down to “who owns the kernel command line flags”. There was some kerfuffle about exactly that between the kernel and systemd people, but they eventually worked it out – I think systemd no longer tries to assert ownership of the "debug" kernel command line parameter.
Then again the other guy in the systemd team, Sievers, have a long history of pissing the kernel people off with out of the blue unilateral decisions.
I just worry when their buddy GKH is given the reins of the kernel when Torvalds steps down...
With a script-based startup, the entire system was understandable by anyone with some rudimentary knowledge. It wasn't complicated, and it worked well. It did have its shortcomings, but the accessibility to modification and the ability to effectively debug were not amongst them. In comparison, systemd's behaviour is a black box with the unit files as input. The actual internal state is opaque, and it's inaccessible even to experts.
Sounds like time synchronization is not particularly important for you. We have alerts set up for high clock drift that trigger at abs(drift) >= 300ms.