3. I don't understand the desire to trim seconds off of boot-up time (even assuming that systemd does this; it didn't for me). The goal should be to restart less often, not to restart more quickly.
5. The systemd documentation is indeed very good, and that's probably one of the biggest drivers behind its adoption. However, it is also difficult. A big part of the pushback from people over systemd is that it also replaced syslog, and did so with its own custom binary log format. To quote from a forum thread I started shortly after updating my old system to systemd, "Getting smbd/nmbd to work again was a real adventure. Like other users reported, it would just silently fail when starting it from systemd. No error message when issuing the start command, and only a vague "failed" in status. I ended up having to track down Lennart's blog post on "systemd for administrators" to figure out how in blazes to extract anything useful from that cussed binary log system he invented. My first half-dozen or so attempts to get anything useful out of the log journal got exactly zero results; I finally got lucky on another approach..." (I ended up abandoning that distribution altogether after that and a number of other frustrations, and the response from the forums.)
13. The problem for BSDs isn't so much that systemd is or isn't portable to them; it's that some upstream software is beginning to require systemd, making that software difficult (or impossible) to port to BSD.
14. It seems weird to me to hear someone else decide for other people what is or isn't a "negligible" amount of work.
15. So ... systemd is in fact Linux-only by design. How does that jive with 13 again?
19. systemd may not "force" you to do anything, up until your distribution adopts it, pushes it as an update, and then you find yourself spending hours trying to figure out how to troubleshoot a problem that didn't exist before the update. Then it certainly is forcing you to do something.
Here's the problem in a nut shell as I see it: if systemd had been the default in Linux for the last ten years, probably the tool chain around it would be mature enough to meet everyone's needs, we would all be accustomed to the specific commands needed to control and interact with and debug systemd stuff, and if someone came along and proposed replacing everything with a syslog daemon and a pile of init scripts, there'd be rage and outcry. That is to say, I don't see anything inherently bad about systemd.
But, what is enormously frustrating is to have something that works, and be well adapted to it, so that if something breaks I know exactly where to look, and then have all of that be replaced by a foreign system that breaks old things in new ways and requires hours spent trying to figure out what the hell happened.
If the replacement system offers serious benefits over the old system, that offsets the pain slightly. In this case, I've yet to see what the actual benefits are; I have no idea what problems systemd is attempting to solve which are so severe, so immediate, so intractable that they require a jarring change to some of the fundamental parts of the operating system.
Regarding your last point, the main thing that systemd excels at is managing the mishmash of power events, other ACPI events, service management and integration with DBus (which everything seems to require these days).
However, I run Linux in three locations and here's where systemd fits for me:
1. Servers. No use whatsoever. There are two power states: off and on. None of this is really an improvement over SVR4 init. It's just another damn tool to learn.
2. Desktops: No use whatsoever. I run mine like servers simply because if I need remote access, trusting stuff like WOL and ACPI is just silly on Linux.
3. Laptops: No use whatsoever. This is simply down to the fact that the power management on Linux i.e. hibernate/sleep support is a bag of shit. I run mine as power states off and on, much as 1 and 2.
Regarding boot speed - my laptop boots in about 14 seconds (SSD). Who cares about making it faster?
I find that a shell script is far more useful i.e. it doesn't enforce constraints on you which have to then be added as features to systemd. Plus shell scripts are generic tools i.e. learning how to write them will be of more use globally than learning systemd guts.
So basically, screw it.
The problem that this outlines is that Linux's power management, event and ACPI state management is crappy. We don't need another layer of crap over the top of it to make it work properly.
Regarding (1), I'll echo everything meaty said, and add that basic shell scripting is a valuable and generally useful tool to have anyway, so I'm skeptical of the benefits of replacing them with another, slightly different configuration file format.
For example: we use backuppc for automated network and remote server backups. The backuppc data volume is a TrueCrypt-encrypted drive. Creating an init script that checked to make sure that the volume was present and mounted before launching the backup daemon was fairly straightforward. I understand that I could still do this in systemd, but the catch is that I would have to do it with a shell script -- the same way I am now, except for a different, alien interface -- because the native system doesn't have support for things like that. (This is assuming that some combination of unit config parameters couldn't do it; the commands for finding the drive and checking its status were a little fiddly, and I honestly haven't tried to do this the systemd way. Still though: once you understand shell scripting, you can do anything on Linux.)
1. Not really. Most sysadmins have a decent template init file sitting around or you just steal another one on the machine and modify it. They are also more flexible as some startup processes aren't quite as simple as systemd thinks (consider lockfiles, temp data purging, permissions etc).
2. ulimit / selinux - per process. cgroup whilst funky looking is YET ANOTHER disposable mechanism which will no doubt get canned in 5 years like ipchains ipfwadm etc.
3. service --status-all
4. Concurrent startup yes. That really doesn't make much difference on a server with large IO and CPU capacity.
I'm not linux admin anymore, but I when I was, I used various service status software thingies only to find out whether the system thinks the service is running, because they tend to be wrong, and just annoy you (not starting crashed server because it "is already running" or something similar). I think debian doesn't even track service states, but I'm not sure.
> The goal should be to restart less often, not to restart more quickly.
No! These are not mutually exclusive. The goal should be to restart less ofter, but restart faster when it is necessary. Systems have to restart. Well run/maintained/designed systems may not need to restart ofter, but they do need to restart.
Consider: Many moons ago I was responsible for (among other things) the company's CRM systems. These were a couple of mid-range Suns that were extremely reliable. On an anual basis, the business operations department issues downdown cost estimates for business critical systems (these were used to develop inter-departmental SLAs), and estimated that the CRM database server cost the company about $250K per hour of down time, or, almost $4200 per minute. Or read another way: a 10 seat license for our software in a saturated (niche) market.
The point is, outages happen (no you can not account for all possible failure scenarios), and startup times matter.
While start up time is critical, how much time is being lost by imposing complexity on the humans? At least in my opinion that bit about binary logs makes the entire thing a no go. The absolute last thing I want in any sort of time critical situation is to complicate the process of reading logs.
Very true, and, to be honest, I'm not advocating for systemd. Though according to the original article this isn't quite the issue that it's claimed to be (I'm not actually familiar enough with systemd to be able to have an opinion on it), see point 20.
My assessment on the systemd debate, though, is that it feels a bit like neophobia and the technical justifications (against systemd) are a bit on the weak side.
You're right, it is mostly neophobia. I sort of tried to say that in my top comment; I don't think that systemd is technically inferior, I just can't yet justify switching to it.
But, I've been meaning to write about this for a while: I think neophobia is starting to become an unnecessarily religious article of faith in the technical community. I run a small consulting shop, we support (or try to support) just about everything under the sun. What this means is that every single time there's a significant hardware change, my hardware guy has to be on top of it; every time there's a new mobile device UI change, or platform, or service or software, he has to get to know it right away; every time PHP or MySQL or Linode or any of a number of aspects of Linux changes, I have to be on top it; every time a new operating system version is released, we have to be familiar with it.
And every single one of those little ecosystems assumes that you have the time to sit down and read and digest their documentation or play with their new way of doing things. And, if you don't, or if you miss something, you're chastised by the community (or, worse, by your customers).
It is exactly like being in college and having every one of your professors assign 3 hours' work each night and then wonder why you're making a big deal of it.
To make things worse, when something goes wrong, you know as well as I do that you rely heavily on familiarity. You don't want to be looking things up in a man page trying to remember the specific incantation for a particular thing while your budget shrinks by the second. I was lucky that the smbd/nmbd fault happened on my personal system; if it had happened to one of our clients, who has everyone using a centralized smbd share, then smbd failing to start would have stopped work for the entire company.
So, yes, it's true that I am not a fan of change for change's sake. Lennart argues that that's not what's happening with systemd. Maybe he's right. But, it is still another massive change in something that I use and support on a daily basis, that is going to irretrievably consume a little bit more time out of my limited life, that is going to force me to throw out all of the old familiar tactics ("hmm ... new client, their MySQL isn't starting, dunno where MySQL logs are on this system, let's start with `grep -R mysql /var/log/∗` ... oh wait, this is a systemd system, that doesn't work").
From that standpoint, I feel like the technical justifications in favor of systemd are a bit on the weak side.
Systemd is actually way better at logging why a service did not start. With sysvinit I often had to figure out the exact command the shell script would start the daemon with to see the stderr output. Systemd just logs this by itself and you can see this output in e.g. systemctl. This saves loads of time.
Sure, but given that developer time is a bounded resource, we can't really expect to get both fewer restarts and faster restarts. I sympathize with your example, but I would expect that you had already seriously optimized your startup process, and since systemd doesn't make services start more quickly (it only does a better job of parallelizing them), I doubt that it would have made a huge difference in your case. Assuming even a very generous 5-second startup difference, that would have only made a $350 difference per incident.
> but given that developer time is a bounded resource, we can't really expect to get both fewer restarts and faster restarts.
Your're going to have to clarify, as this statement doesn't make any sense.
You're right though. In this instance, system startup was already highly optimized. In fact, the order of startup was shifted around to get Oracle (the DB in this example) up and operational as early as possible.
On the other hand, with the introduction of SMF, that "need to optimise" largely went away. Sure, SMF didn't make much a difference in this particular instance, but parallelisation does make a difference when you're talking about total system (not service) startup time. Further it can make a difference on system that run multiple services. If 5 services have the same dependency set, then a parallel startup means that services 2-5 become available earlier versus a linear startup.
Finally, shaving 5 seconds off of service startup _does_ make a difference. While the above example means $350, there's a few things to consider.
1) a $350 loss is stil a loss. From a business perspective, a loss, no matter how small, is still to be avoided. There's other considerations as well, such as the impact of the extra 5 seconds on my budget (the SLAs mean that the loss comes out of my budget).
2) In the realm of big enterprise, this is a relatively small outage cost. In fact, I've managed system with an order of magnitude larger failure cost (SLAs are bitch).
3) It's difficult to factor soft costs. Things such as customer confidence, indirect productivity loss, etc. 5 seconds, again, means services are available sooner, lessening the impact of these soft, and difficult to quantify, costs.
I've always architected systems such that restart time didn't matter - the systems were able to fail over. However, in flight operations would fail, which meant that a restart would, in fact, cost >$50k - regardless of the length of time it took for that machine to come back.
Faster restart time is irrelevant to me. I'm already designed to deal with outages. Reducing the impact of a restart is extremely important (making them fewer in number, smaller in scope).
This usually depends on how business critical it is. If some service is not very business critical, then how do you justify (or get budget for) the extra resources you need to make it way more reliable?
But even when not business critical, it is nice that you can bring it up quickly.
As someone who's administered Solaris machines rather extensively, I find the SMF (the init system on Solaris which, in part, inspired systemd) to be much easier to manage than sysvinit. Specifically, it makes it easier to add new services than either writing new shell scripts or adapting old ones. It makes it easier to make services which depend on one-another. And it makes it easier to know which services are running and what state they are in.
I see other developers/sysadmins here saying that sysadmins usually have a template initscript which they can use to author new ones, but that's really still a lot of work if you want to do anything not covered by the template and it leads to duplicate code. (It's literal cut-n-paste programming.) Initscripts expose a lot of implementation details. Even worse, they're different for every Linux distro so the initscript for your application has to be different for every distro you want to be on.
I think the most telling thing here is, that the distro maintainers, who are the ones who write the bulk of the initscripts for Linux systems, are the ones pushing for the adoption. It's making their lives easier.
Regarding your response to 3: I agree that system stability is a very important goal. However, there are also plenty of use cases for faster boot times, so it shouldn't be disregarded as a valid pursuit in system engineering.