No! These are not mutually exclusive. The goal should be to restart less ofter, but restart faster when it is necessary. Systems have to restart. Well run/maintained/designed systems may not need to restart ofter, but they do need to restart.
Consider: Many moons ago I was responsible for (among other things) the company's CRM systems. These were a couple of mid-range Suns that were extremely reliable. On an anual basis, the business operations department issues downdown cost estimates for business critical systems (these were used to develop inter-departmental SLAs), and estimated that the CRM database server cost the company about $250K per hour of down time, or, almost $4200 per minute. Or read another way: a 10 seat license for our software in a saturated (niche) market.
The point is, outages happen (no you can not account for all possible failure scenarios), and startup times matter.
My assessment on the systemd debate, though, is that it feels a bit like neophobia and the technical justifications (against systemd) are a bit on the weak side.
But, I've been meaning to write about this for a while: I think neophobia is starting to become an unnecessarily religious article of faith in the technical community. I run a small consulting shop, we support (or try to support) just about everything under the sun. What this means is that every single time there's a significant hardware change, my hardware guy has to be on top of it; every time there's a new mobile device UI change, or platform, or service or software, he has to get to know it right away; every time PHP or MySQL or Linode or any of a number of aspects of Linux changes, I have to be on top it; every time a new operating system version is released, we have to be familiar with it.
And every single one of those little ecosystems assumes that you have the time to sit down and read and digest their documentation or play with their new way of doing things. And, if you don't, or if you miss something, you're chastised by the community (or, worse, by your customers).
It is exactly like being in college and having every one of your professors assign 3 hours' work each night and then wonder why you're making a big deal of it.
To make things worse, when something goes wrong, you know as well as I do that you rely heavily on familiarity. You don't want to be looking things up in a man page trying to remember the specific incantation for a particular thing while your budget shrinks by the second. I was lucky that the smbd/nmbd fault happened on my personal system; if it had happened to one of our clients, who has everyone using a centralized smbd share, then smbd failing to start would have stopped work for the entire company.
So, yes, it's true that I am not a fan of change for change's sake. Lennart argues that that's not what's happening with systemd. Maybe he's right. But, it is still another massive change in something that I use and support on a daily basis, that is going to irretrievably consume a little bit more time out of my limited life, that is going to force me to throw out all of the old familiar tactics ("hmm ... new client, their MySQL isn't starting, dunno where MySQL logs are on this system, let's start with `grep -R mysql /var/log/∗` ... oh wait, this is a systemd system, that doesn't work").
From that standpoint, I feel like the technical justifications in favor of systemd are a bit on the weak side.
Your're going to have to clarify, as this statement doesn't make any sense.
You're right though. In this instance, system startup was already highly optimized. In fact, the order of startup was shifted around to get Oracle (the DB in this example) up and operational as early as possible.
On the other hand, with the introduction of SMF, that "need to optimise" largely went away. Sure, SMF didn't make much a difference in this particular instance, but parallelisation does make a difference when you're talking about total system (not service) startup time. Further it can make a difference on system that run multiple services. If 5 services have the same dependency set, then a parallel startup means that services 2-5 become available earlier versus a linear startup.
Finally, shaving 5 seconds off of service startup _does_ make a difference. While the above example means $350, there's a few things to consider.
1) a $350 loss is stil a loss. From a business perspective, a loss, no matter how small, is still to be avoided. There's other considerations as well, such as the impact of the extra 5 seconds on my budget (the SLAs mean that the loss comes out of my budget).
2) In the realm of big enterprise, this is a relatively small outage cost. In fact, I've managed system with an order of magnitude larger failure cost (SLAs are bitch).
3) It's difficult to factor soft costs. Things such as customer confidence, indirect productivity loss, etc. 5 seconds, again, means services are available sooner, lessening the impact of these soft, and difficult to quantify, costs.
Faster restart time is irrelevant to me. I'm already designed to deal with outages. Reducing the impact of a restart is extremely important (making them fewer in number, smaller in scope).
But even when not business critical, it is nice that you can bring it up quickly.