
Immortal – A cross-platform, OS-agnostic supervisor for Unix-like platforms - tete
https://immortal.run/
======
dsr_
This appears to be aware that it is solving the same problem as daemontools /
runit / nosh / s6... but doesn't seem to offer anything new or better.

And it presents a solved problem (some daemons fork themselves off and the
parent exits, so tell them not to do that or use fghack) as a new thing that
Immortal can solve.

Is there something actually new and interesting about this that I'm missing?

~~~
JdeBP
Here are some differences, from a cursory reading:

* There is no composable toolset, but rather one monolithic program that does only a small part of the job. envdir and setuidgid are -u and -e options to the supervise analogue that is that monolithic program, "immortal". All of functionality of the other daemontools tools, such as envuidgid and softlimit, let alone the likes of runit's chpst, is missing. Not only are there no composable tools for it, the monolithic program simply lacks their functionality entirely. One can extend this functionality with the composable toolsets, with tools to handle things from scheduling classes to control groups. The monolithic program lacks all of this.

* There is no equivalent of the service restart/finalization mechanisms of runit, perp, s6, or nosh; which permit taking various actions dependent from how a service program terminates.

* Nor is there an equivalent of the at-stop and at-start mechanisms of s6 or nosh.

* There is no service readiness mechanism, as s6 and systemd have.

* s6, nosh, and systemd all have the concept of "oneshot" services. immortal does not cater for them.

* The svscan analogue, named "immortaldir", lacks an equivalent to the "down" file mechanism from daemontools, runit, perp, s6, et al.. _All_ services _always_ come up at bootstrap. There is no enable/disable mechanism at all.

* The author hasn't borne in mind the design lessons from svscan, the reasons _why_ it works as it does, such as only having symbolic links in the scan directory, not the actual service definitions themselves. There is a race condition between copying or saving the YAML file into its directory and immortaldir trying to open and read it, which can result in immortal reading a part-written configuration file. There is also the potential for hidden files to be recognized as services, something that svscan explicitly prohibits.

* The author hasn't borne in mind the design lessons from s6-svscan. immortaldir wakes up and polls every few seconds instead of being event-driven, which as M. Bercot notes is a consideration on systems that want to save power.

* immortal does not have logger services, as perp, s6, daemontools, daemontools-encore, freedt, nosh, and runit all do. Instead, it has only the options of the service manager itself writing the log file, or a non-supervised non-manageable logging command. It is thus not possible to control logging through service management.

* There is no equivalent to multilog's rotation scripts, filtering, multiple log directories, or automatic timestamping.

* The log rotation uses the old .1 .2 system. The author hasn't borne in mind the design lessons from multilog, the reasons _why_ it did the things that it did. So it doesn't ensure that logs are safely written to disc at rotation time, for starters.

* The service management API is not compatible with daemontools, daemontools-encore, runit, and nosh; which all share a common binary-compatible subset (the daemontools service management API). This API is, rather, a WWW server. Where daemontools et al. send simple one-byte commands; this API uses JSON and the Internet Message format meaning that a simple command to send a signal is actually an entire HTTP/TCP transaction, with all of that parsing and marshalling going on, under the covers. This flies in the face of the "do not parse at runtime" dictum followed by the other toolsets. There is an entire HTTP parser and server running with superuser credentials inside the service manager here. Ironically, this is a criticism that people vehemently (but erroneously) level at systemd. _This_ service manager _actually does_ do this, though.

* It also contains an entire YAML parser.

* There is no service interdependency mechanism; at all. There aren't even the tools for the old "thundering herd" approach of starting all services and having them manually check their dependencies in their "run" scripts. There is nothing there akin to s6-rc or to the service bundle dependency mechanism in nosh.

* There's no mention of portability to the Hurd, Illumos, OpenIndiana, the Windows NT POSIX subsystem, the Windows NT Linux subsystem, and suchlike. Several of the other systems address the fact that "portable" to any "UNIX-like platform" means more than merely FreeBSD, macOS, OpenBSD, and Linux. s6 has notes on how to build it on Solaris, for example.

* The new and interesting thing is not the fact that it focusses on the solved problems of fghack and PID files, which ironically have been going away over the past decade. The new and interesting thing is spinning up service and supervise directories on demand from a YAML definition. nosh has the ability to spin up a service directory from a systemd service/socket unit file definition; but the other toolsets leave actually constructing the service and supervise directories from scratch to third parties. That said, such directories are simple enough to construct by hand directly, on the same order of simplicity as the YAML definitions here, in fact. It is notable that the YAML definitions again seem to have features weighted towards PID files and suchlike which are problems that are in the process of disappearing.

* One of the fundamental design concepts of the daemontools family is that the service manager knows the process ID of the daemon because it remembers it from when it spawned it and knows that it does not go away until wait(). immortal employs the old broken idea, that this replaced, instead. The code reads process IDs from PID files and tests whether they are valid by sending them signal number 0, in the race-prone and risky way that the world has known not to do for decades. Witness [https://github.com/immortal/immortal/blob/9ed9087a8ef1e64d64...](https://github.com/immortal/immortal/blob/9ed9087a8ef1e64d64e6b86bbaae9729413e3485/daemon.go#L78) and [https://github.com/immortal/immortal/blob/9ed9087a8ef1e64d64...](https://github.com/immortal/immortal/blob/9ed9087a8ef1e64d64e6b86bbaae9729413e3485/daemon.go#L85) .

There is a lot of prior art here that this design is just ignoring. Overall,
it is a retrograde step, to a system that is less capable than runit or
daemontools are, is less robust, and is less well designed. A better approach
is, if one wants to drive services from a WWW browser and to create service
definitions from a configuration file, to layer a WWW server and a service
definition creation tool over something like s6, which is more complete and
more widely portable than immortal.

I also recommend that the author read _at minimum_ Daniel J. Bernstein's
design doco for daemontools and Laurent Bercot's design doco for s6. M. Bercot
goes into some depth when it comes to the reasons _why_ things are engineered
as they are. Additionally, the first chapter in the _nosh Guide_ enumerates
some of the design fundamentals that underpin it, which daemontools-family
systems generally share.

~~~
skarnet
I agree with Jonathan on everything here, with the additional caveat that the
Go runtime is particularly heavy, and thus not really suitable for low-level
system software (which should use resources sparingly, leaving them for
production applications). So I don't think Go is a suitable language choice
for a supervisor in the first place; and apart from the language choice, all
of Jonathan's criticism applys.

I strongly recommend the author of immortal to join the supervision mailing-
list on skarnet.org, follow the discussions there and ask questions if needed.
We have about 19 years of experience with process supervision, there has been
a lot of prior art and successive refinements, and we hold extensive knowledge
about what constitutes good practice. If someone aims to design a process
supervisor, even if it's a completely new take with a specific spin, it would
be a mistake not to tap into our pool of experience. We always want users to
benefit from good, rigorous software.

~~~
nbari
Hi, so assuming immortal was created using the best in the class programming
language for its purpose, and let's say was following all the existing best
practices, I still have one question that never really found a proper
solution, which is:

For cases in where is required to monitor services like unicorn/gunicorn/Nginx
that demonize them self and have the capability to fork how to make aware the
supervisor of does changes and to continue monitoring the new process without
entering into a race condition, to be more clear here is an example of this
problem [https://asciinema.org/a/80371](https://asciinema.org/a/80371), and
work around's here
[http://serverfault.com/a/587065/94862](http://serverfault.com/a/587065/94862).

Immortal and Systemd seems to fix this problem, but based on JdeB comments,
they are employing the "old broken idea - Witness
[https://github.com/immortal/immortal/blob/9ed9087a8ef1e64d64...](https://github.com/immortal/immortal/blob/9ed9087a8ef1e64d64e6b86bbaae9729413e3485/daemon.go#L85")

Therefore wondering, what is the best approach, solutions for this cases?

Or could it be than Immortal is not that wrong, at the end, is solving a
problem, maybe broken or not but is working and could be improved for sure:

[https://asciinema.org/a/80360](https://asciinema.org/a/80360)

As a reference, from systemd man page for options

Type= If set to forking, it is expected that the process configured with
ExecStart= will call fork() as part of its start-up. The parent process is
expected to exit when start-up is complete and all communication channels are
set up. The child continues to run as the main daemon process. This is the
behavior of traditional UNIX daemons. If this setting is used, it is
recommended to also use the PIDFile= option, so that systemd can identify the
main process of the daemon. systemd will proceed with starting follow-up units
as soon as the parent process exits.

PIDFile= Takes an absolute file name pointing to the PID file of this daemon.
Use of this option is recommended for services where Type= is set to forking.
systemd will read the PID of the main process of the daemon after start-up of
the service. systemd will not write to the file configured here, although it
will remove the file after the service has shut down if it still exists.

~~~
skarnet
There is no good solution for servers that, mistakenly, background themselves;
the practice of backgrounding itself comes from a time where supervision was
not as widely known as it is today, and nowadays there is _no excuse_ for a
daemon to fail to at least provide an option to avoid backgrounding itself.

The best thing to do is to contact the daemon's authors and insist they change
that. In nginx's case, they think it's ok if they perform their own
supervision, but it would be nice if they also could integrate with existing
supervision systems.

In the meantime, as a workaround, the accepted practice for supervising
forking daemons is known as "fghack": maintaining a process keeping track of
the forking daemons via a few pipes that close when the daemons die. s6, for
instance, provides such a tool:
[https://skarnet.org/software/s6/s6-fghack.html](https://skarnet.org/software/s6/s6-fghack.html)
. AFAIK, so does nosh.

------
zie
This is pretty much a non-problem for us anymore. We use nomad
(www.nomadproject.io) as a distributed process manager(and scheduler), it also
does lots and lots of other cool things as well, but it also happens to handle
this use case, as a sort of side-effect of what it does.

------
michaelmior
Curious why I would use this over supervisord. It's a shame there's no
comparison.

~~~
interfixus
If for no other reason, because Python.

It's an aesthetic thing mainly, and a bit silly I'm sure, but dammit, I find
it unsettling to have my Python 3 stuff running, and then the legacy Python 2,
simply for the sake of supervisord.

Likewise, I don't need any Python on my Go-server, please.

~~~
IgorPartola
That is sort of a misguided attitude. If you run Ubuntu, you run plenty of
systems stuff on Python. Also Perl and bash and C. If the piece of software
works and doesn't require you to do extra shit to set it up, who cares what
it's written in?

I can see caring about performance. A Python process will be more memory
hungry than something hand coded in C. But for the rest of it, who gives a
crap if t is packaged well. And if it's not, it's probably not worth using it
at all.

~~~
interfixus
I give a crap. Also, I don't run Ubuntu.

As I said, it's a question of aesthetics, and probably quite irrational,
possibly OCD, but I do like my servers tight.

------
norswap
What would have made this a smash hit: Windows support. There is still no
reasonable truly cross-platform way to deploy daemons.

~~~
zie
nomad
([https://www.nomadproject.io/downloads.html](https://www.nomadproject.io/downloads.html))
supports Windows.

~~~
norswap
"A Distributed, Highly Available, Datacenter-Aware Scheduler" is a bit more
than what I want. I just want a cross-platform way to install/remove (and
query the status of) a daemon process (i.e. run your process as a Windows
service, launchd service, or linux equivalent: systemd, ...).

~~~
zie
I agree, it's a little over-kill for your particular use-case, but it works
well, and it can be run in -dev mode, so you don't have to use the whole
distributed, HA and datacenter part(s).

I don't have much experience with it on windows, but I use it regularly on
*nix platforms and it works wonderfully.

------
rektide
I really like that it has a user-mode incarnation that looks in ~/.immortal.
One of the things I really enjoy about Master Control Processes is having a
well-declared place where one can expect to find a manifest of processes.

------
aerioux
Is there more comprehensive docs anywhere? I want to know what options to
pass, and more detailed descriptions of what things do. The initial docs page
looks like but after clicking through it seems like they all link to each
other... so in summary I cant find anything I want :/

That currently would be the primary barrier to usage -- the docs arent very
helpful.

~~~
nbari
Maybe this can help,
[https://immortal.run/post/ansible/](https://immortal.run/post/ansible/)

------
agnivade
Ah nice. Something fresh and written in Go. I was wary that I will have to
learn systemctl again for Ubuntu 16.10 after learning upstart.

I want to use Immortal instead of systemctl. Just a question - is this mature
enough to be used in production systems ?

~~~
nbari
For avoiding having issues, try to compile from source on your target
platform, this is due the
[https://golang.org/cmd/cgo/](https://golang.org/cmd/cgo/)

the readme file has some instructions about how to do it:
[https://github.com/immortal/immortal/](https://github.com/immortal/immortal/)

Regarding if is mature enough, strictly talking still not a stable version,
current beeing `0.11.0` and not jet a `1.x.x` but the more you could test and
evaluate could be nice.

Please feel to raise any issue if needed here:
[https://github.com/immortal/immortal/issues](https://github.com/immortal/immortal/issues)

~~~
agnivade
Thanks !

------
vor1968
These supervisor projects are always a chuckle.

Is anyone ignorant enough to believe that any monitor can survive general
process failure or contribute to recovery of a system with abnormal function
or resource starvation?

And that is the basic reason for these monitors! garbage. My last experience
was with bluepill which was this mess of ruby nonsense. Anyone with any
experience knows you can't trust a system to correct it's own failure. Why are
we hammering this nail over and over to catch a 30% recoverable
transient...management.

~~~
skarnet
You need to read the archives of the supervision mailing-list at
list.skarnet.org, as well as
[http://skarnet.org/software/s6/overview.html](http://skarnet.org/software/s6/overview.html)
\- there is a lot of value in process supervision, it's just that most people
who whip up their own solution end up doing it wrong. Like crypto, in a less
dramatically noticeable way.

