
Signalfd is useless - tptacek
https://ldpreload.com/blog/signalfd-is-useless?reposted-on-request
======
kentonv
Another weirdness about signalfd: read() from a signalfd returns signals for
the calling process, regardless of what process created the signalfd (e.g. it
could have been inherited through fork()). That's arguably usually what you
want, but is inconsistent with usual file descriptor semantics, which say that
it doesn't matter who read()s.

One place where the inconsistency gets weird is when you use signalfd with
epoll. The epoll will flag events on the signalfd based on the process where
the signalfd was registered with epoll, not the process where the epoll is
being used. One case where this can be surprising is if you set up a signalfd
and an epoll and then fork() for the purpose of daemonizing -- now you will
find that your epoll mysteriously doesn't deliver any events for the signalfd
despite the signalfd otherwise appearing to function as expected. That took me
a day or two to debug. :(

With all that said, at the end of the day I disagree with Geoff. I would
rather use signalfd than signal handlers. The "self-pipe trick" is ugly,
involves a lot of unnecessary overhead, and runs the risk of deadlocking if
you receive enough signals to fill the pipe buffer before you read them back
(which can be solved with additional synchronization, but ick). In fact, in my
own code, on systems that don't have signalfd or any similar mechanism, I tend
to block signals except when I'm about to call poll(), and then siglongjmp()
out of the signal handler to avoid the usual race condition. (See pselect(2)
for discussion of said race condition.)

I think it's just a fact of life that you need to clear your signal mask
between fork() and exec(), and yeah no one does this, whoops.

BTW, for the specific problem of dealing with child processes, I really hope
Linux adopts the Capsicum interface as FreeBSD has:

[https://www.freebsd.org/cgi/man.cgi?query=pdfork&sektion=2](https://www.freebsd.org/cgi/man.cgi?query=pdfork&sektion=2)

Until then, you simply can't expect to reap children via signals. You use the
signal to let you know that it's time to call wait().

~~~
geofft
Does running the self-pipe trick on a separate thread solve that issue? It
seems like it's basically equivalent to signalfd (neither worse nor better,
unless you're worried about platform-specific thread bugs): you end up with a
signal mask on your main thread, but you also avoid EINTR on your main thread.
Any possible pipe lockup just happens on the signal-handling thread, so the
mainloop can keep running and eventually dequeue signals.

~~~
nunwuo
> Does running the self-pipe trick on a separate thread solve that issue? It
> seems like it's basically equivalent to signalfd (neither worse nor better,
> unless you're worried about platform-specific thread bugs): you end up with
> a signal mask on your main thread, but you also avoid EINTR on your main
> thread. Any possible pipe lockup just happens on the signal-handling thread,
> so the mainloop can keep running and eventually dequeue signals.

There's no need for threads. Set the pipe to non-blocking and ignore the
write() error if it's EAGAIN/EWOULDBLOCK. See my response above for why
dropping writes if a byte already exists in the pipe is okay.

~~~
geofft
Sure, but that doesn't solve the EINTR problem. If you accept signal-handler
interruptions on threads where you actually do work, then you risk
interrupting system calls on those threads, and even SA_RESTART isn't
guaranteed to work all of the time. That's what a separate thread (or
signalfd) wins you.

(But yes, setting it non-blocking is correct.)

------
caf
If you're going to create a dedicated signal-handling thread as the author
recommends (which _is_ one of the best ways to handle signals in a pthreads
application), you don't need to use signal handlers at all; you should just
mask the signal(s) and have the signal-handling thread loop around
sigwaitinfo().

To his broader point, the mistake is to assume you will be able to get one
signal delivered per signal raised. That's just not how (classic) UNIX signals
work (POSIX realtime signals are different, and _are_ queued) - they
fundamentally need to be treated as level-triggered, not edge-triggered. For
the SIGCHLD example, when a SIGCHLD is recieved (no matter whether through
signal handler, self-pipe trick, signalfd() or sigwaitinfo()) you need to loop
around waitpid() with the WNOHANG flag until it stops returning child PID
statuses.

------
ajross
Money quote seems to be:

> So you have to be very careful to reset any masked signals before starting a
> child process

I don't see what this has to do with signalfd. That statement is true
generically. Non-default Unix signal handling and subprocess management have
_never_ cooperated cleanly. The point to signalfd is to provide a simpler
mechanism to integrate signals (which are a legacy API in almost all cases)
with existing synchronization and event handling architectures, not to
magically make them not suck.

~~~
geofft
Yup. A lot of people, myself included, were under the impression that signalfd
_takes over_ from the normal signal-handling pathway: it doesn't.

It's particularly bad because the only way to get notified on a child exiting,
in an event-handling architecture, is to wait for SIGCHLD notifications. (You
can't call wait/waitpid because that's blocking; at best you can call it in a
separate thread.) So even if all you're trying to do is write a program that
runs a handful of children asynchronously, you have to incorporate signals
into your architecture. And signalfd taunts you by providing siginfo with each
notification, so you think you know which child exited -- but in fact, those
siginfos could have coalesced, so this data is useless.

A friend claimed to me that siginfo is only useful for so-called synchronous
signals (SIGSEGV, SIGILL, etc. -- stuff that you can't handle in an event loop
anyway), which I'm inclined to believe, the more I think about it. So there's
no reason for signalfd to have included siginfo.

~~~
jsnell
Both wait() and waitpid() have a non-blocking mode (WNOHANG). So use SIGCHLD
to decide when to check the status of the child processes, and waitpid() to
actually do it.

~~~
jacquesm
There is a not-so-nice but _very_ effective trick: Send the process you wish
to check for a signal of '0' (using kill(pid, 0)), if that fails the process
no longer exists.

This is kind of nasty in case your pids tick over very fast so you'd have to
do this with a fairly high frequency to make sure you don't hit the same pid
twice.

Fortunately this is hardly ever a problem but it is something worth thinking
about when using the trick.

See also:

[http://unixhelp.ed.ac.uk/CGI/man-
cgi?kill+2](http://unixhelp.ed.ac.uk/CGI/man-cgi?kill+2)

No actual signal is sent, it's just asking the kernel to check if the signal
_could_ have been sent.

------
sreque
I've used signalfd before and consider it an improvement over normal signal-
handling. Signal coalescing happens regardless of signalfd.

The only major thing you have to remember when using signalfd is to mask the
signals you want to only receive via signalfd and then unmask these signals in
any child processes before calling exec*() functions.

------
treve
Question from someone not knowing much about low-level programming and dealing
with signals:

If signals are so problematic, why rely on them? Is the functionality useful
for other things other than dealing with 'emergencies'?

One thing I can see that is useful, is that it allows a program to gracefully
deal with a kill, but many applications seem to have a 'graceful stop'
mechanism that doesn't need signals.

~~~
geofft
A lot of functionality is _only_ available via signals. For instance, there's
no way other than SIGCHLD to be asynchronously notified when a process exits
(unless you want to dedicate a thread to running wait()). There's no way other
than SIGWINCH to be notified when your terminal gets resized.

You could certainly imagine some kernel extensions that take all of this
useful functionality and make it available in ways other than signals, leaving
just signals for things you have to deal with immediately like SIGSEGV (so you
can print a nice error message before quitting), but they don't exist yet. I
imagine some of the intent behind signalfd was to do this all at once for all
signals, but it didn't quite work.

~~~
osandov
> You could certainly imagine some kernel extensions that take all of this
> useful functionality and make it available in ways other than signals,
> leaving just signals for things you have to deal with immediately like
> SIGSEGV (so you can print a nice error message before quitting), but they
> don't exist yet.

In the SIGCHLD case, there's a proposed CLONE_FD flag to clone which would
return a file descriptor instead of a PID. This fd could be read poll'd on and
read from, which is much nicer than dealing with SIGCHLD. See
[http://lwn.net/Articles/638613/](http://lwn.net/Articles/638613/)

So those kernel extensions are happening :)

~~~
quotemstr
clonefd is a very limited solution. What we really need is the ability to open
a file descriptor handle to _any_ process. That ability solves all sorts of
race conditions. Conveniently, we already have an interface to open file
descriptors for processes: /proc. We just need to extend its semantics
slightly.

~~~
colin_mccabe
Maybe I'm misunderstanding, but wouldn't opening a file descriptor to a
process via /proc have the same race condition issues with process id
wraparound? After all, processes in /proc are opened by process ID (the only
exception I can think of is /proc/self... maybe I missed some other
exceptions?)

Overall, it seems easier to avoid process ID wraparound attacks via using the
full 32-bit number space for PIDs. There may be a few programs that need to be
changed because they did something silly like cast pid_t to short, but I think
overall most programs would work just fine. As far as I can remember, the
reason for using low numbers was because people didn't want to type longer
ones at the shell. Internally the kernel and libraries store everything as
32-bit, at least on Linux.

~~~
quotemstr
> Maybe I'm misunderstanding, but wouldn't opening a file descriptor to a
> process via /proc have the same race condition issues with process id
> wraparound?

Absolutely. But once you've opened the file descriptor, the kernel would
guarantee that its corresponding process ID would remain unused until you
closed the file descriptor. (For example, it could keep the process a zombie
if it exits.)

This way, it's possible to write a reliable killall: walk /proc, call
openpid() on each entry, and _with the PID FD open_ , examine the process's
user, command line, or whatever else, kill the process if necessary, and close
the process file descriptor.

No race.

~~~
colin_mccabe
_But once you 've opened the file descriptor, the kernel would guarantee that
its corresponding process ID would remain unused until you closed the file
descriptor. (For example, it could keep the process a zombie if it exits.)_

That seems like it would open you up to a trivial denial-of-service attack
where some attacker just spawns a bunch of processes and never closes the
/proc handles. Then you can't start any more processes because there are no
more process IDs available. The only workaround is to have a larger PID space,
which poses the question... why not just have a larger PID space in the first
place and skip the new, non-portable API?

~~~
quotemstr
It works out all right on Windows, which uses exactly the approach I advocate.
And you can _already_ DoS the system in myriad ways. If you're still worried:
we have ulimits for other resources. We can have a ulimit for this one too.

~~~
colin_mccabe
I agree that there are already many ways to DoS the system-- for example, the
age-old fork bomb. But that is not a good reason to add more flaws. People are
working on ways to fix the old flaws, such as cgroups.

I don't think a ulimit would be very effective here at preventing denial-of-
service. Let's say I set it to 100... I can just have my 100 children each
spawn and hold on to 100 children of their own, and so on and so forth. If I
just go with a bigger process ID space all these headaches go away, plus
existing software works without modification.

~~~
quotemstr
32 bits is still too small. I wouldn't be comfortable relying on the size of
the PID space to avoid collisions until we made it 128 bits or so. I think
you're still seriously overestimating the danger of a DoS here: whatever
limits apply to forked processes can apply to process handles. Whatever
mitigates fork bombs will also mitigate handle-based attacks.

The advantages of process handles outweigh this small risk.

~~~
Dylan16807
In what scenario would you run out of 64 bit PIDs? How many per second for how
many centuries?

~~~
quotemstr
It's not a matter of running out of PIDs: it's about the probability of
accidental collision.

------
dllthomas
What I'd like to see is a _write_ signalfd, so I can send signals without a
race condition that might lead to my signalling the wrong process.

------
amelius
I'd like to see other concepts also covered by file descriptors. Example:
semaphores, mutexes, condition variables, timers.

~~~
geofft
Timers are available via timerfd (man timerfd_create).

There used to be a FUTEX_FD, but it got removed. I think you can mostly
achieve the effect with eventfd, though.

------
worik
If you misuse signals.

I am rusty on low level programming but I have done enough to know that this
poster is whining a bit too much.

Signals should only be used in the general case for exceptional circumstances,
like killing a programme. A signal handler's job is to deal with the crisis,
e.g., gracefully exit.

In lower level cases signals mean there is an urgent event, something that
must be done now or it is useless to bother.

If you try to use signals for general purpose IPC then you get what you
deserve - chaos.

~~~
geofft
As I mentioned in another comment, there are cases (like SIGWINCH) where the
only interface the kernel gives you for general-purpose IPC is signals. In any
case, if you restrict yourself to using signals for urgent respond-immediately
events, then signalfd is still useless, since you want to handle those
synchronously. :)

(That said, I would definitely agree that the _kernel_ is misusing signals --
SIGWINCH should just be some form of metadata on the terminal fd, not a
process-wide signal.)

~~~
caf
This would probably be because by the time resizeable terminals became common,
the semantics of the tty device were long settled. It probably would have made
sense to create some kind of 'ttyaux' device at this point, though.

