
How Not to Write a Signal Handler - nkurz
http://741mhz.com/signal-handler/
======
vezzy-fnord
I think it's a shame (and painful) how the major Unices have all significantly
diverged on things like event handling, I/O multiplexing and file system
watching.

I find the BSD kqueue(2) interface to be much more elegant than any of Linux's
*fd() functions or epoll(7). On the other hand, Linux's inotify(7) is, I find,
cleaner to use for file system event notifications than kqueue(2). But then
there's also fanotify(7) on Linux, which seems to have sort of been neglected
since its initial hype. Then OS X has FSEvents.

I guess this is why one may need libraries like libevent and libuv.

~~~
fafner
I think the kqueue interface is a lot worse than the Linux way. With kqueue
everything is forced through one interface. In Linux it's just file
descriptors and epoll to deal with them. It is a nice extension of the
"everything is a file" mechanism and it embodies the "do one thing and do it
right".

This becomes apparent when you look at fs notifications. You've already
mentioned how inotify is superior to kqueue. I mean seriously, the kqueue fs
notifications are designed in such an incredibly bad way, it almost looks like
a trolling attempt. E.g., when you try to watch a directory the kernel knows
which files are manipulated. But the API has no way of telling you. The
documentations usually suggest to keep a file list and update it. But this is
complicated and racy and will certainly result in buggy behaviour.

And that really shows that having one interface do everything is a bad
approach. The kqueue designers had to cover every event type that could
happen. And thus they added an API for something they apparently didn't
understand properly. And since it's stuck now in their API they would have to
deprecate parts of it, if they ever have the interest in fixing it.

The inotify API is not perfect. But it does its job pretty well and it is the
best fs notification API I have seen. fanotify solves a different use case.
And that's really the flexibility that the Linux interface has. They can
easily come up with new event APIs for new use cases because all they have to
do is provide a file descriptor.

I wish the BSD folks would simply implement inotify. But it seems they are
unwilling to fix their API and if they did then they'd probably design their
own interface simply out of spite. Right now their crappy fs notification
interface is a real pain. And since the lowest common denominator is very
influential when it comes to portability libraries like glib, Qt, etc.
everybody is suffering because of BSD. And they only get away with it because
OSX has the same API. I haven't looked at FSEvents. Is it a new API or simply
based on kqueue?

</rant>

~~~
kev009
There's been talk (at BSDCam) of implementing inotify which is needed for the
Linuxulator and exposing it in the FreeBSD API too. So your wish isn't so far
fetched but I'm not sure anyone is actively working on it. Patches welcome :)

~~~
fafner
That would be a good decision. So far I've only seen an attempt to implement
inotify on top of kqueue. Which of course will be buggy and incomplete.

------
lkrubner
They write:

"It is the 2013th year in the Common Era at the moment of this writing and you
might think that people should have came up with something better in terms of
signal handling at this time. The truth is that they did. It is just not that
well known yet due to a huge momentum of outdated information still
overflowing the Internet."

Then they talk about kqueue.

The above paragraph emphasizes how much Unix programming still relies on old
ideas.

Likewise, sockets are an old idea, and newer libraries, such as ZeroMQ, do a
lot to fix the old problems. ZeroMQ is often described as "Sockets on
steroids". It implements a lot of patterns that Unix itself does not give us:

"It gives you sockets that carry atomic messages across various transports
like in-process, inter-process, TCP, and multicast. You can connect sockets
N-to-N with patterns like fan-out, pub-sub, task distribution, and request-
reply. "

But I am left wondering, what would the world of programming be like if we had
a new operating system that incorporated some of the new ideas and patterns
that have developed over the last 25 years? Instead of depending on libraries,
would we not be better off if we had an OS built around these newer ideas?

~~~
sillysaurus3
We wouldn't be better off, no. The newer OS would likely be less secure, less
reliable, and probably less performant. It would also require all of the tools
we take for granted to be ported to the new OS, which would be difficult since
the above proposal is to change the API of the new OS.

It doesn't seem like there's anything wrong with libraries.

~~~
wtetzner
I don't see why it would be less secure, reliable, or performant. It might be
that the new ideas, and new abstractions, make it harder to do things in
insecure or unreliable ways.

~~~
nic0lette
It's just that new programs often have more bugs in them, and bugs can
sometimes end opening up an exploit vector.

~~~
adamnemecek
> It's just that new programs often have more bugs in them...

That sure is a broad brush that you are using.

~~~
chris_wot
I dunno. In general that's true.

------
achivetta
As far as "modern signal handling" goes: On OS X, you can also just create a
dispatch source of type DISPATCH_SOURCE_TYPE_SIGNAL. If your code already uses
dispatch, this is much easier than trying to wire up to kqueue directly.

------
lelf
> _you might think that people should have came up with something better in
> terms of signal handling at this time. The truth is that they did_

… and then the article goes completely omitting sigqueue+sigwaitinfo (realtime
signals)

------
qznc
Why does the signal handler have to interrupt a thread? Why doesn't the OS
just create a new thread to run the signal handler in? This would avoid this
whole reentrancy problem and make the concurrency explicit.

~~~
teacup50
The OS has to pause a thread that triggers a signal; how can it proceed if
doing so would just segfault again?

~~~
fulafel
Only a subset of signals come from mapping CPU faults to signals, the so-
called synchronous signals. This example was about SIGINT coming from the
usual source (Ctrl-C from terminal).

It's actually possible to handle it in a thread using existing semantics
without kqueue or signalfd: block SIGINT from all threads except your
dedicated SIGINT-handling thread.

~~~
teacup50
That requires that you have control over all threads, which is just not
something that can be guaranteed with modern operating systems and libraries
that regularly spawn their own threads.

Thread-interrupting delivery of signals is necessary in a world where you're
relying on that interruption of your single thread; a replacement that worked
for the async signals is useful, but distinct from the needs of sync signal
handling.

------
digisign
I looked around but couldn't find a great answer on how to do this correctly
in Python on Linux. I have a daemon that attempts to shut down cleanly when it
gets a number of signals and it appears to be working correctly. But, it's
written like the first example.

~~~
whopa
Python does the right thing under the hood for you. The signal handler you
register in Python isn't run in the dangerous context. There's an internal
signal handler which just tells the interpreter a signal happened and when
control is returned back to the interpreter outside of the signal handler it
knows what to do:

[https://hg.python.org/cpython/file/c7d45da654ee/Modules/sign...](https://hg.python.org/cpython/file/c7d45da654ee/Modules/signalmodule.c#l169)

------
majke
I still don't understand how queueing signals works.

For example, signalfd() is really cool, and you can indeed read() a signal
from it, but only one of a kind.

If you get, say, two SIGINT's you may as well just be able to read() one. I
guess it's an implementation detail that some signals aren't queued while some
others are, I guess. In practice it means that receiving a signal is an edge,
and in signal handler you must check how many times the signal occurred. This
for me sounds inherently racey.

~~~
general_failure
What's a good use case for knowing how many times a signal was raised? Signal
once raised has no more context than the signal itself.

~~~
rurounijones
If the signal is received more than once it means it is more important :p

"Oh shit!"

"^C^C^C^C^C^C~C^C"

~~~
lucozade
In my experience it usual means that you need some form of time reversal e.g.
unsend an email or put a BEGIN TRAN before the SQL you just executed.

Reasonably sure that can't be done in C even with all that undefined
behaviour.

However, if your IO monad implementation is sufficiently slow, you could ^C
between when you thought the code had executed and when it actually bothered
to get around to it.

I can't believe that none of the functional ninjas thought to add time travel
as a benefit. Most remiss...

~~~
taejo
> I can't believe that none of the functional ninjas thought to add time
> travel as a benefit. Most remiss...

Ahem:
[https://hackage.haskell.org/package/tardis](https://hackage.haskell.org/package/tardis)

~~~
lucozade
With much delight, I stand corrected.

Haskell perpetually astonishes me. Not only is it, more or less single-
handedly, keeping the Holy Wars alive, it now appears to supply punchlines to
jokes. Amazing.

~~~
steveklabnik
Wouldn't you do the same if you had a time machine?

------
userbinator
I've always found it interesting that the list of safe functions includes
read(), write(), and most of the filesystem operations; so while printf is not
safe, you could use write() if you really wanted to output.

~~~
gryph0n
write() is an OS system call, while printf() is a C function call.

C libraries can (and do) choose to buffer printf, while read/write are usually
unbuffered.

Of course, both are mechanisms are modifying some global state (your
terminal's state), and cannot be relied to be free of race-conditions.

------
jheriko
this seems like yet another case where a thread safe lockless queue would be
good. setting a volatile atomically is fine for the simplest cases, but what
if you want to handle every signal raised independently and not just do
something if any signal is raised?

the real answer here i think is to think about code being run concurrently and
how to do that safely... where a signal handler is a special case that,
actually, requires no extra special treatment.

------
pmorici
Is there a reason they are using sleep(1) instead of pause()? Seems like a
waste to use sleep if all you are doing is waiting for a signal.

~~~
spoiler
Well, you can's make a funky comment about `pause` being essential, and
`sleep` _is_ essential.

