Sigaction: see who killed you, and more

geocar · on April 17, 2018

There's a lot of cool kings you can do with sigaction.

Trapping sigsegv is a great way to do memory management.

    void sa(int _,siginfo_t*s,void*__){
      mmap((void*)((~PAGESZ-1)&(uintptr_t)s->si_addr),PAGESZ,
        PROT_READ|PROT_WRITE,MAP_FIXED|MAP_ANONYMOUS,-1,0);
    }

Want automatic logging of a big on-disk data structure? mmap some pages read-only map_private, trapping writes and logging the address (then remapping the page read+write). Then at "save time", write out page contents to a replay log, fsync+fdatasync+fcntl-F_FULLFSYNC, then rewrite the file using pwrite/write.

You can also implement big sparse hashmaps this way: Every time a server gets a request, it allocates a cell and tells the client about it. Those cells are read+write. If a cell gets checked in, the server simply decrements the counter, but if there's a page fault (read-only) the server can then take the slower path of finding the server that actually responds to it.

And so on.

wahern · on April 17, 2018

I once implemented a specialized contiguous stack structure that used this trick to automatically reallocate the stack. (The stack was state for a compiler-generated backtracking NFA.) The handler would longjmp to code that realloc'd the stack and restarted the NFA. (No internal pointers so resetting the NFA state and restarting was trivial.) I forget the exact performance improvement, but compared to the original code with explicit boundary checks I think it was several multiples faster.

Unfortunately signal handlers are process-wide. These tricks are a bad idea for code that needs to play nice with other code (i.e. libraries, modules, etc) unless you know the process won't be multithreaded or you can own SIGSEGV.

In the above hack I used sigaltstack, which is thread-local, to communicate per-thread state to the handler, allowing the scheme to work multi-threaded and to detect unintended faults. (The alt stack memory region used special page-aligned offsets so I could derive a thread-local object that didn't overlap with the signal stack.) Still, such a scheme is too ugly to hide inside a blackbox component which is why these tricks rightly don't see much usage. Your code and my code could have never run in the same process.

jcranmer · on April 17, 2018

> Unfortunately signal handlers are process-wide. These tricks are a bad idea for code that needs to play nice with other code (i.e. libraries, modules, etc) unless you know the process won't be multithreaded or you can own SIGSEGV.

It's safe to handle processor trap-caused signals, since those are guaranteed to be delivered to the current thread. (SIGSEGV, SIGILL, SIGFPE, SIGBUS, SIGTRAP). The trickier part is that you can still send SIGSEGV and the like from another process, which means you have to distinguish between "is this signal caused by the processor or somebody else?" not to mention the general "is this signal in the code where I should handle it."

Of course, a saner OS (like Windows) would give you a more stack-based approach to handling hardware traps, so that a function can say "if I, or anything I call, gets a #DE, deliver SIGFPE to this location."

wahern · on April 17, 2018

But an installed signal handler is process-wide. There's no way for a thread to install a signal handler without potentially stepping on the toes of some other thread wanting to install its own handler. Contrast that with signal masks which are thread-local, permitting libraries to set and restore masks without disturbing global process state.

jcranmer · on April 17, 2018

If you code your signal handler right, you could dispatch the signals you don't want to handle to another signal handler.

I'm not arguing that the POSIX signals API is in any way sane (it's not). It is possible to build signal handlers correctly, but it is far from easy and certainly far harder than it has any right being.

wahern · on April 17, 2018

Yes, but my point was that these tricks don't work well with libraries, which don't normally (if ever) expose a public API for chaining signal handlers, often couldn't do so as a practical matter, anyhow--see my example, below.

If you control the entire process and all the code, then sure. But even then one should avoid deviating from idiomatic, modular programming patterns without very good reason. Because signal handlers are process-wide they're global state; sharing global state is not a friendly path when it comes to code maintenance and long-term project management.

Not too long ago both OpenBSD and macOS lacked sigtimedwait(), so for my Lua Unix module I wrote a "portable" implementation by longjmp'ing from a SIGALRM handler. It worked well and was correct but with the caveat that it shouldn't be used from multithreaded programs. I usually try to avoid multithreaded programming myself, but others embrace it. As both a user and author of libraries, caveats like this are a real PITA.

cryptonector · on April 17, 2018

Yes, you can "push" and chain handlers, and as long as all bits of code setting a handler only ever set a handler (never resetting disposition) then it will work, for some value of work. I still wouldn't recommend setting a signal handler in a library, even though in a pinch I would.

geocar · on April 18, 2018

libsigsegv[1] has a different manner of dealing with the problem you describe - another way to have your cake and eat it too, but I probably would have just moved everything into a separate process.

I don't view a process as a multi-tenancy: These tricks are for people who control main(). The fetish for threads and libraries gains little and costs much, so I have a real aversion to play.

close() isn't even thread-safe[2]. Serious. If you can't get that part of the house right, I don't want to stay the night, let alone live there.

[1]: https://www.gnu.org/software/libsigsegv/

[2]: http://geocar.sdf1.org/close.html

pm215 · on April 17, 2018

You have to be a little careful with trapping SIGSEGV to implement funky behaviour though -- it's easy to write a handler that works for the expected case, but harder to write one that doesn't result in incredibly confusing behaviour if the rest of the program runs into a bug that would normally be reported as SIGSEGV->program core dumps...

(Also, mmap() isn't async-signal-safe in POSIX, so this is veering into OS-specifics and reliance on implementation details.)

geocar · on April 18, 2018

> it's easy to write a handler that works for the expected case, but harder to write one that doesn't result in incredibly confusing behaviour if the rest of the program runs into a bug that would normally be reported as SIGSEGV->program core dumps...

Of course. I only write a glib example because there isn't a lot of space here.

> (Also, mmap() isn't async-signal-safe in POSIX, so this is veering into OS-specifics and reliance on implementation details.)

close() isn't thread-safe[1], so more people are in OS-specifics and relying on implementation details than we probably realise. I don't know of a single unixish system that gets mmap in a signal handler wrong.

[1]: http://geocar.sdf1.org/close.html

MawKKe · on April 17, 2018

Also the last void* argument of the signal handler points to a ucontext_t object (man getcontext), so you can do something crazy like jumping over a NULL-dereference by modifying the instruction pointer and then "returning" with setcontext().

cryptonector · on April 17, 2018

This is all very slow on modern systems. I would not recommend it. But it's true that these are neat hacks!

quotemstr · on April 17, 2018

Signal handlers don't have to be process-wide. See my detailed proposal for fixing the situation here: https://www.facebook.com/notes/daniel-colascione/toward-shar...

The trouble is that nobody is actually interested in fixing the core API. When I've raised the issue (and my proposal) with libc maintainers, the response has been a bizarre insistence that signals are somehow illegitimate, that the things people do with signals are wrong, and that everyone should just stop with signals despite there being no viable replacement mechanism.

As a result of this head-in-the-sand attitude, we're left with an awful decades old signals API and an absolute mess of ad-hoc libraries (which conflict with each other) to work around the worst of the problems.

This is how good systems ossify and eventually die.

cryptonector · on April 17, 2018

Signals are the weakest part of Unix/POSIX.

The only sane way to portably handle signals in programs is to write(2) a byte to a "self-pipe" in the handler, which pipe is waited for in a main I/O event loop, thus turning the signal into a standard I/O event.

There is no sane way to handle signals in libraries. You just have to avoid generating SIGPIPE (there are several not-very-portable ways) or hope the program ignores it. A library can risk installing a SIGPIPE handler or ignoring it, but... it is risky.

What we really need to do is standardize file descriptor flags and I/O system calls that allow writing without fear of SIGPIPE, and other things which allow libraries to never have to care about signals. There's not much need to further improve signals themselves because the self-pipe trick is plenty good enough.

quotemstr · on April 17, 2018

People are unreasonable scared of doing things in signal handlers. It's very much not the case that the only sane thing to do is writing to a pipe --- for starters, there's an AF_UNIX socket. Or longjmp.

There's a lot of bad advice floating around about what is and is not "portable" based on half-remembered bugs in old systems nobody uses anymore. Examine the situation from first principles based on the world today, not dubious advice from an irrelevant era.

wahern · on April 17, 2018

What people often don't understand is that a signal handler never has to return, and indeed as far as the kernel is concerned (traditionally speaking) never does return. To deliver a signal all the kernel really does is reset the program counter. Okay, that's not the whole truth--first it sets up a trampoline so that if the handler does return the program counter and stack pointer is restored, plus potentially some other magic. But the important thing is that a signal is fundamentally nothing more than an unscheduled goto, as opposed to an asynchronous call/return.

This is why the kernel is never in danger of recursing and only a small, fixed amount of kernel state is needed to deliver signals (ignore real-time signals). Maintaining this O(1) space and time characteristic explains a lot about the semantics of signals, abstractly and in concrete implementations.

Know this doesn't make it any less dangerous in terms of the potential pitfalls, but IMO understanding how it works can help to make it clear how signals can be usefully employed. And it also shows why simply writing to a pipe is problematic--a pipe can overflow and cause an application to miss signals that smarter code wouldn't. signalfd was born precisely because of this limitation. (Unlike BSD kqueue, signalfd only supports a single listener--including signal handlers--per signal, which means signalfd only solves the very narrow problem of overflowing a pipe.)

cryptonector · on April 17, 2018

Writing to a pipe is not a problem! Make it non-blocking and ignore EAGAIN. The point is to wake the event loop, and if the pipe is full then the event loop will wake eventually -- the event loop has fallen behind, it might yet catch up. If you think it won't, then _exit() instead on EAGAIN when writing to the pipe.

wahern · on April 17, 2018

Imagine a series of SIGPIPEs filling the pipe, followed by a SIGTERM or SIGHUP that gets dropped on the floor. At best that's a broken application, at worst it could be a security vulnerability.

Every modern Unix (BSDs, Linux, macOS, Solaris) provides a pollable signal notification mechanism. But you can also always carefully make use of signal masks and 1) use either pselect as the root[1] of your event loop or 2) use a special thread to catch and coalesce signals.

[1] BSD kevent(), Linux epoll(), and Solaris port_wait() all support waiting on a kqueue, epoll, or port descriptor, respectively. That is, you can have a tree of event descriptors. Though with those you also always have a better [passive] signal notification system.

cryptonector · on April 17, 2018

First, as to SIGPIPE, I just don't want even to generate it (and try hard not to), much less catch it -- so if I can then I set its disposition to SIG_IGN.

Second, as to exit conditions, I always set a sig_atomic_t variable to indicate that an exit has been requested via a signal, and I check it in the event loop, so even if the pipe is full, the event loop will see this and cannot miss it.

Third, if need be I use multiple pipes, one per-signal. I've only ever done this for SIGCHLD and SIGUSR1/2 -- that is, I'll have a pipe for all the exit-request signals (e.g., SIGTERM), one for SIGCHLD (if the process creates children processes), and one for SIGUSR1 and/or SIGUSR2.

Fourth, the pollable signal notification interfaces have caveats. I don't even need to think about or know them as long as I have a reliable way to handle signals (which the self-pipe pattern is).

Fifth, signals are not reliable. Generally, signal events can be coalesced (yes, I know...) -- coalescing them in the self-pipe pattern by simply ignoring EAGAIN conditions is just as well (provided you leave evidence of the signal in a sig_atomic_t if you need that evidence).

Sixth, select()/poll() suck, a lot, but they are portable, and more so than pselect()/ppoll(). And if you're using select() or poll() then the self-pipe pattern works very well and yields portable code.

Of course, whenever possible I prefer to use kqueue, epoll, or Solaris event ports. Since I have to work with Linux (sadness) I have to work with epoll (much much sadness), and for all of epoll's warts, it's fantastically better than select()/poll(). Even so, I prefer to self-pipe than to use signalfd() or what have you. I've been using the self-pipe trick for well over a decade and it's never been a problem.

It's true that I don't usually implement language run-times where I need to handle synchronous signals such as SIGSEGV or SIGBUS in creative ways, and if I did I might well make use of longjump() and such in signal handlers -- I'd probably write an async-signal-safe stack walker even, and stack trace printer, and I've looked into that in the past and know it can be and has been done. I do maintain a run-time, jq's, but it's as a toy by comparison to, say, a JVM, and it does not need to play such games.

quotemstr · on April 18, 2018

Why does this have to be so hard? You don't need to worry about EAGAIN or the pipe buffer or whatever. Here's a simple waitable semaphore that never fills the pipe.

  struct waitsem {
    pthread_mutex_t down_lock;
    int count;
    int fd_r;
    int fd_w;
  };

  int
  waitsem_init(struct waitsem* ws)
  {
    memset(ws, 0, sizeof (*ws));
    if (pthread_mutex_init(&ws->down_lock)) {
      return -1;
    }
    int pipefd[2];
    // Instead use pipe2 and O_CLOEXEC if your OS supports it
    if (pipe(pipefd)) {
      posix_mutex_destroy(&ws->down_lock);
      return -1;
    }
    ws->fd_r = pipefd[0];
    ws->fd_w = pipefd[1];
    return 0;
  }

  // This function is async signal safe.
  void
  waitsem_up(struct waitsem* ws)
  {
    if (!atomic_fetch_add_explict(&ws->count, 1, memory_order_relaxed)) {
      // We transitioned 0->1, so it's our job to write a byte to the
      // pipe and wake up concurrent waiters.
      if (TEMP_FAILURE_RETRY(write(ws->fd_w, ws, 1)) != 1) {
        abort();  // POSIX prohibits failure.
      }
    }
  }

  // When ws->fd_r is readable, ws can probably be downed. Does not
  // block. (To perform a blocking down, wait on ws->fd_r.) Not async
  // signal safe.
  bool
  waitsem_try_down(struct waitsem* ws)
  {
    // N.B. We need ws->down_lock to bound the number of bytes in the
    // pipe. If we didn't take it, an arbitrary number of concurrent
    // waitsem_try_down callers could stall between decrementing
    // ws->counter and reading the pipe byte, each allowing an
    // interleaving waitsem_up to write a byte and, in theory,
    // eventually overflow the pipe buffer. You might think "well,
    // waitsem_up() would just block until one of the concurrent
    // waitsem_try_down functions got around to finishing its read", but
    // you'd be wrong, since waitsem_up() can run on a thread also
    // running waitsem_try_down and deadlock!  With the mutex in place,
    // we can have at most two bytes in the pipe at one time, well below
    // the POSIX minimum PIPE_BUF of 512.

    // N.B. nothing in this function sleeps, so posix_mutex_lock doesn't
    // actually block for any meaningful time. Thus, this function
    // behaves as a non-blocking function even though read(2) can, in
    // general, sleep.

    if (posix_mutex_lock(&ws->down_lock)) {
      abort();  // POSIX prohibits failure
    }

    int old_counter = atomic_load_explicit(&ws->count, memory_order_relaxed);
    bool success = false;

    do {
      assert(old_counter >= 0);
      if (old_counter == 0) {
        goto out;
      }
    } while (!atomic_compare_exchange_weak_explicit(&ws->count,
                                                    &old_counter,
                                                    old_counter - 1,
                                                    memory_order_relaxed,
                                                    memory_order_relaxed));

    if (old_counter == 1) {
      // We transitioned 1->0, so it's our job to read the byte out of
      // the pipe. We know the byte is in the pipe, so we don't actually
      // block here.
      char dummy;
      if (TEMP_FAILURE_RETRY(read(&ws->fd_r, &dummy, 1)) != 1)
        abort();  // POSIX prohibits failure.
    }

    success = true;

  out:
    if (posix_mutex_unlock(&ws->down_lock)) {
      abort();  // POSIX prohibits failure.
    }
    return success;
  }

  // Blocks until we can successfully down WS.
  void
  waitsem_down(struct waitsem* ws)
  {
    while (!waitsem_try_down(ws)) {
      struct pollfd pfd = {ws->fd_r, POLL_IN, 0};
      poll(&pfd, 1, -1);
    }
  }

cryptonector · on April 18, 2018

This is... way more complicated than just ignoring EAGAIN because the implied coalescing is fine anyways.

cryptonector · on April 17, 2018

I didn't downvote you, fyi. You're right that longjump()... is on the POSIX async-signal-safe list [0]. And I/O to sockets using system calls like sendmsg(2), is also permitted in async signal handlers because those are also on the list of async-signal-safe functions.

But it's still the case that the best thing to do (in 99% of cases) is to "self-pipe". If you don't have an event loop, then maybe you want to do something else, but it can get hairy right quick, and I'd much rather give advice most programmers can follow that will keep them out of trouble.

[0] http://pubs.opengroup.org/onlinepubs/9699919799/functions/V2...

carterschonwald · on April 17, 2018

My limited knowledge on this topic seems to indicate that some signals are per thread in terms of handler registration rather than process wide. At least if you register the handler in that thread. Am I incorrect ?

quotemstr · on April 17, 2018

Signals are process wide. There is no confusion on this subject; people who claim there is are probably thinking of the delivery of process-directed signals, which is an entirely different subject. There is no current system for which sigaction acts per thread.

cryptonector · on April 17, 2018

POSIX is not clear on this. They are generally process-wide, but on Linux it depends on the pthread library.

EDIT: What would be nice is a way to set signal disposition on a process-wide basis as a "default" that could be overridden on a per-thread basis. But there is no flag for distinguishing "the process" from "this thread" in sigaction(2). Another detail to address is what happens when you pthread_create() a new thread from a thread with signal disposition "overrides": does the new thread inherit these, or not? Another detail: on fork(), should the calling thread's disposition "overrides" become the process-wide dispositions for the new process?

nrclark · on April 17, 2018

Sigaction is pretty cool. Signals still suck though, if you need to stick to the POSIX specs.

It's crazy that so many standard library functions will fail with an EINTR errno if your program gets a signal while it's in a library function. Basically any library call that ever does any I/O of any kind (that means common stuff like open/dup/read/write, but also less obvious stuff like getpwuid).

Sigaction provides the SA_RESTART flag which _should_ auto-restart most calls, but it doesn't work everywhere. And the functions which do and don't auto-restart aren't documented anywhere that I've ever found.

As a result, any high-reliability program that installs signal-handlers basically needs to have a retry mechanism for every libc call that can result in an EINTR. Some functions might not need it, but I've never found a good way to tell which ones do and which ones don't. Select(), for example, doesn't support SA_RESTART. Neither does wait(). And that means they all need to be checked every time, if you're concerned with portability and reliability. And that's nuts.

IIf I could change one thing about sigaction in POSIX, it would be to add an SA_NOINTR flag to the sigaction API, which would let any POSIX library calls finish before presenting a signal.

That, plus I'd also add Linux's signalfd() to the POSIX standard so that we could do away with "this one weird trick from a DJB" that everybody uses now (and I'd make signalfd use the new SA_NOINTR).

cryptonector · on April 17, 2018

If you have a single-threaded program and you want to set a `do_exit` variable in your handler and hope the program notices it soon, you'll want your program not to be blocking in some system call for an I/O event that might never come. This is the reason that EINTR exists.

It might have been better to have "restart-on-signal" / "interrupt-on-signal" be an argument to every system call that can block, but it's a bit late for that. So EINTR it is, and you just have to know -- it's all part of the cognitive load of dealing with POSIX/Unix/Linux/whatever. Win32 has its own things to know about (and plenty of them).

The cognitive load is, I'll say, quite heavy, but you can always use a higher-level programming language whose run-time hides all of this for you.

noselasd · on April 17, 2018

The system calls that are not restarted even when SA_RESTART is set, is documented here for linux: http://man7.org/linux/man-pages/man7/signal.7.html

Look in the "Interruption of system calls and library functions by signal handlers" chapter after the "The following interfaces are never restarted after being interrupted" text.

nrclark · on April 18, 2018

that's actually pretty helpful! Do you know if there's an equivalent in the POSIX spec? I looked around for a while and didn't find one. I want my daemon to work on more than just Linux, so I'm trying hard to stick to pure POSIX.

iforgotpassword · on April 17, 2018

Agreed. Years ago I wrote a daemon for Linux with glibc only, so I assumed I can just ignore EINTR everywhere, until I needed to use valgrind and suddenly had EINTRs everywhere.

Rather than making signalfd POSIX, if we play make-a-wish let's just go for kqueue. Signalfd feels somewhat hackish, even though its simplicity admittedly has some appeal too.

xenadu02 · on April 17, 2018

Signals are an example of false simplicity: they themselves are very simple at the cost of foisting the complexity on you and every library you use.

cryptonector · on April 17, 2018

+1 to kqueue. Burn epoll in a bonfire.