Why isn't memset() async-signal-safe? (conman.org)
Why isn't memset() async-signal-safe? (conman.org)
42 points by spc476 58 minutes ago | 23 comments





The post makes it a bit unclear, so for the record: there's no theoretical reason the direction flag should be a problem. Just like any other callee-saved register, the kernel needs to save it to the stack and restore it when returning from the signal handler; there's nothing about the direction flag that makes it harder to do so, even if a signal handler is interrupting another signal handler or whatnot. And before branching to the signal handler function, just like it sets the registers used for function parameters to the correct values for the function signature, just like it aligns the stack pointer, the kernel (or a libc stub) needs to set the direction flag to 0. This is all defined by the ABI specification.

It's just that some kernels fail to do this (or did in the past), probably because the direction-flag requirement is less well known and most programs won't crash if it's neglected.

Couple of things.

First, I don't think that's true nowadays. The discussion points that this (not restoring DF) is a kernel bug and should be fixed: https://lkml.org/lkml/2008/3/5/531

Second, the kernel tries to avoid doing too much work in the signal return code. It tries hard do _avoid_ heavy xsave and just preserve needed registers.

There was a similar discussion about restoring SS (segment stack) register in signal code:

First proposed by Bryan Ford: https://lkml.org/lkml/2005/10/5/176

Then 10 years later by Andy Lutomirski: https://lkml.org/lkml/2014/7/11/564

The gist: if you modify SS register in the signal handler you are screwed. The only way around it is to install a trampoline using "the famous dosbox iret hack" described here:

http://www.x86-64.org/pipermail/discuss/2007-May/009913.html

(sadly the site is down, can anyone find a mirror?)

Something that's always baffled me: why don't CPUs have a "save ALL state" and "restore ALL state" instructions? Why does every new set of CPU registers seem to require an OS update to save them on context switches?

They do, now. On current Intel CPUs, you can use xsave and xrstor to save and load the complete state, including all new state information. Ring 0 code can ask the CPU for the size of that state (via CPUID leaf 0xd), and allocate the appropriate amount of space per task.

Oh wow. When did this change? I vaguely seem to remember that as recently as Windows 7 (or was it 8.0?) there was trouble with AVX2 or something, but I can't find the info anywhere at the moment.

Perhaps you're thinking of this debugger issue?

https://randomascii.wordpress.com/2013/03/11/should-this-win...

An OS can still use xsave incorrectly, such as by hardcoding the expected size rather than detecting it at runtime, or by getting some aspect of the CPUID leaf 0xd enumeration wrong. I wouldn't find it surprising if an initial implementation got one of the details wrong, resulting in a bug that wouldn't manifest until the next time the xsave layout changed.

When the number of state variables gets bigger, that buffer needs to get bigger. Need some cooperation from the operating system to increase the size of the buffers, that's all.

That doesn't mean the CPU can't report the size necessary using another instruction though. There's no need for the OS code to change.

The more state you save, the larger the latency in handling the interrupt. And the amount of state in modern CPUs can be quite large indeed.

Any idea how much it is? (just as a guess, I'd guess like maybe 8 KiB?)

After some careful reading of the linked bug report, apparently saving the direction flag wasn't the issue. Rather, clearing it upon entering a signal handler was the issue.

Note that (a) the issue was with clearing the direction flag on signal handler entry, not saving it; and (b) it's since been fixed in the kernel to conform to the ABI (which GCC blindly trusted) [1].

And after reading that thread, I'm not as convinced as I was a few minutes ago that this was an obvious kernel issue. Yes, the kernel mismatched the published ABI, but callee-vs-caller save and setup is not always consistent, and for good reason.

E.g. registers are usually callee-save, since there are many registers and small functions use few of them. (This is what the kernel assumed.) But rarely-set/often-used flags (such as the flag in question) may make more sense as caller-save and even caller-setup, since this reduces save/setup overhead to only the cases where the flag is actually set. (This is what the ABI dictated and GCC assumed.)

[1] https://lkml.org/lkml/2008/3/5/306

Seems like a good argument against an architecture having stateful flags that affect the execution of other instructions. Or, at least, against having such flags and not including them in the state saved and restored when switching contexts (including to signals).

Related: Unix signals are deemed "unfixable" by some: https://lwn.net/Articles/414618/

That's an extreme perspective. I've made extensive use of signals in programs that have shipped to hundreds of millions of users. If you're careful, they work fine. Yes, POSIX needs something like NT's vectored exception handlers that could allow multiple users of the same signal to cooperate: but that's an API problem, not something inherently "unfixable" about a program being interrupted and temporarily doing something else for a bit.

Where "some" includes everyone who has ever had to use them.

It isn't POSIX, but Linux does have the very useful signalfd, which creates a file descriptor that accepts signals. This is a good solution for some types of programs.

But as pointed in this old HN story it is useless in this case.

https://news.ycombinator.com/item?id=9564975

I like this site, it's like one of those really classic hacker (in the woz sense) sites from 1995.

Tricky! Even writing your own memset would experience the same behavior, since the compiler will assume the direction flag is unset.

I agree with colanderman; the kernel should be saving every register when the signal handler is entered.

This was a bug in the posix standard that has since been fixed.

