Why isn't memset() async-signal-safe?

comex · on Jan 3, 2017

The post makes it a bit unclear, so for the record: there's no theoretical reason the direction flag should be a problem. Just like any other callee-saved register, the kernel needs to save it to the stack and restore it when returning from the signal handler; there's nothing about the direction flag that makes it harder to do so, even if a signal handler is interrupting another signal handler or whatnot. And before branching to the signal handler function, just like it sets the registers used for function parameters to the correct values for the function signature, just like it aligns the stack pointer, the kernel (or a libc stub) needs to set the direction flag to 0. This is all defined by the ABI specification.

It's just that some kernels fail to do this (or did in the past), probably because the direction-flag requirement is less well known and most programs won't crash if it's neglected.

ajross · on Jan 3, 2017

I'm confused too. On x86, EFLAGS (of which the direction flag is bit 10) is absolutely preserved across signal delivery. It's the spot where the comparison result bits go, so if the issue in the linked blog post was real, it would be impossible to deliver a signal across a test/jump pair, making basically all compiled code signal-unsafe.

I don't know what historical architectures may have had a bug with this, but it's not true of x86. If memset isn't signal-safe on modern x86 linux, it's surely not because of EFLAGS state management.

DannyBee · on Jan 3, 2017

It's about whether it's cleared entering a signal handler or not

https://lkml.org/lkml/2008/3/5/207

is the first message.

Basically: ABI implementation bug in kernel. https://lkml.org/lkml/2008/3/5/231

GCC started actually relying on the ABI being correct in version 4.3, folks noticed bug.

amluto · on Jan 4, 2017

There are a whole bunch of almost-correct comments here along with a surprising number of "it's been so long it must be fixed".

The alleged bug is that Linux didn't clear DF on signal entry. (This has nothing to do with what is saved or restored. Flags have to be saved and restored and, AFAIK, always were.) The x86 ABI is crystal clear: C functions are called with DF clear. Neither glibc nor Linux cleared it before calling a signal handler, so there was a bug.

But the bug was fixed in March 2008 for Linux 2.6.25. [1]

[1] https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux....

pjmlp · on Jan 4, 2017

That doesn't mean anything when talking about POSIX.

lmm · on Jan 4, 2017

POSIX doesn't guarantee that memset is async-signal-safe, that is clear enough, and flags like this are even a reasonable rationale. But it sounds like post-2008 memset is and should be async-signal-safe on Linux. Are there relevant systems on which it isn't? IME it's impractical to write code that will work on all POSIX-compliant systems as the standard leaves too much undefined and there is no testsuite that will let you determine when your code is depending on something beyond the minimum requirements of POSIX.

pjmlp · on Jan 4, 2017

Welcome to POSIX.

As for GNU/Linux, can you assure memset works that way in all hardware platforms supported by Linux?

Even it does, no one intending to write portable UNIX code can rely on it anyway.

lmm · on Jan 4, 2017

> As for GNU/Linux, can you assure memset works that way in all hardware platforms supported by Linux?

I don't know, that's why I'm asking.

I'm not going to try to write code portable to all theoretically possible unices - I doubt anyone has managed that for nontrivial programs, and it's too hard to tell. Hell, I'm content to exclude a fair few systems I know exist (CHAR_BIT != 8, non-IEEE FP). If it's supported on all the platforms I've heard of I'll do it - it's just not practical to hope to be POSIX-complient in a language-lawyer sense.

pjmlp · on Jan 4, 2017

> I doubt anyone has managed that for nontrivial programs, and it's too hard to tell.

In the early 2000's, the company I worked for was deploying UNIX based software for GNU/Linux, FreeBSD, HP-UX, Aix, Solaris and Windows (yes Windows, not a typo).

lmm · on Jan 4, 2017

Sure, writing a program that works on 6 actually existing unices with test instances (or even just documentation) is relatively easy. Writing one for all possible unices including some that don't exist at the time of writing is a lot harder.

pjmlp · on Jan 4, 2017

Those 6 were already enough to learn how "portable" POSIX actually is.

colanderman · on Jan 3, 2017

Note that (a) the issue was with clearing the direction flag on signal handler entry, not saving it; and (b) it's since been fixed in the kernel to conform to the ABI (which GCC blindly trusted) [1].

And after reading that thread, I'm not as convinced as I was a few minutes ago that this was an obvious kernel issue. Yes, the kernel mismatched the published ABI, but callee-vs-caller save and setup is not always consistent, and for good reason.

E.g. registers are usually callee-save, since there are many registers and small functions use few of them. (This is what the kernel assumed.) But rarely-set/often-used flags (such as the flag in question) may make more sense as caller-save and even caller-setup, since this reduces save/setup overhead to only the cases where the flag is actually set. (This is what the ABI dictated and GCC assumed.)

[1] https://lkml.org/lkml/2008/3/5/306

majke · on Jan 3, 2017

Couple of things.

First, I don't think that's true nowadays. The discussion points that this (not restoring DF on signal return) is a kernel bug and should be fixed: https://lkml.org/lkml/2008/3/5/531

Second, the kernel tries to avoid doing too much work in the signal return code. It tries hard do _avoid_ heavy XSAVE and just preserve only the needed registers. This is the job of sigreturn(2) syscall btw. http://man7.org/linux/man-pages/man2/sigreturn.2.html

Third, there was a similar discussion about restoring SS (segment stack) register in x86-64 signal code. First patch proposed by Bryan Ford: https://lkml.org/lkml/2005/10/5/176 . Then 10 years later by Andy Lutomirski: https://lkml.org/lkml/2014/7/11/564

Last one was merged into mainline. Then reversed. Sadly. Then applied again: https://github.com/torvalds/linux/commit/6c25da5ad55d48c41b8...

The gist: if you modify SS register in the signal handler you are screwed. The only way around it is to install a trampoline using "the famous dosemu iret hack" described here:

http://www.x86-64.org/pipermail/discuss/2007-May/009913.html

(the site is down, can anyone find a mirror?)

On Linux use signalfd(2) whenever you can http://man7.org/linux/man-pages/man2/signalfd.2.html (Ie: put signal handling back in event loop). That's the only sane way of dealing with signals.

caf · on Jan 4, 2017

The bug wasn't "not restoring DF on signal return" - it was always saved and restored correctly. The bug was not clearing DF on signal entry.

majke · on Jan 4, 2017

agreed

amluto · on Jan 4, 2017

> Last one was merged into mainline. Then reversed. Sadly. > Then applied again: https://github.com/torvalds/linux/commit/6c25da5ad55d48c41b8.... > > The gist: if you modify SS register in the signal handler you are screwed. The only way around it is to install a trampoline using "the famous dosemu iret hack" described here:

I'm not sure what you mean. This issue is fixed, so SS works exactly the way you would expect it to, unless you do very strange things indeed in which case you might need to fiddle with uc_flags.

majke · on Jan 4, 2017

Strange or not, linux did not restore SS on sigreturn in 64 bit mode. The point of explaining that is to emphasize that the issue described in the blog post is not the only one in this code (the code doing entering / exiting signal handlers).

The issue is fixed indeed, as for Feb 17, 2016.

amluto · on Jan 4, 2017

Okay, I read your post as meaning that it was still broken.

FWIW, I don't consider it sad that my first attempt was reverted. I inadvertently broke some assumptions that a real program (DOSEMU) was making, and Linux takes backwards compatibility quite seriously. The second version was better.

lilyball · on Jan 4, 2017

The site is down again? What's wrong with that site? A while back when I wanted to look something up in the x86_64 ABI the site was down too, for at least a week (I eventually gave up checking). At some point later it came back. I wonder how long it's been down this time.

Interestingly, Xcode's OS X docs used to link to x86-64.org for the AMD64 ABI, but looking right now, the link is removed (but the text remains), I guess because the site was down for so long.

hawkice · on Jan 3, 2017

I mean, kernel bug or no, if that's how it actually works, then it isn't actually safe to be handle signals while executing these functions. If people are still working to change the code and standards we hold the code to, then you can't push changes that rely on the new behavior without introducing some truly bizarre race conditions.

majke · on Jan 3, 2017

This comment suggests that #DF is rarely used anyway https://lkml.org/lkml/2008/3/6/50

> The conclusion is that DF=1 in x86_64 64-bit code is extremely rare

tedunangst · on Jan 3, 2017

The bug is 8 years old. At some point you can assume it's been fixed.

hawkice · on Jan 4, 2017

I mean, that's pretty optimistic, but perhaps: At some point you can assume that, if people get your updated code in 2017, they'll have an update from many years ago.

Depends on how releases are managed, but that might be much much more practical an assumption.

JoshTriplett · on Jan 3, 2017

Seems like a good argument against an architecture having stateful flags that affect the execution of other instructions. Or, at least, against having such flags and not including them in the state saved and restored when switching contexts (including to signals).

bcoates · on Jan 3, 2017

Processors are inherently extremely stateful; this is an argument against having unix-style signals.

JoshTriplett · on Jan 4, 2017

No disagreement there. In a new OS that doesn't need 100% POSIX compatibility, I'd eliminate the standard signal model, and only have signalfd (plus a SIGKILL equivalent and a separate stop/continue mechanism similar to SIGSTOP/SIGCONT).

phs2501 · on Jan 4, 2017

How would you handle being able to trap and fix bad memory access (i.e. hooking SIGSEGV/SIGBUS) with this scheme? Lots of programs use this for various reasons; off the top of my head at least a few generational GCs implement their write barrier by setting pages read-only and noticing the trap and resetting RW on SEGV...

(Yes, I believe something like card marking is more performant than this in several ways on modern CPUs, but then you need to insert card marking instructions into your code generator... trapping SEGV is easy and only happens in the GC.)

Also there's all the debugging and process inspection stuff that works via signals, all of which depends on being able to really interrupt code rather than just stuff a message in a queue.

JoshTriplett · on Jan 4, 2017

For trapping memory accesses, either something like userfaultfd, or otherwise handling segfaults via a signalfd from another thread with its own independent stack. (I'd also eliminate limitations about using a signalfd only within the same process that would have received the signals.)

Or, if you just need a "dirty page" bit, add a dedicated mechanism for that, which can use hardware features to run much faster without having to trap.

For debugging, we can do much better than ptrace. Linux already has dedicated syscalls to read and write another process's memory. Add some mechanisms to read and write registers, and extend the process stop/continue mechanism to allow single-stepping and stop-on-event (such as stop-on-syscall, or BPF-based filtering). I don't see any reason why debugging a process needs to incorporate signals.

gpderetta · on Jan 4, 2017

>signalfd from another thread with its own independent stack.

This has similar reentrancy issues to signals.

JoshTriplett · on Jan 5, 2017

How so? The other thread would handle the signal from a well-defined point (reading the signalfd), and the thread that faulted would stop at the point of the fault until the thread processing the signal let it continue.

gpderetta · on Jan 5, 2017

What if the stopped thread was holding a mutex? The thread handling the signalfd can only safely call async signal safe functions, exactly like a signal handler.

edit: technically of course full reentrancy (implied by async signal safety) is not strictly required, "only" a fully non-blocking implementation of every function called by the signalfd thread.

JoshTriplett · on Jan 5, 2017

Ah, I see your concern. Right, if you called a function that took locks and then faulted, the thread handling the fault can't attempt to take the same locks. That should result in far fewer restrictions than "async-signal-safe", though.

gpderetta · on Jan 6, 2017

You do not know which lock it was holding though; when handling a segfault for example you need to be pessimistic and assume that you can't touch any lock (think of the allocator lock for example).

Async signal safety implies both reentrancy and non blocking algorithms [1]. You might not need reentrancy but you do need non-blocking. That's really a significant restriction as libraries with non-blocking guarantees are rare.

JoshTriplett · on Jan 6, 2017

Depending on the nature of your segfault handler, you could either make sure you have any data structures you need already allocated, or allocate out of a separate arena. Or, alternatively, handle the signal in another process entirely.

gpderetta · on Jan 6, 2017

I.e. it is safe as long as you follow the same rules you would use in a normal signal handler :).

Out of process handling is a robust solution though.

nine_k · on Jan 3, 2017

Related: Unix signals are deemed "unfixable" by some: https://lwn.net/Articles/414618/

quotemstr · on Jan 3, 2017

That's an extreme perspective. I've made extensive use of signals in programs that have shipped to hundreds of millions of users. If you're careful, they work fine. Yes, POSIX needs something like NT's vectored exception handlers that could allow multiple users of the same signal to cooperate: but that's an API problem, not something inherently "unfixable" about a program being interrupted and temporarily doing something else for a bit.

SFJulie · on Jan 3, 2017

Unix signal are an OS try at giving user a portable abstraction for software interrupts (HW Int & trap). It confusingly employs much of the jargon (masking, priorities, bottom halves...)

http://stackoverflow.com/questions/13341870/signals-and-inte...

I have the intuition knowing the HW/ASM & the code but ignoring how POSIX works (I read stevens, but I am not having all my answers) that unices reimplement in SW what the HW does with wires.

Maybe this could be fixable at the HW level? (and maybe I guess requiring kernel privileges)?

stuckagain · on Jan 3, 2017

Where "some" includes everyone who has ever had to use them.

wfunction · on Jan 3, 2017

Something that's always baffled me: why don't CPUs have a "save ALL state" and "restore ALL state" instructions? Why does every new set of CPU registers seem to require an OS update to save them on context switches?

JoshTriplett · on Jan 3, 2017

They do, now. On current Intel CPUs, you can use xsave and xrstor to save and load the complete state, including all new state information. Ring 0 code can ask the CPU for the size of that state (via CPUID leaf 0xd), and allocate the appropriate amount of space per task.

wfunction · on Jan 3, 2017

Oh wow. When did this change? I vaguely seem to remember that as recently as Windows 7 (or was it 8.0?) there was trouble with AVX2 or something, but I can't find the info anywhere at the moment.

MaulingMonkey · on Jan 3, 2017

Perhaps you're thinking of this debugger issue?

https://randomascii.wordpress.com/2013/03/11/should-this-win...

wfunction · on Jan 3, 2017

Ah yes, good find, I believe this was one of them! I don't remember if there were more though, there might have been.

JoshTriplett · on Jan 3, 2017

An OS can still use xsave incorrectly, such as by hardcoding the expected size rather than detecting it at runtime, or by getting some aspect of the CPUID leaf 0xd enumeration wrong. I wouldn't find it surprising if an initial implementation got one of the details wrong, resulting in a bug that wouldn't manifest until the next time the xsave layout changed.

amelius · on Jan 3, 2017

The CPU designers could have protected against this by randomly fluctuating the size of the xsave'd data.

monocasa · on Jan 4, 2017

At the cost of extra bandwidth and interrupt latency.

amelius · on Jan 4, 2017

Yes, but only marginally if they changed the size only 1 out of 1000 times.

spdionis · on Jan 4, 2017

Adding complexity (that could result in bugs) in the hope to prevent bugs by bad usage of the interface is usually a bad idea.

colanderman · on Jan 3, 2017

After some careful reading of the linked bug report, apparently saving the direction flag wasn't the issue. Rather, clearing it upon entering a signal handler was the issue.

geocar · on Jan 3, 2017

When the number of state variables gets bigger, that buffer needs to get bigger. Need some cooperation from the operating system to increase the size of the buffers, that's all.

wfunction · on Jan 3, 2017

That doesn't mean the CPU can't report the size necessary using another instruction though. There's no need for the OS code to change.

spc476 · on Jan 3, 2017

The more state you save, the larger the latency in handling the interrupt. And the amount of state in modern CPUs can be quite large indeed.

wfunction · on Jan 3, 2017

Any idea how much it is? (just as a guess, I'd guess like maybe 8 KiB?)

haimez · on Jan 4, 2017

Show your work.

wfunction · on Jan 4, 2017

I don't remember how many there were for AVX, but let's say there are 8 512-bit registers (4 KiB)? and then equivalent in other kinds of registers, so 8 KiB. Just an order-of-magnitude estimate, nothing I expect to be too accurate.

haimez · on Jan 4, 2017

I'm also not an expert, but I think the bigger cost here is memory latency. Size correlates to, but scales differently than the switching cost of registers because it's zero sum. A register saved is a register waited on, twice. There's also the cost of decoding and executing the instructions to store / restore the register values on both ends.

wfunction · on Jan 4, 2017

I don't get it, you have to save and restore all the state on context switches either way. The only question is whether you're doing it with a generic instruction or through some other more specialized instructions. It's not a question of whether they should be saved and restored at all.

That said, I'm confused why you replied to my comment above, since it's totally off-topic. This thread chain was asking how much state exists; maybe you were trying to reply to another thread?

pwg · on Jan 4, 2017

> I don't get it, you have to save and restore all the state on context switches either way.

Actually, no, you don't, at least not on _all_ context switches. What you have to save is what the switched-to routine will overwrite. For something like an interrupt handler (where latency often matters very much), if you know it will only modify, for example, eflags and EAX, then you only need save on entry and restore on exit from the handler eflags and EAX. The registers that are not modified remain identical from entry to exit and time is saved by not pushing/popping them needlessly.

zAy0LfpBZLC8mAC · on Jan 4, 2017

> I don't get it, you have to save and restore all the state on context switches either way.

Actually, you don't. I'm not up to date as to what's done in this direction by current kernels, but a while back, a patch was merged that disabled floating point and MMX operations by default for a process, and enabled it for a while only when some FP or MMX instruction lead to an exception, which then allowed the kernel to avoid saving and restoring FP state on processes that weren't actually doing any FP/MMX operations.

kazinator · on Jan 3, 2017

This article seems flat out wrong. Since the direction flag is correctly saved and restored, everything is cool.

When some code is interrupted and the signal or interrupt handler calls memmove, that memmove will set up the direction flag for itself correctly. Its entire execution is nested within the handler. If it is interrupted by a nested interrupt, that nested one will restore the flag.

Now if an implementation of the memset function happens not to care about that flag, so that it changes direction from call to call, that's not a signal or interrupt problem! The flag can have arbitrary value in on entry to memcpy in an ordinary situation not involving threads or signals.

The moral is: never use the looping primitives on Intel without setting up the direction flag, if you care about reproducibility.

It goes without saying that the flag is part of the machine state; any machine context saving mechanism (for async situations) is broken if it neglects that flag.

kazinator · on Jan 4, 2017

Also, POSIX explicitly requires memcpy and memmove to be async-signal-safe. Claiming they are not is misinformation.

The list of async-signal-safe functions is available here:

http://pubs.opengroup.org/onlinepubs/9699919799/functions/V2...

kelnos · on Jan 4, 2017

Odd, 'man 7 signal' (http://man7.org/linux/man-pages/man7/signal.7.html) does not list memcpy or memmove in the list of safe functions.

spc476 · on Jan 4, 2017

It appears that memset() was only added in 2016.

olegolegovich · on Jan 4, 2017

> The flag can have arbitrary value in on entry to memcpy in an ordinary situation not involving threads or signals.

It can not have arbitrary value on entry. x86-64 ABI mandates that the flag is cleared before any function is called (3.2.1 Registers and the Stack Frame: "The direction flag in the %eflags register must be clear on function entry, and on function return.")

kazinator · on Jan 4, 2017

OK there we go, then. A compiler that conforms to the ABI generates code which clears the flag, if necessary. Interrupts preserve the flag. Thus, the flag should not be surprisingly set on entry into memcpy. If so, there is a bug.

olegolegovich · on Jan 4, 2017

> A compiler that conforms to the ABI generates code which clears the flag, if necessary.

I'm sorry, I do not follow. The conforming compiler expects that the functions are called with the clear flag. The non-conforming kernel on the other hand does not clear the flag before calling the signal handler which breaks it.

kazinator · on Jan 5, 2017

But that's obviously a bug. ABI says, caller clears flag. Ergo, all situations that trigger function calls shall clear flag.

Gibbon1 · on Jan 3, 2017

Slightly terrifying is the optimizer may very well replace the following

    while(n--)
       *m++ = c;

With a call to the built in memset()

megous · on Jan 3, 2017

It does a lot more replacements, depending on the architecture. Even a simple a/b or a%b will get replaced with function call on many archs.

Spivak · on Jan 3, 2017

I'm fairly inexperienced on this matter. Can you give a real life example where this would be a bad thing?

Gibbon1 · on Jan 4, 2017

Well there is this bug: 'memcpy implementation optimized as a call to memcpy'

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56888

The terrifying thing is while the programmer might know that memset() isn't async-signal-safe and not use it in a signal handler, the compiler may blissfully and silently optimize the code to use memset() anyways. Odd crashes or worse security leaks may result.

Reminds me a bit of old floating point implemented in software. Some implementations had global scratchpad registers. Very weird things happened if you did any floating point operations in multiple threads.

temac · on Jan 3, 2017

If memset is actually not async signal safe on some target, and a compiler targetting the same does that transformation, the end results are very unsound...

tedunangst · on Jan 3, 2017

This was a bug in the posix standard that has since been fixed.

spc476 · on Jan 3, 2017

This page (https://bugzilla.kernel.org/show_bug.cgi?id=25292) appears to have a current list, and yes, memset() is one of the functions listed.

swalsh · on Jan 3, 2017

I like this site, it's like one of those really classic hacker (in the woz sense) sites from 1995.

Skunkleton · on Jan 3, 2017

It isn't POSIX, but Linux does have the very useful signalfd, which creates a file descriptor that accepts signals. This is a good solution for some types of programs.

SFJulie · on Jan 3, 2017

But as pointed in this old HN story it is useless in this case.

https://news.ycombinator.com/item?id=9564975

Skunkleton · on Jan 5, 2017

It is an easy solution to the problem discussed above though. Yeah, all the mechanics of setting up your signals still are a PITA, but at least you don't have to worry about async-safe functions anymore.

Really all signalfd does is provide an easy way to handle signals in an epoll driven application. If you have a different kind of main loop, then you are probably back to stuffing messages into queues, or dealing with all of the async-safe BS.

rer0tsaz · on Jan 4, 2017

Making sure the Direction flag is clear on entry and return is a somewhat hidden requirement of stdcall (Windows calling convention), but it didn't crash if you set it in WindowProc and forgot to clear it until Windows XP.

More recently, Vista had (accidentily?) 16-byte aligned stacks when using OpenMP with MingW, but the exact same binary crashes on Windows 10 when it tries to use unaligned SSE instructions.

vadiml · on Jan 3, 2017

I was under impression that signals are delivered to a process only when process is executing a syscall. Given the fact that memset or memmove are note executing any syscalls they will not be interrupted by a signal handler. Am i wrong?

johncolanduoni · on Jan 3, 2017

Async signal safety refers to the functions being called in the handler, not the interrupted code. Also signals can interrupt the code just as normal thread preemption would, not only while the thread is inside a system call.

linkregister · on Jan 3, 2017

Tricky! Even writing your own memset would experience the same behavior, since the compiler will assume the direction flag is unset.

I agree with colanderman; the kernel should be saving every register when the signal handler is entered.

SFJulie · on Jan 3, 2017

if you do 68000 ASM INT/TRAP handling, it is better to let the interrupt/signal do the cleaning restoring.

But I won't restart a war that has ended the day x86 architecture won over 68K.

Animats · on Jan 4, 2017

Returning from the signal handler occurs in user space. It's the C library's handling of signal save and return that matters here, not what the kernel does.

majke · on Jan 4, 2017

So what does the sigreturn(2) syscall actually do?

I herby claim that sigreturn() restores the user space state in kernel context: https://lwn.net/Articles/676803/

Animats · on Jan 4, 2017

You're right; "sigreturn" wasn't in early UNIX systems but appeared in 4.3BSD. That job can also be done in user space, but there's a timing window against other signals if done that way.

8a · on Jan 4, 2017

What I've never seen discussions on signal safety address is: why must an OS have signals to begin with?

xroche · on Jan 4, 2017

This (signal) is the de-facto communication mechanism to notify a program that an unexpected event just occurred (typically, a SEGV). This mechanism needs to interrupt the normal program flow (because you can not restart an invalid access most of the time), possibly interrupting a C-library call (such as malloc, reason why malloc is not async-signal-safe generally). What would you do instead ? On Windows systems, there is a built-in "__try / __except" mechanism ("structured exception handling", SEH), which provides exception-like catch in plain C with proprietary extension to C. This is probably not such a better method.

trentnelson · on Jan 5, 2017

> This is probably not such a better method.

It actually really, really is. At least from a technical perspective on x64. Having language-level lexical scoping of system-level exceptions is incredibly useful and once you've grokked how NT's trap handler, PE .xdata sections, prologues and epilogues all work in concert, signals just seem barbaric.

Here's an example trapping an access violation:

    //
    // Prefault the page.
    //

    TRY_MAPPED_MEMORY_OP {
        TraceStore->Rtl->PrefaultPages(PrefaultMemoryMap->NextAddress, 1);
    } CATCH_EXCEPTION_ACCESS_VIOLATION {

        //
        // This will happen if servicing the prefault off-core has taken longer
        // for the originating core (the one that submitted the prefault work)
        // to consume the entire memory map, then *another* memory map, which
        // will retire the memory map backing this prefault address, which
        // results in the address being invalidated, which results in an access
        // violation when we try and read/prefault it from the thread pool.
        //

        TraceStore->Stats->AccessViolationsEncounteredDuringAsyncPrefault++;
    }

Or an alignment fault:

    FORCEINLINE
    VOID
    StoreXmm(
        _In_ XMMWORD *Destination,
        _In_ XMMWORD  Source
        )
    {
        TRY_SSE42_ALIGNED {
    
            _mm_store_si128(Destination, Source);
    
        } CATCH_EXCEPTION_ACCESS_VIOLATION {
    
            _mm_storeu_si128(Destination, Source);
        }
    }

Or an illegal instruction:

    FORCEINLINE
    VOID
    StoreYmmFallbackXmm(
        _In_ PYMMWORD Destination,
        _In_ PXMMWORD Destination128Low,
        _In_ PXMMWORD Destination128High,
        _In_ YMMWORD  Source,
        _In_ XMMWORD  Source128Low,
        _In_ XMMWORD  Source128High
        )
    {
        TRY_AVX {
    
            TRY_AVX_ALIGNED {
    
                _mm256_store_si256(Destination, Source);
    
            } CATCH_EXCEPTION_ILLEGAL_INSTRUCTION {
    
                Store2Xmm(
                    Destination128Low,
                    Destination128High,
                    Source128Low,
                    Source128High
                );
    
            }
    
        } CATCH_EXCEPTION_ACCESS_VIOLATION {
    
            _mm256_storeu_si256(Destination, Source);
        }

Or a page fault that has occurred against a memory map backed file (because, say, the underlying network drive has been disconnected):

    TRY_MAPPED_MEMORY_OP {

        //
        // Copy the caller's address range structure over.
        //

        __movsq((PDWORD64)NewAddressRange,
                (PDWORD64)AddressRange,
                sizeof(*NewAddressRange) >> 3);

        //
        // If there's an existing address range set, update its ValidTo
        // timestamp.
        //

        if (TraceStore->AddressRange) {
            TraceStore->AddressRange->Timestamp.ValidTo.QuadPart = (
                Timestamp.QuadPart
            );
        }

        //
        // Update the trace store's address range pointer.
        //

        TraceStore->AddressRange = NewAddressRange;

    } CATCH_STATUS_IN_PAGE_ERROR {

        //
        // We'll leak the address range we just allocated here, but a copy
        // failure is indicative of much bigger issues (drive full, network
        // map disappearing) than leaking ~32 bytes, so we don't attempt to
        // roll back the allocation.
        //

        return FALSE;
    }

Relevant macro definitions:

    #define TRY_AVX __try
    #define TRY_AVX_ALIGNED __try
    #define TRY_AVX_UNALIGNED __try
    
    #define TRY_SSE42 __try
    #define TRY_SSE42_ALIGNED __try
    #define TRY_SSE42_UNALIGNED __try
    
    #define TRY_MAPPED_MEMORY_OP __try
    
    #define CATCH_EXCEPTION_ILLEGAL_INSTRUCTION __except(     \
        GetExceptionCode() == EXCEPTION_ILLEGAL_INSTRUCTION ? \
            EXCEPTION_EXECUTE_HANDLER :                       \
            EXCEPTION_CONTINUE_SEARCH                         \
        )
    
    #define CATCH_EXCEPTION_ACCESS_VIOLATION __except(     \
        GetExceptionCode() == EXCEPTION_ACCESS_VIOLATION ? \
            EXCEPTION_EXECUTE_HANDLER :                    \
            EXCEPTION_CONTINUE_SEARCH                      \
        )
    
    #define CATCH_STATUS_IN_PAGE_ERROR __except(     \
        GetExceptionCode() == STATUS_IN_PAGE_ERROR ? \
            EXCEPTION_EXECUTE_HANDLER :              \
            EXCEPTION_CONTINUE_SEARCH                \
        )

crustycoder · on Jan 3, 2017

POSIX cares not one jot about Intel's CPU implementation, the assumption that memset is not async-signal safe because of something specific to x86 is ludicrous. As of IEEE Std 1003.1-2008, 2016 Edition memset() and lots of others beside are now Async-Signal safe, see http://pubs.opengroup.org/onlinepubs/9699919799/functions/V2.... That's in contrast to IEEE Std 1003.1, 2004 Edition which has a much shorter list which excludes memset(), see http://pubs.opengroup.org/onlinepubs/007904975/functions/xsh...

to3m · on Jan 3, 2017

Any time anybody tells you that POSIX isn't terrible, the subject of this article is the sort of thing you need to bear in mind.

(For a long time, I thought that literally the only thing that POSIX didn't fuck up is the way they don't have any analogue to MAXIMUM_WAIT_OBJECTS. But then, after I gained more experience with POSIX, I realised that even if I don't understand how, this too was probably also something they got wrong.)

temac · on Jan 3, 2017

Posix is terrible. But I'm not sure we have anything better.

MAXIMUM_WAIT_OBJECTS is also terrible of course, but even apart from that Win32 is not bright, at all. The doc contains so little details (or worse is sometime even false) that the real doc when you start to ask serious questions is ReactOS, Wine, or even the NT/2000 source leaks, and then IDA. Compared to that, Posix is actually documented and implemented mostly correctly by tons of OSes. Its easy to ship an API with basically no spec (so nobody can tell you that there is a bug when they find one - actually even if there were some real specs about Win32 I think there is public no way to report bugs in there to MS!), and that does not have to be compatible with any other implementation...

robertely · on Jan 3, 2017

Post title reminds me of "Hollywoo Stars and Celebrities: What Do They Know? Do They Know Things?? Let's Find Out!"[0]

[0] http://bojackhorseman.wikia.com/wiki/Hollywoo_Stars_and_Cele...!

Edit: Has been updated