Hacker News new | comments | ask | show | jobs | submit login
Compiler bug? Linker bug? Windows Kernel bug (randomascii.wordpress.com)
246 points by janvdberg 10 months ago | hide | past | web | favorite | 92 comments

I used to be surprised when people came across these bugs. I would think: a kernel bug! They must have truly hit an edge case for something like that to skip into production. But then I read Showstopper: The Race to Build Windows NT, specifically the part where that OS shipped with 65,000 known defect (and it was considered one of the most stable software releases of all time), and it became clear to me that these bugs are probably known by the team when the OS ships but just not considered high priority enough to fix. Same with the recent articles about iOS and engineers not getting time to fix P2 bugs. Long story short my entire perspective on kernel bugs (and partly software development) changed and I began to think of them more as known flaws that weren’t fixed because of schedules rather than unknown bugs that were uncovered in the wild.

that OS shipped with 65,000 known defect

Issue tracking at Microsoft (and presumably all large tech companies) includes things like:

* localization errors

* typos in user-visible text

* suggestions for rewording user-visible text

* issues related to icons and other graphics

* issues related to sounds that ship with the product

* feature requests

* feature enhancements

* tracking tickets for work to be done or work in progress

* tickets for unverified, incorrect, and unreproducable bugs

So a. 65,000 tickets, not 65,000 defects. b. "65,000 tickets" is meaningless without knowing more about them.

I found my own one: Windows "structured exception handling" relies on code in the kernel that unwinds call stacks. This makes a whole bunch of assumptions about the layout of the code as well. On CE/ARM, we found that Visual Studio was sometimes generating code that used "bl [r3]" constructs which could not be unwound. The workaround was to use a different version of cl.exe that came with the CE dev kit and not the one that came with Visual Studio.

Oh, and if you have more than about 512 bytes of local variables in a stack frame and try to use CE remote debugging, that stops working as well.

Its the same on X64, although I think its better documented/understood on that platform.

Yeah that's one of the things Cutler played an important role in -- formalizing x64 prologue and epilogue constructs. The epilogue stuff is interesting, there's a small set of instructions that are allowed because the unwind code literally does opcode analysis to figure out if the current function is in its epilogue or not.

Heh - that was partly me. The kernel team (i.e. mostly DaveC) mostly prescribed which instructions were allowed in prologues/epilogues. I was dev lead of VC's CRT team at the time (2000-ish), and got assigned to work on that and come up with an x64 version of the prologue/epilogue descriptors needed to unwind. I looked at what was done for Itanium, and simplified that waaaay down to what was needed for x64. I remember designing the ability to split prologues into multiple parts, to do something called shrink-wrapping, where you might not want to push all your saved registers all the time - there might be a subset of a function's code where a register needs to be pushed/popped. DaveC needed convincing that was a good idea.

This is one aspect of OS development in which I think open source wins: users are not at the mercy of the OS developers for bug tracking and/or fixes.

Are we not? If I find a bug, I sure hope they fix it soon. I do not have the time and energy to work myself into the kernel to fix the bug (unless it is a small bug that does not require a lot of knowledge about the kernel itself).

Sure, as an individual, your direct powers are limited, but they're a lot less limited than if your kernel is closed. You can offer a cash bounty to anyone who wants to fix the bug (not just those employed by the kernel-vendor) and you can deploy their fix to your own systems independently of the kernel vendor releasing an update (or, accepting the fix at all) because you can build the kernel from source yourself (or, again, you can pay someone to do this for you). If you can't afford a big enough bounty to get anyone to bite, you can team up with others affected by the bug to pool your resources, and you can even use the kernel bug tracker to coordinate this effort.

At the very least you can write a bug report. I tried writing one for Microsoft Office, not even found where.

First off, true, reporting bugs to Microsoft is hell.

But on the other hand reporting feature suggestions to Microsoft is much more pleasant than anything open source has e.g. https://office.uservoice.com/forums/285186-general

In Open Source world, because by the nature of it you're non-paying, asking for features either gets back "submit your pull request" or "go away." Which makes sense, those who contribute the most time get the most say, and that would be fine if those that contribute the most time had the same goals and taste as those that didn't (see the failure of Linux Desktop as an example).

Looks like people really want ClipArts back.

It was nice having a license free library of art assets even if they were overused. There are websites that offer that these days but as some of the commentators pointed out, that requires legal to sign off on in some businesses or those sites could be hosting stolen images.

> It was nice having a license free library of art assets even if they were overused.

I guess by now Emoji fill that role to some extent. Including the overused part ;-)

I think they consider that a feature, with their 1 billion (not-really technical) users :)

How many times has this actually happened?

Here's a bunch of >$1000 bounties posted: https://www.bountysource.com/bounties/search?direction=desc&...

None for the Linux kernel, though, but it does have >20,000 forks on GitHub: https://github.com/torvalds/linux/network/members

Good points. Thanks!

Yeah, but if you're a professional it's just part of your job to patch the kernel when you have to. In any case, the kernel isn't as scary as most people think. Nor is almost any other "arcane" software in your stack.

I think in reality it's more nuanced than that: the number of people who use computers who can fix a kernel bug is relatively small, and the process of getting a bug accepted, patched, and shipped is not trivial. I would not want to bet on a bug which Linus disagrees with getting patched faster than, say, a report on some company's enterprise support agreement.

(disclaimer: I've been running Linux and other open-source OSes for over two decades. I like OSS but it's not perfect and staffing for projects is an under-solved problems.)

There are many small bugs in the kernel that are not hard to fix but are not priorities of many kernel developers, but the maintainers are usually happy to take your fixes if you create them. Proof; I've submitted a number of small fixes over the years that have been accepted: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/lin...

Are data corruption defects treated the same way as other things? i.e., would kernels get shipped with known defects that cause data corruption under load? I could see kernels being shipped with known cases of hanging, crashes, out of memory, etc., but the same for data corruption under load would be a little hard to swallow.

Bugs are prioritized. Data corruption would be higher than some other bugs, but it also depends on frequency

Yes, but do you know of any large software project that only ships when they hit zero bugs? Actually, this would probably extend to all products. Creating a product with no known defects is not possible AFAICT.

If 250kLOC is “large”, then the Space Shuttle’s onboard software is one (the only?) example.


This is the case with all major software releases. The bug tracker is filled with bugs for any major release. It's just a matter of team lead, project manager and higher ups deciding what the priorities are. Not every feature makes it and not every bug gets fixed.

The key is to hit the milestones and get the product working as much as possible.

Enterprise Software Motto: "Sometimes good enough is good enough". Amen.

Think about your code. Now imagine a codebase 1000 times as large written by thousands of engineers with deadlines. There are bugs or potential vulnerabilities in every codebase, but you’ll never encounter the vast majority of them.

The "Avoid Windows kernel bug using Python hack" commit (for people with JS disabled):


And this looks like the lld-link change: http://llvm.org/viewvc/llvm-project?view=revision&revision=3...

The llvm commit on GitHub (which shows whole diff at once):


Thanks for the link. I meant to track that down and I appreciate your doing that for me.

Huh. There's a memory-mapped I/O bug I've known about for years, I wonder if this is related to it? It's something along the following lines, but I forget the exact details, so some bits (no pun intended) might be wrong: if you create a memory mapping and then close the underlying file handle, you can even delete the file at that point, but the memory mapping will still hang around and pretend to work, and you won't know it's not working. I'm not sure exactly what happens under the hood but I recall it doesn't behave sanely.

IIRC this isn't a bug but a compatibility hack. The pages really are unmapped in your process but if you try to access them the kernel will remap them for you and lie when you try to remap that file (as it already had the handle open and assigned to your process). This was done because a few well known pieces of software did this back before protected mode. So MS pretends the hack to keep them working. I'm pretty sure this goes away if you list the latest OS in your manifest and have DEP and NX turned on for your app.

This doesn't seem to be the case. On the latest Windows 10 I tried out this test program

  	HANDLE mapping = CreateFileMapping(file, NULL, PAGE_EXECUTE_READWRITE, 0, 1, NULL);
  	void *ptr = MapViewOfFile(mapping, FILE_MAP_ALL_ACCESS, 0, 0, 0);
  	*static_cast<unsigned char *>(ptr) += 1;
with /NXCOMPAT and even /DYNAMICBASE and with the embedded manifest

  <?xml version="1.0" encoding="UTF-8" standalone="yes"?>
  <assembly manifestVersion="1.0" xmlns="urn:schemas-microsoft-com:asm.v1" xmlns:asmv3="urn:schemas-microsoft-com:asm.v3">
  	<assemblyIdentity type="win32" name="Microsoft.Windows.Foo" version=""></assemblyIdentity>
  	<trustInfo xmlns="urn:schemas-microsoft-com:asm.v3">
  				<requestedExecutionLevel level="asInvoker" uiAccess="false"></requestedExecutionLevel>
  	<compatibility xmlns="urn:schemas-microsoft-com:compatibility.v1">
  			<supportedOS Id="{8e0f7a12-bfb3-4fe8-b9a5-48fd50a15a9a}"></supportedOS>
but it seems to still modify the file after closure even after the CloseHandle call.

When I get time I might actually debug this to see what's going on. I'm quite curious what happens when you access the pointer again. If the Kernel really is remapping the pages again as the anecdotes from Raymond would suggest it might be. Or if it just hasn't reclaimed them yet. VMMap and ProcessExplorer should give the story.

That's undefined behaviour then, right? Even if it's not the sane behaviour.

I imagine that might be reason enough for them to treat it as a low-priority.

I'm not sure... UB in general seems to be a language-level construct. OSes (and below) don't really have much UB to speak of, and I assume this is because they are expected to provide security and stability regardless of misbehaving user programs. And also, UB is usually said in response to a program expecting a certain behavior, but in this case I'm not worried about the program itself, but about the system and where exactly the black hole that the physical memory is mapped to is (given that it is no longer a file). In this case, I wouldn't be surprised if there are serious stability or even security issues lying behind this... I just haven't bothered to investigate.

It would rather be classified as undocumented behaviour. Undocumented behaviour is frequently maintained in OSes because programs become reliant on it, to the point that sometimes bugs need to be emulated to avoid breakage.

It's been a long time since I've looked at Windows documentation, but I wonder if it's reasonable to treat their documentation as the actual contract offered by the OS?

I had the impression that poor documentation forced a lot of Windows programmers to understand Microsoft's API's via experimentation.

> UB in general seems to be a language-level construct

Not always. Various standard library functions can cause UB, and not just things like C's div function. In C++, calling front() on an empty std::vector<T> causes UB, for instance. http://www.cplusplus.com/reference/vector/vector/front/

I agree that from a system perspective, it doesn't look like good behaviour.

> Not always. Various standard library functions can cause UB, and not just things like C's div function. In C++, calling front() on an empty std::vector<T> causes UB, for instance.

std::vector et al. are parts of the C++ language. "Language" != pure syntax.

I think I see your point - that despite the way C++ libraries can and do expose UB if not used as intended, it doesn't mean that Windows does the same.

For what it's worth, I read that POSIX does have UB in the spec (as in, distinct from how the compiler is allowed to break things in terms of language-level UB), at least in some ghastly signal handling situations. I wasn't able to find at a glance whether Windows does too, or if it categorically doesn't.

> According to POSIX, the behavior of a process is undefined after it ignores a SIGFPE, SIGILL, or SIGSEGV signal that was not generated by kill(2) or raise(3).

A call to unlink removes the name from the filesystem, but keeps the file around as long as it's still opened by a process; and mmap keeps the file opened, even if you close the original file descriptor.

> A call to unlink removes the name from the filesystem, but keeps the file around as long as it's still opened by a process; and mmap keeps the file opened, even if you close the original file descriptor.

Are you familiar with Windows or just assuming all OSes works like *nix systems? That's definitely not at all close to how Windows file systems behave.

Not even all UNIXes.

It is implementation defined on the UNIX OS variant and mounted file system.

The beauty of POSIX is how it appears to be portable, while leaving quite a few things being implementation defined.

Not sure what UNIX variant are you thinking, but from POSIX:

> The mmap() function adds an extra reference to the file associated with the file descriptor fildes which is not removed by a subsequent close() on that file descriptor. This reference is removed when there are no more mappings to the file.

I'm sure there have been unix variants that got it wrong and might not even have been in posix from the beginning (it is there since at least SUSv2); still I think this is traditional unix behavior.

Today I learned!

Your comment contains no details whatsoever, then you described perfectly sane mmap behavior, so what do I know?

Windows dosent have the concept of an inode, so there is nothing for a file handle (what windows calls a file descriptor) to bind to except a file name, this means that a filename cannot be removed while it is being accessed somehow.

This is the cause of a lot of woes on windows, such as random access denied errors when attempting to delete stuff, and what forces a reboot when you update nearly anything on windows.

One of my favourite things related to this is that the Windows XP explorer had a bug where clicking on a file would generate a dynamic preview of an audio/video file in the sidebar, which would lock the file, rendering it impossible to delete because you had to click on it first....

this reminds me how updating libraries in linux can be kind of dangerous depending on how you perform it. for example if you unlink the library then create a new library with the same name then the system behaves probably how you want it to. existing processes use the original library until they are restarted. however, if you open the existing library with O_TRUNC and overwrite the library then existing processes will use the new code which is super dangerous. by default cp will do the O_TRUNC behaviour, and tar will use the unlink behaviour. its interesting to think that stuff might be accidentally working for some people because of this choice in behaviour for tar.

i haven't actually tested this with a real library. this is what i've assumed happens from playing around with mmap(PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE).

Both of those are the wrong way to do it. :)

O_TRUNC for the reasons you stated and unlink because it causes a race between the unlink and the copy when a new process that tries to use the library will either fail because it doesn't exist or get a copy of the library that isn't completely written yet.

The correct way to do it is to write the new file to the same filesystem at a different path and then, once it's completely written, move it to the intended path. Move (i.e. rename(2)) is atomic on sane filesystems so anything that opens the file will get one complete copy or the other.

Are PE executable files special in Windows, or could this bug have been hit with normal IO?

I'm not sure in what manner they would need to be special for this, but I would expect the significant part was the fact that it was memory-mapped (which isn't normal I/O), not that they were PE files.

Edit (in response to comments): Actually, I don't think it has anything to do with PE files being optimized in some special fashion, because they say loading via LoadLibrary(Ex) can cause this issue too, and those are very much user-mode constructs, and hence can't mess with the caching/paging behavior (which are lower, in the kernel). AFAIK the only kernel-mode component to them is to just create a file mapping (NtCreateSection) and after that all they do is a bunch of user-mode juggling (fixing up relocations, loading other libraries that are depended on, calling entrypoints, etc.).

Since it's only observed by execution, its entirely possible that the PE loader has been optimized in a low-level fashion leading to bugs that are not exposed in general I/O facilities.

My reading is that PE images are memory mapped into RAM instead of being read from disk. This is why they require write locks on the files during execution: the program image is backed from the corresponding file instead of the normal page file.

So my speculation is that this bug is a race condition between the cleanup of a dirty memory mapped section of a file and the reading of the same file contents in a separate mapping.

Indeed but being loaded as code seems important as well. It sounds like the two used together, in certain extreme conditions, causes the issue.

Both parts are special. As the post says, there are three conditions: 1) Write the file with memory-mapped I/O 2) Read the file as a PE file (CreateProcess, LoadLibrary, etc., I'm not positive of the exact requirements but it's something special that I believe only PE files can hit) 3) Very heavy load

> memory-mapped (which isn't normal I/O)

Memory mapped I/O is very much normal I/O. Normal, as in "common" or "usual".

>> memory-mapped (which isn't normal I/O)

> Memory mapped I/O is very much normal I/O. Normal, as in "common" or "usual".

You know well what I meant.

What you said left an impression memory mapped I/O is somehow unusual or special, while it's actually a fairly common type of I/O.

So while I did know what you meant, I found your comment misleading to someone who isn't familiar with the subject.

Too many programmers misunderstand and are scared of memory mapping as it is.

Still a linker bug to me. Where did it became a kernel bug? Looks like wrong usage of memory-mapped I/O, not like a bug in memory-mapped I/O itself.

The linker did nothing wrong. It wrote to a file using memory mapped I/O. Shortly afterwards the build system tried to run the just-linked file. This is supposed to work. It usually does. Due to this bug it occasionally doesn't.

The linkers are getting a compatibility hack (FlushFileBuffers call) but this is not supposed to be necessary, by the file-system contract.

I just wish they'd fix this bug with incorrect restored window states. Eg, when monitor layouts changed maximized windows are windowed, but they still think they are maximized. It requires clicking the maximize button twice to fix it.

Been there since Windows 98, at least.

really love reading bug hunting stories

So basically if you write and read immediately, windows will give you bad data in some cases. Not a good sign for the underlying design.

No, basically it's not like that.

From the TFA:

    if a program writes a PE file (EXE or DLL) using memory mapped file I/O and 
    if that program is then immediately executed (or loaded with LoadLibrary or LoadLibraryEx), and 
    if the system is under very heavy disk I/O load
    then a necessary file-buffer flush may fail

And also this only happens on multi-socked (not just multi-core) systems. So the underlying problem is likely related to cache coherency issues across NUMA nodes.

I'm assuming that the linker exits before the program it linked is executed. I'd hope so! If so then this is clearly an OS bug.

How does your comment contradict the parent one?

The parent uses weasel words https://en.wikipedia.org/wiki/Weasel_word that, while technically correct, imply something false. The comment is refuting the implication.

Here the implication is that there is a fundamental design flaw in the OS that may cause widespread issues, when really this is a bug in quite a narrow edge case: it may happen when building colossal projects repeatedly on very high spec computers, the article suggests that the team at Microsoft investigating the bug couldn't reproduce it since they couldn't build Chrome as quickly as the author could.

Memory-mapped I/O is a lot more precise a description of "writing" and more unusual than simply writing out, and execution similarly is not a normal read path for a just-written file.

The parent made it sounds like you could just write a loop with for(;;) { write_file("x"); read_file("x"); } and expect discrepancies. That's not the case and it's wilfully misleading to suggest it is.

The parent said "in some cases". I guess it could be misinterpreted, but I took that to mean "in some very specific access patterns" (rather than, say, "in any access pattern some random small proportion of the time").

Um, also, in some cases, I might win the lottery.

Writing a binary executable file is not a common situation, only compilers do that. If you add the unusual heavy IO, this is a fringe case.

> Writing a binary executable file is not a common situation, only compilers do that.

Not only compilers. Decompressors (.zip, etc.), installers and auto-updaters. There are more use cases for writing out executables than just compilers.

Although these cases probably write sequentially and don't need to memory map the file and update sections in a random access fashion.

Look, there are a handful of details in the description that, for the right person, clearly and right away indicates that it's a very very specific, even fringe case. Saying that "basically it means..." as if it was a general case is terribly misguided because it's not. There are already a few comments explaining just that. Insisting in a misunderstanding is not very useful.

[Edit: whoops, never mind, thanks. Might be worth adding the memory-mapped part to your comment to clarify for others too.]

You're right that it is an example of a case that matches the "basic" description of the conditions to hit the bug, but it probably won't satisfy any of the three conditions when they are considered in detail:

* The files being written probably won't be using memory mapped IO (I'd expect them to use the equivalent of fwrite instead) as the installer won't need to make complex changes to the binaries in the same way that a compiler/linker would

* There will likely be a significant delay between writing the file and running it compared with a build process that uses its own build output as part of the build (even in the build process case, the bug went away for a year, possibly due to an additional delay between the write and execute steps)

* Once the installer has run, the system will probably not be under heavy load (the author was running a highly parallel build on a 24-core machine, and this bug was still only hit in 3% of builds)

Really the earlier post was just misleading, as it implies the bug is much easier to hit than it really is.

There are other use cases where this might occur, though – JITs that store JITed binaries on disk.

That’s rare (usually there you’d read it back in and execute in your process space), but it’d be at least a second realistic scenario.

Not the same. Notice the memory mapped file bit.

I agree. But the comment I was referring to said "in some cases"; it never claimed it was a common case.

I don't know if you're trolling.

> if a program writes a PE file (EXE or DLL) using memory mapped file I/O and


> if that program is then immediately executed (or loaded

Read immediately

> then a necessary file-buffer flush may fail

You get bad data in memory.

You've abstracted very specific conditions to the point of absurdity. I don't see any reason for that except for pulling a strawman and taking a stab at Windows.

memmove in glibc produces wrong results in some cases (that was a fun over-Christmas debugging session). There is bugs in all software.

You might as well go all the way and simplify it down to "so basically if you run Windows you might get corrupt data" to make your point clearer.

You might as well go further and simplify it down to "if a computer is involved, it will corrupt your data".

HN Guidelines: "Please respond to the strongest plausible interpretation of what someone says, not a weaker one that's easier to criticize. Assume good faith."

Answer is: "none of the above".

Coherency is covered by the WinAPI documentation (check the remarks section). Always RTFM.


Impressive. It would be difficult for you to be more wrong, and reading the article and noting that

> [ex-coworkers at microsoft] confirmed that my fix should mitigate the bug (I’d already noted that it had allowed ~600 clean builds in a row), and promised to create a proper fix in Windows.

would have saved you the bother of nerd-sniping.

The coherency notes in the article you link are irrelevant because they're about concurrent access. In TFA, the file is created (through mapping), closed, then executed. Sequentially. Also no ReadFile or WriteFile involved (in fact though I don't know how Windows implements it one would expect PE loading to use memory mappings, and thus be covered under coherence guarantees)

Sure, you can call your "ex-coworkers at Microsoft" (most of which are irrelevant to debugging/verifying this issue) - or, you can just spend some time reading the documentation, as previously suggested.

The coherency issues are your "tread lightly" warning. The docs regarding FlushViewOfFile expand on this specific issue:

Flushing a range of a mapped view initiates writing of dirty pages within that range to the disk. Dirty pages are those whose contents have changed since the file view was mapped. The FlushViewOfFile function does not flush the file metadata, and it does not wait to return until the changes are flushed from the underlying hardware disk cache and physically written to disk. To flush all the dirty pages plus the metadata for the file and ensure that they are physically written to disk, call FlushViewOfFile and then call the FlushFileBuffers function.

So, after carefully reading this documentation, we can clearly conclude OP's pull-request is actually missing a call to FlushViewOfFile prior to calling FlushFileBuffers.

Closing the file is supposed to ensure coherency. It generally does. That it occasionally does not is a bug, that Microsoft will fix.

This is incorrect. FlushFiewOfFile and FlushFileBuffers only guarantees that contents are written to the physical media. It is not necessary (well, it is not supposed to be necessary) to call either of these functions for other processes to be able to see the changes.

The whole point of a cache is that if A writes something, and then B reads it after A finishes writing it, B should see the results of A's modification. Whether it has been flushed to the physical media is irrelevant, because if it hasn't the cache manager can still serve the contents directly from the cache. And a call to Flush is not necessary for that.

So, the OP's change is not missing anything. As the article explains, it is in fact the Windows kernel that is behaving incorrectly (which was also confirmed by people who work on the windows kernel).

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact