
How Dirty COW works from the Linux kernel’s perspective - arunc
https://chao-tic.github.io/blog/2017/05/24/dirty-cow
======
kbart
I wonder, how does somebody discover such bugs? Is there some strict
methodology or tools? I find it non-trivial just following this step-by-step
guide, even though I've programmed Linux kernel for few years on daily basis.

~~~
BugsBunnySan
It might have something to do with what the article's author writes as the
moral 'lying is bad'... And understanding _why_ it is bad if code lies about
stuff.

I can imagine someone looking at that code and the part where the kernel says
one thing (I don't need write permissions) when that is not actually true and
thinking that's a bad situation, that smells bad.

And the fact that the kernel does this is bad enough, but it's really just a
symptom of an underlying problem. And that problem is the fact that the kernel
code _has_ to do this lying to prevent a loop, Which seems like bad code
design (or the result of a hundred other factors that make it look, at first
glance, like bad design). And the solution was done in a sloppy manner
(dropping that write flag, instead of being honest and doing the little bit of
extra work, which they did in the fix for the exploit with that extra flag).

And then maybe if you have experience with this stuff you might recognize that
within this code there's that race condition.

And from those two things you might go see if other things nearby in the code
can help you exploit it and it probably evolves from there into this very
complex exploit.

The thing is, I think, one person looks at that code and is like, yeah the
kernel lies a bit here, but whatever, sure it's grand. And in 99.999% percent
of the time it is and you're fine with that and walk on.

And another person, who has Hacker Nature, might think like, a) this is bad as
the tip of an iceberg is bad for a ship and more importantly b) I wonder what
happens in the 0.001% of the time where this isn't fine...

Or, of course, maybe someone was standing on the edge of their toilet hanging
a clock, the porcelain was wet, they slipped, hit their head on the sink, and
when they came to they had a vision of this exploit. ;)

~~~
mannykannot
Perhaps the talk[1] given by Bryan Cantrill in 2015, and mentioned in footnote
1 of the article, may have alerted someone to the possibility that there might
be something exploitable here? In addition, any of the other online
discussions of madvise(,,MADV_DONTNEED), mentioned in that talk, may have
attracted someone's interest.

This exploit seems to lie at the intersection of several things that I guess
might indicate a higher-than-average risk of exploitability: shared memory,
special provisions for debugging access to running processes, complicated and
unintuitive (if not broken) semantics, and concurrency.

[1]
[https://www.youtube.com/watch?v=bg6-LVCHmGM&feature=youtu.be...](https://www.youtube.com/watch?v=bg6-LVCHmGM&feature=youtu.be&t=59m8s)

~~~
taneq
> This exploit seems to lie at the intersection of several things that I guess
> might indicate a higher-than-average risk of exploitability: shared memory,
> special provisions for debugging access to running processes, complicated
> and unintuitive (if not broken) semantics, and concurrency.

I think this is a great general rule. Any one of these things is a reason to
tread lightly. All of them at once? Here be dragons.

------
caf
It's not made completely explicit here, but one reason you want debuggers to
be able to write even to read-only mappings in the debuggee is to be able to
insert software breakpoints, which entails writing to the mapping of the
executable file.

~~~
AstralStorm
That is supposed to be limited to processes with ptrace rights. Which are
highly privileged.

Generally what happened here is that a piece of state got lost in the layers
of abstraction. This combined with a race condition is enough.

~~~
caf
In common setups ptrace rights aren't highly privileged, the normal situation
is that users can ptrace their own processes.

To be clear, I'm just adding a bit of context around why a page in a read-only
mapping is COWed when these debugging interfaces are used to write to it,
rather than failing with an exception. The bug existed in the handling of this
edge-case.

------
_pmf_
Programming language support for COW (at the actual page level, not via
internal copy-on-write) would be a nice idea; maybe something for Jai, since
it seems to be the language that is most open towards integrating memory model
abstractions.

