
A Skeleton Key of Unknown Strength (CVE-2015-7547) - cookiecaper
http://dankaminsky.com/2016/02/20/skeleton/
======
tptacek
Reading this, you'd think getaddrinfo() was the first glibc vulnerability ever
discovered. "Look at how many things are affected! Even sudo! How will we ever
detect them all?" Let's hope the author has some benzos on hand before they
learn about kernel vulnerabilities.

Look: not only is this flaw not unprecedented, like, at all, but just last
year we had a glibc vulnerability in gethostbyname() --- the most common libc
DNS resolution function. You may remember the flaw by its brand name, "GHOST".

GHOST did not ruin the Internet or justify what I can best describe as 4000
words of concerned back-patting. And it was exploitable on day one!

Just patch the stupid thing, like everything else, and get on with your life.

I do object strongly to the bit at the end, about how ASLR, NX, CFG, &c are
effective only at showing us who the best exploit developers are. Horse. Shit.
There are today memory corruption bugs that are _not exploitable_ because of
runtime constraints. Moreover, when Linux and OS X bugs turn out to be widely
and reliably exploitable, the reason tends to be that they hit something that
isn't fully covered by runtime protections.

If you want to argue that critical, exposed network services shouldn't be
written in C in 2016, you'll get no argument from me, or, for that matter, 98%
of Hacker News. We should definitely get moving on porting stuff like DNS
servers to things like Go and Rust.

~~~
pcwalton
> If you want to argue that critical, exposed network services shouldn't be
> written in C in 2016, you'll get no argument from me, or, for that matter,
> 98% of Hacker News.

Not 98% of Hacker News. There are an awful lot of people who think modern C++
is memory safe.

~~~
nly
C++ isn't 'memory safe', but what it does is give you the ability to write
memory and resource safe types, and then lean on the type system as a means of
ensuring memory safety.

C doesn't let you do that. It's just incapable, and it's a handicap that leads
to bad APIs and bad client code. Take gethostbyname() as a historical example.
It returns a pointer to static data, which avoids memory leaks at the cost of
being thread-unsafe. Want a fix? use gethostbyname_r()... ok, great, except
now you're dealing with a function with _six_ parameters and have even more
ways to shoot yourself in the foot. Ok well, they're both deprecated anyway...
use getaddrinfo()... just don't forget to use freeaddrinfo() because this
function allocates a linked list. So don't forget to check for null pointers
as you enumerate it. C++ brings value semantics, iterators, and RAII to table
and these problems largely go away.

All this is why C++ is better than C. In C++ you can have memory safety,
thread safety, type safety, _and_ a simple API. It's not perfect, and it's not
easy, designing APIs never is, but at least it's a remote possibility.

~~~
pcwalton
> C++ isn't 'memory safe', but what it does is give you the ability to write
> memory and resource safe types, and then lean on the type system as a means
> of ensuring memory safety.

No, it doesn't, because it isn't memory safe.

> All this is why C++ is better than C. In C++ you can have memory safety,
> thread safety, type safety, and a simple API.

No, you can't. C++ is not memory safe. Use after free happens in C++ all the
time, because smart pointers do not protect you from dangling pointers and
null pointer dereference, among other things.

Instead of copy and pasting examples to show this, I'll just link to other
posts of mine from the past two months:
[https://news.ycombinator.com/item?id=11111987](https://news.ycombinator.com/item?id=11111987),
[https://news.ycombinator.com/item?id=11054630](https://news.ycombinator.com/item?id=11054630),
[https://news.ycombinator.com/item?id=11055020](https://news.ycombinator.com/item?id=11055020),
[https://news.ycombinator.com/item?id=10819501](https://news.ycombinator.com/item?id=10819501).

~~~
nly
Move your for-loop in to an algorithm that takes Container by const& and your
resize() / iterator invalidation problem becomes a compile error. If you
actually need to resize your array while you iterate, then a language like
Rust is going to catch your pointer invalidation, but it's not going to make
the impossible possible and suddenly make your algorithm do something
sensible.

I never claimed C++ statically enforced memory safety. Clearly that can't
happen while its a superset of C. I just claim that it's steadily reaching the
point where the most concise and elegant code with the best API is the safest.

And I'd argue that's the problem with C. The language is too weak to write
safe code _concisely_ , so you end up with too much cognitive load, get lazy,
and then end up with sprawling unauditable messes like we see in glibc lately.

------
nkurz
Toward the end of the essay there's an excellent and provocative summary of
the way forward:

    
    
      My concerns are not merely organizational.  I do think we   
      need to start investing significantly more in mitigation 
      technologies that operate before memory corruption has 
      occurred.  ASLR, NX, Control Flow Guard – all of these 
      technologies are greatly impressive, at showing us who
      our greatly impressive hackers are.  They’re not actually 
      stopping code execution from being possible.  They’re
      just not.
    
      Somewhere between base arithmetic and x86 is a sandbox 
      people can’t just walk in and out of.  To put it bluntly, 
      if this code had been written in JavaScript – yes, really – 
      it wouldn’t have been vulnerable.  Even if this network 
      exposed code remained in C, and was just compiled to 
      JavaScript via Emscripten, it still would not have been 
      vulnerable.  Efficiently microsandboxing individual 
      codepaths is a thing we should start exploring.  What can 
      we do to the software we deploy, at what cost, to actually 
      make exploitation of software flaws actually impossible, as 
      opposed to merely difficult?

~~~
bcook
if ($password == blah.crap) is insecure in any lang.

What I mean is, we solve one vector, but is always another (logic flaw vs
stack flaw). Is there a truly "perfect" way to code? Is there any academic
philosophy that is impossible to exploit?

I ask from a _very_ ignorant perspective. I can barely program.

~~~
cbd1984
We can eliminate entire classes of attacks by using certain languages and
technologies.

Simple, forgotten example: You can't just patch a running OS kernel from an
application program anymore. There's hardware in the CPU called an MMU, or
Memory Management Unit, which inspects all attempts to access memory, read or
write, and checks them against a policy the kernel set which aims to disallow
all unsafe memory accesses. (This is like nine kinds of oversimplified, but
it's not actually wrong... ) The MMU will alert the kernel if an application
program attempts to access memory in a way contrary to policy, and the kernel
typically kills the program dead right there. That's what a segfault is.

My point is, prior to the MMU, the only possible response to "Applications can
modify running OS kernels willy-nilly." was "Don't do that then." There was no
way to _enforce_ that policy. That's why MS-DOS, which ran pretty much
exclusively on hardware either without an MMU or with the MMU disabled, had no
effective security policy: Any application could modify the only thing
attempting to enforce that policy at any time, and nothing could stop it. In
the immortal words of Bokosuka Wars, "WOW ! YOU LOSE !"

We take MMUs for granted now. We take OSes which use MMUs for granted now. We
no longer have to rely on the care and kind nature of strangers to enforce the
basic policy.

The trade-off is speed: Adding an MMU to the path to RAM inevitably makes
accessing RAM slower. There's no way around it. We see it as a rock-simple
win-win tradeoff that we've almost forgotten that there even _is_ a tradeoff,
but our computers would run faster without MMUs. We've just, as a hardware and
software culture, decided that it's worth it.

So. The discussion here is, "Which other tradeoffs are worth it?" Because
there _are_ other technologies we could adopt, hardware and software and a mix
of both, which could completely seal off other classes of attack vectors, and
we need to decide which of those technologies are worth implementing.

~~~
bcook
Was this exact conversation not had decades ago regarding C & assembly-type
languages? Tomorrow, lua vs C, then Go vs java, then spoken language vs
whatever. (I surely got the specific comparatives wrong)

Tomorrow we will take terabytes & terahertz for granted.

I see no end. Old-school vs new-school ad-infinitum.

~~~
pjmlp
Yes, once upon a time C shared a spot with the other systems programmer
languages many of them safer than it, and on home-micros it was seen as a
"managed language", which many used as a cheap macro assembler via the inline
assembly extensions.

Now its compilers are praised for speed, given 30 years of optimization
efforts.

------
capitalsigma
> Somewhere between base arithmetic and x86 is a sandbox people can’t just
> walk in and out of....Efficiently microsandboxing individual codepaths is a
> thing we should start exploring.

wat

The turtles need to end somewhere --- at the end of the day, you are writing
something that becomes machine code, and you've lost the benefit of your high-
level abstractions. The glibc bug is a perfect example --- glibc is an
abstraction layer to make life safer that virtually all (excluding Debian and
some other distros, but near enough) Linux code links against.

No matter where you put it, your stack is going to have a layer like that in
it somewhere, and sometimes you're going to find bugs in that layer. Maybe
next time it's a bug in your C-to-JS compiler that emits insecure code, or a
bug in your `microsandboxing framework` that allows RCE. Shit, maybe there's a
bug in your CPU architecture that allows RCE.

The glibc bug is an example of the sort of incredibly rare screwup you can't
escape no matter how hard you try, and we're fortunate that 'the good guys'
caught it before it had a chance to develop in the wild.

~~~
dakami
What I'm saying is that a lot of energy has gone into "Assuming an attacker
has gotten us into an undefined state, let's try to prevent them from pushing
us into a chosen redefined state." And what I'm saying is, maybe we can create
an environment where we don't end up in undefined states, or at least, there
are bounds to how undefined they can be.

For example, I'm exploring ending use after free bugs by just not freeing
memory. This sounds ridiculous until you realize that on 64 bit, leaking
virtual memory (and therefore never recycling pointers) is actually not an
insane idea, particularly for browsers that get to kill processes outright
because they feel like it. Also, lots of UaF in there.

~~~
tptacek
When you indict things like ASLR as being little more than bait for exploit
developers, and later suggest that part of the solution might be a hack
involving free() create zombie addresses, you give the impression of having
said "exploit mitigations aren't working, unless they're my exploit
mitigations".

(I also don't think yours is a good plan, but I'll wait for you to publish
more details before criticizing it further).

~~~
dakami
Very specifically, I'm interested in exploit mitigations that eliminate
undefined states, rather than just hope an attacker doesn't know enough to
redefine them. One can show "zombie pointers" (fine, we've got lots of space
in 64 bit land) will never allow an attacker to exploit a UaF much easier than
we can show memory is randomized enough.

At the end of the day hard bounds checking (however slow it might be) also
falls into this category of "approaches that do not try to survive falling
into undefined states". I'm not saying ASLR et al isn't useful, just that we
should put more energy intostaying within well defined states.

That's ultimately what "better" languages promise, after all. I'm curious if
there are approaches that don't require rewrites, and very interested in
actually measuring what does and doesn't absolutely suppress vulnerability, at
what performance cost. We're not doing enough of that.

------
codeisawesome
Is patching this bug on any server a matter of running `sudo apt-get update`
(or the equivalent of the linux flavor in question) - and then rebooting
afterwards?

EDIT: From AWS ([https://aws.amazon.com/security/security-
bulletins/cve-2015-...](https://aws.amazon.com/security/security-
bulletins/cve-2015-7547-advisory/)):

""" We have reviewed the issues described in CVE-2015-7547 and have determined
that AWS Services are largely not affected. The only exception is customers
using Amazon EC2 who’ve modified their configurations to use non-AWS DNS
infrastructure should update their Linux environments immediately following
directions provided by their Linux distribution. EC2 customers using the AWS
DNS infrastructure are unaffected and don’t need to take any action. """

~~~
INTPenis
Yes, when I woke up the morning after the advisory all my servers were already
patched. I only had to reboot them. Thanks to things like yum-cron and
unattended-upgrades.

~~~
tie_
First, until you reboot your servers, they are not really patched. Second, you
are happy about unattended core system upgrades to production machines? I
don't think this is the right feeling to have :)

~~~
jlgaddis
As a general rule, I would agree with you (with regard to point two). However,
you don't know the details of INTPenis' infrastructure so you can't know.
Perhaps automated / unattended upgrades / reboots would totally hose your
environment but that's not the case in every instance.

------
al2o3cr
"The hard truth is that if this code was written in JavaScript, it wouldn’t
have been vulnerable."

I'd be very, very careful making broad "interpreted language X would prevent
this bug" statements - not only is it dependent on the VM not having remotely-
exploitable issues, there's also the matter of VM-to-host leakage. For
instance, it's possible to exploit the row-hammer behavior of the host
system's DRAM from Javascript:

[https://github.com/IAIK/rowhammerjs](https://github.com/IAIK/rowhammerjs)

~~~
dakami
No argument from me about the seriousness of issues like Rowhammer. Computers
are built in layers, and layers require reduction to logical assumptions. The
degree to which Rowhammer destroys logical assumptions is astonishing.

However, we've got lots and lots of common flaws that our present coding
patterns aren't quite covering. This exact code written in JS wouldn't have
been a problem. It wouldn't have been a problem even if it was just transpiled
to JS and kept in a well defined sandbox (which isn't how we're doing
sandboxing right now).

------
FracMat
"A network where devices eventually become existential threats is a network
that eventually ceases to exist."

I imagine this only if these devices can't be disconnected. Life creates its
own existential threats in a lot of parts of the "network", but they are
contained or fixed eventually. Diversity is the key, and even though all life
we know has the same fundamental building blocks, so far it worked out. The
internet is not that fundamentally different. So instead of working on ways to
increase maintenance, continue to make the internet a highly diverse place.

------
yyin
Back in 2008 after the cache poisoning hype, I developed my own method of
resolving names without using caches (it is very fast); I use only
authoritative servers. I still use this method daily.

"... but we can set the tc "Truncation" bit to force an upgrade to the
protocol with more bandwidth."

dnsq does not do TCP queries. Sorry.

I also developed a few systems for resolving all the names I needed in advance
so I did not need to use DNS at all, except when periodically updating the
list of IP addresses. I am glad I did that work. (But nowadays there are
resources like scans.io)

When someone publishes a vulnerability in dnsq from djbdns (it does not send
recursive requests), I'll have to dream up another solution to the problem of
"DNS". I doubt that's going to happen _, but I could be wrong.

_ There are too many other easier targets.

~~~
duskwuff
1\. Hitting authoritative servers for every DNS query and refusing to cache
results will make the operators of those servers _hate_ you. You will probably
find that some sites (or even potentially entire TLDs!) will end up blocking
your requests entirely after a while, as this is an incredibly "unfriendly"
behavior.

2\. Supporting TCP queries is not optional. Some DNS servers will refuse to
answer certain types of queries over UDP. In particular, ANY queries are often
TCP-only, as they are a potential vector for DNS amplification attacks.

~~~
yyin
1\. The truth is, I make far fewer queries than the average web user. Because
I have the IP addresses I need stored permanently. And I only update those
files periodically. Today's websites and graphical web browsers (that I do not
use) perform astounding quantities of uneccessary _daily_ or _hourly_ DNS
requests that I never make. Maybe you think I am resolving every registered
name in existence? If that were true, then yes, I think that is unreasonable
_. But the fact is I am only resolving the names I need, which, in the context
of the total number of names registered, is very, very few._ however scans.io
and other scanning projects do not seem to be labeled as "unfriendly" nor the
target of "hate"; perhaps your views are not based on actual exerience?

2\. This is a personal solution. I am not writing software for anyone else. I
do not have to use TCP for DNS queries and I have never found an authoritative
server that refused to accept a UDP query. dnsq does not do TCP queries; I
guess you could complain to the author he's violating some rule? If I am not
mistaken, amplification problems happen because of ideas like open resolvers
and enormous UDP packets, like those required for EDNS0 and DNSSEC. I am not a
user of either of those ideas.

~~~
chei0aiV
I would encourage you to publish your solution and get it included in various
software distributions.

