Hacker News new | past | comments | ask | show | jobs | submit login
Libcurl remote code execution (volema.com)
129 points by marshray on Feb 8, 2013 | hide | past | favorite | 60 comments



This is nasty but fortunately it only affects fairly recent versions of curl, specifically curl 7.26.0 up to and including 7.28.1. That means Debian Stable and Ubuntu 12.04 aren't affected.

As a general rule, if you use libcurl in an application and follow redirects you should probably restrict CURLOPT_REDIR_PROTOCOLS to just HTTP and HTTPS (and maybe FTP). Otherwise a nefarious site could redirect curl to, for example, a IMAP, POP, or SMTP URL, which is potentially undesirable even without this vulnerability. If you're just using curl to talk HTTP(S) you really don't need any of these other protocols.


Why does curl even allow that?


wow, thanks for the heads up about version numbers - I was about to stay up late patching... with this information patching can wait til morning.


Regarding version numbers, I'll mention something here that I used to have to tell customers in the web hosting biz all the time: stuff sometimes gets backported and version numbers don't tell the whole story. For instance, RHEL 5 has "curl-7.15.5-15.el5" right now, and that would suggest it doesn't support either of the CURLOPTS required to disable this.

However, the actual build loops in a patch called curl-7.15.5-CVE-2009-0037.patch, and that adds in all of the CURLPROTO_* magic required to lock down an application. I discovered this tonight when updating my client-side code to restrict redirects and found that it would build fine on RHEL 5 even though I expected it to die. A little digging around in the source RPM explained it.

So, if you're on an OS like RHEL and you think you might not be able to use this feature, try looking in your curl.h file. You might actually have support courtesy of some backported patch from your distributor.


Somebody wrote new code which uses strcat() in 2012 (the commit which introduced that bug was written in June 2012)? That's.. wow.. unbelievable.


Is this kind of stuff not code reviewed?


It's an open source project, you are free to do so if you have the time or the need, and contribute back any corrections.


This is only a partially-accurate description of how most open source projects work. While code review of the core developers is rare, patches from outside contributors typically are scrutinized before being applied. The curl vulnerability was introduced by an outsider's patch so I am troubled that it wasn't noticed before being committed.


I guess I'm just idealistic, but I would think for something this security sensitive or important (see "Fixing" TWO in glibc), they would have code reviews before anything could be committed.

In this case, a automated system rejecting anything that uses the non "n"(strcat vs strncat, etc.) version of the string functions in C would have worked.


If you expect code reviews and automated tests for an open-source product that's developed in spare time for no money, you can go to this page (http://curl.haxx.se/donation.html), click the PayPal button, and be very very generous.


If there's anyone using my Requests for PHP library [0], just an FYI that it is not vulnerable to this as it handles redirection outside of cURL and double-checks all URLs. That said, I'll be pushing some commits momentarily to ensure that this is also forced via CURLOPT_PROTOCOLS inside the cURL transport.

[0]: http://requests.ryanmccue.info/


I've been reading more about security lately, and one thing I don't understand is how buffer overflows allow for true RCE these days. I believe all modern processors support having non-executable pages; why is any program-writeable page executable? Especially in C, where there's no JIT. It also seems like address space randomization would make RCE pretty hard.

Obviously they can still arbitrarily write the instruction pointer, which is very bad and possibly bootstrappable into fully RCE, but I still see more conventional RCE described in recent books and wonder if and why it's still doable.


Here you can read about defeating the protections you mentioned: http://www.blackhat.com/presentations/bh-europe-09/Fritsch/B...


You should distinguish RCE and the ability to mount working, reliable exploits. RCE is possible here, and its best to consider that a fatal security failing in and of itself rather than hoping a particular set of countermeasures will be completely effective.

NX pages would make some classes of RCE exploit difficult or hard (such as including shellcode directly in the POP3 response, and returning directly to it) but doesn't help for many other classes (return-to-libc, or finding or installing 'gadgets' elsewhere in executable pages).

ASLR is a working countermeasure for other classes of exploit, as long as addresses are unpredictable enough to make a search is hard.


You can redirect the control-flow of the program by overwriting a return address or a vtable. Once you've done that, it's easiest if you can redirect control to code you've written, in executable heap; but if that's not possible, you can still use return-to-libc, or potentially "return-oriented programming" strategies.


So apparently cURL speaks POP3. I had no idea until seeing this. This might be useful to know for future projects.


FYI, `curl --version` shows all the protocols it supports


On my Debian Squeeze I get the following:

    $ curl --version
    curl 7.21.0 (x86_64-pc-linux-gnu) libcurl/7.21.0 OpenSSL/0.9.8o zlib/1.2.3.4 libidn/1.15 libssh2/1.2.6
    Protocols: dict file ftp ftps http https imap imaps ldap ldaps pop3 pop3s rtsp scp sftp smtp smtps telnet tftp 
    Features: GSS-Negotiate IDN IPv6 Largefile NTLM SSL libz
Looks pretty impressive.


Thus illustrating, once again, that the C programming language is unfit for network-facing software and that using it in that role is irresponsible.


The long history of exploitable buffer overflows in C programs is something to keep in mind when people hyperventilate about how recent vulnerabilities in a popular framework reflect on the security of the language ecosystem as a whole.


It's been a long time since somebody tried to convince me that a buffer overflow was an intentional feature of the program they were developing...


Assuming you and your parent are both coyly referring to the recent YAML parsing related Ruby and Rails vulnerabilities, I think the analogy is better than you realize. Raw strcat is to the built-in YAML parser as strncat is to the various whitelist-based YAML fixes[1]. Both strcat and the YAML parser work as intended, but should never be exposed to data controlled by an external source. Buffer overflows aren't intentional, but uses of strcat are, even when they are wrong.

1. https://github.com/dtao/safe_yaml


That's a silly thing to say.

All the networking code for Python, Ruby, Perl, Haskell, Java, Scala, Clojure, Erlang, and just about every other language in existence eventually makes calls to the sockets API, which is written entirely in C.

Where were the "Ruby is unfit for networking code" comments for all the recent Rails remote code execution exploits?

Lazy developers will write unsafe code in any language.


So the language should by default be safe, with extra effort required to perform unsafe operations. Then lazy programmers will write safe code!


We'll still have to deal with industrious but stupid ones.


What is a suitable alternative?


There are many actions that can be taken;

The first and most important is to assume any tool that reaches out to the internet and other machines is suspect no matter what language it's written in, and to treat the entire machine as tainted.

The industrial data approach to this is to have a sacrificial machine that's almost completely network isolated from which all FTP's, curl's, wget's et al are run - once setup 'securely' then image the machine to be able to wipe reset on a regular basis. Use that machine to fetch all data from foreign sites and have it dump incoming files to an in coming queue folder. Network security should be such that one other machine on the network can read from that folder and all other network traffic related to the fetch machine rings alarms.

Another approach, when using C to develop your own in house tools is to understand that 'C Strings" are not and never have been "part of the language" - go read the spec. in the latter half, after the language specification, it mentions that a particular byte pattern is referred to a string in a C context, it's just a label of convenience. Don't use the C stdlib functions that manipulate C string patterns, think of them as just an early example of the kinds of things you can build with C, when dealing with tainted data (any data from a network, a user, a foreign source) use hard blocks and hard sizes. Don't fall into the tacit trap that input is "well behaved" .. it never is.


Turbo Pascal.

It has a similar set of capabilities to C (including pointers, in line ASM, arbitrary memory access, etc.), ability to call C code, so integrates nicely with C libs.

It has slightly better type handling - e.g. you can declare "feet" and "metres" types based on float, but the compiler won't let you assign one to the other (unless you force a cast).

The major safety advantage is that by default access to strings, arrays and memory management is bounds checked. You can turn this off with pragmas in the code for performance sensitive bits.

In fact, standard practice in C now is to use bounds-checked functions (the strn* functions, etc). Except instead of the compiler keeping track of data sizes for you, you have to do it manually!

Normal string handling is also faster in TP since it stores the length of the string, so it doesn't have to scan through to find the zero char during string operations.

Of course, all this is possible in C, but it's not part of the standard language and API. This has two disadvantages: 1) Lazy or inexperienced programmers will write the most dangerous code. Yikes! 2) Each careful programmer will solve the safety problems in their own way, code from different sources will not necessarily be compatible, forcing library APIs to fall back to the unsafe standard structures.


C++. I hate it, it has its own host of problems, and memory corruption is certainly still an issue there, but the most common issues that plague C apps won't plague properly written C++ apps/libraries. C's string handling (or lack thereof) has cost the world an immeasurable amount of time and money.

The biggest advantage that C++ has over just about everything else, is that you can use it to write libraries usable from everything else, and you can do it very easily. You can expose a C-style API trivially, and bind it to everything you want; that advantage can't be overstated.

While I'd love for everything to be written in pure safe, managed code, that's not viable right now. C++ is the best alternative we have, when safe languages aren't usable for the task.


> the most common issues that plague C apps won't plague properly written C++

or properly written C, either. :-)


Not that I agree with the upper comment, but functional languages that limit and rigorously encapsulate memory operations can eliminate a large number of these problems (the access itself is localized to monads, and the rest is strictly-typed and must be explicitly reasoned about). They can still be DDoS'd (too many open handles, out of memory, etc) to be sure. As long as there are fixed buffers, there will be buffer overflows but even using a functional style in C/C++ can often provide some of the same benefits (even just using const wherever possible...).


I agree, but using functional languages won't let you replace C in the vast majority of cases. The performance hit is minimal, but the ability for code to be used from anywhere (as C libraries can be) is extremely important.

Also, grouping C and C++ together in the context of security is a Bad Thing (TM). They have completely different security issues that plague them, and while C++ may be (mostly) a superset of C, it's truly a completely different language. Things like buffer overflows in string handling are next to nonexistent in proper C++ code, while they exist all over the place in C.


Yes and no.

C and C++, as you note, permit a similar lowest-common-denominator programming style and so a similar class of low-level bugs. While the C++ libraries (std::string, etc) and other differences can help, they can be subverted through ignorance or will, and often are. The strict typing in C and C++ helps eliminate bugs over the same task written in assembly (or Forth, etc). In the same way, goto might best be used sparingly, since the sharper the tool, the more likely the damage done by accident. C and C++ sit at approximately the same "danger" level (potential for low-level access) whereas I think functional languages can be automatically "safer" (admittedly ill-defined), assuming you are willing to subjugate yourself to them and can trust the implementation (which is not always a fair assumption), since compilation boils down to a relational proof. (That's not to say it's my preferred style.)

Furthermore, Haskell and OCaml, for example, can both be compiled to linkable objects (C interface-able), so I don't see the loss of interoperability as you suggest. A Haskell .o looks like any other.


> The performance hit is minimal, but the ability for code to be used from anywhere (as C libraries can be) is extremely important.

I'd argue about the performance hit. (And I programme in Haskell for a living.) But FFI, on the other hand, isn't too unpleasant in functional languages.


C with a proper string and buffer library, such as libdjb. http://www.fefe.de/djb/


I'm not sure, but I think we should spearhead an effort to rewrite the Linux TCP/IP stack in Forth.


Let's add some more obscurity to this security. I suggest APL.


More generically than the other answers: Anything with managed strings gets you out of buffer-overflow land.

If you're really concerned about security, something that does not support "eval" is also a good idea. Replacing your buffer exploit which still requires some skill to exploit with the opportunity to create a "Please tell me what code you would like to execute, in source code form" exploit isn't exactly a good trade. You'd think it would be easy to prevent users from executing code, but evidence suggests you'd be wrong.


Perhaps D?



A garbage collected language created by people ideologically opposed to dynamic linking? How is that ever going to replace dynamic libraries written in C that can be used from dozens of other languages?

Go is not a C replacement.


My mistake, I thought the question was about standalone applications in C, not libraries.


I hate the fact that most of the C standard library needs to be ignored due to buffer overflow/security issues. I nearly always use the cross platform "Safe C Library" anytime I do something in C: http://sourceforge.net/projects/safeclib/ http://www.drdobbs.com/cpp/the-safe-c-library/214502214


I guess I'm just a dumbass, but does the linked PoC do more than just overwrite EIP with 'A'? Is the payload left as an exercise for the reader or is this a working example and am I missing something clever?


The stack was overwritten with 'A's, and thus upon returning from a function the return address consisted entirely of 'A's, which is why EIP is now filled with it.

A clever hacker would use this to jump into some malicious code instead; that's how stack overflow exploits usually work.

So, yes, the actual payload is left as an exercise for the reader, but this shows that there's already everything there to make it possible.


Oh OK, then I did see correctly after all. But still, on many modern systems (e.g. with non-executable stack) it would require fairly sophisticated trickery in the payload to make this do anything. I mean, getting control over EIP is a very good first step, but this isn't 1995.


Control over the program counter allows return-into-libc or other ROP attacks.


Having access to EIP means you have full control. From there it is very simple to do return to libc or ROP attacks. EIP is the holy grail, control that and you win.


Generally it is considered good form to release PoC exploit code that illustrates the weakness without causing significant harm.

The buffer overflow is the demonstration. There is enough other work that shows how to create payloads.


Seems OS X 10.8.2 is unaffected.

    curl --version
Gives me 7.24 or so.


Mitigation: wget.


This is a flaw in libcURL, and last I checked wget was just a binary, not a cross platform library.


Not having wget installed stops about 60% of the PHP shell scripts I've seen dropped on servers.


Not having PHP installed stops 100% of them.


Less snarky and more portable option: avoid webserver-writeable directories.


Not all of us get to be so lucky.


curl is still the best tool for file downloads. wget lacks some fairly important features, like limiting the size downloads. To be checked if you are worried about bandwidth or disk quota.


curl --max-filesize: "NOTE: The file size is not always known prior to download, and for such files this option has no effect even if the file transfer ends up being larger than this given limit. This concerns both FTP and HTTP transfers."

wget --quota: "Note that quota will never affect downloading a single file. However, quota is respected when retrieving either recursively, or from an input file."

These options are not that much different in practice. If single-file size is uber-important, pipe it through something that will break the pipe after the limit point. For ad-hoc scripting usage, wget is almost always more usable than curl IMO. Of course, wget isn't a library.


If you're using http, you can use the range parameter in curl to restrict the number of bytes you download.


Sure, if the server supports it.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: