Hacker News new | past | comments | ask | show | jobs | submit login
PCRE Heap Overflow in Regex Processing Lets Users Execute Arbitrary Code (securitytracker.com)
84 points by muraiki on June 4, 2015 | hide | past | favorite | 25 comments

I'm not that familiar with PCRE, but I think that this engine is used in a variety of programming languages -- but not Perl's, as it's used to provide Perl-compatible regex. PHP's Manual for PCRE links to pcre.org, so it seems that at least PHP is vulnerable.


Edit: Here's a copy of the actual CVE, which demonstrates the vulnerability using PHP: http://www.openwall.com/lists/oss-security/2015/06/01/6

The CVE mentions that PCRE is used in Flash, Apache, and Nginx.

Edit2: Could a mod change the article url to the openwall CVE? It seems that securitytracker.com is not the most responsive website. Sorry, I should have linked to the CVE directly.

Anyone compiling regex strings in PHP from user input should take care to use preg_quote(): http://php.net/manual/en/function.preg-quote.php Otherwise, you're in risky business trusting what a user could send you.

Especially since you could otherwise inject "/e" which turns the whole thing into an eval() !

SERIOUSLY!? eval() from regexes!?

Jebus... Just burn it all down, we're screwed.

From the documentation for regex modifiers: "DEPRECATED as of PHP 5.5.0 and REMOVED as of PHP 7.0.0". Not for much longer.

Here the text is formatted better: https://bugs.exim.org/show_bug.cgi?id=1636

For yum based systems can you find all packages installed requiring PCRE and all packages in the yum repos which require PCRE:

repoquery --whatrequires pcre --installed

repoquery --whatrequires pcre

The apt equivalent seems to be `apt-cache rdepends libpcre3`

Flash fell for a PCRE-related exploit in March. http://googleprojectzero.blogspot.com/2015/02/exploitingscve...

If I'm reading this right, the exploit is in the regex compilation stage, not the data matching stage; therefore remote exploitation would require the server setup to compile attacker-provided regexps, not just setup to run attacker data through an admin-configured regexp?

If that's the case, a typical nginx/apache config shouldn't be remotely vulnerable, right? Though I could see some shared hosting scenarios having some issues.

Simple searches could be an issue, however.

Not everyone quotes the things they put into regular expressions. Before that would have resulted in incorrect code, but now it's a security vulnerability.

I build nginx with PCRE, and I'm curious if this vulnerability would impact nginx in some way. My hope is that since nginx is using it internally and not accepting randomly-supplied regex strings, then there might be minimal impact. Can anybody with more knowledge of nginx + PCRE comment on this?

I guess this means that regular expressions are Turing complete, at least until a patch arrives?

If you take regex as user input, you should use RE2: https://github.com/google/re2

Even without this vulnerability, some regexes can be painfully slow.

Now you have three problems.

A regular expression processing library seems like a great candidate for something that could be implemented in a "safe" language - Haskell, OCAML, (Rust?), etc.

wow, I was expecting an odd looking, weird, larger regex, but look at this..

is that `WGXCREDITS` the command to execute?

Definitely not. It is a HEAP overflow exploit. It allows you to write arbitrary data outside the bounds of the allocated memory in the heap.

The string is likely only important due to its length. Using an alternate 10 character string triggers the same error:

    ~ $ php -a
    Interactive shell

    php > preg_match("/^(?P=B)((?P=B)(?J:(?P<B>c)(?P<B>a(?P=B)))>WGXCREDITS)/","ADLAB",$arr);
    *** Error in `php': free(): invalid next size (normal): 0x0000000002ff7a10 ***
    ~ $ php -a
    Interactive shell

    php > preg_match("/^(?P=B)((?P=B)(?J:(?P<B>c)(?P<B>a(?P=B)))>AAAAAAAAAA)/","ADLAB",$arr);
    *** Error in `php': free(): invalid next size (normal): 0x00000000020e5a10 ***

What does that regex even match?

It is quite possible that the regex doesn't match anything useful. From the looks of it, I would say it was generated using a fuzzing tool, in much the same way as what lead to the discovery of the Shellshock vulnerability.

What does this mean to someone with a public facing website that uses PHP w/ Apache or Nginx?

It means if you have a PHP script or application (Wordpress, Drupal, etc) on your server, and there's code in the script or application that uses one of the `pcre_` functions, and that the regular expression passed to that `pcre` function uses user input to create a regular expression, then an attacker can theoretically run any unix command on the server. This means your user information (including any passwords in text files) is vulnerable, and it puts the attacker in a great position to gain full access to the server.

Until PCRE or PHP release a patch for this, you remain vulnerable. You'd want to defend against this at the web server level -- think `MOD_SECURITY` rules that scan requests, look for known "bad" regular expressions, and then stop that request from reaching the PHP application. If you have a good hosting company hopefully they're already doing this for you.

My guess is that nothing unless you let the user input REs. (Or use REs that can be modified by user input)

Thank you :)

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact