This problem occurs due to IA-32's 80-bit floating point arithmetic. The simple fix: add a "-ffloat-store" flag to your CFLAGS.
The problematic function, zend_strtod, seems to parse the mantissa (2.225...011 part) and the exponent (-308 part) separately, calculate the approximation of m*10^e and successively improve that approximation until the error becomes less than 0.5ulp. The problem is that this particular number causes the infinite loop (i.e. the iteration does not improve the error at all) in 80-bit FP, but does not in 64-bit FP. Since x86-64 in general uses the SSE2 instruction set (with 64-bit FP) instead of the deprecated x87 it does not have this problem.
It's hardly what I would call "PROBLEM SOLVED". The source should either not compile without the flag with a nice explanation, or it should be changed to work properly without the flag too. Otherwise, someone will run into it again in the future (or the flag will be removed X years from now by someone trying to achieve something completely different).
A better fix would be to use "-mfpmath=sse" to disable x87 math - this also makes your program slightly faster, whereas -ffloat-store can make it much slower.
.. or not use an algorithm that depends on the hardware implementation of floating point math? How about just comparing the previous & current iteration values? If you're not making any progress, you're done.
I don't know enough about the intricacies of decimal-to-floating point conversion to have an opinion, but it seems strange to me that there is any sort of approximation process involved at all. Can someone shed light on whether this is a good way of doing things that happened to go wrong on a pathological case or totally crazy or something in between?
This is the nature of floating point numbers: they're not exactually "exact" at all. Converting a fixed fraction decimal number into a floating point means turning an exact number into its best approximation. In order to get the approximation as close as possible to the original number, a floating point conversion algorithm will perform several runs until the error between the original number and the floating point representation is smaller than some very small value. This leads to problems when either the error can't get smaller than the required precision, or when the error doesn't decrease per iteration. In both cases an algorithm that doesn't have error detection will be stuck in an infinite loop.
"floating point conversion algorithm will perform several runs until the error between the original number and the floating point representation is smaller than some very small value"
Yes yes yes yes yes, but my question is not "what happens in the PHP" but why does that happen. Shouldn't there just be an algorithm that just runs down the number and produces the correct float without any approximation step? Floats and doubles may be inaccurate but the inaccuracy in this case is purely deterministic.
Here's the strtod.c in tcl, with various exciting things in the copyright: http://www.opensource.apple.com/source/tcl/tcl-14/tcl/compat... Unless I've missed it, there's no convergence testing there, it just gets on with the conversion; all the if, while, and gotos seem to be focused on the matter of parsing, not checking error sizes. Is the PHP way faster? Is the linked code somehow inaccurate? Is the PHP way just insane?
almost every floating point value given as a string in decimal notation will not be representable in binary given a floating point format with fixed precision. if the number to convert lies within the range of your binary fp precision (eg iee 754, 52 bits of mantissa, 11 bits exponent), it will in general be sandwiched between two numbers that are exactly representable in binary. the task is to implement an algorithm which choose between these 2 numbers in the best possible way (eg without introducing systematic bias).
the apple engineers simply rely on the compiler library: they collect the mantissa digits into up to 2 integers, thus generating an exact binary representation for mantissse of up to 18 decimal digits.
thereafter they coalesce these integers into a double variable having an implicit cast from int to double - that's where the library or compiler-generated sequence of machine instructions take over.
the pgp guys actually do the same thing. however, after this step they continue to adjust the result for obtaining the best approximation in the sense above. it can be proven mathematically ([1]) that this task _cannot_ be performed for all inputs using arithmetic with any given precision. [1] also provides a near-optimal non-iterative solution which the author of the php conversion routine has improved upon. however the computations involve floating point (processor) arithmetic which suffers from a specific error in intel fpus. due to this error intermediate results are altered in an unfortunate way so that the adjustment algorithm no longerterminated in spite of the theoretical guarantee.
some more words about the 18-digit mantissa mentioned at the beginning. the apple code cuts off the mantissa after 18 decimal digits and its authors claim this operation won't affect the result. that would not be a problem if the input strings would be given with normalized mantissae (first digit not a zero), as the error was introduced in the 18th decimal / approximately 60th binary digit while ieee 754 double format only caters for 52 matissa digits. as the strtod.c docs do not detail the 'internal fp representation', this might be problematic (it would be eg. for 80-bit extended precision on x87 coprocessors featuring 64 mantissa bits).
however, as they allow non-normalized mantissa values, their conversion runs foul on strings with 'many' leading zeros, ie they'd convert 0.0000000000000000009999E+0 to 0.
hope all this makes kind of sense.
greets, carsten
[1] http://dx.doi.org/10.1145/989393.989430
William D. Clinger,
How to read floating point numbers accurately,
ACM SIGPLAN Notices,
v.25 n.6,
p.92-101,
Jun. 1990
If you have any thoughts on what the bug is, please let me (or PHP) know.
Here's what you do. Step 1: compile PHP with debugging symbols. Then run the test case in GDB:
$ gdb `which php`
(gdb) set args testcase.php
(gdb) run
<program hangs>
Then hit C-c, and see where the program is:
^C
Program received signal SIGINT, Interrupt.
0x0000000000703898 in ?? ()
#0 0x0000000000703898 in ?? ()
#1 0x00000000006aae40 in execute ()
#2 0x00007ffff4400116 in ?? () from /usr/lib/php5/20090626/suhosin.so
#3 0x000000000068290d in zend_execute_scripts ()
#4 0x000000000062e1a8 in php_execute_script ()
#5 0x000000000071317a in ?? ()
#6 0x00007ffff5475c4d in __libc_start_main (main=<value optimized out>, argc=<value optimized out>, ubp_av=<value optimized out>,
init=<value optimized out>, fini=<value optimized out>, rtld_fini=<value optimized out>, stack_end=0x7fffffffe9c8)
at libc-start.c:228
#7 0x000000000042d4b9 in _start ()
Now you have some idea of where to look. (Note: this is not the actual bug, as I can't reproduce it on my machine. This is <?php while(1){} ?>, which is just as good for demonstration purposes. Also, no debugging symbols, so we don't really know what's going on.)
It might be if you grew up in a C/C++ environment, many people these days start with scripting languages and don't venture out (myself included, but learning C is my new year's resolution).
You can do all this in any decent scripting language too. In ruby:
require 'rubygems'
require 'ruby-debug'
Debugger.start
Signal.trap('INT') { debugger } # this line makes the debugger run when you hit ctrl-c
# YOUR CODE HERE INSTEAD!
loop {}
Simply hit ctrl-C (sometimes twice) when your code is hung or whatever and then issue the `where` command to rdb. Just make sure you have the ruby-debug gem installed.
It's actually easier, as you obviously don't have to worry about debugging symbols. You can also attach rdb to running interpreter processes and all that good stuff.
EDIT: I've just realised that this will obviously only catch bugs in ruby code. For bugs in the ruby interpreter itself, like the php bug under discussion, you will need to use gdb on the interpreter binary like jrockway showed. Still the point is that you shouldn't consider this stuff over your head just because you only hack dynamic languages :)
It might be now, but it wasn't back when C was the norm in college. Everyone knew how to step through visual studio in 101, or they failed out. If you didn't have windows, you learned gdb pdq :)
that's true, but back then we probably all used fax machines for sending messages. times move on, and so does the level of interaction with the machines. eventually the number of people who understand this will be even smaller than now, that's not a bad thing, it's a sign of progress.
Bullshit. Fax machines are obsolete because we have email now, but C and debuggers still haven't been replaced. "Level of interaction with the machines"? No, kernels and upper level language runtimes aren't magic, they're made of C by programmers just like you or me.
fax machines are obsolete because the newer techs are more convienient, in just the same way that not having to worry about type in a loosely typed language is more convienient than having to worry about it in a strictly typed language.
that doesn't mean that we won't ever need fax again, or morse code, it still exists and has it's place, but for the majority of users it's not needed, and the same will happen with all technology, including programming. eventually almost noone will need or use something low level like C, just like most people right now don't need or use assembly, or even lower level, machine code, or even lower level... I don't even know what's lower level. Point being as things move along the lower level, while still being there, is understood and used by less and less people.
That's still partially rubbish, because you don't build technology on top of a fax machine. Many of the technologies at the higher level are directly written in C, and its going to be a while before the C goes away.
I.e.: its like a world where all our cellphones/email still, under the covers, run over fax, and we still need a fax machine in every house to make this possible. =P
Without people to write the languages in C, and without people to hack on the kernel in C, there will be no more languages, and existing C based languages will get no more features.
The best you can do now is write one high-level language in another, and that tends to be reasonably slow.
Just because you don't understand something, or because you don't know anybody who does, that doesn't mean that technology is disappearing, it just means you're in a selective circle.
I was specific to not imply any level of the levels of complexity would ever go away. That's like saying assembly has gone away. It has for the vast majority, but not for those who actually code compilers or reverse engineer software. That group is extremely small compared to the rest, hopefully we can agree on that. I expect that eventually languages like C will be similar to the way cobol is now :)
As the author said, it does hang in zend_strtod.c, and it seems to happen in 32-bit only.
Debug trace:
#0 0x0832257f in mult (a=0xe1931e82, b=0x8781590)
at /usr/src/php-5.3.3/Zend/zend_strtod.c:720
#1 0x08322757 in pow5mult (b=0x8781590, k=1)
at /usr/src/php-5.3.3/Zend/zend_strtod.c:803
#2 0x08324443 in zend_strtod (s00=0xb7a7d01d "e-308;\n?>\n", se=0x0)
at /usr/src/php-5.3.3/Zend/zend_strtod.c:2352
#3 0x082e03ce in lex_scan (zendlval=0xbf94dd34, tsrm_ls=0x8648050)
at Zend/zend_language_scanner.l:1382
#4 0x082fa849 in zendlex (zendlval=0xbf94dd30, tsrm_ls=0x8648050)
at /usr/src/php-5.3.3/Zend/zend_compile.c:4942
#5 0x082dcc47 in zendparse (tsrm_ls=0x8648050)
at /usr/src/php-5.3.3/Zend/zend_language_parser.c:3280
#6 0x082dd232 in compile_file (file_handle=0xbf9502d0, type=8,
tsrm_ls=0x8648050) at Zend/zend_language_scanner.l:354
#7 0x081ad3cc in phar_compile_file (file_handle=0xbf9502d0, type=8,
tsrm_ls=0x8648050) at /usr/src/php-5.3.3/ext/phar/phar.c:3393
#8 0x0830acc5 in zend_execute_scripts (type=8, tsrm_ls=0x8648050, retval=0x0,
file_count=3) at /usr/src/php-5.3.3/Zend/zend.c:1186
#9 0x082b660f in php_execute_script (primary_file=0xbf9502d0,
tsrm_ls=0x8648050) at /usr/src/php-5.3.3/main/main.c:2260
#10 0x08388893 in main (argc=2, argv=0xbf9503b4)
at /usr/src/php-5.3.3/sapi/cli/php_cli.c:1192
Bug appearing at my Core 2 Duo / Win7 / PHP 5.3.0.
This is really serious. In fact, I’ve just tested if the problem happens for GET passed values and it does. Not all the passed data to a website is treated as a number, so not all websites with the PHP versions and configuration that could fail with this bug will be vulnerable, but definitely there is going to be a huge amount of websites that will do. This is really scaring.
I hope the PHP team patch it soon.
Meanwhile, a possible workaround would be adding this line at the very top of the execution of php website:
if (strpos(str_replace('.', '', serialize($GLOBALS)), '22250738585072011')!==false) die();
This will stop execution if any decimal version of the number were passed as parameter. Note that 222.50738585072011e-310 cause problems too, and any of the other possibilities to write it.
Do you know if there are any other possible ways to write the number that causes trouble too?
Only if that parameter is treated as number and not as string. And only if the php version/configuration have that bug. It seems to be a problem when converting from decimal string to number.
In Windows it leaves a zombie resource putting the cpu to 100%, so it doesn't seems to be a nice thing...
It's going to be a very, very small subset of websites running PHP that could be hung...
- Must be one of the PHP 5.3.3 versions with the bug, where very very few web hosts are running such a recent version (5.0, 5.1 and 5.2 branches are much more common)
- Must be a 32-bit version, no bug in 64-bit
- The PHP program must try to use the input as a number
My experience matches yours: the bug occurs in i686 but not x86_64. I have a variety of boxes with different Linux distributions, and tried it on each.
Ubuntu 10.04, i686, "PHP 5.3.2-1ubuntu4.5 with Suhosin-Patch" has the bug.
Ubuntu 10.10, i686, "PHP 5.3.3-1ubuntu9.1 with Suhosin-Patch" has the bug.
Debian Lenny with 2.6.26-1-amd64 kernel, i686, "PHP 5.2.6-1+lenny9 with Suhosin-Patch" has the bug.
Debian Lenny with custom kernel build, x86_64, "PHP 5.3.3-5 with Suhosin-Patch" does not have the bug.
So the common thread seems to be 32- vs 64-bit: the bug occurs on all the 32-bit boxes I tested, but not on any of the 64-bit boxes.
Though the author doesn't seem to be malicious in any way, he really should have reported it to the PHP core team as a security vulnerability before writing a blog post. This could easily lead to denial-of-service attacks.
Aren't a huge percentage of software bugs potential security vulnerabilities by this standard? I understand the wisdom of trying to get patches pushed out before publicly disclosing major exploits in mission critical software, but it seems unreasonable to expect users to not make statements of the form "program X crashes when I do such-and-such" in public discourse.
It just depends on how the crash is triggered. Most things that users run into in normal usage don't have very much potential to become a serious attack vector for lots of reasons -- many programs are not particularly network-aware or may not provide any service an attacker is interested in denying, the attack may require local access (which is still serious in some cases, but obviously less than something that can be exploited over the network by sending a malformed query), and so on.
I think that the vast majority of stuff doesn't qualify as something that should be reported to a security team first. However, from a brief glance at the article, it appears that a remote user of a web application may be able to crash PHP entirely if he can find the right place to drop the number, denying access to all programs that depend on that PHP installation. That is a very serious bug and one incident can theoretically take out hundreds or thousands of sites and cost a lot of people a lot of money, not to mention time or frustration. Definitely seems like it should have hit the security group first to me.
Yes, I understand that bugs in the "platform software" used by many internet services have the potential to rapidly affect a lot of users. However, I believe that it is unreasonable to expect users who experience a crash in php, or mysql, or any important application you can name, to not talk about the crash that happened. Software bugs are just too common to impose that restrictive a standard. In general, I believe statements of the form "this information is too dangerous to be shared publicly" are suspect, and we should maintain a strong presumption that as a general rule, it is OK to talk about problems publicly. When it comes to security issues, there is a difference between releasing deliberately crafted 0-day arbitrary remote code execution techniques and just making a public report of a crash you experienced.
Perhaps the real issue is that the growing reliance upon internet services makes fault-tolerant engineering and fallback plans for handling failures very important. We need to make sure that hospitals/police/everything aren't dependent on systems that might break down completely because of bugs like this.
That would be all well and fine in the idea world, but this is the PHP core team we're talking about here. They don't give a shit about security, and you have to publicly humiliate them into taking things like this seriously and doing anything about security.
He did submit it to PHP. However, it seems the fastest way to get bugs patched on projects like PHP is to publicly announce the bug and let the public push the devs to fix it. Otherwise the report just languishes in the bug tracker and never gets fixed.
He asks "Is this serious?" halfway down the page, so I'm guessing he just didn't think about it. I'm sure it will be patched shortly, and equally sure some PHP core team member is like "F--!" right now.
When a problem is described as an optimizer issue, it usually means that the code is making some unsound assumptions that happen to be true when compiling without optimization. Of course it's easier on the programmer's ego to blame it on the compiler...
(I haven't looked at this particular issue; maybe it really is a compiler bug. But the odds are against it.)
"I'm not a real programmer. I throw together things until it works then I move on. The real programmers will say "Yeah it works but you're leaking memory everywhere. Perhaps we should fix that."" -Rasmus Lerdorf
When you attempt to make an argument from authority, at least choose someone who has some credibility to use as an authority.
We still need to fix the code to make it immune to compiler switches.
In other words, "We know that our code is buggy and it is relying on undefined behavior."
I.e., this developer knows it's a problem with his code, not the compiler, but he phrases it as if the "compiler switches" are the things causing the problems. This is classic developer arrogance and immaturity.
Thank God I don't rely on his software for anything important.
"For all the folks getting excited about my quotes. Here is another - Yes, I am a terrible coder, but I am probably still better than you :)" - Rasmus Lerdorf
That is because with x87 floating point, when the number is in registers, it is 80-bit, but when the compiler spills it to memory, it is 64-bit, which is why -ffloat-store works.
The copying from the 80-bit register to the 64-bit memory cell is causing the result to be corrupted. When we use volatile it tells the compiler not to apply the optimisation which in this case was putting it in a register.
But doesn't that make it a compiler bug? If the root cause is that the compiler truncates data it shouldn't, does that mean the fundamental issue is a compiler bug?
Awesome, this catches HostGator's PHP 5.3.3 (which isn't used by default, have to turn it on yourself) too. I knew there was yet another good reason for always casting expected-int input before doing anything with them... Something as simple as
aquilax@zelda /tmp> php -v
PHP Warning: PHP Startup: Unable to load dynamic library '/usr/lib/php5/20090626+lfs/adodb.so' - /usr/lib/php5/20090626+lfs/adodb.so: cannot open shared object file: No such file or directory in Unknown on line 0
PHP 5.3.3-6 with Suhosin-Patch (cli) (built: Dec 7 2010 18:23:49)
Copyright (c) 1997-2009 The PHP Group
Zend Engine v2.3.0, Copyright (c) 1998-2010 Zend Technologies
with Xdebug v2.1.0, Copyright (c) 2002-2010, by Derick Rethans
with Suhosin v0.9.32.1, Copyright (c) 2007-2010, by SektionEins GmbH
Good advertisement in a way for more type safe languages, given I'm passing something as a string into JSON then using it as a string but PHP still converts it to a double which triggers this error.
This isn't really a bug that originates from PHP's handling of types. If any floating point variable containing this number crashes the process, it doesn't matter whether the type declaration was implicit or not. Also, since you're using JSON as an example (why?), it's worth noting that type-safe languages require some sort of mapping for JSON fields as well, any of which could be FLOAT or DOUBLE and would thus be vulnerable. Until today there was no reason to assume the floating point types weren't safe, either. Finally, it has already been pointed out that the problem is caused by a GCC optimization bug, so the implication is that any recent GCC-compiled executable may be vulnerable to erratic behavior upon encountering this number.
I was using JSON as an example because it's taking something in JSON specified as a string, using quotation marks, and automatically converting it to a double.
I wasn't referring to this being the cause, just in other languages if you pass a string through JSON it would never end up being decoded to a double, just because syntactically it is a double.
> taking something in JSON specified as a string, using quotation marks, and automatically converting it to a double.
I can't see anything wrong with this behavior. Double isn't supposed to be any less safe than any other type.
> I wasn't referring to this being the cause, just in other languages if you pass a string through JSON it would never end up being decoded to a double, just because syntactically it is a double.
I believe there is a fundamental misunderstanding here. Other languages would indeed convert to double if so instructed, or if the value presented was a (high) floating point value. And once again, I can see no inherent problem with a JSON parser that looks at a floating point number and interprets it as a double. The only problem would be the memory space of a double vs that of a float or smaller type, but since PHP doesn't make that distinction the point is moot. I don't see the evilness of it, nor do I see how static typing would avoid bugs such as this.
More generally speaking, if one of the basic types of a language is defectively handled, there is no way this bug goes away if you declare that type beforehand. It has quite simply nothing to do with it. I guess an argument could be made that less code would be vulnerable on account of having less instances of doubles around, but it would still be a huge problem. And it's not like floating point numbers are somehow rarely used.
This bug could potentially hit any language. The issue lies in the parsing of the number. The fact that php triggers that parsing implicitly is incidental. It could just as well happen in e.g. Java using an explicit Double.valueOf(vulnerableString)
This problem occurs due to IA-32's 80-bit floating point arithmetic. The simple fix: add a "-ffloat-store" flag to your CFLAGS.
The problematic function, zend_strtod, seems to parse the mantissa (2.225...011 part) and the exponent (-308 part) separately, calculate the approximation of m*10^e and successively improve that approximation until the error becomes less than 0.5ulp. The problem is that this particular number causes the infinite loop (i.e. the iteration does not improve the error at all) in 80-bit FP, but does not in 64-bit FP. Since x86-64 in general uses the SSE2 instruction set (with 64-bit FP) instead of the deprecated x87 it does not have this problem.