Hacker News new | comments | show | ask | jobs | submit login

Follow-up: PROBLEM SOLVED.

This problem occurs due to IA-32's 80-bit floating point arithmetic. The simple fix: add a "-ffloat-store" flag to your CFLAGS.

The problematic function, zend_strtod, seems to parse the mantissa (2.225...011 part) and the exponent (-308 part) separately, calculate the approximation of m*10^e and successively improve that approximation until the error becomes less than 0.5ulp. The problem is that this particular number causes the infinite loop (i.e. the iteration does not improve the error at all) in 80-bit FP, but does not in 64-bit FP. Since x86-64 in general uses the SSE2 instruction set (with 64-bit FP) instead of the deprecated x87 it does not have this problem.




It's hardly what I would call "PROBLEM SOLVED". The source should either not compile without the flag with a nice explanation, or it should be changed to work properly without the flag too. Otherwise, someone will run into it again in the future (or the flag will be removed X years from now by someone trying to achieve something completely different).


"PROBLEM ROOT-CAUSED" is not as sexy


A better fix would be to use "-mfpmath=sse" to disable x87 math - this also makes your program slightly faster, whereas -ffloat-store can make it much slower.


.. or not use an algorithm that depends on the hardware implementation of floating point math? How about just comparing the previous & current iteration values? If you're not making any progress, you're done.


You'd probably want to compare to an epsilon AND exact compare, otherwise you could oscillate between two values.


Ah, your point is correct. In my defense, I was just tried to be conservative.


I don't know enough about the intricacies of decimal-to-floating point conversion to have an opinion, but it seems strange to me that there is any sort of approximation process involved at all. Can someone shed light on whether this is a good way of doing things that happened to go wrong on a pathological case or totally crazy or something in between?


This is the nature of floating point numbers: they're not exactually "exact" at all. Converting a fixed fraction decimal number into a floating point means turning an exact number into its best approximation. In order to get the approximation as close as possible to the original number, a floating point conversion algorithm will perform several runs until the error between the original number and the floating point representation is smaller than some very small value. This leads to problems when either the error can't get smaller than the required precision, or when the error doesn't decrease per iteration. In both cases an algorithm that doesn't have error detection will be stuck in an infinite loop.


"floating point conversion algorithm will perform several runs until the error between the original number and the floating point representation is smaller than some very small value"

Yes yes yes yes yes, but my question is not "what happens in the PHP" but why does that happen. Shouldn't there just be an algorithm that just runs down the number and produces the correct float without any approximation step? Floats and doubles may be inaccurate but the inaccuracy in this case is purely deterministic.

Here's the strtod.c in tcl, with various exciting things in the copyright: http://www.opensource.apple.com/source/tcl/tcl-14/tcl/compat... Unless I've missed it, there's no convergence testing there, it just gets on with the conversion; all the if, while, and gotos seem to be focused on the matter of parsing, not checking error sizes. Is the PHP way faster? Is the linked code somehow inaccurate? Is the PHP way just insane?


almost every floating point value given as a string in decimal notation will not be representable in binary given a floating point format with fixed precision. if the number to convert lies within the range of your binary fp precision (eg iee 754, 52 bits of mantissa, 11 bits exponent), it will in general be sandwiched between two numbers that are exactly representable in binary. the task is to implement an algorithm which choose between these 2 numbers in the best possible way (eg without introducing systematic bias).

the apple engineers simply rely on the compiler library: they collect the mantissa digits into up to 2 integers, thus generating an exact binary representation for mantissse of up to 18 decimal digits. thereafter they coalesce these integers into a double variable having an implicit cast from int to double - that's where the library or compiler-generated sequence of machine instructions take over.

the pgp guys actually do the same thing. however, after this step they continue to adjust the result for obtaining the best approximation in the sense above. it can be proven mathematically ([1]) that this task _cannot_ be performed for all inputs using arithmetic with any given precision. [1] also provides a near-optimal non-iterative solution which the author of the php conversion routine has improved upon. however the computations involve floating point (processor) arithmetic which suffers from a specific error in intel fpus. due to this error intermediate results are altered in an unfortunate way so that the adjustment algorithm no longerterminated in spite of the theoretical guarantee.

some more words about the 18-digit mantissa mentioned at the beginning. the apple code cuts off the mantissa after 18 decimal digits and its authors claim this operation won't affect the result. that would not be a problem if the input strings would be given with normalized mantissae (first digit not a zero), as the error was introduced in the 18th decimal / approximately 60th binary digit while ieee 754 double format only caters for 52 matissa digits. as the strtod.c docs do not detail the 'internal fp representation', this might be problematic (it would be eg. for 80-bit extended precision on x87 coprocessors featuring 64 mantissa bits). however, as they allow non-normalized mantissa values, their conversion runs foul on strings with 'many' leading zeros, ie they'd convert 0.0000000000000000009999E+0 to 0.

hope all this makes kind of sense.

greets, carsten

[1] http://dx.doi.org/10.1145/989393.989430 William D. Clinger, How to read floating point numbers accurately, ACM SIGPLAN Notices, v.25 n.6, p.92-101, Jun. 1990

[2] http://en.wikipedia.org/wiki/Double_precision_floating-point... fp formats for starters


Congrats, you were quoted in an article

http://www.theregister.co.uk/2011/01/04/weird_php_dos_vuln/



We have slapped together a quick workaround that can be found here: http://www.aircraft24.com/en/info/php-float-dos-quickfix.htm

Its a quick+dirty fix for site-owners that cannot immediately upgrade php.


does it work when the number is entered without scientific notation?




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: