Hacker News new | past | comments | ask | show | jobs | submit login
999999999999999 - 999999999999997 (google.com)
185 points by unalone on Aug 24, 2009 | hide | past | favorite | 101 comments



Maybe I'm just getting incurably academic, but I think "I found a bug!" is not nearly as interesting as "I found a bug!" and one of:

1) It affects millions of people!

2) It affects large amounts of money!

3) It is a great example of a new or rare class of bug. Here's how you can avoid introducing similar bugs into your code! Here's how we can detect these things automatically! etc.

Here, the only thing that springs to mind is IEEE floating point being used internally by Google Calculator for some reason, which I find hard to believe. Where's the insight?


What about: 4) It's not obvious how this bug could happen! Let's think about it!


Well, apparently, it was obvious. AFAICT, no one has come up with a better explanation than using IEEE floating point where bignums should be used, which I pointed out in my original comment.


The guy who was the maintainer of Google Calculator from 2005 to 2007 confirmed that this is the correct explanation. He said he wanted to switch it to use bignums, but wasn't able to (and not due to technical reasons).

ZorbaTHut (http://mandible.net) said:

  it's 80-bit ieee floats internally, but there's some specialized code
  to clamp it to zero when it looks like it ought to be zero.
  as far as I know it should otherwise behave exactly like 80-bit ieee
  (it wouldn't surprise me if the clamping code is being a little overzealous
  in this case)


If the reason wasn't technical, what was the reason?

The first thing that springs to my mind is that Google calculator is free, and more precision isn't.


Except that neither double nor single precision IEEE floats behave the same way as whatever numbers Google is using.


The algorithm probably first decides how to do the math and then does the math.

EDIT: single-precision does behave this way:

    -bash-3.2$ cat precision.c
    #include <stdio.h>
    int main() {
      float f1 = 999999999999999;
      float f2 = 999999999999997;
      printf("%f\n", f1 - f2);
      return 0;
    }
    -bash-3.2$ gcc precision.c
    -bash-3.2$ ./a.out
    0.000000
I am saying that if it's not consistent with single-precision across multiple inputs, there's probably code that looks at the numbers involved and decides what kind of numeric type to use to do the calculations. In this case, single-precision floating-point happens to be a bad choice.

Second edit: replacing the 7 with a 6 still prints 0. The parent is right: something else is going on here. I like the theory espoused elsewhere in this discussion that it's a result of some truncation to avoid returning nonzero when the result should be zero.


What exactly do you mean, and how does it make "using floating point arithmetic with insufficient precision" a viable explanation for this, given that none of the usual floating-point formats behave in this way?


I agree. My first thought was "who cares?" After reading all the comments, my thought was "who cares?" Now, my thought is "who cares?"


Spoken like a true "business guy". It's technically interesting behavior.



Bing gets it wrong if you add enough digits:

http://www.bing.com/search?q=999999999999999999999999999+-+9...



I was curious as to what caused this. Posting to HN seemed like the fastest way to get an answer.


Interestingly, you also created a top Google result for this search term (indexed in around 10 minutes):

http://www.google.com/search?hl=en&client=safari&rls...


Google loves Hacker News.


Not hard to get a page indexed with only 700 other competing pages. lol



Clearly someone wasn't using 'bignum' when they should have.

Pretty sloppy to not even detect overflow on the input.

another one:

http://www.google.com/#q=999999999999999999+-+90000000000000...


It's not really overflow. It's more like insufficient precision.


Insufficient precision implies too little bits to store the number in.

These are unsigned integers, they could have been represented as integers in a 64 bit unsigned value, if the number of bits in the variable destined to hold the input is not enough and you keep on working as though there are that's overflow al right.

It should have simply thrown an error: operand too large.

I have a dark brown feeling that it won't be long before it does though :)


My definition of "overflow" is along the lines of "value is too big to be represented by said type". If they had been using integers, ok. But they're using floats, and floats can hold numbers bigger than that.

It all boils down to what type you consider the entries have, opposed to what Google considers.


I did not realize they use floats to represent all types, 64.64 would have been an excellent choice for this and would go a long way to solving the issue reported.


This bug has been there for years. So I don't think they'll fix it all that fast.

They use floats for everything, not ints.


That would explain it quite nicely, the decimal64 implementation of the ieee floating point data tops out at 16 digits.

edit: actually, it doesn't, the input above is 15 digits long.


...also known as Arithmetic underflow (http://en.wikipedia.org/wiki/Arithmetic_underflow)


The correct term is quantization, iirc.


I guess it's just a reminder not to trust a search engine with your fiduciary responsibilities?


http://www.brillig.com/debt_clock/

I think you still have a few orders of magnitude to go before that becomes a real problem.


I remember a programmer once griping that you still need to be careful with floats, and that double-precision still did not have neough significant digits to store the dollar to turkish lira conversion rate..


If you need floats you don't understand your problem ;)


unless you're doing realtime graphics


No, you can do realtime graphics just fine with fixed point arithmetic.

You can do nice matrix and vector math entirely in 16.16 fixed point. It's not as fast as having a co-processor, but there was a time that those were not standard items.


The problem is not with precision, but in using floats to represent money.

Why in the fuck would you try to do financial math with anything but fixed-point or rational numbers?


This is clearly a floating point precision problem. It happens when are no bits small enough on the mantissa to express the difference between the two values.

Check out: http://www.google.com/search?q=999999999999999+*+1 this overflows into the next integer as well.


In the early 90's the Pentium floating point error could fit your criteria:

http://www.cs.niu.edu/other/pentium.html

I remember taking the chip into Digital at the time. They replaced it for me.


This is a pretty good description of what's (probably) going on:

http://www.stata.com/statalist/archive/2002-10/msg00290.html


Except that the number of digits in this case is only 14.


... or so ...


Wolfram Alpha tells me the answer is two :p


. . . but it's copyrighted, so you're not allowed to tell anyone.


That's fair use.


As long as it's 4 words or less http://news.ycombinator.com/item?id=737986 :)


I don't know. Wolfram might have sued you for writing "110 proved Turing Complete!" a few years ago.


This is definitely not just a matter of using double where something longer would have been better. 999999999999999 and 999999999999997 are both exactly representable as 64-bit IEEE floating-point numbers (i.e., doubles).

They're not using single-precision floats either; with one 9 fewer in each number, everything's fine, but those numbers are way out of range for 32-bit IEEE floats.

Doesn't look like fixed-precision decimals, either: reduce the second number by 1 and you correctly get a difference of 3.


Tweaking that second number, I find that when the difference is 2 or less we get 0, but otherwise we get the right answer. More evidence that it's not simply a FP imprecision problem. Further, if I approximately halve them both (just change the initial 9 to a 4) then a difference of 1 or less produces 0 and other results are correct.

I think what's going on is more likely this: whoever implemented this wanted to avoid giving nonzero answers to calculations whose correct answer is zero, and therefore put in some sort of deliberate truncation somewhere. (It's not clear to me exactly what -- maybe every addition or subtraction that produces an answer much much smaller than its inputs gets clamped to 0, or something.)

I don't think the truncation is happening purely at the output stage: if I as it to do the original calculation and multiply the result by 100, e.g., I still get 0. (I had to write it like that because otherwise HN's asterisks-mean-italics hack scrambles it. Even though the supposedly matching asterisk is in a different paragraph. PG, if you're reading this, I'd love it if you were to change that behaviour.)

In other words, I think we're seeing an unfortunate effect of the same feature that makes it say that cos(2*atan(1)) is 0 rather than reporting it as about 6.123 x 10^-17 (the actual result in IEEE doubles).


Well you can also go somewhere meant for doing computations and get an accurate result.

http://www.wolframalpha.com/input/?i=999999999999999+-+99999...


Obviously not using Lisp on the back-end.


I like Lisp too, but there's nothing especially accurate about it's arithmetic that you can't get in other civilized languages.


http://gigamonkeys.com/book/numbers-characters-and-strings.h...

One of the reasons Lisp is a nice language for math is its numbers behave more like true mathematical numbers than the approximations of numbers that are easy to implement in finite computer hardware. For instance, integers in Common Lisp can be almost arbitrarily large rather than being limited by the size of a machine word.3 And dividing two integers results in an exact ratio, not a truncated value. And since ratios are represented as pairs of arbitrarily sized integers, ratios can represent arbitrarily precise fractions.4

On the other hand, for high-performance numeric programming, you may be willing to trade the exactitude of rationals for the speed offered by using the hardware's underlying floating-point operations. So, Common Lisp also offers several types of floating-point numbers, which are mapped by the implementation to the appropriate hardware-supported floating-point representations.5 Floats are also used to represent the results of a computation whose true mathematical value would be an irrational number.

Finally, Common Lisp supports complex numbers--the numbers that result from doing things such as taking square roots and logarithms of negative numbers. The Common Lisp standard even goes so far as to specify the principal values and branch cuts for irrational and transcendental functions on the complex domain.


These days quite a few languages already support bignums, rationals and even complex numbers. CL's support for these was unique back then, but I guess it has lost its edge in this domain.

Still, support of strict read/write invariance of floating point numbers (e.g. as described in Burger&Dybvig 1996 and Clinger 1990) doesn't seem to spread widely outside CL/Scheme camp. Concept of exact/inexact numbers as well.


Like Ruby.


... about 40 years later ;-)


IEEE 754 double precision can represent numbers with about 16 decimal digits of precision, however Google calculator (from anecdotal evidence) seems to only reliably represent numbers with 14 of precision (ex. http://www.google.com/search?q=99999999999999+-+999999999999...)

Interestingly, for pure decimal operations it seems some other floating point representation is used. For example, http://www.google.com/search?q=1.0+-+0.2+-+0.2+-+0.2+-+0.2+-... returns mathematically correct result of zero. If double precision were being used it should return the mathematically incorrect result of 0.0000000000000000555 because of floating point representations errors.


MS Excel had a similar issue (though probably an unrelated bug) a while back that caused a furor:

http://www.appscout.com/2007/10/excel_bug_exterminated.php

Imagine being the accountant who learned about this too late...


Here is what seems to have happened: Note that whenever the answer is not accurate, Google uses scientific notation. For example, 1e15 + 30 = 1.0 × 10^15. We accept this as true because Google only claims it is true to two sig figs.

However, in this case, Google means to say 0.0 × 10^15, indicating two sig figs of accuracy. Google's mistake is probably that the internal notation used does not allow it to distinguish between "0" and "0.0 x 10^15", and so it does not report that the result is only accurate to two sig figs as it should.

So, if Google somehow reports the "zero" result in scientific notation with two sig figs, then it will be consistent with the rest of their calculator and clear that it may not be an accurate result.



I bet the guys at blekko don't have this problem when they launch.

http://www.skrenta.com/2008/04/microsoft_bias_in_msn_search_...


In Python:

    >>> 999999999999999 - 999999999999997
    2L


The bug, in Python (needs more digits because Python's "float" is C's "double"):

    >>> float(999999999999999999) - float(999999999999999997)
    0.0


Not the same bug. Try chopping off a nine from each number, first in the above in Python, and then in the Google version.


Out of curiosity...

  C:\>ruby --version
  ruby 1.8.6 (2008-08-11 patchlevel 287) [i386-mswin32]

  C:\>irb
  irb(main):001:0> 999999999999999 - 999999999999997
  => 2
  irb(main):002:0> 9999999999999999 - 9999999999999997
  => 2
  irb(main):003:0> 999999999999999.0 - 999999999999997.0
  => 2.0
  irb(main):004:0> 9999999999999999.1 - 9999999999999997.1
  => 2.0


999 999 999 999 999 - 999 999 999 999 996 = 3 + or - 2 margin of error


Rebol 2:

>> 999999999999999 - 999999999999997 == 2.0

Rebol 3:

>> 999999999999999 - 999999999999997 == 2


On my HP48X rpn calculator I get = 0

The HP35 scientific calculator also give 0 as an answer.

wierd.


Is there anything that can do infinity - (infinity-1)?


Even in non-standard models of arithmetic, where you could use infinity as an operand, the value of that expression would have to be undefined (not 1, as you seem to expect).


Yeah ... I was suggesting that a more sophisticated algorithm for arithmetic on very large integers could eliminate substantial similarities, leaving the 'zone of difference'. A limited 'bit window' needs to be more versatile.

For example, a while back a man got a bill for $35 billion ... that wouldn't happen with more intelligence built into routines. (OUR minds look at 999...999 and 999...997 and see quickly what can be eliminated.)


What about limit as x goes to infinity of x - (x - 1)? We do the algebra first, right? Now, that's a special case of y - (x - 1) where y = x, so can we get a different residue by going about the limit in two dimensions?


http://www.wolframalpha.com/input/?i=limit+of+x+-+(x+-+1)+as...

Also, http://www.wolframalpha.com/input/?i=infinity+-+(infinity+-+...

If nothing else, Wolfram Alpha is useful for math. I'm impressed that it understood my syntax for limits that I made up on the spot, first try.


When I asked it to show its work, it demonstrated that it did the algebra first. I was asking if that's legal here.


Maybe we could just define "infinity = 999999" or something like that ...


Maybe we could just define "zero = 123456789" or something like that while we're at it. If we're going to break math we may as well go whole hog.


One of my favorite pranks on Squeak was to do "true := false" in a workspace window.

Sadly, it no longer works...


I've actually seen that kind of definition in the source of one game.


999 999 999 - 999 999 997 = 2 according to the calculator widget on the Dashboard in OS X 10.5


Well, they have fixed it.


Nope, still equals zero.


Oops my mistake!


in bash:

bash-3.1$ echo "(999999999999999 - 999999999999997)" | bc 2


I just want to know why you were even trying to calculate that figure in the first place?


I found it on 4chan.


I don't even want to know why it was posted there...


The thing about 4chan is that there's no order. That means not even chaos is an absolute must. Pictures of transsexuals and underage girls and people covered in fecal matter go alongside programming jokes, debates about Obama, and discussions of existentialism. This was closer to the latter.

I don't go on 4chan often - once or twice a year at most - but when I do, it's to get that completely random mix of things all at once. Every time, I come away with 3-4 things I didn't know/hadn't thought of before. This was one of the things I found this time.


I think I could have done without those mental images, thank you.


The ability to ignore and avoid thinking about certain things is essential when visiting 4chan. If you practice, you can read sentences like the above without having mental images of the described event. More advanced filters can prevent you from thinking about it with imagery even after seeing something you wish you hadn't seen. ;)


Impressive. I find that extremely hard, it's like things play on an internal monitor that I'm forced to watch and can not ignore no matter how much I want to.

I run a webcam site and over the years we've had some pretty weird stuff happening there and I really wished I had not seen any of that.


It's a combination of ignorance and desensitization. My generation grew up with tubgirl and 2girls1cup. I first saw tubgirl/lemon party when I was 13. When I was I think 17, I came across goatse for the first time; by then, a man * his wide wasn't enough to faze me.

I wrote an essay about how this widespread desensitization will affect society and culture. It'll be interesting to see the world in twenty years.


After browsing around your site for a few minutes, I didn't find this essay. Link?


I have to remember to shut that site down. I removed almost all my stuff back in June. It's all archived, but the archives aren't online yet. I'm trying to figure out how best to put it all up again.

My new essays are all on Facebook at the moment.


This is probably a good place to start learning that ability. http://www.vipassana.com/meditation/mindfulness_in_plain_eng...

When I read those sentences I didn't even come close to thinking about underage girls, girls covered in fecal matter, or underaged girls covered in fecal matter. ;)


Hey man, I have a hard enough time getting my head wrapped around learning (in no particular order) django, python and lisp right now :)

But I appreciate the pointer and I've filed it for future reference.

One thing though, my 'vivid imagination' really comes in to its own while reading books, I literally see the scene in front of me as though completely immersed in it.


Unfortunately, I cannot tell you exactly how to develop this ability, because I only discovered I had it by trying, but I remember not having the ability to NOT think about a pink elephant, so it can be developed. Somehow. :)

Edit: Apparently rms knows how; see his comment.


Just imagine you are Searle's "Chinese room" :).

http://en.wikipedia.org/wiki/Chinese_room


I would rather have never been to /b/ and not have that ability and miss out on all the counterculture, because that ability is also a curse.


Really? How so? I don't find any difficulty imagining things, just an absence of difficulty in choosing not to.


So much of what we do is subconscious. Turning off an automatic tool means you have to devote cognitive load to enabling it manually.


Because it looks cool, when it is in fact an utterly unremarkable case of insufficient precision/float vs double.

But if you're not a programmer you don't know a float from a double, and it just looks so neat and funny that google screwed up.

And because it's so funny it showed up on 4Chan, then on reddit and now here too.


Yet it's being remarked upon.

Perhaps it's not a unique case - icey's link explained the cause, which incidentally I hadn't known before posting this - but it's still one that's perhaps worth a conversation for some people.


lol


indeed




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: