Hacker News new | comments | show | ask | jobs | submit login
0.30000000000000004 (30000000000000004.com)
380 points by shawndumas 87 days ago | hide | past | web | 130 comments | favorite

Once I wrote a library for double-to-string conversion and vice versa, which handles such roundings nicely: https://github.com/mkupchik/dconvstr

Key idea is not just to map binary floating point value X to a decimal floating point value Y, but instead (in extended precision, with 64-bit mantissa) compute an interval of decimal floating point values [Y1, Y2] which maps back to X (in standard precision, with 53-bit mantissa). Then choose such Y from [Y1, Y2] that Y has the shortest decimal representation.

The state-of-the-art is the Errol algorithm of Adrysco, Jhala and Lerner (2016), which is proven to be always correct: https://cseweb.ucsd.edu/~lerner/papers/fp-printing-popl16.pd...

I'm the author of Grisu. (http://github.com/google/double-conversion and http://florian.loitsch.com/publications/dtoa-pldi2010.pdf)

Grisu is also proven to be always correct. As mentioned in another comment: you have to define "correct". Grisu is always accurate, but may bail out if you need the shortest or closest result.

Basically: Grisu will always print a number that reads back to the input. However, it may not be able to print the shortest or closest number (when two decimal outputs have the same shortest length). In that case, it knows this, though, and can tell the user.

I would say that the combination of Grisu and Errol is currently the state of the art.

Errol is not the fastest, nor is it (afaik) proven to always return the closest result.

However, combining Grisu with Errol (as fallback) would probably yield the fastest complete algorithm.


> Our original evaluation of Errol against the prior work of Grisu3 was erroneous. The evaluation indicates a 2x speed improvement over Grisu3. However, corrected performance measurements show a 2x speed loss to Grisu3.

Yeah, it's unfortunately not the fastest algorithm known, despite what was originally thought, but it is still the fastest accurate one.

Grisu isn't accurate?

Author here.

Grisu is accurate, but not always optimal. However, it figures out when that happens and lets you bail out to a slower algorithm when that happens. If you want full speed you should use Grisu, and then use Errol as a bailout.

A bit more context: there are 3 important properties when printing floating-point numbers.

- accurateness: you want the printed number to read back to the same number.

- shortness: often you want the shortest decimal representation: "0.3" and not "0.2999999999999999889". The latter is more precise, but reads back to the same number as "0.3".

- closeness: given two decimal representations of the same length, you prefer the one that is closer to the actual number.

I call the combination of these three properties "optimal".

Note that shortness and closeness are not always crucial. For example, a json-encoder could easily drop those, if the encoding is faster without them. Also, some languages drop shortness in favor of printing something closer to the input.

Grisu always produces accurate results. However, it sometimes can't tell if it already has the shortest and closest number. However (by being conservative) it can tell the user when that happens. In that case, one can fall back to another complete algorithm.

The double-conversion library (http://github.com/google/double-conversion) does exactly that. It uses the slower bignum algorithm in these cases.

Python does this, for example (but likely in a more efficient way): http://bugs.python.org/issue1580

See also http://www.netlib.org/fp/ http://web.archive.org/web/20060908072403/http://ftp.ccs.neu...

So for instance 0.29999999999999993 + 0.00000000000000003 -> 0.3, but 0.30000000000000002 -> 0.30000000000000004

Note that this will still not solve the 0.1 + 0.2 problem from the OP, since the nearest float to 0.3 is not actually the same as the nearest float to 0.1 + 0.2.

> Note that this will still not solve the 0.1 + 0.2 problem from the OP, since the nearest float to 0.3 is not actually the same as the nearest float to 0.1 + 0.2.

Interval arithmetic can be used, although that requires double the storage for all the intermediate results, and extra work.

For example, .1 can be represented by x=[x0, x1], where x0 and x1 are floating point values bracketing .1, and similarly .2 can be represented by y=[y0, y1].

Then z=[z0, z1] can be computed by "z=x+y", where the interval z respects the sum of both intervals x and y.

When printing, you can determine the smallest representation within the interval z.

I'm not sure if it's worth all the trouble though :)

Interval arithmetic is far from trivial. 1/(1/x+1/y) is not the same as xy/(x+y). It breaks things more than normal floating point math does.

This unfortunately breaks with something like [+1, +1] / [-1, +1] which yields (-∞, -1] and [+1, +∞), the result falls into two separate intervals. Now you can either turn this into (-∞, +∞) or use sets of intervals instead of single intervals. The first option gives pretty useless results, the second one can become really slow and consume a lot of memory. You can make this work, but you have to be really careful what you are calculating and how you are doing it.

> The first option gives pretty useless results

Does it though? Wouldn't a result of `(-∞, +∞)` be a solid indication to go and reduce the range of your input parameters?

But Python also warns about it in the documentation, and provide the factions (https://docs.python.org/3.1/library/fractions.html) module and the decimal module thttps://docs.python.org/3.1/library/decimal.html) in the stdlib to avoid the problem.

One thing I really wish Python had is decimal literals. I mean, it does have imaginary literals, which is arguably more of a corner case...

> the nearest float to 0.3 is not actually the same as the nearest float to 0.1 + 0.2.

Or even more precisely, the nearest float to 0.3 is not actually the same as the nearest float to (the nearest float to 0.1) + (the nearest float to 0.2).

That's basically how Gay's dtoa works, and you'll find it copied into many projects—Python, for example, uses it for repr().


I've long ago stopped using C's standard library floating point routines and use Gay's code. Main reason was inconsistencies between compilers.

There is also http://github.com/double-conversion (of which I'm the author).

It's based on http://florian.loitsch.com/publications/dtoa-pldi2010.pdf

I'm obviously biased, but I think the API is easier, and the algorithm is faster.

However, it's in C++ (which might not suit your needs), and it requires more code (especially with the precomputed constants).

You might be interested in interval arithmetics: https://en.wikipedia.org/wiki/Interval_arithmetic

Previous discussions:

https://news.ycombinator.com/item?id=10558871 (1.5 years ago, 240 comments)

https://news.ycombinator.com/item?id=1846926 (6.5 years ago, 128 comments)

Someone should figure out the mean length of time between article repeats on HN. And then find some correlation with university degrees and job tenure.

Do you have a hypothesis about what the correlation might be?

I haven't seen it yet... what's with that final four (common to all languages)?

Floats can only represent fractions of powers of two. 0.3 is a fraction of powers of two and five.

See IEEE754



> See IEEE754

Only in the original 1985 version, however: IEEE 754-2008 actually specifies decimal floating point types; decimal arithmetic was already specified in IEEE 854-1987. Of course, the default representation is almost always one of the binary formats, but that is no longer the only option.

It's actually pretty simple. When you have a base 10 system (like ours), it can only express fractions that use a prime factor of the base.

In a way, not so simple (obvious to you? not to me)

A maths professor, during their lecture, says "obviously, we now have equation 42" when a student interrupts, asking "how is that obvious?"

The professor looks back at the blackboard, starts to speak, but then remains silent as a consternated look falls across their face. After ten minutes of silent pondering, they erase three blackboards and manically fills them with equations, derivations, and other expressions. After another half-hour of furious scribbling--eventually filling both sides of two more free-standing chalkboards--they exclaim, "AHA! It is obvious!"

It's phrased incredibly badly, but basically you can only write a fraction with a finite number of decimals if the denominator divides 10^k (for some k). This means the denominator can't have prime factors other than 2 and 5 (which are the two prime factors of the base). This last statement isn't exactly trivial, but is reasonably easy to prove.

Modern, lets say, post turn of the century popular languages all support some kind of rational data type allowing native fraction arithmetic.



It turns out that humans like numbers that are evenly spaced like 1/10, 2/10 (1/5), 3/10, 4/10. That all seems pretty evenly spaced to us, but its actually totally arbitrary that we have 10 handy (ugh the puns) appendages. If we chill with space aliens they will think numbers are evenly spaced like 1/14, 2/14 (aka 1/7), 3/14, 4/14 (aka 2/7)... much like jamming a metric socket on an imperial nut to fix your car, the amount of error varies wildly base on size. You can get away with a 9/16 socket on a 10 mm nut most times, but a 15/16th socket will probably require a hammer to fit a 24mm nut. Or just round off all the nuts with a vise-grip pliers. Anyway the space alien's 7/14 is actually a perfect match to our 5/10 but the amount of error varies from not much, to quite a bit, in the other attempts at interplanetary standardization. Likewise if you think of the computer as a space alien (not too far from truth, I sometimes feel) it uses binary and thinks numbers like 1/8, 2/8 (1/4), 3/8, are evenly spaced. Or as some specific example expressing our 1/2 in floating point is pretty easy for most computers while expressing 1/5 or 1/3 is somewhat less successful, or somewhat far in error compared to our closest 1/10th based numbers.

In the old days like half a century ago there were competing software floating point designs with varying levels of success vs speed. Everything is super boring today and standardized. 72 bit pdp-10 floating point was from a romantic adventurous era, like an Indiana Jones movie (well, one of the good ones). IEEE 754 is modern and better in all regards, yet has all the panache and style of a modern econobox commuter car. Think of the glory of PDP8 Fortran floating point spread unevenly across three words of four octal digits, 2 bits of signs, 2 bits MSB of mantissa, and 8 bits of exponent in the first 12 bit octal word, now that is something glorious to wake up to in the morning.

Thirty years ago the BASIC languages of the 8-bit computers used non-IEEE 754 floating point math (most typical I've found is 5-byte version---one byte for sign, one for exponent, three for mantissa).

> appendages

a.k.a. digits!

If you can write n=p/q with p and q relatively prime using d digits after the decimal point in base 10, 10^d times n is an integer. We also have

  10^d*n = 10^d*p/q
so 10^d*p must be divisible by q. Since p and q are relatively prime, 10^d must be divible by q. That's only possible if all prime factors of q are 2 or 5.

I get angry every time I read about floating point being discussed by people who don't understand how numbers work, but I am going to try not to rant and be constructive.

Think of if this way. You can express the fraction 1/2 in decimal (0.5) and binary because 2 is evenly divisible into both 10 and 2. You can't represent 1/3 in either but you could in trinary (0, 1, 2 - it would be 0.1) Now, why programmers feel the numbers you can't represent exactly in binary are somehow worse than the ones in decimal baffles me. I guess we are just used to them and anything else seems wrong. But they are just as legitimate.

If you word it the right way, it can be very simple/'obvious'.

"Moving the decimal point divides by 10. You can ONLY make fractions of 10 via the decimal point."

That plus a couple examples should do fine.

No need to mention prime numbers. No need to use variables.

It's actually pretty simple, if you've taken prime number theory.


> It's actually pretty simple, if you've taken prime number theory.

Or if it's explained a bit more intuitively.

A terminating decimal fraction 0.xyz (in base 10) is basically

  xyz / 1000
Now, 1000 = 2x5 x 2x5 x 2x5.

So, if you have any fraction n / d, and the denominator d has only factors 2 and 5, then you can "fill up" the remaining 2's or 5's to get a power of 10, and write it as a terminating decimal fraction:

  3 / 25 = 3 * 4 / (25 * 4) = 12/100 = 0.12

  7 / 8 = 7 * 125 / (8 * 125) = 875/1000 = 0.875

But, if the denominator d has any factors other than 2's and 5's, there is just no way you can multiply any further integers with it to get a power of ten.

A decimal floating point would then be not only 0.875, say, but 0.0000875 or 875000. So, now you say: 875/1000 x 10^E, where the 875 is your (3-digit) mantissa and the E your exponent. But that's still always an integer divided by a power of 10. So, if what we try to represent is a fraction with a denominator that has prime factors other than 2 and 5, there's no way we can represent it as a floating point decimal fraction.

Similarly, a IEEE 754 floating point is basically

  10011010101 / 2^53 * 2^E
(with some abuse of notation, where the first part is a binary mantissa...). But, that's fundamentally still an integer divided by a power of 2. So, if what we try to represent is a fraction with a denominator that has prime factors other than 2, there's no way we can represent it as a floating point binary fraction, and therefore not as a Float64.

Edit: use x for multiplication to avoid italics...

It’s not obvious that 1/7 cannot be expressed in decimal, or 1/5 cannot be expressed in binary?

>I thought people learned this sort of thing in grade school.

Along with tact and general respect for others?

This is a discussion of the vagaries of floating point math, on a forum for programmers. I would expect that most folks learned something about how to carry out decimal long division in ~3th–6th grade sometime.

If we can’t assume any kind of common base-line level of background experience but need to re-hash all of school mathematics in every conversation, then it’s hard to have a technical discussion.

There’s absolutely nothing wrong with not having an understanding of this point, or forgetting grade school arithmetic, and I’m not trying to be condescending, but it seems weird for the top-level poster to call out the the article’s language. The idea that some fractions don’t have a terminating decimal expansion is “pretty simple” in the context of any reasonable standard for discussion among programmers.

Here’s the language from the article. Is it really that confusing?

> When you have a base 10 system (like ours), it can only express fractions that use a prime factor of the base. The prime factors of 10 are 2 and 5. So 1/2, 1/4, 1/5, 1/8, and 1/10 can all be expressed cleanly because the denominators all use prime factors of 10. In contrast, 1/3, 1/6, and 1/7 are all repeating decimals because their denominators use a prime factor of 3 or 7. In binary (or base 2), the only prime factor is 2. So you can only express fractions cleanly which only contain 2 as a prime factor. In binary, 1/2, 1/4, 1/8 would all be expressed cleanly as decimals. While, 1/5 or 1/10 would be repeating decimals.

If the top-level poster was confused and wanted help, he/she could have said something like “Can someone explain this point to me? I don’t understand what it means for 1/7 to be a ‘repeating decimal’, or what a ‘prime factor of the base’ means.”

>If we can’t assume any kind of common base-line level of background experience but need to re-hash all of school mathematics in every conversation, then it’s hard to have a technical discussion.

We can have any kind of discussion we want, but whatever the participant's level of understanding, talking down to them is unacceptable. You don't need to dumb things down here, but when someone doesn't understand what you meant, you can ignore it or help out, just don't be condescending.

Apparently the explanation in the linked article originally came from a Hacker News comment a year and a half ago, https://news.ycombinator.com/item?id=10560130

Whenver IEEE 754 and its quirks are discussed a potential alternative (but so far without hardware support) called Unum - Universal Numbers - should not go unmentioned:


Previous discussions (>2y old):

https://news.ycombinator.com/item?id=9943589 and


A slideset: https://www.slideshare.net/insideHPC/unum-computing-an-energ...

William Kahan, who's the person mainly responsible for IEEE 754, has a critique of Gustafson's proposal:


IEEE 754 is not a trivial standard (it earned Kahan a Turing award). Error modes for IEEE 754 are precisely defined, even though it requires a lot of effort to understand what they mean. (For example, overflow triggers an exception, but gradual underflow is allowed.) Going beyond it requires some serious effort, and unums seem not to be the solution.

A good book to understand the IEEE 754 standard is the book by Michael Overton. The quoted article is unfortunately an example of floating-point "scaremongering". Floating point arithmetic is not approximate, it is accurately defined with precise error modes. Base-10 arithmetic is not however, the model to understand it, but rather "units in last place".

Just a heads up - we're moving on from the unums standard.


A note on the posit (aka sigmoid) standard that Gustafson discusses in the video:

1. Main feasibility benefit: behaves like IEEE 754 numbers in terms of accuracy.

2. Main performance benefit: able to match IEEE 754 precision & accuracy in fewer bits (~1/2) than IEEE 754.

3. When the standard is finalized, will specify behavior that in IEEE 754 allows for a degree of interpretation, meaning that behavior may not be reproducible across runs and machines.

4. IEEE 754 error-handling, that requires extra silicone or cycles, is not implemented. Overflow/underflow not considered errors leading to Inf or NaN, but rounded to nearest representation, hence has uniform treatment (rounding) across numbers with no exact representation. Illegal operations to be handled as general error à la integers rather than giving rise to NaN.

4a. Performance benefit: 4. should mean that less silicone or cycles required to perform arithmetic compared to IEEE 754, so Gustafson believes, though his team has only just finished with an ALU/FPU circuit schematic for posits.

4b. Programming using posits should feel more like programming using integers, versus handling NaN, Inf.

A few notes:

1. Main feasibility benefit: behaves like IEEE 754 numbers in terms of accuracy.

I think you mean "...in terms of allocation and storage"

2. I would like to disclaim that posits improve over IEEE 754 precision & accuracy over some value ranges. Over others, it's worse. The argument though is that you're more likely to be using those value ranges than the ones where you want. It's also not nearly 2x. For 64-bits, for example you'll realistically get about 6-7 extra bits of precision around 1. The standard DOES require features like "exact dot product" operation, which can allow you to do things that are "impossible" with the standard 754 spec; for example, exact solutions of systems of linear equations using convergent methods.

The way I've been thinking about it of late, is that floats and posits are a 'lossy compression of the real number line'; posits make better choices about how to compress this information, but there are of course tradeoffs.

The rest of it is great!

Any time you're generating percentage data that should sum to 100, not appreciating floating point math will burn you.

For those interested, the largest remainder method (https://gist.github.com/hijonathan/e597addcc327c9bd017c) is useful for dealing with this.

Using tricks to make numbers total what you think they "should" other than using rounding to proper amount of precision is lazy and deceitful.

Depends on the application of this, i.e. whether the total is more important or the individual values are more important.

Example: You're filling out a timesheet for a contracting job, and you worked 8 hours on several different tasks for your client, but your client's software rounds things to the nearest hour, then it would make sense to use an algorithm like this if your pay was going to be determined by this data entry. If your pay was not determined by this data entry then it may make sense to just round normally.

You should just round the percentages to the nearest value of the appropriate precision, and let them sum to slightly more or less than 100% if that’s how the numbers work out. If you want, add an asterisk and a “note, numbers do not sum to 100% because of rounding” at the bottom.

> Your language isn't broken, it's doing floating point math.

Yes, your language is broken if you have no way out of floating point math. JavaScript, for example, is fundamentally broken.

There are a huge pile of libraries for doing bignum, rational, fixed point, decimal, complex, vector, polynomial, symbolic, etc. math in Javascript. If you want you can even write one for 7-adic arithmetic, Eisenstein integers, numbers expressed in the golden ratio base, or points on the surface of a torus.

There’s no language that can build in all possible number systems, and it’s not really the place of a language to try.

It would be nice if JavaScript made a better type distinction between integers and floats though.

Yes, every Turing complete language can build integers and other types of numbers atop floating point numbers if necessary. That doesn't make it any less broken.

Here's something that would probably help lessen the confusion: don't allow or atleast warn about implicit conversions in binary floats in source code literals.

    float foo = 0.3;    // warning: invalid binary float value
    float bar = 0.3f;   // ok, converts into nearest bin float
    decimal qux = 0.3m; // ok
    decimal feh = 0.3;  // why not; will be exactly 0.3
All in all, languages go for decimal floats by default but then the implicit conversions ruin it for everybody who's still learning.

Good suggestion, but it still wouldn't prevent beginners from just learning the 0.3f notation as an idiom and skipping the explanation. The trouble with floats is that understanding them depends on knowledge of binary, scientific notation, etc. It's a lot for someone who is only just getting to grips with how a program works.

C# is missing an example using decimal type:

Decimal uses 128 bits to hold it's data and seems to handle rounding differently to System.Double, e.g. 0.1 and 0.2 using decimals evaulates to 0.3:

> 0.1M + 0.2M 0.3

You can read more about decimal here: https://msdn.microsoft.com/en-us/library/364x0z75.aspx

For people using C or C++ I can recommend using decimal floating point (which may be added to C++20 in the standard).

Contrary to the "default" floating point, which are base 2 and do not support precise presentations of certain numbers, you can use a decimal floating point system which uses base 10 and therefore allows exact and precise presentation and solves the problem mentioned on the page.

Also important: since 2008 this is standardized in IEEE 754-2008 which added support for decimals.

Explanation of decimal floating points: https://en.m.wikipedia.org/wiki/Decimal_floating_point





C++ Standard Proposals:



Then people will just make novelty websites to point out that 1/7 + 2/7 ≠ 3/7.

This is solved by the Scheme numerical tower, which prefers exact representation​s (including exact rationals) unless inexact representation is explicitly requested or forced by an operation that doesn't support exact results.

The problem is remembering to only input rationals :) So instead of doing (iota 100 0.1 0.1), you do (iota 100 (/ 1 10) (/ 1 10)). That does not work in chicken for some reason.

I don't know if precomputation is ever guaranteed for those things, but otherwise it would be neat to be able to input rationals directly into the source.

Edit: So, this is scheme standard discovery week: apparently inputing 1/10 works just fine. I can't believe I missed this.

But in decimal floating point this is also solved: 1/7 + 2/7 = 3/7

Please research decimal floating point first..

That particular example happens to work, but 1/7 + 1/7 != 2/7.

DecFP is not magic. You still need to know that you're dealing with limited precision numbers under the hood.

You are right, that example doesn't work. I thought the rounding and normalization described in the standard may fix all these cases by itself but there I was wrong.

But at least all problems that could happen on a financial application are solved with decimal floating points (where you will only want to use rationals in finite decimal form like 0.01)

And even your example can be made working pretty easily:

  #include <decimal/decimal>

  int main(int /*argc*/, char **/*argv*/)
    using namespace std;

    using namespace decimal;
    decimal128 d1 = 1;
    decimal128 d2 = 2;
    decimal128 d7 = 7;
    uint64_t conversionFactor = 10000000000000000000ull;
    cout << decimal128_to_long_long(d1/d7 * conversionFactor) << endl;
    cout << decimal128_to_long_long(d2/d7 * conversionFactor) << endl;
    cout << (decimal128_to_long_long((d1/d7+d1/d7) * conversionFactor) ==
      decimal128_to_long_long((d2/d7) * conversionFactor) ? "yes" : "no") << endl;


That was done using the gcc included decimal types. And if you look e.g. into the intel dfp library readme, you see lots of functions which will allow you to do the comparison you wanted to do: https://software.intel.com/sites/default/files/article/14463...

> But at least all problems that could happen on a financial application are solved with decimal floating points (where you will only want to use rationals in finite decimal form like 0.01)

You still get cancellation if the magnitudes differ by large enough an amount. This is a problem inherent to floating point arithmetic, using a decimal format instead of binary does not save you from that.

I absolutely disagree.

If you want to represent currency, for example, you should not use decimal floating point. You should integers, specifically you should use an integer number of tenths of cents, which is pretty widely agreed as the standard unit of currency in a computer (or tenths of yen, for example). You need to be extremely careful about overflow, but you need to be anyway, and should almost certainly just use arbitrary-precision integers.

I absolutely disagree with your disagreement. Please try writing an ERP system where you have a quantity of a billionth of an item price of a billion euros (or an item price of a billionth Euro and a quantity of billions) and tell me which integer type you deem sufficient for this.

Additionally please research decimal floating points before disagreeing.

> Please try writing an ERP system where you have a quantity of a billionth of an item price of a billion euros (or an item price of a billionth Euro and a quantity of billions) and tell me which integer type you deem sufficient for this.

For what it's worth, values on the order of 10^18 are exactly representable in int64; decimal64, however, only has 16 significand digits, so there you are already affected by rounding, and it's not far from where the distance between consecutive numbers is greater than 1.

In general, with floating points, you either waste lots of space for excessive precision around zero, or you quickly lose precision when numbers grow. On top of that, you get all the other quirks of floating point numbers such as cancellation.

I'm having trouble visualizing this. The closest I could get is a single unique part for an A380 super-jumbo (typical selling price: $435 million USD)

Imagine a contract which says the contract partner receives e.g. 0.001% of the total revenue for the fiscal year (which could be 10bn Euros)

Or you do have a gram price and are selling literally thousands of tons to a big business

> Imagine a contract which says the contract partner receives e.g. 0.001% of the total revenue for the fiscal year (which could be 10bn Euros)

I'm imagining it, and I don't see why storage or computation time would be major obstacles to a calculation you run once a year. Use whatever big integers you want?

Suppose we dedicated one $100 hard drive (per year!) to storing all the relevant data. I feel like there would be plenty of space left over, and the budget would cover it.

But which advantage would this have over using IEEE 754-2008 decimal floating point numbers?


This is clearly why you should price everything in picodollars :) https://www.tarsnap.com/

My personal view is decimal arithmetic should be the default and floating-point should be the library in scripting languages. And probably also Java/C#/Go-type languages.

My personal view is that the programmer needs to know the difference because there are important tradeoffs to consider.

I do however think it should be easier to use decimal arithmetic. I've written financial apps in both Java and JavaScript and I wish both languages had native syntax and support for decimal numbers and arithmetic. JavaScript has nothing built-in and Java's BigDecimal class is clunky to use.

Decimal only takes care of a subset of cases. I.e. those where the source data nicely lines up in decimal. Good for finance (although I think Britain made a mistake when it gave up the old penny and the shilling). But even there, you sometimes have to divide by numbers like 12 and 365.

The point is NO representation is going to be idiot proof. So you just need to make a sufficient set of representations easy to use, and then try to teach the "idiots" what the underling issues are.

Decimal does only take care of a subset of cases, true. But 1) it's a very common subset, and 2) when it breaks down, people generally understand how and why, because they've dealt with it elsewhere.

Why? Is decimal faster than arbitrary precision? If not, arbitrary precision should probably be the default (or binary).

Nah, arbitrary-precision real arithmetic should be the default.

>Perl 6, unlike Perl 5, uses rationals by default

I'm not sure this is such a good idea. I love rational datatype, but it's too easy to shoot yourself in the foot with simple numerical procedures resulting in gigantic bignum denominators.

Rational numbers are horrible for most numerical computing. The denominators double in length with every multiplication, so computation speed gets slow exponentially.

Any time there’s any degree of uncertainty about a quantity (e.g. it comes from a physical measurement) there’s also no longer any advantage to using rational arithmetic. This turns out to encompass most practical situations.

Rational arithmetic also breaks down entirely in the face of square roots or trig functions, unless you go for a fully symbolic computation environment, which gets even much slower.

Rational arithmetic is mostly nice when the problems have been carefully chosen so the operations will stay rational and the answers will work out nicely, e.g. in high school homework.

default doesn't mean only, from memory Rakudo smooshes Rats into floating point Nums when the denominator gets above 2 to the 64th or so.

FatRat are an infinite precision rat. FatRats, like RATs are Cool (which is a double entendre, because Cool means it knows how to be a string, which is apparently "Cool") The default recently has not been fatrat but one of the many benefits of a language under development for 20 years is that quite possibly that default was changed last weekend. Or maybe fatrat were default for a painful week back in '02. Probably not, but it could have happened.

Perl does the right thing but if you're really unhappy about not using floats you can force a .num conversion into floating point or you could express a number in scientific notation to force floats. Perl really wants to make a division problem into a Rat unless you work hard to stop it.

In general, for the past 30 years or so, if you're doing something weird, Perl will work and probably give faster and possibly more accurate results than most other tools, but the rest of the world will not be interested in debugging perl6 but will instead be asking "why you use perl6 instead of Gnu-R or mathematica or ?" or whatever is trendy momentarily today. Perl will never be trendy yet the whole world runs on it.

Rexx has decimal arithmetics so 0.1+0.2 is 0.3 ...


Unsurprising considering the IBM affiliation of Rexx and decimal arithmetic being IBM’s pet issue.

The fact that we do binary floating point by default is an artifact of RAM being expensive way back in the day while arithmetic coprocessors became cheap. Single-precision is 4 bytes and can represent practically every quantity you care about precisely enough.

While IBM has been harping about this for ages, BCD (binary coded decimal) was sufficiently important that it had special instructions on even the earliest microprocessors. The 6800/8080/Z80 series had DAA (decimal adjust accumulator) and the 6502 had SED/CLD (set/clear decimal mode). The 8008 is notable in not having a specific BCD instruction.

And the 4004, which started it all, was a BCD oriented microprocessor from the start.

And the 8086 (on up), the 68000 family, and the VAX also had BCD instructions. When I wrote my own 6809 emulator [1] the DAA instruction was the hardest one to get right [2]

[1] https://github.com/spc476/mc6809

[2] Specifically the condition code results.

The Common Lisp answer is 0.3 and not 0.30000000000000004 because those are single float literals: for double float literals, we have (+ 0.1d0 0.2d0) => 0.30000000000000004d0.

I would imagine this is the case for several of the other examples, too.

You can tell Common Lisp to read double-floats by default.

    CL-USER 113 > (+ 0.1 0.2)

    CL-USER 114 > *read-default-float-format*

    CL-USER 115 > (setf *read-default-float-format* 'double-float)

    CL-USER 116 > (+ 0.1 0.2)
Thus if one wants to make sure which float format Lisp should be reading,

a) specify the default format, as above

b) or use s, f, d, and l to specify the format (short float, single float, double-float, long-float). Example 1.0s0, 1.0f0, 1.0d0, 1.0l0

These examples all seem to be using double precision floating point, which might be misleading. For example, the C++ snippet is adding two double literals, not floating point literals. The result is even more inaccurate using single precision arithmetic:

    #include <iostream>
    #include <iomanip>

    int main() {
        std::cout << std::setprecision(17) << 0.1  + 0.2  << '\n'; // 0.30000000000000004
        std::cout << std::setprecision(17) << 0.1f + 0.2f << '\n'; // 0.30000001192092896

Side anecdote: Many, many years ago, I wrote a lease payment plan calculator for a big bank-linked leasing company when they moved from AS/400 to .Net. I made sure all rounding was policy driven, not only in precision, but also in where roundings would occur during the calculations. 1 cent differences do matter in such environments. Somehow it passed all the extensive regression tests that compared to their mainframe code out of the box with the config based on nothing more than my guesses. AFAIK the code is still in production to this date.

It's stuff that should be a part of very beginning of any CS101. Float arithmetics != Decimal arithmetics

It is.

Only if you're trying to scare the newbies away. This stuff does need to be learned at some point but cs101 should be about getting a feel for programming in general.

Focusing on the basics of objects/classes functions (with mandatory source control) is smart before killing their will to learn with nasty edge cases

I don't think fumbling around in a trial and error manner is the best start. The basic data types and the set of values they can represent should come early.

Base 2 floating point is perfectly OK. The problem is people using floating point not knowing that IEE754 floating point representation is binary, and not decimal, so most decimal constants will not have an exact match. Said that, many implementations do buggy binary <-> decimal conversions, not because of the limit accuracy in the conversion, but because of "creative" rounding criteria.

More interesting is exactly why floating point 0.1 + 0.2 != 0.3, and it's due to the way rounding is defined in the IEEE-754 standard:

> An implementation of this standard shall provide round to nearest as the default rounding mode. In this mode the representable value nearest to the infinitely precise result shall be delivered; if the two nearest representable values are equally near, the one with its least significant bit zero shall be delivered.

If we convert from decimal to double precision (64-bit) floating point, here is how they are represented in hexadecimal and binary:

    0.1 -> 0x3FB999999999999A = 0011 1111 1011 1001 1001 1001 1001 1001 1001 1001 1001 1001 1001 1001 1001 1010
    0.2 -> 0x3FC999999999999A = 0011 1111 1100 1001 1001 1001 1001 1001 1001 1001 1001 1001 1001 1001 1001 1010
    0.3 -> 0x3FD3333333333333 = 0011 1111 1101 0011 0011 0011 0011 0011 0011 0011 0011 0011 0011 0011 0011 0011
                ^^^^^^^^^^^^^                  ^^^^ ^^^^ ^^^^ ^^^^ ^^^^ ^^^^ ^^^^ ^^^^ ^^^^ ^^^^ ^^^^ ^^^^ ^^^^
Taking 0.1 as an example, here is what its binary representation actually means:

    sign exponent    mantissa (marked using ^ in the table above)
       0 01111111011 1001100110011001100110011001100110011001100110011010
The exponent is encoded as its offset from -1023, so in this case we have 01111111011 which is decimal 1019, making the exponent 1019-1023 = -4.

The mantissa (BBBB…) is an encoding of the binary number 1.BBBB…, so with an exponent of -4 that makes the actual number 0.0001BBBB….

Applying this for each of these numbers:

    decimal  binary
    0.1      0.00011001100110011001100110011001100110011001100110011010
    0.2      0.0011001100110011001100110011001100110011001100110011010
    0.3      0.010011001100110011001100110011001100110011001100110011
Then if we add 0.1 + 0.2, this is the result:

    + 0.0011001100110011001100110011001100110011001100110011010
However we only have 52 bits to represent the mantissa (again marked with ^), so the result above has to be rounded. Both possibilities for rounding are equidistant from the result:

      0.010011001100110011001100110011001100110011001100110100 <==
So according to the specification, the option with the least significant bit of zero is chosen.

Converting this back to floating point format we get 0x3FD3333333333334. Note that the least significant four bits of the mantissa are 0100, which corresponds to the trailing 4 in the hexadecimal representation.

This is not equal to 0x3FD3333333333333 (the result of conversion from decimal 0.3, and also what would have been the result here if the rounding was specified the other way.)

Therefore, floating point 0.1 + 0.2 != 0.3.

TXR Lisp:

  > (+ 0.1 0.2)
Why is that? Because I made a "populist" decision purely for the sake of better "optics". The default printing precision for floating-point values is set at just enough digits to truncate things in this manner:

  2> *print-flo-precision*
The cost to doing that is that the printed representation of floats isn't guaranteed to have read/print consistency: what we print doesn't always read back as the same value which in other words means that we don't have a unique printed representation for each possible value. That default of 15 isn't my favorite language design decision. It's justifiable and I believe in it, but not as wholeheartedly as in some other design decisions.

In any case, language users who need to record every bit of the IEEE 754 double in the printed representation are not just hung out to dry: they can set or override this special variable to 17.

  3> (let ((*print-flo-precision* 17)) (prinl (+ 0.1 0.2)))
(One value here is the `prinl` output, under the dynamic scope of the special variable override; the other is the result value, printed in the REPL, with the top-level binding of it, still at 15 digits.)

Oh, and by the way, we dohn't have to add 0.1 to 0.2 to demonstrate this. Either of those values by itself will do:

  4> (let ((*print-flo-precision* 17)) (prinl 0.1))
  5> (let ((*print-flo-precision* 17)) (prinl 0.2))
Why 15 is used the default precision rather than, say, 14, is that 15 still assures consistency in one direction: if any number with up to 15 digits of precision is input into the system, upon printing it is reproduced exactly in all 15 digits. And 15 is the highest number of digits for which this is possible with an IEEE 64 bit double.

For this reason R has a function for comparing results of floating point math: all.equal()

it is quite tricky as it might show that .1+.2 is 0.3, but this is only due to the fact that default output is a subject of default formatting (i.e. occulting more precision numbers)

> Your language isn't broken, it's doing floating point math.

Specifically, it's doing binary floating point math, which is an odd choice for decimal literals that's​ common only because very few languages choose correctness and clarity over performance when dealing with numbers, which leads to all kinds of problems with the most obvious approaches to addressing common problem domains in industrially popular languages. If I had a dollar for, well, every dollar of errors I'd seen because binary floating point values were used in a monetary calculation, I could afford to stop having to deal with them.

"If I had a dime for every time I've seen someone use FLOAT to store currency, I'd have $999.997634"


How does Postgres "work" properly? I was wondering this the other day. Is something magical going on behind the scenes that I should know about? I added a ::decimal to be safe, but I wasn't sure

If you set extra_float_digits=3 then the output will always be large enough to contain the full precision. pg_dump et al. do so automatically.

Not sure if I like or dislike Perl6 decision to use rationals by default; so 0.1 is saved as 1/10 and 0.2 as 2/10 and the calculation is precise. But also possibly slow.

I'm not sure why the round-trip formatter is used in the C# example (https://msdn.microsoft.com/en-us/library/dwhawy9k(v=vs.110)....). If you omit it then you get the hoped-for result of 0.3.

The fun thing is that in the case of an "accurate" result (0.3), you can't be sure whether the language arrived there by somehow doing accurate floating point math - or by following the PHP way of first rounding in base 2, then rounding again in base 10, thus making the result even more inaccurate.

For all those interested in Floating Point Arithmetic, I'd strongly recommend reading this paper:


As a (kind of) related matter, thought it'd be interesting to mention JavaScript is most likely getting arbitrary precision ints soon https://github.com/tc39/proposal-integer

The C++ code could use std::numeric_limits<double>:: max_digits10 instead of the magical 17!

Similarly, for C it could be:

  printf("%.*f\n", DBL_DECIMAL_DIG, .1+.2)
In both cases, support for a relatively new language standard (C++11, C11) would be required.

2011 was six years ago.

In terms of C and C++, that's fairly new. Lots of older code using pre-C11/C++11 standards is out there.

Hence the phrase "relatively new".

That reads like the code equivalent of involuntarily burping in polite company.

The link given to Rust rationals is broken. It should be: http://rust-num.github.io/num/num_rational/

Use decimals in c# they are used for exact calculation ( and are slower)

PS. It says: Console.WriteLine("{0:R}", .1 + .2);

But when you do .1 + .2 it is 0.3 :) . With the 0:R he actually converts it to double which give the false result

That "digits=18" in R example probably bothers me more than it should.

I was literally wondering why R gave an "accurate" result for 0.2 + 0.1 the other night, despite using doubles. The printer has an algorithm to round then stringify

I wonder if some of the examples exhibit fixed-point precision calculation, and if they do, it would be great to highlight it.

I am surprised that domain was still available

Whats going on in the second D example?

The "f" suffix means they are single-precision (32-bit) FP numbers.


    * (+ (/ 1 10)(/ 2 10))

It's great that your language of choice has fraction support - but real-world engineering problems rarely involve simple rational fractions exclusively. At some point you're going to run into roundoff error.

Plus, fractional computing has severe problems with expanding denominators, which grow very quickly with non-trivial operands. Although computers are fast, the overhead of dealing with hundred-digit-long arbitrary precision integers eventually makes rationals unwieldy.

It's a trade-off between speed and accuracy. Not all software deals with real-world engineering problems, and there are plenty of applications where you would prefer accuracy.

I'd rather make that trade-off myself than have the language decide for me. Most of the time I would start with correctness and if speed is needed and the profiler shows that the bottleneck lies in fractional arithmetic, I'll switch to floating point numbers.

A similar trade-off exists with bignums versus 32 or 64 bit integers. There again, I prefer a language that uses bignums by default. I'll convert the bottlenecks to use fixnums after I've verified that it is safe to do so.

With lisps you can chose whether you want floating point or rationals.

Why does PHP produce 0.3 instead?

edit: Not sure how I missed that. Thanks!

Third example in the article:

PHP converts 0.30000000000000004 to a string and shortens it to "0.3". To achieve the desired floating point result, adjust the precision ini setting: ini_set("precision", 17).

I remember years ago learning something like this and coming to a horrible realization : "My god, they're all strings!"

PHP converts 0.30000000000000004 to a string and shortens it to "0.3". To achieve the desired floating point result, adjust the precision ini setting: ini_set("precision", 17).

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact