Hacker News new | past | comments | ask | show | jobs | submit login

> Performing arithmetic operations against money in floating point is the dangerous part, as error can accumulate beyond an atomic unit.

A good example of this is trying to compute the sales tax on $21.15 given a tax rate of 10%. The exact answer would be $2.115, which should round to $2.12.

IEEE 64-bit floating point gives 2.1149999999999998, which is hard to get to round to 2.12 without breaking a bunch of other cases.

Here are three functions that try to compute tax in cents given an amount and a rate, in ways that seem quite plausible:

  def tax_f1(amt, rate):
    tax = round(amt * rate,2)
    return round(tax * 100)
  
  def tax_f2(amt, rate):
    return round(amt*rate*100)
  
  def tax_f3(amt, rate):
    return round(amt*rate*100+.5)
On these four problems:

   1% of $21.50
   3% of $21.50
   6% of $21.50
  10% of $21.15
the right answers are 22, 65, 129, and 212. Here are what those give:

  tax_f1:  21  65 129 211
  tax_f2:  22  64 129 211
  tax_f3:  22  65 130 212
Note that none of the get all four right.

I did some exhaustive testing and determined that storing a money amount in floating point is fine. Just convert to integer cents for computation. Even though the floating point representation in dollars is not exact, it is always close enough that multiplying by 100 and rounding works.

Similar for tax rates. Storing in floating point is fine, but convert to an integer by multiplying by an appropriate power of 10 first. In all the jurisdictions I have to deal with, tax rate x 10000 will always be an integer so I use that.

Give amt and rate, where amt is the integer cents and rate is the underlying rate x 10000, this works to get the tax in cents:

  def tax(amt, rate):
    tax = (amt * rate + 5000)//10000
    return tax
I'm not fully convinced that you cannot do all the calculations in floating point, but I am convinced that I can't figure it out.





> IEEE 64-bit floating point gives 2.1149999999999998, which is hard to get to round to 2.12 without breaking a bunch of other cases.

Your issue is on how to print the float, not with the precision of fp. For instance, `21.15 * 0.1` can be print both as 2.115 or 1.12 depending on how many decimal digits of precision you set your print function. I manage to get those results with printf using `%.3f` and `%.2f`, respectively.

To produce one cent (0.0x) error with the default FP rounding, it takes more than 1 Quadrillion of operation. Each operation can only introduce 1*10^17/2 error.

The "you shouldn't be using float to do monetary computation" is likely one the most spread float point misinformation.

The issues with your others examples is that you are rounding the data (therefore, discarding information). If you don't do any manual round, the result should be correct (I haven't test thought).


> Your issue is on how to print the float, not with the precision of fp. For instance, `21.15 * 0.1` can be print both as 2.115 or 1.12 depending on how many decimal digits of precision you set your print function. I manage to get those results with printf using `%.3f` and `%.2f`, respectively.

I get 2.115 with %.3f and 2.11 with %.2f. Here's my test program. Same result on my Mac with clang and my Debian 8 server with gcc.

  #include <stdio.h>
  
  double tax_on(double amt, double rate);
  
  int main(void)
  {
      double amt = 21.15;
      double rate = 0.1;
      double tax = tax_on(amt, rate);
      printf("%.3f\n", tax);
      printf("%.2f\n", tax);
      return 0;
  }
  
  double tax_on(double amt, double rate)
  {
      return amt * rate;
  }

The thing is that if 2.115 represents a calculated dollar figure, such as the value of some transaction or the cost of something or whatever, then we should round it to 2.12. (Unless we are working in a financial domain that deals with fractions of a cent.) Now in floating-point, we don't exactly have the exact value 2.12, but we have something that is extremely close. So close that if we happen to print it to %.3f, we better get 2.120, and if we print it to %.4f, we better see 2.1200.

That some monetary calculation works out to $2.115 (and is left that way) instead of being correctly rounded $2.12 doesn't add up to a valid argument against using floating-point for money.

I think piadodjanho does have a point there in the grandparent comment; "don't use floating-point for money" may just be a repeated mantra that doesn't entirely hold water. If extremely accurate engineering and scientific calculations can be done with floating-point, surely we can get floating-point values to measure stacks of pennies with the proper care in the programming.


> If extremely accurate engineering and scientific calculations can be done with floating-point, surely we can get floating-point values to measure stacks of pennies with the proper care in the programming.

That was for a long time my position. I definitely have commented before either here or in /r/programming to the effect that floating point is fine for money as long as you are aware that it is not exact and not associative, and take that into account when doing your calculations.

Any intermediate result in a calculation chain might be off a tiny amount from the exact value, but if you just rounded to the nearest 0.01 before you accumulated enough error to not < 0.005 off, you'd be fine.

I think that's probably true for addition of money amounts. If you have a large number of costs to add up, for example, you should be able to add thousands of them, round to nearest 0.01, and get the right result.

But for tax calculations, such as 10% of $21.15, 0.1 x 21.15 = 2.1149999999999998 in 64-bit IEEE floating point, and rounding the nearest 0.01 gives 2.11, not the 2.12 that we want. A call to fesetround(FE_UPWARD) makes that come out 2.115, and then rounding to the nearest 0.01 gives the desired 2.12.

Will FE_UPWARD make this work for all amounts and tax rates, or are there amounts and rates where we need FE_TONEAREST or FE_DOWNWARD? If so, how do we tell which one we need? Like I said earlier:

> I'm not fully convinced that you cannot do all the calculations in floating point, but I am convinced that I can't figure it out.

PS: calculating tax in cents given double amt, rate, using this method:

  tax = amt * rate;
  cents_tax = round(100 * tax);
almost works if the rounding mode is FE_UPWARD. For all amounts from 0.01 through 99.99, and all tax rates from 0.01% through 10.99% in increments of 0.01% it works except for 3.75% of $67.60 and 7.5% of $33.80.

> but if you just rounded to the nearest 0.01 before you accumulated enough error to not < 0.005 off, you'd be fine.

And in run-of-the-mill, everyday finance, there simply isn't enough calculation stuffed in between the concrete monetary points that are recorded in the ledger.

> If you have a large number of costs to add up, for example, you should be able to add thousands of them, round to nearest 0.01, and get the right result.

Exactly.

> But for tax calculations, such as 10% of $21.15, 0.1 x 21.15 = 2.1149999999999998 in 64-bit IEEE floating point, and rounding the nearest 0.01 gives 2.11, not the 2.12 that we want.

This problem will be there even if we use integers for the currency amounts, but floating-point only for these fractional calculations.

Luckily for us Canadians, I'm pretty sure the Canada Customs and Revenue Agency won't care which way you call this rounding. They also don't collect or refund overall discrepancies of less than around two dollars in a single tax return. I think I've been mostly rounding taxes down over the years, and tax credits up. E.g. if a tax credit is $235.981..., I make it 235.99.

The myth that has been foisted on programmers is that if you use floating-point for numbers, the actual ledgers won't balance, and sum totals of columns of figures will appear incorrect if verified by pencil-and-paper arithmetic. That will certainly be true if the math is done very carelessly; and it's true that it's easier to get it right with less care using integers.

A percentage calculation whose rounding is called the wrong direction will, in and of itself, not cause such a problem. E.g. if we split some sum of money into two complementary percentages, we can do it such that the two add up to the original.

You have to be careful not to do this as two independent percentages. Like, dont take 10% of 21.15 and then 90% of 21.15, individually round them to a penny, and then expect them to add up to 21.15. It has to be centround(21.15 - centround(.1 * 21.15)) to get the 90% residue.


The trick is that by default rounding happens using banker's rounding. Programming languages use this because this is what CPUs use. When you want to round your way, you need an extra digit and round manually:

    def tax_f4(amt, rate):
      tax = round(amt * rate * 1000)
      return tax // 10 + (tax % 10 > 4)

That works for 10% of $21.15, giving the desired 212.

However, for 10.14% of $21.15, it gives 215, but it should be 214. Another example is 3.5% of $60.70, for which it gives 213 but correct is 212.


You're right. My remainder calculation in my code snippet is incorrect. It should've been a floating point remainder instead.

    import math
    
    def tax_f5(amt, rate):
        t = amt * rate * 1000
        return round(t) // 10 + ((math.fmod(t, 10.0) - 5.0) > -1e-7)
But then since there's now an epsilon, it raises the question of how many digits of precision the tax rates typically need. This is indeed a difficult problem.

Some exhaustive testing on all amounts from $0.01 through $999.99 in $0.01 increments and all taxes from 0.01% through 99.99% in increments of 0.01% show that this is the minimum that does the trick (switching to C from Python for speed):

  unsigned long tax = (unsigned long)round(amt * rate * 1000000);
  return tax/(10000) + (fmod(tax, (double)(10000)) - (double)(5000) > -1e-5 ? 1 : 0);
(Yes, I see that I goofed in translation your code to C and typed -1e-5 instead of -1e-7. It looks like the results are the same with -1e-7).

I also tested that up through $9999.99 with taxes up to 12%, and no problems.

Adding another 0 to the 1000000, the two 10000's, and the 5000 works. And another, and another. Past that it starts to fail, but not the simple off-by-one failures you get when you don't use enough digits. These are way way off, so I'm guessing its running into some new class of problem. I haven't looked to see what that is yet.




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: