Hacker News new | comments | show | ask | jobs | submit login

Sure, I'll embarrass myself here and use myself as an example since this is how I actually came to even learn about this to begin with.

When I started working in finance straight out of school (not in HFT at the time), I naively accepted the dogma that money should never be represented using floating points. I mean everywhere I went I would read in bold letters don't use floating point! Don't use floating point! So I just accepted it to be true and didn't question it. When I wrote my first financial application I used an int64 to represent currencies with a resolution of up to 10^-6 because that was what everyone said to do.

And well... all was good in life. Then one day I extended my system to work with currencies other than U.S. dollars, like Mexican Peso's, Japanese Yen, and currencies X where 1 X is either much less than 1 USD or much greater than 1 USD.

Then things started to fail, rounding errors became noticeable, especially when doing currency conversions, things started to fall apart real bad.

So I took the next logical step and used a BigDecimal. Now my rounding issues were solved but the performance of my applications suffered immensely across the board. Instead of storing a 64 bit int in a database I'm now storing a BigDecimal in Postgresql, and that slowed my queries immensely. Instead of just serializing raw 64-bits of data across a network in network byte order, I now have to convert my BigDecimal to a string, send it across the wire, and then parse back the string. Every operation I perform now requires potentially allocating memory on the heap whereas before everything was minimal and blazingly fast. I feel like there is a general attitude that performance doesn't matter, premature optimization is evil, programmer time is more expensive than hardware, so on so forth... but honestly nothing feels more demoralizing to me as a programmer then having an application run really really fast one day, and then the next day it's really really slow. Performance is one of those things that when you have it and know what it feels like, you don't want to give it up.

So not knowing any better I decided to come up with a scheme to regain the lost performance... I realized that for U.S. dollars, 10^-6 was perfectly fine. For currencies that are small compared to the U.S. dollar, I needed fewer decimal places, so for Yen, 1 unit would represent 10^-4 Yen. For currencies bigger than U.S. dollar, 1 unit would represent 10^-8...

So my "genius" younger self decided that my Money class would store both an integer value and a 'scale' factor. When doing operations, if the scale factor was the same, then I could perform operations as is. When the scale factor was different, I would have to rescale the value with lower precision to the value with higher precision and then perform the operation.

This actually worked to a degree, I regained a lot of my speed and didn't have any precision issues, but all I did was reinvent a crappy floating point system in software without knowing it.

Eventually I ended up reading about actual floating points and I could see clearly the relationship between what I was doing and what actual experts had realized was the proper way to handle working with values whose scales could vary wildly.

And once I realized that I could then sympathize with why people were against using binary floating point values for money, but the solution wasn't to abandon them, it was to actually take the time to understand how floating point works and then use floating points properly for my domain.

So my Money class does use a 64-bit floating point value, but instead of (double)(1.0) representing 1 unit, it represents 10^-6 units. And it doesn't matter what currency I need to represent, I can represent currencies as small as Russian rubles to as large as Bitcoin, and it all just works and works very fast.

64-bit doubles give me 15 digits of guaranteed precision, so as long as my values are within the range 0.000001 up to 999999999.999999 I am guaranteed to get exact results.

For values outside of that range, I still get my 15 digits of precision but I will have a very small margin of error. But here's the thing... that margin of error would have been unavoidable if I used fixed decimal arithmetic.

Now I say this as a personal anecdote but I know for a fact I'm not the only one who has done this. I just did a Google search that led to this:

https://stackoverflow.com/questions/224462/storing-money-in-...

The second top answer with 77 points yells in bold letters about not using floating points as a currency and suggests using a scheme almost identical to the one I described, where you store a raw number and a scaling factor, basically implementing a poor-man's version of floating point numbers.




Thanks for the detailed explanation of your thought process.

I guess you fall into the few percent of developers who are NOT meant to be addressed by the general rule of "don't use float for money". Like with all "general rules" in software development, it is intended to guide the >90% of devs who don't want and sometimes also aren't capable of fully grasping the domain they're working in and the inner workings of the technology they use. The majority of devs need simple, clear guidelines that prevent them from making expensive mistakes while wielding technology that's made up of layers and layers of abstractions, of which some (or even most) are entirely black boxes to them. "Playing it safe" often comes with other shortcomings like worse performance, and if those are not acceptable in a specific scenario, you need a developer who fully grasps the problem domain and technology stack and who is thus capable of ignoring the "general rules" because he knows exactly why they exist and why he won't get into the troubles they are intended to protect you from.

I have long thought that every developer under the sun should strive to get up to this point, and I still think that it is an admirable goal, but I came to understand that not all developers share this goal, and that even if someone tries to learn as much as possible about every technology that he comes in contact with, he will never be able to reach this state of deep understanding in every technical domain imaginable, as there are just too many of them nowadays. We all, no matter how smart we are, sometimes need to rely on "general rules" in order to not make stupid mistakes.


You know there's a numeric type in postgres? Not quite arbitrary precision, but large enough for pretty all practical purposes "up to 131072 digits before the decimal point; up to 16383 digits after the decimal point"

https://www.postgresql.org/docs/current/static/datatype-nume...


NUMERIC (without a precision or scale) is a variable-length datatype, and suffers all the performance problems mentioned above.

If you specify a precision and scale, performance of NUMERIC improves quite a bit, but now you can’t store values with vastly different magnitudes (USD and JPY) in the same column without wasting tons of storage on each row. You’re back to square one.


Thanks a lot for this answer -- the real-world scenario including your thought process is very insightful.




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: