
Is Cobol holding you hostage with Math? - mbellotti
https://medium.com/@bellmar/is-cobol-holding-you-hostage-with-math-5498c0eb428b
======
simonbyrne
There is a little more to it than "languages don't provide decimal or fixed
point". Both Java and Python provide decimal implementations as part of their
standard libraries, and languages don't provide fixed-point as it's largely
trivial: "just do integer operations and scale them appropriately".

The problem is that fixed point is ambiguous: there are multiple ways to do
rounding (unlike floating point which has been largely standardised since the
early 90s). In fact, the COBOL rounding rules are rather complicated:
[https://stackoverflow.com/a/30215718/392585](https://stackoverflow.com/a/30215718/392585)

This has some interesting consequences:

I know of an insurance company where there were difficulties trying to
replicate the exact premium calculation (which was done on a mainframe in
COBOL), part of which involved taking 1.01 to the power of a positive integer
(which depended on various risk factors of the policy).

COBOL was not really intended for numerical work, and so doesn't have a "pow"
function (or at least this version didn't): instead, it turns out that the
programmer had used a simple loop which would iteratively multiply a variable
by 1.01, incurring round-off at each iteration. So the only way to emulate it
exactly was to use the same arithmetic _and_ the same ugly hacks used in the
original software.

~~~
jarfil
Why replicate the same calculation with the same errors? Shouldn't it be
specified as a pure math formula somewhere, probably in the contacts? So they
did it slightly wrong for a long time, but why would that be the reference
instead of the contract.

~~~
wruza
It hurts even when there is no explicit contract. After integrating a couple
of factories we found out that not precision is king, but stability is. Too
many processes depend on each other, from small to big, and if you’re lucky to
mess it up without immediate notice from workers (which happens rarely, since
they give a good friction on this slippery surface), costs to fix it may
outprice the precision gains for years. Essentially, no matter what the hw/sw
platform is, business “ABI” is locked into algorithms used, and
rounding/floating/fixed is only one of these. I’ve seen [edit:algorithm-wise]
wrong cost-price distributions, penalty calculations and so on, that were a
foundation for chains of table-based and gut-based decision making. Things
simply don’t add up for everyone, not to mention how hard is turning a switch
for the entire org that runs e.g. 24/7 and/or has unfinished production
shifts.

That was a knowledge I wish I already had back then, since these sorts of
projects provided a lot of frustration, insecurities and expenses that you
never could imagine working as software half-pm/half-coder. At some point you
start to doubt whether you’re skilled for this job or just a loser with a
keyboard.

Minor/simple processes lack that cohesion and can be fixed or left as-is
without much trouble though.

~~~
LeifCarrotson
The requirements doc or contract generated to create the software almost
always has a lot less thought and a lot less debugging and adjusting in it
than the actual software and processes subsequently built around it that have
been running that way for the last decade oe three.

Can you imagine if, suddenly, an insurer had a 1% shift in premiums overnight
because someone fixed the rounding algorithm to comply with the 1983 spec that
had been mis-implemented in COBOL?

------
hyperman1
I had many adventures interfacing other stuff with mostly MicroFocus COBOL.
Some personal observations:

* The COBOL knowledge shortage is a myth. I got pretty good at reading their code and giving the COBOL guys small bugfixes while most of the time not even having the compiler at my disposal. Learning COBOL the language is pretty easy, even if actually doing something with it is slow, verbose and bureaucratic. Training new devs will never be the problem if a company and a person were found willing.

* But training is the problem, politically. Why would a dev learn a language that is a bad mark on the resume while being paid less? Why would management want to risk their career by pushing forward an evolutionary dead end?

* And the language is a dead end. NLS in Cobol is really weak, which means it's a no-go for anything global. Fixed width everywhere causes massive technical dept as there is a tendency to 'cheap out on bytes', after which you're completely locked in by the required data migration everywhere. Cobol culture is rife with obsolete practices. One program equals one file (and some libs managed by other cobol devs) most of the time, so methods of a few thousand lines are the norm.

* The real value of the COBOL devs I know is that they are business people first, technical second. They know their business, deep. They've seen everything and smell if an idea is good or bad. Because of this, they run rings around younger devs and analists when you look at business value, especially if said devs/analists are outside consultants that knknow basically nothing and just code what's been fed to them. While training for Cobol is easy, training for the actual business is close to impossible.

* Another Cobol upside is being a technical dinosaur: The Enterprise Architecture Astronauts or bungee-boss CEO won't touch it with a 10 feet pole, so there is no new great architectural vision every other year. Lava flow architecture is not that huge a problem. Besides, your code base lived more than 40 years and is already ugly as hell. The technological investment and training cost have been made long ago. Cobol devs can just convert problem to program without weird technological surprises, integration hell, or architectural distractions.

Don't get me wrong. I'm very glad I don't do COBOL. But its not burn-it-down-
and-start-over-now bad either.

~~~
specialist
Keen observations.

I worked on the periphery of a mainframe, COBOL modernization project.
Basically an attempted rewrite. Utter failure.

My hunch at the time was the better technical strategy would have been to run
all that stuff within an emulator (to get off the nearly dead mainframe) and
to focus on process improvements (eg source control, repeatable builds, schema
management). But otherwise leave the legacy code in place.

------
otabdeveloper2
Daily reminder: floats are emulated reals.

Money values are rationals, not reals.

Languages need to support the full number stack, including the rationals!

"Floating point decimal" is wrong and the worst of both worlds. Unless you
live in the USA, you need to do currency conversions anyways, and those aren't
decimal.

~~~
boomlinde
Floating point numbers are all rational. Given that rational numbers are a
countable subset of real numbers, I think it's inaccurate to say that floats
are somehow "emulated reals" as opposed to rationals. The radix of a floating
point number relates to the quotient of a rational number. For example in
decimal floating point significand-base-exponent form, 1.23 is 123 * 10^-2 or
123/100.

~~~
taejo
All floating point numbers are rational, but not all rational numbers are
floating-point, including very simple ones like 1/3, and ones common in
finance like 1/100, which is the point.

~~~
boomlinde
1/3 and 1/100 are perfectly representable using floating point. Sure, I've
never seen fp in base 3 in the wild, but base 10 isn't unusual for the exact
reason you're pointing out.

So what I see is a conclusion that floating point is "emulated reals" as
opposed to rationals, despite all floating point numbers being rational
numbers and that decimal floating point is wrong and "the worst of both
worlds", but I don't see a valid argument for it.

------
jillesvangurp
Java has BigDecimal in the standard library. That gives you arbitrary
precision. Also, using a double instead of a float gives you a bit better
precision. The performance hit of using double instead of float on modern
hardware is not something that should be a showstopper.

I don't get the argument about libraries and performance. We're talking about
companies using emulators of ancient hardware to run decades old software.
Pulling in some library is a complete non issue from a performance point of
view.

Speaking of hardware, that is magnitudes faster than anything imaginable when
most cobol was first written. Performance is not the key concern here.

~~~
fpoling
Ergonomics of using a feature plays a big role. The was a famous example when
a programmer did something like

bigDecimal.doubleValue() + bigDecimal2.doubleValue()

just because Java does not have operator overloading and using methods instead
of familiar operators was ugly.

~~~
sorokod
Are you saying that personal sense of esthetics trumps correctness?

Anyway, BigDecimal has an add() method (overloaded for rounding behaviour)

bdResult = bd1.add( bd2 )

~~~
romwell
>BigDecimal has an add() method

Using it for formulas makes for wonderful code.

Try

    
    
        y = (5x^3 - 3x^2 + 7x + 1)/(3x^2 - 7x)
    

See how readable that is using methods like .add()

Side note: the notation we have is not that old; but as soon as people
actually started working with polynomials, they invented notation that makes
it sane.

That's to say, Java's .add() is _so_ 1400's.

~~~
sorokod
There is no argument from me that a + b is more readable than a.add(b). Java
founding fathers (Gosling?) decided against operator overloading sighting the
mess it allowed to created in other languages. The mathematical notations for
spelling out polynomials has been optimized over some time - Java being a
general purpose programming language has other fish to fry.

There are languages that allow for a cleaner way to express mathematical
statements, this is valid Julia:

    
    
        julia> p(x) = (5x^3 - 3x^2 + 7x + 1)/(3x^2 - 7x)
        p (generic function with 1 method)
    
        julia> p(1 + 2im)
        2.4 + 0.2im

------
rossdavidh
While much of this was interesting and perhaps true, I don't really think it's
why COBOL is still being used. It's still being used because it works for
cases where the "BO" part of "COBOL" is relevant, in domains where the
potential downside of migrating are very, very large (if it goes badly), and
the potential upside is actually pretty limited. Best case, your migration
doesn't get noticed as causing any problems, and now your programmers keep
leaving for other industries because their skills are more general. That's a
whole lot of scary scenarios on one side of the scale, against a pretty meager
upside on the other side of the scale.

~~~
wyldfire
> upside is actually pretty limited.

You're selling the upside short. Having a pool of software devs to draw from
who are familiar with the language and typical deployments is a big deal.

The next y2k crisis for this industry will be in a decade or three when
there's few left who have experience here.

~~~
gaius
_Having a pool of software devs to draw from_

Well, in a world where everyone is self-taught from what’s trending at the
moment. In the old days if companies needed someone with particular skills
they would train them and make an effort to retain them. Now the cost and the
risk has all been pushed onto the employee.

It should be a simple equation: cost of training vs cost of rewriting every
2-3 years...

------
jim_lawless
I'd like to point out that COBOL generates rather efficient fixed-point math
code on IBM mainframes because those mainframes have a dedicated set of
machine-level instructions that deal with fixed-point math.

The data type used is "Packed Decimal" where each nybble in a string of bytes
represents a digit, except the last nybble. The last nybble describes the sign
of the overall number. It's similar to BCD with a sign-nybble at the end.

Here's a list of the Packed Decimal instructions with a description of each.

[http://faculty.cs.niu.edu/~byrnes/csci360/notes/360pack.htm](http://faculty.cs.niu.edu/~byrnes/csci360/notes/360pack.htm)

~~~
IshKebab
Interesting, but I assume those machines are dead?

~~~
jim_lawless
No, these instructions are still present in IBM's z/Architecture.

------
dmitriid
The main reason COBOL is still around is not math.

The main reason is that business and domain logic exists only as COBOL logic.
COBOL devs are ~60 on average, and many people who wrote system requirements
are probably already dead. Existing code runs millions or billions of dollars
worth of transactions often based on arcane financial rules and internal
regulations. Good luck untangling that without stopping the business and
without losing that functionality.

Relevant links:

\- [https://uk.reuters.com/article/uk-usa-banks-cobol-
idUKKBN17C...](https://uk.reuters.com/article/uk-usa-banks-cobol-
idUKKBN17C0DZ)

\- Reverse engineering a factory: <if someone can find a link to this
fascinating story, please help me :)>

------
Animats

        PIC 9(3)V9(15).
    

is really shorthand for

    
    
        PICTURE 999V999999999999999.
    

Fixed point numbers are declared like that.

COBOL is a decent language for business logic. Especially when money amounts
are involved. Certainly better than PHP.

~~~
pjc50
The "picture" syntax for records was an _excellent_ idea that should be tried
again in more modern languages.

------
mastax
C#/.NET has the built-in decimal type which is technically floating point, but
it has a 96-bit mantissa and a base-10 exponent which makes it behave
similarly to fixed point numbers.

~~~
wvenable
The same type existed in classic Visual Basic as well. It's part of the set of
basic COM types in Windows.

------
slaymaker1907
The statement about Decimal being an import is a garbage, misleading
statement. Decimal IS PART OF THE STANDARD LIBRARY!!!! It might not be in the
global context, but that is a far cry from having to install an external
library which this article makes it seem like.

Also, performance wise COBOL might be faster on a single machine, but good
like trying to scale it out. Plus, a distributed architecture can make it
easier to deploy software and hardware upgrades since you can generally take
out a few nodes without bringing down the whole system.

I’m not saying all this code should be rewritten. If you have some code out in
the wild that works, you should always weigh carefully the risks in terms of
cost and potential new bugs before doing a rewrite. However, lack of fixed
point is a lousy excuse to not upgrade.

~~~
gaius
_good like trying to scale it out_

COBOL demonstrably scales sufficiently to run the entire global economy, so no
luck required.

~~~
mmt
To be fair, the parent did specify scaling "out" and went on to claim other
benefits of a distributed architecture, which are not even unique to that
architecture [1].

Of course, that comment's definition of "single machine" ignores the
intrinsically distributed nature of mainframe architectures, on which on might
reasonably suppose COBOL would run.

[1] e.g. a "single machine" only needs a single replica for those benefits.
Sure, even that's a _form_ of distributed architecture, potentially subject to
at least some of the Fallacies Of Distributed Computing, but I'm confident
that's not what was meant.

~~~
gaius
Indeed. I am no fan of IBM the company due to the poor way they treat workers
they are laying off, and the way they keep winning big government contracts
despite their many many high profile failures is deeply suspicious. And their
cloud offerings are a joke.

But I won’t deny that Sysplex has been proven to work in the field.

------
tonysdg
Couldn't you at least port your COBOL to C? It's also statically typed,
compiled, there are high-performance libraries that support arbitrary levels
of numeric precision (GNU GMP), and every engineer trained in the last 20
years at least has some C experience.

~~~
gaius
_every engineer trained in the last 20 years at least has some C experience._

I think that’s a massive exaggeration. Most people these days think JavaScript
without a framework is “bare metal”.

~~~
chii
One should be hesitant to call those people engineers.

~~~
meddlepal
One should be hesitant to label computer programmers as "engineers".

Gatekeeping is fun.

~~~
skolemtotem
Firstly, it's not unjustified to say that someone who thinks vanilla
JavaScript is "bare metal" obviously doesn't know what they're talking about.
Secondly, as a software-engineer-in-training, I agree - software "engineering"
doesn't have much in common with other engineering, but I think that's a good
thing for software.

~~~
stevew20
Could you imagine if software engineers DID do rigorous engineering practices?

It would be mayhem! Projects would just NEVER FINISH.

~~~
chii
But when they do finish, it will stand the test of time!

------
ashton314
I'm curious to see how languages with support for rational types would handle
this problem. (E.g. Scheme, Clojure, Haskell) Seems to me that they would
eliminate the round-off problem entirely for a great number of commonly-faced
applications in the finance world. (So easy to use `x/100`.)

~~~
lispm
In Common Lisp:

    
    
      (defun r1 (y z)
        (- 108 (/ (- 815 (/ 1500 z)) y)))
    
      (defun r (n &aux (l '(4 425/100)))
        (loop for i from 2 upto (1+ n)
              do (setf l (append l
                                 (list (r1 (elt l (- i 1))
                                           (elt l (- i 2)))))))
        l)
    
      CL-USER 68 > (loop for e in (r 20) and  i below 20
                         do (format t
                                    "~%~3a~20,15f"
                                    i
                                    (float e 1.0d0)))
    
      0     4.000000000000000
      1     4.250000000000000
      2     4.470588235294118
      3     4.644736842105263
      4     4.770538243626063
      5     4.855700712589074
      6     4.910847499082793
      7     4.945537404123916
      8     4.966962581762701
      9     4.980045701355631
      10    4.987979448478392
      11    4.992770288062068
      12    4.995655891506634
      13    4.997391268381344
      14    4.998433943944817
      15    4.999060071970894
      16    4.999435937146839
      17    4.999661524103767
      18    4.999796900713418
      19    4.999878135477931

------
zengid
I'm doing an internship at a logistics company, and we're building projects in
.NET. They're keeping us away from the Cobol, but part of me really wants to
learn it. There are a lot of Senior Devs who only work in Cobol, and they're
getting close to retiring. I feel like it might be a good trade to be an
expert in Cobol and .NET, since most enterprises are going to have some of
both.

Any Cobol free-lancers hanging out here? Whats you're job perspectives like?
Are you filthy rich?

~~~
nathan_f77
I investigated this recently after I read another article about COBOL [1]. My
conclusion is that it's a much better idea to learn modern languages and
frameworks, and work with web/mobile applications, or machine learning. COBOL
does not paid very well for the average developer. I saw COBOL jobs advertised
with salaries around $50k to $80k. Another article [2] talked about how
"experienced COBOL programmers can earn more than $100 an hour" to fix up old
bank software. That's a reasonable hourly rate but nothing amazing. An
experienced web/mobile developer will earn more than that, and the work will
be much more enjoyable: modern language and tooling, modern software
development practices (e.g. test-driven development), plenty of libraries,
lots of answers on StackOverflow, etc.

[1] [https://qz.com/email/quartz-
obsession/1316505/](https://qz.com/email/quartz-obsession/1316505/)

[2] [https://www.reuters.com/article/us-usa-banks-cobol-
idUSKBN17...](https://www.reuters.com/article/us-usa-banks-cobol-
idUSKBN17C0D8)

~~~
guitarbill
This is the dirty secret, and I think it's largely because COBOL devs don't
create that much value. What I mean is, those systems are in place, have
largely worked for years, and only need a bug fix here and there. For those,
you can get some retired guy back as a contractor and it's still way cheaper
than a dedicated COBOL dev. I don't think I know of anywhere that is writing
big new projects in COBOL (I'd love to hear of counter-examples). Instead it
seems like they've found ways of bolting on extensions as part of some
middleware.

So what's interesting instead is where are the rumours of a COBOL dev shortage
coming from?

------
le-mark
Good article, very informative. You used to hear the "can't do the math"
excuse a lot around y2k with respect to migrating these systems to something
(anything?) else. You don't hear as much nowadays. I think a lot of that
boiled down to properly handling rounding, which is non trivial as other
posters have mentioned.

A lot of commenters here are missing the point about decimal library support.
The point is not that language x does or doesn't have some sort of support for
decimal math, the point is it's not _native support_. Even if language x
supports a decimal type, back by a byte[] (for example) there is still
function/method call overhead for basic operations (+-/*). For high volume
stuff she's talking about, it adds up, fast.

I think there are a few languges with similar native decimal support; ada and
pl/1 iirc.

Cobol is an albatross that's going to be with us for a loooong time to come.

~~~
Coding_Cat
>there is still function/method call overhead for basic operations (+-/*). For
high volume stuff she's talking about, it adds up, fast.

With the right style of C++ programming (trait/templated focused instead of
inheritance) or Rust's static-dispatch-by-default it should be no more of a
performance impact than 'native' Cobol calls after the compiler optimizes it.

------
kwccoin
Confusion about many of the comments.

COBOL can be used for online system and performance wise a 16 MB (M not G)
even for a small mainframe (PC on P5!) can support 1000 uses easily. Whilst it
is totally dated, no one support and IBM charge a lot - None is related to
cobol, or cics or ims ...

For the floating point part ... not using those. Most of the attention is to
handle and agree upon how to do exact calc inclding reminder. No rounding per
sec in the system. Nothing lose. Not even one cent. Hence, floating point ...

And change it to c, ada etc. It is English programming language. The hard part
is to translate and test. And explain to use decmical as exact number.

... lots of project has successfully migrated. But unless you get that right -
cobol is not slow and it is an exact number computation with in depth business
know-how, good luck.

------
sampo
# Short version

The Muller’s Recurrence is a mathematical problem that will converge to 5 only
with precise arithmetics. This has nothing to do with COBOL and nothing to do
with floating and fixed point arithmetics as such. The more precision your
arithmetics has, the closer you get to 5 before departing and eventually
converging to 100. Python's fixed point package has 23 decimal points of
precision by default, whereas normal 64bit floating point has about 16 decimal
points. If you increase your precision, you can linger longer near 5, but
eventually you will diverge and then converge to 100.

# Long version

What's going on here is that someone has tried to solve the roots of the
polynomial

    
    
        x^3 - 108 x^2 + 815 x - 1500,
        

which is equal to

    
    
        (x - 3)(x - 5)(x - 100).
        

So the roots are 3, 5 and 100. We can derive a two-point iteration method by

    
    
        x^3 = 108 x^2 - 815 x + 1500
        x^2 = 108 x - 815 + 1500/z
        x   = 108 - (815 - 1500/z)/y
    

where y = x_{n-1} and z = x_{n-2}. But at this point, we don't know yet
whether this method will converge, and if yes, to which roots.

This iteration method can be seen as a map F from R^2 to R^2:

    
    
        F(y,z) = (108 - (815 - 1500/z)/y, y).
        

The roots of the polynomial are 3,5 and 100, so we know that this map F has
fixed points (3,3), (5,5) and (100,100). Looking at the derivative of F
(meaning the Jacobian matrix) we can see that the eigenvalues of the Jacobian
at the fixed points are 100/3 and 5/3, 20 and 3/5, 1/20 and 3/100.

So (3,3) is a repulsive fixed point (both eigenvalues > 1), any small
deviation from this fixed point will be amplified when the map F is applied
iteratively. (100,100) is an attracting fixed point (both eigenvalues < 1).
And (5,5) has one eigenvalue much larger than 1, and one slightly less than 1.
So this fixed point is attracting only when approached from a specific
direction.

Kahan [1, page 3] outlines a method to find sequences that converge to 5. We
can choose beta and gamma freely in his method (Kahan has different values for
the coefficients of the polynomial, though) and with lots of algebra (took me
2 pages with pen and paper) we can eliminate the beta and gamma and get to the
bottom of it. What it comes down to, is that for any 3 < z < 5, choose y = 8 -
15/z, and this pair z,y will start a sequence that converges to 5. But only if
you have precise arithmetics with no rounding errors.

For the big picture, we have this map F, you can try to plot a 2D vector field
of F or rather F(x,y) - (x,y) to see the steps. Almost any point in the space
will start a trajectory that will converge to (100,100), except (3,3) and
(5,5) are stable points themselves, and then there is this peculiar small
segment of a curve from (3,3) to (5,5), if we start exactly on that curve and
use exact arithmetics, we converge to (5,5).

Now that we understand the mathematics, we can conclude:

Any iteration with only finite precision will, at every step, accrue rounding
errors and step by step end up further and further away from the mathematical
curve, inevitably leading to finally converging to (100,100). Using higher
precision arithmetics, we can initially get the semblance of closing in to
(5,5), but eventually we will reach the limit of our precision, and further
steps will take us away from (5,5) and then converge to (100,100).

The blog post is maybe a little misleading. This has nothing to do with COBOL
and nothing to do with fixed point arithmetics. It just happens that by
default Python's Decimal package has more precision (28 decimal places) than
64bit floating point (53 binary places, so around 16 decimals). Any iteration,
any finite precision no matter how much, run it long enough and it will
eventually diverge away from 5 and then converge to 100.

Specifically, if you were to choose floating point arithmetic that uses higher
precision than the fixed point arithmetic, then the floating point would
"outperform" the fixed point, in the sense of closing in nearer to 5 before
going astray.

[1]
[https://people.eecs.berkeley.edu/~wkahan/Math128/M128Bsoln09...](https://people.eecs.berkeley.edu/~wkahan/Math128/M128Bsoln09Feb04.pdf)

~~~
sampo
In the light of the mathematics above, it is really quite artificial to
declare that 5 would be the correct answer. This is not an optimization
problem, but with some misuse of language I am tempted to say that 100 is the
global optimum, and only by balancing on a knife's edge is it even possible to
approach the local optimum of 5. And 3 is the tip of a needle.

So we could just say that arithmetics with less precision is better, here,
because it reaches the global optimum quicker? Whereas higher precision
arithmetic is "lingers" in the "orbit" of 5 a little longer.

------
sanxiyn
Delphi has Currency type, and I heard that it is largely responsible for
Delphi's hold on financial software.

------
codeisawesome
Wow. I didn't even finish reading the whole post, but just the beginnings of
the post gave me a much better functioning intuition about how floating point
works! Thank you Marienne!

------
incompatible
If it's decimal currency, why not process amounts as integer cents?

~~~
gaius
_why not process amounts as integer cents?_

Trading is actually done in “basis points” or “pips” which are 1/100 of a cent

~~~
incompatible
Fine, integer centicents or microcents or whatever is required for the
application at hand.

~~~
gaius
Right, my point being that some of this stuff is non-obvious and there is
often a reason “why not”

------
tzs
I recently went through our Perl, Python, PHP, and JavaScript code making sure
sales tax and VAT calculations were right, particular when the sale amount and
tax rate were both in floating point (damn legacy code...). During the course
of this, I found some good test cases. They are given below.

Let R = the tax rate x 10000, in a jurisdiction where the tax rate is an
integral multiple of 0.0001. Note that R is an integer.

Let A = the sale amount x 100, in a jurisdiction where prices are an integral
multiple of 0.01. I.e., in the US, A is the sale amount in cents. Note that A
is an integer.

Let T = the tax * 100, or in US terms, the tax in cents. Note that T is an
integer.

If you can arrange to keep your amounts in cents and your rates in the x 10000
form (or whatever is appropriate for the jurisdiction), then you only need
integers and things are simple:

    
    
      def tax(A, R):
        T = (A * R + 5000)//10000
        return T
    

You probably have to go from cents to dollars somewhere, such as when
informing the user of prices, taxes, and totals. I believe that integer/100,
rounded to 2 places in all cases and printed in all of the above languages
will be correct, but I kind of cheated and for display results I treated it as
a string manipulation problem, not a number problem (which also takes care of
making sure amounts less than $1 have a leading 0, and that multiples of 0.1
have a trailing zero) [1].

If you don't have the amount and rate in the nice integer forms above, but
rather have them in floating point such as you get from parsing a string like
12.34 (for a price of $12.34) or 0.086 (for a tax of 8.6%), here are three
functions to return the tax in cents that might seem reasonable, and you might
think are properly handling rounding:

    
    
      def tax_f1(amt, rate):
        tax = round(amt * rate,2)
        return round(tax * 100)
      
      def tax_f2(amt, rate):
        return round(amt*rate*100)
      
      def tax_f3(amt, rate):
        return round(amt*rate*100+.5)
    

Alas, they are all flawed.

    
    
        input        f1  f2  f3
      ------------- --- --- ---
       1% of $21.50  21  22  22 ( 22 is right)
       3% of $21.50  65  64  65 ( 65 is right)
       6% of $21.50 129 129 130 (129 is right)
      10% of $21.15 211 211 212 (212 is right)
    

It does work to convert from floating point to the x 100 and x 10000 integer
form, and then use the integer function given earlier:

    
    
      def tax_f4(amt, rate):
        amt = round(amt * 100)
        rate = round(rate * 10000)
        tax = (amt * rate + 5000)//10000
        return tax
    
      def tax_f5(amt, rate):
        amt = int(amt * 100 + .5)
        rate = int(rate * 10000 + .5)
        tax = (amt * rate + 5000)//10000
        return tax
    

Both of those are right in the test cases above, and I believe in all other
cases (well, all other cases where everything is positive...). For Python I've
done brute force testing of all combinations of amount = 0.01 to 25.00 in
steps of 0.01 and rate = 0.0001 to 1.0000 in steps of 0.0001 to verify that.

I've also done a brute force C test that involved sscanf(..., "%lf",...) of
strings of the form "0.ddd...ddd" where the 'd' are decimal digits, and there
are up to 9 of them. In all cases multiplying the resulting double by 10^k,
where k is the number of digits after the decimal point and called round() on
that gave the correct integer. Assuming that Python, PHP, etc., are using IEEE
754 when they do floating point, the results should be the same in all of
those, which is why I believe that tax_f4 and tax_f5 should work for all
cases, not just the ones I actually tested in the Python brute force test.

I did another C test, over the same range as the sscanf test, to verify that
given an integer I, in the range [0, 10^k] for positive k up to 9, if you
computer (double)I/10^k, then multiply that by 10^k and round(), you get back
I.

My conclusions (assuming IEEE 754 or something that behaves similarly):

1\. It is OK to store money values and tax rates in floating point, at least
as long as you have 9 or fewer digits after the decimal point. Just avoid
doing calculations in this form.

2\. Converting from a floating point representation to an integer x 10^k
representation by doing a floating point multiply by 10^k and a round to
nearest integer works, at least as long as k <= 9.

3\. sscanf '%lf', and I expect most reasonable string to float parsers, if
applied to a floating point number of the form 0.ddd... with up to 9 digits
after the decimal point, will work as expected, in the sense that they will
give you a floating point number that when converted to integer x 10^k
representation as described in #2 will give you the integer you expect and
want.

4\. I did not do any tests of floating point amounts that had large integer
parts. With a large enough integer part, the places where I mention k <= 9
above might need to have that 9 lowered.

[1] e.g., in PHP:

    
    
      function cents_to_dollars($cents)
      {
        $cents = strval($cents);
        while (strlen($cents) < 3)
          $cents = '0' . $cents;
        return substr($cents,0,strlen($cents)-2) . '.' . substr($cents,-2);
      }

------
dwheeler
Ada also has decimal fixed point built in, without the overheads the author is
worried about. But I agree with the author of that built-in easy and
_efficient_ support of decimal arithmetic is not so common.

------
wedesoft
Many languaged have rational numbers (fractions) and big integers in their
stack. This has the advantage that one has full control over where and when to
trade off performance against accuracy.

------
stevew20
You really should right a new opening line Marianne; fractions are legit, and
tons of people like them. Like most people who can do simple division without
a calculator... Starting off with "no one likes fractions" is not a way to
make friends or impress people.

------
nitwit005
It's not exactly difficult to just write your own fixed point library in Java.
I'm sure it'd be tricker to get exactly the same behavior as the old code, but
it still doesn't seem like it should be a huge time sink on a large project.

------
eyphka
Have found similar to be true in my experience with COBOL, and older banking
systems that rely on COBOL.

------
krick
Does Rust have solid support for fixed-point decimals, BTW? Would be quite a
selling point, I suppose.

~~~
Thiez
Not really, although I imagine this might become possible once 'const
generics' stabilizes. Then you'd be able to define a generic type with a
certain precision (of course you can do this today but you have to define a
different type for different numbers of decimals). Overloading the arithmetic
operators can be done. The only problem I see would be not having syntactic
sugar for defining such a number, unlike integers and doubles.

~~~
krick
Actually, I have absolutely no idea, how it is represented internally.
Shamefully, I'm so used to thinking that there are 8/16/32/64-bit
signed/unsigned ints, IEEE-754 floats and bunch of other "basic" stuff, and
pretty much anything more sophisticated is being done with those plus some
union/record/ADT/pointer magic on top of it, that I'm entirely unable to
imagine how introducing another fundamental data-type would work. Would Rust's
compiler backend ever able to make it as "native" and efficient as floats?
Would your typical Intel/AMD/Qualcomm processor even know how to deal with
this stuff, or it would turn out as "efficient" as manual string-based
arbitrary-precision calculations? And, regardless, is there any difference
between how Rust compiler technology could represent it and how gcc could
represent it?

------
crb002
Time for COBOL to get it's Elixir.

------
dmead
Marianne, I'd like to chat with you about this article not in public. Whats
the easiest way to do that?

------
flossball
Isn't this more that floating points are not the solution for everything and
most 'computer scientists' have little clue. Posits for the win (within a
larger range of win at least)!

