Hacker News new | past | comments | ask | show | jobs | submit login
Decimal to fraction (leancrew.com)
91 points by ingve on Aug 15, 2023 | hide | past | favorite | 47 comments



Here's a good algorithm: https://web.archive.org/web/20111027100847/http://homepage.s...

Algorithm To Convert A Decimal To A Fraction by John Kennedy Mathematics Department Santa Monica College 1900 Pico Blvd. Santa Monica, CA 90405 http://homepage.smc.edu/kennedy_john/DEC2FRAC.PDF


Why would you do this instead of just converting the digits to a fraction over a power of 10 and then reducing the fraction (or power of 2, if it's a floating point datatype)? I was thinking it was faster, but the recursive procedure involves a division step, so I would assume that calculating GCD using the binary algorithm (which is just uses shifts and subtraction) would be faster? I guess this is if your numeric data-type can't fit the size of the denominator?


If you assume/know the decimal you got is the result of rounding some computation, and are more interested in a short description than in the most accurate one.

For example 0.142857 would be 142,857/1,000,000 in your algorithm, but ⅐ in one using best rational approximations (which I assume that paper does, as they’re easy to calculate and, according to a fairly intuitive metric, best)


Common Lisp gives you this choice (this being HN, a Lisp mention is obligatory :-)

The CL function #'rational converts a float to a rational assuming the float is completely accurate (i.e. every decimal digit after the last internally-represented one is assumed to be 0), while #'rationalize assumes the float is only accurate to the limit of float precision (i.e. the decimal digits after the last internally-represented one can be anything that would round to the last one).

Both functions preserve the invariant that

  (float (rational x) x) ==  x
and

  (float (rationalize x) x) ==  x
...which is another way of saying they don't lose information.

In practice these tend to be not very useful to me because they don't provide the ability to specify which decimal digit should be considered the last one: They take this parameter from the underlying float representation, which sometimes causes unexpected results (the denominator can end up being larger than you expected) because the input number can effectively have zeros added at the end if the underlying float representation is large enough to capture more bits than you specified when you typed in the number.

In addition, what I really need much of the time is the ability to limit the size of the resulting denominator, like "give me the best rational equivalent of 0.142857 with at most a 3-digit denominator". That function is not built in to Common Lisp but one can write it using the methods in TFA. It loses information of course so a round trip from float->rational->float won't necessarily produce the same result.


> assuming the float is completely accurate (i.e. every decimal digit after the last internally-represented one is assumed to be 0)

Careful though - the float is probably not actually a set of decimal digits but most likely binary ones, so it would be assuming that every binary digit after the last one is zero.

Just because you wrote ‘0.1’ in your source code that doesn’t mean you only have a single significant figure in the float in memory. It’s going to be 0.0001100110011… (repeating 0011 to the extent of your float representation’s precision).

Although the Common Lisp language doesn’t actually appear to require that the internal float radix be 2 - a floating point decimal type would be valid implementation.


> if the underlying float representation is large enough to capture more bits than you specified when you typed in the number.

If you typed in `0.142857` (or however IEEE floats are syntaxed), the correct rational is 142857/1000000. If you want to rationalize relative to number of decimal significant digits, that should be something like `(rationalize "0.142857")` or `(rationalize (radix-mp-float 10 '(1 4 2 8 5 7) -1))`.


> correct rational is 142857/1000000

That would be correct if the underlying representation were decimal. If it's binary, as it is in most Common Lisps,

  (rational 0.142857) --> 9586971/67108864
because the underlying representation is binary, and that denominator is 0x4000000.

My original comment about the representation being large enough to capture more bits than you specified was wrong; the unexpected behavior comes from the internal binary representation and the fact that 9586971/67108864 cannot be reduced.

If you start with integers you get

  (/ 142857 1000000) --> 142857/1000000
so you could write a version of #'rational that gives the expected base-10 result, but you'd first have to convert the float to an integer by multiplying by a power of 10.



> Common Lisp gives you this choice (this being HN, a Lisp mention is obligatory :-)

Yes, we should probably implement a rule. If a post has 100+ comments, it cannot remain on front page unless at least one that mentions Lisp.

dang?


Leave this meta shit on reddit please.

The SNR of this place isn't what it used to be to start with, let's not go out of our way to ADD EVEN MORE noise.


nice language


You might want a simpler representation (smaller denominator) than that will give you.

An example is converting musical pitch ratios to simple fractions (e.g. as used in just intonation). Literally yesterday I was writing code to do this. The mediant method works beautifully for this, and it can be optimized further than what's presented in the article.


> can be optimized further than what's presented in the article

Fascinating, do you have any links about this? Or more generally about the intersection of music, fractions and programming?

I found some code in Audacity that detects the notes in audio, I'll see if I can edit this comment later.


The basic idea to optimize the algorithm (which I got from this [1] SO answer) is to recognize that you only ever update the numerator and denominator of either bound in a linear fashion. So rather than potentially repeatedly update a given bound, you alternate between the two, and use algebra to determine the multiple by which the bound should be updated.

Regarding the intersection of fractions and music, "just intonation" is the term you want to research.

[1] https://stackoverflow.com/a/45314258


How would you get to 355/113 from pi using your method?


I don't think it's that good. At the very least I implemented it and got different answers for the test case of 0.263157894737 = 5/19 depending on the choice of numerical representation.

float64 ended up at 87714233939/333314088968 while decimal and fraction ended up at 87719298244/333333333327.

Here is the code.

    import argparse
    parser = argparse.ArgumentParser()
    parser.add_argument("Q", nargs="?", default="0.263157894737")
    parser.add_argument("--method", choices = ("float", "decimal", "fraction"), default="float")
    args = parser.parse_args()

    if args.method == "float":
        print(f"Using float64 for {args.Q!r}")
        one = 1.0
        Q = float(args.Q)
        int2float = float
    elif args.method == "decimal":
        print(f"Using decimal for {args.Q!r}")
        import decimal
        one = decimal.Decimal(1)
        Q = decimal.Decimal(args.Q)
        int2float = decimal.Decimal
    elif args.method == "fraction":
        print(f"Using fractions for {args.Q!r}")
        import fractions
        one = fractions.Fraction(1)
        Q = fractions.Fraction(args.Q)
        int2float = fractions.Fraction

    def to_frac(X):
        Da = 0
        Db = 1
        Za = X
        while 1:
            Zb = one / (Za - int(Za))
            Dc = Db * int(Zb) + Da
            N = round(X * Dc)
            frac = N / int2float(Dc)
            print(f"  {N}/{Dc} = {frac} (diff: {X-frac})")
            if float(N) / float(Dc) == float(X):
                return (N, Dc)
            Da, Db = Db, Dc
            Za = Zb

    print("solution:", to_frac(Q))
Here's the output for 0.263157894737

  % python frac.py 0.263157894737 --method float
  Using float64 for '0.263157894737'
    1/3 = 0.3333333333333333 (diff: -0.0701754385963333)
    1/4 = 0.25 (diff: 0.01315789473700002)
    4/15 = 0.26666666666666666 (diff: -0.003508771929666643)
    5/19 = 0.2631578947368421 (diff: 1.5792922525292852e-13)
    87714233939/333314088968 = 0.263157894737 (diff: 0.0)
  solution: (87714233939, 333314088968)

  % python frac.py 0.263157894737 --method decimal
  Using decimal for '0.263157894737'
    1/3 = 0.3333333333333333333333333333 (diff: -0.0701754385963333333333333333)
    1/4 = 0.25 (diff: 0.013157894737)
    4/15 = 0.2666666666666666666666666667 (diff: -0.0035087719296666666666666667)
    5/19 = 0.2631578947368421052631578947 (diff: 1.578947368421053E-13)
    87719298244/333333333327 = 0.2631578947370000000000030000 (diff: -3.0000E-24)
  solution: (87719298244, 333333333327)

  % python frac.py 0.263157894737 --method fraction
  Using fractions for '0.263157894737'
    1/3 = 1/3 (diff: -210526315789/3000000000000)
    1/4 = 1/4 (diff: 13157894737/1000000000000)
    4/15 = 4/15 (diff: -10526315789/3000000000000)
    5/19 = 5/19 (diff: 3/19000000000000)
    87719298244/333333333327 = 87719298244/333333333327 (diff: -1/333333333327000000000000)
  solution: (87719298244, 333333333327)

For the pi = 3.14159265359 test case the solutions are all 226883371/72219220 . The sequence diverges at the point marked with "<---- here".

  Using float64
    22/7 = 3.142857142857143 (diff: -0.0012644892671427321)
    333/106 = 3.141509433962264 (diff: 8.321962773605307e-05)
    355/113 = 3.1415929203539825 (diff: -2.667639824593948e-07)
    103993/33102 = 3.1415926530119025 (diff: 5.780975698144175e-10)
    104348/33215 = 3.141592653921421 (diff: -3.3142111277584263e-10)
    208341/66317 = 3.1415926534674368 (diff: 1.2256329284809908e-10)
    312689/99532 = 3.1415926536189365 (diff: -2.893640882462023e-11)
    833719/265381 = 3.141592653581078 (diff: 8.922196315097608e-12)
    1146408/364913 = 3.141592653591404 (diff: -1.403765992336048e-12)
    5419351/1725033 = 3.1415926535898153 (diff: 1.8474111129762605e-13) <---- here
    6565759/2089946 = 3.141592653590093 (diff: -9.281464485866309e-14)
    11985110/3814979 = 3.141592653589967 (diff: 3.2862601528904634e-14)
    18550869/5904925 = 3.1415926535900116 (diff: -1.1546319456101628e-14)
    30535979/9719904 = 3.1415926535899943 (diff: 5.773159728050814e-15)
    49086848/15624829 = 3.141592653590001 (diff: -8.881784197001252e-16)
    226883371/72219220 = 3.14159265359 (diff: 0.0)
  solution: (226883371, 72219220)
For 0.10000000000000002 it gives 562949953421313/5629499534213129 (for float64) or 500000000000000/4999999999999999 (using fractions):

  Using float64 for '0.10000000000000002'
    1/9 = 0.1111111111111111 (diff: -0.011111111111111086)
    1/10 = 0.1 (diff: 1.3877787807814457e-17)
    562949953421313/5629499534213129 = 0.10000000000000002 (diff: 0.0)
  solution: (562949953421313, 5629499534213129)

  Using fractions for '0.10000000000000002'
    1/9 = 1/9 (diff: -4999999999999991/450000000000000000)
    1/10 = 1/10 (diff: 1/50000000000000000)
    500000000000000/4999999999999999 = 500000000000000/4999999999999999 (diff: -1/249999999999999950000000000000000)
  solution: (500000000000000, 4999999999999999)
FWIW, the exact solution is:

  >>> import fractions
  >>> fractions.Fraction("0.10000000000000002")
  Fraction(5000000000000001, 50000000000000000)


The pascal code cuts when it hits a certain accuracy (passed in as an argument).


The main issue I see is that the algorithm - unlike the mediant version - does not generate maximally successive accurate approximations.

Yes, you can stop the algorithm at a certain accuracy, but that doesn't mean you can't get better for that given accuracy.

Consider the pi = 3.14159265359 case, and you want it precise to 1/1,000,000. The float64 algorithm gives 1146408/364913 or 5419351/1725033 because:

     ...
    1146408/364913 = 3.141592653591404 (diff: -1.403765992336048e-12)
      --- want a solution here ---
    5419351/1725033 = 3.1415926535898153 (diff: 1.8474111129762605e-13)
     ...
The mediant method, on the other hand, gives an intermediate solution:

  >>> import fractions
  >>> fractions.Fraction("3.14159265359").limit_denominator(1000000)
  Fraction(3126535, 995207)
  >>> float(fractions.Fraction("3.14159265359").limit_denominator(1000000))
  3.1415926535886505
That's a difference of -1.3495871087343403e-12 which is more accurate than 1146408/364913, and is not a solution found by the other algorithm.

Or, if you want a denominator of 364913 or 1725033 you can do that with mediants:

  >>> fractions.Fraction("3.14159265359").limit_denominator(364913)
  Fraction(1146408, 364913)
  >>> fractions.Fraction(3.14159265359).limit_denominator(1725033)
  Fraction(5419351, 1725033)
Another issue is the numerical range. Consider the input "1.5e-318". It causes overflow in the float, decimal, and fraction implementations I gave:

  % python frac.py 1.5e-318 --method float
  Using float64 for '1.5e-318'
  Traceback (most recent call last):
    File "frac.py", line 40, in <module>
      print("solution:", to_frac(Q))
    File "tmp.py", line 31, in to_frac
      Dc = Db * int(Zb) + Da
  OverflowError: cannot convert float infinity to integer


  % python frac.py 1.5e-318 --method fraction
  Using fractions for '1.5e-318'
   1/666666666... many 6s removed ...6666 = 1/666666666... many 6s removed ...6666
      (diff: -1/666 .. more 6s removed ... 6600.. even more zeros removed ...00)
  Traceback (most recent call last):
    File "frac.py", line 40, in <module>
      print("solution:", to_frac(Q))
    File "frac.py", line 35, in to_frac
      if float(N) / float(Dc) == float(X):
  OverflowError: int too large to convert to float
while with the mediant solution I don't need to worry about the input range beyond infinity and NaN:

  >>> from fractions import Fraction
  >>> Fraction(1.5e-318).limit_denominator(1000000)
  Fraction(0, 1)
(I used the float() calls to ensure the fraction and decimal methods stop at the limits of float64. If I remove them I still end up with numbers like 3/2E319 which would not be representable as a Turbo Pascal integer, while the Turbo Pascal mediant implementation would not have a problem.)

Finally, the mediant solution is easier to implement.


I had a similar problem a few years ago, when I was making gears (and gear shaped objects) in a machine shop on old gear driven machines. We had most, but not all, of the change gears, and when making Worm or Bevel Gears, you'd often have a ratio, and have to work backwards to figure out a set of gears to get as close as possible to that ratio.

I wrote the simplest possible pascal/windows GUI application[1] to simply try all the possible gear combinations, and keep the lowest. I was shocked when it returned an answer in less than a second. I thought for sure I was going to be digging into algorithms like this, but nope... cheap compute saved me. I'd quickly get the gears, and the error in parts per million.

I do feel myself getting Nerd Sniped though... must resist.

[1] https://github.com/mikewarot/GearFinder


The GitHub link you posted appears to 404.


Oops... didn't realize it was private, changed to Public, made GPL 3.0 licensed.

and added a screen shot with some directions

and created my first ever "release" on github


I have a background in linguistics (computational) and part of what was mind blowing when my schooling approached the deep dive on some aspects of it was the sheer extent to which much of what we put into language is artificially constrained by the limits of any given language being unable to fully express an infinite range of concepts/thoughts/etc… that some things are difficult to put into words because there are no words for every single gradation of a a phenomenon (colors, with their inherent connection to gradients, come to mind). Maybe this seems trivial, I kind of had an intuitive “of course” understanding of it before deep dive work in linguistics, but that’s when it really became abundantly clear how limited we are by lexical entries.

All of that is a preamble to say that it seems like the same thing is true of mathematics. No single number system can completely encompass all of mathematical expressions. This post sort of touches on it with the varying methods of expressing something in decimal vs fractional notation. 1/3 cannot be fully expressed (well, expanded) in decimal notation but it is trivial in fractional notation. More esoteric examples exist when entering the realm of hyperreal numbers.

It is both amazing and daunting to confront these limits, which may be a product of the limitations of the human mind.(or not, really I don’t know)

And maybe I’m just rambling… I don’t have a strong background in number theory… it is all just so beautiful and humbling though.


For the continued-fraction convergents, instead of Mathematica, here's a web page of mine that does the same thing: https://shreevatsa.net/fraction/best (try it with the 3.213432 from the post, and a large limit for the denominator).

Also, the mediants approach mentioned in the post is not unrelated to continued fractions; the latter can be seen as an "accelerated" version of the mediants (specifically, counting how many times to take certain mediants, and directly getting there), as explained in this great article "Continued fractions without tears": https://www.maa.org/programs/maa-awards/writing-awards/conti...


That article is a truly marvelous article and very worth reading for anyone interested in either continued fractions or rational approximations.

For those who are curious, a simple algorithm for generating these good approximations and continued fractions is to take the mediant of a Farey pair that enclose the given number, such as [3/1, 4/1], here 7/2, and then choose the sub-interval that contains the desired number. Repeat as desired. This first step has [3/1, 7/2] as the next interval to look at as 3/1 < 3.213432 < 7/2. When doing this to generate the continued fraction representation, start with the nominal interval [0/1, 1/0]. A few steps: 0/1 : 1/1 : 1/0 R 1/1 : 2/1 : 1/0 R 2/1 : 3/1 : 1/0 R 3/1 : 4/1 : 1/0 L 3/1 : 7/2 : 4/1 L 3/1 : 10/3 : 7/2 L 3/1 : 13/4 : 10/3 L 3/1 : 16/5 : 13/4 R 16/5 : 29/9 : 13/4 L 16/5 : 45/14 :29/9 L 16/5 : 61/19 : 45/14 R 61/19 : 106/33 : 45/14 R ... The number 106/33 is 3.2121... and is generated by easy computation of the mediants.

Each time the direction of selection is changed, this results in having an ultra-best approximation. If we count the number of steps in between the switches, we get the continued fraction representation: [3; 4, 1, 2, ...].

This is also related to the Stern-Brocot tree, for those interested.

Playing around with these ideas led me to the idea of defining a real number as a number that, given a rational interval, says Yes or No depending on whether it ought to be in that interval. To define the space of such objects, I used properties that the Yes/No answers have to satisfy. The key property is that if you split a Yes interval into two pieces, one of the pieces should be Yes and the other No (unless the splitting point is the real number).

If curious for details, my paper repository on this matter is at https://github.com/jostylr/Reals-as-Oracles where I not only prove that it all works out as it should, but also give an example of doing continued fraction arithmetic in section 7.9 of the main paper for those who just love continued fractions.


He points out that the YouTube video "uses mediants" to compute rational approximations to real numbers. He then says that the continued fraction expansion is better. However, the mediant-based method is totally identical to the continued fraction expansion method. The two are related via the Stern-Brocot tree (https://en.m.wikipedia.org/wiki/Stern%E2%80%93Brocot_tree).

The fractions that one gets from iterating the mediant process are exactly the "semiconvergents" one gets from the continued fraction process. Truncating the continued fraction gives the "convergents"; these are special mediants which are right before switching direction as one zig zags on the tree.


Pretty good Mathologer video about continued fractions:

https://www.youtube.com/watch?v=CaasbfdJdJg

One neat trivium I like to quiz the mathematically-minded on is: what number is the most irrational, i.e., toward which number do rational approximations converge the slowest?

The answer can be determined from continued fractions and is discussed in the video near the end.


Also got me thinking about what fraction is most "pi efficient"? That is, what fraction is closest to pi compared to how many digits you must use to represent the fraction (or perhaps only digits in the numerator or denominator)?


Seems they answered that question in TFA.


Using continued fractions in Mathematica. Maybe there's enough information in the article to say that this is enough to have answered the question. I don't know enough about all of the methods to say, personally. Also, I never outlined what was the best metric to evaluate how well accuracy compares to the number of digits needed.


As others have said, the convergents of the simple continued fraction representation give the best rational approximation [1] relative to the size of the fraction, in the strongest relevant metric: that of absolute error bound. Proofs can be found in any book on Diophantine Approximation; the classic (and readable) reference is Khinchin's Continued Fractions book.

[1] https://en.wikipedia.org/wiki/Continued_fraction#Best_ration...


The answer is already there. If you extend the sequence given in the article longer, the ratio of the number of digits in the rational representation to the decimal representation gets worse, not better, so you know you reached the inflection point early.


If you don't want to do this yourself, use Python's built-in fractions module. Fractions have a built-in method to compute the best fraction given a maximum denominator:

  >>> from fractions import Fraction
  >>> x = Fraction("3.213432")
  >>> int(x)
  3
  >>> (x % 1).limit_denominator(9999)
  Fraction(375, 1757)
The mediant-based implementation is at https://github.com/python/cpython/blob/607f18c89456cdc9064e2...


Often it will be preferable to limit the error instead of the denominator. "Find the fraction with the smallest denominator that's within 0.005 of 0.33." Does python support this as well?


Not directly that I know of, but that's definitely a straight-forward mediant calculation. I'll do it all in integer ratios and assume the answer is somewhere between 0.0 and +inf, exclusive:

    from fractions import Fraction

    mid = Fraction("0.33")
    delta = mid * Fraction("0.005")
    lo = (mid - delta)
    lo_p, lo_q = lo.as_integer_ratio()
    hi = (mid + delta)
    hi_p, hi_q = hi.as_integer_ratio()

    p0, q0, p1, q1 = 0, 1, 1, 0
    while True:
        p = p0 + p1
        q = q0 + q1

        if p * lo_q < lo_p * q: # p/q < lo_p/lo_q
            p0, q0 = p, q
            continue

        if p * hi_q > q * hi_p: # p/q > hi_p/hi_q
            p1, q1 = p, q
            continue
        
        break # in range

    frac = Fraction(p, q)
    print(f"Found: {p}/{q}")
    print(f"{lo} <= {frac} <= {hi}? {lo <= frac <= hi}")
For your test case:

  Found: 22/67
  6567/20000 <= 22/67 <= 6633/20000? True
A grep of the CPython distribution found no use of the word "mediant".


For anyone else who was puzzled at this result (rather than 1/3): The above algorithm is finding something with the relative error of at most .005 times the input value 0.33. 1/3 is the best value for the absolute error of at most .005.


Ohh, indeed. I was thinking of measurement tolerances.

Replace:

   delta = mid * Fraction("0.005")
with

  delta = Fraction("0.005")
and I get:

  Found: 1/3
  13/40 <= 1/3 <= 67/200? True


I definitely meant an absolute error of 0.005, since the error approach was a generalization of "produces 0.33 when rounded to 2 decimal digits".


I started studying Lisp recently and it blew my mind it can do rational fractions, also virtually infinite length integers, OOTB. I used and programmed computers since 286s and have never seen this before. Now I don't even understand why do people still use calculators (including calculator apps) when we have Lisp which is easier and more powerful.


There's a nice calculator app bundled with Windows 10, and I even have Mathematica installed, and of course Desmos is useable everywhere, but... when I need to compute anything that's more than simple elementary arithmetic, I just open a Python REPL. It's simple to type, it's readable, and way easier to use than a calculator for persistence (ever tried to use variables on a scientific calculator? It works, but it sucks)


> when I need to compute anything that's more than simple elementary arithmetic, I just open a Python REPL. It's simple to type, it's readable

Indeed. It also is fairly hard (to me at least) to enter a long (longer than 2 operands) sequence into a classic calculator without making a typo. Whenever I need even to just add 4 numbers I don't use a calculator to enter 1 + 2 + 3 + 4 nor a spreadsheet to enter the data in separate cells, select them and see the sum (I used this way until recently) - I open IdeOne.com, choose Common Lisp and enter (+ 1 2 3 4). Needless to day I don't only use it for formulae this simple, it is of great help in complex cases.


There are a couple of well-known algorithms to find the best rational approximation to a given real number. A good and short book is "Diophantine Approximations" by I. Niven. For code, you can also check out this repo of mine: https://github.com/alidasdan/best-rational-approximation .

Note that decimal numbers represented as a floating point numbers on a computer are actually rational numbers due to limited precision. So converting decimal numbers to fractions, both stored on a computer, means converting a fraction with potentially large numerator and/or denominator to a simpler fraction, say, one with a far smaller denonimator.

For example, an approximation to pi is 3.14159265358979323844, which is the same thing as 314159265358979323844/10^20 (trivial approximation). Using these algorithms we can covert this fraction to a simpler fraction under different approximation criteria such as a bound on the approximation error or a bound on the magnitude of the denominator. For this example, we then get various approximations to pi such as 22/7, 355/113, ...


Fascinating. I'm not very good at math, but one of the first things I programmed was bruteforce search for the fraction closest to Pi.

It's nice to see the "proper" way of doing things! I might revisit my program and see what the efficiency savings are like! (99.99%?)


One could use bc for this [1] commonly installed on Linux and may also be on Mac I believe. Unsure if WSL installs it.

[1] - https://superuser.com/questions/377492/how-to-do-division-wi...


In the video this guy uses web vpython but doesn't share what website he is using. Google points me to glowscript.org, but it's clear he is using a different web vpython tool (based on the look and icons) Anyone know what tool/website he is using to code python?


The video description links to his trinket.io account that he's using for it.


That was it, thank you!


how do i become a hacker???




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: