Hacker News new | past | comments | ask | show | jobs | submit login
Floating Point Visually Explained (fabiensanglard.net)
451 points by alxmdev on Sept 28, 2017 | hide | past | favorite | 106 comments

Let's represent the number 42,643,192, or 10100010101010111011111000₂, in different "floating point" representations.

Scientific notation with 5 significant figures:

    4.2643 × 10⁷
Scientific notation in base 2 with 17 significant binary figures:

    1.0100010101010111₂ × 2²⁵
Let's pack this in a fixed-length datatype. Note that 011001₂ is the binary encoding of 25.

    1 0100010101010111 011001
    1     mantissa      exp.
This doesn't suffice because

a. We're wasting a bit on the leading 1.

b. We want to support negative values.

c. We want to support negative exponents.

d. It would be nice if values of the same sign sorted by their representation.

The leading 1 can be dropped and replaced with a sign bit (0 for "+", 1 for "-"). The exponent can have 100000₂ subtracted from it, so 011001₂ represents 25-32, or -7, and 111001₂ represents 25. Sorting can be handled by putting the exponent before the mantissa.

Thus we get to a traditional floating point representation.

    0 111001 0100010101010111
    ±  exp.      mantissa
Real floating point has a little more on top (infinities, standardised field sizes, etc.) but is fundamentally the same.

> Sorting can be handled by putting the exponent before the mantissa.

Don't denormal numbers prevent simply sorting this way? I could use a diagram like the OP's to remind myself how they work...

Using the article's "window" analogy: denormals are in the smallest "window", so the sort order between the "windows" (exponents) is kept. Their special property is that they don't have the implicit one to the left of the point (it's an implicit zero instead); the order within the mantissa is still kept.

That is:

    0 000000 0100010101010111
    ±  exp.      mantissa
Here, the value is 0.0100010101010111₂ × 2^-30 (I hope I calculated the exponent correctly)

As a bonus, the zero value comes naturally in this approach: it's a denormal with a mantissa of zero. Without denormals, the implicit one would get in the way.

> Don't denormal numbers prevent simply sorting this way? I could use a diagram like the OP's to remind myself how they work...

No, there is nothing really special about subnormal numbers other than that they have a smaller than normal precision.

For example 0x00000001 < 0x00400000 in both floating point and normal integer arithmetic. The only weird bit about sorting floats as integers is that the order is reversed; eg -0f is represented by the smallest integer (0x80000000) and -Inf is represented by 0xFF800000.

Best simple explanation I've seen so far.

I highly recommend Fabien's Game Engine Black Book. I'm halfway through it, and it's really fun. I've only been a software dev for 6 years, so looking at how things could be hacked around in the 90s to squeeze every drop of performance out of very constrained devices is fascinating.

I dunno all this talk of "windows" and "buckets" etc doesn't seem particularly simple to me.

I guess I'm not math-y enough so that I intuitively understand the simple math formula.

On the other hand, I like his window image : e.g. I understand better how, the further you get from 0, the more your precision goes down, because "bigger windows, divided in the same number of buckets".

So the question is: if you feel like you don't get the math formula, but you get the windows-and-buckets explanation of it, is it still possible that your understanding doesn't match the true underlying concept? Because that is the pitfall with a lot of intuitive explanations, that unless you are sure that the explanations are equivalent to the true thing, you might end up understanding an idea that is close but slightly off.

So a puzzle: if two positive numbers in exactly the same window are subtracted, what is the worst-case rounding error you can get in the result?

To confuse analogies with formal reasoning is a misuse of analogies. They are stepping stones, a kind of 'sniff test' or 'sizing up' of the mind to set the stage for more formal understanding. Is this your point?

Because, to re-use your argument, the pitfall with many formal definitions is that they are not easily understandable, whose payoff is often not much more than the analogy for the learner, but requires significantly more resources.

So, which would you prefer as an introduction? A "good enough" understanding, or no understanding because the bar of entry was too high?

> you might end up understanding an idea that is close but slightly off

Just like most floating point numbers

ba dum tss

Error relative to the original numbers or to the result?

Is it 2(1/2^23)(window size)?

It should be zero, exactly zero, i.e., no rounding error. Because if you do long subtraction on the significands, the number of digits you need to store the result is at most the number of digits you had, because the numbers were within a factor of two from each other (so no shifting left or right).

Just to check, with sbv in haskell, using the Z3 SMT solver:

    λ> f (x::SFloat) (y::SFloat) = fpIsNormal x &&& fpIsNormal y &&& y/2 .<= x &&& x .<= 2*y ==> fromSFloat sRNE (x - y) .== (fromSFloat sRNE x :: SDouble) - (fromSFloat sRNE y :: SDouble)
    f :: SFloat -> SFloat -> SBool
    λ> prove (forAll_ f)
Or without rounding through double, using the formula for exact rounding error of a sum:

    λ> g (x::SFloat) (y::SFloat) = fpIsNormal x &&& fpIsNormal y &&& 0 .<= x &&& y .<= x &&& x .<= 2*y ==> (x - ((x-y) + y)) .== 0
    g :: SFloat -> SFloat -> SBool
    λ> prove (forAll_ g)

It's a fantastic book. Fabien has a gift for taking complex topics and presenting them in a concise and approachable way. I hope he continues on with the project and does books on Doom and Quake.

> People who really wanted an hardware floating point unit in 1991 could buy one. The only people who could possibly want one back then would have been scientists (as per Intel understanding of the market). They were marketed as "Math CoProcessor". Performance were average and price was outrageous (200 USD in 1993 equivalent to 350 USD in 2016.). As a result, sales were mediocre.

Actually, that's only partly true. My father owned a company that outfitted large manufacturing shops (MI company, you can imagine who his customers were). As a result, he used AutoCAD. The version of AutoCAD he used had a hard requirement on the so-called "Math Co-processor", so he ended up having to purchase one and install it himself. That was my first taste of taking a computer apart and upgrading it and I credit that small move with my becoming interested in building PCs, which led to my dad and I starting a business in the 90s doing that for individuals and businesses. There were definitely more reasons for that kind of add-on than just scientific fields; anyone in the computer aided drafting world at that time needed one as well.

That's why there was math coprocessor software emulator. It worked fine on 386SX.

Okay, fine, I agree that sometimes mathematical notation is bad and we are all computer people here, not math people, so we get really scared of mathematical notation.

But is (-1)^S 1.M 2^(E-127) so bad that it required a whole blog post to explain it? Except for the "1.M" pseudo-notation to explain the mantissa with the implicit on bit, all of those symbols are found in most programming languages we use.

I don't think the value of the blog post was explaining the notation. We all knew what operations to perform when we saw it. The value seems to lie more in thinking of the exponent as the offset on the real line and the mantissa as a certain window inside that offset.

Personally, though, this still doesn't seem like a huge, deep insight to me, but maybe I'm just way too used to floating point and have forgotten how hard it was to learn this. I did learn about mantissa, exponents, and even learned how to use a log table in high school, but maybe I'm just old and had an unusual high school experience.

I don't really understand significance of this insight easier. Floating point number is just stored in scientific notation base 2. That's it. And kids learn scientific notation in 7th grade? 8th tops. I mean, swap base 2 to base 10 in this image, and the effect changes from "woah" to "duh, obviously".

>But is (-1)^S 1.M 2^(E-127) so bad that it required a whole blog post to explain it?

Well, for one the expression doesn't tell us anything, could as well describe alien gravity -- even if we know math. You still need to explain what S, M, and E are used for to understand it.

S is sign. M is mantissa. E is exponent.

The only term there you might not remember from high school math is mantissa, and a search engine will precisely explain that in two sentences, in terms you do remember from high school math.

> S is sign. M is mantissa. E is exponent.

That doesn't actually get you any closer to knowing what the formula means, though. You need to also define that M is unsigned & 23 bits and E is unsigned & 8 bits. And that alone still doesn't get across the resulting limitations like that precision reduces by half every time the exponent bumps. Sure you can reason about it if you devote the brainpower to applying the limitations to the formula, but it's not explained by the formula itself.

Yes, jordigh did post, mistakenly or deliberately, an incomplete explanation of IEEE754 single precision floats. A number line with alternative labels showing an approximate mantissa and exponent for a given float is certainly one possible way to explain the application of the spec, but obviously, not only does it give less information than the specification, it is not obvious that it is a generally better explanatory tool. For example, it certainly helps Fabien, I assume it helps you and coldtea, but it leaves me completely bewildered (I know what it says, but on it's own, I'm completely lost as to why).

edit: jordigh, not coldtea, first brought up the formula in his argument

>not only does it give less information than the specification

That's generally considered a plus (if not a sine qua non) for a simplified explanation, and the larger the specification, the bigger plus.

Else, one might as well read the 100s of pages of specs.

A search engine will also show that the use of the word "mantissa" in relation to floating point numbers is discouraged in favor of "significand".

"it is an abuse of terminology to call the fraction part a mantissa, since this concept has quite a different meaning in connection with logarithms"[1] –DEK

[1] https://en.wikipedia.org/wiki/Significand#Use_of_.22mantissa...

How is this issue unique to strings? Nobody can look at a random graph and know what it stands for either. Otherwise there would be no need for labeled axes. You still need to explain what is it that we're looking at

This isn't to say that visualizations aren't useful, of course.

>How is this issue unique to strings?

Who said anything about strings?

>Nobody can look at a random graph and know what it stands for either.

Well, the parent wrote: "is (-1)^S 1.M 2^(E-127) so bad that it required a whole blog post to explain it?".

And the answer is, yes it is.

We can look at y = 2*x + 5 and understand it immediately. Sums, limits, and lots of other equations too.

Floating point though is an encoding/representation scheme, so it needs more than the mere equation to be understood. I don't just want to know I replace three numbers and do a calculation, I need to know what these numbers are supposed to represent.

I've mentioned strings, and so did you now.

What I don't understand is what's supposed to be uniquely puzzling about strings. Maybe I'm under the influence of the submission, because the guy said he had an allergy for notation. But even after putting a floating point expression in analogy with some geometrical constructs, the whole thing could still be a model of alien gravity, because neither the expression nor the geometric figures carry any inherent meaning in themselves. The only reason why we know that it is in fact just a pure representation of how to approximate real numbers is because the post itself tells us so.

“Who said anything about {x}?” is an idiomatic equivalent to “Why did you bring up {x} unprompted?”

I think you're trying to say that unlabeled formulas are no better or worse than unlabeled graphs and the like, but calling a formula a “string” is muddling it up. (If that is what you meant.) Because we're talking about floating-point data types, people are likely to think of the string data type.

> But is (-1)^S 1.M 2^(E-127) so bad that it required a whole blog post to explain it?

What is the value of this question? If you don't need a blog post to understand it, then don't read it; no harm was done. Obviously the author felt that there was enough potential for confusion that an explanation might be needed, and other commenters here have appreciated it. That's good for some people and harm for no-one, which seems a pretty clear net win.

Because I get frustrated with people. I feel like they refuse to even attempt to understand. They say to themselves "I'm a visual learner" (everyone is) or they seek solace in groups when mathematical notation is difficult to understand. It's ok if I don't understand this, and I don't even have to try, because lots of other people don't get it either!

If even an idiot like me can understand mathematical notation, anyone can. In particular, if you're able to understand code, you can understand mathematics. I get frustrated with the mental barriers people set up for themselves when learning mathematics.

I think it may even be cultural. I have heard that in China, for example, they don't have this aversion to anything that looks mathematical.

> Because I get frustrated with people. I feel like they refuse to even attempt to understand.

Many people do refuse to understand. My parents & grandparents do that about anything computer related at all, not to mention math. Lots of people, especially guitar players, avoid learning music notation. Most people gloss over anything in a foreign or technical vocabulary.

I agree with you, and I get frustrated too, and I like to dig into mathy stuff, and I also find myself glossing over things and refusing to "get it" only to discover later it wasn't that hard.

So: does it help to get frustrated? Aside from expressing incredulity, what can you do to help someone more easily understand, or get motivated to understand?

It seems like for at least a subset of people, calling it "easy" and making pictures really truly does help them understand it, and/or spend the time being willing to understand it. What's wrong with that?

> I think it may even be cultural.

It might. I'd buy that Asian students get more math and have a slightly lower aversion in general. But I also know older Asians that refuse to learn simple computer concepts exactly the same way my parents do, so on that anecdata alone, I'd speculate that refusing to learn things that seem foreign or technical is part of the human condition. I'd even guess that this is an evolutionary advantage, that our brains actively avoid hard to understand things because it uses too many of our limited cognitive resources.

> Because I get frustrated with people. I feel like they refuse to even attempt to understand. They say to themselves "I'm a visual learner" (everyone is) or they seek solace in groups when mathematical notation is difficult to understand. It's ok if I don't understand this, and I don't even have to try, because lots of other people don't get it either!

But surely attempting to convince someone who is afraid to learn a given piece of mathematics that it is actually within his or her comprehension is a good thing, not a bad? One need not convince a die-hard visual learner that he or she should give up learning that way, only that he or she can, and probably should, strive to learn how to convert mathematics into a more visual form?

(EDIT: I just noticed that dahart (https://news.ycombinator.com/item?id=15362069) said much the same, and probably better.)

Incidentally, I think that I disagree with the statement that everyone is a visual learner. Without getting into the technical meaning of the term, if there is one, I believe that I don't learn by picturing images the way that I think people who say that they are visual learners believe that they are learning.

I'm quite certain you can understand a histogram a lot better than a long list of numbers, that a continuous function will make more sense if you're able to look at its graph, that you've doodled or scrawled on a board some sort of diagram to help you understand something.

We're not digital computers. We have a well-developed visual cortex and we have to use it in order to learn.

(-1)^S 1.M 2^(E-127) is the concept not the implementation and the implementation is the important part, especially as a programmer. The blog is mostly about the details. How the bits are allocated, how they influence precision, etc...

And exactly none of that is included in "(-1)^S 1.M 2^(E-127)"

But that's not what the blog post is about. Every other explanation of floating point numbers describes how bits are allocated. The visual part of this article is a drawing of the window and offset thing. Which in this case, is exactly what the expression is saying.

Some of us are visual thinkers and understand things easier if there are images attached :-). Personally i think almost completely in images so i'd have to somehow visualize something to understand it :-P someone doing it for me helps immensely.

Of course i can also think more abstract but it feels as if i'm trying to send megabytes of data to another city through TCP/IP with one byte per send().

Everyone is a visual learner but everyone can also understand a simple arithmetic expression. Everyone thinks almost completely in images. The only difference is that mathematicians have had more practice learning how to transform expressions into visual images. It's a bit like looking at sheet music and being able to hum it in your head.

Learning how to read sheet music isn't something that shouldn't even be attempted because you're an "auditory learner". I also think that spending a little extra effort to learn how to read an arithmetic expression is worthwhile and a skill we should foster in society.

>Everyone is a visual learner but everyone can also understand a simple arithmetic expression. Everyone thinks almost completely in images.

I think it is hard to separate your experience from the collective here. Much of the time I am working on math or programming, I don't think in images; I am just manipulating symbols in a patterned fashion that I have already convinced myself is sound (in a logic sense).

I definitely have never thought about floating point as anything other than binary scientific notation. I am quite comfortable with the understanding of exponent, significand, and sign.

You've never drawn a linked list? A Venn diagram to think about your logical sets? A Hasse diagram for showing set inclusion? A function to see its growth? Never visualised numbers as being on a real line, with in-betweenness properties?

The visual cortex is one of the most developed part of our brains. Even blind people use it for mathematics:


Even when you're doing symbol pushing "blindly", you're still using visual skills to place symbols, different fonts, you align variables, superscripts, subscripts. There's a certain geometry to mathematical notation itself.

> You've never drawn a linked list? A Venn diagram to think about your logical sets? A Hasse diagram for showing set inclusion? A function to see its growth? Never visualised numbers as being on a real line, with in-betweenness properties?

I never said that I don't visualize things. Just that it isn't the only (or main) way that I think about things.

By all means, methods of visualization can be very powerful. Designing different ways to visualize things can make more ideas easily accessible. Many things are also illuminated by logical proofs, sound visualizations (for lack of a better word), or narrative.

I think it is hard to extrapolate from a single experience what the best way to understand is for most people. You may be a very visual learner, and not realize that many other people don't gain as much from visualizations as you do.

> Even when you're doing symbol pushing "blindly", you're still using visual skills to place symbols, different fonts, you align variables, superscripts, subscripts. There's a certain geometry to mathematical notation itself.

The particular symbols and syntax matters little. I certainly am not considering different fonts when I work things out. Often manipulations I am doing are only in my head and don't have a concrete image until I need to write it down.

I'll step in here. I've met people who are visual like me, and do math using their visual thinking. For me, algebra and such is like tetris or a legos but with symbols. To this day, when I do algebra in my head, it involves visualizing symbols as if they are on a page and manipulating that picture.

Others however are not visual, but yet they can do algebra fine using something like logical rules, and to be honest, often tend to be more accurate with it. It sounds like you're one of those people, so it's unfair for what the above poster said regarding everyone being "visual". I think it's not really a matter of ability, it's more about what methods people prefer and some people just don't prefer visualization.

Funny side story: I think I can do the logical thing but it isn't my usual modus operandi, and I had to work on sharpening my "logical" side. I remember when I was in particle physics, and my office mate was so good at the algebra in QFT class and always very accurate. I on the other hand made frequently many little errors, but I'd always be able to "see the forest from the trees." I remember him once showing me the correct way to derive some transition amplitude that was in the homework, and just seeing it for the first time, I said "ah, that reduces to 1 + blah" because I could see that immediately; while he hadn't realized that even after working on it for a couple of hours. I think both types of people (left and right brain I guess) sort of complement each other in research, and we both have to approach math in our own ways.

The value of the post is that, yes, I could understand the mathematical notation, but I never had the need to dive into how floating point numbers actually work.

Now it's on HN in an accessible format, and now I know how IEEE 754 works, and I've made an intuitive mental model of it thanks to Fabien's great teaching skills.

Math notation is a compact representation of a concept, not a way to _explain_ it.

There's no need to be condescending.

Here's everything you need to know about Floating Point in as shortly as I can write it.

1. Floating points are simply "Binary Scientific notation". The speed of light is 2.98E8... which in "normal form" is written 298,000,000. An IEEE 754 Single has 8-bits for the exponent (E8 in the speed of light), and 24-bits for the mantissa (the 2.98 part). There's some complicated stuff like offset shifting here, but this is the "core idea" of floating point.

2. "Rounding" is forced to happen in Floating Point whenever "information drops off" the far side of the mantissa. The mantissa is only 24-bits long, and many numbers (such as .1) require an infinite number of bits to represent! As such, this "rounding error" builds up exponentially the more operations you perform.

3. Subtraction (cancellation error) is the biggest single source of error and the one that needs to be most studied. "Subtraction" can occur when a positive and negative number is added together.

4. Because of this error (and all errors!), Floating point operations are NOT associative. (A + B) + C gives a different value than A + (B + C). The commutative property remains for multiplication and addition (A+B == B+A). If you require "bit-perfect" and consistent floating-point simulations, you MUST take into account the order of all operations, even simple addition and multiplication.

For example: Try "0.1 + 0.7 + 1" vs "1 + 0.1 + .7" in Python, and you'll see that these to orderings lead to different results.


Once you fully know and understand these 4 facts, then everything else is just icing on the cake. For example to prevent "cancellation error" (#3), you can sort the numbers by magnitude, and then add them up from smallest magnitude to largest magnitude.

This is an excellent short list of the main points I learned as an undergrad, and what I've retained today. One other point is that 64 bit floating points (aka doubles) really are about double the precision of 32 bit floating points, in terms of significant digits. An interesting implementation detail is that they spend a higher fraction of their "bit budget" on the mantissa to get this (52 out of 64 bits, vs 23 out of 32).

23 bits for the mantissa, and you're forgetting the leading sign bit.

No, it's a 24 bits significand, you're forgetting the not-explicitly-stored leading 1. (https://en.wikipedia.org/wiki/Single-precision_floating-poin...)

Are you sure rounding error builds up exponentially? I may be wrong but I thought it was sqrt(N) where N is the number of operations.

I like the "window/offset" concept. I wrote an extended blog article with yet different visual aids: http://blog.reverberate.org/2014/09/what-every-computer-prog...

Excellent post. I like it better then the one posted here.

I really wish he didn't make [0,1] one of the windows, because in floating point arithmetic the range [0,1] contains approximately as many floating point numbers (a billion or so in Float32) as the range [1,∞). There are "windows" [2^k,2^(k+1)] for positive as well as negative k. Just creates unnecessary scope for further confusion.

Looks like he corrected it.

In 32 bit float, about 50% of the possible values (no denormals) fall between -0.5 and 0.5 (~2 000 000 000), the space on each side between 0.5 and 1 adds only about 80 million new values.

IMO, the best way to explain floating point is to play with a tiny float. With an 8-bit float (1 bit sign, 4 bits exponent, 3 bits mantissa, exponent bias 7), there are only 256 possible values. One can write by hand a table with the corresponding value for each of the 256 possibilities, and get a feel to how it really works.

(I got the 1+4+3 from http://www.toves.org/books/float/, I don't know if it's the best allocation for the bits; but for didactic purposes, it works.)

> Since floating point units were so slow, why did the C language end up with float and double types ? After all, the machine used to invent the language (PDP-11) did not have a floating point unit! The manufacturer (DEC) had promised to Dennis Ritchie and Ken Thompson the next model would have one. Being astronomy enthusiasts they decided to add those two types to their language.

Wait, what was the alternative? No floats? How the heck would people calculate things with only integers?

edit: AFAIK bignums are even slower, and fixed-point accumulates error like crazy

Embedded developer here, using 8-bit microcontrollers, so all our calculations are done in fixed-point/integers.

Here's a chunk of real code that converts results from the analog-digital converter to appropriate units. A lot of precalculated constants and things like using shifts instead of division as much as possible.

    // - motor current from ACS723 (2.5V is 0A, +/- 400mV/A)
    while (ADC_RUNNING()) {}
    uint8_t result_current = ADCH;
    // result(val) * 5V/256(val) * 1000mA/0.4V => result*48.828 = current in ma
    // e.g. 5A -> 4.5V out = 230, less 2.5V (128) = 102
    // 102 * 48.828125 = 4980 mA
    // for max precision, max multiplier = 32678/128 = 255, 
    // use (value * (48.828 << 2)) >> 2. -> 195
    // value_ma = (result_adc * 195) >> 2
    current_ma = (((int16_t)(result_current) -127) * 195) >> 2;
    // - motor output speed feedback signal (filtered PWM, 16% (1/6.1) of actual value, so 5V in == 30.5V)
    while (ADC_RUNNING()) {}
    uint8_t result_speed = ADCH;
    // result(val) * 6.1 = actual speed setting
    // result / 256 * 6100 = actual in mV
    int16_t speed_mv = (uint16_t) result_speed * 119;
    #if DEBUG
        // calculate target speed mv
        uint16_t output_speed_mv = ((uint16_t) motor_current_speed * 195) >> 2;

I've definitely read a ton of code like this.

I often do it in simulation as well to help speed up calculations. Basically one is still doing floating point, but manually. i.e. You changed the unit (a.k.a. exponent a.k.a. window) such that the desired value can be accurately quantized by an unsigned int (a.k.a. mantissa a.k.a offset).

The big caveat being one needs to be sure the calculation doesn't over/under flow the window, of course.

This is exactly what I was imagining! x>>2 rather than x/2. Brilliant, thanks.

x >> 2 replaces x/4.

Oops, you're absolutely right.

>Wait, what was the alternative? No floats? How the heck would people calculate things with only integers?

I find working with fixed-point numbers actually much easier than (hardware) floats, because of the even distribution. Also note that some libraries like OpenGL ES provide fixed-point types for extra performance. I do believe IEEE 754 is next to impossible to use for any practical application, unless all your values are between -1 and 1. It's truly a horrible standard.

>How the heck would people calculate things with only integers?

Easily -- it happens all the time in industries which demand specific precision -- e.g. integers are used to calculate monetary values in languages that don't have a decimal/big number type. You just need to format them to the precision you want when you show them to the user.

It would just be slow.

Sorry, by bignum I didn't mean "big numbers", I meant arbitrary-precision, i.e. 1/3 instead of 0.333333333, is this what you mean? I don't see how you get around stuff like many divisions or nth roots. You'd lose precision during the calculation operation, right? Whether you format it at the end has little bearing, since it'll be garbage by that time.

You can just as easily lose precision in FP arithmetic if you're not careful. But for engineering purposes fixed point works quite well because having arithmetic much more precise than manufacturing tolerance is no use.

Fixed point was good enough to go to the moon: https://www.netjeff.com/humor/item.cgi?file=ApolloComputer

AND core-rope memory. Wow. Thank you for this.

>I meant arbitrary-precision, i.e. 1/3 instead of 0.333333333, is this what you mean?

No, more like having in mind the precision you want and multiplying with that (scaling your number, so to speak). But that's for fixed precision -- it just gives you access to decimal values. (You could emulate arbitrary precision in software too, but it would be way slower).

>You'd lose precision during the calculation operation, right?

Yes, but that's the case with FP too. You can't just use floats for financial calculations for example.

A logarithmic number system is in my opinion far superior to floating point in many ways.


Imagine a ruler with all floating point values on it, each time the mantissa comes at its maximum, you increase the exponent, so the space between farther float values doubles.

The number of mantissa values being constant for each exponent value, the exponent describes some kind of "zoom level".

Float values on a ruler would sort of looks like this:

   ... x x x x   x   x   x   x   x   x...
              ^ exponent increases, spacings are doubled

I see how some people just get the math, but I don't see why programmers here say they find it difficult to understand the window / offset explanation the article gives.

A "window" is a common programming term for a range between two values.

An "offset" is a common term for where a value falls after a starting point.

In simpler decimal and equidistant terms, the idea is to split a range of values in windows, divide each window in N values, and store an FP number by storing which window and which index inside the window (0 to N) it falls.

The FP scheme actually uses powers of 2 instead of equal distant windows (so the granularity becomes coarser as the numbers become bigger) but the principle is the same.

I'm guessing because if you put those words together, you automatically assume the offset means the position of the window. It's clear enough with the illustration, but using terms that are in very common use for other concepts can make it confusing at the first glance.

Also the window explanation hides the fact that it's able to represent really small numbers. (graph implies that 0-1 is the smallest window)

Smaller is an ambiguous term when it comes to numbers. Is it the lesser value? Or is it the value nearest to 0? It depends on the context.

Either way, they cover the sign bit, and the brackets on the diagram show the graphs are not using the sign, just exponent and mantissa. So I'd assume the same principles stand but for 0 to -1, -1- to -2, and so on.

I mean smaller as in closer to zero as in exponent lower than 127. That part is obvious in the formula, but not from the graphs.

The interesting part of the explanation is that it shows very clearly the effect of the 1.M in the formula. Unfortunately it then does not answer the resulting question: how is zero represented. ( https://en.wikipedia.org/wiki/Signed_zero does a pretty good job at that though)

I think the example values at https://en.wikipedia.org/wiki/Minifloat are most useful for intuitively understanding how floating point works --- especially the "all values" table, which shows how the numbers are spaced by 1s, then 2s, then 4s, etc. meaning the same number of values can represent a larger range of magnitudes, but sacrificing precision in the process.

Obligatory: What Every Computer Scientist Should Know About Floating-Point Arithmetic [1] (pdf)

[1]: http://www.itu.dk/~sestoft/bachelor/IEEE754_article.pdf

A much easier and better way to understand floating point is to just do it in base 10.

Ooooh I see. I like it, thanks !

See also: An interactive floating point visualization: https://evanw.github.io/float-toy/

As a complete layman with only a cursory knowledge of programming, as well as a complete lack of math skills above Algebra 2, (I didn't even complete that, tbh, once they threw graphing into the equation. I did get slope-intercept form down, but that's it.)

I ended up finding this easier to understand than I expected, and a great read.

I love explanations like these, with a visual breakdown. It really helps it "click."

as long as I glazed over the math formulas and didn't let the numbers overwhelm me.

This is what I took away: the exponent "reaches" out to the max value of the [0,1] [2,4] etc, and the number represented tends to be like 51-53% of the way down the line of the mantissa.

It "clicked" a bit for me, see? Am I way off?

This is the way I always learned math the best in school, an alternate explanation that helps it "click."

Very good explanation, from my point of view, of how floating point numbers work and what they even are.

That's a nice feeling for someone like me who is pretty bad at math and finds formulas like the one shown in the article to be, frankly, indecipherable.

But now I (sort of) understand how floating point numbers work, (sort of) what they are, why they are important, and what role they play.

Could I program anything using one? No. But, I could learn someday, and explanations like these give me some hope that I just might be able to learn a programming language if I put the effort in. That I could learn the math required of me, even!

Why did they fix the bit-width for the mantissa and exponent? It would be nice to have more bits for the mantissa when you are near 1, and then ignore the mantissa entirely when you're dealing with enormous exponents, and very far from one. Granted, there would be some overhead (e.g. a 3-bit field describing the exponent length, or something) but it would be a useful data-structure.

You may like: https://youtu.be/aP0Y1uAA-2Y

It turns out the math is slightly harder but it's faster than IEEE FP in hardware, probably because of less conditionals in the spec.

There is better performance in terms of numerical accuracy at the cost of substantially more difficult error analysis.

Disclaimer: I'm in the video.

That exists, it's called arbitrary precision floating point (as opposed to single/double precision etc).

> Instead of Exponent, think of a Window between two consecutive power of two integers.

I know what an exponent is, or if you want "order of magnitude". Sorry, but "A window between two consecutive power of two integers" doesn't make it easier to think about.

And the implication is that the windows all have different sizes, which I do not find intuitive at all.

It does for others, who deal with windows and buckets in programming every day.

window washers?

I loled.

But they wouldn't qualify for the "in programming" part.

The last bits of trivia are very nice.

The x87 coprocessor makes me wonder about days were each chip changed you system. It was such a different mindset that videogame consoles had parallel routes between the board and the cartridge themselves to allow hardware extension per game.

>I wanted to vividly demonstrate how much of a handicap it was to work without floating points.

So, did he manage to demonstrate that in the book? Because the page linked here, while explaining how floating points are represented in memory, does not explain how computers perform operations on them, or what purpose does a FPU serve (how does it differ from an ALU).

Does anyone have an alternate link, for some strange reason this link appears to be blocked at my work.

Off topic: The response of dragontamer is one example about why down votes alone are not enough. It was downvoted to the dead level and now nobody can reply to it. But also nobody gave the reason why what he was saying is incorrect.

It wasn't downvoted, it was killed by software because that user is banned. We ban accounts that violate the site guidelines (https://news.ycombinator.com/newsguidelines.html).

In such cases future comments are dead unless someone vouches for them, which gives good comments a pathway back to visibility and means that bannage isn't a one-way street.

I agree. I read HN with dead comments not suppressed and I estimate about 50% of the time I consider them legit. Not always what I agree with, but sometimes I even want to reply to them.

If you think a dead comment should be seen by the community, vouch for it. It's a way for the community to help moderate the discussion by allowing habitual offenders to still input relevant discussion.

Once it's marked dead (and not simply downvoted to 0) I don't see a way to upvote it back to life nor a way to reply to it. Perhaps other people do. Otherwise I would definitely do so!

Try clicking on the time ("6 hours ago") to open the comment by itself, for me that makes the "vouch" option appear.

Thanks, I’ll do that!


I think you posted this to the wrong thread.

Derp. I did indeed.

Wrong post.

I was hoping for something akin to a xkcd or SMBC comic, this isn't really much better than what my asm prof said when he explained it for MIPS programming. Maybe it's because I don't get what he means by offset and window, but this wasn't really that helpful.

The window limits the value the offset can represent -- minimum and maximum bounds. The window doubles each time you increment it since it's binary / power-of-two (0, 1, 2, 4, 8, …).

Within any given window there are 2^23 possible values. If you imagine those values as an array, the offset is the array index.

So in the example of 3.14, the window is [2, 4] since that's the power-of-two pair that contains the value. The offset is (value - minimum) / (maximum - minimum) = 0.57, which you multiply by 2^23 to get the array index.

Because the window doubles each time it's incremented, as the window gets bigger you lose precision -- in his example, [0, 1] gives 15 decimal places but [2048, 4096] only gives 4 decimal places.

Thank you. I am certain the article said that somewhere but that didn't click until your explanation.

The window is an index into a consecutive block of numbers (like a 4K page), and the offset is the offset into that block. [edit: the other poster highlighted the important distinction--that the window doubles in size each time so your fixed offsets into that window become further apart. If you had 1024 offsets into a 1024 window, you get index 0 for 0. index 1 for 1. 2 for 2. etc. But when you double that window, you're now one index for two steps. #1 for 2, #2 for 4, #3 for 6. and then you add that onto the top of the previous window's starting point, which in my example was 0-1023. So add 1023 for the second block. But a diagram would be really helpful at this point.]

Now, I don't see the huge benefit of introducing a whole extra topic (pointer arithmetic) to explaining floating point. But if it works for someone, more power to them!

[edit: Also, since pointers don't normally swell/scale/double like floating point, then even if you explain using the pointer analogy... you then have to NOTE the difference. So I still think it's a really bad analogy to use. Even worse when tons of modern day programmers know almost nothing about system programming and addressing modes.]

Anyway, best of wishes. The guy certainly put a lot of work into writing an article he thought would be helpful.

Personally, I'm grateful that it isn't a comic. Between some stick figures plus a diagram and just the plain diagram, I'll take the latter. Comics might help writers order their thoughts because due to the constrained and sequential nature of speech bubbles, but generally they don't add much to the explanation itself.

Yeah it's really not a great explanation, IMO.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact