Hacker News new | past | comments | ask | show | jobs | submit login
The Cult of Posits (cornell.edu)
105 points by FullyFunctional on April 1, 2022 | hide | past | favorite | 72 comments



Having investigated posits by running RTL implementations through synthesis on 7 and 28 nm nodes, I don't buy the claim that a posit FPU is smaller than a IEEE float FPU. The implication is probably more around that one could use a 32 bit word size posit than a 64 bit word size posit, or similar, for many applications which would make this true. This is still TBD for many classical HPC problems I think though.

On an equivalent word size basis, the maximum precision of a posit (assuming reasonable exponent scale) is much larger than an IEEE float at a given word size. Adders and multipliers must be sized to handle this (e.g., a multiplication between two posits with maximal precision (1 <= x < 2) involves requires the full multiplier to handle this case). Multipliers have a quadratic dependency on the significand size. Subnormal handling in IEEE adds a lot of complexity, but not as much as a significantly larger multiplier.


I think you are spot on and it's IMO unfortunate that John overreaches wrt. the benefits of posit as it distracts from the actual major advantages:

* A much more consistent and sane floating point (most arithmetic rules _do_ apply for posits unlike for IEEE Std 754 and you never round to infinity).

* Much greater range and precision for the same bits.

posit32 falls somewhere between floats and double in precision and the hardware implementation will reflect this (which isn't a bad thing given the much higher space efficiency).

I'm not sure about the quire as it looks expensive to me, but I haven't tried implementing it. However what it does provide is pretty remarkable: zero rounding errors for reasonable sized dot-products (IIRC < 2^32 elements).


The energy requirement for a quire is very high unless you are talking less than 12 bit posits or so (in which case the strategy actually becomes superior from what I've seen due to the lack of needing to convert back to a float/posit via rounding). >500 flops to hold state, or a >500 bit RAM for a (32, 2) posit quire uses a crapload of power compared to just a 32 bit register.

Pipelining becomes difficult too, as carries mean that potentially every bit held in the quire needs to be updated, so in the naive implementation you need to wait for the latency of a >500 bit adder.

You can solve this partially by bucketing based on where in the quire a posit value expanded into fixed point should be added as fixed point addition is associative (ignoring overflow) so in a single cycle you don't need to wait for everything and can re-order as needed, but there's still the potential that a carry would need to propagate over the entire quire. Also, this means additional flop/RAM state and more bookkeeping which burns even more energy. Maybe you can get more exotic with asynchronous logic (like Nvidia did a paper on recently for handling logarithmic addition), but good luck verifying timing and everything for that.

It's been quite a while since I've run numbers on this, but I wouldn't be surprised if you could get 4-10+ posit non-quire FMAs in the same power budget as a single posit FMA using a quire, and fit a lot more onto a chip.

Don't get me wrong, I think the posit is a wonderful idea (make the bits you are storing in memory more meaningful), but I think a lot of Gustafson's proposals hinges upon the quire being available too ("the end of error"), which is a strong ask save for the few applications that actually need it.


The main problem I see with the quire idea is that John tries to use it as an argument that fma isn't necessary, and while a quire is strictly more useful, you won't be able to use multiple of them at the same time (due to hardware constraints). For applications taking dot products, this isn't a problem, but for fast evaluation of polynomials, interleaving multiple fmas is essential for good performance. As such, I think that quires are probably a really good idea for 8 and 16 bit posits, but for 32 and 64 bit, I think having an fma instruction is pretty much necessary.


Actually they have changed their position on this in the latest, now ratified spec; the quire is now just a data type. How you implement it is up to you, but you can certainly have more than one.

A lot of the material about posits is out of date, including the Cult of Posits. The +/- Inf has been replace with NaR. As an aside: interestingly MININT maps to and from NaR when converting between posits and integers.


good to know, but even if you can have more than 1 semantically, in hardware, they take 512 bits for 32 bit, or 2048 for 64 bit, and most CPUs only have 1 (occasionally 2) units of vector math per core, so I think it is unlikely that they would be able to efficiently work with multiple quires. 32 bit quires are possible, but 64 bit almost certainly aren't. Also, for vectorization, modern cpus are capable of doing 16x 32 bit fma at a time, but processing 16x 512 bits for a quire in a similar amount of time is totally out of the question.


> Also, for vectorization, modern cpus are capable of doing 16x 32 bit fma at a time, but processing 16x 512 bits for a quire in a similar amount of time is totally out of the question.

I'd be worried about space, but I'm not sure why time would be an issue? I'd assume a big quire would have some kind of delayed carry mechanism.


even with a delayed carry, you still need to pump the data through the CPU. For a basic place where this becomes a problem, consider taking exp of each element in a vector. A vectorized version of this code will spend most of it's time computing a polynomial which is just 8 fmas (in 2 chains of 4). With a dedicated fma instruction, the cpu can do each of those fmas in 4 cycles, and overlap the chains, leading to a total time of 17 cycles (assuming 4 cycle fma which is fairly standard). Doing the same computation with quires would require moving around 16x as much data, which is impossible to do as quickly.


If you leave the quires in place you don't have that issue, though. Do your eight multiplies, feed them to a single adder, collapse the result.


You can't keep that much memory in registers easily. Just storing all those quires would 16 512 bit registers (which on x86 is all you get)


Hence why my initial comment was "I'd be worried about space, but I'm not sure why time would be an issue?"


In your RTL synthesis did you use "hidden -2 bit" for negative posits? Assuming you are "cheating" IEEE by not implementing subnormals or NaN... This is one key insight that makes posit sizes much smaller, but the algebra that you have to do to get the correct circuits is a bit trickier!


Can you expand on this? This sounds really interesting.


If you're familiar with how the hidden bit works for IEEE floating point, use a hidden '10' in front of the fraction for negative numbers for posits and suddenly a whole bunch of math falls out. This is equivalent to having the fraction be added to a -2 value which pins the 'overall value' of the fraction to be between -1 and -2, (like how for positive values the hidden bit is 1 and the 'overall value' of the fraction is pinned to between 1 and 2).


>> Having investigated posits by running RTL implementations through synthesis on 7 and 28 nm nodes...

Is it possible to implement 32 and 64 bit posits in similar area to floating point? Can the calculations be had in the same number of cycles or fewer?

I feel like these are the two most important questions. If the answers are yes, then I think it may be worth doing implementations for the increased precision and simplicity (lack on NaN and other IEEE quirks).


I don’t understand why not having NaNs would increase simplicity. Isn’t that just moving the problem of detecting various forms of over/underflow from the CPU to the programmer using it?


It's not quite right to say that Posits don't have NaN. They have a NaR (not a real) value that's fairly similar. The simplification is that they don't have 2^52 of them, and don't have -0, Inf, -Inf, or the mess with subnormals (which are necessary but not always well supported). Also, NaR compares equal to itself, fixing one of the biggest bugs in floating point.


> Also, NaR compares equal to itself, fixing one of the biggest bugs in floating point.

I thought that was by-design? Just like how NULL != NULL in SQL?


As far as I can tell (see https://stackoverflow.com/questions/1565164/what-is-the-rati...), the only reason for it was that early chips had no other single instruction that could be used to implement isnan, so they decided to use == for that purpose.


No, a NaN not being equal to itself is the correct behavior for an order relation that is a partial order, not a total order.

The bug is neither in the FP standard nor in the implementations.

The bug is in almost all programming languages, which annoyingly provide only the 6 relational operators needed for total order relations.

To deal with partial order relations, which appear not only in operations with floating-point numbers, but also in many other applications, you need 14 relational operators.

Only when they are provided by the programming language you no longer need to test whether a number is a NaN, because you can always use the appropriate relational operator, depending on what the test must really do.

The 2 relational operators "is equal to" and "is either equal to or unordered", are distinct operators, exactly like "is equal to" and "is equal to or greater" are distinct.

If you do not have distinct symbols for"is equal to" and "is either equal to or unordered", the compiler cannot know whether the test must be true or false when a NaN operand is encountered. If you do not know which of the 2 operators must be used, you must think more about it, because your program will be wrong in exceptional cases.

Specifying a stupid rule like a NaN being equal to itself and claiming that the correct behavior is a bug shows a dangerous lack of understanding of mathematics, which is not acceptable for someone designing a new number representation method (even if a single NaN value is used, it can be generated as a result of different, completely unrelated computations, and there is no meaningful way in which those results can be considered equal).

In general almost all programming languages are designed by people without experience in numerical computation, who completely neglect to include in the language the features required for the operations with floating-point numbers, resulting in an usually dreadful support for the IEEE floating-point arithmetic standard, even after almost 40 years since it became widespread.

It is not the standard which must be corrected, but the programming languages, which do not provide good access to what the hardware can do.

Also the education of the programmers is lacking because most of them are familiar only with the 6 relational operators for total orders, instead of being equally familiar with the 14 operators for partial orders.

Partial orders are encountered extremely frequently. If only the 6 relational operators are available in such cases, then they must always be preceded by a test for unordered operands. Forgetting the additional test guarantees errors. To avoid the double tests that are needed in most programming languages, it may be possible to define macros corresponding to all the 14 relational operators, but they cannot be as convenient as proper support in the language.


IMO, this only makes sense if you assume that NaNs can propagate useful information. In the real world, seeing a NaN basically just means that your math is wrong. From my perspective, the only 2 possibly useful semantics for NaN are to either just always throw an error (and not have NaN), or to say that it's equal to itself. Any other option means that == isn't an equivalence relation which pretty much completely breaks math.


On all CPUs it is possible to enable the undefined operation exception.

In that case, no NaN will ever appear, but all undefined operations will throw an exception.

So one of the behaviors that you want is available in any computing device and it should always be selected by all programmers who want to avoid testing whether an operand is a NaN. The only problem appears when you write some function in a library for which there is a policy that it should not require a certain FPU configuration to work, but it should behave correctly regardless of the FPU configuration selected by the user.

However the second behavior desired by you is not good, because that is the one that completely breaks math.

On the set of all floating-point numbers, "==" is a partial equivalence relation. It is not a total equivalence relation. Attempting to forcibly define it as a total relation leads only to inconsistent results.

On any set where there is only a partial equivalence relation and a partial order relation instead of total relations, the result of the comparison of 2 set elements can have 4 values: "==", "<", ">" and "unordered", instead of only 3 values, like for total relations.

To the 3 comparison values of total relations, there are 2^3 - 2 = 6 corresponding relational operators.

To the 4 comparison values of partial relations, there are 2^4 - 2 = 14 corresponding relational operators (e.g. "not less than" is not the same as ">=" but it is "equal, greater or unordered", so to the 6 operators of total orders, their 6 negations are added, plus other 2 operators, "unordered" and "not unordered").

Working correctly with partial relations is not really more complex than working with total relations. They are unfamiliar only because for some reason they are neglected in school, where only total relations are discussed in detail.


You keep saying that == on floating point is a partial relation, but that's only true because of the way NaN currently compares. Floating point exists as a model of the real numbers, on which equality is total, so a system modeling them should absolutely keep that if possible, which it is. It's fine to have a non-reflexive operation that behaves like == does for floats, but making that the primary equivalence relation is absolutely bonkers, since 2 NaN values with the same bit-pattern are completely indistinguishable from each other, yet compare non-equal.

Furthermore, I can't think of any reason that Inf==Inf should be true, while NaN==Nan is false. In both cases, they can be modeling different numbers (although that is true of literally any floating point number).

If there were a major advantage to making == non-reflexive, that would be one thing, but I've been programming for most of my life, (including a lot of floating point), and I've yet to come across a place where I was glad that NaN!=NaN.


I'm not sure what you're talking about by saying that this is the correct behavior for a partial order. In every definition of a partial order that I've ever seen, there is a pre-existing notion of equality (which is an equivalence relation, so reflexive -- that is, we would need NaN = Nan), and secondly the order relation in the partial order must be reflexive (that is, we would need NaN <= NaN).

Neither equality nor inequality is reflexive, so floats with <= do not form a partial order.

  >>> float('nan') == float('nan')
  False
  
  >>> float('nan') <= float('nan')
  False
Could you point me to a reference about the 14 relational operators and what this has to do with partial orders? I've never seen anything about this in theoretical mathematics.


> Could you point me to a reference about the 14 relational operators and what this has to do with partial orders? I've never seen anything about this in theoretical mathematics.

In a partial order:

There are four relations that a left side item and a right side item can have. Equal, less, greater, unordered with.

A relational operator could look for any combination of those relations, so 2^4=16 operators, except 'any' and 'none' don't make sense so there's 14.

==, <, >, <=, >=, <> make six.

Six more are "== or unordered with", "< or unordered with", etc. Negating one of the first six operators gives you one of these.

The last two are "unordered with" and its opposite "equal to or less than or greater than".


An equivalence relation can also be either total or partial, like an order relation. So the reflexivity may be true only for a subset of the complete set on which the equivalence relation is defined.

I have mentioned this also in another reply, but the following are the differences between total relations and partial relations.

On a set where total equivalence and order relations are defined, when 2 elements are compared, the result is 1 of 3 possible values: "equal", "less", "greater".

To the 3 values correspond 2^3 - 2 = 8 - 2 = 6 relational operators. In C notation: == != < >= > <=.

On a set where partial equivalence and order relations are defined, when 2 elements are compared, the result is 1 of 4 possible values: "equal", "less", "greater", "undefined" (a.k.a. "unordered").

To the 4 values correspond 2^4 - 2 = 16 - 2 = 14 relational operators.

In both cases, the 2 operators that do not exist, so they are subtracted from 8 and 16, correspond to the always true and always false operators.

For total order relational operators, "!= >= <=" are the negations of "== < >".

For partial order relational operators, that is no longer true, because e.g. the negation of "<" is no longer ">=", but "greater, equal or unordered".

So none of the 6 operators "== != < >= > <=" is the negation of another of these 6. Their 6 negations are distinct relational operators, which results in 6 + 6 = 12 relational operators.

The extra 2 operators until 14 operators are "is unordered" (i.e. isNaN for the case of numbers) and its negation.

I am sorry but I do not remember now references. However everything about partial relations are just trivial consequences of the fact that the 3 possible values of a comparison become 4 possible values.


Thanks for taking the time to explain your terminology. I've found a couple of things about partial equivalence relations (one neat one was on nlab about how the construction of the reals can either be a quotient of a subset of sequences or a subset of a quotient of sequences, and partial equivalences let you reason about the latter), but everything else seems to be in textbooks about CS-specific theory.

When you've talked about "partial orders" and "total orders" I'm surprised that you haven't disambiguated that you're talking about something completely different from what pretty much every mathematician would call a "partial order" or "total order." You could very well know about all this, but when you say "claiming that the correct behavior is a bug shows a dangerous lack of understanding of mathematics" but also don't mention that you're aware of the usual notion of a partial order, it seems a bit odd...

> that is the one that completely breaks math.

> Attempting to forcibly define it as a total relation leads only to inconsistent results.

Where's the inconsistency?

Or, a stronger question, why can't a NaN be the bottom of a total order (in the math sense) for floats? That is, make it be so that for all floats x, NaN <= x?


> an order relation that is a partial order, not a total order.

x == x (aka reflexivity) is part of the definition of a partial order[0][1]

There's case to be made that (bitwise-)distinct NaNs should compare unordered, but that's generally not what people mean when they complain about incorrect floating-point equality for NaNs.

0: https://en.wikipedia.org/wiki/Reflexive_relation

1: https://en.wikipedia.org/wiki/Partial_order


The definition is just saying that "<=" is reflexive. When it says it's "a homogeneous relation on a set P that is reflexive, antisymmetric, and transitive," the meaning of P being a set is that it's an object described by set theory, and sets come with a reflexive, symmetric, and transitive equality relation -- the "native" one that comes from the underlying logical theory. (ZFC is built on a first-order logical theory with two relations: = and ∈.)


(Sorry a1369209993, I got confused about HN threading -- somehow I thought you were responding to me.)


> Also, NaR compares equal to itself, fixing one of the biggest bugs in floating point.

Wait, so with posits, sqrt(-7) == sqrt(-4)? And 1e42^42 == 0.0/0.0? And this is supposed to be a good thing?

The semantics around NaN, +/- 0/Inf are really important. I feel like most of the proponents of posits don't understand why it's so vitally important that these constructs operate the way that they do. Maybe there are some useful things to be learned from posits, but it's difficult to take posits seriously when two values can compare equal when they are quite obviously not.


In floating point 2^54 == 1+2^54 and 1/0 == 2/0 == -ln(0). Every floating point value is an ambiguous result of infinitely many calculations with real numbers. When you do an invalid calculation, the most important thing to know is that you did something invalid.


That's a recent change, right? They switched inf to more generic nar?


I think it's been there from the beginning, but I'm not 100% sure. Realistically, Inf and -Inf are pretty much never useful in the real world, and having them makes -0.0 almost seem reasonable (since branch cuts almost always result in Inf/-Inf).


No, originally there were -/+ inf. NaR replaced this at some point, but semantically not much changed.


In the initial proposal for the IEEE floating-point standard, whether to have 1 or 2 values for infinities was configurable, like the rounding mode.

However before the standard was adopted it was decided to reduce the number of configurable modes and the single-value infinity was dropped.


> I think it's been there from the beginning, but I'm not 100% sure

The 2017 Gustafson paper linked in the article has only inf, no nar

https://posithub.org/docs/Posits4.pdf


It's way out of date


What is the recommended latest equivalent of that document?



Have you read the NaN rules? Know when NaN is converted to the various forms? It's maddening, truly. A posit implementation is definitely simpler from the point of view of dealing with that, but outside the NaN and Inf numbers, it's no easier (and technically a wee bit more work to deal with the regiments).


I will say one thing: I pity the person (grad student?) that has to do error propagation analysis on a research project using posits (I'm the original implementor)


Yeah, I pretty much think that posit only makes sense for 32 bit and smaller, and that you want your 64 bit numbers to be closer to float64 (although with the posit semantics for Inf/NaN/-0).


Oh I actually only think it's useful for machine learning. I have some unpublished, crudely done research showing that the extended accumulator is only necessary for the Kronecker delta stage of the back propagation (posits trivially convert to higher precision by zero-padding).. you can see what I'm talking about I'm the Stanford video.

Fun fact: John sometimes claims he invented the name, but this is untrue; my old college website talks about building "positronic brains" and it's long been a goal of mine to somehow "retcon" Asimov/tng data, into being a real thing, and when this opportunity for some clever wordsmithing arrived I coined the term with the hopes that someone would make a posit-based perceptron, or "positron".


>the maximum precision of a posit (assuming reasonable exponent scale)

Having run into this problem in computing the partition function of a quantum system, 10^300 is not always enough for everyone. So I don't agree with Gustafson's attempt to mimic the dynamic range of IEEE floats: give us more, since you have it. If that makes the implementation cheaper, all the better.


> On an equivalent word size basis, the maximum precision of a posit (assuming reasonable exponent scale) is much larger than an IEEE float at a given word size.

What's 'much' here? If I'm doing the math right for a 64 bit number, it's 53 bits of precision vs. 59?


Whenever the precision of some number representation format is said to be higher than the precision of another representation format, it must be kept in mind that this claim can be true only in a limited interval.

When a number representation uses a fixed number of bits, e.g. 64 bits, there are a fixed number of points on the number axis that can be represented, e.g. 2^64.

Any other representation has the same number of points. If you increase the density of the points in some interval, to increase the precision, you must take them from another interval, where fewer points will remain, resulting in a lower precision.

The traditional floating-point numbers are distributed so that the relative error is almost constant as long as there is neither overflow nor underflow.

For most computations that belong to simulations of complex physical systems, the physical quantities may have values varying within many orders of magnitude and a constant relative error is what is desired for maximum accuracy in the final results.

The posit representation increases the precision for numbers close to 1, with the price of decreasing the precision for large or small numbers.

There are applications that can benefit from the increased precision around 1, but there are also others whose precision would be seriously affected if posits were used.

So it is wrong to claim that posits have greater precision in general. They have greater precision only for the applications that use only numbers that are neither too large, nor too small.

It is likely that the largest benefits from posits are available only for the applications that use only small number sizes, i.e. no more than 32 bits. Such small number formats have a too small exponent range to be used in applications where very large or very small numbers are common, so if 32-bit or smaller floating-point numbers are already used, it is likely that the application belongs to those that might benefit from posits, due to the limited range of the numbers handled by it.

An application that really needs 64-bit numbers, e.g. the simulation of a semiconductor device, is more likely to have its accuracy worsened than improved by posits.


Everything you're saying about precision is fine but I'm really confused about why you responded to me to say this. This was a comment chain about how big of a multiplier you need.

> An application that really needs 64-bit numbers, e.g. the simulation of a semiconductor device, is more likely to have its accuracy worsened than improved by posits.

That really depends on what you're storing. For something like voltages, I would expect posits to have better precision all the time. 64 bit posits don't have less precision until you go over 2^21 or under 2^-21.


Sorry, by mistake I have replied to the text quoted by you, which claimed a higher precision without specifying the conditions when that is true.

I should have replied one level above.

Regarding voltages, an application that would just add voltages would use numbers in a limited range, which could be well represented by posits.

However no such applications exist. The real applcations multiply and divide different kinds of physical quantities before adding them, and they might mix in computations voltages in kilovolts with currents in picoamperes, capacitances in femtofarads, the Boltzmann constant, the elementary charge, and so on.

The intermediate quantities that appear in various computations may have an exponent range at least 2 or 3 times larger than the exponent range present in the input data or output data, so it is frequent to exceed the exponent range provided by single-precision floating-point numbers, i.e. 10^(+/-38).


Yep. This works out to be on the order of 60x6x2 adders which is honestly not that much.


How do you compute with posits? I can imagine vaguely how an FPU would work, by applying the rules of algebra to m*2^e and cleaning up afterwards to keep things normalized, but I do not know what operations on the projective circle are homeomorphic to addition or multiplication on the real line.



posits can be also projected to m*2^e, just with larger m and e.


Converting to floats, computing and converting back would hardly result in a smaller FPU, unless I am missing something about it.

Edit: a dead comment replies with,

> I'm the implementor for posits (deactivated my primary hn account) -- I built circuit diagrams for computation with posits. So, we have them -- and they are smaller. There is also a key insight into the representation of posits (negative numbers have a "hidden -2" instead of a "hidden 1") that I cracked early on in my fiddling with circuits that was completely missed by all other implementers until earlier this year, even after I communicated it to them through email correspondence several times.


I'm the implementor for posits (deactivated my primary hn account) -- I built circuit diagrams for computation with posits. So, we have them -- and they are smaller. There is also a key insight into the representation of posits (negative numbers have a "hidden -2" instead of a "hidden 1") that I cracked early on in my fiddling with circuits that was completely missed by all other implementers until earlier this year, even after I communicated it to them through email correspondence several times.


> negative numbers have a "hidden -2" instead of a "hidden 1"

Isn't this just (what I assumed was) the standard implementation technique for FPUs that aren't[0] stuck with IBM/Intel's braindead sign-magnitude junk?

0: so (eg, for a 1.7.8 float) -1.996..-1.000 would be 3F00-3FFF, and +1.000..+1.996 would be C000-C0FF, making 0x[-2].00-0x[-2].FF and 0x[+1].00-0x[+1].FF (with implicit -2/+1 corresponding to 3F/C0)


I can believe that for not too large number sizes a posit implementation might be smaller than for the traditional FP format.

However "smaller" must be qualified, because the size of a FPU varies enormously depending on the speed target. A completely different size results when the target is to do a fused multiply-add in 4 cycles at 5 GHz than when the target is to do it in 100 cycles at 200 MHz.

So unless you give more information about what you have compared, we cannot know whether it is true that a posit implementation can be smaller.


I don't know how you can claim to be "the" implementor as there are many implementations. However, your explanation doesn't have enough context. Do you have a paper describing this in better detail?


Sorry, should have specified: I'm the original implementor. I'm on the paper with John Gustafson, and presenter/live-demoer of the second half the Stanford video.

There is a paper coming out with details on the yonemoto -2 hidden bit method... I don't know if it's still preprint or embargoed or what but it is recent. I'm really not involved in the project anymore so my knowledge of the existence of this paper is only due to the courtesy of the authors.


Is the source code available? It would be amazing to the best possible implementation available as Verilog. There are quite a few pretty good IEEE 754 implementations.


I don't mean converting to floats, I mean treating the exponent and mantissa separately. Processors already do something similar for floats. For addition, (for example), you de-normalize the smaller input by an amount to make the exponents line up, add and re-normalize. The steps would be pretty much exactly the same for posits, the only difference is how you calculate the difference in exponents (which just boils down to a few shifts based on the useed value).


Is it true that these are just a curiosity (albeit a compelling one) and not practical until hardware support for them comes along, because until then they will always be slower than floating-point on supporting hardware?


You could say that about anything at some point, not least RISC-V.

However, there _are_ hardware implementations of posit available, but given the inertial of existing code and data that assumes IEEE-754, posits are more like to see adoption in specialized areas where the higher information density is enough of a win. Or in green field application without concerns for legacy.


I think one all-around-good use case/feature is the fact that multiplying a very large number by a very small number has significantly less error in posits vs. IEEE754.


I don't think this is true. In floats, multiplying a big number by a small number has the same accuracy of any other non-overflowing/underflowing multiplication. In posits, the multiplication will be exact, but only because the big and small numbers were less accurate in the first place. If you have something like f(x)*g(y) where f(x) is really big and g(y) is really small, floats will give more accuracy.


I'm going off the chart shown in Figure 15 of the cited Cornell post about it, which seems to disagree with your assertion


That's exactly what I was saying. The result is exact, but it has lower accuracy.


ML training is bottlenecked on arithmetic performance. Computational fluid dynamics is bottlenecked on arithmetic performance. There are others, but those are the two obvious ones.

Nobody in any bottlenecked field is sufficiently motivated to generate a chip that is potentially worth lots of millions of dollars? Really? That's a pretty strong statement.

And, if posits can't break those bottlenecks, are they really better?

I would cite 3D graphics, but I suspect that memory bandwidth is more the problem as it seems that people are willing to double or quadruple the computational workload to avoid a memory stall or roundtrip. And I don't have enough background to discuss whether HFT is bottlenecked on arithmetic performance.


Is there any good documentation about the algorithms for adding and multiplying posits? Most of the stuff I have found focuses on how well they represent numbers, and less on how to manipulate the bits.


The basics of the algorithms are essentially the same as for floating point. You separate the mantissa, exponent, and sign and then do the calculations in the same way.


I was beginning to think posits had been forgotten.

I wonder about the prospects for (maybe 16-bit) posits in AI engines. It seems like a natural fit not so constrained by history, unlike traditional languages.


Nice!

Where can I get the posit tee-shirt? Asking for others :)




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: