
Fast Approximate Logarithms, Part I: The Basics - r4um
http://www.ebaytechblog.com/2015/05/01/fast-approximate-logarithms-part-i-the-basics/
======
sdenton4
Awesome writeup.

I've occasionally thought that there's a lot of time to be saved in machine
learning systems by making the floating point operations a bit more noisy. Use
maybe eight bit (or even four bit) numbers for the matrix entries, with the
philosophy that the overall degree of freedom in the system is enough to make
up for this local rigidity, while you gain a lot in terms of both storage
space and evaluation time.

Along these lines, I saw a great demo of a robot a while ago which used 256
binary devices for movement. Each of these devices is the diagonal of a
quadrilateral; when the device is in it's 'on' state, the diagonal is long,
and when it's in the 'off' state the diagonal is short. Chain a lot of these
together, and you can get pretty good accuracy aiming for specific points in a
two-dimensional space. The demo I saw was a three-dimensional version of the
same concept, of course.

So likewise, we chain together lots of linear operations to build (say) a deep
learning system. The weight space is very high-dimensional, and the decision
space is the much lower dimensional classification/regression problem: one
probably needs a lot less than 64 bits of accuracy in the weight space to
approximate a point in the decision space.

~~~
benanne
Indeed, 64 bits of accuracy is overkill for a lot of ML algorithms, where
there is so much noise that the additional quantization noise due to low
precision is negligible. Most deep neural nets are already being trained in
single precision because that is what NVIDIA GPUs can do the fastest. There
has been quite some research on reducing the bit depth further the last couple
of years, see e.g.
[http://arxiv.org/abs/1502.02551](http://arxiv.org/abs/1502.02551) and
[http://arxiv.org/abs/1412.7024](http://arxiv.org/abs/1412.7024).

The current generation of GPUs already has limited support for half-precision
operations, and the tools for using these operations in neural networks (dot
product, convolution implementation) are starting to become available as well,
which is awesome (see e.g.
[https://github.com/NervanaSystems/nervanagpu](https://github.com/NervanaSystems/nervanagpu)).

NVIDIA themselves have also noticed and better hardware support for low-
precision arithmetic is coming in Pascal:
[http://techreport.com/news/27978/nvidia-pascal-to-feature-
mi...](http://techreport.com/news/27978/nvidia-pascal-to-feature-mixed-
precision-mode-up-to-32gb-of-ram)

~~~
ot
I also liked a lot this ICML paper [1], which is more theoretically principled
than the two you reference (and weirdly they do not cite it).

[1] [http://www.eecs.tufts.edu/~dsculley/papers/round-model-
icml....](http://www.eecs.tufts.edu/~dsculley/papers/round-model-icml.pdf)

------
tagrun
From a practical point of view, in case you have Mathematica (which is a
proprietary software unfortunately), you can do this easily for any function
in a general way:
[http://reference.wolfram.com/language/FunctionApproximations...](http://reference.wolfram.com/language/FunctionApproximations/tutorial/FunctionApproximations.html)

(MiniMaxApproximation minimizes the relative maximum error)

For the case discussed in write up, it gives

MiniMaxApproximation[Log2[x + 1], {x, {1, 2}, 2, 0}]

log2(x+1) ~ 0.177392 + 0.943626 x - 0.120232 x^2

with a maximum error of -0.000786282.

In the 4th order (that is, (4,0)), it gives

0.0399284 + 1.27027 x - 0.391233 x^2 + 0.0909038 x^3 - 0.00986308 x^4

max error: -4.83031 10^-6.

However, if the division isn't too expensive, a (2,2) approximant also works
for an even better precision:

(0.00383323 + 1.42412 x + 0.336161 x^2)/(1 + 0.704317 x + 0.0598015 x^2)

max error: -1.33902 10^-7

As for [0.75, 1.5) interval, I get

0.108715 + 1.05621 x - 0.165315 x^2

which gives a max error -0.000651425 and the max relative error is around
0.0006, which is again way better than what he gives in the write up, although
the error is non zero at points 0.75 and 1.5 (which is not really a necessary
constraint, but an artifact of the particular polynomial he chose). The
overall plot for relative error in a wider range is smooth.

Bottom line: 1) Unless you are an expert in numerical methods, consult to a
software/article/book by an expert (or to the expert him/herself if you can)
in the field at some point. 2) While I appreciate his efforts and the write
up, I wouldn't recommend using his polynomials in practice.

(BTW, one can use frexp instead of bit-twiddling to make the C code friendlier
since not everyone is familiar with the bit layout of IEEE 754 numbers.)

~~~
tagrun
I realized that I missed the part that he's talking about log2(x) rather than
log(x+1), which changes the interval.

Since MiniMaxApproximation gets quirky when the function is 0 in the given
interval (it minimizes the relative error, so a division by 0 occurs, unless
you cancel it), I'm rather using the interval [0.25, 0.5] with t=1/4:

-3.64631 + 7.93796 x - 5.30406 x^2

This gives a maximum relative error of 0.00334047, which is still an order
better than his.

As for the (2,2) approximant: (-5.83261 - 16.7012 x + 22.394 x^2)/(1 + 11.2638
x + 7.81124 x^2) max rel error: 1.69763 10^-6

I checked several other values of t, and the max rel error for (2,0)
approximant is roughly the same. This leads me to believe that the reason he
is getting 10 times larger error is because he is imposing that errors should
vanish at certain points. Not only this increases the max rel error, but it
also distorts the smoothness of the function in a bad way (the kinks in the
plots are a bad sign in general).

~~~
gjm11
Your polynomial, since it doesn't get it exactly right at 1/4 & 1/2, will have
an enormous relative error when you use it for computing logs of numbers very
close to 1.

Goldberg's initial function b ignores this issue, and a glance at the graph of
its absolute errors will confirm that it's the minimax poly for absolute error
or very close to it. His next candidate t half-fixes this, and again looking
at the graph of the relative error makes it clear that it's right. (Note that
avoiding intervals containing zero will indeed allow you to get better max
relative errors -- keeping the relative error finite where the function is
zero effectively reduces your polynomial degree by 1.) Then finally he fully
fixes it with the polynomial g, solving the more interesting problem of
minimizing the max relative error for the log function everywhere, and chooses
a better functional form by shifting the interval he reduces to with the
polynomial s.

(The fact that his earlier polynomials do the wrong thing is clearly not a
mistake but a paedagogical strategy: start with something simple and
incrementally improve it.)

My apologies for spelling all this out, but it really appears that you're
accusing him of making elementary mistakes without actually understanding what
he's doing.

"Imposing that errors should vanish at certain points" is not a mistake, it's
a necessity if you're implementing log by argument reduction plus local
approximation and you care about relative error in the resulting approx_log
function, and you are getting better numbers than Goldberg by solving the
wrong problem.

Your (2,2) rational approximant will be much more expensive to compute than
the polynomial and won't yield the speed gains Goldberg is looking for. And,
again, it will give you a log function with infinite relative error at 1.

[EDITED to fix a typo and clarify something.]

~~~
tagrun
And sorry for spelling this out for you, but maybe it is you who is "accusing
of making elementary mistakes without actually understanding what he's doing."

Here is why:

1) > Your polynomial, since it doesn't get it exactly right at 1/4 & 1/2, will
have an enormous relative error when you use it for computing logs of numbers
very close to 1.

In case you missed, Mathematica's (not mine) polynomial gives a maximum
0.00334047 relative error. And the points with 0.00334047 error are actually
edge points (which does _not_ include 1).

The important point is, it is _not_ enormous. In fact, his polynomial gives
_10 times_ larger error at certain points, so it is he who has an enormous
error. And Mathematica's polynomial is continuous, so even when you slightly
get outside of the intended range, you're still getting an error of the order
10^-3.

2) > Your (2,2) rational approximant will be much more expensive to compute
than the polynomial and won't yield the speed gains Goldberg is looking for.

Again, since you apparently missed it, the (2,2) approximant I gave is for
[0.25,0.5] interval with t=1/4, so yes, bad things can and will happen if you
use that polynomial outside of that range (same things happens with his
polynomials too, obviously). Instead, what one should do is to move that 1/4
part into the exponent and stick to [0.25,0.5].

(About the speed vs accuracy trade-off, I already pointed that out in my
original post by saying "if the division isn't too expensive"; I'm not sure
why you are emphasizing it again here.)

Going back to your statement, when compute logs very close to 1, you subtract
2 from the exponent and instead compute between [0.25,0.5] using that
polynomial.

3\. > "Imposing that errors should vanish at certain points" is not a mistake,
it's a necessity if you're implementing log by argument reduction plus local
approximation and you care about relative error in the resulting approx_log
function, and you are getting better numbers than Goldberg by solving the
wrong problem.

"argument reduction plus local approximation" means range reduction to [t,2t)
for log function and this has nothing to do with the behavior of approximation
polynomial at the edges; you may very well have non-zero errors at t and 2t
and there is nothing wrong or alarming about it (see Handbook of Floating
Point Arithmetic, Chapter 11, for example). The point is, the max rel error
should be kept small, which Mathematica does, 10 times better.

Such an artificial condition is necessary only when the relative error
function becomes discontinuous and starts making sudden jumps (see Figs 6-7
from the top in the write up.) He is trying to remedy a problem which appeared
due to his choice of t, at the cost of losing degrees of freedom in the
polynomial (see also the comments here
[http://reference.wolfram.com/language/FunctionApproximations...](http://reference.wolfram.com/language/FunctionApproximations/tutorial/FunctionApproximations.html)
about when the function becomes zero in the interval, in which case you need
to deal with these points by making them special --note from the example that
this doesn't have to mean you lose 2 degrees of freedom to avoid a single
point).

So, in short, I _am_ solving the right problem.

4\. Which brings me to the other unspelled problem in some of his polynomials;
the derivative is very bumpy and discontinuous, which spells out disaster in
general math applications. (Clearly, the derivative of his function will have
enormous minimax (L^\infty) and RMS (L^2) errors, unlike Mathematica's
polynomial.)

Summary: 1) Mathematica's polynomial gives ~10 times smaller maximum error
than the write up within the range 2) Mathematica's polynomial gives 10x
smaller RMS error within the range 3) Mathematica's polynomial is smooth. (I
hope by now, you are clear that Mathematica's polynomial nowhere gives any
enormous error and quite the oppoisite, the error is _much_ lower) 4) If you
use approximant polynomials only within their intended range and move the
remaining factors into the exponent, everything works out. 5) My pointing out
a bad mathematical practice wasn't personal, so please don't steer the
discussion into that direction. I understand that you respect him which may
motivate this, my credentials in maths aren't shabby at all (I have more peer-
reviewed research papers than him, and in higher impact journals etc etc) (and
if Mathematica says one thing and a human says another, in most cases
Mathematica is correct) but credentials don't matter in front of solid
results. In fact, accepting a scientific result (partly-)based on credentials
is considered to be toxic in science. Which brings me back to my original
point. I stand by what I said earlier: I don't recommend using the polynomials
in the write up.

~~~
gjm11
> a maximum 0.0033407 relative error

On the interval from 1/4 to 1/2\. But the whole point of this exercise is to
make an approximation for the log function over its whole domain. If you take
Mathematica's/your polynomial and do that with it, you get an infinitely large
relative error at 1 because log(1)=0 but your approx_log(1) is not 0.

(This is the _central point_ I was trying to make in my comment, and the fact
that you missed it suggests I wasn't clear enough. I'm sorry about that; in
particular, clearly I should have made it more explicit that I understand how
argument reduction works -- it hadn't occurred to me that that was in
question, since it's a standard technique, described perfectly adequately in
the article we're discussing.)

> if you use the polynomial outside of that range

I never for an instant suggested using the polynomial outside that range. I am
assuming the obvious argument-reduction scheme where you say log2(2^a.b) = a +
log2(b) and arrange for b to lie within some interval [t,2t). In other words,
I am assuming the same thing that Goldberg described and the same thing that
you are assuming (but apparently missing a key consequence of).

So. Suppose you compute log2(1) using your polynomial. You argument-reduce:
log2(1) = 2 + log2(0.25). Then you use your polynomial: log2(1) = 2 +
(-3.64631 + 7.9376/4 - 5.30406/16) ~= 0.0066. The actual value you are looking
for is log2(1) = 0. Your relative error is infinite.

(You can use the other end of the interval instead. Then you get -0.0035;
again, the relative error is infinite.)

Goldberg avoids having this happen by forcing the absolute error to be zero at
the endpoints (when he uses a polynomial on [1,2)) or at the relevant point in
the middle (when he uses one on [0.75,1.5)). This extra constraint is what
forces him to accept larger relative errors _on his [t,2t) interval_ ; that's
the price he has to pay for finite relative error _on the whole domain of the
log2 function_.

> and this has nothing to do with the behaviour of approximation polynomial at
> the edges

Indeed, the fact that the point where you run into trouble is at the end of
the interval is irrelevant, and it would be an interior point that mattered if
you did as Goldberg did and used an interval like [0.75,1.5).

> So, in short, I _am_ solving the right problem.

No, really, you're not. Goldberg wants his relative error for the approx log2
function not to blow up when computing log2(1), which requires that it
evaluate to exactly 0 when fed exactly 1. Your polynomial doesn't have that
property.

(Of course it's questionable whether relative error is what one ought to care
about for an approximate logarithm function. Maybe it isn't. I don't know what
applications Goldberg actually has in mind, and quite possibly he isn't at
liberty to tell us. But the problem he's stated is to compute an approximation
to log2 with reasonable relative error, and it's simply incorrect to say he's
done that badly because you can get reasonable relative error on a smaller
interval, if your relative error goes to infinity elsewhere. Which it does.)

> the derivative is very bumpy and discontinuous

Yeah, I dislike that too. Again, how much this matters depends on the
applications you have in mind.

> My pointing out a bad mathematical practice wasn't personal, so please don't
> steer the discussion into that direction. [...] accepting a scientific
> result (partly-)based on credentials is considered to be toxic in science.

I never said anything even slightly personal; I have made no comment on either
Goldberg's credentials or yours (or for that matter mine); only one person in
this discussion has appealed to credentials, and it's you. Twice. (Once
earlier: "Unless you are an expert in numerical methods ...". Once just now,
comparing your credentials with Goldberg's.)

Anyway: of course I strongly agree: credentials are less important than
results. In this case, Goldberg made an approximation to log2 whose relative
error is bounded on its entire domain of applicability, and you didn't. In
particular, #4 from your summary is flatly incorrect.

~~~
tagrun
> If you take Mathematica's/your polynomial and do that with it, you get an
> infinitely large relative error at 1 because log(1)=0 but your approx_log(1)
> is not 0.

So what?

Instead of accepting that his polynomial is more erroneous, you are steering
this whole discussion in a totally synthetic direction.

Essentially your whole argument is _global_ relative minimax error must be
finite at every single point in the domain.

I don't know where you got this idea/mantra but the point here is to have a
rough and fast sketch with small overall error in applications, not
mathematical rigor, which you can't have with a 2nd degree polynomial anyway.
His polynomials give inifintely huge L^1 and L^2 errors near origin; the
minimax is finite (1) but it doesn't make an infinitely big error OK. Which is
why such simple minded mantras are wrong and bound to fail.

While minimax is the only thing that gets mentioned there, the world is a
bigger place. There are many error measures. The concept of minimax
approximation breaks down at the points where the function is 0 (you can have
any arbitrary small number and we're still doomed according to minimax) --this
is not the end of the world, in many application, L^2 error (RMS) is more
relevant, in which case Mathematica's approximant is again 10 times better.

Do I want to make my errors 10 times larger over the _whole domain_ and do I
want a bumpy approximant just to make minimax happy at _one, single_ point in
the whole floating point range? No, thanks.

If we're going to pick nits and be rigorous, his function fails big time at
x=0 and x<0, and for x=\pm \infty as well as NaN. Now if this is a real
concern, people just put a conditional there and live on happily. You can do
that for x==1 case if this is a serious concern in your application ---even
though it is very unlikely (this shouldn't bother you since you're OK with
bumpy functions, and it's going to be a small bump in this case anyway and
still beats his polynomial by an order globally).

The whole point here is to have some rough sketch of log that works fast and
yields small errors overall in the actual calculation. If I wanted rigor, I'd
use libm log2 or
[http://www.netlib.org/fdlibm/e_log.c](http://www.netlib.org/fdlibm/e_log.c)
which is also quite fast.

Understand this, relative flavor of minimax is some measure which is rarely
directly relevant, and is hardly suited for log function. It's not God's
absolute commandement and the requirements of domain specific rough
approximants depend on the actual usage: whether the errors are going to end
up being additive, multiplicative, etc.

Anyway, I'm not going to waste more time on this. Here are the unwavering
facts since the beginning: 1) His polynonimal has 10x worse relative error,
and worse RMS error in the whole domain 2) Your whole argument "but what about
x==1" is wrongly placed since the approximant you have been fiercely defending
here itself suffers from infinite pathological errors at x <= 0; if it bothers
you so much nevertheless, just put an if(x==1) return 0, and there you go, I
still have a 10x better function.

~~~
gjm11
> So what?

So "produce a fast approx log function with small relative error everywhere"
is the problem Goldberg was trying to solve from the outset. (He wants to be
able to give guarantees like "accurate to 7 bits".) The fact that he gets
larger errors than you is completely explained by this.

 _Of course no one else is obliged to be interested in solving that problem,
but that 's the problem he's trying to solve._

(If you'd said "I think Goldberg is solving the wrong problem; no one cares
about relative error rather than absolute error in an approximate logarithm
function" I'd have been inclined to agree. But no, it was "haha, he computed a
crappy polynomial; that's what happens when you let amateurs do work that
should be left to the professionals" with no sign of any understanding of what
constraints he was working with that led him to compute what he did.)

> Instead of accepting that his polynomial is more erroneous, you are steering
> this whole discussion in a totally synthetic direction.

As well as agreeing that his polynomial gives larger errors on the range to
which it's applied directly (which I _have_ , explicitly, agreed; go back and
check if you don't believe me), I am pointing out that it does (infinitely)
better than yours according to _the criterion Goldberg was actually using_ ,
namely relative error over the whole domain in the log function it produces.

> mantra

Wow.

(I got it from the start of Goldberg's article, where he explains what problem
he's trying to solve.)

> infinitely huge L^1 and L^2 errors near origin

(I take it you mean in the derivative.) Yours does too; it is discontinuous at
every integer power of 2. Obviously any piecewise-polynomial scheme like this
one is going to have discontinuities in some derivative (and you can push up
which derivative at the usual cost of either using higher-degree polynomials
or getting worse errors).

> There are many error measures.

It appears that you think I am much more ignorant in this field than I
actually am. (Or perhaps that you're pretending to for rhetorical effect, but
I hope not.)

> just to make minimax happy at _one, single_ point

If the relative error is infinite at x=1 then it is also very large when x is
very close to 1.

> fails big time at x=0 and x<0

You're complaining that he isn't generating NaNs and infinities when passed
inputs that don't have a finite log? Fair enough, I guess, but in a post
subtitled "Part I: the basics" it seems enough to me that he has the following
warning sentence: "The code does not check that x>0, much less check for
infinities or NaNs. This may be appropriate for a fast version of log." (Which
he does.)

> fiercely defending

I think you may be mistaken about where the fierceness is coming from in this
discussion.

> just put an if (x==1) return 0 and there you go, I still have a 10x better
> function

Nope. You've got rid of the _actually infinite_ relative errors but you still
have _extremely large_ relative errors for x very close to 1 but not equal.

Again: by all means argue that bounding the relative error in the return value
of a log function is the Wrong Thing for almost all purposes. I will probably
agree. But that isn't what you did.

------
minimax
The author of this post, David Goldberg, is the same guy that wrote "What
Every Computer Scientist Should Know About Floating-Point Arithmetic."

[http://docs.oracle.com/cd/E19957-01/806-3568/ncg_goldberg.ht...](http://docs.oracle.com/cd/E19957-01/806-3568/ncg_goldberg.html)

~~~
bumbledraven
Great find. There's an archive of his posts on the eBay Tech Blog. Looks like
interesting reading.

[http://www.ebaytechblog.com/author/dgoldberg/](http://www.ebaytechblog.com/author/dgoldberg/)

------
malkia
I've learned to be extra careful with fast log approximation. I was the tools
programmer at a game studio, and few years ago was given the task to speed
things a bit. There was a process that was summing neighbouring pixels, doing
some filtering/pre-processing then calculating a value through log, so I've
profiled that this was the slowdown, and decided to provide a faster log
version. After some googling, and testing it seems everything should've work.

Few days later, black mipmaps started to appear. Turns out, I was doing
testing in a wrong way, where I was thinking I'm comparing new to old, where I
was comparing old to old created by other machine (a bit different results due
to be expected). All because I forgot to turn off caching.

And my approximation of log2 did not really worked for the most of the numbers
involved in. It was using floats, not doubles to begin with, and producing a
lot of either NaN's and infinities (hence the black colors).

Anyway it was reverted, and had to reconvert everything to get back to normal.

------
TheLoneWolfling
Another interesting idea is ditching the format of a classic polynomial.
Multiplication isn't the fastest thing ever.

I wonder what would happen if you ran the same minmax scoring scheme through
an general instruction synthesizer for the same complexity cap?

------
boulos
This is a pretty nice introduction, and makes me want to return to my side
project where I used to play with these approximations but for SIMD math:
[https://github.com/boulos/syrah/blob/master/src/include/syra...](https://github.com/boulos/syrah/blob/master/src/include/syrah/FixedVectorMath.h)
.

I hope in a later post he mentions Sollya
([http://sollya.gforge.inria.fr](http://sollya.gforge.inria.fr)), I found it
invaluable for playing around with different precision vs speed.

------
jgalt212
Why is Ebay using the log function and why is it used so heavily that it
causes a performance bottleneck for them? Is this an inherited code base from
their Hunch purchase?

------
me2you
As in Math Toolkit for Real-Time Programming by Jack Crenshaw

------
rg2004
If your bottleneck is log() and you're not memory-limited, why not use a
lookup table?

~~~
walshemj
which is how pre calculators you used to do it by hand using Log tables

~~~
Houshalter
Computing log tables is what the would-be first computer was intended to do.

------
brandmeyer
Boost used to have a minimax approximation algorithm in its codebase.
Unfortunately, nobody knows enough about how it works it to maintain it, and
you have to look in much older releases of Boost to get it.

~~~
alnsn
This one? #include <boost/math/tools/remez.hpp>

I used some bits from it to help me in building my own log approximation but I
didn't have time to dig into it.

I ended up doing Chebyshev approximation over [1, 2) interval but I extended
the interval slightly to make sure that an error term at 1.0 is minimal (you
can also minimise an error term at 2.0 if you want some smoothness between [1,
2) and [2, 4) intervals).

It's easy to calculate new interval values for Chebyshev approximation and it
shouldn't be a problem to modify Remez but I couldn't find any Remez
implementation that would let me do what I needed.

Usually, interval ends in Remez algorithm are fixed, while internal points can
be moved by the algorithm. I needed a way to to fix one or two internal points
(1.0 and optionally 2.0) but let the algorithm move interval ends within
certian limits.

If anyone knows of any tool that let me do it, I'd be happy to play with it
and publish my code. My approach is a bit similar to [1] but I'd more likely
use Intel assembly or DynASM.

[1] [http://gallium.inria.fr/blog/fast-vectorizable-math-
approx/](http://gallium.inria.fr/blog/fast-vectorizable-math-approx/)

PS The author of [1] uses remez from e Sollya tool but it doesn't let me fix
internal points. I looked at the implementation but it's more obscure than
boost/remez.

UPDATE: it's not clear from the above what my objectives were. I wanted to to
produce exact values for integer powers of 2 including log2(1). This approach
also gives a small relative error around x=1.

~~~
brandmeyer
Yes, that one. It comes with a driver program using the Boost.Test framework
that allows you to change the range and fix the values at the interval ends if
that is an objective.

Their framework does not allow you to arbitrarily fix internal values, as per
your spec however.

------
paulpauper
But why is this important? Can't computers already do this? Even my 1978 TI
pocket calculator can do logs very fast. It's not like this is an important
unsolved mathematics puzzle.

~~~
pjscott
From the first sentence:

> Performance profiling of some of the eBay code base showed the logarithm
> (log) function to be consuming more CPU than expected.

There are a lot of different types of "very fast", and these guys needed a
faster one.

~~~
TheLoneWolfling
I wonder why eBay is using the log function?

~~~
pmiller2
I'm more curious why they use it so extensively that tripling the speed of it
is significant to them.

~~~
TheLoneWolfling
That's a much better way to phrase that, thanks.

------
Houshalter
I tried evolving a good approximation with Eureqa:
[https://i.imgur.com/1CTTjYD.png?1](https://i.imgur.com/1CTTjYD.png?1)

~~~
dalke
It looks like all of them (they use sqrt() and x^y) will be slower than
calling log() directly.

~~~
Houshalter
Here is it without those operations:
[https://i.imgur.com/xhyNnU4.png?1](https://i.imgur.com/xhyNnU4.png?1)

EDIT: Without sin/cos:
[https://i.imgur.com/mTwgKKO.png](https://i.imgur.com/mTwgKKO.png)?

You could do better by splitting the function into different ranges and using
a lookup table for a formula that fits each range the best. This is how many
fast approximations are done. See how the residual error fits nicely into a
few different curves? Eureqa doesn't support this though.

~~~
dalke
The original posting ended with a second-order polynomial, a few bitwise
operations, and an if statement.

All of the ones you just posted are more computationally difficult. The non-
trig ones, for example, include a division, and the best fit is a higher-order
polynomial. There's no reason to believe it will be as fast to evaluate, so
the question would be, is it more accurate and is the accuracy worth the
additional time? It would be ill-advised to use one of those if it ended up
being both less accurate than log() and slower.

Concerning lookup tables, as tgbrter answered in this thread, they are likely
slower. It still needs the scaling, and also needs two value lookups followed
by an interpolation. The cost of doing the interpolation will be about the
same as evaluating the second order polynomial, but the pipeline will be
waiting on the two memory lookups.

Assuming everything in is L2 cache, that might not be a problem. But depending
on the problem, it may be in L3 or (shudder) main memory, which is eons away.
So as tgbrter also pointed out, you may end up with variable calculation time
depending on the cache.

~~~
Houshalter
Eureqa lets you choose what functions you use and what cost each has. I need
to know what the expense is for each operation and I can just feed it in. I
removed division and increased the cost for multiplication:
[https://i.imgur.com/8C9OBEY.png?1](https://i.imgur.com/8C9OBEY.png?1) (I'm
using the author's error metric of relative error, which is what the 0 =
abs(y-...)/y means.)

Here it is without mod which I think might be expensive: pg1:
[https://i.imgur.com/SMvj68W.jpg](https://i.imgur.com/SMvj68W.jpg), and pg2:
[https://i.imgur.com/ACw321W.png?1](https://i.imgur.com/ACw321W.png?1)

And of course the best fit has extra complexity, that's why it gives you many
choices along the complexity tradeoff.

I wasn't suggesting a lookup table to approximate the function, but the choose
the formula itself. E.g. you might have a function which fits the first half
of the function very well, and another which fits the second half very well. I
think that branching code is slower, so alternatively you can do a lookup for
the constants of the formula. Some constants will be optimized for one part of
the function, and another set of constants will be optimized for another part.

I don't think that kind of lookup would be too slow. You only need to fit a
few elements in the cache, and interpolation is a simple operation. It's just
a subtraction, a multiplication, and an addition.

~~~
Houshalter
EDIT: In the last paragraph I mean I don't think a _complete_ lookup table and
interpolation, like you mentioned, would be too slow. Not referring to the
lookup table of formulas I suggested, but computed log values.

~~~
dalke
I am confused about why you need to _think_ that certain operations are/are
not expensive, or why you _think_ a table is faster.

I think they are expensive, and it won't be faster. Hence we're at an impasse.
That block was predictable, and easily solve by taking the equation you have
and testing against the reference code.

If you test it out, you'll know the answer. There's no need to think you know
the answer when you can just measure it, and overrule any incorrect beliefs I
may have.

~~~
Houshalter
As I said, I don't have a table of how fast each operation is. I don't think
any of my assumptions were unreasonable, but I did explicitly state they were
assumptions. And I don't have any way of testing it.

I am interested in function approximation mainly for finding functions for
when there is no closed form solution, not so much tiny optimization.

