
SLEEF Vectorized Math Library - gbrown_
https://sleef.org/
======
svantana
Interesting, but a bit weird that the benchmark doesn't include the standard
math functions - the question most potential users would ask is, how much
performance could I gain from this?

Personally I do a lot of approximations of math functions, which usually give
me about a 10x speed increase relative to <math.h>. From the looks of it, this
isn't quite that fast.

~~~
celrod
Recent versions of GCC as well as the Intel compilers can autovectorize with
the right flags (-ffast-math with GCC, -fast with Intel), and yield better
performance with the <math.h> functions than you'll get from SLEEF on x86_64.

LLVM doesn't have a vector math library, so SLEEF could help you there.

If you're using single precision and AVX512, a >10x speed increase is likely.
Otherwise, you'll probably get less than that. These functions are very
accurate, most to withing 1 ULP of least precision. That is, if the answer
they provide isn't the correctly rounded floating point answer, then it'll be
either the next or previous representable floating point number.

There is a lot of room for giving up accuracy in the name of speed (eg, using
less terms in the polynomials).

~~~
svantana
Do you have a benchmark showing that gcc is faster than sleef?

Sorry I wasn't clear, by "10x" I meant 10x faster than the standard library
with the fastest compiler options (-O3 -ffast-math -avx512), but I've only
tested with clang.

------
gameswithgo
i use this library as an optional feature in simdeez. it was a relief to find
it, and a rust wrapper for it. its hard stuff to port it is so arcane. big
thanks to sleef!

~~~
celrod
I've been relying on a Julia port of version 2 (version 3 is out now). Version
3 added a lot of new functions, and I believe it improved performance on many
of the old ones.

It is much faster (when vectorized) than what you get in base Julia, but lags
behind gcc (glibc) and the Intel compiler's vectorized math libraries in
performance.

~~~
improbable22
Curious whether you've compared it to
[https://github.com/chriselrod/LoopVectorization.jl](https://github.com/chriselrod/LoopVectorization.jl)
? Which if I understand right is a pure-Julia attempt to use many of the same
tricks.

~~~
celrod
Compare my username with that of the author of that library ;).

For special functions, LoopVectorization relies on SLEEFPirates.jl, which is a
fork of SLEEF.jl, a Julia port of version 2 of SLEEF. Most of the changes in
SLEEFPirates are so that it works when you use llvm-vectors as arguments, but
I also switched to using Estrin's rather than Horner's method of evaluating
polynomials for a few functions (which more recent SLEEF versions did as
well).

The code is pure Julia (or Julia + LLVM call; either way it does not need any
external dependencies aside from Julia itself). It does need performance work,
but I have many higher priorities at the moment.

~~~
improbable22
Hah! Sorry, didn't cross my mind. And thanks for the details.

