
Statistical shortcomings in standard math libraries (and how to fix them) - jwmerrill
http://www.evanmiller.org/statistical-shortcomings-in-standard-math-libraries.html#footnotes
======
jwmerrill
Unfortunately, it looks like the Cephes library of mathematical functions
linked from this article is released with a poorly specified license, and
contains code that is copyrighted by multiple organizations:
[http://www.netlib.org/cephes/readme](http://www.netlib.org/cephes/readme)

~~~
gone35
[ _Edit: I overlooked the fact that, being free under GPL, the GSL
implementations do not suit the author 's intent. (See /jwmerrill's comment
below). Leaving the comment anyway just for reference, but my original point
is moot._]

This. The author should have at least mentioned the ubiquituous and reasonably
well-maintained GNU Scientific Library (GSL) [1], which already implements all
the functions listed in the article [2,3,4,5], among many, many other things
[6].

[1] [http://www.gnu.org/software/gsl/](http://www.gnu.org/software/gsl/)

[2]
[http://www.gnu.org/software/gsl/manual/html_node/Incomplete-...](http://www.gnu.org/software/gsl/manual/html_node/Incomplete-
Gamma-Functions.html#Incomplete-Gamma-Functions)

[3]
[http://www.gnu.org/software/gsl/manual/html_node/Incomplete-...](http://www.gnu.org/software/gsl/manual/html_node/Incomplete-
Beta-Function.html#Incomplete-Beta-Function)

[4] [http://www.gnu.org/software/gsl/manual/html_node/The-
Gaussia...](http://www.gnu.org/software/gsl/manual/html_node/The-Gaussian-
Distribution.html#The-Gaussian-Distribution)

[5] [http://www.gnu.org/software/gsl/manual/html_node/Bessel-
Func...](http://www.gnu.org/software/gsl/manual/html_node/Bessel-
Functions.html#Bessel-Functions)

[6]
[http://www.gnu.org/software/gsl/manual/html_node/](http://www.gnu.org/software/gsl/manual/html_node/)

~~~
jwmerrill
The gsl is a great resource, but I don't think it suits the author's purpose.
He is talking about adding new functions to the C standard library, for anyone
to use in commercial or non-commercial software. It isn't practical to stick
GPL code into C standard libraries that are meant to be used broadly in
commercial as well as free software.

~~~
gone35
You are absolutely right. Now I see why the author didn't mention it. That was
a knee-jerk reaction on my part.

------
Bootvis
The meat isn't in the footnotes, so if a mod could update the link it would be
nice.

~~~
jwmerrill
Sorry about that.

------
cjslep
We are still using distributions in FORTRAN written decades ago to generate
random numbers in a variety of distributions. Having support for the inverse
standard normal CDF would be incredibly helpful, especially when I had to go
back in and add support for correlated uniformly distributed variates.

------
mindcrime
I see that the author of TFA wants this stuff put in the various standard
libraries, and I agree that that would be good. But in the meantime, at least
there are a variety of libraries one can use for accessing statistical
functions. In Java-land there is Commons Math[1] which provides a number of
statistical functions (in addition to other numerical/maths functions).

Things like Quantlib and jQuantlib also provide a lot of
mathematical/numerical algorithms. Yeah, it's a drag having to pull in another
library, but at least these things exist. :-)

[1]: [http://commons.apache.org/proper/commons-
math/index.html](http://commons.apache.org/proper/commons-math/index.html)

------
eliteraspberrie
I wouldn't have guessed these functions were so popular. I think those people
who need them, need them often, but they are very few, so standards committees
just ignore their needs.

If I have time this summer, I'll implement the Gamma and Beta functions, and
their inverses, with a permissive license.

------
awalton
The problem with this is that most programmers don't need statistics on a
regular enough basis. That's how things get into standard libraries: people
need these things so regularly that there's no point in reimplementing them
forever.

You really think we need stats functions in libc more than we need, say,
properly written accelerated vector and matrix functions? Which do you think
comes up more regularly in practice?

You're better off reimplementing these handful of functions in a BSD-licensed
copy-lib and forgetting about it. None of these are particularly tricky to
implement.

~~~
jwmerrill
> The problem with this is that most programmers don't need statistics on a
> regular enough basis.

Empirically, many language environments (think, e.g. scripting languages)
don't have really easy access to all of these functions, but do have easy
access to most of the rest of libm. The author is saying that if these were
part of libm, then language implementers would include them in their
respective standard libraries, and then people that care about stats (and many
of us should care about stats!) could write good stats software in general
purpose languages that we might have chosen for some independent reason.

> You really think we need stats functions in libc more than we need, say,
> properly written accelerated vector and matrix functions?

Can't we have both?

~~~
thisrod
The (dense) vector and matrix functions have lived in Lapack and BLAS for 40
years, longer than there has been a libc. Every serious language has a
standard interface for them, and the issues discussed in the article haven't
caused problems for decades.

