

Exact Numeric Integration - ssp
http://conal.net/blog/posts/exact-numeric-integration/

======
gjm11
It took me a little while to work out what he's doing here, so here are a few
notes for interested readers.

0\. Key point: the goal here is to do infinite-precision integrals. Not merely
"arbitrary precision", where you get to choose a number of decimal places and
then do the integral, but "infinite precision", where the result of the
integration is an object you can get arbitrarily many decimal places out of.

1\. His key data structure is a "lazy number" (not his term) that represents
some specific exact real number, and which you can query somehow (he doesn't
really say how) to get an arbitrarily good approximation to its value. For
instance, you might be able to say "give me an interval of length at most x
that the number is guaranteed to lie within".

2\. So he does everything with interval arithmetic. In particular, you need to
have a way to do an interval evaluation of the function you're trying to
integrate. That is, given an interval [a,b] you can get values lo,hi such that
lo <= f(x) <= hi whenever a <= x <= b, and if you let these intervals shrink
to a point your lower and upper bounds converge on the value of f at that
point. (Note: he never states that last point, but it's important.) Let's call
these lb(f,a,b) and ub(f,a,b) for "lower bound" and "upper bound".

3\. Clearly the following two things are true. (a) The integral of f between a
and b -- call this I(f,a,b) -- lies between (b-a).lb(f,a,b) and
(b-a).ub(f,a,b). (b) I(f,a,b) = I(f,a,m) + I(f,m,b) where m = (a+b)/2. (Or
indeed where m = anything else.)

4\. (3a) gives us a crude way to evaluate I(a,b) roughly. Typically it's very
crude, but it's guaranteed correct and as a,b get closer together it converges
on the right answer. (3b) gives us a way to evalute I(f,a,b) more accurately,
by using integrals over shorter intervals.

5\. So, he defines I(f,a,b) = (3a) intersect (3b), where that's a lazy
intersection on "lazy numbers". The idea is that when you query it, it uses
(3a) if that's enough to get the accuracy you need, and otherwise it uses (3b)
to do better.

6\. There are a whole lot of details being swept under the rug here. For
instance, when you take the (3b) branch, how much accuracy do you need from
the two subproblems to get the requested accuracy for the final result? (This
question is relevant only if the unspecified functionality provided by his
lazy numbers really is "give me an interval of length at most L containing the
correct value". If instead what you get is "tell me whether the correct value
is above or below X", there's a different problem, which is that if X happens
to _be_ the correct value then the computation will continue for ever until
you run out of memory or terminate it.) One possible answer is that you demand
twice as accurate a result from each sub-computation. I think that in this
case the algorithm will in general not actually terminate, but I haven't
thought it through carefully.

7\. So the whole thing is a bit like the following (with apologies for absence
of formatting):

integral(f,a,b,tolerance) = let lb,ub = bounds(f,a,b) in {if abs(b-a)abs(ub-
lb) <= tolerance then (b-a)[lb,ub] else let m = (a+b)/2 in
integral(f,a,m,tolerance/2) + integral(f,m,b,tolerance/2)}

except that (a) it uses lazy evaluation to make it so that you call
integral(f,a,b) and then repeatedly feed tolerance values to that, (b) because
of this it's presumably able to do less redundant recomputation when you do
the calculation repeatedly with different tolerances, and (c) the actual way
in which the tolerances are handled may be different.

~~~
samstokes
Thanks for this - helped my understanding of the original article.

