A quick note for those upvoting this ... Thank you. I'm casting about for more t...

tzs · on Feb 16, 2013

I don't know how accessible it can be made, but the Khinchin-Lévy constant is astonishing, both in its existance and its value [1].

[1] http://mathworld.wolfram.com/Khinchin-LevyConstant.html

gjm11 · on Feb 17, 2013

I can give a handwavy explanation for this. I'm afraid it's a bit heavy going in places, but there's nothing too difficult and the more technical bits can largely be skipped.

First, a reminder of what we're going to kinda-sorta-prove: Take any number and compute its continued fraction, giving approximations p1/q1, p2/q2, etc. Then for "almost all" numbers it happens that qn^(1/n) tends to the magical number exp(pi^2 / (12 log 2)).

And a brief summary of the idea: the process of computing continued fractions is basically one of applying a certain function again and again, and the number qn^(1/n) can be related to a certain average involving the numbers you get; a very general and powerful thing called the "ergodic theorem" tells us that when you do this, the values you get look a lot like random numbers chosen with a particular distribution; the corresponding average (over numbers chosen with that probability distribution) is a fairly simple integral, equal to an infinite series whose value is the thing we're looking for.

CONTINUED FRACTIONS

OK. So, how do you find the continued fraction for a number x? You repeatedly subtract off the integer part and then take the reciprocal; the integer parts you subtract off are the c.f. coefficients.

Forget about the integer parts for a moment and just consider the following function: x -> fractional_part(1/x). Call this F. We'll apply it only to numbers between 0 and 1. If you start with any x, compute fractional_part(x) and then repeatedly apply F, the c.f. coefficients are exactly the bits you throw away when computing the fractional parts.

Now suppose you pick a random x and compute F(x), F(F(x)), etc. These points will be scattered over the interval (0,1) somehow. In this sort of setting it often turns out that for almost all x, the density with which these points turn up is independent of x. Let me be more specific.

ERGODIC THEORY

Suppose you've got a probability distribution on (0,1) with the following magical property: picking a number according to this probability distribution and applying F to it gives you a number with the same probability distribution. For some F you can do this, and for others you can't. If you happen to have done anything with Markov chains and run across the notion of an "invariant measure" or "stationary distribution", this is the same idea but in a continuous rather than a discrete setting. "Measure", "distribution", and "probability distribution" all mean more or less the same thing here.

Well, anyway, if you have your function F and an invariant measure, and if another technical condition called "ergodicity" applies (it says that F mixes things up enough that you can't find nontrivial subsets that F maps to themselves), then something wonderful follows: for almost all x, the sequence x, F(x), F(F(x)), ... "samples the space fairly" according to your probability distribution. There are several different "ergodic theorems" that make this rigorous with different meanings of "sample fairly". Here's one: let f be any well-behaved function; then for almost all x, the averages [f(x)+f(F(x))+...+f(F^(n-1)(x))]/n tend to the same value, which is the average value of f(x) when you pick x from that magic probability distribution.

THE INVARIANT MEASURE

If you're still awake after all that, it will not surprise you that the particular F we're talking about -- mapping x to frac(1/x) -- does indeed have an invariant measure. Its probability density function is nice and simple: (1/log2) 1/(1+x). Proving that this is indeed invariant under F is pretty straightforward.

So all we need to do is to set things up so that the thing we're interested in is of the form [f(x)+f(F(x))+...+f(F^(n-1)(x))]/n, and then compute the average of f(x) with respect to our invariant measure. How do we do that?

CONTINUED FRACTIONS AGAIN

(There's a bit of magic here, but it's fairly easy magic.) Suppose the n'th convergent to x is pk(x)/qk(x). Then it's not hard to show that pk(x) = q{k-1}(Fx). So the following identity is trivial because it's just multiplying by a lot of 1s:

1/qn(x) = 1/qn(x) pn(x)/q{n-1}(Fx) p{n-1}(Fx)/q{n-2}(FFx) ... p2(F^{n-2}x)/q1(F^{n-1}x).

Now "shift the numerators left by one space":

1/qn(x) = pn(x)/qn(x) p{n-1}(Fx)/q{n-1}(Fx) ... p1(F^{n-1}x)/q1(F^{n-1}x).

Now remember that pk(y)/qk(y) is a pretty good approximation to y (at this point in an actual proof there'd need to be some computation of what the errors are, but I'll spare you), so

1/qn(x) ~= x F(x) F(F(x)) ... F^n(x)

which is beginning to look a lot like what we need. Take n'th roots and logs:

log qn(x)^(1/n) ~= -[log x + log F(x) + ... + log F^n(x)]/n

NOW IT'S JUST INTEGRATION

Aha. So now (thanks to the ergodic theorem) we just need to know the average of log x when we pick x according to our probability distribution. That's

integral {0..1} of (1/log2) . log x . 1/(1+x)

which, given the form of the answer we're looking out, is crying out to be done by expanding log x as a series. More specifically, first an integration by parts shows that -(that integral) equals

integral {0..1} of log(1+x)/x

which equals

integral {0..1} of (x-x^2/2+x^3/3-x^4/4+...)/x

which, integrating term by term, equals 1 - 1/2^2 + 1/3^2 - 1/4^2 + ..., and the fact that this equals pi^2/12 is basically the same as the slightly more famous fact that 1 + 1/2^2 + 1/3^2 + ... equals pi^2/6. And now we're done.

neilxdsouza · on Feb 17, 2013

I like the Schroeder Bernstein theorem :

http://en.wikipedia.org/wiki/Cantor%E2%80%93Bernstein%E2%80%...

kvb · on Feb 17, 2013

The fact that there are algebraic numbers inexpressible by roots still bugs me somewhat, but I'm not sure there's a suitably elementary explanation.

thaumasiotes · on Feb 17, 2013

I'd love to see an exposition of your third example, 'the rationals can be listed in full exactly once each'. The standard proof of their countability demonstrates that Z2 (the set of ordered pairs of integers; markup got me here) is countable, and then says the rationals are a subset of Z2 so must be countable, somehow, as well. I've always wondered about a function to enumerate the rationals without duplicates.

defen · on Feb 17, 2013

Consider the sequence 1,1,2,1,3,2,3,1,4... where the ith element (starting from 0) represents the number of ways to write i as a sum of powers of 2, with each power appearing at most twice. Then you can enumerate the positive rationals without duplicates as follows: Q_n = S_n / S_(n+1). So Q_1 = 1, Q_2 = 1/2, Q_3 = 2/1, Q_4 = 1/3, etc. You can get all the rationals by inserting each one's negation after it. Q_1 = 1, Q_2 = -1, Q_3 = 1/2, Q_4 = -1/2, etc. See http://en.wikipedia.org/wiki/Calkin%E2%80%93Wilf_tree