Trapping a Transcendental: There are reals that are not roots of polynomials

gjm11 · on Feb 16, 2013

This is a nice proof showcasing an important idea, but for this particular theorem I prefer the beautiful constructive proof found by Liouville. It can be sketched in a few paragraphs, so let me do that.

0. The basic idea is to show that an algebraic number -- i.e., one that's a root of a polynomial with integer coefficients -- can't be very well approximated by rational numbers ... and then to construct a number that can be very well approximated by rational numbers.

1. Suppose you've got a polynomial p, of degree d, with integer coefficients. Let x be one of its roots, and let a/b be any rational number. How close can it be to x? Well, let's ask a slightly different question: how close can p(a/b) be to 0? Answer: it's always (some integer)/b^d, and unless x actually equals a/b that integer isn't zero, so is >=1 or <=-1; so abs(poly(rational)) >= 1/denominator^degree.

2. OK, so how close can a/b be to x? Well, roughly speaking the ratio between p(a/b)-0 (which we know isn't too small) and a/b-x (which is what we care about) is close to p'(x), provided a/b is close to x. So (after filling in the details) we find that: if x is a root of an integer polynomial of degree d then there's a constant C such that every rational number has abs(rational-x) >= C denominator^d.

3. Now let's construct a number that is better-approximated by rationals than that allows. Specifically, take a decimal that looks like 0.110001000... where the spacing between the 1s grows very fast. The traditional thing is to take 10^-1! + 10^-2! + 10^-3! + ... so let's do that. Then truncating after 10^-k! leaves an error that's approximately 10^-(k+1)!; in other words we have a rational number with denominator N=10^k! and an error on the order of N^(k+1), so for any integer polynomial of degree d, once k gets somewhat larger than d this approximation is "too good", so this number can't be a root of that polynomial. So it can't be the root of any integer polynomial; in other words, it's transcendental.

ColinWright · on Feb 16, 2013

That is beautiful and elegant. It's also powerful.

The article submitted here easily shows that in any interval on the real line there is at least one transcendental. Your proof here shows that you can create a transcendental "close" to any given rational. Both have a power beyond just "transcendentals exist".

The reason I like the one submitted is that it's more visual and less algebraic, hence easier to explain to those less comfortable with polynomials and the like, but certainly both have their merits.

Thanks for showing us this one.

Evbn · on Feb 16, 2013

What is the difference between "any interval" and "close to a given rational"? Rationals are dense, after all.

ColinWright · on Feb 16, 2013

The difference is in the way it's being described, and how that description feeds in to the subsequent definition and proof.

The submitted item says: take an interval, here's how to construct a transcendental inside your chosen interval.

The proof that gjm11 gives says: Here's a proof that if something is unreasonably close to a rational then it's not algebraic, and here's how to construct something like that.

The concept of "unreasonably close" then has to be expressed in terms of sizes of denominators, and hence is a relative term. That forms a key part of the proof.

mjw · on Feb 16, 2013

This is a really nice proof, and I love this kind of "you could have discovered X if you'd just had this core idea!" kind of exposition.

Perhaps a hard ask, but do you know of any similarly explicable proofs for transcendental-ness of "well known" numbers?

gjm11 · on Feb 17, 2013

As Colin says, transcendence proofs tend to be hard except for numbers carefully constructed to make transcendence proofs easy :-).

His mention of the (beautiful but difficult) Gelfond-Schneider theorem reminds me of the following much easier, much less deep but rather pretty observation.

Question: Is it possible to take an irrational number, raise it to an irrational power, and get a rational number?

Answer: Yes. Write s=sqrt(2) -- which is of course irrational -- and consider s^s. If it's rational then we're done. Otherwise, (s^s)^s is an irrational number to an irrational power; but it equals s^(s^2) = s^2 = 2, which is rational!

Thanks to Gelfond and Schneider, we know that actually s^s is not only irrational but transcendental.

(In 1919, the great mathematician David Hilbert mentioned three big unsolved problems in number theory: the Riemann Hypothesis, Fermat's "last theorem", and the transcendence of 2^sqrt(2). He said RH would probably be proved within a few years, FLT maybe in his lifetime, and the transcendence of 2^sqrt(2) probably not in the lifetime of anyone in his audience. In fact he got the order exactly wrong: the transcendence of 2^sqrt(2) was proved in 1930, FLT in 1995, and RH still isn't done despite its great importance.)

tzs · on Feb 17, 2013

Another answer I've seen to the irrational^irrational question is sqrt(10)^log(4)=2, where log is the common logarithm. This avoids upsetting the constructivists.

gjm11 · on Feb 17, 2013

But upsetting the constructivists is the whole point! (That is: what's so cute about that proof is that it tells you that there's a solution to irrational^irrational=rational, and that it's one of two things, but it doesn't tell you which. Also, I think it's a little shorter and more elegant than using sqrt(10)^log10(4). Don't you?)

ColinWright · on Feb 16, 2013

Results for specific known numbers such as e and pi are known to be hard to come by.

http://en.wikipedia.org/wiki/Transcendental_number#Numbers_p...

The main tools are the Gelfond-Schneider theorem and the Lindeman-Weierstrass theorem:

http://en.wikipedia.org/wiki/Gelfond%E2%80%93Schneider_theor...

http://en.wikipedia.org/wiki/Lindemann%E2%80%93Weierstrass_t...

gjm11 will know much more about these than I, and is better placed to offer insight into them.

tzs · on Feb 16, 2013

e can be done reasonably with material from a good first year college calculus course. Such a proof is given around page 437 of the 3rd edition of Spivak's "Calculus".

e3pi · on Feb 16, 2013

Felix Klein's lecture made Lindemann's pi and Hermite's e transcendental proof accessible and interesting to German `HS' students, and was my own first persuasion. It was a Dover soft cover: Felix Klein's Famous Problems of Elementary Geometry, Dover, 1956.

ColinWright · on Feb 16, 2013

A quick note for those upvoting this ...

Thank you. I'm casting about for more things to write about like this. There are lots of things hackers will have read, but not seen the details for, and sometimes the details are very accessible.

If you liked this one, what else would you like? For example:

* Between every two rationals there's an irrational

* Between every two rationals there's a rational

* The rationals can be listed, in full, exactly once each

That kind of stuff. What assertions or claims have you seen made that you'd like explained in more detail?

Feel free to reply here, or email - address in my profile.

Edited for layout.

tzs · on Feb 16, 2013

I don't know how accessible it can be made, but the Khinchin-Lévy constant is astonishing, both in its existance and its value [1].

[1] http://mathworld.wolfram.com/Khinchin-LevyConstant.html

gjm11 · on Feb 17, 2013

I can give a handwavy explanation for this. I'm afraid it's a bit heavy going in places, but there's nothing too difficult and the more technical bits can largely be skipped.

First, a reminder of what we're going to kinda-sorta-prove: Take any number and compute its continued fraction, giving approximations p1/q1, p2/q2, etc. Then for "almost all" numbers it happens that qn^(1/n) tends to the magical number exp(pi^2 / (12 log 2)).

And a brief summary of the idea: the process of computing continued fractions is basically one of applying a certain function again and again, and the number qn^(1/n) can be related to a certain average involving the numbers you get; a very general and powerful thing called the "ergodic theorem" tells us that when you do this, the values you get look a lot like random numbers chosen with a particular distribution; the corresponding average (over numbers chosen with that probability distribution) is a fairly simple integral, equal to an infinite series whose value is the thing we're looking for.

CONTINUED FRACTIONS

OK. So, how do you find the continued fraction for a number x? You repeatedly subtract off the integer part and then take the reciprocal; the integer parts you subtract off are the c.f. coefficients.

Forget about the integer parts for a moment and just consider the following function: x -> fractional_part(1/x). Call this F. We'll apply it only to numbers between 0 and 1. If you start with any x, compute fractional_part(x) and then repeatedly apply F, the c.f. coefficients are exactly the bits you throw away when computing the fractional parts.

Now suppose you pick a random x and compute F(x), F(F(x)), etc. These points will be scattered over the interval (0,1) somehow. In this sort of setting it often turns out that for almost all x, the density with which these points turn up is independent of x. Let me be more specific.

ERGODIC THEORY

Suppose you've got a probability distribution on (0,1) with the following magical property: picking a number according to this probability distribution and applying F to it gives you a number with the same probability distribution. For some F you can do this, and for others you can't. If you happen to have done anything with Markov chains and run across the notion of an "invariant measure" or "stationary distribution", this is the same idea but in a continuous rather than a discrete setting. "Measure", "distribution", and "probability distribution" all mean more or less the same thing here.

Well, anyway, if you have your function F and an invariant measure, and if another technical condition called "ergodicity" applies (it says that F mixes things up enough that you can't find nontrivial subsets that F maps to themselves), then something wonderful follows: for almost all x, the sequence x, F(x), F(F(x)), ... "samples the space fairly" according to your probability distribution. There are several different "ergodic theorems" that make this rigorous with different meanings of "sample fairly". Here's one: let f be any well-behaved function; then for almost all x, the averages [f(x)+f(F(x))+...+f(F^(n-1)(x))]/n tend to the same value, which is the average value of f(x) when you pick x from that magic probability distribution.

THE INVARIANT MEASURE

If you're still awake after all that, it will not surprise you that the particular F we're talking about -- mapping x to frac(1/x) -- does indeed have an invariant measure. Its probability density function is nice and simple: (1/log2) 1/(1+x). Proving that this is indeed invariant under F is pretty straightforward.

So all we need to do is to set things up so that the thing we're interested in is of the form [f(x)+f(F(x))+...+f(F^(n-1)(x))]/n, and then compute the average of f(x) with respect to our invariant measure. How do we do that?

CONTINUED FRACTIONS AGAIN

(There's a bit of magic here, but it's fairly easy magic.) Suppose the n'th convergent to x is pk(x)/qk(x). Then it's not hard to show that pk(x) = q{k-1}(Fx). So the following identity is trivial because it's just multiplying by a lot of 1s:

1/qn(x) = 1/qn(x) pn(x)/q{n-1}(Fx) p{n-1}(Fx)/q{n-2}(FFx) ... p2(F^{n-2}x)/q1(F^{n-1}x).

Now "shift the numerators left by one space":

1/qn(x) = pn(x)/qn(x) p{n-1}(Fx)/q{n-1}(Fx) ... p1(F^{n-1}x)/q1(F^{n-1}x).

Now remember that pk(y)/qk(y) is a pretty good approximation to y (at this point in an actual proof there'd need to be some computation of what the errors are, but I'll spare you), so

1/qn(x) ~= x F(x) F(F(x)) ... F^n(x)

which is beginning to look a lot like what we need. Take n'th roots and logs:

log qn(x)^(1/n) ~= -[log x + log F(x) + ... + log F^n(x)]/n

NOW IT'S JUST INTEGRATION

Aha. So now (thanks to the ergodic theorem) we just need to know the average of log x when we pick x according to our probability distribution. That's

integral {0..1} of (1/log2) . log x . 1/(1+x)

which, given the form of the answer we're looking out, is crying out to be done by expanding log x as a series. More specifically, first an integration by parts shows that -(that integral) equals

integral {0..1} of log(1+x)/x

which equals

integral {0..1} of (x-x^2/2+x^3/3-x^4/4+...)/x

which, integrating term by term, equals 1 - 1/2^2 + 1/3^2 - 1/4^2 + ..., and the fact that this equals pi^2/12 is basically the same as the slightly more famous fact that 1 + 1/2^2 + 1/3^2 + ... equals pi^2/6. And now we're done.

neilxdsouza · on Feb 17, 2013

I like the Schroeder Bernstein theorem :

http://en.wikipedia.org/wiki/Cantor%E2%80%93Bernstein%E2%80%...

kvb · on Feb 17, 2013

The fact that there are algebraic numbers inexpressible by roots still bugs me somewhat, but I'm not sure there's a suitably elementary explanation.

thaumasiotes · on Feb 17, 2013

I'd love to see an exposition of your third example, 'the rationals can be listed in full exactly once each'. The standard proof of their countability demonstrates that Z2 (the set of ordered pairs of integers; markup got me here) is countable, and then says the rationals are a subset of Z2 so must be countable, somehow, as well. I've always wondered about a function to enumerate the rationals without duplicates.

defen · on Feb 17, 2013

Consider the sequence 1,1,2,1,3,2,3,1,4... where the ith element (starting from 0) represents the number of ways to write i as a sum of powers of 2, with each power appearing at most twice. Then you can enumerate the positive rationals without duplicates as follows: Q_n = S_n / S_(n+1). So Q_1 = 1, Q_2 = 1/2, Q_3 = 2/1, Q_4 = 1/3, etc. You can get all the rationals by inserting each one's negation after it. Q_1 = 1, Q_2 = -1, Q_3 = 1/2, Q_4 = -1/2, etc. See http://en.wikipedia.org/wiki/Calkin%E2%80%93Wilf_tree

calhoun137 · on Feb 17, 2013

Stuff like this is amazing. I am so happy I switched out of comp-sci freshman year of college, and took a 10 year break from programming to study math and physics. I think if I had stuck with comp-sci there is a big chance I would hate computers, and always wish I had gone into science. Now that I am back to programming all the time it feels like I am a kid again, and I don't have a regret in the world. My career might be slightly fucked, but the joy of knowing this kind of stuff, and still loving computers makes it worth it.

e3pi · on Feb 16, 2013

Thank you. Got it, the ratio of algebraics to transcendentals is nothing. This has been bugging me lately... can you tell me why the transcendental e, signs it's name in the first significant digits past the decimal point...

...where s3 is root(3), and gg is Euler's gamma/Masceroni's 0.5772... constant:

e^(s3 times gg)/(e times s3 times gg) = 1.00000002718?

Flirting? Fear of patent trolls, proprietary pride, just dumb luck, what?

Note: HN edit coughed when I first used '*' for 'times'

gjm11 · on Feb 17, 2013

Just dumb luck, I think. In general, anything that depends essentially on working in base 10 tends to be just coincidence :-).

If you compute some more places, you get 1.0000000271816967980437100574683393 so the digits of e don't continue for very long.

The thing that's more surprising on the face of it is that the result is so close to 1. Let's rearrange it a little:

exp(s3 gg) ~= exp(1) s3 gg

which I prefer to write as

exp(s3 gg - 1) ~= 1 + (s3 gg - 1)

because the series for exp looks like exp(x) = 1 + x + ... -- so what gives you the value close to 1 is that s3 gg - 1 is rather close to 0, so that the first couple of terms of the series are close to the actual value of exp(); in other words, that gg is close to 1/s3. So far as I know, this too is just a coincidence. (There are lots of simple-ish constants between, say, 0.1 and 10; it's not really that surprising to find a pair that are approximately reciprocals.)

The e-like digits that follow amount (after a little simple algebra) to the fact that (s3 gg - 1)^2/2 ~= e/10^8 or, if you prefer, 1 - s3 gg ~= sqrt(2e)/10^4. Which smells very, very coincidental to me. It's a basically random smallish number; why shouldn't it be close to sqrt(2e) / a power of 10?

jerf · on Feb 17, 2013

Somewhat more generally, the space of interesting numbers you can write compactly with traditional notation ("4", "sqrt(2)", "log(7)"), combined with the ways you can compactly write a relation between two numbers (+, /, Ackermann's function), combined with allowing two or three or maybe four numbers (even before they start showing up multiple times as you show there), combined with the number of ways in which a result may be "interesting", results in an inevitable population of combinations of numbers and operations that have an "interesting" result that aren't "really" interesting, or perhaps rather, are only interesting because of the selection function being used ("looks interesting to a human").

You may enjoy http://en.wikipedia.org/wiki/Mathematical_coincidence .

e3pi · on Feb 17, 2013

"........that have an "interesting" result that aren't "really" interesting, or perhaps rather, are only interesting because of the selection function being used ("looks interesting to a human").

No sir, you are mistaken. My linux netbook's perl regex's determined this is interesting. I myself, a human, merely scan a move-to-front list that suit's this week's curious fancy.

ColinWright · on Feb 16, 2013

I'll look at that and come back to it tomorrow - I don't have time now to look into it. But I will reply with what I know (which may be nothing)

Also, I think you you put spaces around the * sign you're OK.

(Added in edit: Feels like a coincidence, but strange things do happen, as evidenced in my calculations for the birthday problem, where sqrt(ln(4)) surfaced.)