
How many zeros are there in 2^n?  - wglb
http://www.blog.republicofmath.com/archives/4750
======
qntm
Consider the final _k_ digits of 2^n. (If 2^n has fewer than _k_ digits, pad
with zeroes.) Now repeatedly multiply by 2, and then truncate mod 10^k.

For k = 2, you get the repeating sequence "2", "4", "8", "6", "2", ....

For k = 3 you get the repeating sequence "04", "08", "16", "32", "64", "28",
"56", "12", "24", "48", "96", "92", "84", "68", "36", "72", "44", "88", "76",
"52", "04", ....

2^(n+4) = 2^n mod 10^1

2^(n+20) = 2^n mod 10^2

2^(n+100) = 2^n mod 10^3

2^(n+500) = 2^n mod 10^4

...

Because there are only 10^k possible k-digit strings, we know that ultimately
the final k digits of each successive power of 2 must ultimately enter a loop.

In practice, that loop seems to grow in size by a factor of 5 each time. (I
_assume_ there is a number theoretic reason for this, but I am not sure
because I am terrible at number theory.) So let's assume (possibly
fallaciously) that the loop contains 4* 5^(k-1) distinct elements. (Notice how
this is substantially less than the 10^k possible elements.)

Assuming that the digits are distributed randomly - which, again, is possibly
fallacious - the probability of a random k-digit number containing no zeroes
is (9/10)^k.

Therefore, the probability of a k-digit number containing at least one zero is
1-(9/10)^k.

Therefore, the probability of all 4* 5^(k-1) elements containing at least one
zero is [1-(9/10)^k]^[4* 5^(k-1)].

Which rapidly tends to 0. Hoom. Well, I hoped to show it tended to 1, but
never mind.

~~~
waterhouse
> In practice, that loop seems to grow in size by a factor of 5 each time. (I
> assume there is a number theoretic reason for this, but I am not sure
> because I am terrible at number theory.)

I can explain this reason. Consider 2^n mod 10^k. If n > k, then we can
rewrite this as:

    
    
      2^k * (2^(n-k) mod 5^k)
    

And then the Euler-Fermat theorem states that, if a and m are coprime, then

    
    
      a^(φ(m)) ≣ 1 (mod m)
    

where φ(m) is Euler's totient function, the number of integers from 1 to m
that are coprime to m. And φ(p^k) = (p-1) * p^(k-1) for any prime p, so we get
in particular that φ(5^k) = 4 * 5^(k-1), which gives us the (maximum possible
value of the) cycle length.

------
impendia
Nice data, good conjecture with solid probabilistic reasoning.

But beware: none of this is even the start of a proof.

------
tzs
Maybe some thinking about individual digit positions could shed some light.
Let's call the last digit d_0, the next to last d_1, and so on.

d_0 follows a simple pattern: 1, 2, 4, 8, 6, 2, ..., and so never has a zero.
It ends up in this pattern [2, 4, 8, 6], and throws two carries over to d_1.

d_1 goes 0, 0, 0, 0, 1, 3, 6, 2, 5, 1, 2, 4, 9, 8, 6, 2, 4. It ends up
repeating [8, 6, 2, 4], and throwing two carries to d_1.

In general, d_n starts off with a sequence of 0, until it gets the first carry
from d_{n-1}, then tries to fall into 1, 2, 4, 8, 6, 2, ..., but the carries
from the previous column perturb that. However, the carries come in on a
pattern, and so the result will be that d_n ends up in a cycle.

I haven't looked past the first couple of columns, but I'd not be surprised if
there is a predictable pattern in these cycles and their phases that would
allow someone to work out a considerable amount about the occurrences of any
specific digits in powers of two.

EDIT: I missed a carry when doing d_1. d_1 goes
0,0,0,0,1,3,6,2,5,1,2,4,9,9,8,6,3,7,4,8,7,5,0,0 and repeats
[1,3,6,2,5,1,2,4,9,9,8,6,3,7,4,8,7,5,0,0]. It throws 10 carries into d_2.

------
Almaviva
It's interesting to me that he has an article a few places below this one on
"Contrived versus genuine mathematical problems". Anything that looks at
string properties of integers' base ten representations as important seems
fairly contrived. Project Euler suffers a lot from this type of problem as
well.

------
waterhouse
Beginning of an outline of a proof...

    
    
      arc> (dedup:mapn [mod-expt 2 _ 100] 2 100)
      (4 8 16 32 64 28 56 12 24 48 96 92 84 68 36 72 44 88 76 52)
    

That's the cycle of powers of 2 mod 100. (I removed the beginning 1 and 2,
which don't repeat.) The first two numbers will show up as 04 and 08 in our
number (except when the number _is_ 4 or 8). Thus, there's always a 0 when n
is 2 or 3 mod 20. (Note that there are 20 numbers up there.)

    
    
      arc> (no:prn:dedup:mapn [string:num->digs (mod-expt 2 _ 1000) 10 3] 3 1000)
      (008 016 032 064 128 256 512 024 048 096 192 384 768 536 072 144 288 576 152 304
      608 216 432 864 728 456 912 824 648 296 592 184 368 736 472 944 888 776 552 104
      208 416 832 664 328 656 312 624 248 496 992 984 968 936 872 744 488 976 952 904
      808 616 232 464 928 856 712 424 848 696 392 784 568 136 272 544 088 176 352 704
      408 816 632 264 528 056 112 224 448 896 792 584 168 336 672 344 688 376 752 504)
    

That's the cycle of powers of 2 mod 1000, starting with n=3. There are 100
numbers. (Incidentally, the Euler-Fermat theorem and some thinking can tell
you that the cycle of powers of 2 mod 10^k will have length that divides
phi(5^k), which is 4 * 5^(k-1), which is 4, 20, 100, 500, etc.) I'll make my
computer figure out which numbers contain 0.

    
    
      arc> (let u (dedup:mapn [string:num->digs (mod-expt 2 _ 1000) 10 3] 3 1000) (keep [pos #\0 u._] (range 0 dec:len.u)))
      (0 1 2 3 7 8 9 14 19 20 39 40 59 60 76 79 80 85 99)
    

Which is 19 out of the 100 different things mod 1000. Now let's write a
function to deal with 10^k.

    
    
      (def ass (k)
        (withs (pt (expt 10 k)
                u  (dedup:mapn [string:num->digs (mod-expt 2 _ pt) 10 k]
                               k (+ k (* 4 (expt 5 dec.k)))))
          (list (count [pos #\0 _] u)
                len.u)))
    

To explain the above code: I let "pt" = 10^k, then I take the numbers from k
to k + 4 * 5^(k-1), find 2^n (mod pt), write out its digits in base 10 (and
force there to be k digits, by appending 0's to the front if necessary), put
the digits together into a string, make a list of all these resulting strings,
remove duplicates just to be sure, and give the resulting list the name "u";
then I return a list of two numbers, the number of strings in the list that
contain 0, and the total number of strings in the list. From what we've seen
above, we'd expect (ass 2) to return (2 20), and (ass 3) to return (19 100).

    
    
      arc> ass.2
      (2 20)
      arc> ass.3
      (19 100)
    

That's a good sign. Now let's proceed to higher numbers. I'll make it print
out the resources it's using, too.

    
    
      arc> time:ass.4
      time: 30 gc: 0 mem: 2458488
      (136 500)
      arc> time:ass.5
      time: 338 gc: 169 mem: -21400960
      (862 2500)
      arc> time:ass.6
      time: 905 gc: 141 mem: -2376232
      (5129 12500)
      arc> time:ass.7
      time: 4333 gc: 227 mem: 43046240
      (29330 62500)
      arc> time:ass.8
      time: 28885 gc: 1797 mem: 74515640
      (163232 312500)
    

Well, we've passed the halfway mark, but my computer is feeling the pressure.
If you rewrote this in C or something, you might be able to go up a few
numbers higher, but probably not much. (Hmm, actually, I'm kind of repeating a
bunch of work: I already know that "2 and 3 mod 20" numbers always have a 0,
and similar for "3 4 5 6 10 ... mod 100 [add 3 to my third list of numbers]".
However, if I ignored those, that would only cut out half the work for k=8.
But maybe a lot more will get cut out in higher numbers, and the number of
things to deal with will become manageable?)

Well, in principle at least, this could be solved by computer.

Edit: Hmm, qntm seems to have said much of this already. Well, at least I have
done some grunt work to illustrate his arguments.

~~~
impendia
+1. At first I thought that none of these statements could even remotely be
proved, but I've changed my mind. I totally buy this.

In particular, it looks like this would prove "Every sufficiently large power
of 2 contains at least k zeroes", for any value of k. The computational time
would probably be prohibitive, but this should definitely work in principle.

As a working analytic number theorist, I suspect the asymptotic formula is out
of reach, for similar reasons to why you can't count twin primes. I'd be very
pleased if someone persuaded me I'm wrong.

But, please, none of this mumbo-jumbo about "higher order Fourier analysis".
Gowers et al. developed it to find progressions in arithmetic sequences, which
is not related to the present problem as far as I can see.

~~~
Someone
_it looks like this would prove "Every sufficiently large power of 2 contains
at least k zeroes", for any value of k. The computational time would probably
be prohibitive, but this should definitely work in principle._

Only when you accept constructive proofs that do not say anything about their
running time and that may not even halt.

 _I suspect the asymptotic formula is out of reach, for similar reasons to why
you can't count twin primes._

I am not aware of any result on the _inherent_ difficulty of counting twin
primes. Are you hinting at one? If so, can you give a reference?

~~~
impendia
By "in principle" I am excluding any consideration of running time, or
computing resources. For example, if k = 10, perhaps you would need a computer
with 10^10000 bytes of RAM working for an equal number of years. It's a finite
computation. ;)

"may not even halt" corresponds to my "it looks like". Such algorithms not
working would be equivalent to me and the parent commenter being wrong --
which is not out of the question. Probabilistically, such an algorithm should
work (I think -- have not checked the details), so it ought to work unless
there is some unforeseen reason why it wouldn't. A "conspiracy" against this
happening, if you will.

As far as the difficulty in counting twin primes -- well, to give an easier
example, look at Cojocaru and Murty's book on sieve methods and read about the
sieve of Eratosthenes-Legendre. You know that 1/2 of numbers are even, 2/3
aren't divisible by 3, 4/5 aren't divisible by 5, etc., so the proportion of
numbers that are prime is equal to (1/2) _(2/3)_ (4/5) _(6/7)_..., which you
can show with a little bit of effort is equal to 0.

With some work, you can turn this into a proof. _In theory_ , you can estimate
the number of primes < X, as a function of X. This estimate would be roughly
along the lines of the original post. But as Cojocaru and Murty (among others)
point out, the error terms get bad quickly. This type of phenomenon is
extremely common in analytic number theory.

In the case of prime counting, you can get better results by other methods,
most prominently the Riemann zeta function. But even then you can only get the
best results if you prove the Riemann Hypothesis. Million dollar bounty out on
that one.

------
Sniffnoy
If we ask instead how many _nonzero_ digits there are in 2^n, there's actually
a known lower bound on this, due to Cam Stewart, though it is unfortunately
quite tiny - on the order of (log log n)/(log log log n), much smaller than
what empirically appears to be the case.

In fact, the statement is much more general: Say you have two bases A and B
(such that no power of A is a power of B, to prevent triviality), a base-A
digit a, and a base-B digit b. For any n let L_A,a(n) denote the number of
non-a digits in the base-A expansion of n, and L_B,b(n) similarly with B and
b. Then for any n, L_A,a(n)+L_B,b(n)>(log log n / (log log log n + C)) - 1,
where C is a constant depending only on A and B. So while it's certainly
impossible to say "large numbers have complicated base-A expansions", it is in
some sense possible to say that they can't _simultaneously_ have simple base-A
and base-B representations!

In particular you get a lower bound on number of nonzero digits (or more
generally non-b digits) of powers of A written to base B by just taking n to
be powers of A and taking a=0.

Paper:
[http://www.math.uwaterloo.ca/PM_Dept/Homepages/Stewart/Jour_...](http://www.math.uwaterloo.ca/PM_Dept/Homepages/Stewart/Jour_Books/J-reine-
ange-Math-1980.pdf)

Unfortunately, not only is the lower bound tiny even asympotically, but the
constant involved is absolutely huge. (The paper doesn't explicitly state how
to compute this constant, but once I was trying to bound below the number of
nonzero digits in powers of 2 written to base 3, so I traced through all the
computations and references, and it's ridiculously large, much too large for
what I was trying to do... I forget just how large, except that it was the
sort of number most conveniently described as being "about 10^[something]"
rather than in words.) So again, not close to what appears to be true
empirically.

------
qntm
Second attempt.

2^n has approximately n log 2 digits. Assuming that they are distributed
randomly, the probability that none of these digits are 0 is (9/10)^(n log 2).

For n = 50,000, this is already unbelievably tiny. For n = 50,001, it is even
tinier. And so on.

In fact, summing all of these tiny probabilities from n = 50,000 to n =
infinity is about the same as integrating (9/10)^(x log 2) from x = 50,000 to
x = infinity, which is, again, only slightly above zero.

So, statistically at least, there is an excellent chance that every power of 2
above 50,000 has at least one zero.

~~~
xyzzyz
_Assuming that they are distributed randomly_

Well, this is what precisely needs to be proven. You cannot just assume that
-- they're not distributed randomly in base 2, why they should be in base 10?
They most likely are, but if you assume that, you are assuming the hypothesis.

------
rounak
The approximate linearity is amazing!

~~~
JoachimSchipper
For a random number m, you'd expect about 1/10 digits to be zero. Why is it
surprising that 2^n would behave somewhat similarly?

~~~
qntm
The numbers 2^n aren't random. For example, the final digit is never 0. You
might expect other patterns to appear on larger scales.

------
leif
n

~~~
arethuza
That's exactly what I thought then I read the article which does state that
they are talking about base ten.

~~~
forinti
For base 2, you can use Pascal's triangle to find out how many numbers with x
zeroes (or ones) exist up to 2^n:
[http://alquerubim.blogspot.com/2010/10/combinacoes-de-
digito...](http://alquerubim.blogspot.com/2010/10/combinacoes-de-digitos.html)
(portuguese only, I'm afraid).

From base 3 onwards, I would expect to see some "subparticles" inside each
number in the triangle. I would start from there.

------
Qz
Something about the font they use there makes 0 look like () on Firefox for
me.

------
marshray
My quick estimate, on average, for large numbers: n/(log2 10)/10 which is
about n/33.

He says it way mo better tho.

