Hacker News new | past | comments | ask | show | jobs | submit login
A prime number whose binary representation looks like a giraffe (reddit.com)
264 points by svenfaw on Jan 20, 2018 | hide | past | favorite | 76 comments

The last line of the picture contains some 'noise':


So the idea is that you produce any 63x64 binary image then you find a number up to 2^64 so that added to the number represented by the first image gives a prime. It's not exactly amazing, but clever enough.

> The last line of the picture contains some 'noise'

Note that if the picture is prime, then the last line must always contain noise... the last digit must be a 1!

I also feel like it's slightly cheating that the first line doesn't contain noise, or a leading 1 digit... the number doesn't naturally fit into 64x64 bits because of all the leading zeroes.

If you "draw" with zeros instead of ones, then you don't have the last-digit-in-foreground problem.

It also solves the leading zeros problem. That's potentially a lot of extra computation to handle, though, since the resulting number in the giraffe case would be something like 2^500 times larger. I guess the question of whether that is worth it depends on your individual value for image shaped primes.

Or you can move the image on canvases of different size until you find a position/size where it is prime.

Several other methods can work... Primes are not rare at all - average prime gaps are fairly small in that domain...

I don't see why that would work. If the last digit is 0 then the number is not prime. Even if you make the bottom right corner a 1, it is not clear to me that any image will give a prime for sufficiently large canvas. The reason the original method works is that the density of primes around n is roughly 1/log n, so there are lots of primes around. If your image has k pixels the density is approx 1/k, so you'd only have to change the last log k pixels on average.

Isn't the largest gap for any prime sqrt(n) ?

not even close. the average gap is closer to log(n).

The average gap is log(n), but the maximal gap is not well understood.

According to https://en.wikipedia.org/wiki/Cram%C3%A9r%27s_conjecture, current results are

Unconditional: n^0.525

Conditional on RH: log(n)*sqrt(n)

Conjecturally: log(n)^2

> It is still probably true that for every constant {\displaystyle c>2} c>2, there is a constant {\displaystyle d>0} d>0 such that there is a prime between {\displaystyle x} x and {\displaystyle x+d(\log x)^{c}} {\displaystyle x+d(\log x)^{c}}

That's quite fascinating.

where log(n) is the natural logarithm of n


Any log base is just a multiple of the natural log. When you talk about asymptotic growth, those factors are irrelevant.

It wasn't clear to me that the problem as framed was about asymptotic growth. I was curious if/when the factors might become relevant, so I started sketching this out...

The 63x64 binary canvas is made of integers on the order of 2^4032. (Please forgive my approximation notation.)

    ln(2^4032) ~= 2795
    log_10(2^4032) ~= 1214
So, the difference between natural log (for this canvas size) and log base 10 is only one binary digit (min 11, max 12). The "noise" you need to introduce to the image will definitely keep to the bottom right corner.

So, I'm curious how tall the phone-width image needs to be before the required noise exceeds 64 bits (aka a single row). And the answer is... Too tall to be calculated quickly with my brute force method of "try bigger numbers until you get too impatient to wait for your desktop to calculate the natural log".

I'm sure if I spent enough time drawing the parameters out, I'd eventually agree that there's no representation of this problem where those factors ever become relevant.


If the problem we were calculating involved log base 1.0001 instead of the natural log, those factors would become relevant at much smaller canvas sizes. I would love to hear an intuitive explanation of logarithms that would convince me to not bother making calculations like this in the future!

As the post you replied to said, it’s a constant factor. You’ll find:

    ln(2^n)     = n * ln(2)
    log_10(2^n) = n * log_10(2)
Which, substituting x for 2^n, gives

    log_10(x) = ln(x) * (log_10(2) / ln(2))
you can do replace 2 by any positive number here. So, the value of

    log_10(x) / ln(x)
is independent of x. A bit of thinking will show it to be equal to

And indeed,

    1214 * ln(10) ~= 2795,338

No, I understand the conversion factor is constant.

The subthread started when someone thought it was necessary to point out that the log everyone was talking about was the natural log.

Independent of what the actual problem is, say someone tells you that something is log(n). It is a safe assumption that they mean ln(n), but if you are are speaking out loud, that is still pronounced "log n".

What I'm looking for is better intuitions on F/ln(n) versus F/log_x(n) for some completely arbitrary function F. The intuitions for x > e are very different than the intuitions for x < e. (This is just the nature of exponents. Maybe there is no further intuition for me to develop here, other than to just practice more.)

ln 2⁶⁴ⁿ > 2⁶⁴

64n × ln 2 > 2⁶⁴

n > 2⁵⁸ ∕ (ln 2) ≈ 4.2E+17

Such beautiful superscripts and ≈, but then you used * and / instead of × and ÷…

I like to also put plenty of THIN SPACE ( ) in to give it space to breathe, e.g. 2 ⁶⁴ ⁿ and 64 n instead of 2⁶⁴ⁿ and 64n. I make my thin spaces with Compose+Space+'.

THIN SPACE (U+2009) is also good to use to group digits, and is in fact the standard way to group digits in the SI system instead of "," or ".".

That looks nice and gets rid of the confusion over number like "1,234" and "1.234". In the US, the former is the integer 1234 and the later is the rational number 1234/1000. In Germany or France, these would be the other way around. In SI, "1.234" and "1,234" would both be 1234/1000, and the integer 1234 would be "1 234" with the space there being a thin space.

Unfortunately, HN does not support THIN SPACE as far as I can tell.

Reddit is a little better. You can enter it in a comment and it will be correctly saved with the comment and it will display correctly in the browser. As long as you don't need to make any edits to your comment afterwards, it is fine.

If you edit the comment thin spaces are converted to regular spaces when it loads the editing widget, so unless you go through and put all the thin spaces back you will lose them when you save.

HN supports THIN SPACE just fine. I used it in my comment, and so did you.

I’m not at all fond of the thin space digit grouping practice; to my Australian eyes it tends to look fairly terrible in most places, like bad kerning.

Your comment also shows another catastrophic weakness of using a regular thin space for digit grouping without additional magic: you need a non-breaking thin space, but there isn’t one. On my comments page, your comment shows the 1 on one line, and the 234 on the next.

I love my YYYY-MM-DD dates (I use it everywhere where the date format isn’t prescribed, and thus get the occasional odd look on forms and such from people aren’t used to that form of date—so you can see I’m not afraid of deviating from local conventions to prefer superior or sometimes merely international ones), but I hate and consequently don’t use thin space digit grouping in general.

OK, this is odd. It looks like something browser dependent is going on.

You earlier comment shows up with thin spaces on Chrome for me, but not on Safari or Firefox.

I know Firefox handles thin spaces, because it handles this Reddit comment just fine: https://www.reddit.com/r/Physics/comments/67g28i/because_gra...

I'm going to paste the examples from there here and see what happens:

1​234​567 (U+200B, ZERO WIDTH SPACE)

1 234 567 (U+200A, HAIR SPACE)

1 234 567 (U+202F, NARROW NO-BREAK SPACE)

1 234 567 (U+2009, THIN SPACE)

1 234 567 (U+2006, SIX-PER-EM SPACE)

1 234 567 (U+2008, PUNCTUATION SPACE)

1 234 567 (U+2005, FOUR-PER-EM SPACE)

1 234 567 (U+2004, THREE-PER-EM SPACE)

1 234 567 (U+0020, SPACE)

1 234 567 (U+2000, EN QUAD)

1 234 567 (U+2002, EN SPACE)

1 234 567 (U+2007, FIGURE SPACE)

1 234 567 (U+2003, EM SPACE)

1 234 567 (U+2001, EM QUAD)

Edit: they displayed with the various different space sizes in Chrome, and with fixed space sizes in Firefox (1 regular space for all except 200B and 202F which showed up with no space).

Unlike Reddit, editing does not lose information.

Also displays right in Edge, and in Firefox on Windows. So it looks like it is just Firefox Mac and Safari that are not handling the various space sizes for me.

Looks good with Firefox on Linux too.

Given chrismorgan's comment about breaking, it seems that NARROW NO-BREAK SPACE would be a better choice for digit grouping, right?

Yeah, NARROW NO-BREAK SPACE is probably better.

As far as the display problem on Firefox and Safari on Mac goes, I've done some experimenting. It looks like it is a font thing. HN has "font-family:Verdana, Geneva, sans-serif;" as part of the style sheet.

I made a minimal test page with just the different space examples, and an internal style sheet that just set the font-family for the body, and it shows the problem. Changing it to "Arial, sans-serif" makes it work.

I've never looked seriously into fonts for browsers, so have no idea what is going on at this point. If there is a problem with the Mac versions of Verdana and/or Geneva, then why is it only affecting Firefox and Safari, and not Chrome, on my Mac?

If anyone else wants to try to figure this out, here is the test page I've been using:

  <!DOCTYPE html>
      <meta charset="utf-8"/>
      <title>space test</title>
          body { font-family:Verdana, Geneva, sans-serif; }
          xbody { font-family:sans-serif;}
          xbody { font-family:Arial, sans-serif;}
  1&#x200B;234&#x200B;567 (U+200B, ZERO WIDTH SPACE)<br/>
  1&#x200A;234&#x200A;567 (U+200A, HAIR SPACE)<br/>
  1&#x202F;234&#x202F;567 (U+202F, NARROW NO-BREAK SPACE<br/>
  1&#x2009;234&#x2009;567 (U+2009, THIN SPACE)<br/>
  1&#x2006;234&#x2006;567 (U+2006, SIX-PER-EM SPACE)<br/>
  1&#x2008;234&#x2008;567 (U+2008, PUNCTUATION SPACE)<br/>
  1&#x2005;234&#x2005;567 (U+2005, FOUR-PER-EM SPACE)<br/>
  1&#x2004;234&#x2004;567 (U+2004, THREE-PER-EM SPACE)<br/>
  1&#x0020;234&#x0020;567 (U+0020, SPACE)<br/>
  1&#x2000;234&#x2000;567 (U+2000, EN QUAD)<br/>
  1&#x2002;234&#x2002;567 (U+2002, EN SPACE)<br/>
  1&#x2007;234&#x2007;567 (U+2007, FIGURE SPACE)<br/>
  1&#x2003;234&#x2003;567 (U+2003, EM SPACE)<br/>
  1&#x2001;234&#x2001;567 (U+2001, EM QUAD)  </p>

It’s common for fonts to not include glyphs for all the fancy spaces, which will leave it to a fallback mechanism where a different font is used.

It’s also common for web fonts especially to possess glyphs for some of the fancier spaces, but as duplicates of SPACE, i.e. the wrong size.

Face it: fonts (to a degree the technology, but mostly the actual fonts) are generally pretty bad.

Changed * to × (U+00D7 MULTIPLICATION SIGN) and / to ∕ (U+2215 DIVISION SLASH).

I don't like ÷, and I can't be bothered about whitespace. Sorry!

Thank you for writing out that extremely clear sequence of inequalities. I love a good Unicode style debate, but regardless you absolutely nailed it with conceptual clarity.

I always forget about DIVISION SLASH and that it’s a thing, distinct from FRACTION SLASH (which I use regularly). I wonder if I should add it to my compose key. Nah, I think ÷ and FRACTION SLASH are enough for me.

In computational complexity, one typically ignores multiplicative constants, but in number theory, one often doesn't.

The prime number theorem is of this stronger form, and it is in fact only the natural logarithm that makes the usual statement true. It is considerably more difficult to prove this asymptotic result (first proven 1896) than to get "within a multiplicative constant" as in big-O notation (first proven 1848--50).

I'm just disappointed that the noise isn't hidden in the pattern on the giraffe.

Then the background cannot be zeros.

Sorry, I mean the spots of the giraffe. You'd need a 'higher resolution prime' for that, but still.

That may have been how the author came up with a prime number, but any extremely large odd number is probably prime, since the primes get further and further apart as you go further along the number line. So it may have been a lucky guess.

This logic is backward. If you want empirical evidence you can ask WolframAlpha:

There are 25 primes in [1, 100]. There are 16 in [1001, 1100]. There are 6 in [1000001, 1000100].

I agree with your example, and it's what I said. But you're right, I'm wrong. By my own reasoning, any large odd number is likely to be composite. I must have been tired when I wrote my reply.

I think the more interesting way to do this would be to create a border that's roughly circular to frame the giraffe, and fill that border with noise. Then do the incrementing thing, and your 12 or so bits of noise at the end will blend in better, making it harder for the average person to reverse engineer.

I'm imaging something like this, but with a giraffe instead: https://6d4be195623157e28848-7697ece4918e0a73861de0eb37d0896...

(Sorry for unreadable CDN link)

I had the same sort of idea, and just implemented it in Python using gmpy2 for the primality check. Here's a prime smiley face:

As a number, it's

This took 487 random border attempts to find, which seems pretty small!

Note that efficient primality tests are probabilistic, so there's a small chance this isn't actually prime. I expect it's the same with the giraffe one in the link.

Edit: Here's the code (improved a bit over what made the picture above): https://gitlab.com/snippets/1694391

It is provably prime; I got Mathematica to generate a certificate for it in about five minutes using `PrimeQCertificate`.

Sometimes its a good idea to use a faster probable prime test when searching, and then use a slower test to verify the result. As has been done here.

Lenstra and Pomerace's variant of the AKS test is O(log(n)^6), and deterministic. So polynomial, and not a huge exponent. Still not as fast as the probabilistic algorithms, but not as slow as pre-AKS methods.

"not as slow as pre-AKS methods." only in terms of provable asymptotic bounds. It's much slower than APR-CL and ECPP, both of which existed before AKS, and are the methods used daily.

Pari/GP's APR-CL took 4 seconds to prove it. Perl's ntheory module took 2.6 seconds to prove with ECPP (including generating and verifying a certificate). That's over 1000x slower than BPSW but still not bad.

Playing with the wrap width (in this case 94), you'll see a small herd of giraffes instead. (This is the prime from the reddit article, rather than your smiley face. Easier to see white on black.)


Nice results at 192 (also three giraffes) and at 256 (four giraffes).

Great, another great demonstration of the fundamental principles of digital computing - representation and interpretation of data. It's been my experience teaching CS courses that many students, despite being taught bits and bytes and binary and ASCII and so forth, don't really "get it" until they see things like this.

This particular one only looks like a giraffe when it's interpreted as a 1bpp big-endian bitmap, but you could probably do the same with a colour (24bpp, in whatever order you wish) photo.

Here's another fun idea: write a short program that checks if a file is a prime number. Now you can get into the principles of CRCs, RS, and other error-correction codes...

"write a short program that checks if a file is a prime number."

What do you mean? I tried googling but failed.

Hint: Interpret the file’s bits as a single number

Thanks for the clarification.

Was that a downvote? For asking a question? Jeez.

Here's another challenge: find a number whose binary representation looks like Mickey mouse, and find the index where it occurs in the binary representation of pi. Bonus: that index should have a binary representation that looks like a copyright symbol.

We don't have the digits of pi for the second part. We only have 22,459,157,718,361 digits of pi which is only ~45 bits not enough to create the image with bits left over for moving it around in pi.

How is that only 45 bits?

Cause that's how the math works. log_2(22,459,157,718,361) ~= 44.35. Powers of two get big really really fast after all.


Shouldn't it be log_2(10^22,459,157,718,361)?

That answers the question of how many bits of pi we know. If you want to be able to take an arbitrary n-bit string and find its position in a set of n-bit strings, you need to have something like 2^n strings in your set.

So to find the number of bits you can support for an algorithm like this, taking the log of the number of digits of pi we know gives you a rough estimate.

The person I was responding to had a two-ish step problem 1) find a number with a binary representation that looks like Mickey 2) find an index for that number in pi that looks like the copyright symbol. Both are conceivably possible because pi is infinite and it's decimal places contain all numbers somewhere but we only know so many actual digits. Since we're dealing with the index of the number from step 1 though the number of bits we have to find the copyright symbol in is only log_2(22,459,157,718,361) ~= 45 because it's the index we're after.

That gives a roughly 7x7 grid to draw © in which is pretty limited and might be the minimum viable for a clean ©. (and that's been a touch generous by allowing 0 padding on the front of the number)

No, the index of the 22-trillionth digit only needs 45 bits.

Found the link in the comments to the "Walk made with the first million digits in prime numbers" very interesting!


Previously on HN, someone posted the The Corpus Christi Prime blog post [1], which was inspired by the Numerphile Trinity Hall Prime video [2].

See also Zachary Abel's Prime Portraits article [3].

[1]: https://news.ycombinator.com/item?id=15208317

[2]: https://www.youtube.com/watch?v=fQQ8IiTWHhg

[3]: http://archive.bridgesmathart.org/2016/bridges2016-359.pdf

See also Tupper's Self-Referential Formula [0] and the Numberphile explanation [1] for generating bitmap plots of itself.

[0]: http://mathworld.wolfram.com/TuppersSelf-ReferentialFormula....

[1]: https://www.youtube.com/watch?v=_s5RFgd59ao

Cool but old. It can be done in any number base, e.g. in decimal: https://www.youtube.com/watch?v=fQQ8IiTWHhg

I think it would look much better had he put a frame around it with binary ones. It would not only hint about how to decode it, but it would also make it more likely to be prime (ending with huge number of 1s) and hide the ugly noise.

I would also like to conjecture that there is infinitely many primes that look like a giraffe in binary, so it's nothing that unusual.

I guess I should comment here, but I don't have much to add. Most of the questions that arise naturally about how this was done, and about primes in general, have already been asked and answered.

Still, given my username I should add something ...

So if the size was prime x prime then the fact that the number of bits is a semi-prime tells you the size of the image, even if all you have is a stream of digits. Then the "noise" at the end will tell you what the least significant bits are, and hence you get an image without very much ambiguity.

Just thought I'd mention that.

Neat. Reminds me of illegal primes: https://en.m.wikipedia.org/wiki/Illegal_prime

It’s pretty much the same thing - take some arbitrary data, and add some bits so that the result is prime. Not surprising from a mathematical point of view since primes are very common, but it’s a fun thing to see.

> primes are very common

Isn't it thought that they get more and more rare the further you go away from zero in the positive direction?

By the way I just realized why no negative numbers are prime and that's because -1 is a factor in all of them. Quite obvious but I just never thought about it before.

That's not exactly true. E.g. -1 is also a "factor" of 3: -3 * -1 = 3

In number theory we define primes as a subset of the natural numbers, which excludes negative integers. But when you move on to abstract algebra, the definition of a prime element in a commutative ring is an element p, such that p|ab implies p|a or p|b. Using this definition, we can conclude that the prime elements in the ring of integers are both the positive and negative primes.

Yes, the primes become more sparse, but because there are countably infinitely many of them, and the natural numbers (of which they are a subset) are merely countably infinite, this counts as "common".

Our intuitions about this stuff are not very helpful in mathematics. For example, _most_ numbers are normal in any base (if you wrote them out as a decimal fraction they'd have an evenly distributed amount of all the different digits, forever) and we can prove this is true of the set of real numbers - but even though we know it's true, we don't have any definite examples of such numbers, the closest are we think Pi might be normal but we can't prove it, and we know Chaitin's constants are normal for sure, but by definition there's no way to write one of those down...

-1 is not a prime, so that doesn't work. The actual reason is a little more boring - negative numbers are not prime by the definition of a prime number: "a natural number greater than 1 that cannot be formed by multiplying two smaller natural numbers" (Wikipedia).

There's another definition, that doesn't have to exclude the one explicitly. I think it's a natural number not divisible by any natural number smaller than itself.

*other than one

damn what was it though ... can't remember where I read it.

Seems like an idea for a (lighter) proof of work, find a prime number in a certain format (and the harder it is, the least noise it should have)

Did somebody stumble upon this by accident, or did they deliberately look for a prime number and play with the layout until they saw a picture?

Either way, this is really cool.

They made/got the picture, converted it to a number (by interpreting it as binary), then incremented the number until they found the next prime. The incrementing is what made the "noise" in the bottom right.

Thanks for explanation. It does take away the magic, though.

It's still way cool, but how much cooler woulc it have been if a mathematician had stumbled upon this by accident...

In my mind, math + giraffes =


There shouldn't be two answers for that equation. That is just too weird of a coincidence.

Mods, please add ["Contact" (the novel, not the movie) spoiler alert] to the title.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact