Hacker News new | past | comments | ask | show | jobs | submit login
Storing binary data in playing cards (2014) (timwarriner.com)
77 points by vmoore 9 months ago | hide | past | favorite | 36 comments



It's always been pretty mind-blowing to me that you can store over 225 bits of information just in an ordering of one deck of cards. A standard way to store data in this way is a factoradic encoding like Lehmer coding:

https://en.wikipedia.org/wiki/Lehmer_code

The python library 'permutation' has some functionality around this that's fun to play with:

https://permutation.readthedocs.io/en/stable/#permutation.Pe...

I found new joy in shuffling a deck of cards, after learning that every (proper) shuffle that every human's ever done has returned a unique deck that nobody's ever held before.

edit: I just remembered a guy who made a javascript demo that encodes small strings into a card deck order: https://jerde.net/peter/cards/cards.html (explanation page linked)


Yeah, combinatorics is neat and frightening.

It's crazy things have sensible probabilities at all, but surprisingly often the virtual numerator and denominator are of comparable size, despite the magnitudes involved.


What? No, you can store 52! unique states with a pack of playing cards. Just map each state to a data set. /j


`52!` is indeed between 225 and 226 bits of information.

    > 52!
    = 80,658,175,170,943,878,571,660,636,856,403,766,975,289,505,440,883,277,824,000,000,000,000

    > 2^225
    = 53,919,893,334,301,279,589,334,030,174,039,261,347,274,288,845,081,144,962,207,220,498,432

    > 2^226
    = 107,839,786,668,602,559,178,668,060,348,078,522,694,548,577,690,162,289,924,414,440,996,864


I used this fact in an interview ages ago. The interviewer wanted a function, in Java, that shuffled a deck of cards such that every permutation was equally likely. I pointed out this was not possible using the standard library random functions since the seed is a long (akshually... it's 48 bits).

They inexplicably hired my know-it-all ass...


You can show this with a easy one-liner in a lot of languages, e.g. Julia:

  $ julia -E 'log2(factorial(big(52)))'
  225.581003123702761946342444376665612911126819036757601937313805865615023286654
(It's also just math, of course, but it's nice to have a quick way to check these things.)


Since log(xy) = log(x) + log(y), you can simply calculate sum(log2(i) for i in range(1, 53)) :) might be a nicer formulation for when you don’t have arbitrary precision support.


Or you can use the `bc` that's shipped with macOS.

    echo 'l(f(52))/l(2)' | bc -l
    225.58100312370276194868
(`l(x)` is natural log, hence the `/l(2)` to convert to `log2`)

Annoyingly, GNU `bc` doesn't have `f(x)`.


This reminds me of Pontifex, the playing card based cipher designed by Bruce Schneier for Neal Stephenson's Cryptonomicon.

https://www.schneier.com/academic/solitaire/


Author draws a comparison right in the front page:

> My method stores data within a deck of cards, whilst his uses the deck as a key to encrypt data kept elsewhere. His method is more useful if you want to encrypt a lot of data, while mine could be more useful if you want to smuggle a shorter hidden message without any possibility of detection


I don't understand the claim that this is more efficient, it's trivial to find a method that's information-theoretically optimal: choose an arbitrary order for the cards, and encode log(52!)/log(2) bits as the lexicographic order of the permutation, doing a binary search each time.

Shouldn't be too hard to do even with pen and paper since the 2-adic eval of 52! is large.


OP's code still only has log2(52!) different binary strings that can be encoded, but they vary in length from 52 to 1378 bits (and are the leaves of a binary tree). This is handy for easily encoding sequences drawn from a non-uniform distribution, like strings of English letters, by giving each letter a fixed-length binary codeword. It's sort of flipped from the more common method of giving symbols a varying-length binary codeword (e.g. with Huffman coding) and encoding a fixed-length binary string in a deck of cards (or anything else).


For people like me who didn't know what a "2-adic eval" is, it's a fancy way of saying that 52![1] can be divided by two 49 times and still give an integer.

If you factor the number 52! (https://wolframalpha.com/input?i=factor+52%21), you can see the term 2^49. If you divide 52! by two 49 times, you are still left with an integer.

[1]: 52 factorial, 52*51*50*...*2*1, the number of different orderings of the 52 cards in a standard deck


And "2-adic eval" is

    The 2-adic valuation of a number is the largest number of times you can divide it by 2 and still get an integer. 
https://math.stackexchange.com/questions/434931/elementary-q...


He isn't making the claim that it can encode more bits than the information theoretic maximum you suggest _on average_. On average his method is similar or worse than optimal. But some messages are effectively compressed in this encoding, and so more plaintext bits can be encoded. These better than average messages "borrow" bits from the worse than average messages, similar to how a compression algorithm works.


> Shouldn't be too hard to do even with pen and paper since the 2-adic eval of 52! is large.

Could you elaborate?


https://en.wikipedia.org/wiki/P-adic_valuation

It's nothing fancy, get the prime power decomposition of your number and pick the exponent of p.

There's a clever way to do that for a factorial, but I have the Pari/GP app on my phone so I just did:

    valuation(52!,2)
which gives the answer 49, so 52! is divisible by 2 forty-nine times. Interestingly chatgpt4 turbo got it right with no extra prodding needed.


One can do this mentally easily enough. 52! = 525150....321, and there are 26 even numbers in this progression, so we have 26 powers of 2. Taking them out, there are 13 of those even factors divisible by 4, but we already took out the first 2 from each so we have 13 more 2's, giving 26 + 13 = 39. Now on to factors divisible by 8, they are half of those that are divisible by 4, so half of 13, giving 6 more 2's (52 is not divisible by 8 so we round down). Thus so far we have 39 + 6 = 45 two's in the factorization of 52!. On to numbers less than 52 that are divisible by 16, that's half those divisible by 8, so 3 more, getting us to 48. Finally there is there is the factor 32 = 2^5 of 52! giving us one more 2, hence 49. i.e. for p a prime, the largest k such that p^k divides n! is given by k = Floor(n/p) + Floor(n/p^2) + ... + Floor(n/p^t) where p^(t+1)>n


Does not seem right, the number is way too low.. after all, just the last factor (52) can be divided by 2 at least 5 times.

My calculator says 225 bits, and text suggests the same. Looks like chatgpt4 was wrong as usual:)


The 2-adic valuation is about how often 2 is a prime factor of a number.

For just 52 for example 2 is a prime factor twice, because (52/2)/2 = 13, which is no longer divisible by 2.

Or in other words 52! / (2^49) is an integer, but 52! / (2^50) is not, thus 49 is the correct answer.


When you see something that doesn't look right it's good to engage and work things out, but it's also courteous to check that you haven't misunderstood. I see how you could arrive at the understanding you had, that "how many times you can divide by 2" is equivalent to base-2 logarithm. It's not the right interpretation however, and in context it's clear.

Could I recommend phrasing this kind of comment as a question in future? (Notwithstanding the lifehack of making a false statement in the internet being the shortest path to an answer.)


Fair, I should have rephrased the comment to more directly reference the thread-starter, which is encoding bits "using lexicographic order of the permutation, doing a binary search each time." It's not that your computation of 2-adic decomposition is wrong, it's the idea that using 2-adic decomposition produces the number that is too low.

Let me elaborate:

I am not 100% sure what user qsort meant by "binary search", but one of the simplest manual algorithms I can think of is to use input bits as decision points in binary-search-like input state split: you start with 52 cards, depending on first input bit you take top or bottom half of the set, then use 2nd input bit to select top or bottom of the subset, and so on, repeat until you get to a single card. Then place it in the output, remove from input stack, and repeat the procedure again. Note there is no math at all, and this would be pretty trivial to do with just pen & paper.

What would be the resulting # of bits encoded this way? With 54 cards, you'd need to consume 5 to 6 bits, depending on input data. Once you are down to 32 cards, you'd need 5 bits exactly, 31 cards will need 4-5 bits depending on the data, and so on... If I'd calculated this correctly, that's at least 208 bits in the worst case, way more that 51 bits mentioned above.

(Unless there is some other meaning to "51" I am missing? but all I see in the thread are conversations about bit efficiently...)


To be clear I agree with your interpretation about how much data you can store in the deck permutation and how to search it, my previous comment was only about p-adic valuations. I can't actually see how the 49 is relevant either.


And 50 of the factors of 52! are greater than 2.


The basic method would be to assign a number, 0 through 52!-1, to each permutation in lexicographic order. Because 52! is not a power of 2, if you want to encode binary bits, you can only use 2^N permutations, where that number is the largest power of 2 less than 52!. You can not losslessly encode more than N bits, that's a hard bound, they just won't fit.

If you wanted to turn this into an actual protocol, you would presumably flag some permutations as invalid and use the other ones. You would then encode one bit at a time doing a binary search of the set of valid permutations.

Because 52! has a large number of 2s in its factorization, for a careful choice of the valid permutations it should be practical (or at least not significantly more impractical than the OP's proposed method) to perform this by hand because you would be able to "eyeball" most splits of the binary search.


They seem to be ignoring another useful aspect of a deck of cards. Each card could be placed face-up or face-down in the deck.


And another 51 bits of info for whether the picture of the puppy on the back of the card is right-side-up or upside-down! (Or, absent puppies, any other asymmetrical pattern or image that you chose for the message deck).

Because your recipient has to be able to determine the reference orientation of the deck, you get 51 bits of extra information from puppy orientation, and another 50 bits of extra information from face-up/face-down orientation.

To place the deck in correct orientation, in preparation for decoding, ensure that the top and bottom card are face up, and that the puppy on the back of the top card isn't upside-down.

An extra 101 bits of information is significant!


There’s additional because at least some faces can be rotated - 3 of spades has to have two spades ♠ up and one down or versa-vice.

And some face cards are not mirror images, but that depends on the deck.


In an asymmetrical design, the orientation of face carries no extra information, since orientation information is already carried by the orientation of the design on the back.

For decks with a symmetrical back design, the following cards have asymmetrical faces:

- The seven of diamonds.

- The ace, three, five, seven and nine of hearts, clubs and spades.

In my deck (a standard design), none of the face cards are asymmetrical.

So there are sixteen cards that carry orientation information, one of which must be used to define the reference orientation of the deck, yielding 15 additional bits of information.


This reminded me of the De Bruijn Sequence card trick.

https://sites.math.washington.edu/~billey/classes/562.winter...


This reminded me of T-Codes, a form of variable-length self-synchronising codes for communicating based on letter and word frequency.

https://web.archive.org/web/20180329104930/http://tcode.auck...


This is cool (especially the encryption), but certain sequences of bits can’t be encoded. If you have n items left in the sequence then you cannot encode a run of n zeros. So it’s not a general encoding scheme.


Start with the deck in standard order. Before each step, number the remaining cards from 0 to N. Each card encodes a digit in a mixed radix number:

a0 + 52(a1 + 51(a2+ 50*(a3 + ...)))


First thing that came to mind: https://www.imdb.com/title/tt0060581

Of course that information would be redundant.


There is also the possibility to use Braille to encode binary data in cards, even though it was designed mainly for reading and writing.





Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: