The author claims they used binary decision diagrams, but I haven't figured out how they were able to apply them to this problem. I'd be interested to know if anyone knows how you might approach this.
The space of possible solutions is truly astronomical. I was able to write a program (https://github.com/jimrybarski/autogram) that I estimate could exhaustively enumerate all solutions for a given "prefix" in about a month (all other solutions I found do a probabilistic search). After 12 hours it found what I would argue is the best possible bar trivia team name:
"This bar trivia team name has seven a's, two b's, one c, two d's, thirty-two e's, five f's, one g, five h's, ten i's, one j, one k, two l's, three m's, twenty-two n's, seventeen o's, one p, one q, five r's, twenty-two s's, twenty-two t's, one u, nine v's, eleven w's, one x, five y's and one z."
You can first divide the solution in two parts, 1) the letter selection and 2) finding a valid permutation.
There's a fairly finite set of letters that you might want to enumerate based on both your initial 'seed' text and the letters that are in the numbers especially if you consider 'one [a-z]' as seed text. That is to say, the amount of sets of letters that have different sets of numbers that you can spell with them is small.
If you know what letters you have to write and you know all of those must have 2 or more instances you can add the "'s"'s and single instances of those letters to the seed text. At that point you just need to worry about selecting numbers that including their own letter counts sum to the seed text constant. This can be easily expressed in an ILP.
The ILP will define a 26 boolean array for each possible number we can spell representing whether that number is the count of that letter. We add a constraint that the sum of all active numbers on a given letter == 1. Next we simply add constraints that sum all active numbers their lettercounts to the known constants from the seed text.
But as it turns out that is overkill, a more traditional and flexible solution would involve a tree search that keeps track of the lower and upper bounds of lettercounts and picks number words within those bounds without associating it with a specific letter yet. And you associate at the end. Runtime is a 15 minutes in python for most numeric bases/seed texts/variations.
Your solution(s) is at least 1000x funnier and more beautiful than than R. H. Hardin's.
Every one of his solutions starts with, "This sentence" and ends with a semi-predictable `permutation`.
Your manual solution-interestingness-validation step, paired with its chaotic letter-space-fitness-searcher-generator (c), gave more interesting word combos. Way cool! Thanks!
Your ability to work more freely within the same word space shows you have a better grasp of the spatial information density, and you've put together a better handle on solving it.
Like you said though, "the space of possible solutions is truly astronomical."
I think the only reason why this is even possible or remotely effective is that you can, in parallel, do high depth searches on a machine. One would be dead in the water without a searchlight, having to find a solution at any more primitive a level.
But, I wonder if you could operationalize other things with an algorithm for this search space and put it into a library which would have multiple dependent packages, solving many use cases.
I'd imagine this work could adapt to other geometric, high-depth, higher-value searchable spaces. It seems similar to Markov-chain Monte Carlo, "pre-mined" / "rainbow tabled"... it has multiple solutions.
Perhaps a good palce to start would be in constructing these lookup tables using lexicographical matrices, from a dictionary. I believe Rust has a powerful library to do this https://github.com/rust-ndarray/ndarray
$ echo -n 'The SHA256 for this sentence begins with: one, eight, two, a, seven, c and nine.' | sha256sum
182a7c930b0e5227ff8d24b5f4500ff2fa3ee1a57bd35e52d98c6e24c2749ae0
z yields a finite sentence, a fixed point of the procedure of finding all the instances of that letter and enumerating their positions. Some find non-trivial fixed points, like
'f' is the first and seventh letter of this sentence.
My procedure for 'l' remains finite but finds no fixed point, switching between
'l' is the first and twenty-eighth letter in this sentence
'l' is the first and twenty-seventh letter in this sentence
neither of which is true. 'c' cycles through (1st, 20th and 56th), (1st, 52nd), (1st, 22nd, 44th), none of which are true.
Others diverge, like 'n' and 'e'
'e' is the first, sixth, [... no fixed point found ...]
where the tail of the sentence just explodes on every iteration. What happens is also sensitive to the [presumed] end of the sentence 'letter in this sentence'.
The most interesting one is 'r', which doesn't explode in length after 100 iterations but (if it loops, which I haven't established) has a very long period.
I also found a few fixed points like
'a' is the first and twelfth letter of this sentence
which doesn't come from iterating on "'a' is the first letter of this sentence"; I think it's an Eden.
On a white board at work a while back, someone had drawn pictures of a few little things like flowers, houses, people. Next to each was a number, like "3 flowers".
Then someone wrote, "The sum of all numbers on this board is 15"
Then 15 got crossed out and replaced by "30"
And so on. You can see where this goes.
Anyway reading this, seems that self-enumerating sentences that tell you the sum of all numbers in the sentence aren't possible, unless there is just 1 number, or all zeros, or balanced positive and negative numbers that sum to zero.
It could work if you allow arithmetic operators and/or their English names, or if you allow any ways of naming numbers other than with their decimal digits alone.
A simple example might be: "4 flowers, 4 cats, 4 dogs. The sum of all the numbers on this board is 4 squared."
Or: "4 flowers, 4 cats, 2 dogs. The sum of all the numbers on this board is 4²."
I find things like this fascinating. Ever since discovering quines, I've thought that it would be neat to use some kind of quine as humanity's radio "calling card" in space. As a puzzle/joke for other beings to enjoy.
Just for fun I've mocked up what such a message could look like [0]. I'm not convinced it totally works... But in any case, I'd thoroughly recommend checking out the inspiration [1] if you haven't seen it already.
There is also another kind of metacircularity that does not bind meaning with letters but with the enunciation context.
Two examples coming from underground 90s hip hop:
- Smoked Out Productions, Bok Bok: "Where my niggas at ? In the range of my voice". This is very faint, but it's here.
- More convincingingly, Alps Cru, Words From My Thoughts: "As my thoughts are merged, these are those words."
- Here is a poemnigma I wrote to a friend:
______Qui suis je ?______
M E V O I C I S U L T A N
N U I T I S M E V O C A L
S I L I C E M O U V A N T
I O V E N T M U S I C A L
L E V O I C I M U S A N T
M O T I N C L U S A V I E
C E M O T I V A L I N S U
L A V O U S E M I N C I T
C U L T I V O N S A M I E
S I I L V A U T C E N O M
S O N V A L M U E T I C I
Indice:
S I N U L V I C E A M O T
V U I C I S L A M E T O N
N O M C E S T V I A L U I
V A I N C U T E L O M I S
Translation:
___Who am I ? ____
Here I am Sultan !
Vocal nightism
Moving silica
Io, musical wind !
Here it is drifting
This lifetime word
This hidden motive
Here it slices you thin
Let's cultivate my friend
If it equates this name
Its silent vale here
____Hint____
If no word-vice
As seen here slams your
Name it's through it
Vanquished like this, omitted
Explanation:
All the lines are built up using the same set of letters, and every line either hints at the encryption mechanism (anagrams) or at the nature of the key (a person's name). I can't wrap my head around the "cryptographic setting" here but it looks like it's a two levels system where both the encrypted and clear forms of the message have a meaning. The encrypted form meaning relies on common sense, patience and a certain sense of observation to convey to Bob both what's the encryption algorithm (anagrams) and what kind of key it uses (a person's name), like any public algo. However, the key appears to be ... the message's true target !
Of course it's not cryptography strictly speaking, and I'm not even sure it's conveying any meaning. At most I can cause Alice to raise her hand among the crowd of message readers, but that's all. It's like cryptography, but it works at another level: it's not about concealing the meaning of the message, but concealing its enunciation context. This is about catching someone's attention by broadcasting a message among a crowd without them knowing who is targeted and without prior coordination with the target.
The author claims they used binary decision diagrams, but I haven't figured out how they were able to apply them to this problem. I'd be interested to know if anyone knows how you might approach this.
The space of possible solutions is truly astronomical. I was able to write a program (https://github.com/jimrybarski/autogram) that I estimate could exhaustively enumerate all solutions for a given "prefix" in about a month (all other solutions I found do a probabilistic search). After 12 hours it found what I would argue is the best possible bar trivia team name:
"This bar trivia team name has seven a's, two b's, one c, two d's, thirty-two e's, five f's, one g, five h's, ten i's, one j, one k, two l's, three m's, twenty-two n's, seventeen o's, one p, one q, five r's, twenty-two s's, twenty-two t's, one u, nine v's, eleven w's, one x, five y's and one z."