Hacker News new | comments | ask | show | jobs | submit login
The Soviet license plate game and Kolmogorov complexity (johndcook.com)
168 points by weinzierl 14 days ago | hide | past | web | favorite | 32 comments

Huh, I thought I was the only one playing this weird game in my head. I used to live in a country where license plates are all numeric digits. But with 4 digits and my limited knowledge of math operations (I was a child), it was hard to find plates that "had solutions". So I gave myself a little wiggle room by imagining the numbers were written as they were on digital alarm clocks, where certain numbers (like {2,3,5} and {6,9}) required the same number of "bars" to represent [1]. I added a rule that the bars can be rearranged to form different numbers (e.g. 2, 3 and 5 can be rearranged as each other).

Eventually, I added another rule that allowed the bars to be "concatenated," for example, "1" with two bars and "2" with 5 bars can be added to form "8", which requires 7 bars.

[1] https://goo.gl/images/t2njyy

Trying to understand if a plate number is a prime also works quite well :) Though local congestions are rarely large enough to allow checking for factors over 11 or 13.

The article falls a bit short I think. What if you consider each function as one character? What about automatic generation of solutions? (Maybe by dynamic programming?) IMHO it proposes an interesting problem and brushes it off.

The problem of complexity is solved neatly in Prolog and Erlang¹, by counting “reductions”. For simple tasks like in the article,

one function call ≅ one reduction.

Loops? There are no loops in Erlang, they are usually tail-recursive function calls.

¹) in Erlang it plays a very important role because the VM uses that information for scheduling.

> In theory you can enumerate all Turing machines of a given length, or all Python programs of a given length, and find the shortest one that does a given task, but the list grows exponentially with length.

Another problem with this strategy is that it generally requires solving the halting problem: you can only know whether a program "does a given task" when you know whether it halts or not.

That's only true if you care about knowing when you are done enumerating.

The definition of enumeration commonly used in computability does not require that you know when you are done enumerating.

> The definition of enumeration commonly used in computability does not require that you know when you are done enumerating.

Sorry, but this is a non-sequitur. In order to find (by enumeration) the shortest program producing a certain output, you absolutely do need to know when you're done, and thus you do need to solve the halting problem.

Ah, you're right. I was focusing too much on the statement itself and not the context of it.

How can you make progress if your first generated program does not halt?

Run things in parallel.

The idea behind most enumeration algorithms looks like this:

    machines = {}
    next_program = 0
    loop forever:
        for machine in machines:
            if machine.halted():
                if machine has output we want:
                    yield machine

An interesting application of the Kolmogorov complexity to measure the similarity of various languages etc.

"Abstract—A new class of distances appropriate for measuring similarity relations between sequences, say one type of similarity per distance, is studied. We propose a new “normalized information distance,” based on the non computable notion of Kolmogorov complexity, and show that it is in this class and it minorizes every computable distance in the class (that is, it is universal in that it covers all computable similarities). We demonstrate that it is a metric and call it the similarity metric. This theory forms the foundation for a new practical tool. ..."


The issue with trying to apply Kolmogorov complexity to extremely small data (4 digits here) is that the universal Turing machine itself might take more than 4 digits to write down. In principle, one could extend it to any sufficiently long pair of n digits, and then the question would be mathematically robust (though still uncomputable).

Also, for what it's worth, for large enough n, probably there are no interesting short answers. C(x|y) - the conditional complexity of any string x given any other string y is close to maximum (approximately the length of x) for most pairs x and y. This means, for most pairs x and y, there will be no simple short equations connecting x to y, regardless of the mathematical operators we take.

"In theory you can enumerate all Turing machines of a given length, or all Python programs of a given length, and find the shortest one that does a given task, but the list grows exponentially with length."

Can it be proven that the shortest program is always the fastest?

This is the problem I had when studying mathematics and still have with mathematics to this day: mathematics when stripped and laid bare is just clever transformations, "pouring from hollow into empty", as the old folks' saying goes. What is or are the practical applications of Kolmogorov complexity, how might I be able to turn this to my advantage?

It is rarely the case that the shortest program is the fastest.

To see a simple such phenomenon (very roughly) in the case of Kolmogorov complexity, consider two programs to print a sequence of 0x1000000 zeros.

Program 1:

   for i = 1 to 0x10000000:
     print 0
Program 2:

  i = 0x1 << 28;
  for j=1 to i
    print 0
(I may be off by a constant, but you get the idea). Program 2 is shorter, since it takes about log 28 bits (the other code has basically constant length), while Program 1 takes about 32 bits (the remaining code being of a different constant length). They both produce the same string. Program 1 is longer, but "faster", since it does not have the left-shift loop.

Philosophically: Program 1 has identified that the string to be printed is a very simple one. Program 2 has exploited the additional fact that the length of the string to be printed, itself is a very simple number, and has a short description. But then it needs additional time to figure out how many zeroes to print, since it needs to decode the length first.

This effect can be amplified with larger constants involved.

Kolmogorov complexity is a "well-defined" notion, hence mathematically nice, it is just that it is not computable. That's bad, but not catastrophic. The halting problem is undecidable, C++ compilation with templates is undecidable [1] - that does not make coding useless.

About the larger problem of dry mathematical textbooks, I agree with you. Some of it just boils down to the fact that the pure and the applied side are often tightly segregated, and very few mathematicians care about both. When you talk to practising mathematicians, you often get a livelier picture.

[1] https://stackoverflow.com/questions/794015/what-do-people-me...

In most cases the shortest program is the fastest, because most bitstrings are random, and the shortest program just prints the bitstring.

On the other hand, for compressible bitstrings, the shortest program is most likely not going to be the fastest, because you need at least one operation to output each final bit, so since the shortest program is going to be doing more than just printing the output, it's probably going to take longer than just printing the output.

I'm reading a 5 page paper from “German Aerospace Center Aviation and Space Psychology” titled “DLR - Basic Knowledge”. Under “Training Mathematics” section it says “Try to quickly first add, then multiply the digits of the licence plates of the cars in front of you when waiting at a traffic light”. So, I guess they wrote this because of Lev Landau.

One solution to the problem of measuring the length of the Python program is to turn it into machine code and measure that instead:


Doesn't this just kick the problem down the road? Now you've introduced the optimisation of the compiler abd the expressiveness of your processor's machine language. The problem of knowing you have the "best" solution remains.

I think if you are trying to come up with a canonical languahe to measure kolmogorov complexity in you can do better, like Binary Lambda Calculus.


The same mathematical problem or operation can have several different Kolmogorov complexities, depending on how efficient an algorithm is when solving the problem.

I don't think there's any way around it. If I create a Turing Machine that calculates 1+1 by counting all the atoms in the universe, it's going to have a high complexity.

The compiler might optimize out the entire calculation because it won’t change at runtime.

And, of course, one must always ask what is the least number of bits required to explain Kolmogorov complexity.

> In theory you can enumerate all Turing machines of a given length, or all Python programs of a given length, and find the shortest one that does a given task, but the list grows exponentially with length.

That works as long as you solve the halting problem first.

You don't have to do anything impossible, just something exceedingly impractical. Because we know that the function doesn't halt if its runtime exceeds the busy beaver number for that length. Now, we will obviously not know that for any non-trivial length, but in theory it's fine.

> Now, we will obviously not know that for any non-trivial length, but in theory it's fine.

It's not fine in theory because we can't know the busy beaver number for any Turing machine larger than the one doing the enumeration.

You can't know the busy beaver number without solving the halting problem.

For that, you can replace Kolmogorov complexity with Levin complexity, which penalizes the Kolmogorov complexity by adding the log of the execution time.

Since sufficiently advanced Turing machines are all equivalent up to a multiplicative factor, Levin complexity is defined up to a constant, as is Kolmogorov complexity.

> √74 = sec arctan sec arctan … √44.

He meant "sec arctan sec arctan … √74 = √44." (sent him a message)

Also why not just applying the power 0 (assuming numbers are different than 0) both sides. Would be a school-level operation and very likely the shortest.

You can't write down new numbers in the formula.

Oh, I read too quickly, the ° is uses is the degree "operator" (it really isn't an operator IMHO).

You could argue that the degree operator is just another name for multiplication by the constant pi/180.

Some pretty advanced high school math operators for this game to work

In a standard curriculum this should all be covered by 10th grade.

Applications are open for YC Summer 2019

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact