Hacker News new | past | comments | ask | show | jobs | submit login
Kolmogorov Complexity (2014) [pdf] (informationtheory.weebly.com)
73 points by rfreytag on May 18, 2016 | hide | past | web | favorite | 21 comments



You might also find this video relevant, it's entitled "Kolmogorov Music", and attempts to apply the ideas from the paper to musical notation.

https://www.youtube.com/watch?v=Qg3XOfioapI


Excellent talk with great pointers to other material.

If you have a pointer to Fogus' talk he mentioned, titled: Macronomicron, well, that sounds intriguing.



Cool. Can anyone tell me what keywords may be useful in exploring this topic? Textbooks?


The standard reference is "An Introduction to Kolmogorov Complexity and Its Applications", from Ming Li and Paul Vitányi. It is the "must-go" for the subject.

An introductory chapter is available in the "Elements of Information Theory" of Thomas Cover and Joy Thomas.

The field is called "Algorithmic Information Theory" and some of the big names in it are: Ray Solomonoff, Gregory Chaitin, Ming Li, Paul Vitányi, Chris Wallace, David Dowe, Markus Hutter, Jürgen Schmidhuber, Jorma Rissanen, among others. All of these guys are great writers and have breakthrough ideas!

If you are interested in "real world" applications of Kolmogorov complexity, I recommend you to take a look at MDL (Minimum Description Length) and MML (Minimum Message Length).

I hope you will enjoy the references! :)


Yet, having studied the chapter in Cover and Thomas, and worked with people who have tried to apply the idea as an inference tool, and listened to talks by David Dowe explaining MML and its relation to MDL -- I have come away with the impression that the intellectual interest in Kolmogorov Complexity is much, much greater than its actual usefulness.


That does tend to happen with incomputable functions. I once got an interesting hobby project out of an AIT paper, though: https://github.com/eligottlieb/Calumbda


True. I was compelled to write because I've noticed people being seduced by the notion of KC. It tends to be a dead end.


Computable things include CSSR (http://bactra.org/CSSR/), the considerably more bullshit Lempel-Ziv (yeah, like LZ compression) complexity, etc etc etc etc.

Lloyd(http://web.mit.edu/esd.83/www/notebook/Complexity.PDF) has a supposedly non-exhaustive list, but it is probably exhaustive enough for any purpose. Of those, the useful and non-banal ones in my opinion include Fisher information, Renyi entropy, VC dimension, the many ordinary computational complexities (of course), metric entropy, correlational structure and channel capacity.


Hmm. You raise an interesting point. The LZ77 paper does indeed refer to an earlier work by Lempel and Ziv, which measures sequence complexity by counting how many new subsequences they have (roughly speaking). That earlier LZ paper in turn refers directly to Kolmogorov Complexity (hereafter, "KC").

Case closed, right?

But take a look at that paper (Jan. 1976 IEEE IT Trans. -- http://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=1055501). The third paragraph acknowledges the KC approach of using the length of a hypothetical computer program, and refers to the usual suspects. But the fourth paragraph then specifically rejects that approach, and says they are taking a new and simpler approach to the problem, which counts prefix sequences.

See what I mean? They were able to get somewhere by explicitly not following the KC path (which involves the hypothetical sequence-generating computer program). LZ went with a weaker notion that was actually computable in practice. I think that weakening has characterized the more successful work that uses the general idea of complexity.

The list you link to simply catalogs all approaches at measuring "complexity". It's not measuring complexity that's the issue, it's that attempting to use KC as a complexity measure in real settings tends to lead nowhere. (Or induce charlatanry, or both.)

And I categorically reject a couple of your examples. VC dimension has nothing to do with Kolmogorov Complexity. The term does not even appear in the index of Vapnik's magnum opus. And, having gone through Vapnik's proofs very closely during my PhD, been to his talks, driven him to lunch in my car, etc., etc., I can say that they make no reference to KC. None.

I similarly reject your comparison to Fisher information. It's an expected value of a Hessian. Sure, you could construct a specific problem where there is a relationship between FI and KC, but is KC really helping to understand FI?


I was referring to measures of _complexity_, variously and ill-defined, which may or may not have anything to do with KC. FI and VC-dim are, indeed, not examples of things closely related to KC, but they are useful measures of complexity nonetheless: I was answering the question as to more useful complexity measures, not as if they had to be strictly related to KC. If you had the idea that I was representing that Vapnik was poking about with KC when he was thinking about VC-dimension, I regret the misunderstanding.

But of course, they can be related to KC, because anything you say about anything computable can be related to KC: it is exactly saying, "is this complexity measure related to Turing machines?" - the answer had better be yes, unless you like writing fiction about complexity. But that "yes" is nearly always a vacuous "yes", a Turing tarpit.

Like in other places, if there was really a good definition of complexity, people would have standardized. Of the ones listed in Lloyd's note, the ones that you would expect a normal programmer to know are Shannon information and computational complexity, which are outrageously useful: Fisher information is sometimes used in neuroscience and VC dimension is used in ML, of course both as vague measures of complexity which are nice because they are computable.

There is, of course, a lot of charlatanry in the general measurement of complexity, and this is a thing not under dispute.


OK, I see where you're coming from. Thanks for clarifying.

The reason I pushed back is that I think KC has drawn disproportionate interest and/or tends to invite speculative nonsense. But, re-reading, I see your point as well.


Sipser's undergraduate textbook on the theory of computation[1] has a single, very nice, introductory chapter on the topic.

If you want to get technical and deep, you probably want [2] (An introduction to Kolmogorov complexity and its applications). It's pretty hard, though. Not for the faint of heart.

[1] http://www.amazon.com/Introduction-Theory-Computation-Michae...

[2] http://www.amazon.com/Introduction-Kolmogorov-Complexity-App...


I did a uni project based on that second book. Despite the title, it's hard to actually get to practical applications.


>If you want to get technical and deep, you probably want [2] (An introduction to Kolmogorov complexity and its applications). It's pretty hard, though. Not for the faint of heart.

Basically, learn real analysis first, with emphases on topology and measure theory. Then tackle Kolmogorov complexity theory.


I like this website a lot for gentle introductions to new topics: https://jeremykun.com/2012/04/21/kolmogorov-complexity-a-pri...


http://inference-review.com/article/doing-mathematics-differ...

While I don't have a text book for you, I found this article by Gregory Chaitin (a notable academic in the same sphere as kolmogorov et al) to be very exciting / enlightening.


Kolmogorov-Chaitin Complexity is another term for the same concept (as per the OP Wikipedia article).


Badii and Politi 1997 is fun and interesting reading. Try Lempel-Ziv complexity for some hilarious idiocy and statistical Grassberger complexity for some less idiotic thought. Logical depth. Symbolic dynamics.


A definitive introduction to the field: https://www.cs.auckland.ac.nz/~chaitin/cup.pdf


Komolgorov, Hofstadter, Autechre, and LISP.

This talk made my life. Thank you.




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: