Norvig's experiment doesn't show anything at all. Turning bits at random for any string at all, be it DNA or Shakespeare, will increase the amount of "information" towards the maximum, that is, a completely random string.
DNA is code. Imagine that I compressed mygame.cpp and mygame.lisp with LZW and claimed Ah-ha! The C++ version is more complex because it has more bytes!
And then I'd change a random character in the code, and claim that the information content has increased.
Not "any string at all". Do it to a maximal-entropy string (e.g., a genuinely random one) and you won't see an increase.
You're using "information" in the colloquial sense, where random junk is not information. Norvig is using it in the information-theoretic sense, where random junk has more information than anything else of the same length. The information-theoretic sense is not "nonsense"; it's just not the same as the colloquial one.
(Motivation for the terminology: the "information" in a string is the minimal number of bits -- i.e., the minimal amount of information -- it takes you to tell me what the string is.)
Norvig's experiment doesn't show anything at all. Turning bits at random for any string at all, be it DNA or Shakespeare, will increase the amount of "information" towards the maximum, that is, a completely random string.
DNA is code. Imagine that I compressed mygame.cpp and mygame.lisp with LZW and claimed Ah-ha! The C++ version is more complex because it has more bytes!
And then I'd change a random character in the code, and claim that the information content has increased.
Nonsense!