XiXiDu comments on Kevin T. Kelly's Ockham Efficiency Theorem - Less Wrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (81)
I don't think it's been thoroughly explained in the sequences, but why not start with the Wikipedia page? Or this post.
On second thought, this is actually a very simple idea that can be explained in a comment. Let me try:
The Kolmogorov complexity of a finite string of bytes is the size (in bytes) of the shortest Python program that outputs that string and then halts. Or you could use any other language in place of Python and get a different value of complexity - but it cannot differ by more than a constant as the string grows, because you could always implement a constant-size Python interpreter in the other language to "convert" one description into another. So complexity is not unique, but all definitions of complexity sorta "agree" as we go to infinity.
For example, if a string consists of "a" repeated a million times, its Kolmogorov complexity is quite small because you can write a very short program that outputs a million a's and then halts. But this is not the case for all possible strings. Here's a simple way to see why: there can't be more Python programs of length N than there are different byte strings of length N, because any program is itself a byte string. Therefore, you cannot encode all strings of length N using programs of length N/2 or something, because there's not enough of them. Actually this shows that most strings of length N must have complexity close to N. (Obviously it can't be much larger than N, because any string of length N can be output by a program of length N+c that "hardcodes" that specific string.)
Another important observation. Imagine you have a huge, random-looking string of bytes. How can you calculate its Kolmogorov complexity? Turns out that you can't, because there's no easier way to do that than, essentially, simulating all programs up to length N and seeing what they output. But this is impossible because any given program can (say) hang idly for a million years and only then start outputting symbols, and you have no way of learning that reliably from the program text (this is known as the "halting problem"). So K-complexity is largely a theoretical notion, I have no idea why people love it so.
Thanks, great! I think this might be another introductory explanation.
Here is more: Information vs. Meaning