Daniel_Burfoot comments on K-complexity of everyday things - Less Wrong

11 Post author: cousin_it 04 December 2011 02:54PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (16)

You are viewing a single comment's thread.

Comment author: Daniel_Burfoot 05 December 2011 02:48:54AM 0 points [-]

It contains lots of patterns making it easy to compress using a regular archiver, but can we do much better than that?

We can do a little bit better but not much better. We can use some tricks like dictionaries, PCFG grammars, and so on, but there's just a hard limit to how much we will be able to achieve, because we need to pay a model-specification cost to encode the dictionary or grammar, and a single novel just isn't all that long.

What I think is interesting is what happens when we consider larger datasets. What is the K-complexity of all the books in the library of Congress? Here we should be able to do much better by using a specialized compressor than by using a regular archiver. Because now we can afford to use all kinds of high-concept tricks, because the cost of encoding those tricks will be amortized over a far larger dataset.

Comment author: cousin_it 05 December 2011 02:56:53AM 2 points [-]

Simulating the universe seems to be the ultimate "high-concept trick". I wonder how much data you need before it starts to pay off. How do I even go about estimating the answer to such a question?

Comment author: wedrifid 05 December 2011 03:32:29AM 2 points [-]

Simulating the universe seems to be the ultimate "high-concept trick". I wonder how much data you need before it starts to pay off.

It would seem that such a trick would usually only provide minimal assistance. After all, more can be said of a single library of congress than of all libraries of congress in a Tegmark level 1 simulation!

I confess I'm a little confused by the question here. Doesn't K-complexity have to be relative to some base language? Normally that's a trivial detail but when we get to the level of deciding whether or not to use a specification of the universe as an optimization technique it becomes rather salient.

Comment author: cousin_it 05 December 2011 03:51:48AM *  2 points [-]

Doesn't K-complexity have to be relative to some base language? Normally that's a trivial detail but when we get to the level of deciding whether or not to use a specification of the universe as an optimization technique it becomes rather salient.

Why? As long as the translator between two different base languages is much smaller than Finnegans Wake, I don't see the problem.

It would seem that such a trick would usually only provide minimal assistance. After all, more can be said of a single library of congress than of all libraries of congress in a Tegmark level 1 simulation!

That's true. I'm trying to understand if simulating our universe is ever the easiest way to recreate complex things, and from the comments it's seeming less and less likely. In particular, UDASSA presupposes that the easiest way to generate the state of a human mind is to simulate the universe and point to the mind within it, which might easily turn out to be false.

Comment author: wedrifid 05 December 2011 04:29:13AM 1 point [-]

Why? As long as the translator between two different base languages is much smaller than Finnegans Wake, I don't see the problem.

Because if we are deciding whether there are gains to be made by including a simulation of the universe in the compression algorithm. In that case the comparisons to be made are between the representation of a universe simulation, what efficiency this can gain compared to the base language and the translation cost between languages. Since we can expect any benefit to using a universe sim to be rather minimal this matters a lot!

I was actually only allowing the possibility that simulating the universe could be an "ultimate high concept trick" if you were going to be compressing things a whole heap more arbitrary than Finnegans wake (or any human produced data). If you are just talking about compressing human works and lets say using any old language like "ruby" then including a simulation of the universe as part of the message is an ultimately terrible concept. Even narrowing things down from "something represented in the universe" to "something conveniently expressed in ruby" provides an enormous amount of information already.

Comment author: cousin_it 05 December 2011 01:26:40PM *  0 points [-]

If our hypothesis is correct, the gains from simulating the universe are not just minimal, they're negative...