Wei_Dai comments on Open Thread: October 2009 - Less Wrong

5 Post author: gwern 01 October 2009 12:49PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (425)

You are viewing a single comment's thread. Show more comments above.

Comment author: Daniel_Burfoot 01 October 2009 03:42:00PM *  4 points [-]

But the No Free Lunch theorems suggest that this means it will be suboptimal compared to any algorithm tailored to any specific world.

This is a subtle point. The NFL theorem does prohibit any algorithm from doing well over all possible worlds. But Solomonoff induction does well on any world that has any kind of computable regularity. If there is no computable regularity, then no prior can do well. In fact, the Solomonoff prior does just as well asymptotically as any computable prior.

As is often the case, thinking in terms of codes can clear up the issue. A world is a big data file. Certainly, an Earth-specific algorithm can get good compression rates if it is fed data that comes from Earth. But as the data file gets large, the Solomonoff general-purpose compression algorithm will achieve compression rates that are nearly as good; in the worst case, just has to prepend the code of the Earth-specific algorithm to its encoded data stream, and it only underperforms by that program size.

The reason AIXI doesn't work in practice is that the "efficient approximations" aren't really efficient, or aren't good approximations.

Comment author: Wei_Dai 01 October 2009 10:27:39PM 1 point [-]

If there is no computable regularity, then no prior can do well. In fact, the Solomonoff prior does just as well asymptotically as any computable prior.

This seems to be a common belief. But see this discussion I had with Eliezer where I offered some good arguments and counterexamples against it.

The link goes to the middle, most relevant part of the discussion. But if you look at the top of it, I'm not arguing against the Solomonoff approach, but instead trying to find a generalization of it that makes more sense.

I've linked to that discussion several times in my comments here, but I guess many people still haven't seen it. Maybe I should make a top-level post about it?