Today's post, Optimization was originally published on 13 September 2008. A summary (taken from the LW wiki):
A discussion of the concept of optimization.
Discuss the post here (rather than in the comments to the original post).
This post is part of the Rerunning the Sequences series, where we'll be going through Eliezer Yudkowsky's old posts in order so that people who are interested can (re-)read and discuss them. The previous post was Psychic Powers, and you can use the sequence_reruns tag or rss feed to follow the rest of the series.
Sequence reruns are a community-driven effort. You can participate by re-reading the sequence post, discussing it here, posting the next day's sequence reruns post, or summarizing forthcoming articles on the wiki. Go here for more details, or to have meta discussions about the Rerunning the Sequences series.
"This process prefers to exactly follow the laws of physics, therefore future events and observations will turn out exactly as a natural physical system would evolve" seems to be a minimal message length description for predicting the behavior of any process, unless that process necessarily restricts its final output to a very tiny domain that's shorter to describe than the initial state. For any description of the current measurable state of a process and a predicted future description of state likelihoods it seems that it will always be simpler to describe the current state and then predict it will exactly follow the natural laws.
Maybe it works if the likelihood of a given prediction is compared with its length? It's easy to be trivially correct, but not so easy to make complex predictions that are also right. Take the probability of any message of that length being correct and compare that with P(E|M), the probability that the event E described by message M occurs. If P(E|M) is higher than the probability of a random M-length message being correct it means M is a good description, but I am not convinced it's a good description solely because of the properties of E. It still seems like M="laws of physics" will be better than other descriptions of optimizing processes. I am probably missing something.
It is frequently difficult to describe the current state, including the entire optimizer, in complete detail. Often times the behavior of systems including optimizers will be chaotic, in that arbitrarily small changes (especially to the optimizer itself) will result in large changes in the output. In such cases, a less-precise description of the process is more useful, and an optimization based description may be able to constrain the output state much more accurately than "laws of physics" applied over a broad range of input states and optimizer states.
It takes a very, very long message to describe the state of every quark in the human brain.