John_Maxwell_IV comments on Explanations for Less Wrong articles that you didn't understand - Less Wrong Discussion
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (118)
If you want to build an AI that maximizes utility, and that AI can create copies of itself, and each copy's existence and state of knowledge can also depend on events happening in the world, then you need a general theory of how to make decisions in such situations. In the limiting case when there's no copying at all, the solution is standard Bayesian rationality and expected utility maximization, but that falls apart when you introduce copying. Basically we need a theory that looks as nice as Bayesian rationality, is reflectively consistent (i.e. the AI won't immediately self-modify away from it), and leads to reasonable decisions in the presence of copying. Coming up with such a theory turns out to be surprisingly hard. Many of us feel that UDT is the right approach, but many gaps still have to be filled in.
Note that many problems that involve copying can be converted to problems that create identical mind states by erasing memories. My favorite motivating example is the Absent-Minded Driver problem. The Sleeping Beauty problem is similar to that, but formulated in terms of probabilities instead of decisions, so people get confused.
An even simpler way to emulate copying is by putting multiple people in the same situation. That leads to various "anthropic problems", which are well covered in Bostrom's book. My favorite example of these is Psy-Kosh's problem.
Another idea that's equivalent to copying is having powerful agents that can predict your actions, like in Newcomb's problem, Counterfactual Mugging and some more complicated scenarios that we came up with.
Can you formalize the idea of "copying" and show why expected utility maximization fails once I have "copied" myself? I think I understand why Newcomb's problem is interesting and significant, but in terms of an AI rewriting its source code... well, my brain is changing all the time and I don't think I have any problems with expected utility maximization.
We can formalize "copying" by using information sets that include more than one node, as I tried to do in this post. Expected utility maximization fails on such problems because your subjective probability of being at a certain node might depend on the action you're about to take, as mentioned in this thread.
The Absent-Minded Driver problem is an example of such dependence, because your subjective probability of being at the second intersection depends on your choosing to go straight at the first intersection, and the two intersections are indistinguishable to you.