Post author: Eliezer_Yudkowsky 21 January 2009 11:04AM

Comment author: bogdanb 21 January 2009 09:55:34PM 1 point [-]

Will Pearson: I'm going to skip quickly over the obvious problem that an AI, even much smarter than me, might not necessarily do what you mean rather than what (it thinks) you said. Let's assume that the AI somehow has an interface that allows you to tell exactly what you mean:

"that the AI would keep inside it a predicate Will_Pearson_would_regret_wish (based on what I would regret), and apply that to the universes it envisages while planning"

This is a bit analogous to Eliezer's "regret button" on the directed probability box, except that you always get to press the button. The first problem I see is that you need to define "regret" extremely well (i.e., understand human psychology better than I think is "easy", or even possible, right now), to avoid the possibility that there _aren't_ any futures where you wouldn't regret the wish. (I don't say that's the case, I just say that you need to prove that it's not the case before reasonably making the wish.) This gets even harder with CNR.

I you're not able to do that, you risk the AI "freezing" the world and then spending the life of the Universe trying to find a plan that satisfies the predicate before continuing. (Note that this just requires that finding such a plan be hard enough that the biggest AI physically possible can't find it before it decays; it doesn't have to be impossible or take forever.)

We can't even assume that the AI will be "smart enough" to detect this kind of problem: it might simply be mathematically impossible to anticipate if a solution is possible, and the wish too "imperative" to allow the AI to stop the search.

* * *

I short, I don't really see why a machine inside the universe could simulate even one entire future light-cone of just one observer in the same universe, let alone find one where the observer doesn't regret the act. Depending on what the AI understands by "regret", even not doing anything may be impossible (perhaps it foresees you'll regret asking a silly wish, or something like that).

This doesn't mean that the wish _is_ bad, just that I don't understand its possible consequences well enough to actually make it.