shminux comments on Domesticating reduced impact AIs - Less Wrong

9 Post author: Stuart_Armstrong 14 February 2013 04:59PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (104)

You are viewing a single comment's thread. Show more comments above.

Comment author: Eliezer_Yudkowsky 17 February 2013 04:56:44AM 0 points [-]

Even invalidating a proof doesn't automatically mean the outcome is the opposite of the proof. The key question is whether there's a cognitive search process actively looking for a way to exploit the flaws in a cage. An FAI isn't looking for ways to stop being Friendly, quite the opposite. More to the point, it's not actively looking for a way to make its servers or any other accessed machinery disobey the previously modeled laws of physics in a way that modifies its preferences despite the proof system. Any time you have a system which sets that up as an instrumental goal you must've done the Wrong Thing from an FAI perspective. In other words, there's no super-clever being doing a cognitive search for a way to force an invalidating behavior - that's the key difference.

Comment author: shminux 17 February 2013 05:22:18AM *  0 points [-]

Hmm, I did not mean "actively looking". I imagined something along the lines of being unable to tell whether something that is a good thing (say, in a CEV sense) in a model universe is good or bad in the actual universe. Presumably if you weren't expecting this to be an issue, you would not be spending your time on non-standard numbers and other esoteric mathematical models not usually observed in the wild. Again, I must be missing something in my presumptions.

Comment author: Eliezer_Yudkowsky 17 February 2013 06:19:58AM 1 point [-]

The model theory is just for understanding logic in general and things like Lob's theorem, and possibly being able to reason about universes using second-order logic. What you're talking about is the ontological shift problem which is a separate set of issues.