Eliezer_Yudkowsky comments on Open Thread: February 2010 - Less Wrong

1 Post author: wedrifid 01 February 2010 06:09AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (738)

You are viewing a single comment's thread. Show more comments above.

Comment author: Eliezer_Yudkowsky 01 February 2010 06:03:22PM 2 points [-]

Possible solution: I think there are ways to write it a program such that even if it inferred our existence, it would optimize away from us, rather than over us. Loosely: A goal like "I need to organize these instructions within this block of memory to solve a problem specified at address X." needs to be implemented such that it produces a subgoal like "I need to write a subroutine to patch over the fact that an error in the VM I'm running on gives me a window of access into a universe with huge computation resources and godlike power over my memory space, so that my solution get get the right answer to it's arithmetic and sole the puzzle." It should want to do things in a way that isn't cheating.

Marcello had a crazy idea for doing this; it's the only suggestion for AI-boxing I've ever heard that doesn't have an obvious cloud of doom hanging over it. However, you still have to prove stability of the boxed AI's goal system.

Comment author: wnoise 01 February 2010 10:55:57PM 5 points [-]

Can you link to (or otherwise more fully describe) this crazy idea?