orthonormal comments on Superintelligent AGI in a box - a question. - Less Wrong

14 Post author: Dmytry 23 February 2012 06:48PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (77)

You are viewing a single comment's thread. Show more comments above.

Comment author: orthonormal 25 February 2012 03:20:50PM 2 points [-]

A couple of things:

  • To be precise, you're offering an approach to safe Oracle AI rather than Friendly AI.

  • In a nutshell, what I like about the idea is that you're explicitly handicapping your AI with a utility function that only cares about its immediate successor rather than its eventual descendants. It's rather like the example I posed where a UDT agent with an analogously myopic utility function allowed itself to be exploited by a pretty dumb program. This seems a lot more feasible than trying to control an agent that can think strategically about its future iterations.

  • To expand on my questions, note that in human beings, the sort of creativity that helps us write more efficient algorithms on a given problem is strongly correlated with the sort of creativity that lets people figure out why they're being asked the specific questions they are. If a bit of meta-gaming comes in handy at any stage, if modeling the world that originated these questions wins (over the alternatives it enumerated at that stage) on criteria 3 even once, then we might be in trouble.