You're looking at Less Wrong's discussion board. This includes all posts, including those that haven't been promoted to the front page yet. For more information, see About Less Wrong.

William_S comments on Superintelligence 13: Capability control methods - Less Wrong Discussion

7 Post author: KatjaGrace 09 December 2014 02:00AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (44)

You are viewing a single comment's thread. Show more comments above.

Comment author: William_S 09 December 2014 03:27:15AM 1 point [-]

This has some problems associated with stunting. Adding humans in the loop with this frequency of oversight will slow things down, whatever happens. The AI would also have fewer problem solving strategies open to it - that is if doesn't care about thinking ahead to <do evil things>, it also won't think ahead to <do things that make future optimizations easier>.

The programmers also have to make sure that they inspect not only the output of the AI at this stage, but the strategies it is considering implementing. Otherwise, it's possible that there is a sudden transition where one strategy only works up until a certain point, then another more general strategy takes over.