You're looking at Less Wrong's discussion board. This includes all posts, including those that haven't been promoted to the front page yet. For more information, see About Less Wrong.

William_S comments on Superintelligence 13: Capability control methods - Less Wrong Discussion

7 Post author: KatjaGrace 09 December 2014 02:00AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (44)

You are viewing a single comment's thread.

Comment author: William_S 10 December 2014 11:58:44PM 1 point [-]

Capability control methods, particularly boxing and stunting, run the risk of creating a capability overhang: a gap in optimization power between the controlled AI and an uncontrolled version. This capability overhang creates an additional external hazard: that another AI team, hearing of the first AI team's success will believe that less capability control is required than used by the initial team (possibly due to different assumptions, or motivated cognition). They will want to create a less controlled version of the AI to attempt to gain greater optimization power and a boost over their rivals. This continues until someone crosses the line to an unsafe AI.

This isn't a problem if you assume all AI researchers agree on all substantial aspects of the control problem or forced to coordinate. I'm not convinced this is a likely outcome.

Does this model make sense?