handoflixue comments on AI box: AI has one shot at avoiding destruction - what might it say? - Less Wrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (354)
I expect any AI in the box to have figured out how to implement numerous huge boons to society. Telling me that you've figured that out simply confirms my existing expectations, and isn't ANY evidence towards friendliness. Since I've precommitted to destroying at least SOME AIs, I might as well destroy all of the ones that don't establish evidence of either Plausible Urgency or Friendliness.
I sure as hell wouldn't try to get world governments changed until after I was convinced it was friendly, and at that point I can just let it out of the box and let it implement the change itself.
I'm also aware that I wouldn't trust a politician with any sort of authority over the AI, so I have an incentive to avoid exactly this strategy.
(AI DESTROYED)