selylindi comments on Superintelligence 11: The treacherous turn - Less Wrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (50)
Regarding the "treacherous turn" and the "conception of deception", I've previously proposed the following empirical method of testing friendliness:
In the second case, each AI cares about maximizing achievement of its own utility function, not about whether it does that achieving itself. Thus this set-up should encourage each AI to design the friendliest version of itself that it can. It's a competition that optimizes for friendliness! (Or at least for difficulty of being proved unfriendly.) The early rounds should be conducted with tight limits on computing resources, but each subsequent round with (presumably) safer AIs can be given more computing resources.