Stuart_Armstrong comments on Detecting agents and subagents - Less Wrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (6)
This is indeed a very preliminary concept.
Friendly AI's need not be nice in a game-theoretic sense. They can (and likely would) be ruthless and calculating at achieving their goals - it's just that heir goals are good/safe/positive. This puts some constraints on means (eg the AI will likely not kill everyone just to get to its goals), but it's not likely that "play nicer than you have to with other AIs" will be such a constraint.