djm comments on Detecting agents and subagents - Less Wrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (6)
This in an interesting article, though necessarily abstract - how can we take this to implement an actual AI detector?
This could be the combination of research in the areas of antivirus software, network detection and intrusion, stock market agents along with some sort of intelligence honeypot (ie some construed event / data that exists for a split second that only an AI could detect and act on)
If this were a friendly AI, shouldn't the utility function of the gathering agent take sharing / don't be greedy into account. Though as AI development advances we can't be sure that all governments / corporations / random researchers will put the necessary effort into making sure their intelligent agent is friendly.
This is indeed a very preliminary concept.
Friendly AI's need not be nice in a game-theoretic sense. They can (and likely would) be ruthless and calculating at achieving their goals - it's just that heir goals are good/safe/positive. This puts some constraints on means (eg the AI will likely not kill everyone just to get to its goals), but it's not likely that "play nicer than you have to with other AIs" will be such a constraint.