You're looking at Less Wrong's discussion board. This includes all posts, including those that haven't been promoted to the front page yet. For more information, see About Less Wrong.

SilentCal comments on Superintelligence 16: Tool AIs - Less Wrong Discussion

7 Post author: KatjaGrace 30 December 2014 02:00AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (36)

You are viewing a single comment's thread.

Comment author: SilentCal 30 December 2014 07:11:11PM 1 point [-]

Note 2 seems worth discussing further. The key step, as I see it, is that the AI is not a communication consequentialist, and it does not model the effects of its advice on the world. I would suggest calling this "Ivory Tower AI" or maybe "Ivory Box".

To sketch one way this might work, queries could take the form "What could Agent(s) X do to achieve Y?" and the AI then reasons as if it had magic control over the mental states of X, formulates a plan, and expresses it according to predefined rules. Both the magic control and the expression rules are non-trivial problems, but I don't see any reason they'd be fr Friendliness-level difficult.

(just never ever let Agent X be "a Tool AI" in your query)