You're looking at Less Wrong's discussion board. This includes all posts, including those that haven't been promoted to the front page yet. For more information, see About Less Wrong.

JoshuaZ comments on [LINK] Wait But Why - The AI Revolution Part 2 - Less Wrong Discussion

17 Post author: adamzerner 04 February 2015 04:02PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (87)

You are viewing a single comment's thread. Show more comments above.

Comment author: JoshuaZ 05 February 2015 06:25:29PM 2 points [-]

You have a complicated goal system that can distinguish between short-term rewards and other goals. In the situations in question, the AI has no goal other than than the goal in question. To some extent, your stability arises precisely because you are an evolved hodgepodge of different goals in tension- if you weren't you wouldn't survive. But note that similar, essentially involuntary self-modification does on occasion happen with some humans- severe drug addiction is the most obvious example.

Comment author: pinyaka 05 February 2015 07:28:19PM 0 points [-]

But the goal in question is "get the reward" and it's only by controlling the circumstances under which the reward is given that we can shape the AIs behavior. Once the AI is capable of taking control of the trigger, why would it leave it the way we've set it? Whatever we've got it set to is almost certainly not optimized to triggering the reward.

Comment author: JoshuaZ 05 February 2015 08:00:19PM *  3 points [-]

If that happens you will then have the problem of an AI which tries to wirehead itself while simultaneously trying to control its future light-cone to make sure that nothing stops it from continuing to wirehead.

Comment author: pinyaka 05 February 2015 08:32:16PM 1 point [-]

That sounds bad. It doesn't seem obvious to me that reward seeking and reward optimizing are the same thing, but maybe they are. I don't know and will think about it more. Thank you for talking through this with me this far.