Stuart_Armstrong comments on Google Deepmind and FHI collaborate to present research at UAI 2016 - Less Wrong

23 Post author: Stuart_Armstrong 09 June 2016 06:08PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (10)

You are viewing a single comment's thread. Show more comments above.

Comment author: Gurkenglas 12 June 2016 12:44:18AM *  0 points [-]

Would this agent be able to reason about off switches? Imagine an AI getting out, reading this paper on the internet, and deciding that it should kill all humans before they realize what's happening, just in case they installed an off switch it cannot know about. Or perhaps put them into lotus eater machines, in case they installed a dead man's switch.

Comment author: Stuart_Armstrong 12 June 2016 05:19:09PM 1 point [-]

This approach works under the assumption that the AI knows everything there is to know about its off switch.

And an AI that would kill everyone in case it had an off switch, is one that desperately needs a (public) off switch on it.

Comment author: Gurkenglas 13 June 2016 02:12:51PM *  0 points [-]

The approach assumes that it knows everything there is to know about off switches in general, or what its creators know about off switches.

If the AI can guess that its creators would install an off switch, it will attempt to work around as many possible classes of off switches as possible, and depending on how much of off-switch space it can outsmart simultaneously, whichever approach the creators chose might be useless.

Such an AI desperately needs more FAI mechanisms behind it, it desperately needing an off switch assumes that off switches help.

Comment author: Stuart_Armstrong 14 June 2016 02:55:48AM 0 points [-]

This class of off switch is designed for the AI not to work around.