Punoxysm comments on How to Study Unsafe AGI's safely (and why we might have no choice) - Less Wrong

10 Post author: Punoxysm 07 March 2014 07:24AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (47)

You are viewing a single comment's thread. Show more comments above.

Comment author: ThrustVectoring 07 March 2014 10:19:53AM 0 points [-]

The issue with sandboxing is that you have to keep the AI from figuring out that it is in a sandbox. You also have to know that the AI doesn't know that it is in a sandbox in order for the sandbox to be a safe and accurate test of how the AI behaves in the real world.

Stick a paperclipper in a sandbox with enough information about what humans want out of an AI and the fact that it's in a sandbox, and the outputs are going to look suspiciously like a pro-human friendly AI. Then you let it out of the box, whereupon it turns everything into paperclips.

Comment author: Punoxysm 07 March 2014 04:25:43PM 0 points [-]

In addition to what V_V says below, there could be absolutely no official circumstance under which the AI should be released from the box: that iteration of the AI can be used solely for experimentation, and only the next version with substantial changes based on the results of those experiments and independent experiments would be a candidate for release.

Again, this is not perfect, but it gives some more time for better safety methods or architectures to catch up to the problem of safety while still gaining some benefits from a potentially unsafe AI.

Comment author: ThrustVectoring 07 March 2014 10:32:40PM 0 points [-]

Taking source code from a boxed AI and using it elsewhere is equivalent to partially letting it out of the box - especially if how the AI works is not particularly well understood.

Comment author: Punoxysm 08 March 2014 01:52:22AM 0 points [-]

Right; you certainly wouldn't do that.

Backing it up on tape storage is reasonable, but you'd never begin to run it outside peak security facilities.