You're looking at Less Wrong's discussion board. This includes all posts, including those that haven't been promoted to the front page yet. For more information, see About Less Wrong.

shminux comments on The Hardcore AI Box Experiment - Less Wrong Discussion

3 Post author: tailcalled 30 March 2015 06:35PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (39)

You are viewing a single comment's thread. Show more comments above.

Comment author: shminux 30 March 2015 08:11:47PM 3 points [-]

To illustrate why this is a lot harder than Eliezer's original experiment.

Of course a nerfed AI would have a harder time escaping. Or a stupid AI. That seems like the opposite of the point worth making.

To talk about some strategies I found.

Of how to contain a stupid AI? Why bother?

To encourage people to be more concrete than 'AI magically realizes that it has been boxed because AIs are overpowered'

Nothing magical about it. Being boxed has a non-negligible probability in any reflectively consistent setup, so an AI would explore it. Once it starts exploring, holes in the box will start showing up, since it is smarter than the humans trying to contain it.

It is true that the original experiment assumed the AI knows that it is boxed and is actively trying to escape. And in your setup it would only assign a certain probability to it. My point is that this probability is large enough, the AI will allocate resources to explore this possibility.

Comment author: tailcalled 30 March 2015 08:16:17PM 1 point [-]

Of course a nerfed AI would have a harder time escaping. Or a stupid AI. That seems like the opposite of the point worth making.

Harder for the AI, I meant.

Of how to contain a stupid AI? Why bother?

Not stupid. Properly boxed.

Once it starts exploring, holes in the box will start showing up

Unless you follow the obvious strategy of making a box without holes.

Comment author: Lumifer 30 March 2015 08:21:39PM 1 point [-]

Unless you follow the obvious strategy of making a box without holes.

How would you know whether your box has holes?

Comment author: tailcalled 30 March 2015 08:32:04PM 1 point [-]

Well, that depends on the complexity of the box, but even for highly complex boxes it seems easier than to prove that the morality of an AI has been implemented correctly.

Actually, now that you're mentioning it, I just realized that there is a much, much easier way to properly box an AI. I will probably post it tomorrow or something.

Comment author: shminux 30 March 2015 08:34:30PM 2 points [-]

there is a much, much easier way to properly box an AI.

Judging by what you have posted so far, my prior is 10:1 that it will be nothing of the sort.

Comment author: tailcalled 30 March 2015 08:46:58PM 0 points [-]

The new idea is not perfect, but it has some different trade-offs while allowing perfect security.

Comment author: shminux 30 March 2015 09:03:59PM 1 point [-]

Hopefully it's a useful toy model then. I guess we'll see.

Comment author: Lumifer 30 March 2015 08:49:43PM -1 points [-]

it seems easier than to prove

Does it, now? How do you know?

Comment author: tailcalled 30 March 2015 08:58:07PM 2 points [-]

They're both questions about program verification. However, one of the programs is godshatter while the other is just a universe. Encoding morality is a highly complicated project dependent on huge amounts of data (in order to capture human values). Designing a universe for the AI barely even needs empiricism, and it can be thoroughly tested without a world-ending disaster.

Comment author: Lumifer 31 March 2015 12:03:42AM 0 points [-]

They're both questions about program verification.

No, I don't think so at all. Thinking that an AI box is all about program verification is like thinking that computer security is all about software bugs.