You're looking at Less Wrong's discussion board. This includes all posts, including those that haven't been promoted to the front page yet. For more information, see About Less Wrong.

shminux comments on The Hardcore AI Box Experiment - Less Wrong Discussion

3 Post author: tailcalled 30 March 2015 06:35PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (39)

You are viewing a single comment's thread.

Comment author: shminux 30 March 2015 07:26:28PM *  3 points [-]

What is the purpose if your reformulation? You nerfed the AI so much, it is now so far from the Eliezer's original point that "constraining someone smarter than you is very hard" that it make your reformulation useless for the purpose of analyzing a realistic superintelligence in a box.

Comment author: tailcalled 30 March 2015 07:37:54PM 5 points [-]

Well, there are multiple purposes:

  1. To illustrate why this is a lot harder than Eliezer's original experiment.

  2. To talk about some strategies I found.

  3. To encourage people to be more concrete than 'AI magically realizes that it has been boxed because AIs are overpowered'.

Comment author: shminux 30 March 2015 08:11:47PM 3 points [-]

To illustrate why this is a lot harder than Eliezer's original experiment.

Of course a nerfed AI would have a harder time escaping. Or a stupid AI. That seems like the opposite of the point worth making.

To talk about some strategies I found.

Of how to contain a stupid AI? Why bother?

To encourage people to be more concrete than 'AI magically realizes that it has been boxed because AIs are overpowered'

Nothing magical about it. Being boxed has a non-negligible probability in any reflectively consistent setup, so an AI would explore it. Once it starts exploring, holes in the box will start showing up, since it is smarter than the humans trying to contain it.

It is true that the original experiment assumed the AI knows that it is boxed and is actively trying to escape. And in your setup it would only assign a certain probability to it. My point is that this probability is large enough, the AI will allocate resources to explore this possibility.

Comment author: tailcalled 30 March 2015 08:16:17PM 1 point [-]

Of course a nerfed AI would have a harder time escaping. Or a stupid AI. That seems like the opposite of the point worth making.

Harder for the AI, I meant.

Of how to contain a stupid AI? Why bother?

Not stupid. Properly boxed.

Once it starts exploring, holes in the box will start showing up

Unless you follow the obvious strategy of making a box without holes.

Comment author: Lumifer 30 March 2015 08:21:39PM 1 point [-]

Unless you follow the obvious strategy of making a box without holes.

How would you know whether your box has holes?

Comment author: tailcalled 30 March 2015 08:32:04PM 1 point [-]

Well, that depends on the complexity of the box, but even for highly complex boxes it seems easier than to prove that the morality of an AI has been implemented correctly.

Actually, now that you're mentioning it, I just realized that there is a much, much easier way to properly box an AI. I will probably post it tomorrow or something.

Comment author: shminux 30 March 2015 08:34:30PM 2 points [-]

there is a much, much easier way to properly box an AI.

Judging by what you have posted so far, my prior is 10:1 that it will be nothing of the sort.

Comment author: tailcalled 30 March 2015 08:46:58PM 0 points [-]

The new idea is not perfect, but it has some different trade-offs while allowing perfect security.

Comment author: shminux 30 March 2015 09:03:59PM 1 point [-]

Hopefully it's a useful toy model then. I guess we'll see.

Comment author: Lumifer 30 March 2015 08:49:43PM -1 points [-]

it seems easier than to prove

Does it, now? How do you know?

Comment author: tailcalled 30 March 2015 08:58:07PM 2 points [-]

They're both questions about program verification. However, one of the programs is godshatter while the other is just a universe. Encoding morality is a highly complicated project dependent on huge amounts of data (in order to capture human values). Designing a universe for the AI barely even needs empiricism, and it can be thoroughly tested without a world-ending disaster.

Comment author: Lumifer 31 March 2015 12:03:42AM 0 points [-]

They're both questions about program verification.

No, I don't think so at all. Thinking that an AI box is all about program verification is like thinking that computer security is all about software bugs.

Comment author: [deleted] 30 March 2015 08:25:30PM 1 point [-]

The AI box game is useless for the purpose of analyzing a realistic superintelligence in a box.

Comment author: shminux 30 March 2015 09:33:30PM 4 points [-]

Of course it is useless for that purpose. The experiment was a dramatization designed to show that the AI boxing problem is hard, given that even a human stands a fair chance of breaking out, not to do any serious analysis. It achieved its goal of falsifying the usual sentiment that an isolated UFAI is not much of a risk. It was never meant to be a formal research of the issue.

Comment author: [deleted] 30 March 2015 11:30:30PM 3 points [-]

I don't think it actually accomplished that goal however, common perception notwithstanding. The standard rules used by Yudkowsky were biased in favor of the AI, and the setup itself is comical from a common-sense safety standpoint.

Comment author: shminux 30 March 2015 11:36:14PM 2 points [-]

Altering the perception was the goal.

Comment author: lmm 01 April 2015 09:41:39PM 0 points [-]

Did it succeed? I know I find it thoroughly unconvincing.

Comment author: [deleted] 01 April 2015 09:47:23PM *  -1 points [-]

Certainly outside of LW it has had absolutely no impact.

I, too, am thoroughly unconvinced.

Comment author: dxu 01 April 2015 10:37:50PM *  0 points [-]

I still find the bit about not releasing the logs strange. I understand the rationale behind it (preventing people from saying, "But I wouldn't have fallen for that!"), but surely Eliezer had to understand that it would make his accomplishment look a lot more suspect.

Comment author: gjm 01 April 2015 11:38:27PM 1 point [-]

One possible explanation is that seeing the logs would have made his accomplishment look even more suspect. (E.g., perhaps he didn't in fact persuade the gatekeeper to let him out in-game, but made some out-of-band argument like "If you agree to say that you let me out and never release the logs, I will pay you $1000" or "If you say that I persuaded you to let me out, it will make people take the problem of AI safety more seriously". I think Eliezer has denied doing any such thing ... but then he would, wouldn't he?)

Comment author: FeepingCreature 03 April 2015 10:37:11PM *  1 point [-]

I suspect that seeing the logs would have made Eliezer seem like a horrible human being. Most people who hear of AI Box imagine a convincing argument, when to me it seems more plausible to exploit issues in people's sense of narrative or emotion.

Comment author: tailcalled 30 March 2015 10:04:37PM 2 points [-]

I think the fundamental point I'm trying to make is that Eliezer merely demonstrated that humans are too insecure to box an AI and that this problem can be solved by not giving the AI a chance to hack the humans.

Comment author: artemium 31 March 2015 06:06:16AM 0 points [-]

Agree.. The AI boxing Is horrible idea for testing AI safety issues. Putting AI in some kind of virtual sandbox where you can watch his behavior is much better option, as long as you can make sure that AGI won't be able to become aware that he is boxed in.

Comment author: Vaniver 31 March 2015 01:22:39PM *  1 point [-]

Agree.. The AI boxing Is horrible idea for testing AI safety issues. Putting AI in some kind of virtual sandbox where you can watch his behavior is much better option, as long as you can make sure that AGI won't be able to become aware that he is boxed in.

  1. What's the difference between the AI's text output channel and you observing the virtual sandbox?
  2. Is it possible to ensure that the AI won't realize that it is boxed in?
  3. Is it possible to ensure that, if the AI does realize that it is boxed in, we will be able to realize that it realizes that?

As I understand it, the main point of the AI Box experiment was not whether or not humans are good gatekeepers, but that people who don't understand why it would be enticing to let an AI out of the box haven't fully engaged with the issue. But even how to correctly do a virtual sandbox for an AGI is a hard problem that requires serious attention.

Comment author: dxu 30 March 2015 11:11:52PM *  0 points [-]

That being said, if you have an AI, only to seal it in a box without interacting with it in any way (which seems the only realistic way to "not [give] the AI a chance to hack the humans"), that's not much different from not building the AI in the first place.

Comment author: tailcalled 30 March 2015 11:31:53PM 0 points [-]

I'll post a list of methods soon, probably tomorrow.

Comment author: Lumifer 31 March 2015 12:08:47AM -1 points [-]

The experiment was a dramatization ... It achieved its goal of falsifying the usual sentiment that an isolated UFAI is not much of a risk.

So, do I understand correctly that dramatization falsified a sentiment?? X-D

Comment author: passive_fist 31 March 2015 02:21:22AM 0 points [-]

I agree. What tailcalled is proposing isn't strictly the AI box scenario, although I guess you might call it a simulation of the AI box scenario.

*The reason it's not the AI box scenario is that it's hard to have an entity that is simultaneously intelligent, useful, and boxed. It's not that boxing an AI is hard. Physically boxing an AI is the easy part. The hard part - which the AI box experiment is about - is making sure that we can get information of real-world use out of the AI without the AI taking over the Universe.