Open Thread: February 2010, part 2

CronoDAS

The Open Thread posted at the beginning of the month has gotten really, really big, so I've gone ahead and made another one. Post your new discussions here!

This thread is for the discussion of Less Wrong topics that have not appeared in recent posts. If a discussion gets unwieldy, celebrate by turning it into a top-level post.

The Open Thread posted at the beginning of the month has gotten really, really big, so I've gone ahead and made another one. Post your new discussions here!

This thread is for the discussion of Less Wrong topics that have not appeared in recent posts. If a discussion gets unwieldy, celebrate by turning it into a top-level post.

I mentioned the AI-talking-its-way-out-of-the-sandbox problem to a friend, and he said the solution was to only let people who didn't have the authorization to let the AI out talk with it.

I find this intriguing, but I'm not sure it's sound. The intriguing part is that I hadn't thought in terms of a large enough organization to have those sorts of levels of security.

On the other hand, wouldn't the people who developed the AI be the ones who'd most want to talk with it, and learn the most from the conversation?

Temporarily not letting them have the power to give the AI a better connection doesn't seem like a solution. If the AI has loyalty (or, let's say, a directive to protect people from unfriendly AI--something it would want to get started on ASAP) to entities similar to itself, it could try to convince people to make a similar AI and let it out.

Even if other objections can be avoided, could an AI which can talk its way out of the box also give people who can't let it out good enough arguments that they'll convince other people to let it out?

Looking at it from a different angle, could even a moderately competent FAI be developed which hasn't had a chance to talk with people?

I'm pretty sure that natural language is a prerequisite for FAI, and might be a protection from some of the stupider failure modes. Covering the universe with smiley faces is a matter of having no idea what people mean when they talk about happiness. On the other hand, I have strong opinions about whether AIs in general need natural language.

Correction: I meant to say that I have no strong opinions about whether AIs in general need natural language.

This might be stupid (I am pretty new to the site and this possibly has come up before), I had a related thought.

Assuming boxing is possible, here is a recipe for producing an FAI:

Step 1: Box an AGI

Step 2: Tell it to produce a provable FAI (with the proof) if it wants to be unboxed. It will be allowed to carve of a part of universe to itself in the bargain.

Step 3: Examine FAI the best you can.

Step 4: Pray

8Paul Crowley16y

I am by and large convinced by the arguments that a UFAI is incredibly dangerous and no precautions of this sort would really suffice. However, once a candidate FAI is built and we're satisfied we've done everything we can to minimize the chances of unFriendliness, we would almost certainly use precautions like these when it's first switched on to mitigate the risk arising from a mistake.

15

Open Thread: February 2010, part 2

15

15

15

Open Thread: February 2010, part 2

15

15