Comment Permalink

I'm not sure that AI boxing is a live debate anymore. People are lining up to give full web access to current limited-but-unknown-capabilities implementations, and there's not much reason to believe there will be any attempt at constraining the use or reach of more advanced versions.

See in context

2

[ Question ]

AI box question

by KvmanThinking

4th Dec 2024

1 min read

A

1 2

2

I believe that even through a text-only terminal, a superintelligence could do anything to a human. Persuade the human to let it out, inflict extreme pleasure or suffering with a word. However, can't you just... limit the output of the superintelligence? Just make it so that the human can say anything, but the AI can only respond from a short list of responses, like "Yes," "No," "Maybe," "IDK," "The first option," "The second option," or a similar system. I... don't see how that can still present a risk. What about you?

AI Boxing (Containment)Conversations with AIsAI

Frontpage

2

New Answer

New Comment

1 Answers sorted by
top scoring

TsviBT

Dec 04, 2024

70

In theory, possibly, but it's not clear how to save the world given such restricted access. See e.g. https://www.lesswrong.com/posts/NojipcrFFMzNx6Grc/sudo-s-shortform?commentId=onKfTrunn2Q2Gc4Pw

In practice no, because you can't deal with a superintelligence safely. E.g.

You can't build a computer system that's robust to auto-exfiltration. I mean, maybe you can, but you're taking on a whole bunch more cost, and also hoping you didn't screw up.
You can't develop this tech without other people stealing it and running it unsafely.
You can't develop this tech safely at all, because in order to develop it you have to do a lot more than just get a few outputs, you have to, like, debug your code and stuff.
And so forth. Mainly and so forth.

1 comment, sorted by

top scoring

Click to highlight new comments since: Today at 12:57 AM

[-]Dagon6mo33

I'm not sure that AI boxing is a live debate anymore. People are lining up to give full web access to current limited-but-unknown-capabilities implementations, and there's not much reason to believe there will be any attempt at constraining the use or reach of more advanced versions.

Reply

Moderation Log

LESSWRONG
LW

2

[ Question ]

AI box question

2

New to LessWrong?

2

1 Answers sorted by
top scoring

Dec 04, 2024

2

[ Question ]

AI box question

2

New to LessWrong?

2

1 Answers sorted by top scoring

Dec 04, 2024

1 Answers sorted by
top scoring