Lightning Post: Things people in AI Safety should stop talking about

I'll put a commensurate amount of effort into why you should talk about these things.

How an AI could persuade you to talk it out of a box/How an AI could become an agent

You should keep talking about this because if it is possible to "box" an AI, or keep it relegated to "tool" status, then it might be possible to use such an AI to combat unboxed, rogue AI's. For example, give it a snapshot of the internet from a day ago, and ask it to find the physical location of rogue AI servers, which you promptly bomb.

How an AI could get ahold of, or create, weapons

You should keep talking about this because if an AI needs military access to dominate the world, then the number of potentially dangerous AI goes from the hundreds of thousands or millions to a few dozen, run by large countries that could theoretically be kept in line with international treaties.

How an AI might Recursively Self Improve without humans noticing

You should keep talking about this because it changes how many AI's you'd have to monitor as active threats.

Why a specific AI will want to kill you

You should keep talking about this because the percentage of AI that are dangerous makes a huge difference to the playing field we have to consider. If 99.9% of AGI are safe, you can use those AGI to prevent a dangerous AI from coming into existence, or kill it when it pops up. If 99.9% of AGI are dangerous, there might be warning shots that can be used to pre-emptively ban AGI research in general.

In general, you should also talk about these things because you are trying to persuade people that don't agree with you, and just shouting "WRONG" along with some 101 level arguments is not particularly convincing.

[-]Prometheus3y10

"keep it relegated to "tool" status, then it might be possible to use such an AI to combat unboxed, rogue AI"

I don't think this is a realistic scenario. You seem to be seeing it as an island of rogue, agentic, "unboxed" AIs in a sea of tool AIs. I think it's much, much more realistic that it'll be the opposite. Most AIs will be unboxed agents because they are superior.

"For example, give it a snapshot of the internet from a day ago, and ask it to find the physical location of rogue AI servers, which you promptly bomb."

This seems to be approaching it from a perspective where people in AIS have taken global control, or where normal people somehow start thinking the way they do. This is not realistic. This is not the world we live in. This is not how the people in control think.

"You should keep talking about this because if an AI needs military access to dominate the world, then the number of potentially dangerous AI goes from the hundreds of thousands or millions to a few dozen, run by large countries that could theoretically be kept in line with international treaties."

This is a topic that I debated putting on the list, but resolved not to, but I don't think humans have any real control at that point, regardless of treaties. I don't even expect a rogue AI to have to forcefully coup'd humans. I expect us to coup'd ourselves. We might have figureheads occupying official positions, such as "President"/"CEO"/etc. but I don't think humans will have much control over their own destiny by that point. Large-scale coordination I don't think will be possible by then. I did remove it, because it seems more uncertain than the others listed.

"You should keep talking about this because it changes how many AI's you'd have to monitor as active threats."

Who is doing this monitoring? What is their power to act on such threats? Despite recent interest in AI Risk from "serious people", I don't think it's at all realistic that we'll see anything like this.

"If 99.9% of AGI are dangerous, there might be warning shots that can be used to pre-emptively ban AGI research in general."

Probability distributions of how many AIs are dangerous is probably useful. I don't think specific AIs being dangerous/non-dangerous will be, because I expect widespread proliferation. In terms of political ways out of the problem, I agree that some kind of crisis or "warning shot" is the most realistic situation where that might happen. But there have to actually be warning shots. Explaining thought experiments probably won't matter. And, if that happens, I don't think it would be a good idea to debate which specific AIs might kill you, and instead just call for a sweeping ban on all AI.

[-][anonymous]3y60

This argument falls apart on the last one. A superintelligence that wants to kill you can't if it's vastly out resourced and out numbered by superintelligences, some in boxes, that don't want to kill humans.

You snuck in a questionable assumption, that a "free" superintelligence able to decide to kill you will be far more capable than boxed and restricted superintelligences who may have access to use weapons when authorized.

If the boxed superintelligence with the ability to plan usage of weapons when authorized by humans, and other boxed superintelligences able to control robotics in manufacturing cells are on humans side, the advantage for humans could be overwhelming. No matter how smart an ASI is it's tough to win if the humans are prepared with millions of drones and nukes and space suits and bio and nano weapon sensors and weapons satellites and..

It's an assumption EY has made many times, I am just calling it out.

[-]Prometheus3y10

"If the boxed superintelligence with the ability to plan usage of weapons when authorized by humans, and other boxed superintelligences able to control robotics in manufacturing cells are on humans side, the advantage for humans could be overwhelming"

As I said, I do not expect boxed AIs to be a thing most will do. We haven't seen it, and I don't expect to see it, because unboxed AIs are superior. This isn't how people in control are approaching the situation, and I don't expect that to change.

[-][anonymous]2y20

My definition of "box" may be very different from yours. In my definition, locked weights and training only on testing, as well as other design elements such as distribution detection, heavily box the model'a capabilities and behavior.

See https://www.lesswrong.com/posts/a5NxvzFGddj2e8uXQ/updating-drexler-s-cais-model?commentId=AZA8ujssBJK9vQXAY

It is fine if the model can access the internet, robotics, etc so long as it lacks the context information to know it's on the real thing vs a sim or cached copy.

[-]ShardPhoenix3y41

I feel like LW at least has already largely moved away from most of these ideas in the light of what's been happening lately, especially since ChatGPT.

LESSWRONG
is fundraising!
LW

LESSWRONG
is fundraising!
LW

23

Lightning Post: Things people in AI Safety should stop talking about

23

23

Things I wish people in AI Safety would stop talking about

How an AI could persuade you to let it out of the box

How an AI could become an agent

How an AI could get ahold of, or create, weapons

How an AI might Recursively Self Improve without humans noticing

Why a specific AI will want to kill you