funnyfranco - LessWrong

Thank you for your considered response.

For 2A, it's the efficiency of the task it has been given. Whatever that may be. I discuss how one such task could lead to removing humanity in an upcoming on AGI vs AGI war essay. I'd be keen to hear your thoughts on it.

Yes, 7C is possible, if it is more efficient and humanity poses no threat to it. It's not a great situation for us either way. Think of the resources an AGI would save by not having to caretake humanity over a 100 year period, or over 10000 years. The chances it would determine that keeping us around is the more efficient choice becomes vanishingly small.

Yes, the title was meant to attract eyes and was already long enough to be honest. If I had put the full argument in there it would be like reading the essay twice (which is also long enough already).

On ChatGPT's evaluation:

On Assumption of Unfettered Competition: The essay is not assuming that no collective action will be possible, it's assuming that complete collective action will be. It's assuming that whatever agreements are made to make AGI safe, that one or more bad actors will simply ignore them to get an advantage. This follows with historical precedent of companies and governments simply ignoring any restrictions that may be in place for profit or advantage. It doesn't take a global effort to bring about a hostile AGI, just one would do it, and it seems impossible to make sure there's not at least one (but likely several).

On Assumption about AGI’s Goal Structure: It's the same thing. It's not that every AGI will do as I've described, it's the fact that only one needs to.

On Simplified “Benevolent AI” Game Theory: same issue, again. While a superintelligent AGI would not believe that all humans will perceive it as a threat, which is unreasonable, it would, correctly, believe that some are. As soon as an AGI exists that could potentially be a threat to humanity, just be virtue of its sheer capability, at least some humans somewhere would immediately develop a plan as to how to turn it off (or have one ready already). This is enough. It's the prisoners dilemma played out on a global scale.

On Determinism and Extrapolation: the issue with describing the fact there could be an alternative to systemic forces producing predictable, likely, almost certain, results is that you would need to suggest some. Right now, I've heard no likely alternatives to the results I predict. In order to find an alternative route you would global cooperation seen of a scale we've simply never been able to achieve. Cooperation on the ozone layer and nuclear non-proliferation just aren't the same as asking companies and governments to not pursue a better AGI. There was an alternative to CFC's, but nothing else does what AGI does to optimise. No one wants a nuclear war, but everyone wants an advantage.

On ChatGPTs counterarguments:

Potential for Coordination and Regulation: Seems unlikely, as already described.

Successful Alignment: If it's a restriction on the task it has been given, and if it is superintelligent, then there's no reason to believe it won't find a way around any blocks we put in place that interferes with that task. You're basically saying, "we'll just outsmart the superintelligence." Which seems naive to say the least.

AGI Might Not Seek Power if Designed Differently: It will if it allows it to complete its task more efficiently and, remember, we only need that to be true for 1 AGI. We don't need humanity to get wiped out more than once, once is already pretty bad.

Timeline and Gradual Integration (and the concept of AI guardians): I'm actually writing an essay about AI guardians that will be finished and ready to share soon. I think the issue is that we're not really in control of the progress of AI any more - AI is. As soon as we tell AI to optimise a task and it begins determining its own behaviour, we have lost much of the ability it will have to eventually leap forward at some point. Given enough resources it could increase in capability exponentially in a way that we can barely even monitor, let alone control.

Humans May Not React with Hostility to Friendly AI: covered above.

Role of Capitalism – Is it the core issue? Not really, it's just a catchy title, it's more about systemic forces driven by competition. We can stop this, if literally every single agent capable of bringing this about agrees to cooperate to make sure that doesn't happen. Seems unlikely given everything we know about global cooperation.

Unpredictability of Technological Outcomes: I think the issue with the examples it gives is that it relies upon human ingenuity solving a problem that is not a direct adversary.

While these counter arguments are worth noting, the only thing I would say is that the assumption that I'm "assuming the worst at every juncture" is false. I'm following the most likely logical conclusion from undeniable premises. If there was some other conclusion I would have landed on it. I didn't write the essay as an argument of how humanity will end, I wrote it as an argument of what will likely happen under current conditions. The fact I landed on humanity's extinction was because the logic led me there, not because I was trying to get there.

It is notable that the most rigorous scrutiny my essay has undergone was from an AI. I have used ChatGPT myself as a writing partner in this essay, because when I put ideas down they're just a stream of consciousness, and ChatGPT turns them into something actually readable. I have previously instructed, and subsequently reinforced, that my ChatGPT not be a cheerleader for my ideas, but to be a sparring partner. To question everything I assert with all logical rigor available to it. Despite that, I find myself still questioning if it's agreeing with me just because its programming says that it should (for the most part) or because my ideas actually have strong validity. It is very comforting to know that even when run through other people's ChatGPT in an attempt to find flaws in my argument, few are found and those that are I can deal with (without ChatGPTs assistance).

So I thank you for your engagement, and I'll leave my response on a quote from your ChatGPT's evaluation of my essay:

"As the post hauntingly implies, if we don’t get this right, we risk writing the final chapter of human philosophy – because there may be no humans left to ask these questions."

LESSWRONG
LW

Posts

Wikitag Contributions

Comments