Solomon — LessWrong

Reading this, I immediately thought of one of the critical moments of John C. Wright's Golden Age trilogy, which if any of you are unfamiliar with involves a transhuman AI that both protagonists know to be overtly hostile attempting to convince them to surrender when it is clearly not in their (pre-persuasion) interests to do so. (That's a rough sketch, at least). In the end, similar to the results of your tests, the AI is able to convince each of them individually to surrender in isolation. But, when they confronted each (individually) convincing argument together, they ultimately rejected the transhuman AI, as the argument used on each was abhorrent to the other.

What I wonder, then, is if such a situation would present a realistic constraint upon even a transhuman AI's power of persuasion? Do you think it could even be tested? Perhaps an AI 'player' who had convinced two separate Gatekeepers individually could attempt to convince them both simultaneously? The one complication I can think of that method is it would be necessary to ensure the two Gatekeepers were persuaded by reasonably separate logic.

LESSWRONG
LW

LESSWRONG
LW

Posts

Wikitag Contributions

Comments

Posts

Wikitag Contributions

Comments