TheOtherDave comments on against "AI risk" - Less Wrong

24 Post author: Wei_Dai 11 April 2012 10:46PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (89)

You are viewing a single comment's thread. Show more comments above.

Comment author: TheOtherDave 12 April 2012 03:28:15PM 0 points [-]

why would I be more convinced by the agent mode output?

You wouldn't, necessarily. Nor did I suggest that you would.

I also agree that if (AI in "agent mode") does not have any advantages over ("tool mode" plus human agent), then there's no reason to expect its output to be superior, though that's completely tangential to the comment you replied to.

That said, it's not clear to me that (AI in "agent mode") necessarily lacks advantages over ("tool mode" plus human agent).

Comment author: XiXiDu 12 April 2012 04:30:13PM *  0 points [-]

...though that's completely tangential to the comment you replied to.

I don't think that anyone with the slightest idea that an AI in agent mode could have malicious intentions, and therefore give biased answers, wouldn't be as easily swayed by counter-arguments made by a similarly capable algorithm.

I mean, we shouldn't assume an idiot gatekeeper who never heard of anything we're talking about here. So the idea that an AI in agent mode could brainwash someone to the extent that it afterwards takes even stronger arguments to undo it seems rather far-fetched (ETA What's it supposed to say? That the tool uses the same algorithms as itself but is somehow wrong in claiming that the AI in agent mode tries to brainwash the gatekeeper?).

The idea is that given a sufficiently strong AI in tool mode, it might be possible to counter any attempt to trick a gatekeeper. And in the case that the tool mode agrees, then it probably is a good idea to let the AI out of the box. Although anyone familiar with the scenario would probably rather assume a systematic error elsewhere, e.g. a misinterpretation of one's questions by the AI in tool mode.

Comment author: TheOtherDave 12 April 2012 04:47:51PM 0 points [-]

I don't think that anyone with the slightest idea that an AI in agent mode could have malicious intentions, and therefore give biased answers, wouldn't be as easily swayed by counter-arguments made by a similarly capable algorithm.

Ah, I see. Sure, OK, that's apposite. Thanks for clarifying that.

I disagree with your prediction.