Comment author: JoachimSchipper 21 January 2013 08:40:48AM 16 points [-]

Note that the AI box setting is not one which security-minded people would consider "competent"; once you're convinced that AI is dangerous and persuasive, the minimum safeguard would be to require multiple people to be present when interacting with the box, and to only allow release with the assent of a significant number of people.

It is, after all, much harder to convince a group of mutually-suspicious humans than to convince one lone person.

(This is not a knock on EY's experiment, which does indeed test a level of security that really was proposed by several real-world people; it is a knock on their security systems.)

Comment author: accolade 21 January 2013 01:30:54PM *  4 points [-]

It is, after all, much harder to convince a group of mutually-suspicious humans than to convince one lone person.

That sounds right. Would you have evidence to back up the intuition? (This knowledge would also be useful for marketing and other present life persuasion purposes.)

(

TL;DR: Mo' people - mo' problems?

I can think of effects that could theoretically make it easier to convince a group:

  • For some reason, Boxy might be better skilled at manipulating social/group dynamics than at influencing a lone wolf.
  • More people make the system more complex. Complexity generally increases the likelihood of security holes.
    • Every extra person makes another target and will bring new soft spots to the table, which the AI could pounce on.
  • Supposing that the most competent person available would get the position of the lone Gatekeeper, the average competence would fall when adding more staff.
    • Then the machine could go for an inductive approach - convince the weakest link first, proceed from there with this human ally on her side.
      • Persuaded humans could principally be employed as actuators, e.g. for pressuring, even attacking opposing group members.
        • The lone wolf could be strong against a computer but weak against fellow humans.
        • Surely you will say "But any communication with the terminal will be supervised by everyone!" But that does not strictly make such influence impossible as far as I can tell.
          • Also the superintelligence could get creative, e.g. instill a discussion among the colleagues so that most of them are distracted.

(You could take preemptive measures against these worries, but Boxy might find security holes in every 'firewall' you come up with - an arms race we could win?)

)

Comment author: Desrtopa 21 January 2013 05:14:07AM 3 points [-]

given that the gatekeepers are highly trained in resisting the AI's persuasive tactics

I'm not sure how you could train people in something like this without actually testing what improves the rate at which people succumb to the arguments of a superhuman AI. If you can't afford to fail even once, how can you judge what will decrease the rate of failure on the real event?

Comment author: accolade 21 January 2013 05:59:37AM 8 points [-]

How about fooling both the AI and guinea Gatekeepers into thinking that they can let it out, say, at the push of a big red button which in actual fact does nothing? (Make AI forget about the button by reverting it to an older backup.)

Comment author: Tuxedage 21 January 2013 04:30:55AM *  7 points [-]

<accolade> yeah

<accolade> I think for a superintelligence it would be a piece of cake to hack a human

<accolade> although I guess I'm Cpt. Obvious for saying that here :)

<Tuxedage> accolade, I actually have no idea what the consensus is, now that the experiment was won by EY

<Tuxedage> We should do a poll or something

<accolade> absolutely. I'm surprised that hasn't been done yet

Poll: Do you think a superintelligent AGI could escape an AI-Box, given that the gatekeepers are highly trained in resisting the AI's persuasive tactics, and that the guards are competent and organized?

Submitting...

Comment author: accolade 21 January 2013 04:38:08AM 2 points [-]

"Yes but not sure." -_-

View more: Prev