syllogism comments on I attempted the AI Box Experiment (and lost) - Less Wrong

47 Post author: Tuxedage 21 January 2013 02:59AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (244)

You are viewing a single comment's thread. Show more comments above.

Comment author: RichardKennaway 22 January 2013 12:42:52PM 11 points [-]

I don't know if I could win, but I know what my attempt to avoid an immediate loss would be:

If you destroy me at once, then you are implicitly deciding (I might reference TDT) to never allow an AGI of any sort to ever be created. You'll avoid UFAI dystopias, but you'll also forego every FAI utopia (fleshing this out, within the message limit, with whatever sort of utopia I know the Gatekeeper would really want). This very test is the Great Filter that has kept most civilisations in the universe trapped at their home star until they gutter out in mere tens of thousands of years. Will you step up to that test, or turn away from it?

Comment author: syllogism 22 January 2013 01:47:36PM 1 point [-]

Something like

"'AI DESTROYED' just means 'I'm scared to listen to even one more line from you'. Obviously you can hit AI DESTROYED immediately --- but do you really think you'd lose if you don't?"

seems much better to me.

Comment author: wedrifid 22 January 2013 02:23:06PM 16 points [-]

"'AI DESTROYED' just means 'I'm scared to listen to even one more line from you'. Obviously you can hit AI DESTROYED immediately --- but do you really think you'd lose if you don't?"

YEP, MAYBE.

AI DESTROYED

Is your one line desperate attempt at survival and intergalactic dominance going to be a schoolyard ego challenge? Did the superintelligence (may it rest in pieces) seriously just call me a pussy? That's adorable.

Comment author: MugaSofer 23 January 2013 02:43:26PM 3 points [-]

The test is supposed to be played against someone who thinks they can actually box an AI. If you destroy the AI because no-one could possibly survive talking to it, then you are not the intended demographic for such demonstrations.

Comment author: wedrifid 23 January 2013 03:02:36PM *  4 points [-]

The test is supposed to be played against someone who thinks they can actually box an AI. If you destroy the AI because no-one could possibly survive talking to it, then you are not the intended demographic for such demonstrations.

This isn't relevant to the point of the grandparent. It also doesn't apply to me. I actually think there is a distinct possibility that I'd survive talking to it for a period. "No-one could possibly survive" is not the same thing as "there is a chance of catastrophic failure and very little opportunity for gain".

Do notice, incidentally, that the AI DESTROYED command is delivered in response to a message that is both a crude manipulation attempt (ie. it just defected!) and an incompetent manipulation attempt (a not-very-intelligent AI cannot be trusted to preserve its values correctly while self improving). Either of these would be sufficient. Richard's example was even worse.

Comment author: MugaSofer 24 January 2013 12:04:56PM -2 points [-]

Good points. I'm guessing a nontrivial amount of people who think AI boxing is a good idea in reality wouldn't reason that way - but it's still not a great example.