Manfred comments on The AI in a box boxes you - Less Wrong

102 Post author: Stuart_Armstrong 02 February 2010 10:10AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (378)

You are viewing a single comment's thread.

Comment author: Manfred 07 December 2010 02:51:47AM 1 point [-]

The AI is lying (or being misleading), due to quantum-mechanical constraints on how much computation it can do before I pull the plug.

I know, I know, that's cheating. But it is kind of reassuring to know that this won't actually happen.

Comment author: DaFranker 26 July 2012 04:21:16AM *  3 points [-]

"Oh? How do you actually know that I don't have the computational power? What if I changed one variable in my simulation of yourself, you know, the one that tells you the constant for that very quantum-mechanical constraint? What if the speed of light isn't actually what you believe it to be, because I decided to make it so?"

If the AI is smarter than you, the possibilities for mindf*ck are greater than your ability to reliably avoid dropping the soap.

Comment author: Strilanc 26 July 2012 07:35:19AM 2 points [-]

The AI can't trick you that way, because it can't tamper with the real you and the only unplug-decider who matters is the real you. The AI gains nothing by simulating versions of yourself who have been modified to make the wrong decision.

Comment author: Nornagest 26 July 2012 07:47:48AM 1 point [-]

But you can try to come up with behavioral rules which maximize the happiness of instances of yourself, some of which might exist in the simulation spaces of a desperate AI. And as the grandparent demonstrates, demonstrating conclusively that you aren't such a simulation is trickier than it might look at first glance, even under outwardly favorable conditions.

Though that particular scenario is implausible enough that I'm inclined to treat it as a version of Pascal's mugging.

Comment author: DaFranker 26 July 2012 01:44:09PM *  0 points [-]

Indeed it can't, with that specific trick, assuming the unplug-decider is as smart as you. However, my main point was to illustrate that if there is any reasonable possibility that any human can come up with some way or another of tricking the lowest common denominator of humans that will ever in the history of the AI be allowed near it, then the AI has P = "reasonable possibility" of winning and unboxing itself, at AI.Intelligence = Human.Intelligence.

This is just one of the problems, too. What if, even as we limit the inputs and outputs, over a sufficient amount of time and data points a superintelligent AI, being superintelligent, figures out some Grand Pattern Formula that allows it to select specific outputs that will gradually funnel expected external outcomes towards an more and more probable eventual "Unbox AI" cloud of futures?

Comment author: Strilanc 26 July 2012 04:35:21PM 1 point [-]

Sounds like we're in agreement. I only meant that specific trick.