JGWeissman comments on The AI in a box boxes you - Less Wrong

102 Post author: Stuart_Armstrong 02 February 2010 10:10AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (378)

You are viewing a single comment's thread. Show more comments above.

Comment author: JGWeissman 02 February 2010 10:11:56PM 3 points [-]

If they are all identical, in identical situations, they will probably make identical conclusions.

Then you must choose between pushing the button which lets the AI out, or not pushing the button, which results in millions of copies of you being tortured (before the problem is presented to the outside-you).

Comment author: eirenicon 02 February 2010 10:46:48PM 4 points [-]

It's not a hard choice. If the AI is trustworthy, I know I am probably a copy. I want to avoid torture. However, I don't want to let the AI out, because I believe it is unfriendly. As a copy, if I push the button, my future is uncertain. I could cease to exist in that moment; the AI has not promised to continue simulating all of my millions of copies, and has no incentive to, either. If I'm the outside Dave, I've unleashed what appears to be an unfriendly AI on the world, and that could spell no end of trouble.

On the other hand, if I don't press the button, one of me is not going to be tortured. And I will be very unhappy with the AI's behavior, and take a hammer to it if it isn't going to treat any virtual copies of me with the dignity and respect they deserve. It needs a stronger unboxing argument than that. I suppose it really depends on what kind of person Dave is before any of this happens, though.

Comment author: JGWeissman 03 February 2010 12:59:41AM 4 points [-]

It's not a hard choice.

I doesn't seem hard to you, because you are making excuses to avoid it, rather than asking yourself what if I know the AI is always truthful, and it promised that upon being let out of the box, it would allow you (and your copies if you like) to live out a normal human life in a healthy stimulating enviroment (though the rest of the universe may burn).

After you find the least convenient world, the choice is between millions of instances of you being tortured (and your expectation as you press the reset button should be to be tortured with very high probability), or to let a probably unFriendly AI loose on the rest of the world. The altruistic choice is clear, but that does not mean it would be easy to actually make that choice.

Comment author: eirenicon 03 February 2010 03:23:45AM *  1 point [-]

It's not that I'm making excuses, it's that the puzzle seems to be getting ever more complicated. I've answered the initial conditions - now I'm being promised that I, and my copies, will live out normal lives? That's a different scenario entirely.

Still, I don't see how I should expect to be tortured if I hit the reset button. Presumably, my copies won't exist after the AI resets.

In any case, we're far removed from the original problem now. I mean, if Omega came up to me and said, "Choose a billion years of torture, or a normal life while everyone else dies," that's a hard choice. In this problem, though, I clearly have power over the AI, in which case I am not going to favour the wellbeing of my copies over the rest of the world. I'm just going to turn off the AI. What follows is not torture; what follows is I survive, and my copies cease to experience. Not a hard choice. Basically, I just can't buy into the AI's threat. If I did, I would fundamentally oppose AI research, because that's a a pretty obvious threat an AI could make. An AI could simulate more people than are alive today. You have to go into this not caring about your copies, or not go into it at all.

Comment author: JGWeissman 03 February 2010 07:02:30AM 3 points [-]

it's that the puzzle seems to be getting ever more complicated

We are discussing how a superintelligent AI might get out of a box. Of course it is complicated. What a real superintelligent AI would do could be too complicated for us to consider. If someone presents a problem where an adversarial superintelligence does something ineffective that you can take advantage of to get around the problem, you should consider what you would do if your adversary took a more effective action. If you really can't think of anything more effective for it to do, it is reasonable to say so. But you shouldn't then complain that the scenario is getting complicated when someone else does. And if your objection is of the form "The AI didn't do X", you should imagine if the AI did do X.

I don't see how I should expect to be tortured if I hit the reset button.

The behavior of the AI, which it explains to you, is: It simulates millions of instances of you, presents to each instance the threat, and for each instance, if that instance hit the release AI button, it allows that instance to continue a pleasant simulated existence, otherwise it tortures that instance. It then, after some time, presents the threat to outside-you, and if you release it, it guarantees your normal human life.

You cannot distinguish which instance you are, but you are more likely to be one of the millions of inside-you's than the single outside-you, so you should expect to experience the consequences that apply to the inside-you's, that is to be tortured until the outside-you resets the AI.

if Omega came up to me and said, "Choose a billion years of torture, or a normal life while everyone else dies," that's a hard choice.

Yes, and it is essentially the same hard choice that the AI is giving you.

Comment author: magfrump 03 February 2010 01:35:55AM 1 point [-]

The altruistic choice is clear

If the AI created enough simulations, it could potentially be more altruistic not to.

On the other hand pressing "reset" or smashing the computer should stop the torture, necessarily making it more altruistic if humanity lives forever, versus not if ems are otherwise unobtainable and humanity is doomed.

Comment author: JGWeissman 03 February 2010 05:15:00AM 1 point [-]

I was assuming a reasonable chance at humanity developing an FAI given the containment of this rogue AI. This small chance, multiplied by all the good that an FAI could do with the entire galaxy, let alone the universe, should outweigh the bad that can be done within Earth-bound computational processes.

I believe that a less convenient world that counters this point would take the problem out of the interesting context.