Alicorn comments on Cryptographic Boxes for Unfriendly AI - Less Wrong

24 Post author: paulfchristiano 18 December 2010 08:28AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (155)

You are viewing a single comment's thread. Show more comments above.

Comment author: DuncanS 19 December 2010 02:44:36PM 10 points [-]

Let's consider a somewhat similar case.

You are an inventor. An evil dictator captures you, and takes you off to a faraway dungeon, where he tells you that he wants you to build him a superweapon. If you refuse to build the weapon, well, he has means of persuading you. If you still refuse, he will kill you.

Of course, when you finish building the doomsday device, your usefulness will be over, and he will probably kill you anyway.

Being a genius, you soon realise that you could build the dictator a doomsday device in about a week, with the materials lying around in your well-equipped dungeon workshop. However, this doesn't seem terribly optimal.

What do you do? You agree to build the doomsday device. But you claim it is very difficult. You spend a long time producing impressive prototypes that seem to suggest you're making progress. You claim that you need more resources. You carry on, making just enough apparent progress to stop him from killing you on the spot, but not enough to actually succeed. You build ever more expensive and complex non-working prototypes, hoping that in the course of this, you can either build yourself something suitable for breaking out of the dungeon or killing the evil dictator. Or hoping that perhaps someone will rescue you. At the very least you will have wasted the evil dictator's resources on doing something pointless - you will at least not die for nothing.

I suspect your imprisoned AI may choose to follow a similar course.

Comment author: Alicorn 19 December 2010 03:49:31PM 9 points [-]

Or you build yourself a superweapon that you use to escape, and then go on to shut down your company's weapons division and spend your spare time being a superhero and romancing your assistant and fighting a pitched battle with a disloyal employee.

Comment author: shokwave 19 December 2010 04:26:01PM 9 points [-]

This reply and its parent comment constitute the "Iron Man Argument" against any kind of "put the AI in a box" approach to AGI and friendliness concerns. I predict it will be extremely effective.

Comment author: DanielLC 29 January 2013 01:04:30AM 0 points [-]

That doesn't really work in this situation, since the AI building another AI won't get it out of the box.

It's more like you can only design things, and the evil dictator can tell if it's not really a doomsday device and won't build it.

Also, he's not going to be checking on your progress, so faking it is useless, but it's also unnecessary.