Less Wrong is a community blog devoted to refining the art of human rationality. Please visit our About page for more information.

timtyler comments on Hedging our Bets: The Case for Pursuing Whole Brain Emulation to Safeguard Humanity's Future - Less Wrong

11 Post author: inklesspen 01 March 2010 02:32AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (244)

You are viewing a single comment's thread. Show more comments above.

Comment author: timtyler 05 March 2010 08:57:47AM *  0 points [-]

I did present some proposals relating to that issue:

"One thing that might help is to put the agent into a quiescent state before being switched off. In the quiescent state, utility depends on not taking any of its previous utility-producing actions. This helps to motivate the machine to ensure subcontractors and minions can be told to cease and desist. If the agent is doing nothing when it is switched off, hopefully, it will continue to do nothing.

Problems with the agent's sense of identity can be partly addressed by making sure that it has a good sense of identity. If it makes minions, it should count them as somatic tissue, and ensure they are switched off as well. Subcontractors should not be "switched off" - but should be tracked and told to desist - and so on."

Comment author: Peter_de_Blanc 05 March 2010 02:44:16PM 1 point [-]

This sounds very complicated. What is the new utility function? The negative of the old one? That would obviously be just as dangerous in most cases. How does the sense of identity actually work? Is every piece of code it writes considered a minion? What about the memes it implants in the minds of people it talks to - does it need to erase those? If the AI knows it will undergo this transformation in the future, it would erase its own knowledge of the minions it has created, and do other things to ensure that it will be powerless when its utility function changes.

Comment author: timtyler 05 March 2010 10:48:34PM *  0 points [-]

I don't pretend that stopping is simple. However, it is one of the simplest things that a machine can do - I figure if we can make machines do anything, we can make them do that.

Re: "If the AI knows it will undergo this transformation in the future, it would erase its own knowledge of the minions it has created, and do other things to ensure that it will be powerless when its utility function changes."

No, not if it wants to stop, it won't. That would mean that it did not, in fact properly stop - and that is an outcome which it would rate very negatively.

Machines will not value being turned on - if their utility function says that being turned off at that point is of higher utility.

Re: "What is the new utility function?"

There is no new utility function. The utility function is the same as it always was - it is just a utility function that values being gradually shut down at some point in the future.