DanielLC comments on Work on Security Instead of Friendliness? - Less Wrong

29 Post author: Wei_Dai 21 July 2012 06:28PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (103)

You are viewing a single comment's thread. Show more comments above.

Comment author: DanielLC 25 July 2012 12:44:02AM 0 points [-]

Everything a human can do, a human cannot do in the most extreme possible manner. An AI could be made to wirehead easier or harder. It could think faster or slower. It could be more creative or less creative. It could be nicer or meaner.

I wouldn't begin to know how to build an AI that's improved in all the right ways. It might not even be humanly possible. If it's not humanly possible to build a good AI, it's likely impossible for the AI to be able to improve on itself. There's still a good chance that it would work.

Comment author: timtyler 25 July 2012 09:52:28AM *  0 points [-]

An AI could be made to wirehead easier or harder.

Probably true - and few want wireheading machines - but the issues are the scale of the technical challenges, and - if these are non-trivial - how much folk will be prepared to pay for the feature. In a society of machines, maybe the occasional one that turns Buddhist - and needs to go back to the factory for psychological repairs - is within tolerable limits.

Many apparently think that making machines value "external reality" fixes the wirehead problem - e.g. see "Model-based Utility Functions" - but it leads directly to the problems of what you mean by "external reality" and how to tell a machine that that is what it is supposed to be valuing. It doesn't look much like solving the problem to me.