Alexei comments on A definition of wireheading - Less Wrong

35 Post author: Anja 27 November 2012 07:31PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (80)

You are viewing a single comment's thread. Show more comments above.

Comment author: devas 28 November 2012 11:32:47AM 2 points [-]

I agree with Alexei, this has just now helped me a lot.

Although I now have to ask a stupid question; please have pity on me, I'm new to the site and I have little knowledge to work of.

What would happen if we set an algorithm inside the AGI assigning negative infinite utility to any action which modifies its own utility function and said algorithm itself?

This within reasonable parameters; ideally, it could change its utility function but only in certain pre approved paths, so that it could actually move around.

Reasonable here is a magic word, in the sense that it's a block box which I don't know how to map out

Comment author: Alexei 29 November 2012 04:27:14AM 2 points [-]

I think you intuition is basically right. AGI will have to change its utility function, the answer is basically how/why? For FAI, we want to make sure that all future modifications will preserve the "friendly" aspect, which is very difficult to ensure (we don't have the necessary math for that right now).