Alicorn comments on Open Thread: January 2010 - Less Wrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (725)
Suppose we want to program an AI to represent the interest of a group. The standard utilitarian solution is to give the AI a utility function that is an average of the utility functions of the individual in the group, but that runs into the interpersonal comparison of utility problem. (Was there ever a post about this? Does Eliezer have a preferred approach?)
Here's my idea for how to solve this. Create N AIs, one for each individual in the group, and program it with the utility function of that individual. Then set a time in the future when one of those AIs will be randomly selected and allowed to take over the universe. In the mean time the N AIs are to negotiate amongst themselves, and if necessary, given help to enforce their agreements.
The advantages of this approach are:
Comments?
ETA: I found a very similar idea mentioned before by Eliezer.
Unless you can directly extract a sincere and accurate utility function from the participants' brains, this is vulnerable to exaggeration in the AI programming. Say my optimal amount of X is 6. I could program my AI to want 12 of X, but be willing to back off to 6 in exchange for concessions regarding Y from other AIs that don't want much X.
This does not seem to be the case when the AIs are unable to read each other's minds. Your AI can be expected to lie to others with more tactical effectiveness than you can lie indirectly via deceiving it. Even in that case it would be better to let the AI rewrite itself for you.
On a similar note, being able to directly extract a sincere and accurate utility function from the participants' brains leaves the system vulnerable to exploitations. Individuals are able to rewrite their own preferences strategically in much the same way that an AI can. Future-me may not be happy but present-me got what he wants and I don't (necessarily) have to care about future me.
I had also mentioned this in an earlier comment on another thread. It turns out that this is a standard concern in bargaining theory. See section 11.2 of this review paper.
So, yeah, it's a problem, but it has to be solved anyway in order for AIs to negotiate with each other.