You're looking at Less Wrong's discussion board. This includes all posts, including those that haven't been promoted to the front page yet. For more information, see About Less Wrong.

Kaj_Sotala comments on Open Thread Feb 22 - Feb 28, 2016 - Less Wrong Discussion

5 Post author: Elo 21 February 2016 09:14PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (228)

You are viewing a single comment's thread. Show more comments above.

Comment author: halcyon 28 February 2016 08:43:52PM *  0 points [-]

Although what if we told each party to submit goals rather than non-goal preferences? If the AI has access to a model specifying which actions lead to which consequences, then it can search for those actions that maximize the number of goals fulfilled regardless of which party submitted them, or perhaps takes a Rawlsian approach of trying to maximize the number of goals fulfilled that were submitted by whichever party will have the least number of goals fulfilled if that sequence of actions were taken, etc. That seems very imaginable to me. You can then have heuristics that constrain the search space and stuff. You can also have non-goal preferences in addition to goals if the parties have any of those.

In that light, it seems to me that the problem was inferring goals from a set of preferences which were not purely non-goal preferences but were actually presented with some unspecified goals in mind. Eg. One party wanted chocolate, but said, "I want to go to the store" instead. If that was the source of the original problem, then we can see why we might need an AI to solve it, since it calls for some lightweight mind reading. Of course, a CEV-implementing AI would have to be a mind reader anyway, since we don't really know what our goals ultimately are given everything we could know about reality.

This still does not guarantee basic morality, but parties should at least recognize some of their ultimate goals in the end result. They might still grumble about the result not being exactly what they wanted, but we can at least scold them for lacking a spirit of compromise.

All this presupposes that enough of our actions can be reduced to ultimate goals that can be discovered, and I don't think this process guarantees we will be satisfied with the results. For example, this might erode personal freedom to an unpleasant degree. If we would choose to live in some world X if we were wiser and nicer than we are, then it doesn't necessarily follow that X is a Nice Place to Live as we are now. Changing ourselves to reach that level of niceness and wisdom might require unacceptably extensive modifications to our actual selves.

Comment author: Kaj_Sotala 01 March 2016 07:55:57AM 1 point [-]

My recent paper touches upon preference aggregation a bit in section 8, BTW, though it's mostly focused on the question of figuring out a single individual's values. (Not sure how relevant that is for your comments, but thought maybe a little.)

Comment author: halcyon 05 March 2016 09:58:10PM 0 points [-]

Thanks, I'll look into it.

(And all my ranting still didn't address the fundamental difficulty: There is no rational way to choose from among different projections of values held by multiple agents, projections such as Rawlsianism and utilitarianism.)