You're looking at Less Wrong's discussion board. This includes all posts, including those that haven't been promoted to the front page yet. For more information, see About Less Wrong.

jimrandomh comments on Polymath-style attack on the Parliamentary Model for moral uncertainty - Less Wrong Discussion

22 Post author: danieldewey 26 September 2014 01:51PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (74)

You are viewing a single comment's thread.

Comment author: jimrandomh 28 September 2014 10:00:33PM 6 points [-]

We discussed this issue at the two MIRIx Boston workshops. A big problem with parliamentary models which we were unable to solve, was what we've been calling ensemble stability. The issue is this: suppose your AI's value system is made from a collection of value systems in a voting-like system, is constructing a successor, more powerful AI, and is considering constructing the successor so that it represents only a subset of the original value systems. Each value system which is represented will be in favor; each value system which is not represented, will be against. In order to keep that from happening, you either need a voting system which somehow reliably never does that (but nothing we tried worked), or a special case for constructing successors, and a working loophole-free definition of that case (which is Hard).

Comment author: varsel 30 September 2014 08:06:35PM *  2 points [-]

This seems to be almost equivalent to irreversibly forming a majority voting bloc. The only difference is how they interact with the (fake) randomization: by creating a subagent, it effectively (perfectly) correlates all the future random outputs. (In general, I think this will change the outcomes unless agents' (cardinal) preferences about different decisions are independent).

The randomization trick still potentially helps here: it would be in each representative's interest to agree not to vote for such proposals, prior to knowing which such proposals will come up and in which order they're voted on. However, depending on what fraction of its potential value an agent expects to be able to achieve through negotiations, I think that some agents would not sign such an agreement if they know they will have the chance to try to lock their opponents out before they might get locked out.

Actually, there seems to be a more general issue with ordering and incompatible combinations of choices - splitting that into a different comment.