Well, I agree that I chose words badly and then didn't explain the intended meaning, continued to speak in metaphors (my writing skills are seriously lacking). What I called "personality" of a delegate was a function that assigns a utility score for any given state of the world (at the beginning they are determined by moral theories). In my first post I thought about these utility function as constants and stayed that way throughout negotiation process (it was my impression that ESRogs 3rd assumption implicitly says basically the same thing), maybe accepting some binding agreements if they help to increase the expected utility (these agreements are not treated as a part of utility function, they are ad-hoc).
On the other hand, what if we drop the assumption that these utility functions stay constant? What if, e.g. when two delegates meet, instead of exchanging binding agreements to vote in a specific way, they would exchange agreements to self-modify in a specific way that would correspond to those agreements? I.e. suppose a delegate M_1 strongly prefers option O_1,1 to an option O_1,2 on an issue B_1 and slightly prefers O_2,1 to O_2,2 on an issue B_2, whereas a delegate M_2 strongly prefers option O_2,2 to an option O_2,1 on an issue B_2 and slightly prefers O_1,2 to O_1,1 on an issue B_1. Now, M_1 could agree to vote (O_1,1 ;O_2,2) in exchange for a promise that M_2 would vote the same way, and sign a binding agreement. On the other hand, M_1 could agree to self-modify to slightly prefer O_2,2 to O_2,1 in exchange for a promise that M_2 would self-modify to slightly prefer O_1,1 to O_1,2 (both want to self-modify as little as possible, however any modification that is not ad-hoc would probably affect utility function at more than one point (?). Self-modifying in this case is restricted (only utility function is modified), therefore maybe it wouldn't require heavy machinery (I am not sure), besides, all utility functions ultimately belong to the same persons). These self-modifications are not binding agreements, delegates are allowed to further self-modify their "personalities"(i.e. utility functions) in another exchange.
Now, this idea vaguely reminds me a smoothing over the space of all possible utility functions. Metaphorically, this looks as if delegates were "persuaded" to change their "personalities", their "opinions about things"(i.e. utility functions) by an "argument" (i.e. exchange).
I would guess these self-modifying delegates should be used as dummy variables during a finite negotiation process. After the vote, delegates would revert to their original utility functions.
Thanks to ESrogs, Stefan_Schubert, and the Effective Altruism summit for the discussion that led to this post!
This post is to test out Polymath-style collaboration on LW. The problem we've chosen to try is formalizing and analyzing Bostrom and Ord's "Parliamentary Model" for dealing with moral uncertainty.
I'll first review the Parliamentary Model, then give some of Polymath's style suggestions, and finally suggest some directions that the conversation could take.
The Parliamentary Model
The Parliamentary Model is an under-specified method of dealing with moral uncertainty, proposed in 2009 by Nick Bostrom and Toby Ord. Reposting Nick's summary from Overcoming Bias:
In a comment, Bostrom continues:
It's an interesting idea, but clearly there are a lot of details to work out. Can we formally specify the kinds of negotiation that delegates can engage in? What about blackmail or prisoners' dilemmas between delegates? It what ways does this proposed method outperform other ways of dealing with moral uncertainty?
I was discussing this with ESRogs and Stefan_Schubert at the Effective Altruism summit, and we thought it might be fun to throw the question open to LessWrong. In particular, we thought it'd be a good test problem for a Polymath-project-style approach.
How to Polymath
The Polymath comment style suggestions are not so different from LW's, but numbers 5 and 6 are particularly important. In essence, they point out that the idea of a Polymath project is to split up the work into minimal chunks among participants, and to get most of the thinking to occur in comment threads. This is as opposed to a process in which one community member goes off for a week, meditates deeply on the problem, and produces a complete solution by themselves. Polymath rules 5 and 6 are instructive:
It seems to us as well that an important part of the Polymath style is to have fun together and to use the principle of charity liberally, so as to create a space in which people can safely be wrong, point out flaws, and build up a better picture together.
Our test project
If you're still reading, then I hope you're interested in giving this a try. The overall goal is to clarify and formalize the Parliamentary Model, and to analyze its strengths and weaknesses relative to other ways of dealing with moral uncertainty. Here are the three most promising questions we came up with:
The original OB post had a couple of comments that I thought were worth reproducing here, in case they spark discussion, so I've posted them.
Finally, if you have meta-level comments on the project as a whole instead of Polymath-style comments that aim to clarify or solve the problem, please reply in the meta-comments thread.