The threats problem seems like a specific case of problems that might arise by putting real intelligence in to the agents in the system. Especially if this moral theory was being run on a superintelligent AI, it seems like the agents might be able to come up with all sorts of creative unexpected stuff. And I'm doubtful that creative unexpected stuff would make the parliament's decisions more isomorphic to the "right answer".
One way to solve this problem might be to drop any notion of "intelligence" in the delegates and instead specific a deterministic algorithm that any individual delegate follows in deciding which "deals" they accept. Or take the same idea even further and specify a deterministic algorithm for resolving moral uncertainty that is merely inspired by the function of parliaments, in the same sense that the stable marriage problem and algorithms for solving it could have been inspired by the way people decide who to marry.
Eliezer's notion of a "right answer" sounds appealing, but I'm a little skeptical. In computer science, it's possible to prove that a particular algorithm, when run, will always achieve the maximal "score" on a criterion it's attempting to optimize. But in this case, if we could formalize a score we wanted to optimize for, that would be equivalent to solving the problem! That's not to say this is a bad angle of approach, however... it may be useful to take the idea of a parliament and use it to formalize a scoring system that captures our intuitions about how different moral theories trade off and then maximize this score using whatever method seems to work best. For example waves hands perhaps we could score the total regret of our parliamentarians and minimize that.
Another approach might be to formalize a set of criteria that a good solution to the problem of moral uncertainty should achieve and then set out to design an algorithm that achieves all of these criteria. In other words, making a formal problem description that's more like that of the stable marriage problem and less like that of the assignment problem.
So one plan of attack on the moral uncertainty problem might be:
Generate a bunch of "problem descriptions" for moral uncertainty that specify a set of criteria to satisfy/optimize.
Figure out which "problem description" best fits our intuitions about how moral uncertainty should be solved.
Find an algorithm that provably solves the problem as specified in its description.
Thanks to ESrogs, Stefan_Schubert, and the Effective Altruism summit for the discussion that led to this post!
This post is to test out Polymath-style collaboration on LW. The problem we've chosen to try is formalizing and analyzing Bostrom and Ord's "Parliamentary Model" for dealing with moral uncertainty.
I'll first review the Parliamentary Model, then give some of Polymath's style suggestions, and finally suggest some directions that the conversation could take.
The Parliamentary Model
The Parliamentary Model is an under-specified method of dealing with moral uncertainty, proposed in 2009 by Nick Bostrom and Toby Ord. Reposting Nick's summary from Overcoming Bias:
In a comment, Bostrom continues:
It's an interesting idea, but clearly there are a lot of details to work out. Can we formally specify the kinds of negotiation that delegates can engage in? What about blackmail or prisoners' dilemmas between delegates? It what ways does this proposed method outperform other ways of dealing with moral uncertainty?
I was discussing this with ESRogs and Stefan_Schubert at the Effective Altruism summit, and we thought it might be fun to throw the question open to LessWrong. In particular, we thought it'd be a good test problem for a Polymath-project-style approach.
How to Polymath
The Polymath comment style suggestions are not so different from LW's, but numbers 5 and 6 are particularly important. In essence, they point out that the idea of a Polymath project is to split up the work into minimal chunks among participants, and to get most of the thinking to occur in comment threads. This is as opposed to a process in which one community member goes off for a week, meditates deeply on the problem, and produces a complete solution by themselves. Polymath rules 5 and 6 are instructive:
It seems to us as well that an important part of the Polymath style is to have fun together and to use the principle of charity liberally, so as to create a space in which people can safely be wrong, point out flaws, and build up a better picture together.
Our test project
If you're still reading, then I hope you're interested in giving this a try. The overall goal is to clarify and formalize the Parliamentary Model, and to analyze its strengths and weaknesses relative to other ways of dealing with moral uncertainty. Here are the three most promising questions we came up with:
The original OB post had a couple of comments that I thought were worth reproducing here, in case they spark discussion, so I've posted them.
Finally, if you have meta-level comments on the project as a whole instead of Polymath-style comments that aim to clarify or solve the problem, please reply in the meta-comments thread.