Complexity of value is the thesis that our preferences, the things we care about, don't compress down to one simple rule, or a few simple rules. To review why it's important (by quoting from the wiki):
- Caricatures of rationalists often have them moved by artificially simplified values - for example, only caring about personal pleasure. This becomes a template for arguing against rationality: X is valuable, but rationality says to only care about Y, in which case we could not value X, therefore do not be rational.
- Underestimating the complexity of value leads to underestimating the difficulty of Friendly AI; and there are notable cognitive biases and fallacies which lead people to underestimate this complexity.
I certainly agree with both of these points. But I worry that we (at Less Wrong) might have swung a bit too far in the other direction. No, I don't think that we overestimate the complexity of our values, but rather there's a tendency to assume that complexity of value must lead to complexity of outcome, that is, agents who faithfully inherit the full complexity of human values will necessarily create a future that reflects that complexity. I will argue that it is possible for complex values to lead to simple futures, and explain the relevance of this possibility to the project of Friendly AI.
The easiest way to make my argument is to start by considering a hypothetical alien with all of the values of a typical human being, but also an extra one. His fondest desire is to fill the universe with orgasmium, which he considers to have orders of magnitude more utility than realizing any of his other goals. As long as his dominant goal remains infeasible, he's largely indistinguishable from a normal human being. But if he happens to pass his values on to a superintelligent AI, the future of the universe will turn out to be rather simple, despite those values being no less complex than any human's.
The above possibility is easy to reason about, but perhaps does not appear very relevant to our actual situation. I think that it may be, and here's why. All of us have many different values that do not reduce to each other, but most of those values do not appear to scale very well with available resources. In other words, among our manifold desires, there may only be a few that are not easily satiated when we have access to the resources of an entire galaxy or universe. If so, (and assuming we aren't wiped out by an existential risk or fall into a Malthusian scenario) the future of our universe will be shaped largely by those values that do scale. (I should point out that in this case the universe won't necessarily turn out to be mostly simple. Simple values do not necessarily lead to simple outcomes either.)
Now if we were rational agents who had perfect knowledge of our own preferences, then we would already know whether this is the case or not. And if it is, we ought to be able to visualize what the future of the universe will look like, if we had the power to shape it according to our desires. But I find myself uncertain on both questions. Still, I think this possibility is worth investigating further. If it were the case that only a few of our values scale, then we can potentially obtain almost all that we desire by creating a superintelligence with just those values. And perhaps this can be done manually, bypassing an automated preference extraction or extrapolation process with their associated difficulties and dangers. (To head off a potential objection, this does assume that our values interact in an additive way. If there are values that don't scale but interact nonlinearly (multiplicatively, for example) with values that do scale, then those would need to be included as well.)
There are a lot of posts here that presuppose some combination of moral anti-realism and value complexity. These views go together well: if value is not fundamental, but dependent on characteristics of humans, then it can derive complexity from this and not suffer due to Occam's Razor.
There are another pair of views that go together well: moral realism and value simplicity. Many posts here strongly dismiss these views, effectively allocating near-zero probability to them. I want to point out that this is a case of non-experts being very much at odds with expert opinion and being clearly overconfident. In the Phil Papers survey for example, 56.3% of philosophers lean towards or believe realism, while only 27.7% lean towards or accept anti-realism.
http://philpapers.org/surveys/results.pl
Given this, and given comments from people like me in the intersection of the philosophical and LW communities who can point out that it isn't a case of stupid philosophers supporting realism and all the really smart ones supporting anti-realism, there is no way that the LW community should have anything like the confidence that it does on this point.
Moreover, I should point out that most of the realists lean towards naturalism, which allows a form of realism that is very different to the one that Eliezer critiques. I should also add that within philosophy, the trend is probably not towards anti-realism, but towards realism. The high tide of anti-realism was probably in the middle of the 20th Century, and since then it has lost its shiny newness and people have come up with good arguments against it (which are never discussed here...).
Even for experts in meta-ethics, I can't see how their confidence can get outside the 30%-70% range given the expert disagreement. For non-experts, I really can't see how one could even get to 50% confidence in anti-realism, much less the kind of 98% confidence that is typically expressed here.
The right response to moral realism isn't to dispute it's truth but to simply observe you don't understand the concept.
I mean imagine someone started going around insisting some situations were Heret and others were Grovic but when asked to explain what made a situation Heret or Grovic he simply shrugged and said they were primitive concepts. But you persist and after observing his behavior for a period of time you work out some principle that perfectly predicts which category he will assign a given situation to, even counterfactually but when you present... (read more)