It's confusing how the term "realism" is used when applied to ethics, which I think obfuscates a natural position relevant to alignment. Realism about mathematical objects (small-p platonism?) doesn't say that there is One True Mathematical Object to be discovered by all civilizations, instead there are many objects that exist, governing truth of propositions about them. When discussing a mathematical question, we first produce references to some objects, in order to locate them, and then build models of situations that involve them, to understand how these objects work, to formulate claims that are true about them. The references depend on the mathematician's choice of topic or problems, while truth of models, given objects, doesn't depend on the references and hence on the mathematician. The dependence involves two steps: first, there are references, which reside in the mathematician, then there are mathematical objects singled out using the references, and finally there are models that again reside in the mathematician, determined by the objects. Even though the models depend on the references, the references are screened off by the objects, so given the objects, the models no longer depend on the references.
This straightforwardly applies to ethics, except unlike for mathematics, the references are typically vague. The resulting position is realist in the sense that it considers moral facts (objects) as real mind-independent entities governing truth of propositions about them, but the choice of moral objects to consider depends on the agent, which is usually called an anti-realist position, making it difficult to frame a realist/anti-realist narrative. Here, the results of consideration of the moral objects are not necessarily intuitively transparent, their models can be unlike the references that singled them out for consideration, and correctness of models doesn't depend on attitude of the agent, it's determined by the moral objects themselves, their origin in references within the agent is screened off.
This position is, according to the post's own taxomony, the only one not discussed in the post! Here, what you should do depends on current values, yet ideal understanding need not bring values into harmony with what you should do. That is, a probable outcome is alienation from what you should do despite what you should do being determined by what you currently value.
(Cross-posted from Hands and Cities)
In a previous post, I tried to gesture at the possibility of a certain kind of wholeheartedness in ethical life. In this post, I want to examine a question about meta-ethics that seems to me important to this wholeheartedness: namely, whether you can be completely alienated from what you should do. By this I mean: is it possible for what you should do to hold no appeal whatsoever, even given arbitrarily deep and full understanding of reality and of the consequences of your actions? I discuss how three different meta-ethical questions approach this question, and why the differences matter for wholeheartedness.
(Note: this post plays fast and loose with a number of philosophical issues that it would take a lot of extra work to tease apart and pin down precisely; my aim is mostly just to gesture at a general picture that I think is important, not to set it out rigorously).
(Second note: the types of “shoulds” and ”normative facts” I have in mind here are the ones of all-things-considered practical deliberation – deliberation that encompasses both moral and non-moral considerations. Thus, the question of whether I should eat this cupcake, go back to grad school, clip my toenails, etc, are all normative questions in this sense, but not necessarily moral ones).
I.
On some versions of normative realism, the answer to the question above is yes: it might turn out that what you should do, according to the true normativity written into the fabric of the reality, is fully and forever alien from anything you or any other human cares about. It might be, for example, that there is an over-riding moral imperative to maximize the amount of helium gas, or to paint your mother a specific shade of pastel blue, or to eat a prime number of bricks every other Tuesday. Future philosophers might discover this fact (though given their current reliance on intuitions closely tied to actual patterns of human concern, their prospects for finding the truth look slim); but their discovery, even fully comprehended, wouldn’t move them to action. Rather, the human species as a whole would be afflicted with a kind of totalized akrasia: “I know I should eat this brick; I’ve seen the proof, it’s airtight; but no matter how much I reflect and imagine, I just can’t get motivated.”
Or perhaps the better analogy is species-wide sociopathy, an irredeemable indifference to what really matters, that summum bonum, helium. “How can you just sit there, letting these precious opportunities to make helium slip away?!” morally-attuned aliens might say, aghast at our wasteful games, our loves and joys. But we would persist, numb to (but not ignorant of) the gravity of our wrongs.
Or maybe we wouldn’t persist. Maybe we (or some of us) would start dutifully making helium, motivated by the bare fact that doing so is right, good, what we should do, even though entirely un-motivated, at least directly, by what this turns out to consist in. Perhaps we would try to find things we like about helium, or to raise our children to like helium, or, failing that, to rewire our brains. It’s a hard scenario to imagine, partly because it’s so unclear what sort of discovery could prove helium the summum bonum, without also at least helping us to see why making helium might be at all appealing (my girlfriend suggests imagining that the discovery came via some sort of convoluted argument about anthropics — “if we endorse the Self-Sampling Assumption, and the Great Filter is behind us, then it must be the case that in the infinite universe of mathematical structures, all of the logically impossible civilizations valued…” — but even the surprises and confusions of anthropics don’t seem up to the task).
Indeed, it’s tempting to imagine the dutiful helium makers saying something like “Well, I can’t for the life of me see it myself, but apparently making helium is the thing. Weird, right? Yeah, I thought so too. But so it goes: let’s get on it with.” That is, it’s tempting to imagine them thinking of themselves as not understanding something about helium — something they have to, as it were, “defer to normative reality” about. A feature of the set-up, though, is supposed to be that this isn’t the case. They understand everything there is to understand; they have seen through to the core of reality, and reality, it turns out, is really into helium. But they, unfortunately, aren’t.
Maybe the set-up is confused in this respect. In particular, it assumes that there’s an available notion of “understanding” such that you can understand fully not just that you should do X, but why you should do X, and still not be at all motivated to do X (though note that in certain cases, on many realist views, there won’t be a further “why” — it will just be a brute fact that X is good, obligatory, reason-providing, and so forth). That is, in this case, motivation is orthogonal to normative understanding. Normative facts are like facts about which minor league batters have the best batting average: whether you know/understand them is one thing; whether you care, entirely another. Call this “externalist normative realism.”
II.
Not all normative realism need take this form. To the contrary: the most intuitive and naive form of normative realism, I think, treats normative facts, understood fully, as necessarily compelling to our motivational capacities. That is, on this view, if helium is ultimately the highest good, then adequate understanding of reality ought to allow us to see what’s good about it in a manner that makes it actually seem good; the morally-attuned aliens ought, with enough patience and non-coercive effort, to be able to bring us around, such that later, in deciding whether to make more helium, or to help an old dog trapped in a ditch, we’ll be able to return, internally, to the rational source of our allegiance to helium — to that perception of helium’s value that we had when we first really saw what the aliens were talking about — and then to choose with emotional clarity. On this conception, that is, it’s not an open question whether Plato’s cave-dwellers, upon emerging to stand in the raw and shining light of the Good, will turn back with an indifferent shrug. Not, at least, if their eyes were really open: love and true-seeing are too intimately linked. Call this “internalist normative realism.”
This is the type of normative realism that really calls to me at an intuitive level; and I expect that it’s the type of thing that many people hope for out a realist meta-ethic (though it’s not, I think, the thing they are likely to ultimately get, at least not from analytic meta-ethics). It’s also, I think, the one most continuous with a realist’s desire to treat normative facts as similar in metaphysical status to mathematical and scientific facts, while preserving the intimate connection between normative understanding and motivation. For just as we expect creatures capable of understanding the universe to converge on similar mathematical and scientific facts, so too does this form of realism tend (at least in its most naive form) to expect them to converge both in their understanding of the normative facts, and in the patterns of care and concern that flow from this understanding. In the great intergalactic meeting of mature and intelligent civilizations, on this view, we will all be able to share stories about how we started out valuing X, Y, or Z old thing, for whatever random evolutionary reason, but that gradually, as our wisdom and enlightenment expanded, we started to discern the contours of the great Way, the deep compellingness of which became clearer and clearer as the haze of ignorance and folly fell away, until we all arrived, here, in cosmic harmony, all with our helium-making machines, having seen the Truth and followed it.
(I’m planning to talk more about expectations of normative convergence in a future post. In particular, I think realism of this broad type is in tension with Nick Bostrom’s Orthogonality Thesis, at least in the form most relevant to thinking about AI risk — but that the Orthogonality Thesis is probably correct. Note, though, that it’s also possible to expect a great deal of convergence in both normative understanding and patterns of concern even on various anti-realist views).
A general problem with internalist normative realism, though, is that it’s really hard to make sense of. In particular, I think, it’s hard to see what sort of epistemic access we could have to the normative facts, and what sorts of facts they could be, such that suitably comprehensive epistemic contact with them necessarily reshapes our motivations, via a kind of rational causation, to conform with normative reality. Exactly what makes this picture hard to make out is a bit tricky to pin down, but I think the general problem is something to do with Humean intuitions to the effect that facts about the universe — of which normative facts are, on this form of realism, a species — are by themselves motivationally inert; they only motivate if you already have some pre-existing motivation to which they connect. But on internalist realism, these facts can somehow intervene, via your rational understanding, to reconfigure your motivational system entirely, regardless of what concerns you started with. Internalist realists have a variety of replies available to this, but I think something in this vicinity remains an important objection, and I tend to think that the best way to be this type of realist is to basically just admit that we lack even a rudimentary picture of how this kind of rational communion with the normative facts works, and how it fits with our empirical picture of human psychology and cognition.
III.
Enter our third meta-ethical view, which I’ll call “internalist anti-realism” (though note that there are lots of other forms of anti-realism). On this view, facts about what you should do are in some sense dependent on your existing patterns of care and concern, rather than being independent facts about reality itself (exactly how to tease apart realist and anti-realist views in this respect is a bit tricky — but again, I’m playing fast and loose here). The versions of anti-realism that I take most seriously here usually appeal to some kind epistemic idealization procedure — for example, they appeal to what you would care about, or what you would want yourself to do (the counterfactual here can get complicated), if you had arbitrarily full and vivid understanding of reality, the consequences of your actions, etc. Indeed, this sort of idealization procedure closely resembles the type of idealized understanding that the internalist realist also expects to result in correct normative understanding and motivations. That is, on both views, people with ideal understanding of reality end up with the right patterns of concern, and in this sense, both can capture an intuitive connection between understanding and motivation. On internalist anti-realism, though, this connection arises because the right patterns of concern just are whichever ones you would end up with given ideal understanding; whereas on internalist realism, the correct patterns of concern are fixed independently on the concerns you started out with, and ideal understanding necessarily leads you to the True Way.
A central objection to internalist anti-realism, at least in its naive forms, is that it ends up endorsing claims in the vicinity of “an ideally coherent paperclip maximizing AI system should maximize paperclips, even if it means killing Bob” or “an ideally coherent pain maximizer should maximize pain.” There are various more complicated moves you can make to try to dodge these sorts of implications — for example, you can try for theories of normative language that make it true in our mouths that “the paperclipper shouldn’t kill Bob,” but true in the paperclipper’s mouth that “I should kill Bob” — and of course, you can endorse (at least certain types of) human attempts to restrain/punish/shame/deter the paperclipper (or to try to deceive it about meta-ethics). But in a basic sense — which persists, for me, upon engagement with these dodges — a certain kind of realist hope that Clippy can be revealed as mistaken about its basic reasons for action has been lost. You can fight with Clippy; you can trade with Clippy; but it won’t be the case that in the eyes of the universe (and not just in your own eyes), Clippy cares about the wrong things, and you care about the right things. For this and other reasons, various people resist this type of anti-realism.
IV.
Simplifying somewhat, then, we can think of the three different views I’ve discussed via the following framework (though note that there are many more problems with these views than the ones I’ve discussed):
It’s tempting, in thinking about meta-ethics, to focus primarily on the distinction between realist and anti-realist views — that is, to draw the important line between (1) and (2) on the one hand vs. (3) on the other. But I find that in my own life, it’s actually the line between (1) vs. (2) and (3) that matters most. This is because (2) and (3) both ground a certain kind of wholeheartedness in simultaneously pursuing the projects of trying to see the world truly, trying to do what you should do, and trying to protect and promote what the wisest version of yourself would care about most deeply. That is, on (2) or (3), what you should ultimately do, and what the wisest version of yourself would want you to do, are in harmony. Wise-you and normativity are always on the same side.
On (1), by contrast, this isn’t so. You might wake up, at end of all your inquiry, having understood the truth as fully as possible, and still find yourself fundamentally alienated from the values that should govern your action. You might do what you should do, but you won’t ever deeply see why it’s worth doing; the values at stake will always seem external, outside, other; you’ll never been able to endorse and act on them with full sincerity.
V.
Here’s a (strange) example that might illustrate the difference. Suppose that after you die, you find yourself in Hell, being held in a cage by demon guards. These guards inform you that you’ve done great wrongs, that it’s God’s Will that you endure this confinement for some unspecified and possibly infinite period of time, and His Will, also, that you don’t resist or try to escape.
You can’t, from your current perspective, really see that this is a just punishment. But on (2) and (3), you know that whatever it is you should do — whether it’s to endure the punishment, or to try to escape — you would endorse it wholeheartedly if you really understood the situation. Perhaps full understanding would reveal that God’s Will is in fact truly just and right, that you should be punished in this way, and that wise-you would want you to cooperate humbly with your captors. Or perhaps wise-you would understand that in fact, you and God are in a fundamentally adversarial relationship, and that what you should do is resist Him, subvert His plans, try free the other captors, etc. Either way, you can consider your options wholeheartedly, confident that your wisest self is on the side of making the right choice.
What’s ruled out by (2) and (3), but not by (1), is the possibility that (a) you should endure the punishment without trying to resist or escape, and (b) you’d never be able to comprehend why, even in the limits of wisdom and clarity. That God’s Will is right and true, but what makes it right and true will remain fundamentally and forever alien from you, the sinner, regardless of the understanding you achieve.
Of course, your meta-ethical convictions may prove cold comfort as the days in the cell wear on (though I actually think it would make a big difference to me, if I found myself in Hell, if I expected to endorse my being there on reflection). But I actually think that the difference between (1) vs. (2) and (3) is important in much more mundane cases. I’ll leave that, though, for another post.
There are also important questions about the possibility of being alienated from the wisest version of yourself, even if that version and normativity are in harmony. I’m generally optimistic, though, about finding routes to harmony, cooperation, and loyalty between you and wise-you, despite the difference in wisdom — since the two of you are ultimately, I think, on the same side (and indeed, the same person). But I’ll leave that for another post as well.