Years ago, Eliezer1999 was convinced that he knew nothing about morality.
For all he knew, morality could require the extermination of the human species; and if so he saw no virtue in taking a stand against morality, because he thought that, by definition, if he postulated that moral fact, that meant human extinction was what "should" be done.
I thought I could figure out what was right, perhaps, given enough reasoning time and enough facts, but that I currently had no information about it. I could not trust evolution which had built me. What foundation did that leave on which to stand?
Well, indeed Eliezer1999 was massively mistaken about the nature of morality, so far as his explicitly represented philosophy went.
But as Davidson once observed, if you believe that "beavers" live in deserts, are pure white in color, and weigh 300 pounds when adult, then you do not have any beliefs about beavers, true or false. You must get at least some of your beliefs right, before the remaining ones can be wrong about anything.
My belief that I had no information about morality was not internally consistent.
Saying that I knew nothing felt virtuous, for I had once been taught that it was virtuous to confess my ignorance. "The only thing I know is that I know nothing," and all that. But in this case I would have been better off considering the admittedly exaggerated saying, "The greatest fool is the one who is not aware they are wise." (This is nowhere near the greatest kind of foolishness, but it is a kind of foolishness.)
Was it wrong to kill people? Well, I thought so, but I wasn't sure; maybe it was right to kill people, though that seemed less likely.
What kind of procedure would answer whether it was right to kill people? I didn't know that either, but I thought that if you built a generic superintelligence (what I would later label a "ghost of perfect emptiness") then it could, you know, reason about what was likely to be right and wrong; and since it was superintelligent, it was bound to come up with the right answer.
The problem that I somehow managed not to think too hard about, was where the superintelligence would get the procedure that discovered the procedure that discovered the procedure that discovered morality—if I couldn't write it into the start state that wrote the successor AI that wrote the successor AI.
As Marcello Herreshoff later put it, "We never bother running a computer program unless we don't know the output and we know an important fact about the output." If I knew nothing about morality, and did not even claim to know the nature of morality, then how could I construct any computer program whatsoever—even a "superintelligent" one or a "self-improving" one—and claim that it would output something called "morality"?
There are no-free-lunch theorems in computer science—in a maxentropy universe, no plan is better on average than any other. If you have no knowledge at all about "morality", there's also no computational procedure that will seem more likely than others to compute "morality", and no meta-procedure that's more likely than others to produce a procedure that computes "morality".
I thought that surely even a ghost of perfect emptiness, finding that it knew nothing of morality, would see a moral imperative to think about morality.
But the difficulty lies in the word think. Thinking is not an activity that a ghost of perfect emptiness is automatically able to carry out. Thinking requires running some specific computation that is the thought. For a reflective AI to decide to think, requires that it know some computation which it believes is more likely to tell it what it wants to know, than consulting an Ouija board; the AI must also have a notion of how to interpret the output.
If one knows nothing about morality, what does the word "should" mean, at all? If you don't know whether death is right or wrong—and don't know how you can discover whether death is right or wrong—and don't know whether any given procedure might output the procedure for saying whether death is right or wrong—then what do these words, "right" and "wrong", even mean?
If the words "right" and "wrong" have nothing baked into them—no starting point—if everything about morality is up for grabs, not just the content but the structure and the starting point and the determination procedure—then what is their meaning? What distinguishes, "I don't know what is right" from "I don't know what is wakalixes"?
A scientist may say that everything is up for grabs in science, since any theory may be disproven; but then they have some idea of what would count as evidence that could disprove the theory. Could there be something that would change what a scientist regarded as evidence?
Well, yes, in fact; a scientist who read some Karl Popper and thought they knew what "evidence" meant, could be presented with the coherence and uniqueness proofs underlying Bayesian probability, and that might change their definition of evidence. They might not have had any explicit notion, in advance, that such a proof could exist. But they would have had an implicit notion. It would have been baked into their brains, if not explicitly represented therein, that such-and-such an argument would in fact persuade them that Bayesian probability gave a better definition of "evidence" than the one they had been using.
In the same way, you could say, "I don't know what morality is, but I'll know it when I see it," and make sense.
But then you are not rebelling completely against your own evolved nature. You are supposing that whatever has been baked into you to recognize "morality", is, if not absolutely trustworthy, then at least your initial condition with which you start debating. Can you trust your moral intuitions to give you any information about morality at all, when they are the product of mere evolution?
But if you discard every procedure that evolution gave you and all its products, then you discard your whole brain. You discard everything that could potentially recognize morality when it sees it. You discard everything that could potentially respond to moral arguments by updating your morality. You even unwind past the unwinder: you discard the intuitions underlying your conclusion that you can't trust evolution to be moral. It is your existing moral intuitions that tell you that evolution doesn't seem like a very good source of morality. What, then, will the words "right" and "should" and "better" even mean?
Humans do not perfectly recognize truth when they see it, and hunter-gatherers do not have an explicit concept of the Bayesian criterion of evidence. But all our science and all our probability theory was built on top of a chain of appeals to our instinctive notion of "truth". Had this core been flawed, there would have been nothing we could do in principle to arrive at the present notion of science; the notion of science would have just sounded completely unappealing and pointless.
One of the arguments that might have shaken my teenage self out of his mistake, if I could have gone back in time to argue with him, was the question:
Could there be some morality, some given rightness or wrongness, that human beings do not perceive, do not want to perceive, will not see any appealing moral argument for adopting, nor any moral argument for adopting a procedure that adopts it, etcetera? Could there be a morality, and ourselves utterly outside its frame of reference? But then what makes this thing morality—rather than a stone tablet somewhere with the words 'Thou shalt murder' written on them, with absolutely no justification offered?
So all this suggests that you should be willing to accept that you might know a little about morality. Nothing unquestionable, perhaps, but an initial state with which to start questioning yourself. Baked into your brain but not explicitly known to you, perhaps; but still, that which your brain would recognize as right is what you are talking about. You will accept at least enough of the way you respond to moral arguments as a starting point, to identify "morality" as something to think about.
But that's a rather large step.
It implies accepting your own mind as identifying a moral frame of reference, rather than all morality being a great light shining from beyond (that in principle you might not be able to perceive at all). It implies accepting that even if there were a light and your brain decided to recognize it as "morality", it would still be your own brain that recognized it, and you would not have evaded causal responsibility—or evaded moral responsibility either, on my view.
It implies dropping the notion that a ghost of perfect emptiness will necessarily agree with you, because the ghost might occupy a different moral frame of reference, respond to different arguments, be asking a different question when it computes what-to-do-next.
And if you're willing to bake at least a few things into the very meaning of this topic of "morality", this quality of rightness that you are talking about when you talk about "rightness"—if you're willing to accept even that morality is what you argue about when you argue about "morality"—then why not accept other intuitions, other pieces of yourself, into the starting point as well?
Why not accept that, ceteris paribus, joy is preferable to sorrow?
You might later find some ground within yourself or built upon yourself with which to criticize this—but why not accept it for now? Not just as a personal preference, mind you; but as something baked into the question you ask when you ask "What is truly right"?
But then you might find that you know rather a lot about morality! Nothing certain—nothing unquestionable—nothing unarguable—but still, quite a bit of information. Are you willing to relinquish your Socratean ignorance?
I don't argue by definitions, of course. But if you claim to know nothing at all about morality, then you will have problems with the meaning of your words, not just their plausibility.
Re: expected utility maximizer is not going to question its utility function
It appears that Caledonian needs to read the papers on: http://selfawaresystems.com/