This is part of a semi-monthly reading group on Eliezer Yudkowsky's ebook, Rationality: From AI to Zombies. For more information about the group, see the announcement post.
Welcome to the Rationality reading group. This fortnight we discuss Part V: Value Theory (pp. 1359-1450). This post summarizes each article of the sequence, linking to the original LessWrong post where available.
V. Value Theory
264. Where Recursive Justification Hits Bottom - Ultimately, when you reflect on how your mind operates, and consider questions like "why does Occam's Razor work?" and "why do I expect the future to be like the past?", you have no other option but to use your own mind. There is no way to jump to an ideal state of pure emptiness and evaluate these claims without using your existing mind.
265. My Kind of Reflection - A few key differences between Eliezer Yudkowsky's ideas on reflection and the ideas of other philosophers.
266. No Universally Compelling Arguments - Because minds are physical processes, it is theoretically possible to specify a mind which draws any conclusion in response to any argument. There is no argument that will convince every possible mind.
267. Created Already in Motion - There is no computer program so persuasive that you can run it on a rock. A mind, in order to be a mind, needs some sort of dynamic rules of inference or action. A mind has to be created already in motion.
268. Sorting Pebbles into Correct Heaps - A parable about an imaginary society that has arbitrary, alien values.
269. 2-Place and 1-Place Words - It is possible to talk about "sexiness" as a property of an observer and a subject. It is also equally possible to talk about "sexiness" as a property of a subject, as long as each observer can have a different process to determine how sexy someone is. Failing to do either of these will cause you trouble.
270. What Would You Do Without Morality? - If your own theory of morality was disproved, and you were persuaded that there was no morality, that everything was permissible and nothing was forbidden, what would you do? Would you still tip cabdrivers?
271. Changing Your Metaethics - Discusses the various lines of retreat that have been set up in the discussion on metaethics.
272. Could Anything Be Right? - You do know quite a bit about morality. It's not perfect information, surely, or absolutely reliable, but you have someplace to start. If you didn't, you'd have a much harder time thinking about morality than you do.
273. Morality as Fixed Computation - A clarification about Yudkowsky's metaethics.
274. Magical Categories - We underestimate the complexity of our own unnatural categories. This doesn't work when you're trying to build a FAI.
275. The True Prisoner's Dilemma - The standard visualization for the Prisoner's Dilemma doesn't really work on humans. We can't pretend we're completely selfish.
276. Sympathetic Minds - Mirror neurons are neurons that fire both when performing an action oneself, and watching someone else perform the same action - for example, a neuron that fires when you raise your hand or watch someone else raise theirs. We predictively model other minds by putting ourselves in their shoes, which is empathy. But some of our desire to help relatives and friends, or be concerned with the feelings of allies, is expressed as sympathy, feeling what (we believe) they feel. Like "boredom", the human form of sympathy would not be expected to arise in an arbitrary expected-utility-maximizing AI. Most such agents would regard any agents in its environment as a special case of complex systems to be modeled or optimized; it would not feel what they feel.
277. High Challenge - Life should not always be made easier for the same reason that video games should not always be made easier. Think in terms of eliminating low-quality work to make way for high-quality work, rather than eliminating all challenge. One needs games that are fun to play and not just fun to win. Life's utility function is over 4D trajectories, not just 3D outcomes. Values can legitimately be over the subjective experience, the objective result, and the challenging process by which it is achieved - the traveller, the destination and the journey.
278. Serious Stories - Stories and lives are optimized according to rather different criteria. Advice on how to write fiction will tell you that "stories are about people's pain" and "every scene must end in disaster". I once assumed that it was not possible to write any story about a successful Singularity because the inhabitants would not be in any pain; but something about the final conclusion that the post-Singularity world would contain no stories worth telling seemed alarming. Stories in which nothing ever goes wrong, are painful to read; would a life of endless success have the same painful quality? If so, should we simply eliminate that revulsion via neural rewiring? Pleasure probably does retain its meaning in the absence of pain to contrast it; they are different neural systems. The present world has an imbalance between pain and pleasure; it is much easier to produce severe pain than correspondingly intense pleasure. One path would be to address the imbalance and create a world with more pleasures, and free of the more grindingly destructive and pointless sorts of pain. Another approach would be to eliminate pain entirely. I feel like I prefer the former approach, but I don't know if it can last in the long run.
279. Value is Fragile - An interesting universe, that would be incomprehensible to the universe today, is what the future looks like if things go right. There are a lot of things that humans value that if you did everything else right, when building an AI, but left out that one thing, the future would wind up looking dull, flat, pointless, or empty. Any Future not shaped by a goal system with detailed reliable inheritance from human morals and metamorals, will contain almost nothing of worth.
280. The Gift We Give to Tomorrow - How did love ever come into the universe? How did that happen, and how special was it, really?
This has been a collection of notes on the assigned sequence for this fortnight. The most important part of the reading group though is discussion, which is in the comments section. Please remember that this group contains a variety of levels of expertise: if a line of discussion seems too basic or too incomprehensible, look around for one that suits you better!
The next reading will cover Part W: Quantified Humanism (pp. 1453-1514) and Interlude: The Twelve Virtues of Rationality (pp. 1516-1521). The discussion will go live on Wednesday, 23 March 2016, right here on the discussion forum of LessWrong.
Yes, but I don't think people saying that simple moral theories are too simple are claiming that no theory about any aspect of ethics should be simple. At any rate, in so far as they are I think they're boringly wrong and would prefer to contemplate a less silly version of their position. The more interesting claim is (I think) something more like "no very simple theory can account for all of human values", and I don't see how CEV offers anything like a counterexample to that.
Ahem. You would appear to be right. Therefore, when I've said "Gram_Stone" in the above, please pretend I said something like "those advocating the position that simple moral theories are too simple". As you say, it doesn't particularly matter who said what, but I regret foisting a position on someone who wasn't actually advocating it.
I regret making you sad. I wasn't suggesting any sort of "emotional againstness", though. And I think we actually are disagreeing. For instance, you are arguing that saying "we should reject very simple moral theories because they cannot rightly describe human values" is making a mistake, and that it's the mistake Eliezer was arguing against when he wrote "Say not 'Complexity'". I think saying that is probably not a mistake, and certainly can't be determined to be a mistake simply by recapitulating Eliezer's arguments there. Isn't that a disagreement?
But I take your point, and I too am in the habit of defending things I "disagree" with. I would say then, though, that I am disagreeing with bad arguments against those things -- there is still disagreement, and that's not the same as looking at an issue from multiple directions.
I have a very strong impression we disagree insomuch that we interpret each other's words to mean something we can argue with.
Just now, you treated my original remark in this way by changing the quoted phrase, which was (when I wrote my comment) "Simple moral theories are too neat to do any real work in moral philosophy" but become (in your version) "simple moral theories cannot rightly describe human values". Notice the difference?
I'm not defending my original comment, it was pretty st... (read more)