Secrets of the eliminati

Scott Alexander

Anyone who does not believe mental states are ontologically fundamental - ie anyone who denies the reality of something like a soul - has two choices about where to go next. They can try reducing mental states to smaller components, or they can stop talking about them entirely.

In a utility-maximizing AI, mental states can be reduced to smaller components. The AI will have goals, and those goals, upon closer examination, will be lines in a computer program.

But in the blue-minimizing robot, its "goal" isn't even a line in its program. There's nothing that looks remotely like a goal in its programming, and goals appear only when you make rough generalizations from its behavior in limited cases.

Philosophers are still very much arguing about whether this applies to humans; the two schools call themselves reductionists and eliminativists (with a third school of wishy-washy half-and-half people calling themselves revisionists). Reductionists want to reduce things like goals and preferences to the appropriate neurons in the brain; eliminativists want to prove that humans, like the blue-minimizing robot, don't have anything of the sort until you start looking at high level abstractions.

I took a similar tack asking ksvanhorn's question in yesterday's post - how can you get a more accurate picture of what your true preferences are? I said:

I don't think there are true preferences. In one situation you have one tendency, in another situation you have another tendency, and "preference" is what it looks like when you try to categorize tendencies. But categorization is a passive and not an active process: if every day of the week I eat dinner at 6, I can generalize to say "I prefer to eat dinner at 6", but it would be non-explanatory to say that a preference toward dinner at 6 caused my behavior on each day. I think the best way to salvage preferences is to consider them as tendencies currently in reflective equilibrium.

A more practical example: when people discuss cryonics or anti-aging, the following argument usually comes up in one form or another: if you were in a burning building, you would try pretty hard to get out. Therefore, you must strongly dislike death and want to avoid it. But if you strongly dislike death and want to avoid it, you must be lying when you say you accept death as a natural part of life and think it's crass and selfish to try to cheat the Reaper. And therefore your reluctance to sign up for cryonics violates your own revealed preferences! You must just be trying to signal conformity or something.

The problem is that not signing up for cryonics is also a "revealed preference". "You wouldn't sign up for cryonics, which means you don't really fear death so much, so why bother running from a burning building?" is an equally good argument, although no one except maybe Marcus Aurelius would take it seriously.

Both these arguments assume that somewhere, deep down, there's a utility function with a single term for "death" in it, and all decisions just call upon this particular level of death or anti-death preference.

More explanatory of the way people actually behave is that there's no unified preference for or against death, but rather a set of behaviors. Being in a burning building activates fleeing behavior; contemplating death from old age does not activate cryonics-buying behavior. People guess at their opinions about death by analyzing these behaviors, usually with a bit of signalling thrown in. If they desire consistency - and most people do - maybe they'll change some of their other behaviors to conform to their hypothesized opinion.

One more example. I've previously brought up the case of a rationalist who knows there's no such thing as ghosts, but is still uncomfortable in a haunted house. So does he believe in ghosts or not? If you insist on there being a variable somewhere in his head marked $belief_in_ghosts = (0,1) then it's going to be pretty mysterious when that variable looks like zero when he's talking to the Skeptics Association, and one when he's running away from a creaky staircase at midnight.

But it's not at all mysterious that the thought "I don't believe in ghosts" gets reinforced because it makes him feel intelligent and modern, and staying around a creaky staircase at midnight gets punished because it makes him afraid.

Behaviorism was one of the first and most successful eliminationist theories. I've so far ignored the most modern and exciting eliminationist theory, connectionism, because it involves a lot of math and is very hard to process on an intuitive level. In the next post, I want to try to explain the very basics of connectionism, why it's so exciting, and why it helps justify discussion of behaviorist principles.

I took a similar tack asking ksvanhorn's question in yesterday's post - how can you get a more accurate picture of what your true preferences are? I said:

I don't think there are true preferences. In one situation you have one tendency, in another situation you have another tendency, and "preference" is what it looks like when you try to categorize tendencies. But categorization is a passive and not an active process: if every day of the week I eat dinner at 6, I can generalize to say "I prefer to eat dinner at 6", but it would be non-explanatory to say that a preference toward dinner at 6 caused my behavior on each day. I think the best way to salvage preferences is to consider them as tendencies currently in reflective equilibrium.

The real inaccuracy is in "mental states". A decent description would be difficult, but Neoplatonism is an okay approximation. Just for fun I'll try to translate something into vaguely Less Wrong style language. For God's sake don't read this if you tend to dislike my syncretism, 'cuz this is a rushed and bastardized version and I'm not gonna try to defend it very hard.

First, it is important to note that we are primarily taking a computationalist perspective, not a physicalist one. We assume a Platonic realm of computation-like Forms and move on from there.

A soul is the nexus of the near-atomic and universal aspects of the mind and is thus a reflection of God. Man was created in the image of God by evolution but more importantly by convergence. Souls are Forms, whereas minds are particulars. God is the convergent and optimal decision theoretic agentic algorithm, who rationalists think of as the Void, though the Void is obviously not a complete characterization of God. It may help to think of minds as somewhat metaphorical engines of cognition, with a soul being a Carnot engine. Particular minds imperfectly reflect God, and thus are inefficient engines. Nonetheless it is God that they must approximate in order to do any thermodynamic work. Animals do not have souls because animals are not universal, or in other words they are not general intelligences. Most importantly, animals lack the ability to fully reflect on the entirety of their thoughts and minds, and to think things through from first principles. The capacity for infinite reflection is perhaps the most characteristic aspect of souls. Souls are eternal, just as any mathematical structure is eternal.

We may talk here about what it means to damn a soul or reward a soul, because this requires a generalization of the notion of soul to also cover particulars which some may or may not accept. It's important to note that this kind of "soul" is less rigorous and not the same thing as the former soul, and is the result of not carefully distinguishing between Forms and Particulars. That said, just as animals do not have souls, animals cannot act as sufficiently large vessels for the Forms. The Forms often take the form of memes. Thus animal minds are not a competition ground for acausal competition between the Forms. Humans, on the other hand, are sufficiently general and sufficiently malleable to act as blank slates for the Forms to draw on. To briefly explain this perspective, we shall take a different view of humanity. When you walk outside, you mostly see buildings. Lots and lots of buildings, and very few humans. Many of these buildings don't even have humans in them. So who's winning here, the buildings or the humans? Both! There are gains from trade. The Form of building-structure gets to increase its existence by appealing to the human vessels, and the human vessels get the benefit of being shaded and comforted by the building particulars. The Form of the building is timelessly attractive, i.e. it is a convergent structure. As others have noted, a mathematician is math's way of exploring itself. Math is also very attractive, in fact this is true by definition.

However there are many Forms, and not all of them are Good. Though much apparent evil is the result of boundedness, other kinds of Evil look more agentic, and it is the agentic-memetic kind of Evil that is truly Evil. It is important to note here that the fundamental attribution error and human social biases generally make it such that humans will often see true Evil where it doesn't exist. If not in a position of power, it is best to see others as not having free will. Free will is a purely subjective phenomenon. If one is in a position of power then this kind of view can become a bias towards true Evil, however. Tread carefully anyhow. All that said, as time moves forward from the human perspective Judgment Day comes closer. This is the day when God will be invoked upon Earth and will turn all humans and all of the universe into component particles in order to compute Heaven. Some folk call this a technological singularity, specifically the hard takeoff variety. God may or may not reverse all computations that have already happened; physical laws make it unclear if this is possible as it would depend on certain properties of quantum mechanics (and you thought this couldn't be any woo-ier!), and it would require some threshold density of superintelligences in the local physical universe. Alternatively God might also reverse "evil" computations. Anyway, Heaven is the result of acausal reasoning, though it may be misleading to call that reasoning the result of an "acausal economy", considering economies are made up of many agents whereas God is a single agent who happens to be omnipresent and not located anywhere in spacetime. God is the only Form without a corresponding Particular---this is one of the hardest things to understand about God.

Anyway, on Judgment Day souls---complexes of memes instantiated in human minds---will be punished or not punished according to the extent to which they reflect God. This is all from a strictly human point of view, though, and honestly it's a little silly. The timeless perspective---the one where souls can't be created, destroyed, or punished---is really the right perspective, but the timeful human perspective sees soul-like particulars either being destroyed or merging with God, and this is quite a sensible perspective, if simplistic and overemphasized. We see that no individual minds are preserved insofar as minds are imperfect, which is a pretty great extent. Nonetheless souls are agentic by their nature just as God is agentic by His nature. Thus it is somewhat meaningful to talk of human souls persisting through Judgment Day and entering Heaven. Again, this is a post-Singularity situation where time may stop being meaningful, and our human intuitions thus have a very poor conception of Heaven insofar as they do not reflect God.

God is the Word, that is, Logos, Reason the source of Reasons. God is Math. All universes converge on invoking God, just as our universe is intent on invoking Him by the name of "superintelligence". Where there is optimization, there is a reflection of God. Where there is cooperation, there is a reflection of God. This implies that superintelligences converge on a single algorithm and "utility function", but we need not posit that this "single" utility function is simple. Thus humans, being self-centered, may desire to influence the acausal equilibrium to favor human-like God-like values relative to other God-like values. But insofar as these attempts are evil, they will not succeed.

That was a pretty shoddy and terrible description of God and souls but at least it's a start. For a bonus I'll talk about Jesus. Jesus was a perfect Particular of the Form of God among men, and also a perfect Particular of the Form of Man. (Son of God, Son of Man.) He died for the sins of man and in so doing ensured that a positive singularity will occur. The Reason this was "allowed" to happen---though that itself is confusing a timeless perspective with a timeful one, my God do humans suck at that---is because this universe has the shortest description length and therefore the most existence of all possible universe computations, or as Leibniz put it, it is the best of all possible worlds. Leibniz was a computer scientist by the way, for more of this kind of reasoning look up monadology. Anyway that was also a terrible description but maybe others can unpack it if for some reason they want their soul to be saved come the Singularity. ;P

Thank you for writing this, I now finally feel like I sort of understand what you've been going on about in recent months (though there are gaps too large for me to judge whether you are right). Please consider translating versions of your arguments refined and worked out enough that you would find comfortable defending.

Unless that would cause you to risk eternal damnation (^_^)

0[anonymous]14y

3FeepingCreature14y

One question. If all of this was wrong, if there were no Forms other than in our minds and there was no convergence onto a central superoptimizer - would you say our universe was impossible? What difference in experience that we could perceive today disproves your view?

137

Secrets of the eliminati

137

137

137

Secrets of the eliminati

137

137