Secrets of the eliminati

Scott Alexander

137 Secrets of the eliminati

20th Jul 2011

3 min read

137

Anyone who does not believe mental states are ontologically fundamental - ie anyone who denies the reality of something like a soul - has two choices about where to go next. They can try reducing mental states to smaller components, or they can stop talking about them entirely.

In a utility-maximizing AI, mental states can be reduced to smaller components. The AI will have goals, and those goals, upon closer examination, will be lines in a computer program.

But in the blue-minimizing robot, its "goal" isn't even a line in its program. There's nothing that looks remotely like a goal in its programming, and goals appear only when you make rough generalizations from its behavior in limited cases.

Philosophers are still very much arguing about whether this applies to humans; the two schools call themselves reductionists and eliminativists (with a third school of wishy-washy half-and-half people calling themselves revisionists). Reductionists want to reduce things like goals and preferences to the appropriate neurons in the brain; eliminativists want to prove that humans, like the blue-minimizing robot, don't have anything of the sort until you start looking at high level abstractions.

I took a similar tack asking ksvanhorn's question in yesterday's post - how can you get a more accurate picture of what your true preferences are? I said:

I don't think there are true preferences. In one situation you have one tendency, in another situation you have another tendency, and "preference" is what it looks like when you try to categorize tendencies. But categorization is a passive and not an active process: if every day of the week I eat dinner at 6, I can generalize to say "I prefer to eat dinner at 6", but it would be non-explanatory to say that a preference toward dinner at 6 caused my behavior on each day. I think the best way to salvage preferences is to consider them as tendencies currently in reflective equilibrium.

A more practical example: when people discuss cryonics or anti-aging, the following argument usually comes up in one form or another: if you were in a burning building, you would try pretty hard to get out. Therefore, you must strongly dislike death and want to avoid it. But if you strongly dislike death and want to avoid it, you must be lying when you say you accept death as a natural part of life and think it's crass and selfish to try to cheat the Reaper. And therefore your reluctance to sign up for cryonics violates your own revealed preferences! You must just be trying to signal conformity or something.

The problem is that not signing up for cryonics is also a "revealed preference". "You wouldn't sign up for cryonics, which means you don't really fear death so much, so why bother running from a burning building?" is an equally good argument, although no one except maybe Marcus Aurelius would take it seriously.

Both these arguments assume that somewhere, deep down, there's a utility function with a single term for "death" in it, and all decisions just call upon this particular level of death or anti-death preference.

More explanatory of the way people actually behave is that there's no unified preference for or against death, but rather a set of behaviors. Being in a burning building activates fleeing behavior; contemplating death from old age does not activate cryonics-buying behavior. People guess at their opinions about death by analyzing these behaviors, usually with a bit of signalling thrown in. If they desire consistency - and most people do - maybe they'll change some of their other behaviors to conform to their hypothesized opinion.

One more example. I've previously brought up the case of a rationalist who knows there's no such thing as ghosts, but is still uncomfortable in a haunted house. So does he believe in ghosts or not? If you insist on there being a variable somewhere in his head marked $belief_in_ghosts = (0,1) then it's going to be pretty mysterious when that variable looks like zero when he's talking to the Skeptics Association, and one when he's running away from a creaky staircase at midnight.

But it's not at all mysterious that the thought "I don't believe in ghosts" gets reinforced because it makes him feel intelligent and modern, and staying around a creaky staircase at midnight gets punished because it makes him afraid.

Behaviorism was one of the first and most successful eliminationist theories. I've so far ignored the most modern and exciting eliminationist theory, connectionism, because it involves a lot of math and is very hard to process on an intuitive level. In the next post, I want to try to explain the very basics of connectionism, why it's so exciting, and why it helps justify discussion of behaviorist principles.

Goal FactoringMotivations

Frontpage

137

New Comment

Rendering 0/256 comments, sorted by

top scoring

(show more) Click to highlight new comments since: Today at 7:11 PM

Some comments are truncated due to high volume. (⌘F to expand all)Change truncation settings

Moderation Log

137 Secrets of the eliminati

by Scott Alexander

20th Jul 2011

3 min read

256

137

I took a similar tack asking ksvanhorn's question in yesterday's post - how can you get a more accurate picture of what your true preferences are? I said:

I don't think there are true preferences. In one situation you have one tendency, in another situation you have another tendency, and "preference" is what it looks like when you try to categorize tendencies. But categorization is a passive and not an active process: if every day of the week I eat dinner at 6, I can generalize to say "I prefer to eat dinner at 6", but it would be non-explanatory to say that a preference toward dinner at 6 caused my behavior on each day. I think the best way to salvage preferences is to consider them as tendencies currently in reflective equilibrium.

Goal FactoringMotivations

Frontpage

137

Mentioned in

131The Library of Scott Alexandria

56Reality is weirdly normal

New Comment

Rendering 0/256 comments, sorted by

top scoring

(show more) Click to highlight new comments since: Today at 7:11 PM

Some comments are truncated due to high volume. (⌘F to expand all)Change truncation settings

Moderation Log

More from Scott Alexander

Curated and popular this week

256Comments

256

Comment Permalink

Wei Dai15y50

my model of Wei Dai has him as less curious than you are about things that I yammer about

If you mean the nature of superintelligence, I'm extremely curious about that, but I think the way you're going about trying to find out is unlikely to lead to progress. To quote Eric Drexler, "most new ideas are wrong or inadequate." The only way I can see how humans can make progress, when we're running on such faulty hardware and software, is to be very careful, to subject our own ideas to constant self-scrutiny for possible errors, and to be as precise as possible in our communications, and to lay down all the steps of our reasoning, so others can understand what we mean and how exactly we arrived at our conclusions, so they can help find our errors for us.

Now sometimes one could have a flash of inspiration--an idea that might be true or an approach that seems worth pursing--but don't know how to justify that intuition. It's fine to try to communicate such potential insights, but this can't be all that you do. Most of your time still has to be spent trying to figure out whether these seeming inspirations actually amount to anything, whether there are arguments that can back up your intuitions, and whether these arguments stand up to scrutiny. If you are not willing to put a substantial amount of effort into doing this yourself, then you shouldn't be surprised that few others are willing to do it for you (i.e., take you seriously), especially when you do not even make a strong effort to use language that they can easily understand.

There are people who know me in person and thus share background knowledge with me, who are able to understand what I am saying. They are the thinkers I admire most and the people I care most about influencing.

I would be interested to know if any of your intuitive leaps have lead any of those people to make any progress beyond "a new idea that's almost certain to be wrong even if we're not sure why" to "something that seems likely to be an improvement over the previous state of the art". (It's possible that you have a comparative advantage in making such leaps of intuition, even though a priori that seems unlikely.)

you [Nesov] err on the side of calling bullshit when I know for certain that something is not bullshit

Do you have any examples? (This is unrelated to my points above. I'm just curious.)

Will_Newsome15y20

(Warning, long comment; it stays mostly on track but is embarrassingly mostly self-centered.) I think I must have been being imprecise when I said you were "less curious" about the things I yammer about, and honestly I don't remember what I was thinking at the time and won't try to rationalize it. (I wasn't on adderall then but am on adderall now; there may be state-dependent memory effects.) I thus unendorse at least that part of the grandparent.

I think that everything you're saying is correct, and note the interesting similarities between my ca... (read more)

See in context