Another big gap is explaining how and when U directly updates on C's information. For example, it requires conscious reasoning and language processing to understand that a man on a plane holding a device with a countdown timer and shouting political and religious slogans is a threat, but a person on that plane would experience fear, increased sympathetic activation, and other effects mediated by the unconscious mind.
That's not a gap at all -- in fact, you answered it elsewhere in your article, right here:
U seems to reason over neural inputs – it takes in things like sense perceptions
The key is understanding that those sense perceptions need not be present-tense/actual; they can be remembered or imagined. It's pretty central to what I do. (Heck, most of the model you're describing isn't much different from things I've been writing about since around 2005.)
Anyway, a big part of the work I do with people is helping them learn to identify the remembered or imagined sensory predictions (which drive the feelings and behavior) and inject other ways of looking at things. More precisely, other ways of interpreting the sensory impressions, such that they lead to different prediction...
This is a very interesting article, thanks for writing it! I agree with Tim Tyler's remark that your theory sounds more like a perturbation to a more fundamental theory of consciousness.
You may be generalizing from one example based on personal experience with feelings of tension between a conscious desire to be a utilitarian and unconscious desires that point in mostly other directions, as evidenced by your nice post the trouble with good. It must be remembered that very few people consciously subscribe to normative utilitarianism.
Various issues that your post does not appear to address:
•Sometimes people consciously have overtly selfish goals. Sometimes people even explicitly talk about such goals in public. (I can dig up references if you'd like.)
Relatedly, note that apparent pursuit of altruistic goals can result social expulsion. It's an oversimplification to say that it's evolutionarily advantageous to have a conscious mind with noble motivations. This is quite possibly related to your remark "you might dislike people who would make you feel morally inferior and force you to expend more resources to keep yourself morally satisfied."
•Your theory points to the idea tha...
There's also a sort of akrasia which is physically based-- if I eat too much refined carbs, I can get a day or two of doing very little while thinking "I don't care, I don't care". It looks like a psychological problem, but is really well corelated with the carbs.
This model is missing a plausible evolutionary explanation for how U and C may have evolved. That's a pretty gaping hole because if we don't constrain U and C to being plausible under evolution then they can be given whatever motives, responsibilities, etc. that are convenient to fit the model to existing data (see Psychohistorian's epicycle comment).
This feels like a worse version of epicycles, in that even if it's kind of useful, it seems like it definitely is not what's going on. The idea of lying being difficult seems to (A) presuppose a consciousness, and (B) make no sense - it seems like it would be much cheaper to evolve better lie-hiding mechanisms than to evolve consciousness. "Cognitive dissonance is adaptive with respect to expensive gestures" seems to explain pretty much all of what this theory is trying to address, without being weirdly centered on lying.
This feels like a theory...
Interesting theory.
I tend to agree with Tim Tyler that the "common" interpretation of consciousness is simpler and the signaling thing is not necessary. I realize that you are trying to increase the scope of the theory, but I am not convinced yet that the cure is better than the illness.
While I could see why an ape trying to break into "polite society" might want to gain the facility you describe, the apes created the "polite society" in the first place, I do not see a plausible solution to the catch 22 (perhaps it's a lac
This theory simply does not resonate with me. I do not feel that I am at all like that and neither has anyone I have known been like that. It is as off the mark as Freudian theories are, in my view. "So you wall off a little area of your mind.." Do you have any evidence for this idea that the consciousness is a walled off area?
This theory seems to make a testable prediction: you will have less akrasia if your signaling requires you to reach your goal, not just show that you're working towards it. Looking at my life, I'm not sure if that's true.
Your consciousness contains the things you need to be able to reflect on in order to function properly. That seems like a much more basic way of delineating the conscious mind than the proposed signalling theory.
Yes: consciousness sometimes excludes things that it is undesirable to signal - but surely that is more of a footnote to the theory than its main feature. Quite a bit of that work is actually done by selective forgetting - which is a feature with better targeting capabilities than the filters of consciousness.
If you want the answer to involve signalling, then the ego seems like a more suitable thing to examine.
EDIT: This theory does not sufficiently address the heart of the issue, and needs to be reconsidered.
Perhaps this knot can be cut with PCT. Suppose you have the following hypothesis:
"The executive function sequences actions so as to minimize the error signal from mental subsystems".
This seems to explain most of the things you're trying to resolve. For instance:
-Crossing the street to avoid the homeless man minimizes the errors from the "maximize my amount of money", "avoid socially awkward situations", and "maximize my se...
The idea of C being a public relations agency resonates for me. I prefer the C/U dichotomy to the superego/ego dichotomy because whereas in both cases it is U or the ego that represents my real self, the first theory has U in agent-control and trying to mollify C, the second theory has the superego in agent-control and embarrassed by the ego. I feel like the first theory more closely fits what I experience, especially during indecision conflicts. Without any guilt, I'll ask, what is the minimum I need to do to feel external-world/socially comfortable here?...
About empathy: what is a good way for someone who experiences less empathy to relate to more normal humans?
About lying: I do not regard it as helpful to consider whether C is lying. Instead, one should ask whether there exists an isomorphism between C's purported beliefs and an accurate model of the person's whole mind, and if so, what that isomorphism is.
As an example from my experience, consider the exchange:
"Oh, hi, [given name]. How are you?"
"Oh, fine, thanks."
The second person, despite not being "fine" by more objective metrics, need not be regarded as lying, so long as "Oh, fine, thanks", is simply taken to mean "I recognize that you have taken effort to express concern to me, and would like to reciprocate by showing friendliness and not bothering you with more details about myself than are appropriate for our relationship."
Instead of asking such a nebulous, abstract question as, "Is the second person lying by claiming to be fine?" I contend that one should focus on the question of how those statements should map to a model of reality, and if there exists a concise description for how it does so.
So, if the person discussing this, and presumably the one choosing to be rational, is C, and it must necessarily fight against a selfish, flighty and almost completely uncaring U except in the cases where it percieves a direct benifit, and furthermore is assumed to have complete or nearly complete control over the person, then why be rational? The model described here makes rationality, rather than mere rationalization, literally impossible. Therefore, why try? Or did U's just decide to force their C's into this too, making such a model deterministic in all but name?
they honestly believe on introspection that they have admirable goals
This seems incorrect - anyone reasonably apt at introspection would not come to the conclusion "I have only admirable goals", but instead to the conclusion "I seem to have many conflicting goals". It's only a profound LACK of introspection that would make someone believe that they have only admirable goals.
A few weeks ago, I saw a beggar on the sidewalk and walked to the other side of the street to avoid him. This isn't sane goal-directed behavior: either I want beggars to have my money, or I don't.
I think that for some people it’s sometimes rational to avoid beggars. Recalling your post Doing your good deed for the day, it seems plausible that for some people, giving money to beggars is likely to lower their motivation to do other good things. Giving money to beggars is probably not a cost-effective charitable activity. So it’s plausible that some peopl...
I think some terminology clarification might be in order here - consciousness performs a variety of functions (attention/monitoring, abstract thought, executive, etc.), and mediating conflicts between conscious and subconscious preferences comprises a somewhat small part of what it does. This may be why the theory seems awkward to some people (including me).
Related to: Alien Parasite Technical Guy, A Master-Slave Model of Human Preferences
In Alien Parasite Technical Guy, Phil Goetz argues that mental conflicts can be explained as a conscious mind (the "alien parasite”) trying to take over from an unsuspecting unconscious.
Last year, Wei Dai presented a model (the master-slave model) with some major points of departure from Phil's: in particular, the conscious mind was a special-purpose subroutine and the unconscious had a pretty good idea what it was doing1. But Wei said at the beginning that his model ignored akrasia.
I want to propose an expansion and slight amendment of Wei's model so it includes akrasia and some other features of human behavior. Starting with the signaling theory implicit in Wei's writing, I'll move on to show why optimizing for signaling ability would produce behaviors like self-signaling and akrasia, speculate on why the same model would also promote some of the cognitive biases discussed here, and finish with even more speculative links between a wide range of conscious-unconscious conflicts.
The Signaling Theory of Consciousness
This model begins with the signaling theory of consciousness. In the signaling theory, the conscious mind is the psychological equivalent of a public relations agency. The mind-at-large (hereafter called U for “unconscious” and similar to Wei's “master”) has socially unacceptable primate drives you would expect of a fitness-maximizing agent like sex, status, and survival. These are unsuitable for polite society, where only socially admirable values like true love, compassion, and honor are likely to win you friends and supporters. U could lie and claim to support the admirable values, but most people are terrible liars and society would probably notice.
So you wall off a little area of your mind (hereafter called C for “conscious” and similar to Wei's “slave”) and convince it that it has only admirable goals. C is allowed access to the speech centers. Now if anyone asks you what you value, C answers "Only admirable things like compassion and honor, of course!" and no one detects a lie because the part of the mind that's moving your mouth isn't lying.
This is a useful model because it replicates three observed features of the real world: people say they have admirable goals, they honestly believe on introspection that they have admirable goals, but they tend to pursue more selfish goals. But so far, it doesn't explain the most important question: why do people sometimes pursue their admirable goals and sometimes not?
Avoiding Perfect Hypocrites
In the simplest case, U controls all the agent's actions and has the ability to set C's values, and C only controls speech. This raises two problems.
First, you would be a perfect hypocrite: your words would have literally no correlation to your actions. Perfect hypocrites are not hard to notice. In a world where people are often faced with Prisoners' Dilemmas against which the only defense is to swear a pact to mutually cooperate, being known as the sort of person who never keeps your word is dangerous. A recognized perfect hypocrite could make no friends or allies except in the very short-term, and that limitation would prove fatal or at least very inconvenient.
The second problem is: what would C think of all this? Surely after the twentieth time protesting its true eternal love and then leaving the next day without so much as a good-bye, it would start to notice it wasn't pulling the strings. Such a realization would tarnish its status as "the honest one"; it couldn't tell the next lover it would remain forever true without a little note of doubt creeping in. Just as your friends and enemies would soon realize you were a hypocrite, so C itself would realize it was part of a hypocrite and find the situation incompatible with its idealistic principles.
Other-signaling and Self-Signaling
You could solve the first problem by signaling to others. If your admirable principle is to save the rainforest, you can loudly and publicly donate money to the World Wildlife Fund. When you give your word, you can go ahead and keep it, as long as the consequences aren't too burdensome. As long as you are seen to support your principles enough to establish a reputation for doing so, you can impress friends and allies and gain in social status.
The degree to which U gives permission to support your admirable principles depends on the benefit of being known to hold the admirable principle, the degree to which supporting the principle increases others' belief that you genuinely hold the principle, and the cost of the support. For example, let's say a man is madly in love with a certain woman, and thinks she would be impressed by the sort of socially conscious guy who believes in saving the rainforest. Whether or not he should donate $X to the World Wildlife Fund depends on how important winning the love of this woman is to him, how impressed he thinks she'd be to know he strongly believes in saving the rainforests, how easily he could convince her he supports the rainforests with versus without a WWF donation - and, of course, the value of X and how easily he can spare the money. Intuitively, if he's really in love, she would be really impressed, and it's only a few dollars, he would do it; but not if he's not that into her, she doesn't care much, and the WWF won't accept donations under $1000.
Such signaling also solves the second problem, the problem of C noticing it's not in control - but only partly. If you only give money when you're with a love interest and ey's standing right there, and you only give the minimum amount humanly possible so as to not repulse your date, C will notice that also. To really satisfy C, U must support admirable principles on a more consistent basis. If a stranger comes up and gives a pitch for the World Wildlife Fund, and explains that it would really help a lot of rainforests for a very low price, U might realize that C would get a little suspicious if it didn't donate at least a token amount. This kind of signaling is self-signaling: trying to convince part of your own mind.
This model modifies the original to include akrasia2 (U refusing to pursue C's goals) and the limitations on akrasia (U pursues C's goals insofar as it has to convince other people - and C itself - its signaling is genuine).
It also provides a key to explaining some superficially weird behavior. A few weeks ago, I saw a beggar on the sidewalk and walked to the other side of the street to avoid him. This isn't sane goal-directed behavior: either I want beggars to have my money, or I don't. But under this model, once the beggar asks for money, U has to give it or risk C losing some of its belief that it is compassionate and therefore being unable to convince others it is compassionate. But as long as it can avoid being forced to make the decision, it can keep both its money and C's innocence.
Thinking about this afterward, I realized how silly it was, and now I consider myself unlikely to cross the street to avoid beggars in the future. In the language of the model, C focuses on the previously subconscious act of avoiding the beggar and realizes it contradicts its principles, and so U grudgingly has to avoid such acts to keep C's innocence and signaling ability intact.
Notice that this cross-the-street trick only works if U can act without C being fully aware what happened or its implications. As we'll see below, this ability of U's has important implications for self-deception scenarios.
From Rationality to Rationalization
So far, this model has assumed that both U and C are equally rational. But a rational C is a disadvantage for U for exactly the reasons mentioned in the last paragraph; as soon as C reasoned out that avoiding the beggar contradicted its principles, U had to expend more resources giving money to beggars or lose compassion-signaling ability. If C is smart enough to realize that its principle of saving the rainforest means you ought to bike to work instead of taking the SUV, U either has to waste resources biking to work or accept a decrease in C's environmentalism-signaling ability. Far better that C never realizes it ought to bike to work in the first place.
So it's to U's advantage to cripple C. Not completely, or it loses C's language and reasoning skills, but enough that it falls in line with U's planning most of the time.
“How, in detail, does U cripple C?” is a restatement of one of the fundamental questions of Less Wrong and certainly too much to address in one essay, but a few suggestions might be in order:
- The difference between U and C seems to have a lot to do with two different types of reasoning. U seems to reason over neural inputs – it takes in things like sense perceptions and outputs things like actions, feelings, and hunches. This kind of reasoning is very powerful – for example, it can take as an input a person you've just met and immediately output a calculation of their value as a mate in the form of a feeling of lust – but it can also fail in weird ways, like outputting a desire to close a door three dozen times into the head of an obsessive-compulsive, or succumbing to things like priming. C, the linguistic one, seems to reason over propositions – it takes propositions like sentences or equations as inputs, and returns other sentences and equations as outputs. This kind of reasoning is also very powerful, and also produces weird errors like the common logical fallacies.
- When U takes an action, it relays it to C and claims it was C's action all along. C never wonders why its body is acting outside of its control; only why it took an action it originally thought it disapproved of. This relay can be cut in some disruptions of brain function (most convulsions, for example, genuinely seem involuntary), but remains spookily intact in others (if you artificially activate parts of the brain that cause movement via transcranial magnetic stimulation, your subject will invent some plausible sounding reason for why ey made that movement)3.
- C's crippling involves a tendency for propositional reasoning to automatically cede to neural reasoning and to come up with propositional justifications for its outputs, probably by assuming U is right and then doing some kind of pattern-matching to fill in blanks. For example, if you have to choose to buy one of two cars, and after taking a look at them you feel you like the green one more, C will try to come up with a propositional argument supporting the choice to buy the green one. Since both propositional and neural reasoning are a little bit correlated with common sense, C will often hit on exactly the reasoning U used (for example, if the red car has a big dent in it and won't turn on, it's no big secret why U's heuristics rejected it) but in cases where U's justification is unclear, C will end up guessing and may completely fail to understand the real reasons behind U's choice. Training in luminosity can mitigate this problem, but not end it.
- A big gap in this model is explaining why sometimes C openly criticizes U, for example when a person who is scared of airplanes says “I know that flying is a very safe mode of transportation and accidents are vanishingly unlikely, but my stupid brain still freaks out every time I go to an airport”. This might be justifiable along the lines that allowing C to signal that it doesn't completely control mental states is less damaging than making C look like an idiot who doesn't understand statistics – but I don't have a theory that can actually predict when this sort of criticism will or won't happen.
- Another big gap is explaining how and when U directly updates on C's information. For example, it requires conscious reasoning and language processing to understand that a man on a plane holding a device with a countdown timer and shouting political and religious slogans is a threat, but a person on that plane would experience fear, increased sympathetic activation, and other effects mediated by the unconscious mind.
This part of the model is fuzzy, but it seems safe to assume that there is some advantage to U in changing C partially, but not completely, from a rational agent to a rubber-stamp that justifies its own conclusions. C uses its propositional reasoning ability to generate arguments that support U's vague hunches and selfish goals.
How The World Would Look
We can now engage, with a little bit of cheating, in some speculation about how a world of agents following this modified master-slave model would look.
You'd claim to have socially admirable principles, and you'd honestly believe these claims. You'd pursue these claims at a limited level expected by society: for example, if someone comes up to you and asks you to donate money to children in Africa, you might give them a dollar, especially if people are watching. But you would not pursue them beyond the level society expects: for example, even though you might consciously believe saving a single African child (estimated cost: $900) is more important than a plasma TV, you would be unlikely to stop buying plasma TVs so you could give this money to Africa. Most people would never notice this contradiction; if you were too clever to miss it you'd come up with some flawed justification; if you were too rational to accept flawed justifications you would just notice that it happens, get a bit puzzled, call it “akrasia”, and keep doing it.
You would experience borderline cases, where things might or might not be acceptable, as moral conflicts. A moral conflict would feel like a strong desire to do something, fighting against the belief that, if you did it, you would be less of the sort of person you want to be. In cases where you couldn't live with yourself if you defected, you would cooperate; in cases where you could think up any excuse at all that allowed you to defect and still consider yourself moral, you would defect.
You would experience morality not as a consistent policy to maximize utility across both selfish and altruistic goals, but as a situation-dependent attempt to maximize feelings of morality, which could be manipulated in unexpected ways. For example, as mentioned before, going to the opposite side of the street from a beggar might be a higher-utility option than either giving the beggar money or explicitly refusing to do so. In situations where you were confident in your morality, you might decide moral signaling was an inefficient use of resources – and you might dislike people who would make you feel morally inferior and force you to expend more resources to keep yourself morally satisfied.
Your actions would be ruled by “neural reasoning” that outputs expectations different from the ones your conscious reasoning would endorse. Your actions might hinge on fears which you knew to be logically silly, and your predictions might come from a model different from the one you thought you believed. If it was necessary to protect your signaling ability, you might even be able to develop and carry out complicated plots to deceive the conscious mind.
Your choices would be determined by illogical factors that influenced neural switches and levers and you would have to guess at the root causes of your own decisions, often incorrectly – but would defend them anyway. When neural switches and levers became wildly inaccurate due to brain injury, your conscious mind would defend your new, insane beliefs with the same earnestness with which it defended your old ones.
You would be somewhat rational about neutral issues, but when your preferred beliefs were challenged you would switch to defending them, and only give in when it is absolutely impossible to keep supporting them without looking crazy and losing face.
You would look very familiar.
Footnotes
1. Wei Dai's model gets the strongest compliment I can give: after reading it, it seemed so obvious and natural to think that way that I forgot it was anyone's model at all and wrote the first draft of this post without even thinking of it. It has been edited to give him credit, but I've kept some of the terminology changes to signify that this isn't exactly the same. The most important change is that Wei thinks actions are controlled by the conscious mind, but I side with Phil and think they're controlled by the unconscious and relayed to the conscious. The psychological evidence for this change in the model are detailed above; some neurological reasons are mentioned in the Wegner paper below.
2. Or more accurately one type of akrasia. I disagreed with Robin Hanson and Bryan Caplan when they said a model similar to this explains all akrasia, and I stand by that disagreement. I think there are at least two other, separate causes: akrasia from hyperbolic discounting, and the very-hard-to-explain but worthy-of-more-discussion-sometime akrasia from wetware design.
3. See Wegner, "The Mind's Best Trick: How We Experience Conscious Will" for a discussion of this and related scenarios.