I wonder:
if you had an agent that obviously did have goals (let's say, a player in a game, whose goal is to win, and who plays the optimal strategy) could you deduce those goals from behavior alone?
Let's say you're studying the game of Connect Four, but you have no idea what constitutes "winning" or "losing." You watch enough games that you can map out a game tree. In state X of the world, a player chooses option A over other possible options, and so on. From that game tree, can you deduce that the goal of the game was to get four pieces in a row?
I don't know the answer to this question. But it seems important. If it's possible to identify, given a set of behaviors, what goal they're aimed at, then we can test behaviors (human, animal, algorithmic) for hidden goals. If it's not possible, that's very important as well; because that means that even in a simple game, where we know by construction that the players are "rational" goal-maximizing agents, we can't detect what their goals are from their behavior.
That would mean that behaviors that "seem" goal-less, programs that have no line of code representing a goal, may in fact be beh...
From that game tree, can you deduce that the goal of the game was to get four pieces in a row?
One method that would work for this example is to iterate over all possible goals in ascending complexity, and check which one would generate that game tree. How to apply this idea to humans is unclear. See here for a previous discussion.
Reductionists want to reduce things like goals and preferences to the appropriate neurons in the brain; eliminativists want to prove that humans, like the blue-minimizing robot, don't have anything of the sort until you start looking at high level abstractions.
Surely you mean that eliminativists take actions which, in their typical contexts, tend to result in proving that humans, like the blue-minimizing robot, don't have anything of the sort until you start looking at high level abstractions.
Surely you mean that there are just a bunch of atoms which, when interpreted as a human category, can be grouped together to form a being classifiable as "an eliminativist".
eliminativists want to prove that humans, like the blue-minimizing robot, don't have anything of the sort until you start looking at high level abstractions.
Just because something only exists at high levels of abstraction doesn't mean it's not real or explanatory. Surely the important question is whether humans genuinely have preferences that explain their behaviour (or at least whether a preference system can occasionally explain their behaviour - even if their behaviour is truly explained by the interaction of numerous systems) rather than how these preferences are encoded.
The information in a jpeg file that indicates a particular pixel should be red cannot be analysed down to a single bit that doesn't do anything else, that doesn't mean that there isn't a sense in which the red pixel genuinely exists. Preferences could exist and be encoded holographically in the brain. Whether you can find a specific neuron or not is completely irrelevant to their reality.
Interesting post throughout, but don't you overplay your hand a bit here?
There's nothing that looks remotely like a goal in its programming, [...]
An IF-THEN piece of code comparing a measured RGB value to a threshold value for firing the laser would look at least remotely like a goal to my mind.
More explanatory of the way people actually behave is that there's no unified preference for or against death, but rather a set of behaviors. Being in a burning building activates fleeing behavior; contemplating death from old age does not activate cryonics-buying behavior.
YES. This so much.
But if whenever I eat dinner at 6I sleep better than when eating dinner at 8, can I not say that I prefer dinner at 6 over dinner at 8? Which would be one step over saying I prefer to sleep well than not.
I think we could have a better view if we consider many preferences in action. Taking your cyonics example, maybe I prefer to live (to a certain degree), prefer to conform, and prefer to procrastinate. In the burning-building situation, the living preference is playing more or less alone, while in the cryonics situation, preferences interact somewhat like oppsite forces and then motion happens in the winning side. Maybe this is what makes preferences seem like varying?
Eliminativism is all well and good if all one wants to do is predict. However, it doesn't help answer questions like "What should I do?", or "What utility function should we give the FAI?"
The same might be said of evolutionary psychology. In which case I would respond that evolutionary psychology helped us stop thinking in a certain stupid way.
Once, we thought that men were attracted to pretty women because there was some inherent property called "beauty", or that people helped their neighbors because there was a universal Moral Law to which all minds would have access. Once it was the height of sophistication to argue whether people were truly good but corrupted by civilization, or truly evil but restrained by civilization.
Evolutionary psychology doesn't answer "What utility function should we give the FAI?", but it gives good reasons to avoid the "solution": 'just tell it to look for the Universal Moral Law accessible to all minds, and then do that.' And I think a lot of philosophy progresses by closing off all possible blind alleys until people grudgingly settle on the truth because they have no other alternative.
I am less confident in my understanding of eliminativism than of evo psych, so I am less willing to speculate on it. But since one common FAI proposal is "find out human preferences, and then do those", if it turns...
if you were in a burning building, you would try pretty hard to get out. Therefore, you must strongly dislike death and want to avoid it. But if you strongly dislike death and want to avoid it, you must be lying when you say you accept death as a natural part of life and think it's crass and selfish to try to cheat the Reaper. And therefore your reluctance to sign up for cryonics violates your own revealed preferences! You must just be trying to signal conformity or something.
I don't think this section bolsters your point much. The obvious explanation f...
A more practical example: when people discuss cryonics or anti-aging, the following argument usually comes up in one form or another: if you were in a burning building, you would try pretty hard to get out. Therefore, you must strongly dislike death and want to avoid it. But if you strongly dislike death and want to avoid it, you must be lying when you say you accept death as a natural part of life and think it's crass and selfish to try to cheat the Reaper.
nitpick: Burning to death is painful and it can happen at any stage of life. "You want to live a long life and die peacefully with dignity" can also be derived but of course it's more complicated.
So if someone stays in the haunted house despite the creaky stairwell, his preferences are revealed as rationalist?
Personally I would have run away exactly because I would not think the sound to come from a non-existent, and so harmless, ghost!
Thanks for this great sequence of posts on behaviourism and related issues.
Anyone who does not believe mental states are ontologically fundamental - ie anyone who denies the reality of something like a soul - has two choices about where to go next. They can try reducing mental states to smaller components, or they can stop talking about them entirely.
Here's what I take it you're committed to:
goals appear only when you make rough generalizations from its behavior in limited cases.
I am surprised no one brought up the usual map / territory distinction. In this case the territory is the set of observed behaviors. Humans look at the territory and with their limited processing power they produce a compressed and lossy map, here called the goal.
The goal is a useful model to talk simply about the set of behaviors, but has no existence outside the head of people discussing it.
quoted text if you were in a burning building, you would try pretty hard to get out. Therefore, you must strongly dislike death and want to avoid it. But if you strongly dislike death and want to avoid it, you must be lying when you say you accept death as a natural part of life and think it's crass and selfish to try to cheat the Reaper.
Won't it be the case that someone who tries to escape from a burning building, does so, just to avoid the pain and suffering it inflicts? It would be such a drag to be burned alive rather than a peaceful painless poison death.
Interesting that you chose the "burning building" analogy. In the fire sermon the Buddha argued that being incarnated in samsara was like being in a burning building and that the only sensible thing to do was to take steps to ensure the complete ending of the process of reincarnation in samsara ( and dying just doesn't cut it in this regard). The burning building analogy in this case is a terrible one- as we are talking about the difference between a healthy person seeking to avoid pain and disability versus the cryonics argument- which is all ab...
Excellent post!
I hope that somewhere along the way you get to the latest neuroscience suggesting that the human motivational system is composed of both model-based and model-free reinforcement mechanisms.
Keep up the good work.
Without my dealing here with the other alternatives, do you Yvain, or does any other LW reader think that it is (logically) possible that mental states COULD be ontologically fundamental?
Further, why is that possibility tied to the word "soul", which carries all sorts of irrelevant baggage?
Full disclosure: I do (subjectively) know that I experience red, and other qualia, and try to build that in to my understanding of consciousness, which I also know I experience (:-) (Note that I purposely used the word "know" and not the word "believe".)
This is an excellent post Yvain. How can I socially pressure you into posting the next one? Guilt? Threats against my own wellbeing?
I like to enforce reductionist consistency in my own brain. I like my ehtics universal and contradiction free, mainly because other people can't accuse me of being inconsistent then.
The rest, is akrasia.
Reductionists want to reduce things like goals and preferences to the appropriate neurons in the brain; eliminativists want to prove that humans, like the blue-minimizing robot, don't have anything of the sort until you start looking at high level abstractions.
I don't really see how these two philosophies contradict.
Absolutely fantastic post. Extremely clearly written, and made the blue-minimizing robot thought experiment really click for me. Can't wait for the next one.
Comment reply 2 of 2.
Like,
LW straw man: "OMG! You took advantage of a cheap syncretic symmetry between the perspectives of Thomism and computationalist singularitarianism in order to carve up reality using the words of the hated enemy, instead of sitting by while people who know basically nothing about philosophy assert that people who actually do know something about philosophy use the word 'soul' to designate something that's easy to contemptuously throw aside as transparently ridiculous! Despite your initial strong emphasis that your effort was very hasty and largely an attempt at having fun, I am still very skeptical of your mental health, let alone your rationality!
One-fourths-trolling variation on Will_Newsome: "Aside from the very real importance of not setting a precedent or encouraging a norm of being contemptuous of things you don't understand, which we'll get back to... First of all, I was mostly just having fun, and second of all, more importantly, the sort of thing I did there is necessary for people to do if they want to figure out what people are actually saying instead of systematically misguidedly attributing their own inaccurate maps to some contemptible (non-existent) enemy of Reason. Seriously, you are flinching away from things because they're from the wrong literary genre, even though you've never actually tried to understand that literary genre. (By the way, I've actually looked at the ideas I'm talking about, and I don't have the conceptual allergies that keep you from actually trying to understand them on grounds of "epistemic hygiene", or in other words on grounds of assuming the conclusion of deserved contempt.) If someone took a few minutes to describe the same concepts in a language you had positive affect towards then you probably wouldn't even bother to be skeptical. But if I cipher-substitute the actually quite equivalent ideas thought up by the contemptible enemy then those same ideas become unmotivated insanity, obviously originally dreamed up because of some dozens of cognitive biases. (By the way, "genetic fallacy"; by the way, "try not to criticize people when they're right".) And besides charity and curiosity being fundamental virtue-skills in themselves, they're also necessary if one is to accurately model any complex phenomenon/concept/thing/perspective at all.
LW straw man: "What is this nonsense? You are trying to tell us that, 'it is virtuous to engage in lots of purposeful misinterpretation of lots of different models originally constructed by various people who you for some probably-motivatedly-misguided reason already suspect are generally unreasonable, even at the cost of building a primary maximally precise model, assuming for some probably-motivatedly-misguided reason that those two are necessarily at odds'. Or perhaps you are saying, 'it is generally virtuous to naively pattern match concepts from unfamiliar models to the nearest concept that you can easily imagine from a model you already have'. Or maybe, 'hasty piecemeal misinterpretations of mainstream Christianity and similar popular religions are a good source of useful ideas', or 'all you have to do is lower your epistemic standards and someday you might even become as clever as me', or 'just be stupid'. But that's horrible advice. You are clearly wrong, and thus I am justified in condescendingly admonishing you and guessing that you are yet another sympathizer of the contemptible enemies of Reason. (By the way aren't those hated enemies of Reason so contemptible? Haha! So contemptible! Om nom nom signalling nom contempt nom nom "rationality" nom.)
One-thirds-trolling variation on Will_Newsome: "...So, ignoring the extended mutual epistemic back-patting session... I am seriously warning you: it is important that you become very skillful---fast, thorough, reflective, self-sharpening---at finding or building various decently-motivated-if-imperfect models of the same process/concept/thing so as to form a constellation of useful perspectives on different facets of it, and different ways of carving its joints, and why different facets/carvings might seem differentially important to various people or groups of people in different memetic or psychological contexts, et cetera. Once you have built this and a few other essential skills of sanity, that is when you can be contemptuous of any meme you happen upon that hasn't already been stamped with your subculture's approval. Until then you are simply reveling in your ignorance while sipping poison. Self-satisfied insanity is the default, for you or for any other human who doesn't quite understand that real-life rationality is a set of skills, not just a few tricks or a game or a banner or a type of magic used by Harry James Potter-Evans-Verres. Like any other human, you use your cleverness to systematically ignore the territory rather than try to understand it. Like any other human, you cheer for your side rather than notice confusion. Like any other human, you self-righteously stand on a mountain of cached judgments rather than use curiosity to see anything anew. Have fun with that, humans. But don't say I didn't warn you."
By the way aren't those hated enemies of Reason so contemptible? Haha! So contemptible! Om nom nom signalling nom contempt nom nom "rationality" nom.
I am seriously warning you: it is important that you become very skillful---fast, thorough, reflective, self-sharpening---at finding or building various decently-motivated-if-imperfect models of the same process/concept/thing so as to form a constellation of useful perspectives on different facets of it, and different ways of carving its joints, and why different facets/carvings might seem differentially important to various people or groups of people in different memetic or psychological contexts, et cetera.
Why do you think this is so important? As far...
Anyone who does not believe mental states are ontologically fundamental - ie anyone who denies the reality of something like a soul - has two choices about where to go next. They can try reducing mental states to smaller components, or they can stop talking about them entirely.
In a utility-maximizing AI, mental states can be reduced to smaller components. The AI will have goals, and those goals, upon closer examination, will be lines in a computer program.
But in the blue-minimizing robot, its "goal" isn't even a line in its program. There's nothing that looks remotely like a goal in its programming, and goals appear only when you make rough generalizations from its behavior in limited cases.
Philosophers are still very much arguing about whether this applies to humans; the two schools call themselves reductionists and eliminativists (with a third school of wishy-washy half-and-half people calling themselves revisionists). Reductionists want to reduce things like goals and preferences to the appropriate neurons in the brain; eliminativists want to prove that humans, like the blue-minimizing robot, don't have anything of the sort until you start looking at high level abstractions.
I took a similar tack asking ksvanhorn's question in yesterday's post - how can you get a more accurate picture of what your true preferences are? I said:
A more practical example: when people discuss cryonics or anti-aging, the following argument usually comes up in one form or another: if you were in a burning building, you would try pretty hard to get out. Therefore, you must strongly dislike death and want to avoid it. But if you strongly dislike death and want to avoid it, you must be lying when you say you accept death as a natural part of life and think it's crass and selfish to try to cheat the Reaper. And therefore your reluctance to sign up for cryonics violates your own revealed preferences! You must just be trying to signal conformity or something.
The problem is that not signing up for cryonics is also a "revealed preference". "You wouldn't sign up for cryonics, which means you don't really fear death so much, so why bother running from a burning building?" is an equally good argument, although no one except maybe Marcus Aurelius would take it seriously.
Both these arguments assume that somewhere, deep down, there's a utility function with a single term for "death" in it, and all decisions just call upon this particular level of death or anti-death preference.
More explanatory of the way people actually behave is that there's no unified preference for or against death, but rather a set of behaviors. Being in a burning building activates fleeing behavior; contemplating death from old age does not activate cryonics-buying behavior. People guess at their opinions about death by analyzing these behaviors, usually with a bit of signalling thrown in. If they desire consistency - and most people do - maybe they'll change some of their other behaviors to conform to their hypothesized opinion.
One more example. I've previously brought up the case of a rationalist who knows there's no such thing as ghosts, but is still uncomfortable in a haunted house. So does he believe in ghosts or not? If you insist on there being a variable somewhere in his head marked $belief_in_ghosts = (0,1) then it's going to be pretty mysterious when that variable looks like zero when he's talking to the Skeptics Association, and one when he's running away from a creaky staircase at midnight.
But it's not at all mysterious that the thought "I don't believe in ghosts" gets reinforced because it makes him feel intelligent and modern, and staying around a creaky staircase at midnight gets punished because it makes him afraid.
Behaviorism was one of the first and most successful eliminationist theories. I've so far ignored the most modern and exciting eliminationist theory, connectionism, because it involves a lot of math and is very hard to process on an intuitive level. In the next post, I want to try to explain the very basics of connectionism, why it's so exciting, and why it helps justify discussion of behaviorist principles.