Fallacies of Compression

Eliezer Yudkowsky

Fallacies of Compression — LessWrong

A Human's Guide to Words

109 Fallacies of Compression

by Eliezer Yudkowsky

17th Feb 2008

5 min read

109

"The map is not the territory," as the saying goes. The only life-size, atomically detailed, 100% accurate map of California is California. But California has important regularities, such as the shape of its highways, that can be described using vastly less information—not to mention vastly less physical material—than it would take to describe every atom within the state borders. Hence the other saying: "The map is not the territory, but you can't fold up the territory and put it in your glove compartment."

A paper map of California, at a scale of 10 kilometers to 1 centimeter (a million to one), doesn't have room to show the distinct position of two fallen leaves lying a centimeter apart on the sidewalk. Even if the map tried to show the leaves, the leaves would appear as the same point on the map; or rather the map would need a feature size of 10 nanometers, which is a finer resolution than most book printers handle, not to mention human eyes.

Reality is very large—just the part we can see is billions of lightyears across. But your map of reality is written on a few pounds of neurons, folded up to fit inside your skull. I don't mean to be insulting, but your skull is tiny, comparatively speaking.

Inevitably, then, certain things that are distinct in reality, will be compressed into the same point on your map.

But what this feels like from inside is not that you say, "Oh, look, I'm compressing two things into one point on my map." What it feels like from inside is that there is just one thing, and you are seeing it.

A sufficiently young child, or a sufficiently ancient Greek philosopher, would not know that there were such things as "acoustic vibrations" or "auditory experiences". There would just be a single thing that happened when a tree fell; a single event called "sound".

To realize that there are two distinct events, underlying one point on your map, is an essentially scientific challenge—a big, difficult scientific challenge.

Sometimes fallacies of compression result from confusing two known things under the same label—you know about acoustic vibrations, and you know about auditory processing in brains, but you call them both "sound" and so confuse yourself. But the more dangerous fallacy of compression arises from having no idea whatsoever that two distinct entities even exist. There is just one mental folder in the filing system, labeled "sound", and everything thought about "sound" drops into that one folder. It's not that there are two folders with the same label; there's just a single folder. By default, the map is compressed; why would the brain create two mental buckets where one would serve?

Or think of a mystery novel in which the detective's critical insight is that one of the suspects has an identical twin. In the course of the detective's ordinary work, his job is just to observe that Carol is wearing red, that she has black hair, that her sandals are leather—but all these are facts about Carol. It's easy enough to question an individual fact, like WearsRed(Carol) or BlackHair(Carol). Maybe BlackHair(Carol) is false. Maybe Carol dyes her hair. Maybe BrownHair(Carol). But it takes a subtler detective to wonder if the Carol in WearsRed(Carol) and BlackHair(Carol)—the Carol file into which his observations drop—should be split into two files. Maybe there are two Carols, so that the Carol who wore red is not the same woman as the Carol who had black hair.

Here it is the very act of creating two different buckets that is the stroke of genius insight. 'Tis easier to question one's facts than one's ontology.

The map of reality contained in a human brain, unlike a paper map of California, can expand dynamically when we write down more detailed descriptions. But what this feels like from inside is not so much zooming in on a map, as fissioning an indivisible atom—taking one thing (it felt like one thing) and splitting it into two or more things.

Often this manifests in the creation of new words, like "acoustic vibrations" and "auditory experiences" instead of just "sound". Something about creating the new name seems to allocate the new bucket. The detective is liable to start calling one of his suspects "Carol-2" or "the Other Carol" almost as soon as he realizes that there are two of them.

But expanding the map isn't always as simple as generating new city names. It is a stroke of scientific insight to realize that such things as acoustic vibrations, or auditory experiences, even exist.

The obvious modern-day illustration would be words like "intelligence" or "consciousness". Every now and then one sees a press release claiming that a research has "explained consciousness" because a team of neurologists investigated a 40Hz electrical rhythm that might have something to do with cross-modality binding of sensory information, or because they investigated the reticular activating system that keeps humans awake. That's an extreme example, and the usual failures are more subtle, but they are of the same kind. The part of "consciousness" that people find most interesting is reflectivity, self-awareness, realizing that the person I see in the mirror is "me"; that and the hard problem of subjective experience as distinguished by Chalmers. We also label "conscious" the state of being awake, rather than asleep, in our daily cycle. But they are all different concepts going under the same name, and the underlying phenomena are different scientific puzzles. You can explain being awake without explaining reflectivity or subjectivity.

Fallacies of compression also underlie the bait-and-switch technique in philosophy—you argue about "consciousness" under one definition (like the ability to think about thinking) and then apply the conclusions to "consciousness" under a different definition (like subjectivity). Of course it may be that the two are the same thing, but if so, genuinely understanding this fact would require first a conceptual split and then a genius stroke of reunification.

Expanding your map is (I say again) a scientific challenge: part of the art of science, the skill of inquiring into the world. (And of course you cannot solve a scientific challenge by appealing to dictionaries, nor master a complex skill of inquiry by saying "I can define a word any way I like".) Where you see a single confusing thing, with protean and self-contradictory attributes, it is a good guess that your map is cramming too much into one point—you need to pry it apart and allocate some new buckets. This is not like defining the single thing you see, but it does often follow from figuring out how to talk about the thing without using a single mental handle.

So the skill of prying apart the map is linked to the rationalist version of Taboo, and to the wise use of words; because words often represent the points on our map, the labels under which we file our propositions and the buckets into which we drop our information. Avoiding a single word, or allocating new ones, is often part of the skill of expanding the map.

DistinctionsBucket ErrorsMap and TerritoryPhilosophy of LanguageFallacy of Gray

Frontpage

109

Replace the Symbol with the Substance

17 comments102 karma

Categorizing Has Consequences

13 comments81 karma

New Comment

29 comments, sorted by

oldest

Click to highlight new comments since: Today at 11:56 PM

[-]Anonymous2418y240

I love you, Eliezer.

[-]Eliezer Yudkowsky18y400

Thanks, but I already have a girlfriend.

[-]Zane2y30

you find some pretty ironic things when rereading 17-year-old blog posts, but this one takes the cake.

[-]Nick_Tarleton18y260

Was that an intentional example of a definitional mismatch?

[-]Tom_McCabe218y20

"Expanding your map is (I say again) a scientific challenge: part of the art of science, the skill of inquiring into the world."

Unless you're doing original research, you could simply read about <subject you don't understand> until you realize that you have to use separate mental categories for the two distinct objects. Original discovery is far too time-consuming to bother with when you're just reworking old ground.

[-]Kenny13y20

Original seeing is still a worthwhile skill to improve and maintain, even if you're not doing "original research".

[-]Ben_Jones18y50

Cracking post, many thanks.

Surprised you didn't note this as a fundamental skill gained (to some extent) in growing up. I'm reminded of the study in which two young kids were shown a ball being hidden under one of three cups. One of the kids was led out of the room, the ball switched to another cup, and the remaining kid asked where he thought the other kid believed the ball was. I think it's between the ages of three and four that kids gain the ability to create an additional, temporary, subjunctive bucket, which allows them to empathise with another mind.

[-]Yelsgib18y2-1

I started reading this blog a few days ago and am particularly interested in your posts since you seem to be a modeler. This sort of thing appeals to me.

Comments/criticisms:

I agree that it is not a good idea to cram too much into one point/label. However, what are your thoughts regarding the necessity of doing this? This is a point which I have not seen you address.

What I would claim is that our own personal "definitions" for words correspond strongly to the computational structures related to those words (as I expect you would agree) - however it may be, and we should expect that it is, difficult to operate outside of our current computational structures. To bifurcate a definition (e.g. to split "phenomonological sound" into "systematic sound" and "experiential sound") might be extremely mentally taxing, it might bring the conversation to a halt. How easy is it to change the map, in your opinion?

I am also somewhat wary of the recent trends in your thinking. In particular, all of your examples refer to very specific phenomena, very simple phenomena. Can you give an example of how you think that we apply/should apply (is there a should in here somewhere?) decoupling in order to disambiguate in very high-order contexts? E.g. let's say we're talking about a difficult-to-pin-down-but-easy-to-use term like "post-modernism?" Is there any way to talk about such a thing without developing a definition with someone? The dictionary definition would obviously be worthless, but so would pretty much any definition that we can come up with.

What about words that "can't be defined"? (e.g. "art")

I have many more questions for you, but I'll end here.

You seem like you might actually think somewhat clearly about the world, which is rare indeed. I really do appreciate the clarity and thoughtfulness of your posts, I'm merely trying to bring up points pertinent to my current and past interests and (hopefully) open up your eyes to potential gaps in your thinking.

Hope all is well.

[-]GloriaSidorum13y10

What about words that "can't be defined"? (e.g. "art")

If you can't think of any unifying features of a category, but you still want to use it, you could go about listing members: "Art" Includes (for all known English-speaking humans):

Intentional paintings from before 1900 Statues Stained-glass windows &c. Includes for many: abstract art modern art cubism Photography &c. Includes for a few: Man-made objects not usually labelled as art &c. Includes for no known English-speaking human: Non man-made objects The Holocaust &c.

If the effect of knowing what "art" is (although that one's common-usage definition can be articulated in terms of features) is understanding what English-speakers mean when they say it, then a list-based definition is as effective, though not as efficient, as a feature based one. (You can make up for not knowing what criterion someone uses with a bit of Bayesian updating: The probability that Alice will call a Jackson Pollock piece "art" is greater if she called Léger's "Railway Crossing" "art" than if she did not)

[-]TheOtherDave13y00

It's worth being a little careful when talking about "list-based" as opposed to "feature-based" definitions, because it's easy to confuse those ideas with the more standard ideas of extensional and intentional definitions.

E.g., an extensional definition of "art" doesn't allow new works of art to be recognized as belonging to the set, and is therefore clearly not what English speakers mean when they say "art", but if I'm understanding what you mean by "list-based" here the same objection doesn't apply. What you seem to to be talking about here is an intentional definition where the defining properties are not explicitly articulable, and where knowledge of them is transmitted by analysis of prototypical examples and non-examples.

Yes?

[-]GloriaSidorum13y00

That works a bit better, at least for the art example. A better example of where you'd best "define" a set by memorising all of it's members might be the morality of a particular culture. For instance, some African tribes consider it evil to marry someone whose sibling has the same first name as oneself. Not only is it hard to put into words, in English or Ju|'hoan, a definition of "bad" (or |kàù) which would encompass this, but one couldn't look at a bunch of other things that these tribes consider bad and infer that one shouldn't marry someone who has a sibling who share's one's first name. Better to just know that that's one of the things that is said to |kàù in that culture.

[-]TheOtherDave13y10

Sure. Though even in cases like that, humans have a way of generalizing these sorts of things -- that is, of inferring an intensional definition which they extend, rather than treating the set strictly extensionally. It would not surprise me if after a few generations such a community came to consider marrying someone whose parent has the same first name as oneself to be |kàù, for example.

[-]GloriaSidorum13y10

If I recall correctly, they actually do. It falls under their incest taboo. So "bad" in any culture could probably be defined by a list of generalised principals which don't necessarily share any characteristics other than being labelled as "bad".

[-]TheOtherDave13y00

Yup, more or less agreed.

[+]Phirand_Ice18y-50

[+]Doug_S.18y-61

[-]Richard_Hollerith218y90

By that definition a punch in the nose is art.

[-][anonymous]16y00

This can, of course, be the case.

[-]gutzperson18y20

Paul McCarthy did a video/performance in the 1970s where he punches his own nose (face). So it is art, isn't it?

[-]outeast518y10

By that definition a punch in the nose is art

Only in a staged boxing match - since when is a hittee an 'audience'? And is a punch supposed to create a reaction or, say, disable the hittee?. I agree that the nature of the reaction needs further definition - or arson in a theatre must be art:)

[-]gutzperson18y00

Going back to the definition of art by Doug S. A reaction by the audience is not necessarily a definition for art. A reaction by an audience can be achieved by many means. See post by Richard Hollerith. I fear that a definition of art that is audience oriented conforms more or less to a definition favoured by public bodies like arts councils, governments and everybody who gives money to the arts. There are numerous studies about art and audiences. Similar to television ratings that make us belief that something is good television because it is watched by zillions, audience reaction (ratings) are used to define if something is (good or bad) art. By the way, many artists have created art because they enjoyed doing so, and many of them (think about painters) did not necessarily think about the reaction of an audience when they created their works.

[-]Eliezer Yudkowsky18y140

I now Taboo the word 'art'. Does anyone still think they have a point to make?

[-]Dojan15y00

Doesn't seem like it :)

[-]Yevuard4mo00

Action and creation is usually deliberate, derived from one or more purposes. Sometimes a purpose is specifically to elicit a desired mental framing (not strictly reaction, rather internal state) in others. Two people can take the same actions for different reasons - a face punch with the purpose to disable someone differs from an identical face punch to create a mental framing such as an emotion of fear - in such cases the purpose often needs to be overtly communicated to clarify the action was done for different purposes than might be normally inferred. We have a term for actions and creations with a strong overt element of eliciting a desired mental framing. Although it may often seem useless, there is nevertheless social value in creating those experiences, demonstrated by the pay that is made for ordinary examples such as entertainment experiences and products which are functional but also elicit mental framings. Many who create such artifacts or performances do it as much for the desired mental state they wish to experience, as for the hope that others enjoy the same and pay them.

ChatGPT was able to infer what I was talking about ("Playing taboo: what is the concept I am referring to here?"), so it's at least comprehensible, though it could likely be much more terse. (With appreciation to the LTUE keynote speaker who set an ordinary water bottle on a table in an ordinary way and declared it an instance of this concept, then made the point that by so declaring it was no longer possible to claim it not such an instance, but only to decide if it was a "good" or "bad" instance of the concept).

The point I wish to make by this is there are no words that can't be defined.

[-]ron_purewal18y-20

Reality is very large - just the part we can see is billions of lightyears across.

bah.

a poster above has already noted the irony here: the term 'reality' is just as susceptible to equivocal use as is 'sound' or 'art'.**

what is reality, after all, if not 'the part we can see'? (by 'see' i am of course including all means of detection, from literal sensory awareness to circumstantial inference.) indeed, it's dicey to posit the existence of any 'reality' independent of our own consciousness. as richard dawkins has said, even the most seemingly incontrovertible truths - like the heat of the desert and the hardness of rocks - are only so because of our own evolutionary adaptations. i.e., we feel rocks as hard only because our brains have created 'hardness' as a way of rationalizing our quantum interactions with rocks.

on a separate note, it's amazing how much bigger northern california (and that means northern california - the cold part with the big trees) looks when one flips the map of california so that south faces up. (i will not commit the fallacy of referring to this orientation as 'upside down'.)

**er, sorry, i meant 'that which is deemed to have value unrelated to practical utility'

[-]Tesseract16y130

Reality is very large - just the part we can see is billions of lightyears across. But your map of reality is written on a few pounds of neurons, folded up to fit inside your skull. I don't mean to be insulting, but PUNY HUMAN, YOU CANNOT CONTAIN REALITY WITHIN YOUR TINY BRAIN

[-]Taurus_Londono13y50

...a team of neurologists investigated a 40Hz electrical rhythm...

For the sake of the blook; neuroscientists, not neurologists. Words can be wrong.

[-]PhilGoetz9y20

Great post! There is also the non-discrete aspect of compression: information loss. English has, according to some dictionaries, over a million words. It's unlikely we store most of our information in English. Probably there is some sort of dimension reduction, like PCA. There is in any case probably lossy compression. This means people with different histories will use different frequency tables for their compression, and will throw out different information when encoding a verbal statement. I think you would almost certainly find that if you measure word use frequency for different people, then cluster the word use distributions, some clusters would correspond to ideologies. The interesting question is which comes first, the ideology, or the word usage frequency (caused by different life experiences).

[-]Yevuard4mo10

This article analyzes the value of splitting. To quote XKCD, "really we're both just categorization pendants" (https://xkcd.com/2518/). I feel both lumping and splitting have tremendous value, as well as scientific relevance (as that is touted in the article). Perhaps lumping is easier because brains pattern match well; but "enhance, enhance, enhance..." on a single pixel is just a hallucination (a single enhance using multiple video frames as samples can be somewhat meaningful, by which point the true signal is exhausted). More information is required, which itself is pattern matched, and splitting is a natural consequence of having more to hang in the tree data structure.

Splitting is good. So is lumping. I'm not seeing a strong case made here other than IBM's "think".

I might instead advance the claim that producing a minimal sufficient knowledge tree (or, a map of the right resolution, except that it has variable scale to detail interesting/relevant parts better, so the extended analogy mis-maps (heh) a bit) for the questions of interest is a better goal. This likely requires having the data available to make those splits (or knowing how and being able to go get it if time is available), but that doesn't change the goal of producing a minimally sufficient map to minimize cognitive burden.

But - perhaps the point is "the map is not the territory" (lump more) is considered well-enough known and this is intended to push back against too much lumping, "the fallacies of compression". Still I think I prefer an integrated view.

Moderation Log