This is an excellent post. It has that rare quality, like much of the Sequences, of the ideas it describes being utterly obvious—in retrospect. (I also appreciate the similarly Sequence-like density of hyperlinks, exploiting the not-nearly-exploited-enough-these-days power of hypertext to increase density of ideas without a concomitant increase in abstruseness.)
… which is why I find it so puzzling to see all these disagreeing comments, which seem to me to contain an unusual, and puzzling, level of reflexive contrarianness and pedanticism.
Oh yeah, totally. I guess that’s going on now, then? I will try and figure out how one nominates things…
Considering how much time is spent here on this subject, I'm surprised at how little reference to distributional semantics is made. It's already a half-century long tradition of analyzing word meanings via statistics and vector spaces. It may be worthwhile to reach into that field to bolster and clarify some of these things that come up over and over.
This is a nice crisp summary of something kind of like pragmatism but capable of more robust intersubjective mapmaking:
Everything we identify as a joint is a joint not "because we care about it", but because it helps us think about the things we care about.
To expand this a bit, when deciding on category boundaries, one should assess the effect on the cost-adjusted expressive power of all statements and compound concepts that depend on it, not just the direct expressive power of the category in question. Otherwise you can't get things like Newtonian physics and are stuck with the Ptolemaic or Copernican systems. (We REALLY don't care about Newton's laws of motion for their own sake.)
As someone who seems to care more about terminology than most (and as a result probably gets into more terminological debates on LW than anyone else (see 1 2 3 4)), I don't really understand what you're suggesting here. Do you think this advice is applicable to any of the above examples of naming / drawing boundaries? If so, what are its implications in those cases? If not, can you give a concrete example that might come up on LW or otherwise have some relevance to us?
Hi, Wei—thanks for commenting! (And sorry for the arguably somewhat delayed reply; it's been a really tough week for me.)
can you give a concrete example that might come up on LW or otherwise have some relevance to us?
Is Slate Star Codex close enough? In his "Anti-Reactionary FAQ", Scott Alexander writes—
Why use this made-up word ["demotism"] so often?
Suppose I wanted to argue that mice were larger than grizzly bears. I note that both mice and elephants are "eargreyish", meaning grey animals with large ears. We note that eargreyish animals such as elephants are known to be extremely large. Therefore, eargreyish animals are larger than noneargreyish animals and mice are larger than grizzly bears.
As long as we can group two unlike things together using a made-up word that traps non-essential characteristics of each, we can prove any old thing.
This post is mostly just a longer, more detailed version (with some trivial math) of the point Scott is making in these three paragraphs: mice and elephants form a cluster if you project into the subspace spanned by "color" and "relative ear size", but using a word to point to a cluster in such a "thin", impoverished subspace is a d
...Thanks, I think I have a better idea of what you're proposing now, but I'm still not sure I understand it correctly, or if it makes sense.
mice and elephants form a cluster if you project into the subspace spanned by “color” and “relative ear size”, but using a word to point to a cluster in such a “thin”, impoverished subspace is a dishonest rhetorical move when your interlocutors are trying to use language to mostly talk about the many other features of animals which don’t covary much with color and relative-ear-size.
But there are times when it's not a dishonest rhetorical move to do this, right? For example suppose an invasive predator species has moved into some new area, and I have an hypothesis that animals with grey skin and big ears might be the only ones in that area who can escape being hunted to extinction (because I think the predator has trouble seeing grey and big ears are useful for hearing the predator and only this combination of traits offers enough advantage for a prey species to survive). While I'm formulating this hypothesis, discussing how plausible it is, applying for funding, doing field research, etc., it seems useful to create a new term like "eargreyish" so I don't have to keep repeating "grey animals with relatively large ears".
Since it doesn't seem to make sense to never use a word to point to a cluster in a "thin" subspace, what is your advice for when it's ok to do this or accept others doing this?
Sometimes people redraw boundaries for reasons of local expediency. For instance, the category of AGI seems to have been expanded implicitly in some contexts to include what might previously have just been called a really good machine learning library that can do many things humans can do. This allows AGI alignment to be a bigger-tent cause, and raise more money, than it would in the counterfactual where the old definitions were preserved.
This article seems to me to be outlining a principled case that such category redefinitions can be systematically distinguished from purely epistemic category redefinitions, with the implication that there's a legitimate interest in tracking which is which, and sometimes in resisting politicized recategorizations in order to defend the enterprise of shared mapmaking.
My interest in terminological debates is usually not to discover new ideas but to try to prevent confusion (when readers are likely to infer something wrong from a name, e.g., because of different previous usage or because a compound term is defined to mean something that's different from what one would reasonably infer from the combination of individual terms). But sometimes terminological debates can uncover hidden assumptions and lead to substantive debates about them. See here for an example.
CFS and SEID are both cases where certain states correlate with each other Zacks post doesn't help us at all to reason about whether we should prefer CFS or SEID as a term.
I'm definitely not claiming to have the "correct" answer to all terminological disputes. (As the post says, "Of course, there isn't going to be a unique way to encode the knowledge into natural language.")
Suppose, hypothetically, that it were discovered that there are actually two or more distinct etiologies causing cases that had historically been classified as "chronic fatigue syndrome", and cases with different etiologies responded better to different treatments. In this hypothetical scenario, medical professionals would want to split what they had previously called "chronic fatigue syndrome" into two or more categories to reflect their new knowledge. I think someone who insisted that "chronic fatigue syndrome" was still a good category given the new discovery of separate etiologies would be making a mistake (with respect to the goals doctors have when they talk about diseases), even if the separate etiologies had similar symptoms (which is what motivated the CFS label in the first place).
In terms of the co
...I've alluded to this in other comments, but I think worth spelling out more comprehensively here.
I think this post makes a few main points:
I realize the three points cleave together pretty closely in the author's model, and make sense to think about in conjunction. But I think trying to introduce them all at once makes for more confusing reading.
I think the followup post Unnatural Categories Are Optimized For Deception does a pretty good job of spelling out the details of points #2 and #3. I think the current post does a good job at #1, a decent job at #2, but a fairly confused job at #3.
In pa...
When rationalists say that definitions can be wrong, we don't mean that there's a unique category boundary that is the True floating essence of a word, and that all other possible boundaries are wrong. We mean that in order for a proposed category boundary to not be wrong, it needs to capture some statistical structure in reality, even if reality is surprisingly detailed and there can be more than one such structure.
So, I got this part. And it seemed straightforwardly true to me, and seemed like a reasonably short inferential step away from other stuff LW has talked about. Categories are useful as mental compressions. Mental compressions should map to something. There are multiple ways you might want to cluster and map things. So far so straightforward.
And then the rest of the article left me more confused, and the disagreements in the comments got me even more confused.
Is the above claim the core claim of the article? If so, I'm confused what other people are objecting to. If not, I'm apparently still confused about the point of the article.
[edit: fwiw, I am aware of the subtext/discussion that the post is an abstraction of, and even taking that into account still feel fairly confused about some of the responses]
As has been mentioned elsewhere, this is a crushingly well-argued piece of philosophy of language and its relation to reasoning. I will say this post strikes me as somewhat longer than it needs to be, but that's also my opinion on much of the Sequences, so it is at least traditional.
Also, this piece is historically significant because it played a big role in litigating a community social conflict (which is no less important for having been (being?) mostly below the surface), and set the stage for a lot of further discussion. I think it's very important that "write a nigh-irrefutable argument about philosophy of language, in order to strike at the heart of the substantive disagreement which provoked the social conflict" is an effective social move in this community. This is a very unusual feature for a community to have! Also it's an absolutely crucial feature for any community that aspires to the original mission of the Sequences. I don’t think it’s a coincidence that so much of this site’s best philosophy is motivated by efforts to shape social norms via correct philosophical argument. It lends a sharpness and clarity to the writing which is missing from a lot of the more abstract philosophizing.
My earlier comment explains why I think this post is one of last year’s best. (My opinion of its quality remains unchanged, after ~1.5 years.)
Similarly, the primary thing when you take a word in your lips is your intention to reflect the territory, whatever the means
This sentence sounds to me like you want to use Korzybski's metaphor while ignoring the point of his argument. After him language is supposed to be used to create semantic reactions in the audience and the is a of identity is to be avoided.
The essay feels like you struggle with is a but are neither willing to go Korzybski's way nor are you willing to provide a good argument for why we should use the is a of identity.
Do not ask whether there's a rule of rationality saying that you shouldn't call dolphins fish. Ask whether dolphins are fish.
That feels to me very wrong. Beliefs are supposed to pay rent in anticipated experiences and discussing whether dolphins are fish in the abstract is detached from anticipated experiences.
Context matters a great deal for what words mean. Thomas Kuhn asked both physicists and chemists whether helium is a molecule:
Both answered without hesitation, but their answers were not the same. For the chemist the atom of helium was a molecule because it behaved like one with respect to the kinetic theory of gases....
In standard English the statement "X is a Y" often means that within the relevant classification system X is a member of category Y. Which classification system is relevant often differs by context, but the OP deals with that explicitly:
in order for a proposed category boundary to not be wrong, it needs to capture some statistical structure in reality, even if reality is surprisingly detailed and there can be more than one such structure.
When quoting the map is not the territory which is a slogan that was created to criticize this usage of is a within a dense 750 page book where on of the main messages is that is a shouldn't be used, I think that paragraph fails to adequately make a case that this common language usage is desirable and if so when it's desirable.
Saying that the primary intention which which language is used isn't to create some effect in the recipient of the language act is a big claim and Zack simply states it without any reflection.
My first reaction to the text was like WaiDai's I don't really understand what you're suggesting here where I'm unsure about the implication that are supposed to be made for practical language use. The second is noting that the text gets basics* like the primary intention of why words are used wrong.
*: I mean basic in the sense of fundamental and not as in easy to understand
Is anyone interested in giving this a second nomination?
I argue that this post is significant for filling in a gap in our canon: in "Where to Draw the Boundary?" (note, "boundary", singular), Yudkowsky contemptuously dismisses the idea that dolphins could be considered fish. However, Scott Alexander has argued that it may very well make sense to consider dolphins fish. So ... which is it? Is Yudkowsky right that categories must "carve reality at the joints", or is Alexander right that "[a]n alternative categorization system is not an error, and borders are not objectively true or false"?
In this post, "Where to Draw the Boundaries?" (note, boundaries, plural), I argue that Yudkowsky is right that categories must carve reality at the joints; however, I reconcile this with Alexander's case that dolphins could be fish with a simple linear-algebraic intuition: entities might cluster in a smaller subspace of configuration space, while failing to cluster in a larger subspace. Clusters in particularly "thin" subspaces (like a fake job title that nevertheless makes predictions on the "what's printed on business cards" dimension) may fail to be useful.
(Perhaps also of significance is that th...
Interesting article. I dare not say I understand it fully. But to argue for some categories as more or less wrong than others is it fair to say you are arguing against the ugly duckling theorem?
Well, I usually try not to argue against theorems (as contrasted to arguing that a theorem's premises don't apply in a particular situation)—but in spirit, I guess so! Let me try to work out what's going on here—
The boxed example on the Wikipedia page you link, following Watanabe, posits a universe of three ducks—a White duck that comes First, a White duck that is not First, and a nonWhite duck that is not First—and observes that every pair of ducks agrees on half of the possible logical predicates that you can define in terms of Whiteness and Firstness. Generally, there are sixteen possible truth functions on two binary variables (like Whiteness or Firstness), but here only eight of them are distinct. (Although really, only eight of them could be distinct, because that's the number of possible subsets of three ducks (2³ = 8).) In general, we can't measure the "similarity" between objects by counting the number of sets that group them together, because that's the same for any pair of objects. We also get a theorem on binary vectors: if you have some k-dimensional vectors of bits, you can use Hamming distance to find the "most dissimilar" one, but if you extend the vectors into 2^k-
...treating reality as fixed and self as fixed and the discovery of the proper mapping between self concepts and reality concepts is doomed to failure because both your own intentions are fluid depending on what you are trying to do and your own sense of reality is fluid (including self model). Ontologies are built to be thrown away. They break in the tails. Fully embracing and extending the Wittgensteinian revolution prevents you from wasting effort resisting this.
This seems technically true but not relevant. Important classes of intersubjective coordination require locally stable category boundaries, and some ontologies have more variation we care about concealed in the tails than others.
There are processes that tend towards the creation of ontologies with stable expressive power, and others that make maps worse for navigation. It's not always expedient to cooperate with the making of a map that lets others find you, but it's important to be able to track which way you're pushing if you want there to sometimes be good maps.
I'm saying that this post itself is falling prey to the thing it advises against. Better to point at a cluster that helps navigate, like Hanson's babblers than to talk about the information theoretic content of aggregate clusters.
It seems to me like the OP is motivated by a desire to improve decisionmaking processes by making a decisive legal argument against corruption in front of a corrupt court, and that this is an inefficient way of coordinating to move people who are reachable to a better equilibrium.
Does that seem like substantively the same objection to you?
I found parts of the post object-level helpful, like the bit I directly commented on, but overall agree it's giving LW too much credit for coordinating towards "Rationality." But people like Zack will correctly believe that LW's corruption is not common knowledge if people like us aren't willing to state the obvious explicitly.
(Self-review.)
Argument for significance: earlier comment
...So, there is a legitimate complaint here. It's true that sailors in the ancient world had a legitimate reason to want a word in their language whose extension was
{salmon, guppies, sharks, dolphins, ...}
. (And modern scholars writing a translation for present-day English speakers might even translate that word as fish, because most members of that category are what we would call fish.) It indeed would not necessarily be helping the sailors to tell them that they need to exclude dolphins from the extension of that word, and instead include dolphins in the e
But in order for your map to be useful in the service of your values, it needs to reflect the statistical structure of things in the territory—which depends on the territory, not your values.
In order for your map to be useful , it needs to reflect the statistical structure of things to the extent required by the value it is in service to.
That can be zero. There is a meta category of things that are created by humans without any footprint in pre existing reality. These include money, marriages, and mortgages
Since useful categories can have no connection...
I'm saying that epistemics focused on usefulness-to-predicting is broadly useful in a way that epistemics optimized in other ways is not. It is more trustworthy in that the extent to which it's optimized for some people at the expense of other people must be very limited. (Of course it will still be more useful to some people than others, but the Schelling-point-nature means that we tend to take it as the gold standard against which other things are judged as "manipulative".)
Another defense of this Schelling point is that as we depart from it, it becomes increasingly difficult to objectively judge whether we are benefiting or hurting as a result. We get a web of contagious lies spreading through our epistemology.
I'm not saying this is a Schelling fence which has held firm through the ages, by any means; indeed, it is rarely held firm. But, speaking very roughly and broadly, this is a fight between "scientists" and "politicians" (or, as Benquo has put it, between engineers and diplomats).
I do, of course, think that the LessWrong community should be and to an extent is such a place.
Something about this has been bugging me and I maybe finally have a grasp on it.
It's perhaps somewhat entangled with this older Benquo comment elsewhere in this thread. I'm not sure if you endorse this phrasing but your prior paragraph seems similar:
Discourse about how to speak the truth efficiently, on a site literally called "Less Wrong," shouldn't have to explicitly disclaim that it's meant as advice within that context every time, even if it's often helpful to examine what that means and when and how it is useful to prioritize over other desiderata.
Since a couple-years-ago, I've updated "yes, LessWrong should be a fundamentally truthseeking place, optimizing for that at the expense of other things." (this was indeed an update for me, since I came here for the Impact and vague-appreciation-of-truthseeking, and only later updated that yes, Epistemics are one of the most important cause areas)
But, one of the most important things I want to get out of LessWrong is a clear map of how the rest of the world works, and how to interface with it.
So when I read the conclusion here...
...Simila
Obviously they do. There's no obvious upper limit to the strutural compexity of a human creation. However, I was talking about pre existing reality.
I question whether "pre-existing" is important here. Zack is discussing whether words cut reality at the joints, not whether words cut pre-existing reality at the joints. Going back to the example of creating a game -- when you're writing the rulebook for the game, it's obviously important in some sense that you are the one who gets to make up the rules... but I argue that this does not change the whole question of how to use language, what makes a description apt or inept, etc.
For example, if I invented the game of chess, calling rooks a type of pawn and reversing the meaning of king/queen for black/white would be poor map craftsmanship.
Money or marriage or mortgages are all things that need to work work in certain ways, but there aren't pre-existing Money or Marriage or Mortgage objects, and their working well isn't a degree of correspondence to something pre-existing -- what realists usually mean by "truth" -- it's more like usefulness.
None of these examples are convincing on their face, though -- there are all sorts of things we can...
I worry that we're spending a LOT of energy on trying to "carve at the joints" of something that has no joints, or is so deep that the joints don't exist in the dimensions we perceive. Categories, like all models, can be better or worse for a given purpose, but they're never actually right.
The key to this is "for a purpose". Models are useful for predictions of something, and sometimes for shorthand communication of some kinds of similarity.
Don't ask whether dolphins are fish. Don't believe or imply that ...
We agree that models are only better or worse for a purpose, but ...
Ask whether this creature needs air. Ask how fast it swims. etc.
If there are systematic correlations between many particular creature-features like whether it needs air, how fast it swims, what it's shaped like, what its genome is, &c., then it's adaptive to have a short code for the conjunction of those many features that such creatures have in common.
Category isn't identity, but the cognitive algorithm that makes people think category is identity actually performs pretty well when things are tightly-clustered in configuration space rather than evenly distributed, which actually seems to be the case for a lot of things! (E.g., while there are (or were) transitional forms between species related by evolutionary descent, it makes sense that we have separate words for cats and dogs rather than talking about individual creature properties of ear-shape, &c., because there aren't any half-cats in our real world.)
There is an important difference between "identifying this pill as not being 'poison' allows me to focus my uncertainty about what I'll observe after administering the pill to a human (even if most possible minds have never seen a 'human' and would never waste cycles imagining administering the pill to one)" and "identifying this pill as not being 'poison', because if I publicly called it 'poison', then the manufacturer of the pill might sue me."
What is that sentence supposed to tell me? ...
It's not clear whether or not that important difference is supposed to imply to the reader that one is better then the other. Given that there seems to be a clear value judgement in the others, maybe it does here?
All three paragraphs starting with "There's an important difference [...]" are trying to illustrate the distinction between choosing a model because it reflects value-relevant parts of reality (which I think is good), and choosing a model because of some non-reality-mapping consequences of the choice of model (which I think is generally bad).
words that are unnecessarily obscure (most people in society won't understand what wasting cycles is about)
The primary audience of this post is longtime Less Wrong readers; as an author, I'm not concerned with trying to reach "most people in society" with this post. I expect Less Wrong readers to have trained up generalization instincts motivating the leap to thinking about AIs or minds-in-general even though this would seem weird or incomprehensible to the general public.
...To those people who proofread and appeartly didn't find an issue in that sentence, is it really necessary to mix all those different issues into a 6-line sen
I do find myself somewhat confused about the hostility in this comment. It's hard to write good things, and there will always be misunderstandings. Many posts on LessWrong are unnecessarily confusing, including many posts by Eliezer, usually just because it takes a lot of effort, time and skill to polish a post to the point where it's completely clear to everyone on the site (and in many technical subjects achieving that bar is often impossible).
Recommendations for how to phrase things in a clear way seem good to me, and I appreciate them on my writing, but doing so in a way that implies some kind of major moral failing seems like it makes people overall less likely to post, and also overall less likely to react positively to feedback.
After rereading the post a few times, I think you are just misunderstanding it?
Like, I can't make sense of your top-level comment in my current interpretation of the post, and as such I interpreted your comment as asking for clarification in a weirdly hostile tone (which was supported by your first sentence being "What is that sentence supposed to tell me?"). I generally think it's a bad idea to start substantive criticisms of a post with a rhetorical question that's hard to distinguish from a genuine question (and probably would advise against rhetorical questions in general, but am less confident of that).
To me the section you quoted seems relatively clear, and makes a pretty straightforwardly true point, and from my current vantage point I fail to understand your criticism of it. I would be happy to try to explain my current interpretation, but would need a bit more help understanding what your current perspective is.
Followup to: Where to Draw the Boundary?
Figuring where to cut reality in order to carve along the joints—figuring which things are similar to each other, which things are clustered together: this is the problem worthy of a rationalist. It is what people should be trying to do, when they set out in search of the floating essence of a word.
Once upon a time it was thought that the word "fish" included dolphins ...
The one comes to you and says:
So, there is a legitimate complaint here. It's true that sailors in the ancient world had a legitimate reason to want a word in their language whose extension was
{salmon, guppies, sharks, dolphins, ...}
. (And modern scholars writing a translation for present-day English speakers might even translate that word as fish, because most members of that category are what we would call fish.) It indeed would not necessarily be helping the sailors to tell them that they need to exclude dolphins from the extension of that word, and instead include dolphins in the extension of their word for{monkeys, squirrels, horses ...}
. Likewise, most modern biologists have little use for a word that groups dolphins and guppies together.When rationalists say that definitions can be wrong, we don't mean that there's a unique category boundary that is the True floating essence of a word, and that all other possible boundaries are wrong. We mean that in order for a proposed category boundary to not be wrong, it needs to capture some statistical structure in reality, even if reality is surprisingly detailed and there can be more than one such structure.
The reason that the sailor's concept of water-dwelling animals isn't necessarily wrong (at least within a particular domain of application) is because dolphins and fish actually do have things in common due to convergent evolution, despite their differing ancestries. If we've been told that "dolphins" are water-dwellers, we can correctly predict that they're likely to have fins and a hydrodynamic shape, even if we've never seen a dolphin ourselves. On the other hand, if we predict that dolphins probably lay eggs because 97% of known fish species are oviparous, we'd get the wrong answer.
A standard technique for understanding why some objects belong in the same "category" is to (pretend that we can) visualize objects as existing in a very-high-dimensional configuration space, but this "Thingspace" isn't particularly well-defined: we want to map every property of an object to a dimension in our abstract space, but it's not clear how one would enumerate all possible "properties." But this isn't a major concern: we can form a space with whatever properties or variables we happen to be interested in. Different choices of properties correspond to different cross sections of the grander Thingspace. Excluding properties from a collection would result in a "thinner", lower-dimensional subspace of the space defined by the original collection of properties, which would in turn be a subspace of grander Thingspace, just as a line is a subspace of a plane, and a plane is a subspace of three-dimensional space.
Concerning dolphins: there would be a cluster of water-dwelling animals in the subspace of dimensions that water-dwelling animals are similar on, and a cluster of mammals in the subspace of dimensions that mammals are similar on, and dolphins would belong to both of them, just as the vector [1.1, 2.1, 9.1, 10.2] in the four-dimensional vector space ℝ⁴ is simultaneously close to [1, 2, 2, 1] in the subspace spanned by x₁ and x₂, and close to [8, 9, 9, 10] in the subspace spanned by x₃ and x₄.
Humans are already functioning intelligences (well, sort of), so the categories that humans propose of their own accord won't be maximally wrong: no one would try to propose a word for "configurations of matter that match any of these 29,122 five-megabyte descriptions but have no other particular properties in common." (Indeed, because we are not-superexponentially-vast minds that evolved to function in a simple, ordered universe, it actually takes some ingenuity to construct a category that wrong.)
This leaves aspiring instructors of rationality in something of a predicament: in order to teach people how categories can be more or (ahem) less wrong, you need some sort of illustrative example, but since the most natural illustrative examples won't be maximally wrong, some people might fail to appreciate the lesson, leaving one of your students to fill in the gap in your lecture series eleven years later.
The pedagogical function of telling people to "stop playing nitwit games and admit that dolphins don't belong on the fish list" is to point out that, without denying the obvious similarities that motivated the initial categorization
{salmon, guppies, sharks, dolphins, trout, ...}
, there is more structure in the world: to maximize the (logarithm of the) probability your world-model assigns to your observations of dolphins, you need to take into consideration the many aspects of reality in which the grouping{monkeys, squirrels, dolphins, horses ...}
makes more sense. To the extent that relying on the initial category guess would result in a worse Bayes-score, we might say that that category is "wrong." It might have been "good enough" for the purposes of the sailors of yore, but as humanity has learned more, as our model of Thingspace has expanded with more dimensions and more details, we can see the ways in which the original map failed to carve reality at the joints.The one replies:
No. Everything we identify as a joint is a joint not "because we care about it", but because it helps us think about the things we care about.
Which dimensions of Thingspace you bother paying attention to might depend on your values, and the clusters returned by your brain's similarity-detection algorithms might "split" or "collapse" according to which subspace you're looking at. But in order for your map to be useful in the service of your values, it needs to reflect the statistical structure of things in the territory—which depends on the territory, not your values.
There is an important difference between "not including mountains on a map because it's a political map that doesn't show any mountains" and "not including Mt. Everest on a geographic map, because my sister died trying to climb Everest and seeing it on the map would make me feel sad."
There is an important difference between "identifying this pill as not being 'poison' allows me to focus my uncertainty about what I'll observe after administering the pill to a human (even if most possible minds have never seen a 'human' and would never waste cycles imagining administering the pill to one)" and "identifying this pill as not being 'poison', because if I publicly called it 'poison', then the manufacturer of the pill might sue me."
There is an important difference between having a utility function defined over a statistical model's performance against specific real-world data (even if another mind with different values would be interested in different data), and having a utility function defined over features of the model itself.
Remember how appealing to the dictionary is irrational when the actual motivation for an argument is about whether to infer a property on the basis of category-membership? But at least the dictionary has the virtue of documenting typical usage of our shared communication signals: you can at least see how "You're defecting from common usage" might feel like a sensible thing to say, even if one's true rejection lies elsewhere. In contrast, this motion of appealing to personal values (!?!) is so deranged that Yudkowsky apparently didn't even realize in 2008 that he might need to warn us against it!
You can't change the categories your mind actually uses and still perform as well on prediction tasks—although you can change your verbally reported categories, much as how one can verbally report "believing" in an invisible, inaudible, flour-permeable dragon in one's garage without having any false anticipations-of-experience about the garage.
This may be easier to see with a simple numerical example.
Suppose we have some entities that exist in the three-dimensional vector space ℝ³. There's one cluster of entities centered at [1, 2, 3], and we call those entities Foos, and there's another cluster of entities centered at [2, 4, 6], which we call Quuxes.
The one comes and says, "Well, I'm going redefine the meaning of 'Foo' such that it also includes the things near [2, 4, 6] as well as the Foos-with-respect-to-the-old-definition, and you can't say my new definition is wrong, because if I observe [2, _, _] (where the underscores represent yet-unobserved variables), I'm going to categorize that entity as a Foo but still predict that the unobserved variables are 4 and 6, so there."
But if the one were actually using the new concept of Foo internally and not just saying the words "categorize it as a Foo", they wouldn't predict 4 and 6! They'd predict 3 and 4.5, because those are the average values of a generic Foo-with-respect-to-the-new-definition in the 2nd and 3rd coordinates (because (2+4)/2 = 6/2 = 3 and (3+6)/2 = 9/2 = 4.5). (The already-observed 2 in the first coordinate isn't average, but by conditional independence, that only affects our prediction of the other two variables by means of its effect on our "prediction" of category-membership.) The cluster-structure knowledge that "entities for which x₁≈2, also tend to have x₂≈4 and x₃≈6" needs to be represented somewhere in the one's mind in order to get the right answer. And given that that knowledge needs to be represented, it might also be useful to have a word for "the things near [2, 4, 6]" in order to efficiently share that knowledge with others.
Of course, there isn't going to be a unique way to encode the knowledge into natural language: there's no reason the word/symbol "Foo" needs to represent "the stuff near [1, 2, 3]" rather than "both the stuff near [1, 2, 3] and also the stuff near [2, 4, 6]". And you might very well indeed want a short word like "Foo" that encompasses both clusters, for example, if you want to contrast them to another cluster much farther away, or if you're mostly interested in x₁ and the difference between x₁≈1 and x₁≈2 doesn't seem large enough to notice.
But if speakers of particular language were already using "Foo" to specifically talk about the stuff near [1, 2, 3], then you can't swap in a new definition of "Foo" without changing the truth values of sentences involving the word "Foo." Or rather: sentences involving Foo-with-respect-to-the-old-definition are different propositions from sentences involving Foo-with-respect-to-the-new-definition, even if they get written down using the same symbols in the same order.
Naturally, all this becomes much more complicated as we move away from the simplest idealized examples.
For example, if the points are more evenly distributed in configuration space rather than belonging to cleanly-distinguishable clusters, then essentialist "X is a Y" cognitive algorithms perform less well, and we get Sorites paradox-like situations, where we know roughly what we mean by a word, but are confronted with real-world (not merely hypothetical) edge cases that we're not sure how to classify.
Or it might not be obvious which dimensions of Thingspace are most relevant.
Or there might be social or psychological forces anchoring word usages on identifiable Schelling points that are easy for different people to agree upon, even at the cost of some statistical "fit."
We could go on listing more such complications, where we seem to be faced with somewhat arbitrary choices about how to describe the world in language. But the fundamental thing is this: the map is not the territory. Arbitrariness in the map (what color should Texas be?) doesn't correspond to arbitrariness in the territory. Where the structure of human natural language doesn't fit the structure in reality—where we're not sure whether to say that a sufficiently small collection of sand "is a heap", because we don't know how to specify the positions of the individual grains of sand, or compute that the collection has a Standard Heap-ness Coefficient of 0.64—that's just a bug in our human power of vibratory telepathy. You can exploit the bug to confuse humans, but that doesn't change reality.
Sometimes we might wish that something to belonged to a category that it doesn't (with respect to the category boundaries that we would ordinarily use), so it's tempting to avert our attention from this painful reality with appeal-to-arbitrariness language-lawyering, selectively applying our philosophy-of-language skills to pretend that we can define a word any way we want with no consequences. ("I'm not late!—well, okay, we agree that I arrived half an hour after the scheduled start time, but whether I was late depends on how you choose to draw the category boundaries of 'late', which is subjective.")
For this reason it is said that knowing about philosophy of language can hurt people. Those who know that words don't have intrinsic definitions, but don't know (or have seemingly forgotten) about the three or six dozen optimality criteria governing the use of words, can easily fashion themselves a Fully General Counterargument against any claim of the form "X is a Y"—
Isolated demands for rigor are great for winning arguments against humans who aren't as philosophically sophisticated as you, but the evolved systems of perception and language by which humans process and communicate information about reality, predate the Sequences. Every claim that X is a Y is an expression of cognitive work that cannot simply be dismissed just because most claimants doesn't know how they work. Platonic essences are just the limiting case as the overlap between clusters in Thingspace goes to zero.
You should never say, "The choice of word is arbitrary; therefore I can say whatever I want"—which amounts to, "The choice of category is arbitrary, therefore I can believe whatever I want." If the choice were really arbitrary, you would be satisfied with the choice being made arbitrarily: by flipping a coin, or calling a random number generator. (It doesn't matter which.) Whatever criterion your brain is using to decide which word or belief you want, is your non-arbitrary reason.
If what you want isn't currently true in reality, maybe there's some action you could take to make it become true. To search for that action, you're going to need accurate beliefs about what reality is currently like. To enlist the help of others in your planning, you're going to need precise terminology to communicate accurate beliefs about what reality is currently like. Even when—especially when—the current reality is inconvenient.
Even when it hurts.
(Oh, and if you're actually trying to optimize other people's models of the world, rather than the world itself—you could just lie, rather than playing clever category-gerrymandering mind games. It would be a lot simpler!)
Imagine that you've had a peculiar job in a peculiar factory for a long time. After many mind-numbing years of sorting bleggs and rubes all day and enduring being trolled by Susan the Senior Sorter and her evil sense of humor, you finally work up the courage to ask Bob the Big Boss for a promotion.
"Sure," Bob says. "Starting tomorrow, you're our new Vice President of Sorting!"
"Wow, this is amazing," you say. "I don't know what to ask first! What will my new responsibilities be?"
"Oh, your responsibilities will be the same: sort bleggs and rubes every Monday through Friday from 9 a.m. to 5 p.m."
You frown. "Okay. But Vice Presidents get paid a lot, right? What will my salary be?"
"Still $9.50 hourly wages, just like now."
You grimace. "O–kay. But Vice Presidents get more authority, right? Will I be someone's boss?"
"No, you'll still report to Susan, just like now."
You snort. "A Vice President, reporting to a mere Senior Sorter?"
"Oh, no," says Bob. "Susan is also getting promoted—to Senior Vice President of Sorting!"
You lose it. "Bob, this is bullshit. When you said I was getting promoted to Vice President, that created a bunch of probabilistic expectations in my mind: you made me anticipate getting new challenges, more money, and more authority, and then you reveal that you're just slapping an inflated title on the same old dead-end job. It's like handing me a blegg, and then saying that it's a rube that just happens to be blue, furry, and egg-shaped ... or telling me you have a dragon in your garage, except that it's an invisible, silent dragon that doesn't breathe. You may think you're being kind to me asking me to believe in an unfalsifiable promotion, but when you replace the symbol with the substance, it's actually just cruel. Stop fucking with my head! ... sir."
Bob looks offended. "This promotion isn't unfalsifiable," he says. "It says, 'Vice President of Sorting' right here on the employee roster. That's an sensory experience that you can make falsifiable predictions about. I'll even get you business cards that say, 'Vice President of Sorting.' That's another falsifiable prediction. Using language in a way you dislike is not lying. The propositions you claim false—about new job tasks, increased pay and authority—is not what the title is meant to convey, and this is known to everyone involved; it is not a secret."
Bob kind of has a point. It's tempting to argue that things like titles and names are part of the map, not the territory. Unless the name is written down. Or spoken aloud (instantiated in sound waves). Or thought about (instantiated in neurons). The map is part of the territory: insisting that the title isn't part of the "job" and therefore violates the maxim that meaningful beliefs must have testable consequences, doesn't quite work. Observing the title on the employee roster indeed tightly constrains your anticipated experience of the title on the business card. So, that's a non-gerrymandered, predictively useful category ... right? What is there for a rationalist to complain about?
To see the problem, we must turn to information theory.
Let's imagine that an abstract Job has four binary properties that can either be
high
orlow
—task complexity, pay, authority, and prestige of title—forming a four-dimensional Jobspace. Suppose that two-thirds of Jobs have{complexity: low, pay: low, authority: low, title: low}
(which we'll write more briefly as [low, low, low, low]) and the remaining one-third have{complexity: high, pay: high, authority: high, title: high}
(which we'll write as [high, high, high, high]).Task complexity and authority are hard to perceive outside of the company, and pay is only negotiated after an offer is made, so people deciding to seek a Job can only make decisions based the Job's title: but that's fine, because in the scenario described, you can infer any of the other properties from the title with certainty. Because the properties are either all low or all high, the joint entropy of title and any other property is going to have the same value as either of the individual property entropies, namely ⅔ log₂ 3/2 + ⅓ log₂ 3 ≈ 0.918 bits.
But since H(pay) = H(title) = H(pay, title), then the mutual information I(pay; title) has the same value, because I(pay; title) = H(pay) + H(title) − H(pay, title) by definition.
Then suppose a lot of companies get Bob's bright idea: half of the Jobs that used to occupy the point [low, low, low, low] in Jobspace, get their title coordinate changed to high. So now one-third of the Jobs are at [low, low, low, low], another third are at [low, low, low, high], and the remaining third are at [high, high, high, high]. What happens to the mutual information I(pay; title)?
I(pay; title) = H(pay) + H(title) − H(pay, title)
= (⅔ log 3/2 + ⅓ log 3) + (⅔ log 3/2 + ⅓ log 3) − 3(⅓ log 3)
= 4/3 log 3/2 + 2/3 log 3 − log 3 ≈ 0.2516 bits.
It went down! Bob and his analogues, having observed that employees and Job-seekers prefer Jobs with high-prestige titles, thought they were being benevolent by making more Jobs have the desired titles. And perhaps they have helped savvy employees who can arbitrage the gap between the new and old worlds by being able to put "Vice President" on their resumés when searching for a new Job.
But from the perspective of people who wanted to use titles as an easily-communicable correlate of the other features of a Job, all that's actually been accomplished is making language less useful.
In view of the preceding discussion, to "37 Ways That Words Can Be Wrong", we might wish to append, "38. Your definition draws a boundary around a cluster in an inappropriately 'thin' subspace of Thingspace that excludes relevant variables, resulting in fallacies of compression."
Miyamoto Musashi is quoted:
Similarly, the primary thing when you take a word in your lips is your intention to reflect the territory, whatever the means. Whenever you categorize, label, name, define, or draw boundaries, you must cut through to the correct answer in the same movement. If you think only of categorizing, labeling, naming, defining, or drawing boundaries, you will not be able actually to reflect the territory.
Do not ask whether there's a rule of rationality saying that you shouldn't call dolphins fish. Ask whether dolphins are fish.
And if you speak overmuch of the Way you will not attain it.
(Thanks to Alicorn, Sarah Constantin, Ben Hoffman, Zvi Mowshowitz, Jessica Taylor, and Michael Vassar for feedback.)