The Problem with AIXI
Followup to: Solomonoff Cartesianism; My Kind of Reflection
Alternate versions: Shorter, without illustrations
AIXI is Marcus Hutter's definition of an agent that follows Solomonoff's method for constructing and assigning priors to hypotheses; updates to promote hypotheses consistent with observations and associated rewards; and outputs the action with the highest expected reward under its new probability distribution. AIXI is one of the most productive pieces of AI exploratory engineering produced in recent years, and has added quite a bit of rigor and precision to the AGI conversation. Its promising features have even led AIXI researchers to characterize it as an optimal and universal mathematical solution to the AGI problem.1
Eliezer Yudkowsky has argued in response that AIXI isn't a suitable ideal to build toward, primarily because of AIXI's reliance on Solomonoff induction. Solomonoff inductors treat the world as a sort of qualia factory, a complicated mechanism that outputs experiences for the inductor.2 Their hypothesis space tacitly assumes a Cartesian barrier separating the inductor's cognition from the hypothesized programs generating the perceptions. Through that barrier, only sensory bits and action bits can pass.
Real agents, on the other hand, will be in the world they're trying to learn about. A computable approximation of AIXI, like AIXItl, would be a physical object. Its environment would affect it in unseen and sometimes drastic ways; and it would have involuntary effects on its environment, and on itself. Solomonoff induction doesn't appear to be a viable conceptual foundation for artificial intelligence — not because it's an uncomputable idealization, but because it's Cartesian.
In my last post, I briefly cited three indirect indicators of AIXI's Cartesianism: immortalism, preference solipsism, and lack of self-improvement. However, I didn't do much to establish that these are deep problems for Solomonoff inductors, ones resistant to the most obvious patches one could construct. I'll do that here, in mock-dialogue form.
Solomonoff Cartesianism
Followup to: Bridge Collapse; An Intuitive Explanation of Solomonoff Induction; Reductionism
Summary: If you want to predict arbitrary computable patterns of data, Solomonoff induction is the optimal way to go about it — provided that you're an eternal transcendent hypercomputer. A real-world AGI, however, won't be immortal and unchanging. It will need to form hypotheses about its own physical state, including predictions about possible upgrades or damage to its hardware; and it will need bridge hypotheses linking its hardware states to its software states. As such, the project of building an AGI demands that we come up with a new formalism for constructing (and allocating prior probabilities to) hypotheses. It will not involve just building increasingly good computable approximations of AIXI.
Solomonoff induction has been cited repeatedly as the theoretical gold standard for predicting computable sequences of observations.1 As Hutter, Legg, and Vitanyi (2007) put it:
Solomonoff's inductive inference system will learn to correctly predict any computable sequence with only the absolute minimum amount of data. It would thus, in some sense, be the perfect universal prediction algorithm, if only it were computable.
Perhaps you've been handed the beginning of a sequence like 1, 2, 4, 8… and you want to predict what the next number will be. Perhaps you've paused a movie, and are trying to guess what the next frame will look like. Or perhaps you've read the first half of an article on the Algerian Civil War, and you want to know how likely it is that the second half describes a decrease in GDP. Since all of the information in these scenarios can be represented as patterns of numbers, they can all be treated as rule-governed sequences like the 1, 2, 4, 8… case. Complicated sequences, but sequences all the same.
It's been argued that in all of these cases, one unique idealization predicts what comes next better than any computable method: Solomonoff induction. No matter how limited your knowledge is, or how wide the space of computable rules that could be responsible for your observations, the ideal answer is always the same: Solomonoff induction.
Solomonoff induction has only a few components. It has one free parameter, a choice of universal Turing machine. Once we specify a Turing machine, that gives us a fixed encoding for the set of all possible programs that print a sequence of 0s and 1s. Since every program has a specification, we call the number of bits in the program's specification its "complexity"; the shorter the program's code, the simpler we say it is.
Solomonoff induction takes this infinitely large bundle of programs and assigns each one a prior probability proportional to its simplicity. Every time the program requires one more bit, its prior probability goes down by a factor of 2, since there are then twice as many possible computer programs that complicated. This ensures the sum over all programs' prior probabilities equals 1, even though the number of programs is infinite.2
Bridge Collapse: Reductionism as Engineering Problem
Followup to: Building Phenomenological Bridges
Summary: AI theorists often use models in which agents are crisply separated from their environments. This simplifying assumption can be useful, but it leads to trouble when we build machines that presuppose it. A machine that believes it can only interact with its environment in a narrow, fixed set of ways will not understand the value, or the dangers, of self-modification. By analogy with Descartes' mind/body dualism, I refer to agent/environment dualism as Cartesianism. The open problem in Friendly AI (OPFAI) I'm calling naturalized induction is the project of replacing Cartesian approaches to scientific induction with reductive, physicalistic ones.
I'll begin with a story about a storyteller.
Once upon a time — specifically, 1976 — there was an AI named TALE-SPIN. This AI told stories by inferring how characters would respond to problems from background knowledge about the characters' traits. One day, TALE-SPIN constructed a most peculiar tale.
Henry Ant was thirsty. He walked over to the river bank where his good friend Bill Bird was sitting. Henry slipped and fell in the river. Gravity drowned.
Since Henry fell in the river near his friend Bill, TALE-SPIN concluded that Bill rescued Henry. But for Henry to fall in the river, gravity must have pulled Henry. Which means gravity must have been in the river. TALE-SPIN had never been told that gravity knows how to swim; and TALE-SPIN had never been told that gravity has any friends. So gravity drowned.
TALE-SPIN had previously been programmed to understand involuntary motion in the case of characters being pulled or carried by other characters — like Bill rescuing Henry. So it was programmed to understand 'character X fell to place Y' as 'gravity moves X to Y', as though gravity were a character in the story.1
For us, the hypothesis 'gravity drowned' has low prior probability because we know gravity isn't the type of thing that swims or breathes or makes friends. We want agents to seriously consider whether the law of gravity pulls down rocks; we don't want agents to seriously consider whether the law of gravity pulls down the law of electromagnetism. We may not want an AI to assign zero probability to 'gravity drowned', but we at least want it to neglect the possibility as Ridiculous-By-Default.
When we introduce deep type distinctions, however, we also introduce new ways our stories can fail.
Can We Do Without Bridge Hypotheses?
Followup to: Building Phenomenological Bridges, Reductionism
Bridge hypotheses are extremely awkward. It's risky to draw permanent artificial lines between categories of hypothesis ('physical' vs. 'bridge'). We might not give the right complexity penalties to one kind of hypothesis relative to the other. Or we might implement a sensible framework for bridge hypotheses in one kind of brain that fails to predict the radically new phenomenology that results from expanding one's visual cortex onto new hardware.
We'd have to hope that it makes sense to talk about 'correct' bridging rules (correctly relating a hypothesis about external stimuli or about transistors composing yourself, to which settings are in fact the ones you call 'green'), even though they're quite different from ordinary physical descriptions of the world. And, since fully general and error-free knowledge of the phenomenologies of possible agents will probably not be available to a seed AGI or to its programmers, we'd have to hope that it's possible to build a self-modifying inductor robust enough that mistaken bridge predictions would just result in a quick Bayesian update towards better ideas. It's definitely a dangling thread.
Why, then, can't we do without them? Maybe they're a handy heuristic for agents with incomplete knowledge — but can they truly never be eliminated?
The notion of an irreducible divide between an AI's subjective sensations and its models of the objective world may sound suspiciously dualistic. If we live in a purely physical world, then why shouldn't a purely physical agent, once it’s come to a complete understanding of itself and the world, be able to dispense with explicit bridges? These are, after all, the agent's beliefs that we're talking about. In the limit, intuitively, accurate beliefs should just look like the world. So shouldn't the agent's phenomenological self-models eventually end up collapsing into its physical world-models — dispensing with a metaphysically basic self/world distinction?1
Yes and no. When humans first began hypothesizing about the relationship between mind and matter, the former domain did not appear to be reducible to the latter. A number of philosophers concluded from this that there was a deep metaphysical divide between the two. But as the sciences of mind began to erode that belief in mind-matter dualism, they didn't eliminate the conceptual, linguistic, or intuitive distinctness of our mental and physical models. It may well be that we'll never abandon an intentional stance toward many phenomena, even once we've fully reduced them to their physical, biological, or computational underpinnings. Models of different levels can remain useful even once we've recognized that they co-refer.
In the case of an artificial scientist, beliefs in a fundamental sensation-v.-world dichotomy may dissolve even if the agent retains a useful conceptual distinction between its perceptual stream and the rest of the world. A lawful, unified physics need not be best modeled by agents with only a single world-modeling subprocess. 'There is one universe' doesn't imply 'one eye is optimal for viewing the universe'; 'there is one Earth' doesn't imply 'one leg is optimal for walking it'. The cases seem different chiefly because the leg/ground distinction is easier for humans to keep straight than the map/territory distinction.
Empirical reasoning requires a representational process that produces updates, and another representational process that gets updated. Eliminate the latter, and gone is the AI’s memory and expectation. (Imagine Cai experiencing its sequence of colors forever without considering any states of affairs they predict.) Eliminate the former, and the AGI has nothing but its frozen memories. (Imagine Cai without any sensory input, just a floating array of static world-models.) Keep both and eliminate bridging, and Cai painstakingly collects its visual data only to throw it all away; it has beliefs, but it never updates them.
Can we replace perceptions and expectations with a single kind-of-perceptiony kind-of-expectationish epistemic process, in a way that obviates any need for bridge hypotheses?
Maybe, but I don't know what that would look like. An agent's perceptions and its hypotheses are of different types, just by virtue of having distinct functions; and its meta-representations must portray them as such, lest its metacognitive reasoning fall into systemic error. Striving mightily to conflate the two may not make any more sense than striving to get an agent to smell colors or taste sounds.2
The only candidate I know of for a framework that may sidestep this distinction without thereby catching fire is Updateless Decision Theory, which was brought up by Jim Babcock, Vladimir Slepnev, and Wei Dei. UDT eliminates the need for bridge hypotheses in a particularly bold way, by doing away with updatable hypotheses altogether.
I don't understand UDT well enough to say how it bears on the problem of naturalizing induction, but I may return to this point when I have a better grasp on it. If UDT turns out to solve or dissolve the problem, it will be especially useful to have on hand a particular reductionism-related problem that afflicts other kinds of agents and is solved by UDT. This will be valuable even if UDT has other features that are undesirable enough to force us to come up with alternative solutions to naturalized induction.
For now, I'll just make a general point: It's usually good policy for an AGI to think like reality; but if an introspectible distinction between updatable information and update-causing information is useful for real-world inductors, then we shouldn't strip all traces of it from artificial reasoners, for much the same reason we shouldn't reduce our sensory apparatuses to a single modality in an attempt to ape the unity of our world's dynamics. Reductionism restricts what we can rationally believe about the territory, but it doesn't restrict the idiom of our maps.
1 This is close to the worry Alex Flint raised, though our main concern is with the agent's ability to reduce its own mental types, since this is a less avoidable problem than a third party trying to do the same.
2 The analogy to sensory modality is especially apt given that phenomenological bridge hypotheses can link sensory channels instead of linking a sensory channel to a hypothesized physical state. For instance, 'I see yellow whenever I taste isoamyl acetate' can function as a bridge between sensations an agent types as 'vision' and sensations an agent types as 'taste'.
Building Phenomenological Bridges
Naturalized induction is an open problem in Friendly Artificial Intelligence (OPFAI). The problem, in brief: Our current leading models of induction do not allow reasoners to treat their own computations as processes in the world.
The problem's roots lie in algorithmic information theory and formal epistemology, but finding answers will require us to wade into debates on everything from theoretical physics to anthropic reasoning and self-reference. This post will lay the groundwork for a sequence of posts (titled 'Artificial Naturalism') introducing different aspects of this OPFAI.
AI perception and belief: A toy model
A more concrete problem: Construct an algorithm that, given a sequence of the colors cyan, magenta, and yellow, predicts the next colored field.

Colors: CYYM CYYY CYCM CYYY ????
This is an instance of the general problem 'From an incomplete data series, how can a reasoner best make predictions about future data?'. In practice, any agent that acquires information from its environment and makes predictions about what's coming next will need to have two map-like1 subprocesses:
1. Something that generates the agent's predictions, its expectations. By analogy with human scientists, we can call this prediction-generator the agent's hypotheses or beliefs.
2. Something that transmits new information to the agent's prediction-generator so that its hypotheses can be updated. Employing another anthropomorphic analogy, we can call this process the agent's data or perceptions.
The genie knows, but doesn't care
Followup to: The Hidden Complexity of Wishes, Ghosts in the Machine, Truly Part of You
Summary: If an artificial intelligence is smart enough to be dangerous, we'd intuitively expect it to be smart enough to know how to make itself safe. But that doesn't mean all smart AIs are safe. To turn that capacity into actual safety, we have to program the AI at the outset — before it becomes too fast, powerful, or complicated to reliably control — to already care about making its future self care about safety. That means we have to understand how to code safety. We can't pass the entire buck to the AI, when only an AI we've already safety-proofed will be safe to ask for help on safety issues! Given the five theses, this is an urgent problem if we're likely to figure out how to make a decent artificial programmer before we figure out how to make an excellent artificial ethicist.
I summon a superintelligence, calling out: 'I wish for my values to be fulfilled!'
The results fall short of pleasant.
Gnashing my teeth in a heap of ashes, I wail:
Is the AI too stupid to understand what I meant? Then it is no superintelligence at all!
Is it too weak to reliably fulfill my desires? Then, surely, it is no superintelligence!
Does it hate me? Then it was deliberately crafted to hate me, for chaos predicts indifference. ———But, ah! no wicked god did intervene!
Thus disproved, my hypothetical implodes in a puff of logic. The world is saved. You're welcome.
On this line of reasoning, Friendly Artificial Intelligence is not difficult. It's inevitable, provided only that we tell the AI, 'Be Friendly.' If the AI doesn't understand 'Be Friendly.', then it's too dumb to harm us. And if it does understand 'Be Friendly.', then designing it to follow such instructions is childishly easy.
The end!
...
Is the missing option obvious?
...
What if the AI isn't sadistic, or weak, or stupid, but just doesn't care what you Really Meant by 'I wish for my values to be fulfilled'?
When we see a Be Careful What You Wish For genie in fiction, it's natural to assume that it's a malevolent trickster or an incompetent bumbler. But a real Wish Machine wouldn't be a human in shiny pants. If it paid heed to our verbal commands at all, it would do so in whatever way best fit its own values. Not necessarily the way that best fits ours.
The Up-Goer Five Game: Explaining hard ideas with simple words
xkcd's Up-Goer Five comic gave technical specifications for the Saturn V rocket using only the 1,000 most common words in the English language.
This seemed to me and Briénne to be a really fun exercise, both for tabooing one's words and for communicating difficult concepts to laypeople. So why not make a game out of it? Pick any tough, important, or interesting argument or idea, and use this text editor to try to describe what you have in mind with extremely common words only.
This is challenging, so if you almost succeed and want to share your results, you can mark words where you had to cheat in *italics*. Bonus points if your explanation is actually useful for gaining a deeper understanding of the idea, or for teaching it, in the spirit of Gödel's Second Incompleteness Theorem Explained in Words of One Syllable.
As an example, here's my attempt to capture the five theses using only top-thousand words:
- Intelligence explosion: If we make a computer that is good at doing hard things in lots of different situations without using much stuff up, it may be able to help us build better computers. Since computers are faster than humans, pretty soon the computer would probably be doing most of the work of making new and better computers. We would have a hard time controlling or understanding what was happening as the new computers got faster and grew more and more parts. By the time these computers ran out of ways to quickly and easily make better computers, the best computers would have already become much much better than humans at controlling what happens.
- Orthogonality: Different computers, and different minds as a whole, can want very different things. They can want things that are very good for humans, or very bad, or anything in between. We can be pretty sure that strong computers won't think like humans, and most possible computers won't try to change the world in the way a human would.
- Convergent instrumental goals: Although most possible minds want different things, they need a lot of the same things to get what they want. A computer and a human might want things that in the long run have nothing to do with each other, but have to fight for the same share of stuff first to get those different things.
- Complexity of value: It would take a huge number of parts, all put together in just the right way, to build a computer that does all the things humans want it to (and none of the things humans don't want it to).
- Fragility of value: If we get a few of those parts a little bit wrong, the computer will probably make only bad things happen from then on. We need almost everything we want to happen, or we won't have any fun.
If you make a really strong computer and it is not very nice, you will not go to space today.
Other ideas to start with: agent, akrasia, Bayes' theorem, Bayesianism, CFAR, cognitive bias, consequentialism, deontology, effective altruism, Everett-style ('Many Worlds') interpretations of quantum mechanics, entropy, evolution, the Great Reductionist Thesis, halting problem, humanism, law of nature, LessWrong, logic, mathematics, the measurement problem, MIRI, Newcomb's problem, Newton's laws of motion, optimization, Pascal's wager, philosophy, preference, proof, rationality, religion, science, Shannon information, signaling, the simulation argument, singularity, sociopathy, the supernatural, superposition, time, timeless decision theory, transfinite numbers, Turing machine, utilitarianism, validity and soundness, virtue ethics, VNM-utility
Reality is weirdly normal
Related to: When Anthropomorphism Became Stupid, Reductionism, How to Convince Me That 2 + 2 = 3
"Reality is normal." That is: Surprise, confusion, and mystery are features of maps, not of territories. If you would think like reality, cultivate outrage at yourself for failing to intuit the data, not resentment at the data for being counter-intuitive.
"Not one unusual thing has ever happened." That is: Ours is a tight-knit and monochrome country. The cosmos is simple, tidy, lawful. "[T]here is no surprise from a causal viewpoint — no disruption of the physical order of the universe."
"It all adds up to normality." That is: Whatever is true of fundamental reality does not exist in a separate universe from our everyday activities. It composes those activities. The perfected description of our universe must in principle allow us to reproduce the appearances we started with.
These maxims are remedies to magical mereology, anthropocentrism, and all manner of philosophical panic. But reading too much (or too little) into them can lead seekers from the Path. For instance, they may be wrongly taken to mean that the world is obliged to validate our initial impressions or our untrained intuitions. As a further corrective, I suggest: Reality is weirdly normal. It's "normal" in odd ways, by strange means, in surprising senses.
At the risk of vivisecting poetry, and maybe of stating the obvious, I'll point out that the maxims mean different things by "normal". In the first two, what's "normal" or "usual" is the universe taken on its own terms — the cosmos as it sees itself, or as an ideally calibrated demon would see it. In the third maxim, what's "normal" is the universe humanity perceives — though this still doesn't identify normality with what's believed or expected. Actually, it will take some philosophical work to articulate just what Egan's "normality" should amount to. I'll start with Copernicanism and reductionism, and then I'll revisit that question.
Engaging First Introductions to AI Risk
I'm putting together a list of short and sweet introductions to the dangers of artificial superintelligence.
My target audience is intelligent, broadly philosophical narrative thinkers, who can evaluate arguments well but who don't know a lot of the relevant background or jargon.
My method is to construct a Sequence mix tape — a collection of short and enlightening texts, meant to be read in a specified order. I've chosen them for their persuasive and pedagogical punchiness, and for their flow in the list. I'll also (separately) list somewhat longer or less essential follow-up texts below that are still meant to be accessible to astute visitors and laypeople.
The first half focuses on intelligence, answering 'What is Artificial General Intelligence (AGI)?'. The second half focuses on friendliness, answering 'How can we make AGI safe, and why does it matter?'. Since the topics of some posts aren't obvious from their titles, I've summarized them using questions they address.
Part I. Building intelligence.
1. Power of Intelligence. Why is intelligence important?
2. Ghosts in the Machine. Is building an intelligence from scratch like talking to a person?
3. Artificial Addition. What can we conclude about the nature of intelligence from the fact that we don't yet understand it?
4. Adaptation-Executers, not Fitness-Maximizers. How do human goals relate to the 'goals' of evolution?
5. The Blue-Minimizing Robot. What are the shortcomings of thinking of things as 'agents', 'intelligences', or 'optimizers' with defined values/goals/preferences?
Part II. Intelligence explosion.
6. Optimization and the Singularity. What is optimization? As optimization processes, how do evolution, humans, and self-modifying AGI differ?
7. Efficient Cross-Domain Optimization. What is intelligence?
8. The Design Space of Minds-In-General. What else is universally true of intelligences?
9. Plenty of Room Above Us. Why should we expect self-improving AGI to quickly become superintelligent?
Part III. AI risk.
10. The True Prisoner's Dilemma. What kind of jerk would Defect even knowing the other side Cooperated?
11. Basic AI drives. Why are AGIs dangerous even when they're indifferent to us?
12. Anthropomorphic Optimism. Why do we think things we hope happen are likelier?
13. The Hidden Complexity of Wishes. How hard is it to directly program an alien intelligence to enact my values?
14. Magical Categories. How hard is it to program an alien intelligence to reconstruct my values from observed patterns?
15. The AI Problem, with Solutions. How hard is it to give AGI predictable values of any sort? More generally, why does AGI risk matter so much?
Part IV. Ends.
16. Could Anything Be Right? What do we mean by 'good', or 'valuable', or 'moral'?
17. Morality as Fixed Computation. Is it enough to have an AGI improve the fit between my preferences and the world?
18. Serious Stories. What would a true utopia be like?
19. Value is Fragile. If we just sit back and let the universe do its thing, will it still produce value? If we don't take charge of our future, won't it still turn out interesting and beautiful on some deeper level?
20. The Gift We Give To Tomorrow. In explaining value, are we explaining it away? Are we making our goals less important?
Summary: Five theses, two lemmas, and a couple of strategic implications.
All of the above were written by Eliezer Yudkowsky, with the exception of The Blue-Minimizing Robot (by Yvain), Plenty of Room Above Us and The AI Problem (by Luke Muehlhauser), and Basic AI Drives (a wiki collaboration). Seeking a powerful conclusion, I ended up making a compromise between Eliezer's original The Gift We Give To Tomorrow and Raymond Arnold's Solstice Ritual Book version. It's on the wiki, so you can further improve it with edits.
Further reading:
- Three Worlds Collide (Normal), by Eliezer Yudkowsky
- a short story vividly illustrating how alien values can evolve.
- So You Want to Save the World, by Luke Muehlhauser
- an introduction to the open problems in Friendly Artificial Intelligence.
- Intelligence Explosion FAQ, by Luke Muehlhauser
- a broad overview of likely misconceptions about AI risk.
- The Singularity: A Philosophical Analysis, by David Chalmers
- a detailed but non-technical argument for expecting intelligence explosion, with an assessment of the moral significance of synthetic human and non-human intelligence.
I'm posting this to get more feedback for improving it, to isolate topics for which we don't yet have high-quality, non-technical stand-alone introductions, and to reintroduce LessWrongers to exceptionally useful posts I haven't seen sufficiently discussed, linked, or upvoted. I'd especially like feedback on how the list I provided flows as a unit, and what inferential gaps it fails to address. My goals are:
A. Via lucid and anti-anthropomorphic vignettes, to explain AGI in a way that encourages clear thought.
B. Via the Five Theses, to demonstrate the importance of Friendly AI research.
C. Via down-to-earth meta-ethics, humanistic poetry, and pragmatic strategizing, to combat any nihilisms, relativisms, and defeatisms that might be triggered by recognizing the possibility (or probability) of Unfriendly AI.
D. Via an accessible, substantive, entertaining presentation, to introduce the raison d'être of LessWrong to sophisticated newcomers in a way that encourages further engagement with LessWrong's community and/or content.
What do you think? What would you add, remove, or alter?
What do professional philosophers believe, and why?
LessWrong has twice discussed the PhilPapers Survey of professional philosophers' views on thirty controversies in their fields — in early 2011 and, more intensively, in late 2012. We've also been having some lively debates, prompted by LukeProg, about the general value of contemporary philosophical assumptions and methods. It would be swell to test some of our intuitions about how philosophers go wrong (and right) by looking closely at the aggregate output and conduct of philosophers, but relevant data is hard to come by.
Fortunately, Davids Chalmers and Bourget have done a lot of the work for us. They released a paper summarizing the PhilPapers Survey results two days ago, identifying, by factor analysis, seven major components consolidating correlations between philosophical positions, influences, areas of expertise, etc.
1. Anti-Naturalists: Philosophers of this stripe tend (more strongly than most) to assert libertarian free will (correlation with factor .66), theism (.63), the metaphysical possibility of zombies (.47), and A theories of time (.28), and to reject physicalism (.63), naturalism (.57), personal identity reductionism (.48), and liberal egalitarianism (.32).
Anti-Naturalists tend to work in philosophy of religion (.3) or Greek philosophy (.11). They avoid philosophy of mind (-.17) and cognitive science (-.18) like the plague. They hate Hume (-.14), Lewis (-.13), Quine (-.12), analytic philosophy (-.14), and being from Australasia (-.11). They love Plato (.13), Aristotle (.12), and Leibniz (.1).
2. Objectivists: They tend to accept 'objective' moral values (.72), aesthetic values (.66), abstract objects (.38), laws of nature (.28), and scientific posits (.28). Note 'Objectivism' is being used here to pick out a tendency to treat value as objectively binding and metaphysical posits as objectively real; it isn't connected to Ayn Rand.
A disproportionate number of objectivists work in normative ethics (.12), Greek philosophy (.1), or philosophy of religion (.1). They don't work in philosophy of science (-.13) or biology (-.13), and aren't continentalists (-.12) or Europeans (-.14). Their favorite philosopher is Plato (.1), least favorites Hume (-.2) and Carnap (-.12).
3. Rationalists: They tend to self-identify as 'rationalists' (.57) and 'non-naturalists' (.33), to accept that some knowledge is a priori (.79), and to assert that some truths are analytic, i.e., 'true by definition' or 'true in virtue of 'meaning' (.72). Also tend to posit metaphysical laws of nature (.34) and abstracta (.28). 'Rationalist' here clearly isn't being used in the LW or freethought sense; philosophical rationalists as a whole in fact tend to be theists.
Rationalists are wont to work in metaphysics (.14), and to avoid thinking about the sciences of life (-.14) or cognition (-.1). They are extremely male (.15), inordinately British (.12), and prize Frege (.18) and Kant (.12). They absolutely despise Quine (-.28, the largest correlation for a philosopher), and aren't fond of Hume (-.12) or Mill (-.11) either.
4. Anti-Realists: They tend to define truth in terms of our cognitive and epistemic faculties (.65) and to reject scientific realism (.6), a mind-independent and knowable external world (.53), metaphysical laws of nature (.43), and the notion that proper names have no meaning beyond their referent (.35).
They are extremely female (.17) and young (.15 correlation coefficient for year of birth). They work in ethics (.16), social/political philosophy (.16), and 17th-19th century philosophy (.11), avoiding metaphysics (-.2) and the philosophies of mind (-.15) and language (-.14). Their heroes are Kant (.23), Rawls (.14), and, interestingly, Hume (.11). They avoid analytic philosophy even more than the anti-naturalists do (-.17), and aren't fond of Russell (-.11).

5. Externalists: Really, they just like everything that anyone calls 'externalism'. They think the content of our mental lives in general (.66) and perception in particular (.55), and the justification for our beliefs (.64), all depend significantly on the world outside our heads. They also think that you can fully understand a moral imperative without being at all motivated to obey it (.5).
6. Star Trek Haters: This group is less clearly defined than the above ones. The main thing uniting them is that they're thoroughly convinced that teleportation would mean death (.69). Beyond that, Trekophobes tend to be deontologists (.52) who don't switch on trolley dilemmas (.47) and like A theories of time (.41).
Trekophobes are relatively old (-.1) and American (.13 affiliation). They are quite rare in Australia and Asia (-.18 affiliation). They're fairly evenly distributed across philosophical fields, and tend to avoid weirdo intuitions-violating naturalists — Lewis (-.13), Hume (-.12), analytic philosophers generally (-.11).
7. Logical Conventionalists: They two-box on Newcomb's Problem (.58), reject nonclassical logics (.48), and reject epistemic relativism and contextualism (.48). So they love causal decision theory, think all propositions/facts are generally well-behaved (always either true or false and never both or neither), and think there are always facts about which things you know, independent of who's evaluating you. Suspiciously normal.
They're also fond of a wide variety of relatively uncontroversial, middle-of-the-road views most philosophers agree about or treat as 'the default' — political egalitarianism (.33), abstract object realism (.3), and atheism (.27). They tend to think zombies are metaphysically possible (.26) and to reject personal identity reductionism (.26) — which aren't metaphysically innocent or uncontroversial positions, but, again, do seem to be remarkably straightforward and banal approaches to all these problems. Notice that a lot of these positions are intuitive and 'obvious' in isolation, but that they don't converge upon any coherent world-view or consistent methodology. They clearly aren't hard-nosed philosophical conservatives like the Anti-Naturalists, Objectivists, Rationalists, and Trekophobes, but they also clearly aren't upstart radicals like the Externalists (on the analytic side) or the Anti-Realists (on the continental side). They're just kind of, well... obvious.
Conventionalists are the only identified group that are strongly analytic in orientation (.19). They tend to work in epistemology (.16) or philosophy of language (.12), and are rarely found in 17th-19th century (-.12) or continental (-.11) philosophy. They're influenced by notorious two-boxer and modal realist David Lewis (.1), and show an aversion to Hegel (-.12), Aristotle (-.11), and and Wittgenstein (-.1).
An observation: Different philosophers rely on — and fall victim to — substantially different groups of methods and intuitions. A few simple heuristics, like 'don't believe weird things until someone conclusively demonstrates them' and 'believe things that seem to be important metaphysical correlates for basic human institutions' and 'fall in love with any views starting with "ext"', explain a surprising amount of diversity. And there are clear common tendencies to either trust one's own rationality or to distrust it in partial (Externalism) or pathological (Anti-Realism, Anti-Naturalism) ways. But the heuristics don't hang together in a single Philosophical World-View or Way Of Doing Things, or even in two or three such world-views.
There is no large, coherent, consolidated group that's particularly attractive to LWers across the board, but philosophers seem to fall short of LW expectations for some quite distinct reasons. So attempting to criticize, persuade, shame, praise, or even speak of or address philosophers as a whole may be a bad idea. I'd expect it to be more productive to target specific 'load-bearing' doctrines on dimensions like the above than to treat the group as a monolith, for many of the same reasons we don't want to treat 'scientists' or 'mathematicians' as monoliths.
Another important result: Something is going seriously wrong with the high-level training and enculturation of professional philosophers. Or fields are just attracting thinkers who are disproportionately bad at critically assessing a number of the basic claims their field is predicated on or exists to assess.
Philosophers working in decision theory are drastically worse at Newcomb than are other philosophers, two-boxing 70.38% of the time where non-specialists two-box 59.07% of the time (normalized after getting rid of 'Other' answers). Philosophers of religion are the most likely to get questions about religion wrong — 79.13% are theists (compared to 13.22% of non-specialists), and they tend strongly toward the Anti-Naturalism dimension. Non-aestheticians think aesthetic value is objective 53.64% of the time; aestheticians think it's objective 73.88% of the time. Working in epistemology tends to make you an internalist, philosophy of science tends to make you a Humean, metaphysics a Platonist, ethics a deontologist. This isn't always the case; but it's genuinely troubling to see non-expertise emerge as a predictor of getting any important question in an academic field right.
EDIT: I've replaced "cluster" talk above with "dimension" talk. I had in mind gjm's "clusters in philosophical idea-space", not distinct groups of philosophers. gjm makes this especially clear:
The claim about these positions being made by the authors of the paper is not, not even a little bit, "most philosophers fall into one of these seven categories". It is "you can generally tell most of what there is to know about a philosopher's opinions if you know how well they fit or don't fit each of these seven categories". Not "philosopher-space is mostly made up of these seven pieces" but "philosopher-space is approximately seven-dimensional".
I'm particularly guilty of promoting this misunderstanding (including in portions of my own brain) by not noting that the dimensions can be flipped to speak of (anti-anti-)naturalists, anti-rationalists, etc. My apologies. As Douglas_Knight notes below, "If there are clusters [of philosophers], PCA might find them, but PCA might tell you something interesting even if there are no clusters. But if there are clusters, the factors that PCA finds won't be the clusters, but the differences between them. [...] Actually, factor analysis pretty much assumes that there aren't clusters. If factor 1 put you in a cluster, that would tell pretty much all there is to say and would pin down your factor 2, but the idea in factor analysis is that your factor 2 is designed to be as free as possible, despite knowing factor 1."
Subscribe to RSS Feed
= f037147d6e6c911a85753b9abdedda8d)