Many Weak Arguments and the Typical Mind

JonahS

In my previous post, I advanced the view that discovering and using many weak arguments generally produces better predictive models for answering questions about the human world than discovering and using a single relatively strong argument does.

My impression is that most high functioning people use the “many weak arguments” epistemic framework, and that this contrasts with people like my (past) self. I believe that people like me have misunderstood parts of the reasoning of most high functioning people due to typical mind fallacy, and that by extension, people like me have misunderstood parts of how society works.

I flesh out my thinking on this point below.

Simplistic generalizations and the typical mind

When we recognize that a member of a reference class has a given feature, we tend to generalize this feature to all members of the reference class. For example, when we encounter an immigrant from a given country, we reflexively assume that other people from the country have the same personality as this immigrant.

Because most people don’t have the resources or inclination to focus on improving their epistemic rationality, their beliefs are in part derived from uncritical simplistic generalizations of this type.

The reason that this sort of works in practice is that information about one member of a reference class is in fact evidence about other members of the reference class. This is a special case of Bayes’ theorem. But such simplistic generalization often yields bad epistemology, and the functionality of most people’s epistemology is often highly contingent on their use of the beliefs of those around them, which are functional by virtue of having survived natural selection.

The dangers of uncritical generalization

People responded to the Islamic terrorist attacks on September 11, 2001 by developing xenophobia toward Muslims in general, even though the terrorists represented a tiny fraction of Muslims. The xenophobia toward Muslims that followed the September 11th attacks did a great deal of harm, and if people had not been thinking in such sweeping terms, the harm may have been averted.

With such examples in mind, much of my past effort to improve my epistemic rationality has focused on refining the reflexive simplistic generalizations that I make. I’ve put a great deal of effort into appreciating and understanding the nuances present in a given reference class, and to figure out the appropriate subcategory of a reference class to which to extrapolate a feature of a given member of the reference class.

This style is characteristic of many of my friends.

Adroit use of simplistic generalizations

My previous efforts to refine my reflexive simplistic generalizations have largely consisted of working to discover a single relatively strong argument for or against a proposition. As I discussed in my previous post, I now believe using many weak arguments generally yields better predictions about the human world.

Weak arguments often arise from simple generalizations. This is exemplified by the arguments that I gave for majoring in a quantitative subject increasing earnings. The statements “The people who are wealthier majored in quantitative subjects,” “high paying jobs use quantitative skills,” “if you major in a quantitative subject, that shows that you’re smart,” “math teaches you to think,” “people say that majoring in a quantitative subject increases earnings” and “my friends think that majoring in a quantitative subject increases earnings” are each examples of placing something in a reference class that has a substantial probability of being inappropriate.

In my previous post, I used reference classes that have a substantial probability of being inappropriate, in order to derive a confident conclusion. I didn’t do substantive object level investigation of whether or not the reference classes are appropriate. You don’t need to do such an investigation to come to a fairly confident conclusion. All that you need is a sufficiently large number of unrelated reference classes.

In pages 27-28 of Intelligence Explosion Microeconomics, Eliezer wrote

Aside from the Lucas critique, the other major problem I have with the “outside view” is that everyone who uses it seems to come up with a diﬀerent reference class and a diﬀerent answer […] I don’t know what to do after two people take diﬀerent reference classes and come up with diﬀerent outside views, both of which we ought to just accept. My experience is that people end up doing the equivalent of saying, “I’m taking my reference class and going home.”

I’m sympathetic to Eliezer’s concerns with the use of the outside view: one can always find a reference class that supports one’s conclusion, and it’s unclear what the “correct” reference class is. Eliezer has argued that the weak inside view is a better alternative.

There’s not a dichotomy between “the outside view” and “the weak inside view” — one can instead use many independent outside views. This is the “many weak arguments” approach. I believe that this alternative is largely free of the problems with “reference class tennis” that Eliezer highlights, and is generally superior to the use of the weak inside view.

One can mitigate the problems with using individual simplistic generalizations by using simplistic generalizations from different reference classes and considering the composite picture.

Comparing the two approaches

In epistemology, one observes a member of a given reference class, and wants to determine whether another member of the reference class shares a given feature of the first member. We start our lives by naively extrapolating from the features of the first example to features of the second example. There are two basic ways of improving this aspect of one’s epistemology.

One way is to scrutinize the reference class that the first example falls into, and attempt to alter the reference class until one finds the largest reference class such that all members of the reference class share the feature of the sample member, and see whether the second member of interest falls into this reference class. This is the “one relatively strong argument” approach, and was previously my dominant mode of operation.

The other way is to attempt to consider many unrelated reference classes that both examples may or may not fit into, keep track of how many reference classes both examples fit into, and use the principle of consilience to assess whether the feature of the first example can reliably be extrapolated to the second example. This is the “many weak arguments” approach. I’ve been striving to make this my dominant mode of operation.

The use of many weak arguments as the default mode of operation for most high functioning people

I believe that most high functioning people rely primarily on many weak arguments and the principle of consilience. Some reasons that I believe this are:

Given that simplistic generalizations are a large input into people’s initial epistemology, it seems more likely that people would improve their epistemology by such simplistic generalizations in conjunction with each other than by adopting some other epistemological principle.
With the exception of people on Less Wrong and people in the mathematical community, I’ve almost never seen high functioning people use the “relatively one strong argument” approach.
I’ve been told that successful venture capitalists use the “many weak arguments” approach, and have been referred to notes on Peter Thiel’s class about startups as providing evidence for this claim. I haven’t investigated further, and intend to do so.
I’ve been very impressed by the epistemic standard at GiveWell, where I worked for a year. My assessment is based on the content on GiveWell’s website, and on my experience working there.

In Maximizing Cost-Effectiveness via Critical Inquiry, Co-Executive Director Holden Karnofsky expressed the position that rather than focusing on explicit cost-effectiveness estimates (a particular example of the use of one relatively strong argument), one should instead give explicit cost-effectiveness estimates a small amount of weight, and examine a philanthropic opportunity from many different angles.

How could we have missed this?

If I’m right about all of this, the question arises: Why have people like myself been oblivious to other people’s epistemological framework, and the strengths of this framework relative to our own epistemological framework? The claim that I’m making is a strong one, in light of people like me having previously been unaware of evidence for it. So I think it’s important to address this question. Here I’ll give some hypotheses:

More communication of single relatively strong arguments — A single relatively strong argument generally involves many steps, and so working it out often requires writing it down. By way of contrast, the “many weak arguments” approach doesn’t involve as long inferential chains, and so there’s less of a need to verbalize the analysis.

A single relatively strong argument lends itself to “story-telling” to a greater extent than a collection of many weak arguments lends itself to “story-telling.” As cognitive scientist Daniel Willingham discussed in The Privileged Status of Story, stories carry special significance for humans, and are more enjoyable to read and to write, than are lists of individual facts.

So the “many weak arguments” approach is less salient in verbal discourse than the “one relatively strong argument” approach. Carl Shulman is the one of the only people who I’ve encountered who explicitly lists many weak arguments in favor or against a position. (For more on Carl’s style, see Luke’s post Just the facts, ma'am!)

Compensatory behavior — When somebody is very strong in one area and weak in another area, he or she can often use his or her strength in the first area to compensate for the weakness in the second area, and this can mask over the existence of the weakness in the second area, making it hard to recognize. Some examples of this are as follows:

(i) My verbal comprehension ability is 1.5 standard deviations above my working memory. This is quite unusual. I only fully realized this very recently, because I had been able to use my relatively high verbal comprehension to partially compensate for my relatively low working memory.

(ii) I have a friend who’s extraordinarily productive and who’s involved in many different activities in a substantive way. It’s only recently that he realized that his productivity is as high as it is, and that he hadn’t been triaging. My impression is because his productivity was so high, he never needed to learn to triage, in contrast with most people, who were forced to learn to triage by necessity.

(iii) Isaac Newton missed the easy proof of the power series expansion for the sine function (using Taylor’s theorem), because he was so powerful a mathematician that he was able to derive the series expansion without trouble in a very roundabout and difficult way.

The “one relatively strong argument” epistemological framework is genuinely better than most people’s epistemological framework, which tends to be of the “few weak arguments” type. So being unaware of the virtues of the use of multiple weak arguments doesn’t cripple the epistemology of those who use “one relatively strong argument” to such a degree as to pull their epistemology to below average. People who use the “one relatively strong argument” approach are generally better at recognizing selection effects than people who use the “many weak arguments” approach, and so on that front have the edge over the people who use the “many weak arguments” approach.

Typical mind fallacy — Humans naturally model other people’s minds based on their own minds. This conspires with the above factors so that people who use the “one relatively strong argument” approach are apt to misread the “many weak arguments” approach as the “few weak arguments” approach, which can be viewed as an inferior version of the “one relatively strong argument” approach. This misreading is understandable. The fact that people are using weak arguments is more salient than the number and independence of the weak arguments that they make.

Implications

The phenomena discussed above have important implications:

Developing facility with the use of many weak arguments is low hanging fruit — If people like myself have been neglecting the virtues of using many weak arguments, that suggests that we can improve our epistemology a lot by working to improve in this area, because marginal diminishing returns haven’t set in, in contrast with the areas of epistemology that we’ve been working on for a long time.
People like me should update in the direction of other people being more rational — Because we’ve misread the “many weak arguments” style as the “few weak arguments” style, we’ve probably underestimated the rationality of most high functioning people. With this in mind, we should update in the direction such people being more rational than we had thought.
We should pay more attention to people’s bottom line than to their stated reasons — If most high functioning people aren’t relying heavily on any one of the arguments that they give, if a typical high functioning person responds to a query of the type “Why do you think X?” by saying “I believe X because of argument Y” we shouldn’t conclude that the person believes argument Y with high probability. Rather, we should assume that argument Y is one of many arguments that they believe with low confidence, most of which they’re not expressing, and we should focus on their belief in X instead of argument Y.

Acknowledgements: Thanks to Vipul Naik, Luke Muehlhauser and Nick Beckstead for very helpful comments on an earlier draft of this post.

Note: I formerly worked as a research analyst at GiveWell. All views expressed here are my own.

The reason that this sort of works in practice is that information about one member of a reference class is in fact evidence about other members of the reference class. This is just Bayes’ theorem.

Is it? It's Bayes' theorem together with the assumption that when you see two things that you've put in the same reference class then they're being drawn from the same distribution. Depending on how the reference class is constructed, this may or may not be a reasonable assumption (if it's constructed poorly the distribution may have more salient and unknown parameters than you can reasonably learn). At worst, the reference class might be "everything in the universe," in which case I suppose it's strictly speaking true that information about one thing in the universe is evidence about other things in the universe, but...

Yes, I wasn't claiming that it's good use of Bayes' theorem. The "sort of" qualification is significant, although I don't think that the use of the "worst case reference class" prevails in practice :-).

Yes, I wasn't claiming that it's good use of Bayes' theorem.

I think most readers will read the phrase "This is just Bayes' theorem" as "This is correct use of Bayes' theorem."

The claim that I intended to make is that "Bayes' theorem implies that the presence of a feature of a member of a given reference class is evidence for the presence of the feature in other members of the reference class." This is technically correct. It's not good epistemology in full generality, for the reason that Qiaochu gives. I'll modify my post to make what I was trying to say more clear.

With the exception of people on Less Wrong and people in the mathematical community, I’ve almost never seen high functioning people use the “relatively one strong argument” approach.

I think it's more general than that (depending in your definition of the 'mathematical community'). For example, I rarely see physicists attempt to argue something based on many weak arguments, and I think you would find the same to be true of engineers. More generally, I think that anyone who's used to formalism is used to being presented with extremely strong arguments, and ending the search for arguments there. Consider a Bayesian actor who happens to be in a quantitative field of study:

I decide proposition A is true, and sketch out a proof on some scratch paper. The probability that I made a mistake is significantly smaller than the probability that I didn't. I go home and write the proof out formally and carefully, and the probability of me being wrong drops further. I ask a peer to look over it and the probability that I make a mistake is vanishingly small. If prop A is important, then I may publish it, and after peer review, I can say that I have a strong argument for A: I have a proof P, and if P is correct, then so is A, with probability 1. The probability that P is incorrect is small, thanks to the formalism and many levels of peer review.

Since most of the arguments we believe are thus strong arguments, this trains our intuition with a heuristic to not bother looking for arguments that aren't extremely strong. This effect would probably scale with the rigor of the field (eg be much stronger in mathematicians, where proofs are essentially the only form of argument written down)

The best physicists use the "many weak arguments" approach at least sometimes. See my post on Euler and the Basel Problem for an example of this sort of thing. (Nowadays, physicists fall into the Eulerian tradition more than mathematicians do.)

A close friend who's a general relativity theorist has told me that the best physicists rely primarily on many weak arguments.

Hmm, I think I may be misunderstanding what you mean by "many weak arguments." As in, I don't think it's uncommon for physicists to make multiple arguments in support of a proposition, but even each of those arguments, IME, are strong enough to bet at least a year of one's career on (eg the old arguments for renormalization), by contrast with, say, continental drift, where you probably wouldn't be taken seriously if you'd produced merely one or two lines of evidence. What this shares with the "one strong argument" position is that we're initially looking for a sufficiently convincing argument, discarding lines of though that would lead to insufficiently strong arguments. It's different mostly in that we go back and find more arguments "to be extra sure," but you're still screening your arguments for sufficient strongness as you make them.

Though admittedly, as a student, I may be biased towards finding my professors' arguments more convincing than they ought to be.

Relying on a small number of strong arguments (or even one) has a clear drawback. Change. A new discovery can invalidate a single argument that seemed very strong in that past. Many weaker arguments have more stability.

Yes, I discuss this in the "Major weaknesses of the 'single relatively strong argument' approach" section of an earlier post.