I think something else is going on with the 2 4 6 experiment, as described. Many of the students are making the assumption about the set of potential rules. Specifically, the assumption is that most pairs of rules in this set have the following mutual relationship: most of the instances allowed by one rule, are disallowed by the other rule. This being the case, then the quickest way to test any hypothetical rule is to produce a variety of instances which conform with that rule, to see whether they conform with the hidden rule.
I'll give you an example. Suppose that we are considering a family of rules, "the third number is an integer polynomial of the first two numbers". The quickest way to disconfirm a hypothetical rule is to produce instances in accordance with it and test them. If the rule is wrong, then the chances are good that an instance will quickly be discovered that does not match the hidden rule. It is much less efficient to proceed by producing instances not in accordance with it.
I'll give a specific example. Suppose the hidden rule is c = a + b, and the hypothesized rule being tested is c = a - b. Now pick just one random instance in accordance with the hypothesized rule. I will suppose a = 4, b = 6, so c = -2. So the instance is 4 6 -2. That instance does not match the hidden rule, so the hypothesized rule is immediately disconfirmed. Now try the following: instead of picking a random instance in accordance with the hypothesized rule, pick one not in accordance with it. I'll pick 4 6 8. This also fails to match the hidden rule, so it fails to tell us whether our hypothesized rule is correct. We see that it was quicker to test an instance that agrees with the hypothetical rule.
Thus we can see that in a certain class of situations, the most efficient way to test a hypothesis is to come up with instances that conform with the hypothesis.
Now you can fault people on having made this assumption. But if you do, then it is still a different error from the one describe. If the assumption about the kind of problem faced had been correct, then the approach (testing instances that agree with the hypothesis) would have been a good one. The error, if any, lies not in the approach per se but in the assumption.
Finally, I do not think one can rightly fault people for making that assumption. For, it is inevitable that very large and completely untested assumptions must be made in order to come to a conclusion at all. For, infinitely many rules are consistent with the evidence no matter how many instances you test. The only way ever to whittle this infinity of rules consistent with all the evidence down to one concluded rule is to make very large assumptions. The assumption that I have described may simply be the assumption which they made (and they had to make some assumption).
Furthermore, it doesn't matter what assumptions people make (and they must make some, because of the nature of the problem), a clever scientist can learn what assumptions people tend to make and then violate those assumptions. So no matter what people do, someone can come along, construct an experiment in which those assumptions are violated, and then say, "gotcha" when the majority of his test subjects come to the wrong conclusions (because of the assumptions they were making which were violated by the experiment).
Another serious problem is that the students must make the necessary assumption that the rule be simple. In the context of school, simple is generally "most trivial to figure out".
This is a necessary assumption because there could be rules that would not be possible to determine by guessing. For example, you'd have to spend the lifetime of the universe guessing triplets to correctly identify that the rule is "Ascending integers except sequences containing the 22nd Busy Beaver number", and then you still wouldn't know if there's some o...
I am teaching a class, and I write upon the blackboard three numbers: 2-4-6. “I am thinking of a rule,” I say, “which governs sequences of three numbers. The sequence 2-4-6, as it so happens, obeys this rule. Each of you will find, on your desk, a pile of index cards. Write down a sequence of three numbers on a card, and I’ll mark it ‘Yes’ for fits the rule, or ‘No’ for not fitting the rule. Then you can write down another set of three numbers and ask whether it fits again, and so on. When you’re confident that you know the rule, write down the rule on a card. You can test as many triplets as you like.”
Here’s the record of one student’s guesses:
At this point the student wrote down their guess at the rule. What do you think the rule is? Would you have wanted to test another triplet, and if so, what would it be? Take a moment to think before continuing.
The challenge above is based on a classic experiment due to Peter Wason, the 2-4-6 task. Although subjects given this task typically expressed high confidence in their guesses, only 21% of the subjects successfully guessed the experimenter’s real rule, and replications since then have continued to show success rates of around 20%.
The study was called “On the failure to eliminate hypotheses in a conceptual task.” Subjects who attempt the 2-4-6 task usually try to generate positive examples, rather than negative examples—they apply the hypothetical rule to generate a representative instance, and see if it is labeled “Yes.”
Thus, someone who forms the hypothesis “numbers increasing by two” will test the triplet 8-10-12, hear that it fits, and confidently announce the rule. Someone who forms the hypothesis X-2X-3X will test the triplet 3-6-9, discover that it fits, and then announce that rule.
In every case the actual rule is the same: the three numbers must be in ascending order.
But to discover this, you would have to generate triplets that shouldn’t fit, such as 20-23-26, and see if they are labeled “No.” Which people tend not to do, in this experiment. In some cases, subjects devise, “test,” and announce rules far more complicated than the actual answer.
This cognitive phenomenon is usually lumped in with “confirmation bias.” However, it seems to me that the phenomenon of trying to test positive rather than negative examples, ought to be distinguished from the phenomenon of trying to preserve the belief you started with. “Positive bias” is sometimes used as a synonym for “confirmation bias,” and fits this particular flaw much better.
It once seemed that phlogiston theory could explain a flame going out in an enclosed box (the air became saturated with phlogiston and no more could be released). But phlogiston theory could just as well have explained the flame not going out. To notice this, you have to search for negative examples instead of positive examples, look into zero instead of one; which goes against the grain of what experiment has shown to be human instinct.
For by instinct, we human beings only live in half the world.
One may be lectured on positive bias for days, and yet overlook it in-the-moment. Positive bias is not something we do as a matter of logic, or even as a matter of emotional attachment. The 2-4-6 task is “cold,” logical, not affectively “hot.” And yet the mistake is sub-verbal, on the level of imagery, of instinctive reactions. Because the problem doesn’t arise from following a deliberate rule that says “Only think about positive examples,” it can’t be solved just by knowing verbally that “We ought to think about both positive and negative examples.” Which example automatically pops into your head? You have to learn, wordlessly, to zag instead of zig. You have to learn to flinch toward the zero, instead of away from it.
I have been writing for quite some time now on the notion that the strength of a hypothesis is what it can’t explain, not what it can—if you are equally good at explaining any outcome, you have zero knowledge. So to spot an explanation that isn’t helpful, it’s not enough to think of what it does explain very well—you also have to search for results it couldn’t explain, and this is the true strength of the theory.
So I said all this, and then I challenged the usefulness of “emergence” as a concept. One commenter cited superconductivity and ferromagnetism as examples of emergence. I replied that non-superconductivity and non-ferromagnetism were also examples of emergence, which was the problem. But be it far from me to criticize the commenter! Despite having read extensively on “confirmation bias,” I didn’t spot the “gotcha” in the 2-4-6 task the first time I read about it. It’s a subverbal blink-reaction that has to be retrained. I’m still working on it myself.
So much of a rationalist’s skill is below the level of words. It makes for challenging work in trying to convey the Art through words. People will agree with you, but then, in the next sentence, do something subdeliberative that goes in the opposite direction. Not that I’m complaining! A major reason I’m writing this is to observe what my words haven’t conveyed.
Are you searching for positive examples of positive bias right now, or sparing a fraction of your search on what positive bias should lead you to not see? Did you look toward light or darkness?