I suspect respondents are answering different questions from the ones asked. And where the question does not include probability values for the options the respondents are making up their own. It does not account for respondents arbitrarily ordering what they perceive as equal probabilities. And finally, they may be changing the component probabilities so that they are using different probability values throughout when viewing the options.
The respondents are actually reading the probabilities as independent, and assigning probabilities such as this: A: P(Accountant) = 0.1 C: P(Jazz) = 0.01 E: P(Accountant^Jazz) = P(Accountant) x P(Jazz) = 0.001, and you would expect the correct ranking
But if they are perceiving E as conditional then P(Accountant|Jazz) = P(Accountant^Jazz)/P(Jazz) = .001/.01 = 0.1, and leaving the equal ranking of A, E ordered as A, E they end up with A >= E > C. And, it's also possible they are using an intuitive conditional probability and coarsely and approximately ranking without calculation.
They may also be doing the intuitive of the following, by reading the questions in order:
A: Yeah, sounds about right for Bill. Let's say 0.1 C: Nah, no way does Bill play Jazz. Let's say zero! E: Well, I really don't think he plays jazz, and I really thought he'd be an accountant. But I guess he could be both. In this case I'm going for 0.05 accountant, but 0.02 Jazz. 0.05 x 0.02 = 0.001
So, A > E > C
In this last case the fact that he could both be an Accountant and play Jazz (E) is more plausible than he would play Jazz and not be an accountant (reading C as not being an accountant). Of course C does not rule out him also being an accountant, but that's not what appears to be the intuitive implication of C. It's as if the respondent is thinking, why would they include E if C already includes the possibility of being an accountant? And though the options are expressed as a set the respondent is not connecting them and so adapting the independent probabilities in each option. As I said, this might be quite intuitive so that the respondents do not perform the calculations and so do not see the mistake. That the question says "not mutually exclusive or exhaustive" may not register.
The diplomatic response might be explained by the following. Without any good reason respondents to (1) think suspension unlikely. Because they are not asked (2) they are asked to rate this independently of anything else, whether that be invasion of Poland, assassination of the US President, or anything else not mentioned in (1). Since they are not given any reason for suspension they think it very unlikely. So, your point that "there is no possibility that the first group interpreted (1) to mean 'suspension but no invasion' " does not hold. They can interpret it to mean 'suspension but nothing else'.
But in (2) the respondents are given a good reason to thank that if invasion is likely then suspension will follow hot on its heels. Also, some respondents might be answering a question such as "If invasion then suspension?", even though that is not what they are being asked.
So I think there are explanations as to why respondents don't get it that go beyond simply not knowing or remembering the conjunction condition, let alone knowing it as a 'fallacy' to avoid.
Is probability a cognitive version of an optical illusion? Two lines may not look the same length, but when you measure them they are. When two probability statements appear one way they may actually turn out to be another way if you perform the calculation. The difference in both cases is relying on intuition rather than measurement or calculation. Looked at it from this point of view probability 'illusions' are no more embarrassing than optical ones, which we still fall for even when we know the falsity of what we perceive.
The following experiment has been slightly modified for ease of blogging. You are given the following written description, which is assumed true:
No complaints about the description, please, this experiment was done in 1974. Anyway, we are interested in the probability of the following propositions, which may or may not be true, and are not mutually exclusive or exhaustive:
Take a moment before continuing to rank these six propositions by probability, starting with the most probable propositions and ending with the least probable propositions. Again, the starting description of Bill is assumed true, but the six propositions may be true or untrue (they are not additional evidence) and they are not assumed mutually exclusive or exhaustive.
In a very similar experiment conducted by Tversky and Kahneman (1982), 92% of 94 undergraduates at the University of British Columbia gave an ordering with A > E > C. That is, the vast majority of subjects indicated that Bill was more likely to be an accountant than an accountant who played jazz, and more likely to be an accountant who played jazz than a jazz player. The ranking E > C was also displayed by 83% of 32 grad students in the decision science program of Stanford Business School, all of whom had taken advanced courses in probability and statistics.
There is a certain logical problem with saying that Bill is more likely to be an account who plays jazz, than he is to play jazz. The conjunction rule of probability theory states that, for all X and Y, P(X&Y) <= P(Y). That is, the probability that X and Y are simultaneously true, is always less than or equal to the probability that Y is true. Violating this rule is called a conjunction fallacy.
Imagine a group of 100,000 people, all of whom fit Bill's description (except for the name, perhaps). If you take the subset of all these persons who play jazz, and the subset of all these persons who play jazz and are accountants, the second subset will always be smaller because it is strictly contained within the first subset.
Could the conjunction fallacy rest on students interpreting the experimental instructions in an unexpected way - misunderstanding, perhaps, what is meant by "probable"? Here's another experiment, Tversky and Kahneman (1983), played by 125 undergraduates at UBC and Stanford for real money:
65% of the subjects chose sequence 2, which is most representative of the die, since the die is mostly green and sequence 2 contains the greatest proportion of green rolls. However, sequence 1 dominates sequence 2, because sequence 1 is strictly included in 2. 2 is 1 preceded by a G; that is, 2 is the conjunction of an initial G with 1. This clears up possible misunderstandings of "probability", since the goal was simply to get the $25.
Another experiment from Tversky and Kahneman (1983) was conducted at the Second International Congress on Forecasting in July of 1982. The experimental subjects were 115 professional analysts, employed by industry, universities, or research institutes. Two different experimental groups were respectively asked to rate the probability of two different statements, each group seeing only one statement:
Estimates of probability were low for both statements, but significantly lower for the first group than the second (p < .01 by Mann-Whitney). Since each experimental group only saw one statement, there is no possibility that the first group interpreted (1) to mean "suspension but no invasion".
The moral? Adding more detail or extra assumptions can make an event seem more plausible, even though the event necessarily becomes less probable.
Do you have a favorite futurist? How many details do they tack onto their amazing, futuristic predictions?
Tversky, A. and Kahneman, D. 1982. Judgments of and by representativeness. Pp 84-98 in Kahneman, D., Slovic, P., and Tversky, A., eds. Judgment under uncertainty: Heuristics and biases. New York: Cambridge University Press.
Tversky, A. and Kahneman, D. 1983. Extensional versus intuitive reasoning: The conjunction fallacy in probability judgment. Psychological Review, 90: 293-315.