Without losing the generality of the theorems of probability, let me address your particular narrative: If you believe that, if a fifth column exists, it is of the type that will assuredly refrain from sabotage now in order to prepare a more devastating strike later;
This is a fancy way of saying that if you assume that the fifth column's intent is totally independent of the observance of sabotage. P(A | B ) = P(A). That is, no evidence can update your position along the lines of Bayes' theorem.
This is not what I am saying. I am saying that P(A |B) and P(A | ~B) can both be nonzero, and in the Bayesian sense this is what is meant by evidence. Either observing sabotage or failing to observe sabotage can, strictly speaking, corroborate the belief that there is a secret Fifth Column. If you make the further assumption that the actions of the Fifth Column are independent from your observations about sabotage, then yes, everything you said is correct.
My only point is that, in general, you cannot say that it is a rule of probability that A and ~A cannot both be evidence for B. You must be talking about specific assumptions involving independence for that to hold.
It also makes sense to think orthogonally about A and ~A in the following sense: if these are my only two hypotheses, then if there is any best decision, it is because under some decision rule, either A or ~A maximizes the a posteriori probability, but not both. If the posterior was equi-probable (50/50) for the hypotheses, then observing or not observing sabotage would change nothing. This could happen if you make the independence assumption above, but even if you don't, it could still happen that the priors and conditional probabilities just work out to that particular case, and there would be no optimal belief in the Bayesian sense.
For a concrete example, suppose I flip a coin and if it is Heads, I will eat a tuna sandwich with probability 3/4 and a chicken sandwich with probability 1/4, and if it is Tails I will eat a turkey sandwich with probability 3/4 and a chicken sandwich with probability 1/4. Now suppose you only get to see what sandwich I select and then must make your best guess about what the coin showed. If I select a chicken sandwich, then you would believe that either Heads or Tails could serve as evidence for this decision. Neither result would be surprising to you (i.e., neither result would change your model) if you learned of it after I selected a chicken sandwich.
In this case, both A and ~A can serve as evidence for chicken, to the tune of 1/4 in each case. A is much stronger evidence for tuna, ~A is much stronger evidence for turkey, but both, to some extent, are evidence of chicken.
I'm not disagreeing with your claim about probability theory at all. I'm just saying that we don't know that Warren made the assumption that his observations about sabotage were independent from the existence of a Fifth Column. For all we know, it was just that he had such a strong prior belief (which may or may not have been rational in itself) that there was a Fifth Column, that even after observing no sabotage, his decision rule was still in favor of belief in the Fifth Column.
It's not that he mistakenly thought that the Fifth Column would definitely act in one way or the other. It's just that both no sabotage and sabotage were, to some degree, compatible with his strong prior that there was a Fifth Column... enough so that after converting it to a posterior it didn't cause him to change his position.
Uh..
A is evidence for B if P(B|A) > P(B). That is to say, learning A increases your belief in B. It is a fact from probability theory that P(B) = P(B|A)P(A) + P(B|¬A)P(¬A). If P(B|A) > P(B) and P(B|¬A) > P(B) then that says that:
P(B) > P(B)P(A) + P(B)P(¬A)
P(B) > P(B)(P(A) + P(¬A))
P(B) > P(B)
SInce A and ¬A are exhaustive and exclusive (so P(A) + P(¬A) = 1) this is a contradiction.
On the other hand, P(B|A) and P(B|¬A) being nonzero just means both A and ¬A are consistent with B -- that is, A and ¬A are not disproofs of B.
From Robyn Dawes’s Rational Choice in an Uncertain World:
Consider Warren’s argument from a Bayesian perspective. When we see evidence, hypotheses that assigned a higher likelihood to that evidence gain probability, at the expense of hypotheses that assigned a lower likelihood to the evidence. This is a phenomenon of relative likelihoods and relative probabilities. You can assign a high likelihood to the evidence and still lose probability mass to some other hypothesis, if that other hypothesis assigns a likelihood that is even higher.
Warren seems to be arguing that, given that we see no sabotage, this confirms that a Fifth Column exists. You could argue that a Fifth Column might delay its sabotage. But the likelihood is still higher that the absence of a Fifth Column would perform an absence of sabotage.
Let E stand for the observation of sabotage, and ¬E for the observation of no sabotage. The symbol H1 stands for the hypothesis of a Japanese-American Fifth Column, and H2 for the hypothesis that no Fifth Column exists. The conditional probability P(E | H), or “E given H,” is how confidently we’d expect to see the evidence E if we assumed the hypothesis H were true.
Whatever the likelihood that a Fifth Column would do no sabotage, the probability P(¬E | H1), it won’t be as large as the likelihood that there’s no sabotage given that there’s no Fifth Column, the probability P(¬E | H2). So observing a lack of sabotage increases the probability that no Fifth Column exists.
A lack of sabotage doesn’t prove that no Fifth Column exists. Absence of proof is not proof of absence. In logic, (A ⇒ B), read “A implies B,” is not equivalent to (¬A ⇒ ¬B), read “not-A implies not-B .”
But in probability theory, absence of evidence is always evidence of absence. If E is a binary event and P(H | E) > P(H), i.e., seeing E increases the probability of H, then P(H | ¬ E) < P(H), i.e., failure to observe E decreases the probability of H . The probability P(H) is a weighted mix of P(H | E) and P(H | ¬ E), and necessarily lies between the two.1
Under the vast majority of real-life circumstances, a cause may not reliably produce signs of itself, but the absence of the cause is even less likely to produce the signs. The absence of an observation may be strong evidence of absence or very weak evidence of absence, depending on how likely the cause is to produce the observation. The absence of an observation that is only weakly permitted (even if the alternative hypothesis does not allow it at all) is very weak evidence of absence (though it is evidence nonetheless). This is the fallacy of “gaps in the fossil record”—fossils form only rarely; it is futile to trumpet the absence of a weakly permitted observation when many strong positive observations have already been recorded. But if there are no positive observations at all, it is time to worry; hence the Fermi Paradox.
Your strength as a rationalist is your ability to be more confused by fiction than by reality; if you are equally good at explaining any outcome you have zero knowledge. The strength of a model is not what it can explain, but what it can’t, for only prohibitions constrain anticipation. If you don’t notice when your model makes the evidence unlikely, you might as well have no model, and also you might as well have no evidence; no brain and no eyes.
1 If any of this sounds at all confusing, see my discussion of Bayesian updating toward the end of The Machine in the Ghost, the third volume of Rationality: From AI to Zombies.