The raven paradox is a dilemma in inductive logic posed by Carl Gustav Hempel. It starts by noting that the statement "All ravens are black" can be expressed in the form of an implication: "If something is a raven, then it is black." This statement is logically equivalent to its contrapositive: "If something is not black, then it is not a raven."

We then consider that observing a black raven would typically be considered evidence for the statement "All ravens are black." The paradox comes from asking whether the same holds for the contrapositive; to put it another way, if we observe a non-black non-raven, such as a green apple, does that constitute evidence that all ravens are black?

The standard Bayesian solution, given by I. J. Good, goes as follows. Suppose there are N objects in the universe, of which r of them are ravens, and b of them are black, and we have a probability of 1/N of seeing any given object. Let H_i be the hypothesis that there are i non-black ravens, and let us assume we have some sensible prior over our hypotheses. Then upon observing a non-black non-raven, our probability of H_0 increases, albeit only very slightly when N is large.

I. J. Good's solution reveals a distinct way of thinking about evidence among Bayesians. Most logicians view logical fallacies as examples of poor reasoning, but the Bayesian interpretation is slightly different. Many fallacies, when stated in their appropriate inductive form, are actually valid in a Bayesian sense, though the evidence they provide is usually weak.

Which is to say that if you criticize a Bayesian for using a "red herring" they may not see the issue. Observing a red herring is indeed Bayesian evidence for the statement "All ravens are black."

New Comment
17 comments, sorted by Click to highlight new comments since:

A less formal way of putting the same thought process, in case it helps it click into place for anyone, might be "Every observation I make has some possibility of being contradictory evidence. That this observation, of a green apple, isn't contradictory means that my [All ravens are black] theory has faced another opportunity to be disproved and survived unscathed"

Yup, the "update based on the magnitude of your surprise" heuristic matches this pretty well.  

[-]jmh10

Does an observation of a black herring or black apple reduce the claim of all ravens are black -- or somehow be less forceful a failure to disprove the claim? What about a blackberry?

I think by default, anything that isn't a non-black raven would carry the same weight of "I made an observation and it wasn't a contradictory example"

I guess unless maybe you had a somewhat contrived prior, saying that there must be a certain number of ravens, and also that there can only be a certain number of black things in total. Then seeing a black non-raven would deplete the number of remaining black things and raise the odds that some of the number of ravens would spill outside that category.

There are choices of hypotheses and assumptions about probability distributions.

Good's choice was the hypothesis family "there are i non-black ravens in the universe", uniform prior over these, and an assumption that there are N objects in the universe and observations were drawn uniformly at random from these.

For these assumptions, anything that isn't a non-black raven does carry the same weight for updating the posterior distribution. But the assumptions are obviously false and the hypothesis family doesn't seem very efficient. I wouldn't use these by default.

I don't think what you're saying makes sense in general. For one thing, you're assuming (as does the model in the post) that non-ravens being black is anti-correlated with all ravens being black. Maybe more interestingly, you're assuming something about the process that presented you with a green apple. If the process is "look for a non-black object", then yeah, we've probably gotten evidence in favor of all ravens are black--this corresponds to "trying and failing to disprove the hypothesis". If the process is "look for a non-raven object" or just "look for an object", then we've probably gotten very little evidence about ravens, and the direction of the evidence depends on our prior.

As the ancient proverb goes: garbage prior, garbage posterior.

One way to make this more intuitive is to frame it in terms of exploring for ravens. If I search through forest after forest for a non-black raven, I will be inspecting enormous numbers of non-black non-ravens: trees, rocks, rivers, open spaces, elk. And yes, I think that most people would agree that if you do this enough, and find no non-white ravens (but do find black ravens), then that's some evidence that all ravens are black.

One way to make it less intuitive is to consider a universe consisting of nine green apples, and nothing else. If I inspect all nine apples (knowing in advance that the universe only contains 9 objects), that's certainly evidence that there are no non-black ravens. Is it evidence that all ravens are black, in this universe that contains no ravens?

It is indeed evidence proof that all ravens are black, as well as that all ravens are white, and also that no ravens exist. These are not mutually contradictory propositions.

Of course, it is also proof that you are a green apple. (User name checks out?)

nit:

and find no non-white ravens (but do find black ravens)

I think you meant "no non-black ravens" here.

Note that if you are a Solomonoff inductor, seeing a black raven doesn't always increase your credence that all ravens are black: see this paper.

Nice! I. J. Good himself pointed to another example of how this rule might break in his paper appropriately titled (for my post anyway), "The White Shoe is a Red Herring",

Suppose that we know we are in one or other of two worlds, and the hypothesis, H, under consideration is that all the ravens in our world are black. We know in advance that in one world there are a hundred black ravens, no non-black ravens, and a million other birds; and that in the other world there are a thousand black ravens, one white raven, and a million other birds. A bird is selected equiprobably at random from all the birds in our world. It turns out to be a black raven. This is strong evidence (a Bayes-Jefrreys-Turing factor of about 10) that we are in the second world, wherein not all ravens are black.

From Wikipedia,

Hempel rejected this as a solution to the paradox, insisting that the proposition 'c is a raven and is black' must be considered "by itself and without reference to any other information", and pointing out that it "... was emphasized in section 5.2(b) of my article in Mind ... that the very appearance of paradoxicality in cases like that of the white shoe results in part from a failure to observe this maxim."

ETA: I now see that the paper you linked cites this example. Cool.

I don't like Good's solution. It's mathematically correct, but, why would our prior look like this? Why would we know there's b black objects, and more generally, why would we expect there to be (anti)correlation between the color of ravens and the color of non-ravens? (In fact, my prior would be that if all ravens are black, this very slightly increases the chances of non-raven objects being black; objects might mimic ravens (e.g., artwork depicting ravens), and objects might be produced by similar processes as ravens (evolution in some niche) and hence have a similar phenotype.)

More than the prior, I find the sampling process to be nonsensical. Why would we expect to run across the objects in the world uniformly at random? Seeing ravens is highly correlated with being outside, looking at the sky, etc, so why wouldn't seeing white ravens have similar but much narrower requirements, like being on a specific island? Adding something along these lines to our model should make the bayesian evidence gained by observing either black ravens or non-black non-ravens quickly go to 0.

Many years ago, I was repeatedly amused when reading texts that talked about rare "black swan" events. Where I lived there were hundreds of swans living on the river, and every single one was black.

Getting back to the original point: a uniform distribution is superficially the simplest model, so should be associated with the lowest complexity penalty. However after gathering broader evidence, you see that this doesn't hold for anything else, so why should you expect it to hold for ravens?

Especially once you get down into gears-level models like inheritance, one should already expect that if there are both black ravens and white ravens, they are more likely to be geographically separated than intermingled.

To address the paradox: If I show you a box and (trustworthily) tell you that there's one object in the box, and then you look through a tiny hole in the box so all you can see is the color, and you see not-black, that is probably evidence that not all ravens are black (unless we think there's way more non-black objects in worlds where all ravens are black, enough to overcome the possibility that we're looking at a non-black raven). Then, if we open the box and see a non-raven, that observation is evidence in favor of "all ravens are black", because we've (mostly) screened off the evidence against--we definitely weren't looking at a nonblack raven.

Unrelated to the paradox, there is a distinction between a definition/category statement, and a prediction about the state of the universe.  There are different uncertainties involved, and different evidence that would move those probabilities for you.  

If it means "by definition, any bird of this form which isn't black isn't a crow, it's something else BECAUSE it's non-black", then the uncertainty is about whether whoever you're talking with is using the same definition.

If it means "we never see this biological cluster of features without also having black pigment", that's much more empirical, and subject to observational evidence.