This afternoon I heard a news story about a middle eastern country where one person said of the defenses for a stockpile of nuclear weapons, "even if there is only a 1% probability of the defenses failing, we should do more to strengthen them given the consequences of their failure". I have nothing against this person's reasoning, but I do have an issue with where that 1% figure came from.
The statement above and others like it share a common problem: they are phrased such that it's unclear over what probability space the measure was taken. In fact, many journalist and other people don't seem especially concerned by this. Even some commenters on Less Wrong give little indication of the probability space over which they give a probability measure of an event, and nobody calls them on it. So what is this probability space they are giving probability measurements over?
If I'm in a generous mood, I might give the person presenting such a statement the benefit of the doubt and suppose they were unintentionally ambiguous. On the defenses of the nuclear weapon stockpile, the person might have meant to say "there is only a 1% probability of the defenses failing over all attacks", as in "in 1 attack out of every 100 we should expect the defenses to fail". But given both my experiences with how people treat probability and my knowledge of naive reasoning about probability, I am dubious of my own generosity. Rather, I suspect that many people act as though there were a universal probability space over which they may measure the probability of any event.
To illustrate the issue, consider the probability that a fair coins comes up heads. We typically say that there is a 1/2 chance of heads, but what we are implicitly saying is that given a probability measure P on the measurable space ({heads, tails}, {{}, {heads}, {tails}, {heads, tails}}), P({heads}) = P({tails}) = 1/2 and P({}) = 0 and P({heads, tails}) = 1. But if we look at the issue of a coin coming up heads from a wider angle, we could interpret it as "what is the probability of some particular coin sitting heads-up over the span of all time", which is another question all together. What this is asking is "what is the probability of the event that this coin sits heads-up over the universal probability space", i.e. the probability space of all events that could occur at some time during the existence of the universe, and we have no clear way to calculate the probability of such an event other than to say that the universal probability space must contain infinitely many (how infinitely is still up for debate) events of measure zero. So there is a universal probability space; it's just not very useful to us, hence the title of the article, since it practically doesn't exist for us.
None of this is to say, though, that the people committing these crimes against probability are aware of what probability space they are taking a measure over. Many people act as if there is some number they can assign to any event which tells them how likely it is to occur and questions of "probability spaces" never enter their minds. What does it mean that something happens 1% of the time? I don't know; maybe that it doesn't happen 99% of the time? How is 1% of the time measured? I don't know; maybe one out of every 100 seconds? Their crime is not one of mathematical abuse but of mathematical ignorance.
As aspiring rationalists, if we measure a probability, we ought to know over what probability space we're measuring. Otherwise a probability isn't well defined and is just another number that, at best, is meaningless and, at worst, can be used to help us defeat ourselves. Even if it's not always a good stylistic choice to make the probability space explicit in our speech and writing, we must always know over what probability space we are measuring a probability. Otherwise we are just making up numbers to feel rational.
O.K.
One wants an universal probability space where one can find the probability of any event. This is possible:
One way of making such a space is to take all recursive functions of some universal computer, run them, and storing the output, resulting in an universal probability space because every possible set of events will be there, as the results of infinitely many recursive functions, or programs as they are called. The probabilities corresponds to the density of these outputs, these events.
A counterargument is that it is too dependent on the actual universal computer chosen. However, theorems in algorithmic information theory shows that this dependence converges asymptotically as information increases, because the difference of densities of different outputs from different universal computers can at most be 2 to the power of the shortest program simulating the universal computer in another universal computer.
Kim Øyhus
OK....
what!? You haven't yet described a probability space. The aforementioned set is infinite, so the uniform distribution is unavailable. What probability distribution will you have on this set of recursive-function-runs. And in what way is the resulting probability space universal?