The usual explanation of probability theory goes like this:
There is this thing called Probability Space, which consists of three other things:
- Sample Space - some non-empty set
- Event Space - a set of subsets of the Sample Space
- Probability Function - a measure function over the elements of the Event Space.
And then several examples of how we can merge this mathematical model with a real world situations are given.
For instance, for a dice roll the appropriate sample space would be {1; 2; 3; 4; 5; 6}. For an Event Space we can use a superset of the Sample Space and probability function has to give every elementary event equal value:
The point of such examples is to give students intuitive understanding of how to apply the math of set theory towards reasoning under uncertainty and I think they generally work fine for such purpose. But also they obfuscate an interesting and very important question: How do we select a sample space for a given problem?
Intuition aside, how can we know that the correct sample space is {1; 2; 3; 4; 5; 6} and not, say {1; 2; 3; 4; 5}? Both of these sets fit the formal definition of the sample space - they are non-empty sets. How can we know which one of them is the sample space for the exact problem we are talking about? How do we logically pinpoint the notion of sample space for a very specific problem, instead of sample space for at least some problem?
By Definition?
Surely, we can simply declare that semantic statement "Sample space for a dice roll" means exactly {1; 2; 3; 4; 5; 6}. And so the answer to our question: "How can we know which is the sample space for the problem we are talking about?" is simple - by definition.
Be careful with this kind of reasoning. As a wise man once said:
In cases like these, it is futile to try to settle the problem by coming up with some new definition of the word “rational” and saying, “Therefore my preferred answer, by definition, is what is meant by the word ‘rational.’ ” This simply raises the question of why anyone should pay attention to your definition. I’m not interested in probability theory because it is the holy word handed down from Laplace. I’m interested in Bayesian-style belief-updating (with Occam priors) because I expect that this style of thinking gets us systematically closer to, you know, accuracy, the map that reflects the territory.
If we can just define "Sample space for a dice roll" as {1; 2; 3; 4; 5; 6}, we can likewise define it as {1; 2; 3; 4; 5} or anything else and none of these definitions will be superior to any other, so we are essentially back to square one.
The only difference is that now we've replaced the initial question with "How can we know which definition is the right one?"
The second issue, is that now we have to do the exact same thing for every conceivable probability theory problem. We managed to pinpoint a sample space for a dice roll and now we can use probability theory to answer questions about dice rolls. But in order to solve problems involving coin tosses we need to axiomatically define a sample space for a coin toss, and in order to solve problems involving picking a marble from a bag with n marbles we need to axiomatically define such a sample space for every n. And so on.
In other words, this approach treats the knowledge about the sample space for a problem as completely non-generalizable. And this doesn't seem to be the case in practice. It's enough to show a couple examples of sample spaces to human students so that they grasp the idea, and then can apply probability theory to different problems. There is some rational principle that allows our intuition to work this way. What is it? Even if defining "Sample space for a dice roll" as {1; 2; 3; 4; 5; 6} is the right thing to do, what makes it so?
Betting Argument?
The other approach that quickly comes to mind is trying to construct a betting argument.
The problem here, is that this "betting" stuff is simply too advanced for us yet. We need to cover a lot of ground before we could be coherently talking about it, at all, yet alone for it to be helpful for us in any way. Let's look at an example.
Albert: So you believe that outcome 6 can't possibly happen?
Barry: Exactly.
Albert: That's crazy!
Barry: On the contrary, this is completely obvious.
Albert: Are you putting your money where your words are?
Barry: Sure. Bet you a dollar at 1:1000 odds that next time I roll the die the outcome isn't 6.
Albert: Deal.
[The die is rolled. It top side is showing four dots]
Barry: Four. Have you learned your lesson?
Albert: You just got lucky this time. Your betting odds are completely ridiculous.
Barry: We made a bet. I won it. You should be updating in favor of my position as a proper rationalist.
Albert: No! I'm winning in expectation. Here let me calculate expected utility for you... See? I may be loosing a dollar 5 times out of six but when I win, I win big, which compensates for all the losses and much more.
Barry: You are begging the question, assuming that your winning outcome could've happened, which is exactly the crux of our disagreement. Besides, you can come up with whatever rationalizations you want, but the fact of the matter is that you are one dollar poorer. Isn't that the whole point of betting arguments?
Albert: Whatever. If you are so confident, then let's bet one more time.
Barry: Okay... give me another dollar then.
Albert: What? You haven't even rolled the die yet!
Barry: I have. It's four. We've been discussing it for the last couple of minutes. Have you forgotten? I suppose memory problems would explain your crazy beliefs...
Albert: Are you mocking me? That was a previous dice roll. I'm talking about a new one!
Barry: That was the dice roll that we agreed to bet on. Now you wanted to repeat the bet...
Albert: Argh! Of course I meant the next dice roll! Why would I want to bet on a roll that has already happened and I know that the outcome is not 6?
Barry: Frankly? Because you are insane. So you want to make a new bet on a new dice roll?
Albert: Yes!
Barry: I don't think I'm interested. You didn't update at all based on our previous bet. Seems that you are not arguing in good faith here.
At the very least we need to understand a concept of some procedure that includes multiple dice rolls, with multiple outcomes. And that we are not just talking about one particular dice roll but about this procedure as a whole. But this is not all.
To formally construct a betting argument you need to have well-defined events over which you could define probability and utility functions. Which means that you already need to have an agreement on the sample space. To define a monetary reward for an outcome we first need to understand what are the outcomes, which is the whole question that is being investigated.
This isn't just some technical minutiae. To see this, let's suppose Albert managed to persuade Barry to keep rolling the die and and making the bets.
[The die is rolled. It shows one dot]
Barry: One. [Collects another dollar from Albert] Are you sure you still want to keep going?
Albert: Absolutely. Roll again.
[The die is rolled, but bumps into a teacup and ends on a corner in an undefined state. Barry re-rolls. Four dots again.]
Albert: [Gives another dollar] Again.
[Another roll. The top side shows three dots]
Albert: Again.
[The die falls from the table and rolls under the couch. Barry pull his hand to get the die back and re-rolls it. This time it clearly shows six dots.]
Albert: Aha! Six! Now... wait, what are you doing?
[Barry quickly re-rolls the die again. Three dots are on the top side]
Barry: No idea what you are talking about. It's three - pay up.
Albert: What the Hell?! It was six! You just re-rolled it. You owe me a grand!
Barry: Just like I re-rolled it when it landed on a corner or fell under the couch. Those obviously do not count.
Albert: Of course those didn't count! This one does!
Barry: I don't remember agreeing to it. As far as I'm concerned only when the top side of the die clearly shows 1, 2, 3, 4 or 5 dots it counts as an outcome of the toss.
Albert: You swindler! That's not what people usually mean! You should have specified!
Barry: I explicitly said that outcome 6 never happens. It's you who jumped to conclusions, before figuring out what's going on. By the way, you still owe me another dollar.
Albert: It's you who owe me a thousand! Pay up!
Barry: Quite a sore loser you are. I knew there was no point in continuing arguing with you after our first bet.
Map and Territory
There are multiple lessons that could be learned here. But what is specifically relevant for our case is this.
When Albert says "Dice roll" he means this procedure:
- Roll a die, wait until it stops.
- If it landed on a corner or in a in a place with low visibility, go back to step 1.
- Add together the number of dots on the top side of the die. The sum corresponds to the outcome that has just happened.
While Barry has a different procedure in mind:
- Roll a dice, wait until it stops.
- If it landed on a corner or in a in a place with low visibility or there are 6 dots on the top side, go back to step 1.
- Add together the number of dots on the top side of the die. The sum corresponds to the outcome that has just happened.
Both procedures include some amount of arbitrariness. There is no particular reason why do we count the number of the dots on the top side and not, say, on the bottom side, beyond general human agreement. Or that the mapping between the sum of the dots and the outcome of the sample space has to be direct. Or that we have to re-roll when the die lands on an edge, instead of counting it as some other outcome.
But also both procedures are connected to reality. They are entangled with a physical object - a die and an action of throwing it.
The disagreement between Albert and Barry is purely semantic. As soon as we've replaced the mental paintbrush handle "Dice roll" with a full description of the procedures, there is nothing to argue about.
For the first procedure the sample space is {1; 2; 3; 4; 5; 6}.
For the second, it's {1; 2; 3; 4; 5}.
How do we know that? The same way we can have any map reflecting a territory. We go outside and look. We conduct an experiment. We follow the described procedure and see what outcomes we get according to it. This is the helpful part of the betting argument - actually rolling the die and observing what happens, regardless of how money change hands.
To validate any map we need to compare it to the territory. But to do that we need to be able to talk about the territory at all. To make a step back and conceptualize the iterated procedure of dice rolling instead of a singular roll.
If probability is in the map, what is the territory?
A less arbitrary way to define a sample space is to take the set of all possible worlds. Each event, e.g. a die roll, corresponds to the disjunction of possible worlds where that event happens. The possible worlds can differ in a lot of tiny details, e.g. the exact position of a die on the table. Even just an atom being different at the other end of the galaxy would constitute a different possible world. A possible world is a maximally specific way the world could be. So two possible worlds are always mutually exclusive. And the set of all possible worlds includes every possible way reality could be. There are no excluded possibilities like a die falling on the floor.
But for subjective probability theory a "sample space" isn't even needed at all. A probability function can simply be defined over a Boolean algebra of propositions. Propositions ("events") are taken to be primary instead of being defined via primary outcomes of a sample space. We just have beliefs in some propositions, and there is nothing psychological corresponding to outcomes of a sample space. We only need outcomes if probabilities are defined to be ratios of frequencies of outcomes. Likewise, "random variables" or "partitions" don't make sense for subjective probability theory: there are just propositions.
I have read it a while ago, but he overstates the importance of axiom systems. E.g. he wrote:
... (read more)