More marbles and Sleeping Beauty
I
Previously I talked about an entirely uncontroversial marble game: I flip a coin, and if Tails I give you a black marble, if Heads I flip another coin to either give you a white or a black marble.
The probabilities of seeing the two marble colors are 3/4 and 1/4, and the probabilities of Heads and Tails are 1/2 each.
The marble game is analogous to how a 'halfer' would think of the Sleeping Beauty problem - the claim that Sleeping Beauty should assign probability 1/2 to Heads relies on the claim that your information for the Sleeping Beauty problem is the same as your information for the marble game - same possible events, same causal information, same mutual exclusivity and exhaustiveness relations.
So what's analogous to the 'thirder' position, after we take into account that we have this causal information? Is it some difference in causal structure, or some non-causal anthropic modification, or something even stranger?
As it turns out, nope, it's the same exact game, just re-labeled.
In the re-labeled marble game you still have two unknown variables (represented by flipping coins), and you still have a 1/2 chance of black and Tails, a 1/4 chance of black and Heads, and a 1/4 chance of white and Heads.
And then to get the thirds, you ask the question "If I get a black marble, what is the probability of the faces of the first coin?" Now you update to P(Heads|black)=1/3 and P(Tails|black)=2/3.
II
Okay, enough analogies. What's going on with these two positions in the Sleeping Beauty problem?
1:
2:
Here are two different diagrams, which are really re-labelings of the same diagram. The first labeling is the problem where P(Heads|Wake) = 1/2. The second labeling is the problem where P(Heads|Wake) = 1/3. The question at hand is really - which of these two math problems corresponds to the word problem / real world situation?
As a refresher, here's the text of the Sleeping Beauty problem that I'll use: Sleeping Beauty goes to sleep in a special room on Sunday, having signed up for an experiment. A coin is flipped - if the coin lands Heads, she will only be woken up on Monday. If the coin lands Tails, she will be woken up on both Monday and Tuesday, but with memories erased in between. Upon waking up, she then assigns some probability to the coin landing Heads, P(Heads|Wake).
Diagram 1: First a coin is flipped to get Heads or Tails. There are two possible things that could be happening to her, Wake on Monday or Wake on Tuesday. If the coin landed Heads, then she gets Wake on Monday. If the coin landed Tails, then she could either get Wake on Monday or Wake on Tuesday (in the marble game, this was mediated by flipping a second coin, but in this case it's some unspecified process, so I've labeled it [???]). Because all the events already assume she Wakes, P(Heads|Wake) evaluates to P(Heads), which just as in the marble game is 1/2.
This [???] node here is odd, can we identify it as something natural? Well, it's not Monday/Tuesday, like in diagram 2 - there's no option that even corresponds to Heads & Tuesday. I'm leaning towards the opinion that this node is somewhat magical / acausal, just hanging around because of analogy to the marble game. So I think we can take it out. A better causal diagram with the halfer answer, then, might merely be Coin -> (Wake on Monday / Wake on Tuesday), where Monday versus Tuesday is not determined at all by a causal node, merely informed probabilistically to be mutually exclusive and exhaustive.
Diagram 2: A coin is flipped, Heads or Tails, and also it could be either Monday or Tuesday. Together, these have a causal effect on her waking or not waking - if Heads and Monday, she Wakes, but if Heads and Tuesday, she Doesn't wake. If Tails, she Wakes. Her pre-Waking prior for Heads is 1/2, but upon waking, the event Heads, Tuesday, Don't Wake gets eliminated, and after updating P(Heads|Wake)=1/3.
There's a neat asymmetry here. In diagram 1, when the coin was Heads she got the same outcome no matter the value of [???], and only when the coin was Tails were there really two options. In Diagram 2, when the coin is Heads, two different things happen for different values of the day, while if the coin is Tails the same thing happens no matter the day.
Do these seem like accurate depictions of what's going on in these two different math problems? If so, I'll probably move on to looking closer at what makes the math problem correspond to the word problem.
Meetup : Urbana-Champaign: Experimentation
Discussion article for the meetup : Urbana-Champaign: Experimentation
What sorts of things would you try out if you could? What meta-level strategies can help turn that list into actually trying things? I also kind of want to talk about some ideas related to anthropics.
Discussion article for the meetup : Urbana-Champaign: Experimentation
Deriving probabilities from causal diagrams
What this is: an attempt to examine how causal knowledge gets turned into probabilistic predictions.
I'm not really a fan of any view of probability that involves black boxes. I want my probabilities (or more practically, the probabilities of toy agents in toy problems I consider) to be derivable from what I know in a nice clear way, following some desideratum of probability theory at every step.
Causal knowledge sometimes looks like a black box, when it comes to assigning probabilities, and I would like to crack open that box and distribute the candy inside to smiling children.
What this is not: an attempt to get causal diagrams from constraints on probabilities.
That would be silly - see Pearl's article that was recently up here. Our reasonable desire is the reverse: getting the constraints on probabilities from the causal diagrams.
The Marble Game
Consider marbles. First, I use some coin-related process to get either Heads or Tails. If Tails, I give you a black marble. If Heads, I use some other process to choose between giving you a black marble or a white marble.
Causality is an important part of the marble game. If I manually interfere with the process that gives Heads or Tails, this can change the probability you should assign of getting a black marble. But if I manually interfere with the process that gives you white or black marbles, this won't change your probability of seeing Heads or Tails.
What I'd like versus what is
The fundamental principle of putting numbers to beliefs, that always applies, is to not make up information. If I don't know of any functional differences between two events, I shouldn't give them different probabilities. But going even further - if I learn a little information, it should only change my probabilities a little.
The general formulation of this is to make your probability distribution consistent with what you know, in the way that contains the very least information possible (or conversely, the maximum entropy). This is how to not make up information.
I like this procedure; if we write down pieces of knowledge as mathematical constraints, we can find correct distribution by solving a single optimization problem. Very elegant. Which is why it's a shame that this isn't at all what we do for causal problems.
Take the marble game. To get our probabilities, we start with the first causal node, figure out the probability of Heads without thinking about marbles at all (that's easy, it's 1/2), and then move on to the marbles while taking the coin as given (3/4 for black and 1/4 for white).
One cannot do this problem without using causal information. If we neglect the causal diagram, our information is the following: A: We know that Heads and Tails are mutually exclusive and exhaustive (MEE), B: we know that getting a black marble and getting a white marble are MEE, and C: we know that if the coin is Tails, you'll get a black marble.
This leaves three MEE options: Tails and Black (TB), HB, and HW. Maximizing entropy, they all get probability 1/3.
One could alternately think of it like this: if we don't have the causal part of the problem statement (the causal diagram D), we don't know whether the coin causes the marble choice, or the marble causes the coin choice - why not pick a marble first, and if it's W we give you an H coin, but if it's B we flip the coin? Heck, why have one cause the other at all? Indeed, you should recover the 1/3 result if you average over all the consistent causal diagrams.
So my question is - what causal constraints is our distribution subject to, and what is it optimizing? Not piece by piece, but all at once?
Rephrasing the usual process
One method is to just do the same steps as usual, but to think of the rationale in terms of knowledge / constraints and maximum entropy.
We start with the coin, and we say "because the coin's result isn't caused by the marbles, no information pertaining to marbles matters here. Therefore, P(H|ABCD) is just P(H|A) = 1/2" (First application of maximum entropy). Then we move on to the marbles, and applying information B and C, plus maximum entropy a second time, we learn that P(B|ABCD) = 3/4. All that our causal knowledge really meant for our probabilities was the equation P(H|ABCD)=P(H|A).
Alternatively, what if we only wanted to maximize something once, but let causal knowledge change the thing we were maximizing? We can say something like "we want to minimize the amount of information about the state of the coin, since that's the first causal node, and then minimize the amount of information about it's descendant node, the marble." Although this could be represented as one equation using linear multipliers, it's clearly the same process just with different labels.
Is it even possible to be more elegant?
Both of these approaches are... functional. I like the first one a lot better, because I don't want to even come close to messing with the principle of maximum entropy / minimal information. But I don't like that we never get to apply this principle all at once. Can we break our knowledge down more so that everything happen nice and elegantly?
The way we stated our knowledge above was as P(H|ABCD) = P(H|A). But this is equivalent to the statement that there's a symmetry between the left and right branches coming out of the causal node. We can express this symmetry using the equivalence principle as P(H)=P(T), or as P(HB)+P(HW)=P(TB).
But note that this is just hiding what's going on, because the equivalence principle is just a special case of the maximum entropy principle - we might as well just require that P(H)=1/2 but still say that at the end we're "maximizing entropy subject to this constraint."
Answer: Probably not
The general algorithm followed above is, for each causal node, to insert the condition that the probabilities of outputs of that node, given the starting information including the causal diagram, are equal to the probabilities given only the starting information related to that node or its parents - information about the descendants does not help determine probabilities of the parents.
Meetup : Urbana-Champaign: TRVTH
Discussion article for the meetup : Urbana-Champaign: TRVTH
How good would knowing the truth be, if we were perfect enough to use it?
How irrational do we have to be before lying to ourselves is a good idea? And how irrational do we have to be before it really is a bad idea after all?
Possibly also featuring our old nemesis, the remote associates test.
Discussion article for the meetup : Urbana-Champaign: TRVTH
Meetup : Urbana-Champaign: Writing Prompts
Discussion article for the meetup : Urbana-Champaign: Writing Prompts
In honor of RaNoWriMo, let's do a few writing prompts - 15 minutes of just sitting down and writing the first page of something awesome.
Want advice? Check out an episode of writing excuses.
Discussion article for the meetup : Urbana-Champaign: Writing Prompts
Meetup : Urbana-Champaign: Fun and Games
Discussion article for the meetup : Urbana-Champaign: Fun and Games
Come for the fun and games, stay for practicing meditation. Also: halloween-candy-based elocution exercises.
Discussion article for the meetup : Urbana-Champaign: Fun and Games
Meetup : Urbana-Champaign: Meta-systems and getting things done
Discussion article for the meetup : Urbana-Champaign: Meta-systems and getting things done
Seed topic:
When I was a little kid doing chores, what worked was for my mom to tell me to do one thing. Then I'd do the thing, and come back and ask her "what next?" And then she'd tell me another thing, and I'd do that, and so on.
A few weeks ago Brienne posted a technique that can be used to do this with a mental model of someone smart. You build a model of what an effective person would do, and you ask your model what thing to do next, and the model tells you, and then you do that. This one of many ways of reminding yourself to follow the pattern "find the best strategy, and then do the next step in that strategy."
If you desire homework: try doing this for a few hours at some point during the week.
By contrast, consider the Getting Things Done family of productivity techniques, where you write down what you want to do in advance and then follow your written plan. How well has this worked for you in the past? (As one might predict, results vary).
Discussion article for the meetup : Urbana-Champaign: Meta-systems and getting things done
Meetup : Urbana-Champaign: Stoicism, anthropics
Discussion article for the meetup : Urbana-Champaign: Stoicism, anthropics
If you want to read up on Stoicism, check out the encyclopedia of philosophy entry.
For anthropics, see a recent discussion here.
Discussion article for the meetup : Urbana-Champaign: Stoicism, anthropics
Meetup : Urbana-Champaign: Noticing continued, Creativity
Discussion article for the meetup : Urbana-Champaign: Noticing continued, Creativity
Let's reconvene on noticing. We may have to go a little meta, but ah well.
I'd also like to try some group creativity exercises, for great justice.
Discussion article for the meetup : Urbana-Champaign: Noticing continued, Creativity
Meetup : Urbana-Champaign: Noticing.
Discussion article for the meetup : Urbana-Champaign: Noticing.
Let's work on noticing things. Homework before this meetup: work on noticing when you're rationalizing, and do some physical act like standing up, or snapping your fingers, or writing down what you were rationalizing about.
This may be a bit ambitious, but that's what we can work on at the meetup.
Subscribe to RSS Feed
= f037147d6e6c911a85753b9abdedda8d)