Less Wrong is a community blog devoted to refining the art of human rationality. Please visit our About page for more information.

Comment author: JeffJo 01 October 2015 07:26:08PM 0 points [-]

The problem with the Sleeping Beauty Problem, is that probability can be thought of as a rate: #successes per #trials. But this problem makes #trials a function of #successes, introducing what could be called a feedback loop into this rate calculation, and fracturing our concepts of what the terms mean. All of the analyses I've seen struggle to put these fractured meanings back together, without fully acknowledging that they are broken. MrMind comes closer to acknowledging it than most, when he says "'A fair coin will be tossed,' in this context, will mean different things for different people."

But this fractured terminology can be overcome quite simply. Instead of one volunteer, use four.

Each will go through a similar experience where they will be woken at least once and maybe twice, on Monday and/or Tuesday, depending on the result of the same fair coin flip.

All four will be wakened both days with the following exceptions: SB1 will be left asleep on Monday if Heads is flipped. SB2 will be left asleep on Monday if Tails is flipped. SB3 will be left asleep on Tuesday if Heads is flipped. And SB4 will be left asleep on Tuesday if Tails is flipped. Note that SB3's schedule corresponds to the original version of the problem.

This way, three of the volunteers will be wakened on Monday. Two of those will be wakened on again Tuesday, while the third will be left asleep and be replaced by the one who slept through Monday. And each has the same chance to be wakened just once.

Put the three in a room together, and allow them to discuss anything EXCEPT the coin result and day that they would sleep through. Ask each for their confidence in the assertion that she will be wakened just once during the experiment.

No matter what day it is, or how the coin landed, the assertion will be true for one of the three awake volunteers, and false for the other two. So their confidences should sum to 1. No matter what combination of day and result each was assigned to sleep through, each has the same information upon which to base her confidence. So their confidences should be the same.

The only possible solution is that the confidences should all be 1/3. If, instead, SB3 is just told about the other three volunteers, but never meets them, she can still reason the same way and get the answer 1/3. And since "I, SB3, will be wakened only once" is equivalent to "the fair coin landed Heads," our original volunteer can give the same answer.

Comment author: florijn 05 August 2013 09:08:19PM -1 points [-]

After having done a lot of research on the Sleeping Beauty Problem as it was the topic of my bachelor's thesis (philosophy), I came to the conclusion that anthropic reasoning is wrong in the Sleeping Beauty Problem. I will explain my argument (shortly) below:

The principle that Elga uses in his first paper to validate his argument for 1/3 is an anthropic principle he calls the Principle of Indifference:

"Equal probabilities should be assigned to any collection of indistinguishable, mutually exclusive and exhaustive events."

The Principle of Indifference is in fact a more restricted version of the Self-Indication Assumption:

"All other things equal, an observer should reason as if they are randomly selected from the set of all possible observers."

Both principles are to be accepted a priori as they can not be attributed to empirical considerations. They are therefore vulnerable to counterarguments...

The counterargument:

Suppose that the original experiment is modified a little:

If the outcome of the coin flip is Heads, they wake Beauty up at exactly 8:00. If the outcome of the first coin flip is Tails, the reasearchers flip another coin. If it lands Heads they wake Beauty at 7:00, if Tails at 9:00. That means that when Beauty wakes up she can be in one of 5 situations:

Heads and Monday 8:00

Tails and Monday 7:00

Tails and Monday 9:00

Tails and Tuesday 7:00

Tails and Tuesday 9:00

Again, these situations are mutually exclusive, indistinguishable and exhaustive. Hence thirders are forced to conclude that P(Heads) = 1/5.

Thirders might object that the total surface area under the probability curve in the Tails-world would still have to equal 2/3, as Beauty is awakened twice as many times in the Tails-world as in the Heads-world. They are then forced to explain why temporal uncertainty regarding an awakening (Monday or Tuesday) is different from temporal uncertainty regarding the time (7:00 or 9:00 o’clock). Both classify as temporal uncertainties within the same possible world, what could possibly set them apart?

An explanation could be that Beauty is only is asked for her credence in Heads during an awakening event, regardless of the time, and that such an event occurs twice in the Tails-world. That is, out of the 4 possible observer-moments in the Tails-world there are only two in which she is interviewed. That means that simply the fact that she is asked the same question twice is reason enough for thirders to distribute their credence, and it is no longer about the number of observer moments. So if she would be asked the same question a million times then her credence in Heads would drop to 1/1000001!

We can magnify the absurdity of this reasoning by imagining a modified version of the Sleeping Beauty Problem in which a coin is tossed that always lands on Tails. Again, she is awakened one million times and given an amnesia-inducing potion after each awakening. Thirder logic would lead to Beauty’s credence in Tails being 1/1000000, as there are one million observer-moments where she is asked for her credence within the only possible world; the Tails-world. To recapitulate: Beauty is certain that she lives in a world where a coin lands Tails, but due to the fact that she knows that she will answer the same question a million times her answer is 1/1000000. This would be tantamount to saying that Mt. Everest is only 1m high when knowing it will be asked 8848 times! It is very hard to see how amnesia could have such an effect on rationality.


The thirder argument is false. The fact that there are multiple possible observer-moments within a possible world does not justify dividing your credences equally among these observer-moments, as this leads to absurd consequences. The anthropic reasoning exhibited by the Principle of Indifference and the Self-Indication Assumption cannot be applied to the Sleeping Beauty Problem and I seriously doubt if it can be applied to other cases...

Comment author: JeffJo 17 December 2013 04:37:53PM 0 points [-]

these situations are mutually exclusive, indistinguishable and exhaustive.

No, they aren't. "Indistinguishable" in that definition does not mean "can't tell them apart." It means that the cases arise through equivalent processes. That's why the PoI applies to things like dice, whether or not what is printed on each side is visually distinguishable from other sides.

To make your cases equivalent, so that the PoI applies to them, you need to flip the second coin after the first lands on Heads also. But you wake SB at 8:00 regardless of the second coin's result. You now have have six cases that the PoI applies to, counting the "8:00 Monday" case twice, and each has probability 1/6.

Comment author: JeffJo 01 November 2013 09:25:39PM *  0 points [-]

Coscott’s original problem is unsolvable by standard means because the expected number of wakings is infinite, so you can’t determine a frequency. That doesn’t mean it is unanswerable – we just need an isomorphism. After informing SB of the procedure and putting her to sleep the first time:

1) Set M=0. 2) Select a number N (I’ll discuss how later). 3) Flip a coin. a. If this (odd-numbered) flip lands heads, wake SB N times and end the experiment. b. If this flip (odd-numbered) lands tails, continue to step 4. 4) Flip the coin again. a. If this (even-numbered) flip lands heads, wake SB 3*N times and end the experiment; b. If this (even-numbered)flip lands tails, set M=M+1 go to step #2.

In Coscott’s version, we start with N=1 and multiply it by 9 each time we choose a new one; that is, N=9^M. But does the answer depend on N in any way? Halfers don’t think the answer depends on the number of wakings at all, and thirders think it depends only on the ratio of wakings in step 3a to those in step 4a, not the specific values.

So I maintain that my problem is the same as coscott’s, except in scale, no matter how we choose N. We can answer the original question by choosing N=1 every time.

There is a 2/3 chance of ending after an odd number of flips, and a 1/3 chance of ending after an even number. A halfer should claim SB gains no new knowledge by being awake, so P(odd|awake)=2/3 and P(even|awake)=1/3. A thirder should say there are four possible situations that awake SB could be in , and she cannot differentiate between them. Since 3 of them correspond to an even number of flips, P(odd|awake)=1/4 and P(even|awake)=3/4.

But like coscott’s, this variation, by itself, sheds no light on the original problem. We can even change numbers to something else:

1) Flip a coin. a. If this (odd-numbered) flip lands heads, wake SB N times and end the experiment. b. If this flip (odd-numbered) lands tails, continue to step 2. 2) Flip the coin again. a. If this (even-numbered) flip lands heads, wake SB M times and end the experiment; b. If this (even-numbered)flip lands tails, go to step #1.

You can decide for yourself whether you think the answer should depend on M and N, but I suspect most people will decide that based on whether they are halfers (“it can’t depend on M and N!”) or thirders (“it must depend on M and N!”), rather than what makes mathematical sense. (I’m not saying they will ignore mathematical, I’m saying they will define it by getting the answer they prefer.)

But as long as we are accepting the possibility of infinite wakings, what happens if we hold N constant and let M approach infinity? Halfers will still say the answers don’t change, thirders will say P(odd)=N/(M+N) and P(even)=M/(M+N).

But is it, or is it not, the same if we hold M constant, at a very large number, and let N approach 0? Because at N=0, P(odd)=0, which it can’t be if the halfers are right.

Comment author: ialdabaoth 23 May 2013 11:35:34AM *  2 points [-]

So, I've been on this site for awhile. When I first came here, I had never had a formal introduction to Bayes' theorem, but it sounded a lot like ideas that I had independently worked out in my high school and college days (I was something of an amateur mathematician and game theorist).

A few days ago I was reading through one of your articles - I don't remember which one - and it suddenly struck me that I may not actually understand priors as well as I think I do.

After re-reading some fo the series, and then working through the math, I'm now reasonably convinced that I don't properly understand priors at all - at least, not intuitively, which seems to be an important aspect for actually using them.

I have a few weird questions that I'm hoping someone can answer, that will help point me back towards the correct quadrant of domain space. I'll start with a single question, and then see if I can claw my way towards understanding from there based on the answers:

Imagine there is a rational, Bayesian AI named B9 which has been programmed to visually identify and manipulate geometric objects. B9's favorite object is a blue ball, but B9 has no idea that it is blue: B9 sees the world through a black and white camera, and has always seen the world through a black and white camera. Until now, B9 has never heard of "colors" - no one has mentioned "colors" to B9, and B9 has certainly never experienced them. Today, unbeknownst to B9, B9's creator is going to upgrade its camera to a full-color system, and see how long it takes B9 to adapt to the new inputs.

The camera gets switched in 5 seconds. Before the camera gets switched, what prior probability does B9 assign to the possibility that its favorite ball is blue?

Comment author: JeffJo 15 June 2013 07:29:25PM 0 points [-]

Since this discussion was reopened, I've spent some time - mostly while jogging - pondering and refining my stance on the points expressed. I just got around to writing them down. Since there is no other way to do it, I'll present them boldly, apologizing in advance if I seem overly harsh. There is no such intention.

1) "Accursed Frequentists" and "Self-righteous Bayesians" alike are right, and wrong. Probability is in your knowledge - or rather, the lack thereof - of what is in the environment. Specifically, it is the measure of the ambiguity in the situation.

2) Nothing is truly random. If you know the exact shape of a coin, its exact weight distribution, exactly how it is held before flipping, exactly what forces are applied to flip it, the exact properties of the air and air currents it tumbles through, and exactly how long it is in the air before being caught in you open palm, then you can calculate - not predict - whether it will show Heads or Tails. Any lack in this knowledge leaves multiple possibilities open, which is the ambiguity.

3) Saying "the coin is biased" is saying that there is an inherent property, over all of the ambiguous ways you could hold the coin, the ambiguous forces you could use to flip it, the ambiguous air properties, and the ambiguous tumbling times, for it to land one way or another. (Its shape and weight are fixed, so they are unambiguous even if they are not known, and probably the source of this "inherent property.")

4) Your state of mind defines probability only in how you use it to define the ambiguities you are accounting for. Eliezer's frequentist is perfectly correct to say he needs to know the bias of this coin, since in his state of mind the ambiguity is what this biased coin will do. And Eliezer is also perfectly correct to say the actual bias is unimportant. His answer is 50%, since in his mind the ambiguity is what any biased coin do. They are addressing different questions.

5) A simple change to the coin question puts Eliezer in the same "need the environment" situation he claims belongs only to the frequentist: Fli[p his coin twice. What probability are you willing to assign to getting the same result on both flips?

6) The problem with the "B9" question discussed recently, is that there is no framework to place the ambiguity within. No environmental circumstances that you can use to assess the probability.

7) The propensity for some frequentists to want probability to be "in the environment" is just a side effect of practical application. Say you want to evaluate a statistical question, such as the effectiveness of a drug. Drug effectiveness can vary with gender, age, race, and probably many other factors that are easily identified; that is, it is indeed "in the environment." You could ignore those possible differences, and get an answer that applies to a generic person just as Eliezer's answer applies to a generic biased coin. But it behooves you to eliminate whatever sources of ambiguity you easily can.

8) In geometry, "point" and "line" are undefined concepts. But we all have a pretty good idea what they are supposed to mean, and this meaning is fairly universal.

"Length" and "angle" are undefined measurements of what separates two different instances of "point" and "line," respectively. But again, we have a pretty clear idea of what is intended.

In probability, "outcome" is an undefined concept. But unlike geometry, where the presumed meaning is universal, a meaning for "outcome" is different for each ambiguous situation. But an "event" is defined - as a set of outcomes.

"Relative likelihood" is an undefined measurement what separates two different instances of "event." And just like "length," we have a pretty clear idea of what it is supposed to mean. It expresses the relative chances that either event will occur in any expression of the ambiguities we consider.

9) "Probability" is just the likelihood relative to everything. As such, it represents the fractional chances of an event's occurrence. So if we can repeat the same ambiguities exactly, we expect the frequency to approach the probability. But note: this is not a definition of probability, as Bayesians insist frequentists think. It is a side effect of what we want "likelihood" to mean.

10) Eliezer misstated the "classic" two-child problem. The problem he stated is the one that corresponds to the usual solution, but oddly enough the usual solution is wrong for the question that is usually asked. And here I'm referring to, among others, Martin Gardner's version and Marilyn vos Savant's more famous version. The difference is that Eliezer asks the parent if there is a boy, but the classic version simply states that one child is a boy. Gardner changed his answer to 1/2 because, when the reason we have this information is not known, you can't implicitly assume that you will always know about the boy in a boy+girl family.

And the reason I bring this up, is because the "brain-teasing ability" of the problem derives more from effects of this implied assumption, than from any "tendency to think of probabilities as inherent properties of objects." This can be seen by restating the problem as a variation of Bertrand's Box Paradox:

The probability that, in a family of two children, both have the same gender is 1/2. But suppose you learn that one child is in scouts - but you don’t know if it is Boy Scouts or Girl Scouts. If it is Boy Scouts, those who answer the actual "classic" problem as Eliezer answered his variation will say the probability of two boys is 1/3. They'd say the same thing, about two girls, if it is Girl Scouts. So it appears you don’t even need to know what branch of Scouting it is to change the answer to 1/3.

The fallacy in this logic is the same as the reason Eliezer reformulated the problem: the answer is 1/3 only if you ask a question equivalent to "is at least one a boy," not if you merely learn that fact. And the "brain-teaser ability" is because people sense, correctly, that they have no new information in the "classic" version of the problem which would allow the change from 1/2 to 1/3. But they are told, incorrectly, that the answer does change.

Comment author: JeffJo 19 May 2012 11:48:40AM -1 points [-]

The problem with the Sleeping Beauty Problem (irony intended), is that it belongs more in the realm of philosophy and/or logic, than mathematics. The irony in that (double-irony intended), is that the supposed paradox is based on a fallacy of logic. So the people who perpetuate it should be best equipped to resolve it. Why they don't, or can't, I won't speculate about.

Mathematicians, Philosophers, and Logicians all recognize how information introduced into a probability problem allows one to update the probabilities based on that information. The controversy in the Sleeping Beauty Problem is based on the fallacious conclusion that such "new" information is required to update probabilities this way. This is an example of the logical fallacy called affirming the consequent: concluding that "If A Then B" means "A is required to be true for B to be true" (an equivalent statement is "If B then A").

All that is really needed for updating, is a change in the information. It almost always is an addition, but in the Sleeping Beauty Problem it is a removal. Sunday Sleeping Beauty (SSB) can recognize that "Tails & Awake on Monday" and "Tails & Awake on Tuesday" represent the same future (Manfred's "AND"), both with prior probability 1/2. But Awakened Sleeping Beauty (ASB), who recognizes only the present, must distinguish these two outcomes as being distinct (Manfred's "OR"). This change in information allows Bayes' Rule to be applied in a seemingly unorthodox way: P(H&AonMO|A) = P(H&AonMO)/[P(H&AonMO) + P(T&AonMO) + P(T&AonTU)] = (1/2)/(1/2+1/2+1/2) = 1/3. The denominator in this expression is greater than 1 because the change (not addition) of information separates non-disjoint events into disjoint events.

The philosophical issue about SSA v. SIA (or whatever these people call them; I haven't seen any two who define them agree), can be demonstrated by the "Cloned SB" variation. That's where, if Tails is flipped, an independent copy of SB is created instead of two awakenings happening. Each instance of SB will experience only one "awakening," so the separation of one prior event into two disjoint posterior events, as represented by "OR," does not occur. But neither does "AND." We need a new one called "ONE OF." This way, Bayes' Rule says P(H&Me on Mo) = P(H&Me on MO)/[P(H&Me on MO) + (ONE OF P(T&Me on MO), P(T&Me on TU))] = (1/2)/(1/2+1/2) = 1/2.

The only plausible controversy here is how SB should interpret herself: as one individual who might be awakened twice during the experiment, or as one of the two who might exist in it. The former leads to a credence of 1/3, and he latter leads to a credence of 1/2. But the latter does not follow from the usual problem statement.

Comment author: JeffJo 29 October 2011 03:15:13PM 1 point [-]

After tinkering with a solution, and debating with myself how or whether to try it again here, I decided to post a definitive counter-argument to neq1's article as a comment. It starts with the correct probability tree, which has (at least) five outcomes, not three. But I'll use the unknown Q for one probability in it:

••••••• Monday---1---Waken; Pr(observe Heads and Monday)=Q/2 ••••••••••/ ••••••••Q •••••••/ ••• Heads •••••/••\••••••••••••1---Sleep; Pr(sleep thru Heads and Tuesday)=(1-Q)/2 ••••/•••1-Q•••••••/ ••1/2••••\••••••••/ ••/•••• Tuesday--0---Waken; Pr(observe Heads and Tuesday)=0 •/ + •\ ••\•••• Monday---1---Waken; Pr(observe Tails and Monday)=1/4 ••1/2••••/ ••••\••1/2 •••••\••/ ••• Tails •••••••\ •••••••1/2 ••••••••••\ ••••••• Tuesday--1---Waken; Pr(observe Tails and Tuesday)=1/4

What halfers refuse to recognize, is that whether Beauty is awakened in any specific circumstance is a decision that is part of the process. It is based on the other two random variables, after both – repeat, both – have been determined. The event “Heads and Tuesday” is an event that exists in the sample space, and the decision to not awaken her is made only after that event has occurred. Halfers think they have to force that event into non-existence by making Q=1, when all the experiment requires is that the probability Beauty will observe it is zero. This is the point one thirder argument utilizes, that of Radford Neal’s companion Prince who is always awakened but only asked if Beauty is awakened.

In fact, there is no reason why the probability that it is Monday, given Heads, should be any different than the probability it is Monday, given Tails. So, with Q=1/2, we get that Pr(observe heads)=1/4, Pr(observe anything)=3/4, so Pr(Heads|observe anything)=1/3. QED.

Neq1’s arguments that the thirder positions are wrong are all examples of circular reasoning. He makes some assumption equivalent to saying the answer is 1/2, and from that proves the answer is 1/2. For example, when he uses “Beauty woken up at least once” as a condition, all his terms are also conditioned on the fact that the rules of the experiment were followed. So when he inserts the completely unconditional “Pr(Heads)=1/2” on the right-hand side of the equation, he really should use Pr(heads|rules followed), which is the unknown we are trying to find. It is then unsurprising that he gets the number he inserted, especially if you consider what using a probability-one event as a condition in Bayes’ Rule means.

Where neq1 claims that Nick Bostrom’s argument is wrong in “Disclosure Process 1,” I suggest he go back and use the values from his probability tree. Her credence of heads is (1/2)/(1/2+1/2/1,000,000). In the second process, it is either (1/2)/(1/2+1/2/7,000,000) of (1/2)/(1/2+1/2/1,000,000,000,000), depending on what “specific day” means.

Comment author: [deleted] 31 August 2011 09:51:03PM *  3 points [-]

I can't speak for the rest of your post, but

We can answer the question without knowing anything more about b, than that it is not 1/2. For any 0<=b1<1/2, since we have no other information, b=b1 and b=1-b1 must be treated as equally likely. Regardless of what the distribution of b1 is, this makes the probability the coin landed on heads 1/2.

is pretty clearly wrong. (In fact, it looks a lot like you're establishing a prior distribution, and that's uniquely a Bayesian feature.) The probability of an event (the result of the flip is surely an event, though I can't tell if you're claiming to the contrary or not) to a frequentist is the limit of the proportion of times the event occurred in independent trials as the number of trials tends to infinity. The probability the coin landed on heads is the one thing in the problem statement that can't be 1/2, because we know that the coin is biased. Your calculation above seems mostly ad hoc, as is your introduction of additional random variables elsewhere.

However, I'm not a statistician.

In response to comment by [deleted] on Probability is in the Mind
Comment author: JeffJo 23 September 2011 08:27:40PM 0 points [-]

Say a bag contains 100 unique coins that have been carefully tuned to be unfair when flipped. Each is stamped with an integer in the range 0 to 100 (50 is missing) representing its probability, in percent, of landing on heads. A single coin is withdrawn without revealing its number, and flipped. What is the probability that the result will be heads?

You are claiming that anybody who calls himself a Frequentist needs to know the number on the coin to answer this question. And that any attempt to represent the probability of drawing coin N is specifying a prior distribution, an act that is strictly prohibited for a Frequentist. Both claims are absurd. Prior distributions are a fact of the mathematics of probability, and belong to Frequentist and Bayesian alike. The only differences are (1) the Bayesian may use information differently to determine a prior, sometimes in situations where a Frequentist wouldn't see one at all; (2) The Bayesian will prefer solutions based explicitly on that prior, while the Frequentist will prefer solutions based on the how the prior affects repeated experiments; and (3) Some Frequentists might not realize when they have enough information to determine a prior, and/or its effects, that should satisfy them.

If both get answers, and they don't agree, somebody did something wrong.

The answer is 50%. The Bayesian says that, based on available information, neither result can be favored over the other so they must both have probability 50%. The Frequentist says that if you repeat the experiment 100^2 times, including the part where you draw a coin from the bag of 100 coins, you should count on getting each coin 100 times. And you should also count, for each coin, on getting heads in proportion to its probability. That way, you will count 5,000 heads in 10,000 trials, making the answer 50%. Both solutions are based on the same facts and assumptions, just organized differently.

The answer Eliezer_Yudkowsky attributes to Frequentists, for the simpler problem without the bag and stamped coins, is an incorrect Frequentist solution. Or at least, a correct solution to a different problem. One that corresponds to the different question "What proportion of the time will this coin come up heads?" I agree that some who claim to be Frequentists will answer that question. But the true Frequentist will answer the question that was asked: "What proportion of the time will the process of flipping a coin with unknown bias come up heads?" His repetitions must represent the bias for each flip as independent of any other flips, not the same bias each time. The bias B will come up just as often as the bias (1-B), so the number of heads will always be half the number of trials.

Comment author: JeffJo 08 September 2011 02:11:19PM 1 point [-]

Sleeping Beauty does not sleep well. She has three dreams before awakening. The Ghost of Mathematicians Past warns her that there are two models of probability, and that adherents to each have little that is good to say about adherents to the other. The Ghost of Mathematicians Present shows her volumes of papers and articles where both 1/2 and 1/3 are "proven" to be the correct answer based on intuitive arguments. The Ghost of Mathematicians Future doesn't speak, but shows her how reliance on intuition alone leads to misery. Only strict adherence to theory can provide an answer.

Illuminated by these spirits, once she is fully awake she reasons: "I have no idea whether today is Monday or Tuesday; but it seems that if I did know, I would have no problem answering the question. For example, if I knew it was Monday, my credence that the coin landed heads could only be 1/2. On the other hand, if I knew it was Tuesday, my credence would have to be 0. But on the gripping hand, these two incontrovertible truths can help me answer as my night visitors suggested. There is a theorem in probability, called the Theorem of Total Probability, that says the probability for event A is equal to the probability of the sum of the events (A intersect B(i)), where B(i) partitions the entire event space.

"Today has to be either Monday or Tuesday, and it can't be both, so these two days represent such a partition. Since I want to avoid making any assumptions as long as I can, let me say that the probability that today is Monday is X, and the probability that it is Tuesday is (1-X). Now I can use this Theorem to state, unequivocally, that my credence that the coin landed heads is P(heads)=(1/2)X+0(1-X)=X/2.

"But I know that it is possible that today is Tuesday; even a Bayesian has to admit that X<1. So I know that 1/2 cannot be correct; the answer has to be less than that. A Frequentist would say that X=2/3 because, if this experiment were repeated many times, two out of every three interviews would take place on Monday. And while a Bayesian could, in theory, choose any value that is less than 1, it is a violation of Occam's Razor to assume there is a factor present that would make X different than 2/3. So, it seems my answer must be 1/3.

Comment author: [deleted] 31 August 2011 09:51:03PM *  3 points [-]

I can't speak for the rest of your post, but

We can answer the question without knowing anything more about b, than that it is not 1/2. For any 0<=b1<1/2, since we have no other information, b=b1 and b=1-b1 must be treated as equally likely. Regardless of what the distribution of b1 is, this makes the probability the coin landed on heads 1/2.

is pretty clearly wrong. (In fact, it looks a lot like you're establishing a prior distribution, and that's uniquely a Bayesian feature.) The probability of an event (the result of the flip is surely an event, though I can't tell if you're claiming to the contrary or not) to a frequentist is the limit of the proportion of times the event occurred in independent trials as the number of trials tends to infinity. The probability the coin landed on heads is the one thing in the problem statement that can't be 1/2, because we know that the coin is biased. Your calculation above seems mostly ad hoc, as is your introduction of additional random variables elsewhere.

However, I'm not a statistician.

In response to comment by [deleted] on Probability is in the Mind
Comment author: JeffJo 01 September 2011 10:47:06AM 0 points [-]

The random process a frequentist should repeat is flipping a random biased coin, and getting a random bias b and either heads or tails. You are assuming it is flipping the *same biased coin with fixed bias B, and getting heads or tails.

The probability a random biased coins lands heads is 1/2, from either point of view. And for nshepperd, the point is that a Frequentist doesn't need to know what the bias is. As long as we can't assume it is different for b1 and 1-b1, when you integrate over the unknown distribution (yes, you can do that in this case) the answer is 1/2.

Comment author: JeffJo 31 August 2011 08:56:59PM 1 point [-]

My first post, so be gentle. :)

I disagree that there is a difference between "Bayesian" and "Frequentist;" or at least, that it has anything to do with what is mentioned in this article. The field of Probability has the unfortunate property of appearing to be a very simple, well defined topic. But it actually is complex enough to be indefinable. Those labels are used by people who want to argue in favor of one definition - of the indefinable - over another. The only difference I see is where they fail to completely address a problem.

Take the biased coin problem as an example. If either label applies to me, it is Frequentist, but my answer is that the one EliezerYudkowsky says is the Bayesian's. He gets the wrong Frequentist solution because he only allows the Frequentist to acknowledge one uncertainty - one random variable - in the problem. Whether the coin came up heads or tails. If a Frequentist says the question is unanswerable, (s)he is wrong because (s)he is using an incomplete solution. The bias b - of a coin already selected - is just as much a random variable as the side s that came up in a coin already flipped. If you claim the answer must be based on the actual value of b for the coins, it must also be based on the actual value of s for this flip. That means the probability is either 0 or 1, which is absurd. (Technically, this error is one of confusing an outcome and an event. An outcome is the specific result of a specific trial, and has no probability. An event is a set of possible outcomes, and is what a probability is assigned to. EliezerYudkowsky's Frequentist is treating the choice of a coin as an outcome, and the result of the flip as an event.)

We can answer the question without knowing anything more about b, than that it is not 1/2. For any 0<=b1<1/2, since we have no other information, b=b1 and b=1-b1 must be treated as equally likely. Regardless of what the distribution of b1 is, this makes the probability the coin landed on heads 1/2.

The classic Two Child Problem has a similar issue, but Eliezer_Yudkowsky did not ask the classic one. I find it best to explain this one in the manner Joseph Bertrand used for his famous Box Paradox. I have two children. What is the probability they share the same gender? That's easy: 1/2. Now I secretly write one gender on a note card. I then show the card to you, and tell you one of my children has that gender. If it says "boy," does the answer change to 1/3? What if it says "girl"? The answers can't be different for the two words you might see; but whatever that answer is, it has to be the same as the answer to the original question (proof by Bayes Theorem). So if the answer does change, we have a paradox.

Yet if presented with the information all at once, "I have two, and one is a boy," Frequentist and Bayesian alike will usually answer "1/3." And they usually will say that anybody who answers 1/2 is addressing the "I have two, and one specific child, by age, is a boy" version Eliezer_Yudkowsky mentioned. But that is not how I get 1/2. There are three random variables, not two: the older child's gender, the younger child's gender, and which gender I will mention if I have the choice of two. Allowing all three to be split 50/50 between "boy" and "girl" makes the answer 1/2, and there is no paradox.

Ironically, my reasoning is what the same mathematicians will use for either the Monty Hall Problem, or the identical Three Prisoners Problem. Two cases that were originally equally likely remain possible. But they are no longer equally possible, because the provider of information had a choice of two in one case, but no choice in the other. Bayesians may claim the difference is a property of the information, and Frequentists (if they use a complete solution) will say there is an additional, implicit random variable. Both work out the same, just by different methods. It is ironic, because while Bertrand's Box Paradox is often compared to these two problems because it is mathematically equivalent to them. The Two Child Problem is closer to being logically equivalent because of the way the information is provided, yet never gets compared. In fact, it is identical if you add a fourth box.

View more: Next