You're looking at Less Wrong's discussion board. This includes all posts, including those that haven't been promoted to the front page yet. For more information, see About Less Wrong.

Metus comments on Jokes Thread - Less Wrong Discussion

25 Post author: JosephY 24 July 2014 12:31AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (82)

You are viewing a single comment's thread.

Comment author: Metus 24 July 2014 01:48:34AM 1 point [-]

A Bayesian apparently is someone who after a single throw of a coin will believe that it is biased. Based on either outcome.

Also, why do 'Bayes', 'base' and 'bias' sound similar?

Comment author: Viliam_Bur 24 July 2014 11:27:23AM *  22 points [-]

Heck, I had to stop and take a pen and paper to figure that out. Turns out, you were wrong. (I expected that, but I wasn't sure how specifically.)

As a simple example, imagine that my prior belief is that 0.1 of coins always provide head, 0.1 of coins always provide tails, and 0.8 of coins are fair. So, my prior belief is that 0.2 of coins are biased.

I throw a coin and it's... let's say... head. What are the posterior probabilities? Multiplying the prior probabilities with the likelihood of this outcome, we get 0.1 × 1, 0.8 × 0.5, and 0.1 × 0. Multiplied and normalized, it is 0.2 for the heads-only coin, and 0.8 for the fair coin. -- My posterior belief remains 0.2 for biased coin, only in this case I know how specifically it is biased.

The same will be true for any symetrical prior belief. For example, if I believe that 0.000001 of coins always provide head, 0.000001 of coins always provide tails, 0.0001 of coins provide head in 80% of cases, 0.0001 of coins provide tails in 80% of cases, and the rest are fair coins... again, after one throw my posterior probability of "a biased coin" will remain exactly the same, only the proportions of specific biases will change.

On the other hand, if my prior belief is asymetrical... let's say I believe that 0.1 of coins always provide head, and 0.9 of coins are fair (and there are no always-tails coins)... then yes, a single throw that comes up head will increase my belief that the coin was biased. (Because the outcome of tails would have decreased it.)

(Technically, a Bayesian superintelligence would probably believe that all coins are asymetrical. I mean, they have different pictures on their sides, that can influence the probabilities of the outcomes a little bit. But such a superintelligence would have believed that the coin was biased even before the first throw.)

Comment author: Vulture 24 July 2014 11:32:03PM 10 points [-]

0.1 of coins always provide head

Now there's a way to get people interested in learning probability.

Comment author: Lumifer 24 July 2014 02:33:21PM 3 points [-]

Turns out, you were wrong.

Not so fast.

imagine that my prior belief is that 0.1 of coins always provide head, 0.1 of coins always provide tails, and 0.8 of coins are fair. So, my prior belief is that 0.2 of coins are biased.

Not quite. In your example 0.2 of coins are not biased, they are predetermined in that they always provide the same outcome no matter what.

Let's try a bit different example: the prior is that 10% of coins are biased towards heads (their probabilities are 60% heads, 40% tails), 10% are biased towards tails (60% tails, 40% heads), and 80% are fair.

After one throw (let's say it turned out to be heads) your posterior for the fair coin did not change, but your posterior for the heads-biased coin went up and for the tails-biased coin went down. Your expectation for the next throw is now skewed towards heads.

Comment author: Viliam_Bur 24 July 2014 03:12:11PM *  10 points [-]

My expectation of "this coin is biased" did not change, but "my expectation of the next result of this coin" changed.

In other words, I changed by expectation that the next flip will be heads, but I didn't change my expectation that from the next 1000 flips approximately 500 will be heads.

Connotationally: If I believe that biased coins are very rare, then my expectation that the next flip will be heads increases only a little. More precisely, if the ratio of biased coins is p, my expectation for the next flip increases at most by approximately p. The update based on one coin flip does not contradict common sense, it is as small as the biased coins are rare; and as large as they are frequent.

Comment author: Lumifer 24 July 2014 04:25:57PM 0 points [-]

My expectation of "this coin is biased" did not change

In this particular example, no, it did not. However if you switch to continuous probabilities (and think not in terms of binary is-biased/is-not-biased but rather in terms of the probability of the true mean not being 0.5 plus-minus epsilon) your estimate of the character of the coin will change.

Also

"my expectation of the next result of this coin" changed

and

but I didn't change my expectation that from the next 1000 flips approximately 500 will be heads.

-- these two statements contradict each other.

Comment author: Viliam_Bur 24 July 2014 05:54:29PM 4 points [-]

"my expectation of the next result of this coin" changed

and

but I didn't change my expectation that from the next 1000 flips approximately 500 will be heads.

-- these two statements contradict each other.

Using my simplest example, because it's simplest to calculate:

Prior:

0.8 fair coin, 0.1 heads-only coin, 0.1 tails-only coin

probability "next is head" = 0.5

probability "next 1000 flips are approximately 500:500" ~ 0.8

Posterior:

0.8 fair coin, 0.2 heads-only coin

probability "next is head" = 0.6 (increased)

probability "next 1000 flips are approximately 500:500" ~ 0.8 (didn't change)

Comment author: Lumifer 24 July 2014 06:16:22PM 0 points [-]

Um.

Probability of a head = 0.5 necessarily means that the expected number of heads in 1000 tosses is 500.

Probability of a head = 0.6 necessarily means that the expected number of heads in 1000 tosses is 600.

Comment author: Viliam_Bur 24 July 2014 07:39:47PM *  5 points [-]

Are you playing with two different meanings of the word "expected" here?

If I roll a 6-sided die, the expected value is 3½.

But I don't really expect to see 3½ as an outcome of the roll. I expect to see either 1, or 2, or 3, or 4, or 5, or 6. But certainly not 3½.

If my model says that 0.2 coins are heads-only and 0.8 coins are fair, in 1000 flips I expect to see either 1000 heads (probability 0.2) or cca 500 heads (probability 0.8). But I don't expect to see cca 600 heads. Yet, the expected value of the number of heads in 1000 flips is 600.

Comment author: Lumifer 24 July 2014 08:18:33PM 1 point [-]

Are you playing with two different meanings of the word "expected" here?

No, I'm just using the word in the statistical-standard sense of "expected value".

Comment author: evand 24 July 2014 07:48:04PM 2 points [-]

You can only multiply out P(next result is heads) * ( number of tosses) to get the expected number of heads if you believe those tosses are independent trials. The case of a biased coin toss explicitly violates this assumption.

Comment author: Lumifer 24 July 2014 08:21:20PM 0 points [-]

But the tosses are independent trials, even for the biased coin. I think you mean the P(heads) is not 0.6, it's either 0.5 or 1, you just don't know which one it is.

Comment author: evand 24 July 2014 08:47:50PM 1 point [-]

Which means that P(heads on toss after next|heads on next toss) != P(heads on toss after next|tails on next toss). Independence of A and B means that P(A|B) = P(A).

Comment author: James_Ernest 20 August 2014 12:04:42AM *  0 points [-]

I don't think so. None of the available potential coin-states would generate an expected value of 600 heads.

p = 0.6 -> 600 expected heads is the many-trials (where each trial is 1000 flips) expected value given the prior and the result of the first flip, but this is different from the expectation of this trial, which is bimodally distributed at [1000]x0.2 and [central limit around 500]x0.8

Comment author: AlexMennen 24 July 2014 05:20:11PM 2 points [-]

However if you switch to continuous probabilities your estimate of the character of the coin will change.

No. If the distribution is symmetrical, then the probability density at .5 will be unchanged after a single coin toss.

these two statements contradict each other.

No they don't. He was saying that his estimate of the probability that the coin is unbiased (or approximately unbiased) does not change, but that the probability that the coin is weighted towards heads increased at the expense of the probability that the coin is weighted towards tails (or vice-versa, depending on the outcome of the first toss), which is correct.

Comment author: Lumifer 24 July 2014 05:29:52PM 0 points [-]

If the distribution is symmetrical, then the probability density at .5 will be unchanged after a single coin toss.

In the continuous-distribution world the probability density at exactly 0.5 is infinitesimally small. And the probability density at 0.5 plus-minus epsilon will change.

No they don't.

Yes, they do. We're talking about expected values of coin tosses now, not about the probabilities of the coin being biased.

Comment author: AlexMennen 24 July 2014 09:11:03PM 2 points [-]

the probability mass at 0.5 plus-minus epsilon will change.

(army1987 already addressed density vs mass.) No, for any x, the probability density at 0.5+x goes up by the same amount that the probability density at 0.5-x goes down (assuming a symmetrical prior), so for any x, the probability mass in [0.5-x, 0.5+x] will remain exactly the same.

We're talking about expected values of coin tosses now, not about the probabilities of the coin being biased.

Ok, instead of 1000 flips, think about the next 2 flips. The probability that exactly 1 of them lands heads does not change. This does not contradict the claim that the probability of the next flip being heads increases, because the probability of the next two flips both being heads increases while the probability of the next two flips both being tails decreases by the same amount (assuming you just saw the coin land heads).

You don't even need to explicitly use Bayes's theorem and do the math to see this (though you can). It all follows from symmetry and conservation of expected evidence. By symmetry, the change in probability of some event which is symmetric with respect to heads/tails must change by the same amount whether the result of the first flip is heads or tails, and by conservation of expected evidence, those changes must add to 0. Therefore those changes are 0.

Comment author: Lumifer 25 July 2014 04:05:35AM -2 points [-]

for any x, the probability density at 0.5+x goes up by the same amount that the probability density at 0.5-x goes down (assuming a symmetrical prior)

I don't think that is true. Imagine that your probability density is a normal distribution. You update in such a way that the mean changes, 0.5 is no longer the peak. This means that your probability density is no longer symmetrical around 0.5 (even if you started with a symmetrical prior) and the probability density line is not a 45 degree straight line -- with the result that the density at 0.5+x changes by a different amount than at 0.5-x.

Comment author: AlexMennen 25 July 2014 04:42:37AM 1 point [-]

You update in such a way that the mean changes, 0.5 is no longer the peak. This means that your probability density is no longer symmetrical around 0.5 (even if you started with a symmetrical prior)

That is correct. Your probability distribution is no longer symmetrical after the first flip, which means that on the second flip, the symmetry argument I made above no longer holds, and you get information about whether the coin is biased or approximately fair. That doesn't matter for the first flip though. Did you read the last paragraph in my previous comment? If so, was any part of it unclear?

with the result that the density at 0.5+x changes by a different amount than at 0.5-x.

That does not follow from anything you wrote before it (the 45 degree straight line part is particularly irrelevant).

Comment author: [deleted] 24 July 2014 05:53:29PM 2 points [-]

In the continuous-distribution world the probability density at exactly 0.5 is infinitesimally small.

That's not what a probability density is. You're thinking of a probability mass.

Comment author: Lumifer 24 July 2014 06:14:28PM 2 points [-]

Yes, you are right.

Comment author: DanielLC 24 July 2014 09:49:36PM 3 points [-]

I didn't realize you were serious, given that this is a joke thread.

Here's the easy way to solve this:

By conservation of expected evidence, if one outcome is evidence for the coin being biased, then the other outcome is evidence against it.

They might believe that it's biased either way if they have a low prior probability of the coin being fair. For example, if they use a beta distribution for the prior, they only assign an infinitesimal probability to a fair coin. But since they're not finding evidence that it's biased, you can't say the belief is based on the outcome of the toss.

I suppose there is a sense it which your statement is true. If I'm given a coin which is badly made, but in a way that I don't understand, then the first toss is fair. I have no idea if it will land on heads or tails. Once I toss it, I have some idea of in which way it's unfair, so the next toss is not fair.

That's not usually what people mean when they talk about a fair coin, though.