You're looking at Less Wrong's discussion board. This includes all posts, including those that haven't been promoted to the front page yet. For more information, see About Less Wrong.

gjm comments on What's wrong with this picture? - Less Wrong Discussion

15 Post author: CronoDAS 28 January 2016 01:30PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (80)

You are viewing a single comment's thread. Show more comments above.

Comment author: gjm 28 January 2016 09:02:28PM 3 points [-]

by naive analysis, it's -always- more likely that Alice cheated than that this particular sequence came up by chance.

Why? I agree that Pr(Alice cheated) is likely higher than Pr(these coin-flip results on this occasion), but that's the wrong comparison. Pr(Alice cheated to produce these coin-flip results) is typically about 2^-n times Pr(Alice cheated), and in particular is generally smaller than Pr(Alice got these coin-flip results fairly).

Comment author: OrphanWilde 29 January 2016 02:42:26PM 0 points [-]

Wait. Are you arguing that, given two possibilities: Alice cheated to produce (random sequence), and Alice produced (random sequence) randomly, given that it requires the same amount of information to produce the sequence in both cases (n bits), Alice cheating to produce a given sequence is just as unlikely, for a sufficiently random sequence, as arriving at the random sequence randomly?

Comment author: gjm 29 January 2016 06:02:05PM 3 points [-]

Pretty much, yes. (Not necessarily exactly equally unlikely -- human cheaters and perfect unbiased uncorrelated coin flippers don't produce the same output. But if you've got some arbitrary 100-long sequence of coin flips, you don't get to say "Alice must have cheated because that sequence is unlikely by chance"; it's unlikely by cheating too for the exact same reason.)

Comment author: OrphanWilde 29 January 2016 06:21:27PM *  0 points [-]

Ok. I think part of the issue [ETA: with our mutual understanding of each other, not with you] is that you're focused on the "You're lying" part of the conversation.

I'm considering it in the context of this: "My observations are always fallible, and if you make an event improbable enough, why shouldn't I be skeptical even if I think I observed it?"

Granted, his observations have N bits of information (at least), the same as the situation with cheating, and it's at least as improbable that he'd observe a given sequence of length N when something else entirely happened, than that the given sequence of length N itself happened, so in practice, it's still -certainly- more likely that he actually observed the observation he observed.

The paradox isn't there. The paradox is that we would, in fact, find some sequences unbelievable, even though they're exactly as likely as every other sequence. If the sequence was all heads 100 times in a row, for instance, that would be unbelievable, even though a sequence of pure heads is exactly as likely as any other sequence.

The paradox is in the fact that the sequence is undefined, and for some sequences, we'd be inclined to side with Alice, and for other sequences, we'd be inclined to side with Bob, even though all possible sequences of the same length are equally likely.

ETA:

This is what I was getting at with the difference between the reference classes of "distinguished" and "undistinguished".

Comment author: gjm 29 January 2016 09:03:29PM 2 points [-]

if you make an event improbable enough, why shouldn't I be skeptical even if I think I observed it?

You should. You should be aware that you might e.g. have made a mistake and slightly misremembered (or miscopied, etc.) the results of the coin flips, for instance.

we would, in fact, find some sequences unbelievable

We might say that. We might even think it. But what we ought to mean is that we find other explanations more plausible than chance in those cases. If you flip a coin 100 times and get random-looking results: sure, those particular results are very improbable, but very improbable things happen all the time (as in fact you can demonstrate by flipping a coin 100 times). What you should generally be looking at is not probabilities but odds. That random-looking sequence is neither much more nor much less likely than any other random-looking sequence of 100 coin-flips, so the fact that it's improbable doesn't give you reason to disbelieve it -- you don't have a better rival hypothesis. But if you flip all heads, suddenly there are higher-probability alternatives. Not because all-heads is especially unlikely by chance, but because it's especially likely by not-chance. Maybe the coin is double-headed. Maybe it's weighted in some clever way[1]. Maybe you're hallucinating or dreaming. Maybe some god is having a laugh. All these things are (so at least it seems) much more likely to produce all-heads than a random-looking sequence.

[1] I think I recall seeing an analysis somewhere that found that actually weighting a coin can't bias its results much.

Comment author: OrphanWilde 29 January 2016 09:16:16PM 0 points [-]

But if you flip all heads, suddenly there are higher-probability alternatives. Not because all-heads is especially unlikely by chance, but because it's especially likely by not-chance. Maybe the coin is double-headed. Maybe it's weighted in some clever way[1]. Maybe you're hallucinating or dreaming. Maybe some god is having a laugh. All these things are (so at least it seems) much more likely to produce all-heads than a random-looking sequence.

Which is, I think, what is interesting about this: All-heads is no more improbable than any other random sequence, but in the case of an all-heads sequence, suddenly we start looking for laughing gods, hallucinations, or dreams as an explanation.

Which is to say, the interesting thing here is that we'd start looking for explanations of an all-heads sequence, even though it's no more improbable than any other sequence.

Comment author: gjm 29 January 2016 10:28:46PM 2 points [-]

No -- not "suddenly we start looking for". Suddenly those are better explanations than if the sequence of coin flips had been random-looking.

Comment author: OrphanWilde 01 February 2016 02:45:46PM 0 points [-]

Like gods having a laugh?

You didn't, and wouldn't, leap into the better explanations. You leapt fully into any explanation except chance, without regard for whether or not it was a better explanation.

Gods having a laugh aren't something you even think of if you aren't looking for an explanation.

Comment author: gjm 01 February 2016 03:02:35PM 0 points [-]

Gods having a laugh are a pretty terrible explanation for anything, and their inclusion here was mostly gjm having a laugh.

The borderline between "suddenly we start looking for a better explanation" and "suddenly better explanations start occurring to us" is an extremely fuzzy one. My reason for preferring the latter framing is that what's changed isn't that randomness has become worse at explaining our observations, but that some non-random explanation has got better.

Comment author: OrphanWilde 01 February 2016 03:09:57PM 0 points [-]

One is a very good mathematical explanation.

The other is why "Gods having a laugh" would actually cross your mind. You include that as a joke because it rings true.

Comment author: buybuydandavis 30 January 2016 12:06:02PM 1 point [-]

the interesting thing here is that we'd start looking for explanations of an all-heads sequence, even though it's no more improbable than any other sequence.

It's no more probable under the null hypothesis, but much more probable under more probable than average alternative hypotheses.

Comment author: OrphanWilde 01 February 2016 02:11:01PM 0 points [-]

It's no more probable under the null hypothesis, but much more probable under more probable than average alternative hypotheses.

Such as gods interfering with our lives?

Imagine, for a moment, you've ruled out all of the probable explanations. Are you still going to be looking for an alternative explanation, or will you accept that it's chance?

Comment author: buybuydandavis 03 February 2016 08:08:29AM 1 point [-]

Or the coin being cheat, or some cheating or "non-random" effect in the situation. Delusional recollection of events.

How did I "rule out" the alternatives? When I imagine me doing that, I imagine me reasoning poorly. I go by Jaynes' policy of having a catch all "something I don't understand" hypothesis for multiple hypothesis testing. In this case, it would be "some agent action I can't detect or don't understand the mechanism of". How did I rule that out?

Suppose it's 1,000,000 coin flips, all heads. The probability of that is pretty damn low, and much much lower than my estimates for the alternatives, including the "something else" hypothesis. You can make some of that up with a sampling argument about all the "coin flip alternatives" one sees in a day, but that only takes you so far.

I don't see how I would ever be confident that 1,000,000 came up all heads with "fair" coin flipping.

Comment author: Romashka 03 February 2016 10:26:58AM 3 points [-]

It's a fair coin. It just has two heads on it.

Comment author: gjm 03 February 2016 12:03:37PM 0 points [-]

The probability of that is pretty damn low

The probability of any specific sequence of 1M coin flips is "pretty damn low" in the same sense. The relevant thing here is not that that probability is low when they're all heads, but that the probability of some varieties of "something else" is very large, relative to that low probability. Or, more precisely, what sets us thinking of "something else" hypotheses is some (unknown) heuristic that tells us that it looks like the probability of "something else" should be much bigger than the probability of chance.

(I guess the heuristic looks for excessive predictability. As a special case it will tend to notice things like regular repetition and copies of other sequences you're familiar with.)

Comment author: entirelyuseless 29 January 2016 10:03:18PM *  1 point [-]

It is not true that overall all sequences are equally likely. The probability of a certain sequence is the probability that it would happen by chance added to the probability that it would happen by not-chance. As gjm said in his comment, the chance part is equal, but the non-chance part is not. So there is no reason why the total probability of all sequences would be equal. The total probability of a sequence of 100 heads is higher than most other sequences. For example, there is the non-chance method of just talking about a sequence without actually getting it. We're doing that now, and note that we're talking about the sequence of all heads. That was far more likely given this method of choosing a sequence, then an individual random looking sequence.

(But you are right that it is no more improbable than other sequences. It is less improbable overall, and that is precisely why we start looking for another explanation.)

Comment author: OrphanWilde 01 February 2016 04:29:40PM 0 points [-]

No, that's a very good reason to start looking for another explanation, but somebody with no understanding of Bayes' Rule at all would do exactly the same thing. If somebody else would engage in exactly the same behavior with a radically different explanation for that behavior, given a particular stimulus - consider the possibility that your explanation for your behavior is not the real reason for your behavior.

Comment author: OrphanWilde 28 January 2016 09:11:43PM -1 points [-]

I agree that Pr(Alice cheated) is likely higher than Pr(these coin-flip results on this occasion), but that's the wrong comparison.

Why?

Pr(Alice cheated to produce these coin-flip results) is typically about 2^-n times Pr(Alice cheated), and in particular is generally smaller than Pr(Alice got these coin-flip results fairly).

"Typically" and "Generally" are doing all the heavy lifting there. Imagine writing an AI to guess the probability that Alice cheated, given a sequence. What rules would you apply?

Comment author: gjm 28 January 2016 09:57:29PM 1 point [-]

Why?

Because if you write down the Bayes' Rule calculation, that's not the ratio that appears in it.

"Typically" and "Generally" are doing all the beavy lifting there.

Nope. They both mean: for large n, for a fraction of sequences that tends to 1 as n -> infinity, that's what happens.

Comment author: OrphanWilde 29 January 2016 02:39:19AM -1 points [-]

Because if you write down the Bayes' Rule calculation, that's not the ratio that appears in it.

HTTTHHHTHTHTTHHTTHTHTTHTHHHTHHTTTHTH. Using Bayes' Rule, what are the odds I actually got that sequence, as opposed to randomly typing letters? (If you miss my point: You're misusing Bayes' Rule in this argument.)

Nope. They both mean: for large n, for a fraction of sequences that tends to 1 as n -> infinity, that's what happens.

If Alice cheats 100% of the time, your formula produces probabilities greater than 1 for any n less than infinity, which I'm reasonably certain doesn't happen.

Comment author: gjm 29 January 2016 02:06:35PM 1 point [-]

Using Bayes' Rule, what are the odds I actually got that sequence, as opposed to randomly typing letters?

Pretending for the sake of argument that I don't see any regularities in your sequence that I wouldn't expect from genuinely random coin flips (it actually looks to me more human-generated, but with only n=36 I'm not very confident of that): the odds are pretty much the same as the prior odds that you'd actually flip a coin 36 times rather than just writing down random-looking Hs and Ts.

You're misusing Bayes' Rule in this argument.

I think you may be misunderstanding my argument.

If Alice cheats 100% of the time, your formula produces probabilities greater than 1

The only formula I wrote down was "2^-n times Pr(Alice cheated)" and those probabilities are definitely not greater than 1. Would you care to be more explicit?

Comment author: OrphanWilde 29 January 2016 02:32:12PM *  0 points [-]

Pretending for the sake of argument that I don't see any regularities in your sequence that I wouldn't expect from genuinely random coin flips (it actually looks to me more human-generated, but with only n=36 I'm not very confident of that): the odds are pretty much the same as the prior odds that you'd actually flip a coin 36 times rather than just writing down random-looking Hs and Ts.

You said something interesting there, and then skipped right past it. That's the substance of the question. You don't get to ignore those regularities; they do, in fact, affect the probabilities. Saying that they don't appear in the ratio of Bayes' Rule is... well, misusing Bayes' Rule to discard meaningful evidence.

The only formula I wrote down was "2^-n times Pr(Alice cheated)" and those probabilities are definitely not greater than 1. Would you care to be more explicit?

2^(-n) approaches 1 as n approaches infinity, but for any finite n, is greater than 1. Multiply that by a probability of 1, and you get a probability greater than 1. [ETA: Gyah. It's been too long since I've done exponents (literally, ten years since I've done anything interesting). You're right, I'm confusing negative exponents with division in exponents.]

Comment author: gjm 29 January 2016 06:08:30PM 1 point [-]

Saying that they don't occur in the ratio of Bayes' Rule [...]

But I didn't say that. I didn't say anything even slightly like that.

This is at least partly my fault because I was too lazy to write everything out explicitly. Let me do so now; perhaps it will clarify. Suppose X is some long random-looking sequence of n heads and tails.

Odds(Alice cheated : Alice flipped honestly | result was X) = Odds(Alice cheated : Alice flipped honestly) . Odds(result was X | Alice cheated : Alice flipped honestly).

The second factor on the RHS is, switching from my eccentric but hopefully clear notation to actual probability ratios, Pr(result was X | Alice cheated) / Pr(result was X | Alice flipped honestly).

So those two probabilities are the ones you have to look at, not Pr(Alice cheated) and Pr(result was X | Alice flipped honestly). But the latter is what you were comparing when you wrote

it's -always- more likely that Alice cheated than that this particular sequence came up by chance.

which is why I said that was the wrong comparison.