Cyan comments on Outside the Laboratory - Less Wrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (336)
I'm sorry, that seems just wrong. The statistics work if there's an unbiased process that determines which events you observe. If Alice conducts trails until 3 successes were achieved, that's a biased process that's sure to ensure that the data ends with a least one success.
Surely you accept that if Alice conducts 100 trials and only gives you the successes, you'll get the wrong result no matter the statistical procedure used, so you can't say that biased data collection is irrelevant. You have to either claim that continuing until 3 successes were achieved is an unbiased process, or retreat from the claim that that procedure for collecting the data does not influence the correct interpretation of the results.
If Alice decides to conduct 12 trials, then the sampling distribution of the data is the binomial distribution. If Alice decides to sample until 3 successes are achieved, then the sampling distribution of the data is the negative binomial distribution. These two distributions are proportional when considered as functions of the parameter p (i.e., as likelihood functions). So in this specific case, from a Bayesian point of view the sampling mechanism does not influence the conclusions. (This is in contradistinction to inference based on p-values.)
In general, you are correct to say that biased data collection is not irrelevant; this idea is given a complete treatment in Chapter 6 (or 7, I forget which) of Gelman et al.'s Bayesian Data Analyses, 2nd ed.