The title's a lot funnier if you s/as/of/.

## Fifty Shades of Self-Fulfilling Prophecy

The official story: "Fifty Shades of Grey" was a Twilight fan-fiction that had over two million downloads online. The publishing giant Vintage Press saw that number and realized there was a huge, previously-unrealized demand for stories like this. They filed off the Twilight serial numbers, put it in print, marketed it like hell, and now it's sold 60 million copies.

The reality is quite different.

Oops; you're right. Careless of me; fixed.

I don't think the "95% confidence" works that way. It's a lower bound, you never try to publish anything with a lower than 95% confidence (and if you do, your publication is likely to be rejected), but you don't always need to have exactly 95% (2 sigma).

Hell, I play enough RPGs to know that rolling 1 or 20 in a d20 is frequent enough ;) 95% is quite low confidence, it's really a minimum at which you can start working, but not something optimal.

I'm not sure exactly in medicine, but in physics it's frequent to have studies at 3 sigma (99.7%) or higher. The detection of the Higgs boson by the LHC for example was done within 5 sigma (one chance in a million of being wrong).

Especially in a field with high risk of data being abused by ill-intentioned people such as "vaccine and autism" link, it would really surprise me that everyone just kept happily the 95% confidence, and didn't aim for much higher confidence.

Especially in a field with high risk of data being abused by ill-intentioned people such as "vaccine and autism" link, it would really surprise me that everyone just kept happily the 95% confidence, and didn't aim for much higher confidence.

Okay. Be surprised. It appears that I've read hundreds of medical journal articles and you haven't.

Medicine isn't like physics. The data is incredibly messy. High sigma results are often unattainable even for things you know are true.

First thing, if you put something in your body, it has some effect, even if that effect is small. "No effect" results just rule out effects above different effect sizes (both positive and negative) with high probability, and there's no point talking about "a link" like it's some discrete thing (you sort of jump back and forth between getting this one right and wrong).

Second, different studies will rule out different effect sizes with 95% confidence - or to put it another way, at a given effect size, different studies will have different p-values, and so your probability exercise was pretty pointless because you didn't compare the studies' opinions about any particular effect size, just "whatever was 95%."

Third, I'd bet a nickel the effect sizes ruled out at 95% in all of these studies are well below the point where it would become concerning (like, say, the effect of the parents being a year older). That is, these studies all likely rule out a concerning effect size with probability much better than 95%.

so your probability exercise was pretty pointless because you didn't compare the studies' opinions about any particular effect size

My probability exercise was not about effect size. It was about the probability of all studies agreeing by chance if there is in fact no link, and so the 95% confidence is what is relevant.

Third, I'd bet a nickel the effect sizes ruled out at 95% in all of these studies are well below the point where it would become concerning (like, say, the effect of the parents being a year older). That is, these studies all likely rule out a concerning effect size with probability much better than 95%.

Again, not relevant to the point I'm making here.

In your "critiquing bias" section you allege that 3/43 studies supporting a link is "still surprisingly low". This is wrong; it is actually surprisingly high. If B ~ Binom(43, 0.05), then P(B > 2) ~= 0.36.*

*As calculated by the following Python code:

```
from scipy.stats import binom
b = binom(43, 0.05)
p_less_than_3 = sum(b.pmf(i) for i in [0,1,2])
print 1 - p_less_than_3
```

I said "surprisingly low" because of publication & error bias.

I'm confused about how this works.

Suppose the standard were to use 80% confidence. Would it still be surprising to see 60 of 60 studies agree that A and B were not linked? Suppose the standard were to use 99% confidence. Would it still be surprising to see 60 of 60 studies agree that A and B were not linked?

Also, doesn't the prior plausibility of the connection being tested matter for attempts to detect experimenter bias this way? E.g., for any given convention about confidence intervals, shouldn't we be quicker to infer experimenter bias when a set of studies conclude (1) that there is no link between eating lithium batteries and suffering brain damage vs. when a set of studies conclude (2) that there is no link between eating carrots and suffering brain damage?

"95% confidence" means "I am testing whether X is linked to Y. I know that the data might randomly conspire against me to make it look as if X is linked to Y. I'm going to look for an effect so large that, if there is no link between X and Y, the data will conspire against me only 5% of the time to look as if there is. If I don't see an effect at least that large, I'll say that I failed to show a link between X and Y."

If you went for 80% confidence instead, you'd be looking for an effect that wasn't quite as big. You'd be able to detect smaller clinical effects--for instance, a drug that has a small but reliable effect--but if there were no effect, you'd be fooled by the data 20% of the time into thinking that there was.

Also, doesn't the prior plausibility of the connection being tested matter for attempts to detect experimenter bias this way?

It would if the papers claimed to find a connection. When they claim not to find a connection, I think not. Suppose people decided to test the hypothesis that stock market crashes are caused by the Earth's distance from Mars. They would gather data on Earth's distance from Mars, and on movements in the stock market, and look for a correlation.

If there is no relationship, there should be zero correlation, on average. That (approximately) means that half of all studies will show a negative correlation, and half will have positive correlation.

They need to pick a number, and say that if they find a positive correlation above that number, they've proven that Mars causes stock market crashes. And they pick that number by finding the correlation just exactly large enough that, if there is no relationship, it happens 5% of the time by chance.

If the proposition is very very unlikely, somebody might insist on a 99% confidence interval instead of a 95% confidence interval. That's how prior plausibility would affect it. Adopting a standard of 95% confidence is really a way of saying we agree not to haggle over priors.

Also, different studies have different statistical power, so it may not be OK to simply add up their evidence with equal weights.

No; it's standard to set the threshold for your statistical test for 95% confidence. Studies with larger samples can detect smaller differences between groups with that same statistical power.

## Too good to be true

A friend recently posted a link on his Facebook page to an informational graphic about the alleged link between the MMR vaccine and autism. It said, if I recall correctly, that out of 60 studies on the matter, not one had indicated a link.

Presumably, with 95% confidence.

This bothered me. What are the odds, supposing there is no link between X and Y, of conducting 60 studies of the matter, and of all 60 concluding, with 95% confidence, that there is no link between X and Y?

Answer: .95 ^ 60 = .046. (Use the first term of the binomial distribution.)

So if it were in fact true that 60 out of 60 studies failed to find a link between vaccines and autism at 95% confidence, this would prove, with 95% confidence, that studies in the literature are biased against finding a link between vaccines and autism.

View more: Next

This belongs in Discussion, not Main. It's barely connected to rationality at all. Is there some lesson we're supposed to take from this, besides booing or yaying various groups for their smartness or non-smartness?

Downvoted for being trivia on Main.

*4 points [-]This is about the rationality of society. It is about how opinions are formed. The idea that the market works by editors identifying books people want, and then being rewarded for their good judgement, was false in this high-profile case.