You're looking at Less Wrong's discussion board. This includes all posts, including those that haven't been promoted to the front page yet. For more information, see About Less Wrong.

PhilGoetz comments on The Universal Medical Journal Article Error - Less Wrong Discussion

6 Post author: PhilGoetz 29 April 2014 05:57PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (189)

You are viewing a single comment's thread.

Comment author: Matt_Simpson 07 April 2013 05:13:01AM *  11 points [-]

Both the t-test and the F-test work by assuming that every subject has the same response function to the intervention:

response = effect + normally distributed error

where the effect is the same for every subject.

The F test / t test doesn't quite say that. It makes statements about population averages. More specifically, if you're comparing the mean of two groups, the t or F test says whether the average response of one group is the same as the other group. Heterogeneity just gets captured by the error term. In fact, econometricians define the error term as the difference between the true response and what their model says the mean response is (usually conditional on covariates).

The fact that the authors ignored potential heterogeneity in responses IS a problem for their analysis, but their result is still evidence against heterogeneous responses. If there really are heterogeneous responses we should see that show up in the population average unless:

  • The positive and negative effects cancel each other out exactly once you average across the population. (this seems very unlikely)
  • The population average effect size is nonzero but very small, possibly because the effect only occurs in a small subset of the population (even if it's large when it does occur) or something similar but more complicated. In this case, a large enough sample size would still detect the effect.

Now it might not be very strong evidence - this depends on sample size and the likely nature of the heterogeneity (or confounders, as Cyan mentions). And in general there is merit in your criticism of their conclusions. But I think you've unfairly characterized the methods they used.

Comment author: PhilGoetz 07 April 2013 03:45:31PM *  1 point [-]

The fact that the authors ignored potential heterogeneity in responses IS a problem for their analysis, but their result is still evidence against heterogeneous responses.

Why do you say that? Did you look at the data?

They found F values of 0.77, 2.161, and 1.103. That means they found different behavior in the two groups. But those F-values were lower than the thresholds they had computed assuming homogeneity. They therefore said "We have rejected the hypothesis", and claimed that the evidence, which interpreted in a Bayesian framework might support that hypothesis, refuted it.

Comment author: Matt_Simpson 07 April 2013 07:30:22PM 2 points [-]

I didn't look at the data. I was commenting on your assessment of what they did, which showed that you didn't know how the F test works. Your post made it seem as if all they did was run an F test that compared the average response of the control and treatment groups and found no difference.