GreenRoot comments on Aspergers Poll Results: LW is nerdier than the Math Olympiad? - Less Wrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (33)
It's a convenience sample!
This is grossly overstated, because the survey respondents are in no way known to be representative of LessWrong. In fact, I would suspect a very strong selection bias. Before you can go saying anything about the LessWrong community, you need some response statistics. How many people were offered the survey? (You could get this from the server logs. Your survey wasn't promoted and was only up for a day or so, an exposure that already subsets its audience from "LessWrong") How many of the people who saw the survey decided to fill it out? How do the people who filled out the survey differ from the people who did not fill out the survey? I doubt the answers to any of these questions would be satisfying unless you started with a random sample, then individually and privately invited each of the sampled people to participate. Even then, the response bias might be bad enough to preclude any conclusions about LessWrong.
Based mostly on Unnamed's comment regarding a poll on homosexuality, I expect it would be lower than what you found for self-selected interested respondents, but I don't have much confidence in this, and I'm not familiar enough with AQ variance to say how much lower.
I have some experience with trying to do survey research right (i.e. in a manner that results in true conclusions about the population), and I have seen so many ways that convenience samples can make me wrong (non-response bias is just a start), that I generally don't bother much with trying to correct for one particular bias. Instead, I stick to random samples or use estimators that can handle convenience samples. But I am genuinely intrigued by your idea (from the very end of your post) to use a prior on non-response bias and a prior on AQ distribution to estimate the true AQ distribution of LW users. This wouldn't handle e.g. problems with question wording or lies in responses, but I would expect those problems to be small in comparison to the non-response bias. Could you elaborate, or point me to any survey methods papers that explain this approach in detail? Where would you get the priors?
Wouldn't the maximum response bias be if all the non-responders scored 0 on the scale?