gjm comments on 2014 Survey Results - Less Wrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (279)
For a statistician, this is insane. In this case, this would mean that a sizable chunk of responders actually receives money from charity.
You seem to assume that every dataset has an inherent mean and standard deviation. But means and standard deviations are the results of modeling a gaussian distribution, and if the model fit is too bad, these metrics simply don't apply for this dataset.
The Lilliefors test was created for exactly this purpose: it gives you the probability that a dataset is not normal distributed. Please use it, or leave out means and standard deviations altogether. The percentiles are (in my - very biased - opinion) much more helpful anyways.
?
Means and standard deviations are general properties one can compute for any statistical distribution which doesn't have pathologically fat tails. (Granted, it would've been conceptually cleaner for Yvain to present the mean & SD of log donations, but there's nothing stopping us from using his mean & SD to estimate the parameters of e.g. a log-normal distribution instead of a normal distribution.)
You can indeed compute means and standard distributions for any distribution with small enough tails, but if the distribution is far from normal then they may not be very useful statistics. E.g., an important reason why the mean of a bunch of samples is an interesting statistic is that if the underlying distribution is normal then the sample mean is the maximum-likelihood estimator of the distribution's mean. But, e.g., if the underlying distribution is a double exponential then the max-likelihood estimator for its position is the median rather than the mean. Or if the distribution is Cauchy then the sample mean is just as noisy as a single sample.