Qiaochu_Yuan comments on Open thread, January 25- February 1 - Less Wrong Discussion
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (316)
Repeating my post from the last open thread, for better visibility:
I want to study probability and statistics in a deeper way than the Probability and Statistics course I had to take in the university. The problem is, my mathematical education isn't very good (on the level of Calculus 101). I'm not afraid of math, but so far all the books I could find are either about pure application, with barely any explanations, or they start with a lot of assumptions about my knowledge and introduce reams of unfamiliar notation.
I want a deeper understanding of the basic concepts. Like, mean is an indicator of the central tendency of a sample. Intuitively, it makes sense. But why this particular formula of sum/n? You can apply all kinds of mathematical stuff to the sample. And it's even worse with variance...
Any ideas how to proceed?
I don't think that's really what means are. That intuition might fit the median better. One reason means are nice is that they have really nice properties, e.g. they're linear under addition of random variables. That makes them particularly easy to compute with and/or prove theorems about. Another reason means are nice is related to betting and the interpretation of a mean as an expected value; the theorem justifying this interpretation is the law of large numbers.
Nevertheless in many situations the mean of a random variable is a very bad description of it (e.g. mean income is a terrible description of the income distribution and median would be much more appropriate).
Edit: On the other hand, here's one very undesirable property of means: they're not "covariant under increasing changes of coordinates," which on the other hand is true of medians. What I mean is the following: suppose you decide to compute the mean population of all cities in the US, but later decide this is a bad idea because there are some really big cities. If you suspect that city populations grow multiplicatively rather than additively (e.g. the presence of good thing X causes a city to be 1.2x bigger than it otherwise would, as opposed to 200 people bigger), you might decide that instead of looking at population you should look at log population. But the mean of log population is not the log of mean population!
On the other hand, because log is an increasing function, the median of log population is still the log of median population. So taking medians is in some sense insensitive to these sorts of decisions, which is nice.