Covariance is one keyword. If the data is linear but not maximal dimensional, then you get covariance. This is to be expected in situations like this, where you convert a scale to a bunch of booleans. ETA: and even if one did not expect adjacent values to be correlated, that the total number of ratings is about the same is a reduction of dimension.
But if the data is not linear, many more things can go wrong. I don't know names for them.
Matt Simpson: I suppose that could solve the problem of covariance, but that's not what I'm talking about.
It would be interesting to see higher-dimensional plots. For example, the scatter plot of average-score vs the number of messages could be colored according to the number of ratings of 1. And similar charts for other ratings.
Thanks for the pointer, I think I get the idea. To check: It is a difference between whether many votes of 1 lead to more messages, or whether they only lead to more messages if at the same time there are many votes for 5. As in the dataset there were many woman who at the same time got many 1s and 5s, and many messages, the linear regression resulted in absurd values, which just happen to match the data-set, but do not model the (non-linear) reality, as for this one would have to consider another dimension, like "disagreement", or whatever. And of course, all this would be much more clear to me if I'd sit down and just read a damn ultra-basic statistics book and learn that stuff. Gah.
http://blog.okcupid.com/index.php/the-mathematics-of-beauty/