Psychohistorian comments on What's In A Name? - Less Wrong

41 Post author: Yvain 29 June 2009 12:54PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (132)

You are viewing a single comment's thread. Show more comments above.

Comment author: RichardKennaway 29 June 2009 02:46:28PM *  4 points [-]

That is all quite fascinating, in a "fancy that!" fashion, but whenever I see correlational data reported I wonder about the magnitude of the effect, and a measure of that magnitude in terms of bits of information. The first result they report is that if there were no influence between name and state of residence, the proportion of coincidences would be 0.1664, while the observed level is 0.1986. How large an influence does this represent?

I am not quite sure what the correct calculation to make is -- perhaps someone more versed in these matters can say -- but when I calculate the Kullback-Leibler divergence between two binary distributions, one with p=0.1664 and the other with p=0.1986, I get about 0.005 bits. When I estimate the mutual information between name and state, making various assumptions about the data I'd need for a precise calculation, I get a similar figure.

In short, if you want to predict someone's name from their state, or vice versa, the result is completely useless. Of course, making such a prediction was not the authors' purpose. But then, what was? What can you do with less than a hundredth of a bit?

How justifiable is it to report the finding in these words (quotes from the paper):

people are attracted to places that resemble their own names.

and

these findings challenge traditional assumptions about how people make major life decisions

I have just found where Andrew Gelman has blogged about this (search his blog for "Pelham"). I don't have time to read what he says at the moment, but his headlines indicate he doesn't rate it.

Comment author: Psychohistorian 29 June 2009 02:53:13PM 9 points [-]

More specifically, such a small effect does not require a widespread bias; if just a tiny number of people have a stronger (even conscious) bias, it could explain the data.