2012 Survey Results

80 Post author: Yvain 07 December 2012 09:04PM

Thank you to everyone who took the 2012 Less Wrong Survey (the survey is now closed. Do not try to take it.) Below the cut, this post contains the basic survey results, a few more complicated analyses, and the data available for download so you can explore it further on your own. You may want to compare these to the results of the 2011 Less Wrong Survey.

Part 1: Population

How many of us are there?

The short answer is that I don't know.

The 2011 survey ran 33 days and collected 1090 responses. This year's survey ran 23 days and collected 1195 responses. The average number of new responses during the last week was about five per day, so even if I had kept this survey open as long as the last one I probably wouldn't have gotten more than about 1250 responses. That means at most a 15% year on year growth rate, which is pretty abysmal compared to the 650% growth rate in two years we saw last time.

About half of these responses were from lurkers; over half of the non-lurker remainder had commented but never posted to Main or Discussion. That means there were only about 600 non-lurkers.

But I am skeptical of these numbers. I hang out with some people who are very closely associated with the greater Less Wrong community, and a lot of them didn't know about the survey until I mentioned it to them in person. I know some people who could plausibly be described as focusing their lives around the community who just never took the survey for one reason or another. One lesson of this survey may be that the community is no longer limited to people who check Less Wrong very often, if at all. One friend didn't see the survey because she hangs out on the #lesswrong channel more than the main site. Another mostly just goes to meetups. So I think this represents only a small sample of people who could justly be considered Less Wrongers.

The question of "how quickly is LW growing" is also complicated by the high turnover. Over half the people who took this survey said they hadn't participated in the survey last year. I tried to break this down by combining a few sources of information, and I think our 1200 respondents include 500 people who took last year's survey, 400 people who were around last year but didn't take the survey for some reason, and 300 new people.

As expected, there's lower turnover among regulars than among lurkers. Of people who have posted in Main, about 75% took the survey last year; of people who only lurked, about 75% hadn't.

This view of a very high-turnover community and lots of people not taking the survey is consistent with Vladimir Nesov's data showing http://lesswrong.com/lw/e4j/number_of_members_on_lesswrong/77xz 1390 people who have written at least ten comments. But the survey includes only about 600 people who have at least commented; 800ish of Vladimir's accounts are either gone or didn't take the census.

Part 2: Categorical Data

SEX:
Man: 1057, 89.2%
Woman: 120, 10.1%
Other: 2, 0.2%)
No answer: 6, 0.5%

GENDER:
M (cis): 1021, 86.2%
F (cis): 105, 8.9%
M (trans f->m): 3, 0.3%
F (trans m->f): 16, 1.3%
Other: 29, 2.4%
No answer: 11, 0.9%

ORIENTATION:
Heterosexual: 964, 80.7%
Bisexual: 135, 11.4%
Homosexual: 28, 2.4%
Asexual: 24, 2%
Other: 28, 2.4%
No answer: 14, 1.2%

RELATIONSHIP STYLE:

Prefer monogamous: 639, 53.9%
Prefer polyamorous: 155, 13.1%
Uncertain/no preference: 358, 30.2%
Other: 21, 1.8%
No answer: 12, 1%

NUMBER OF CURRENT PARTNERS:
0: 591, 49.8%
1: 519, 43.8%
2: 34, 2.9%
3: 12, 1%
4: 5, 0.4%
6: 1, 0.1%
7, 1, 0.1% (and this person added "really, not trolling")
Confusing or no answer: 20, 1.8%

RELATIONSHIP STATUS:
Single: 628, 53%
Relationship: 323, 27.3%
Married: 220, 18.6%
No answer: 14, 1.2%

RELATIONSHIP GOALS:
Not looking for more partners: 707, 59.7%
Looking for more partners: 458, 38.6%
No answer: 20, 1.7%

COUNTRY:
USA: 651, 54.9%
UK: 103, 8.7%
Canada: 74, 6.2%
Australia: 59, 5%
Germany: 54, 4.6%
Israel: 15, 1.3%
Finland: 15, 1.3%
Russia: 13, 1.1%
Poland: 12, 1%

These are all the countries with greater than 1% of Less Wrongers, but other, more exotic locales included Kenya, Pakistan, and Iceland, with one user each. You can see the full table here.

This data also allows us to calculate Less Wrongers per capita:


Finland: 1/366,666
Australia: 1/389,830
Canada: 1/472,972
USA: 1/483,870
Israel: 1/533,333
UK: 1/603,883
Germany: 1/1,518,518
Poland: 1/3,166,666
Russia: 1/11,538,462

RACE:
White, non-Hispanic 1003, 84.6%
East Asian: 50, 4.2%
Hispanic 47, 4.0%
Indian Subcontinental 28, 2.4%
Black 8, 0.7%
Middle Eastern 4, 0.3%
Other: 33, 2.8%
No answer: 12, 1%

WORK STATUS:
Student: 476, 40.7%
For-profit work: 364, 30.7%
Self-employed: 95, 8%
Unemployed: 81, 6.8%
Academics (teaching): 54, 4.6%
Government: 46, 3.9%
Non-profit: 44, 3.7%
Independently wealthy: 12, 1%
No answer: 13, 1.1%

PROFESSION:
Computers (practical): 344, 29%
Math: 109, 9.2%
Engineering: 98, 8.3%
Computers (academic): 72, 6.1%
Physics: 66, 5.6%
Finance/Econ: 65, 5.5%
Computers (AI): 39, 3.3%
Philosophy: 36, 3%
Psychology: 25, 2.1%
Business: 23, 1.9%
Art: 22, 1.9%
Law: 21, 1.8%
Neuroscience: 19, 1.6%
Medicine: 15, 1.3%
Other social science: 24, 2%
Other hard science: 20, 1.7%
Other: 123, 10.4%
No answer: 27, 2.3%

DEGREE:
Bachelor's: 438, 37%
High school: 333, 28.1%
Master's: 192, 16.2%
Ph.D: 71, 6%
2-year: 43, 3.6%
MD/JD/professional: 24, 2%
None: 55, 4.6%
Other: 15, 1.3%
No answer: 14, 1.2%

POLITICS:
Liberal: 427, 36%
Libertarian: 359, 30.3%
Socialist: 326, 27.5%
Conservative: 35, 3%
Communist: 8, 0.7%
No answer: 30, 2.5%

You can see the exact definitions given for each of these terms on the survey.

RELIGIOUS VIEWS:
Atheist, not spiritual: 880, 74.3%
Atheist, spiritual: 107, 9.0%
Agnostic: 94, 7.9%
Committed theist: 37, 3.1%
Lukewarm theist: 27, 2.3%
Deist/Pantheist/etc: 23, 1.9%
No answer: 17, 1.4%

FAMILY RELIGIOUS VIEWS:
Lukewarm theist: 392, 33.1%
Committed theist: 307, 25.9%
Atheist, not spiritual: 161, 13.6
Agnostic: 149, 12.6%
Atheist, spiritual: 46, 3.9%
Deist/Pantheist/Etc: 32, 2.7%
Other: 84, 7.1%

RELIGIOUS BACKGROUND:
Other Christian: 517, 43.6%
Catholic: 295, 24.9%
Jewish: 100, 8.4%
Hindu: 21, 1.8%
Traditional Chinese: 17, 1.4%
Mormon: 15, 1.3%
Muslim: 12, 1%

Raw data is available here.

MORAL VIEWS:

Consequentialism: 735, 62%
Virtue Ethics: 166, 14%
Deontology: 50, 4.2%
Other: 214, 18.1%
No answer: 20, 1.7%

NUMBER OF CHILDREN
0: 1044, 88.1%
1: 51, 4.3%
2: 48, 4.1%
3: 19, 1.6%
4: 3, 0.3%
5: 2, 0.2%
6: 1, 0.1%
No answer: 17, 1.4%

WANT MORE CHILDREN?

No: 438, 37%
Maybe: 363, 30.7%
Yes: 366, 30.9%
No answer: 16, 1.4%

LESS WRONG USE:
Lurkers (no account): 407, 34.4%
Lurkers (with account): 138, 11.7%
Posters (comments only): 356, 30.1%
Posters (comments + Discussion only): 164, 13.9%
Posters (including Main): 102, 8.6%

SEQUENCES:
Never knew they existed until this moment: 99, 8.4%
Knew they existed; never looked at them: 23, 1.9%
Read < 25%: 227, 19.2%
Read ~ 25%: 145, 12.3%
Read ~ 50%: 164, 13.9%
Read ~ 75%: 203, 17.2%
Read ~ all: 306, 24.9%
No answer: 16, 1.4%

Dear 8.4% of people: there is this collection of old blog posts called the Sequences. It is by Eliezer, the same guy who wrote Harry Potter and the Methods of Rationality. It is really good! If you read it, you will understand what we're talking about much better!

REFERRALS:
Been here since Overcoming Bias: 265, 22.4%
Referred by a link on another blog: 23.5%
Referred by a friend: 147, 12.4%
Referred by HPMOR: 262, 22.1%
No answer: 35, 3%

BLOG REFERRALS:

Common Sense Atheism: 20 people
Hacker News: 20 people
Reddit: 15 people
Unequally Yoked: 7 people
TV Tropes: 7 people
Marginal Revolution: 6 people
gwern.net: 5 people
RationalWiki: 4 people
Shtetl-Optimized: 4 people
XKCD fora: 3 people
Accelerating Future: 3 people

These are all the sites that referred at least three people in a way that was obvious to disentangle from the raw data. You can see a more complete list, including the long tail, here.

MEETUPS:
Never been to one: 834, 70.5%
Have been to one: 320, 27%
No answer: 29, 2.5%

CATASTROPHE:
Pandemic (bioengineered): 272, 23%
Environmental collapse: 171, 14.5%
Unfriendly AI: 160, 13.5%
Nuclear war: 155, 13.1%
Economic/Political collapse: 137, 11.6%
Pandemic (natural): 99, 8.4%
Nanotech: 49, 4.1%
Asteroid: 43, 3.6%

The wording of this question was "which disaster do you think is most likely to wipe out greater than 90% of humanity before the year 2100?"

CRYONICS STATUS:
No, don't want to: 275, 23.2%
No, still thinking: 472, 39.9%
No, procrastinating: 178, 15%
No, unavailable: 120, 10.1%
Yes, signed up: 44, 3.7%
Never thought about it: 46, 3.9%
No answer: 48, 4.1%

VEGETARIAN:
No: 906, 76.6%
Yes: 147, 12.4%
No answer: 130, 11%

For comparison, 3.2% of US adults are vegetarian.


SPACED REPETITION SYSTEMS
Don't use them: 511, 43.2%
Do use them: 235, 19.9%
Never heard of them: 302, 25.5%

Dear 25.5% of people: spaced repetition systems are nifty, mostly free computer programs that allow you to study and memorize facts more efficiently. See for example http://ankisrs.net/

HPMOR:
Never read it: 219, 18.5%
Started, haven't finished: 190, 16.1%
Read all of it so far: 659, 55.7%

Dear 18.5% of people: Harry Potter and the Methods of Rationality is a Harry Potter fanfic about rational thinking written by Eliezer Yudkowsky (the guy who started this site). It's really good. You can find it at http://www.hpmor.com/.


ALTERNATIVE POLITICS QUESTION:

Progressive: 429, 36.3%
Libertarian: 278, 23.5%
Reactionary: 30, 2.5%
Conservative: 24, 2%
Communist: 22, 1.9%
Other: 156, 13.2%

ALTERNATIVE ALTERNATIVE POLITICS QUESTION:
Left-Libertarian: 102, 8.6%
Progressive: 98, 8.3%
Libertarian: 91, 7.7%
Pragmatist: 85, 7.2%
Social Democrat: 80, 6.8%
Socialist: 66, 5.6%
Anarchist: 50, 4.1%
Futarchist: 29, 2.5%
Moderate: 18, 1.5%
Moldbuggian: 19, 1.6%
Objectivist: 11, 0.9%

These are the only ones that had more than ten people. Other responses notable for their unusualness were Monarchist (5 people), fascist (3 people, plus one who was up for fascism but only if he could be the leader), conservative (9 people), and a bunch of people telling me politics was stupid and I should feel bad for asking the question. You can see the full table here.

CAFFEINE:
Never: 162, 13.7%
Rarely: 237, 20%
At least 1x/week: 207, 17.5
Daily: 448, 37.9
No answer: 129, 10.9%

SMOKING:
Never: 896, 75.7%
Used to: 1-5, 8.9%
Still do: 51, 4.3%
No answer: 131, 11.1%

For comparison, about 28.4% of the US adult population smokes

NICOTINE (OTHER THAN SMOKING):
Never used: 916, 77.4%
Rarely use: 82, 6.9%
>1x/month: 32, 2.7%
Every day: 14, 1.2%
No answer: 139, 11.7%

MODAFINIL:
Never: 76.5%
Rarely: 78, 6.6%
>1x/month: 48, 4.1%
Every day: 9, 0.8%
No answer: 143, 12.1%

TRUE PRISONERS' DILEMMA:
Defect: 341, 28.8%
Cooperate: 316, 26.7%
Not sure: 297, 25.1%
No answer: 229, 19.4%

FREE WILL:
Not confused: 655, 55.4%
Somewhat confused: 296, 25%
Confused: 81, 6.8%
No answer: 151, 12.8%

TORTURE VS. DUST SPECKS
Choose dust specks: 435, 36.8%
Choose torture: 261, 22.1%
Not sure: 225, 19%
Don't understand: 22, 1.9%
No answer: 240, 20.3%

SCHRODINGER EQUATION:
Can't calculate it: 855, 72.3%
Can calculate it: 175, 14.8%
No answer: 153, 12.9%

PRIMARY LANGUAGE:
English: 797, 67.3%
German: 54, 4.5%
French: 13, 1.1%
Finnish: 11, 0.9%
Dutch: 10, 0.9%
Russian: 15, 1.3%
Portuguese: 10, 0.9%

These are all the languages with ten or more speakers, but we also have everything from Marathi to Tibetan. You can see the full table here..

NEWCOMB'S PROBLEM
One-box: 726, 61.4%
Two-box: 78, 6.6%
Not sure: 53, 4.5%
Don't understand: 86, 7.3%
No answer: 240, 20.3%

ENTREPRENEUR:
Don't want to start business: 447, 37.8%
Considering starting business: 334, 28.2%
Planning to start business: 96, 8.1%
Already started business: 112, 9.5%
No answer: 194, 16.4%

ANONYMITY:
Post using real name: 213, 18%
Easy to find real name: 256, 21.6%
Hard to find name, but wouldn't bother me if someone did: 310, 26.2%
Anonymity is very important: 170, 14.4%
No answer: 234, 19.8%

HAVE YOU TAKEN A PREVIOUS LW SURVEY?
No: 559, 47.3%
Yes: 458, 38.7%
No answer: 116, 14%

TROLL TOLL POLICY:
Disapprove: 194, 16.4%
Approve: 178, 15%
Haven't heard of this: 375, 31.7%
No opinion: 249, 21%
No answer: 187, 15.8%

MYERS-BRIGGS
INTJ: 163, 13.8%
INTP: 143, 12.1%
ENTJ: 35, 3%
ENTP: 30, 2.5%
INFP: 26, 2.2%
INFJ: 25. 2.1%
ISTJ: 14, 1.2%
No answer: 715, 60%

This includes all types with greater than 10 people. You can see the full table here.

Part 3: Numerical Data

Except where indicated otherwise, all the numbers below are given in the format:

mean+standard_deviation (25% level, 50% level/median, 75% level) [n = number of data points]

INTELLIGENCE:

IQ (self-reported): 138.7 + 12.7 (130, 138, 145) [n = 382]
SAT (out of 1600): 1485.8 + 105.9 (1439, 1510, 1570) [n = 321]
SAT (out of 2400): 2319.5 + 1433.7 (2155, 2240, 2320)
ACT: 32.7 + 2.3 (31, 33, 34) [n = 207]
IQ (on iqtest.dk): 125.63 + 13.4 (118, 130, 133)   [n = 378]

I am going to harp on these numbers because in the past some people have been pretty quick to ridicule this survey's intelligence numbers as completely useless and impossible and so on.

According to IQ Comparison Site, an SAT score of 1485/1600 corresponds to an IQ of about 144. According to Ivy West, an ACT of 33 corresponds to an SAT of 1470 (and thence to IQ of 143).

So if we consider self-report, SAT, ACT, and iqtest.dk as four measures of IQ, these come out to 139, 144, 143, and 126, respectively.

All of these are pretty close except iqtest.dk. I ran a correlation between all of them and found that self-reported IQ is correlated with SAT scores at the 1% level and iqtest.dk at the 5% level, but SAT scores and IQTest.dk are not correlated with each other.

Of all these, I am least likely to trust iqtest.dk. First, it's a random Internet IQ test. Second, it correlates poorly with the other measures. Third, a lot of people have complained in the comments to the survey post that it exhibits some weird behavior.

But iqtest.dk gave us the lowest number! And even it said the average was 125 to 130! So I suggest that we now have pretty good, pretty believable evidence that the average IQ for this site really is somewhere in the 130s, and that self-reported IQ isn't as terrible a measure as one might think.

AGE:
27.8 + 9.2 (22, 26, 31) [n = 1185]

LESS WRONG USE:
Karma: 1078 + 2939.5 (0, 4.5, 136) [n = 1078]
Months on LW: 26.7 + 20.1 (12, 24, 40) [n = 1070]
Minutes/day on LW: 19.05 + 24.1 (5, 10, 20) [n = 1105]
Wiki views/month: 3.6 + 6.3 (0, 1, 5) [n = 984]
Wiki edits/month: 0.1 + 0.8 (0, 0, 0) [n = 984]

PROBABILITIES:
Many Worlds: 51.6 + 31.2 (25, 55, 80) [n = 1005]
Aliens (universe): 74.2 + 32.6 (50, 90, 99) [n = 1090]
Aliens (galaxy): 42.1 + 38 (5, 33, 80) [n = 1081]
Supernatural: 5.9 + 18.6 (0, 0, 1) [n = 1095]
God: 6 + 18.7 (0, 0, 1) [n = 1098]
Religion: 3.8 + 15.5 (0, 0, 0.8) [n = 1113]
Cryonics: 18.5 + 24.8 (2, 8, 25) [n = 1100]
Antiagathics: 25.1 + 28.6 (1, 10, 35) [n = 1094]
Simulation: 25.1 + 29.7 (1, 10, 50) [n = 1039]
Global warming: 79.1 + 25 (75, 90, 97) [n = 1112]
No catastrophic risk: 71.1 + 25.5 (55, 80, 90) [n = 1095]
Space: 20.1 + 27.5 (1, 5, 30) [n = 953]

CALIBRATION:
Year of Bayes' birth: 1767.5 + 109.1 (1710, 1780, 1830) [n = 1105]
Confidence: 33.6 + 23.6 (20, 30, 50) [n= 1082]

MONEY:
Income/year: 50,913 + 60644.6 (12000, 35000, 74750) [n = 644]
Charity/year: 444.1 + 1152.4 (0, 30, 250) [n = 950]
SIAI/CFAR charity/year: 309.3 + 3921 (0, 0, 0) [n = 961]
Aging charity/year: 13 + 184.9 (0, 0, 0) [n = 953]

TIME USE:
Hours online/week: 42.4 + 30 (21, 40, 59) [n = 944]
Hours reading/week: 30.8 + 19.6 (18, 28, 40) [n = 957]
Hours writing/week: 7.9 + 9.8 (2, 5, 10) [n = 951]

POLITICAL COMPASS:
Left/Right: -2.4 + 4 (-5.5, -3.4, -0.3) [n = 476]
Libertarian/Authoritarian: -5 + 2 (-6.2, -5.2, -4)

BIG 5 PERSONALITY TEST:
Big 5 (O): 60.6 + 25.7 (41, 65, 84) [n = 453]
Big 5 (C): 35.2 + 27.5 (10, 30, 58) [n = 453]
Big 5 (E): 30.3 + 26.7 (7, 22, 48) [n = 454]
Big 5 (A): 41 + 28.3 (17, 38, 63) [n = 453]
Big 5 (N): 36.6 + 29 (11, 27, 60) [n = 449]

These scores are in percentiles, so LWers are more Open, but less Conscientious, Agreeable, Extraverted, and Neurotic than average test-takers. Note that people who take online psychometric tests are probably a pretty skewed category already so this tells us nothing. Also, several people got confusing results on this test or found it different than other tests that they took, and I am pretty unsatisfied with it and don't trust the results.

AUTISM QUOTIENT
AQ: 24.1 + 12.2 (17, 24, 30) [n = 367]

This test says the average control subject got 16.4 and 80% of those diagnosed with autism spectrum disorders get 32+ (which of course doesn't tell us what percent of people above 32 have autism...). If we trust them, most LWers are more autistic than average.

CALIBRATION:

Reverend Thomas Bayes was born in 1701. Survey takers were asked to guess this date within 20 years, so anyone who guessed between 1681 and 1721 was recorded as getting a correct answer. The percent of people who answered correctly is recorded below, stratified by the confidence they gave of having guessed correctly and with the number of people at that confidence level.

0-5: 10% [n = 30]
5-15: 14.8% [n = 183]
15-25: 10.3% [n = 242]
25-35: 10.7% [n = 225]
35-45: 11.2% [n = 98]
45-55: 17% [n = 118]
55-65: 20.1% [n = 62]
65-75: 26.4% [n = 34]
75-85: 36.4% [n = 33]
85-95: 60.2% [n = 20]
95-100: 85.7% [n = 23]

Here's a classic calibration chart. The blue line is perfect calibration. The orange line is you guys. And the yellow line is average calibration from an experiment I did with untrained subjects a few years ago (which of course was based on different questions and so not directly comparable).

The results are atrocious; when Less Wrongers are 50% certain, they only have about a 17% chance of being correct. On this problem, at least, they are as bad or worse at avoiding overconfidence bias as the general population.

My hope was that this was the result of a lot of lurkers who don't know what they're doing stumbling upon the survey and making everyone else look bad, so I ran a second analysis. This one used only the numbers of people who had been in the community at least 2 years and accumulated at least 100 karma; this limited my sample size to about 210 people.

I'm not going to post exact results, because I made some minor mistakes which means they're off by a percentage point or two, but the general trend was that they looked exactly like the results above: atrocious. If there is some core of elites who are less biased than the general population, they are well past the 100 karma point and probably too rare to feel confident even detecting at this kind of a sample size.

I really have no idea what went so wrong.  Last year's results were pretty good - encouraging, even. I wonder if it's just an especially bad question. Bayesian statistics is pretty new; one would expect Bayes to have been born in rather more modern times. It's also possible that I've handled the statistics wrong on this one; I wouldn't mind someone double-checking my work.

Or we could just be really horrible. If we haven't even learned to avoid the one bias that we can measure super well and which is most susceptible to training, what are we even doing here? Some remedial time at PredictionBook might be in order.

HYPOTHESIS TESTING:

I tested a very few of the possible hypothesis that were proposed in the survey design threads.

Are people who understand quantum mechanics are more likely to believe in Many Worlds? We perform a t-test, checking whether one's probability of the MWI being true depends on whether or not one can solve the Schrodinger Equation. People who could solve the equation had on average a 54.3% probability of MWI, compared to 51.3% in those who could not. The p-value is 0.26; there is a 26% probability this occurs by chance. Therefore, we fail to establish that people's probability of MWI varies with understanding of quantum mechanics.

Are there any interesting biological correlates of IQ? We run a correlation between self-reported IQ, height, maternal age, and paternal age. The correlations are in the expected direction but not significant.

Are there differences in the ways men and women interact with the community? I had sort of vaguely gotten the impression that women were proportionally younger, newer to the community, and more likely to be referred via HPMOR. The average age of women on LW is 27.6 compared to 27.7 for men; obviously this difference is not significant. 14% of the people referred via HPMOR were women compared to about 10% of the community at large, but this difference is pretty minor. Women were on average newer to the community - 21 months vs. 39 for men - but to my surprise a t-test was unable to declare this significant. Maybe I'm doing it wrong?

Does the amount of time spent in the community affect one's beliefs in the same way as in previous surveys? I ran some correlations and found that it does. People who have been around longer continue to be more likely to believe in MWI, less likely to believe in aliens in the universe (though not in our galaxy), and less likely to believe in God (though not religion). There was no effect on cryonics this time.

In addition, the classic correlations between different beliefs continue to hold true. There is an obvious cluster of God, religion, and the supernatural. There's also a scifi cluster of cryonics, antiagathics, MWI, aliens, and the Simulation Hypothesis, and catastrophic risk (this also seems to include global warming, for some reason).

Are there any differences between men and women in regards to their belief in these clusters? We run a t-test between men and women. Men and women have about the same probability of God (men: 5.9, women: 6.2, p = .86) and similar results for the rest of the religion cluster, but men have much higher beliefs in for example antiagathics (men 24.3, women: 10.5, p < .001) and the rest of the scifi cluster.

DESCRIPTIONS OF LESS WRONG

Survey users were asked to submit a description of Less Wrong in 140 characters or less. I'm not going to post all of them, but here is a representative sample:

- "Probably the most sensible philosophical resource avaialble."
- "Contains the great Sequences, some of Luke's posts, and very little else."
- "The currently most interesting site I found ont the net."
- "EY cult"
- "How to think correctly, precisely, and efficiently."
- "HN for even bigger nerds."
- "Social skills philosophy and AI theorists on the same site, not noticing each other."
- "Cool place. Any others like it?"
- "How to avoid predictable pitfalls in human psychology, and understand hard things well: The Website."
- "A bunch of people trying to make sense of the wold through their own lens, which happens to be one of calculation and rigor"
- "Nice."
- "A font of brilliant and unconventional wisdom."
- "One of the few sane places on Earth."
- "Robot god apocalypse cult spinoff from Harry Potter."
- "A place to converse with intelligent, reasonably open-minded people."
- "Callahan's Crosstime Saloon"
- "Amazing rational transhumanist calming addicting Super Reddit"
- "Still wrong"
- "A forum for helping to train people to be more rational"
- "A very bright community interested in amateur ethical philosophy, mathematics, and decision theory."
- "Dying. Social games and bullshit now >50% of LW content."
- "The good kind of strange, addictive, so much to read!"
- "Part genuinely useful, part mental masturbation."
- "Mostly very bright and starry-eyed adults who never quite grew out of their science-fiction addiction as adolescents."
- "Less Wrong: Saving the world with MIND POWERS!"
- "Perfectly patternmatches the 'young-people-with-all-the-answers' cliche"
- "Rationalist community dedicated to self-improvement."
- "Sperglord hipsters pretending that being a sperglord hipster is cool." (this person's Autism Quotient was two points higher than LW average, by the way)
- "An interesting perspective and valuable database of mental techniques."
- "A website with kernels of information hidden among aspy nonsense."
- "Exclusive, elitist, interesting, potentially useful, personal depression trigger."
- "A group blog about rationality and related topics. Tends to be overzealous about cryogenics and other pet ideas of Eliezer Yudkowsky."
- "Things to read to make you think better."
- "Excellent rationality. New-age self-help. Worrying groupthink."
- "Not a cult at all."
- "A cult."
- "The new thing for people who would have been Randian Objectivists 30 years ago."
- "Fascinating, well-started, risking bloat and failure modes, best as archive."
- "A fun, insightful discussion of probability theory and cognition."
- "More interesting than useful."
- "The most productive and accessible mind-fuckery on the Internet."
- "A blog for rationality, cognitive bias, futurism, and the Singularity."
- "Robo-Protestants attempting natural theology."
- "Orderly quagmire of tantalizing ideas drawn from disagreeable priors."
- "Analyze everything. And I do mean everything. Including analysis. Especially analysis. And analysis of analysis."
- "Very interesting and sometimes useful."
- "Where people discuss and try to implement ways that humans can make their values, actions, and beliefs more internally consistent."
- "Eliezer Yudkowsky personality cult."
- "It's like the Mormons would be if everyone were an atheist and good at math and didn't abstain from substances."
- "Seems wacky at first, but gradually begins to seem normal."
- "A varied group of people interested in philosophy with high Openness and a methodical yet amateur approach."
- "Less Wrong is where human algorithms go to debug themselves."
- "They're kind of like a cult, but that doesn't make them wrong."
- "A community blog devoted to nerds who think they're smarter than everyone else."
- "90% sane! A new record!"
- "The Sequences are great. LW now slowly degenerating to just another science forum."
- "The meetup groups are where it's at, it seems to me. I reserve judgment till I attend one."
- "All I really know about it is this long survey I took."
- "The royal road of rationality."
- "Technically correct: The best kind of correct!"
- "Full of angry privilege."
- "A sinister instrument of billionaire Peter Thiel."
- "Dangerous apocalypse cult bent on the systematic erasure of traditional values and culture by any means necessary."
- "Often interesting, but I never feel at home."
- "One of the few places I truly feel at home, knowing that there are more people like me."
- "Currently the best internet source of information-dense material regarding cog sci, debiasing, and existential risk."
- "Prolific and erudite writing on practical techniques to enhance the effectiveness of our reason."
- "An embarrassing Internet community formed around some genuinely great blog writings."
- "I bookmarked it a while ago and completely forgot what it is about. I am taking the survey to while away my insomnia."
- "A somewhat intimidating but really interesting website that helps refine rational thinking."
- "A great collection of ways to avoid systematic bias and come to true and useful conclusions."
- "Obnoxious self-serving, foolish trolling dehumanizing pseudointellectualism, aesthetically bankrupt."
- "The cutting edge of human rationality."
- "A purveyor of exceedingly long surveys."

PUBLIC RELEASE

That last commenter was right. This survey had vastly more data than any previous incarnation; although there are many more analyses I would like to run I am pretty exhausted and I know people are anxious for the results. I'm going to let CFAR analyze and report on their questions, but the rest should be a community effort. So I'm releasing the survey to everyone in the hopes of getting more information out of it. If you find something interesting you can either post it in the comments or start a new thread somewhere.

The data I'm providing is the raw data EXCEPT:

- I deleted a few categories that I removed halfway through the survey for various reasons
- I deleted 9 entries that were duplicates of other entries, ie someone pressed 'submit' twice.
- I deleted the timestamp, which would have made people extra-identifiable, and sorted people by their CFAR random number to remove time order information.
- I removed one person whose information all came out as weird symbols.
- I numeralized some of the non-numeric data, especially on the number of months in community question. This is not the version I cleaned up fully, so you will get to experience some of the same pleasure I did working with the rest.
- I deleted 117 people who either didn't answer the privacy question or who asked me to keep them anonymous, leaving 1067 people.

Here it is: Data in .csv format , Data in Excel format

Comments (640)

Comment author: Cthulhoo 29 November 2012 08:21:20AM *  16 points [-]

Before even reading the full details, I want to congratulate to you for the impressive amount of work. The survey period is possibly my favorite time of the year on lesswrong!

EDIT: The links for the raw csv/xls data at the bottom don't seem to work for me.

Comment author: Yvain 29 November 2012 09:00:26AM 5 points [-]

Thank you. That should be fixed now.

Comment author: Cthulhoo 29 November 2012 10:01:16AM 2 points [-]

It's indeed working, thank you!

Comment author: Qiaochu_Yuan 29 November 2012 08:24:25AM *  9 points [-]

When you discuss the calibration results, could you mention that the surveyors were told what constituted a correct answer? I didn't take the survey and it isn't obvious from reading this post. Also, could you include a plug for PredictionBook around there? You've included lots of other helpful plugs.

Comment author: Yvain 29 November 2012 08:30:04AM 5 points [-]

Done.

Comment author: Bugmaster 29 November 2012 08:49:30AM 19 points [-]
  • "Robot god apocalypse cult spinoff from Harry Potter."

That should be on a T-shirt.

Comment author: Nornagest 29 November 2012 08:56:42AM 3 points [-]

I think that's my favorite description on that list.

Comment author: Tenoke 29 November 2012 09:09:35AM 2 points [-]

I'd buy that shirt. This is instant classic.

Comment author: Tripitaka 30 November 2012 03:24:39PM 0 points [-]

http://www.spreadshirt.com/design-your-own-t-shirt-C59/product/103759664/view/1/sb/l I thinks it a nice robot, but maybe some of our art-inclined people would like to design a robot god thats got a Harry-Potterish feel about it?

Comment author: thomblake 30 November 2012 04:01:58PM 1 point [-]

Spinoff is misspelled.

Comment author: Tripitaka 30 November 2012 04:17:15PM -1 points [-]
Comment author: Bugmaster 30 November 2012 04:39:04PM 1 point [-]

This link takes me to a blank T-shirt design UI...

Comment author: Bugmaster 30 November 2012 04:42:11PM 4 points [-]

I'm envisioning a robot in the classic Sistine Chapel God pose, only with menacingly glowing red eyes. Instead of pointing with its finger, it's holding a wand. There's a wizard hat on its head.

The image could be done in silhouette, for that extra-stylized look.

If I had any artistic skill, I'd draw it myself :-/

Comment author: Khoth 29 November 2012 08:59:14AM 19 points [-]

So I suggest that we now have pretty good, pretty believable evidence that the average IQ for this site really is somewhere in the 130s, and that self-reported IQ isn't as terrible a measure as one might think.

This still suffers from selection bias - I'd imagine that people with lower IQ are more likely to leave the field blank than people with higher IQ.

Comment author: Kindly 29 November 2012 08:17:36PM 2 points [-]

Indeed, more than 2/3 of responders left the field blank, so the real IQ could be pretty much anything.

Comment author: gwern 30 November 2012 02:13:41AM *  9 points [-]

This still suffers from selection bias - I'd imagine that people with lower IQ are more likely to leave the field blank than people with higher IQ.

I think this is only true if we're going to also assume that the selection bias is operating on ACT and SAT scores. But we know they correlate with IQ, and quite a few respondents included ACT/SAT1600/SAT2400 data while they didn't include the IQ; so all we have to do is take for each standardized test the subset of people with IQ scores and people without, and see if the latter have lower scores indicating lower IQs. The results seem to indicate that while there may be a small difference in means between the groups on the 3 scores, it's neither of large effect size nor statistical significance.

ACT:

R> lwa <- subset(lw, !is.na(as.integer(ACTscoreoutof36)))
R> lwiq <- subset(lwa, !is.na(as.integer(IQ)))
R> lwiqnot <- subset(lwa, is.na(as.integer(IQ)))
R> t.test(lwiq$ACTscoreoutof36, lwiqnot$ACTscoreoutof36, alternative="less")
Welch Two Sample t-test
data: lwiq$ACTscoreoutof36 and lwiqnot$ACTscoreoutof36 t = 0.5088, df = 141.9, p-value = 0.6942
alternative hypothesis: true difference in means is less than 0 95 percent confidence interval:
-Inf 0.7507 sample estimates:
mean of x mean of y 32.68 32.50

Original SAT:

R> lwa <- subset(lw, !is.na(as.integer(SATscoresoutof1600)))
R> lwiq <- subset(lwa, !is.na(as.integer(IQ)))
R> lwiqnot <- subset(lwa, is.na(as.integer(IQ)))
R> t.test(lwiq$SATscoresoutof1600, lwiqnot$SATscoresoutof1600, alternative="less")
Welch Two Sample t-test
data: lwiq$SATscoresoutof1600 and lwiqnot$SATscoresoutof1600 t = -1.137, df = 237.4, p-value = 0.1284
alternative hypothesis: true difference in means is less than 0 95 percent confidence interval:
-Inf 6.607 sample estimates:
mean of x mean of y 1476 1490

New SAT:

R> lwa <- subset(lw, !is.na(as.integer(SATscoresoutof2400)))
R> lwiq <- subset(lwa, !is.na(as.integer(IQ)))
R> lwiqnot <- subset(lwa, is.na(as.integer(IQ)))
R> t.test(lwiq$SATscoresoutof2400, lwiqnot$SATscoresoutof2400, alternative="less")
Welch Two Sample t-test
data: lwiq$SATscoresoutof2400 and lwiqnot$SATscoresoutof2400 t = -0.9645, df = 129.9, p-value = 0.1683
alternative hypothesis: true difference in means is less than 0 95 percent confidence interval:
-Inf 109.3 sample estimates:
mean of x mean of y 2221 2374

The lack of variation is unsurprising since the (original) SAT and ACT are correlated, after all:

R> lwa <- subset(lw, !is.na(as.integer(ACTscoreoutof36)))
R> lwsat <- subset(lwa, !is.na(as.integer(SATscoresoutof1600)))
R> cor.test(lwsat$SATscoresoutof1600, lwsat$ACTscoreoutof36)
Pearson's product-moment correlation
data: lwsat$SATscoresoutof1600 and lwsat$ACTscoreoutof36 t = 8.839, df = 66, p-value = 8.415e-13
alternative hypothesis: true correlation is not equal to 0 95 percent confidence interval:
0.6038 0.8291 sample estimates:
cor 0.7362
Comment author: magfrump 01 December 2012 04:15:47AM 2 points [-]

I'm interested in this analysis but I don't think the results are presented nicely, and I am not THAT interested. If someone else wants to summarize the parent I promise to upvote you.

Comment author: gwern 01 December 2012 04:42:33AM 5 points [-]

I... thought I did summarize it nicely:

But we know they correlate with IQ, and quite a few respondents included ACT/SAT1600/SAT2400 data while they didn't include the IQ; so all we have to do is take for each standardized test the subset of people with IQ scores and people without, and see if the latter have lower scores indicating lower IQs. The results seem to indicate that while there may be a small difference in means between the groups on the 3 scores, it's neither of large effect size nor statistical significance.

Comment author: magfrump 01 December 2012 05:03:46AM 4 points [-]

That is actually better than I remembered immediately after reading it; with the data coming after the discussion my brain pattern-completed to expect a conclusion after the data. Also the paragraph is a little bit dense; a paragraph break before the last sentence might make it a little more readable in my mind.

I had already upvoted your post, regardless :)

Comment author: Tenoke 29 November 2012 09:07:52AM 15 points [-]

Some of the 'descriptions of LessWrong' can make for a great quote on the back of Yudkowsky's book.

Comment author: Pablo_Stafforini 29 November 2012 03:57:54PM 16 points [-]

Obnoxious self-serving, foolish trolling dehumanizing pseudointellectualism, aesthetically bankrupt.

;-)

Comment author: FiftyTwo 30 November 2012 12:25:42AM 10 points [-]

Pratchett always includes a quote that calls him a "complete amateur," so there is some precedent for ostentatiously including negative reviews.

Comment author: JenniferRM 29 November 2012 09:16:03AM 9 points [-]

Thank you for this public service. It seems definitely helpful for the community, and possibly helpful for historians :-)

Comment author: Kaj_Sotala 29 November 2012 12:11:43PM 13 points [-]

and possibly helpful for historians :-)

I now have this mental image of future sociology grad students working on their theses by reading through every article and comment ever posted on Less Wrong, and then analyzing us.

Comment author: dspeyer 29 November 2012 05:28:05PM 10 points [-]

I now have an image of those sociologists giving up on reading everything and writing scripts to do some sort of ngram or inverse-markov analysis, then mis-applying statistics to draw wrong conclusions from it. Am I cynical yet?

Comment author: magfrump 01 December 2012 04:17:06AM 3 points [-]

I now have an image of farther future sociologists writing scathing commentaries on the irony of poorly-used statistical measures of this community.

Comment author: Armok_GoB 29 November 2012 11:12:54PM 5 points [-]

I'm imagining them being vast posthumans with specialized modalities for it that can't really be called "reading".

Comment author: TsviBT 29 November 2012 09:47:07AM 4 points [-]

It could be that many people self-reported IQ based off of their SAT or ACT scores, which would explain away the correlation. How many people reported both SAT and ACT scores?

Comment author: gwern 29 November 2012 11:37:40PM 3 points [-]

You mean either of the SATs?

R> length(lw[(!is.na(lw$SATscoresoutof2400) | !is.na(lw$SATscoresoutof2400)) & !is.na(lw$ACTscoreoutof36),])
[1] 106
Comment author: FeepingCreature 29 November 2012 12:05:17PM *  12 points [-]

Or we could just be really horrible. If we haven't even learned to avoid the one bias that we can measure super well and which is most susceptible to training, what are we even doing here?

You're fun to read. Posts explaining things and introducing terms that connect subjects and form patterns trigger reward mechanisms in the brain. This is uncorrelated to actually applying any lessons in daily life.

Two questions you might want to ask next year is "do you think it is practical and advantageous to reduce people's biases via standardized exercises?" and "Has reading LW inspired you to try and reduce your own biases?"

Comment author: John_Maxwell_IV 29 November 2012 12:50:36PM 17 points [-]

But I am skeptical of these numbers. I hang out with some people who are very closely associated with the greater Less Wrong community, and a lot of them didn't know about the survey until I mentioned it to them in person. I know some people who could plausibly be described as focusing their lives around the community who just never took the survey for one reason or another. One lesson of this survey may be that the community is no longer limited to people who check Less Wrong very often, if at all. One friend didn't see the survey because she hangs out on the #lesswrong channel more than the main site. Another mostly just goes to meetups. So I think this represents only a small sample of people who could justly be considered Less Wrongers.

Yeah, this also fits my observations--I suspect that reading LW and hanging out with LW types in real life are substitute goods.

Comment author: [deleted] 29 November 2012 01:22:13PM 23 points [-]

I really have no idea what went so wrong [with the question about Bayes' birth year]

Note also that in the last two surveys the mean and median answers were approximately correct, whereas this time even the first quartile answer was too late by almost a decade. So it's not just a matter of overconfidence -- there also was a systematic error. Note that Essay Towards Solving a Problem in the Doctrine of Chances was published posthumously when Bayes would have been 62; if people estimated the year it was published and assumed that he had been approximately in his thirties (as I did), that would explain half of the systematic bias.

Comment author: Cakoluchiam 29 November 2012 09:38:23PM 1 point [-]

This question was biased against people who don't believe in history.

For my answer, which was wildly wrong, I guesstimated by interpolating backward using the rate of technological and cultural advance in various cultures throughout my lifetime, the dependency of such advances on Bayesian and related logics, with an adjustment for known wars and persecution of scientists and an assumption that Bayes lived in the western world. I should have realized that my confidence on estimates of each of these (except the last) was not very good and that I really shouldn't have had any more than marginal confidence in my answer, but I was hoping that the sheer number of assumptions I made would approach statistical mean of my confidences and that the overestimates would counterbalance the underestimates.

The real lesson I learned from this exercise is that I shouldn't have such high confidence in my ability to produce and compound a statistically significant number of assumptions with associated confidence levels.

Comment author: Manfred 30 November 2012 12:14:05AM 1 point [-]

Have you read Malcolm Gladwell - Blink? It's a fun book that doesn't take too long, which hella makes up for the occasional failure of rigor. Anyhow, the conclusion is that even on hard problems, expert-trusted models will still have very few parameters. And those parameters don't have to be the same things you'd use if you were a perfect reasoner - what's important is that you can use it as an indicator.

Comment author: Alejandro1 29 November 2012 10:31:24PM 2 points [-]

I had a vaguely right idea for the year of publication, and didn't know it was posthumous, but assumed that it was published in his middle-to-old age and so got the question right.

Comment author: magfrump 01 December 2012 04:21:40AM 0 points [-]

I personally had error bars of 75 years on my confidence and was 74 years off. I'm not sure if I translated that correctly into percent certainty of being within 20 years of correct, but I felt okay about the result. This might be another source of systematic error?

Comment author: CCC 29 November 2012 02:41:35PM 6 points [-]

This survey looks like it was a massive amount of work to analyse. Three cheers for Yvain!

Comment author: [deleted] 29 November 2012 03:00:50PM 5 points [-]

Note that people who take online psychometric tests are probably a pretty skewed category already so this tells us nothing. 

What? They calibrated the test using the people who took it online?

Comment author: gwern 29 November 2012 11:39:57PM 2 points [-]

I'm fairly sure the Big Five wasn't calibrated on an online sample, but I have no idea about iqtest.dk.

Comment author: Luke_A_Somers 29 November 2012 03:06:32PM 1 point [-]

Sweet. I was in the one correctly calibrated cohort - I knew just how slim my chances of being right were!

Comment author: [deleted] 29 November 2012 03:25:33PM 6 points [-]

These are all the countries with greater than 1% of Less Wrongers,

And they are exactly the non-write-in ones in the survey, except for New Zealand that was there and Poland that wasn't.

Comment author: somervta 30 November 2012 03:06:46AM 0 points [-]

New Zealand was 0.8, which is close enough to support your point IMO.

Comment author: [deleted] 29 November 2012 03:26:04PM *  7 points [-]

Yvain, I rechecked the calibration survey results, and encourage someone to recheck my recheck further:

First, these strata overlap... is 5 in 0-5 or 5-15? The N I doesn't actually match either one get either one when I recheck.

Secondly, I am not sure what program you used to calculate the statistics, but when I checked in excel, some people used percentages that got pulled as numbers less than one. I tried to clean that for these. (also removed someone who answered 150.)

Thirdly, there are 20 people in this N. You can be either 60% correct (12 correct), or 65% correct (13 correct), but 60.2% correct in this line seems weird. 85-95: 60.2% [n = 20]

Here was my attempt at recalculating those figures: N after data cleaning was 998.

0-<5: 9.1% [n = 2/22]

5-<15: 13.7% [n = 25/183]

15-<25: 9.3% [n = 21/226]

25-<35: 10% [n = 20/200]

35-<45: 11.1% [n = 10/90]

45-<55: 17.3% [n = 19/110]

55-<65: 20.8% [n = 11/53]

65-<75: 22.6% [n = 7/31]

75-<85: 36.7% [n = 11/30]

85-<95: 63.2% [n = 12/19]

95-100: 88.2% [n = 30/34]

I express low confidence in these remarks because I haven't rechecked this or gone into detail about data cleaning, but my brief take is:

1: Yes, there were some errors that made it look a bit worse than it was.

2: It's still shows overconfidence. (Edit: see possible caveat below)

Question: Do we have enough data to determine if that hump at near 10% confidence that you are right is significant?

Edit: I'm not a statistician, but I do notice there appears to be substantially more N that answered in the lower confidence ranges. I mean, yes, on average, the people who answered in those high 55-<85 ranges were quite far off, but there were more people than answered in the 15-<25 range then all of those three groups put together.

Comment author: gwern 30 November 2012 01:36:57AM 5 points [-]

I think the calibration data needs additional cleaning. Eyeballing, I see % signs, decimals, and English comments.

Comment author: [deleted] 29 November 2012 03:31:24PM 6 points [-]

In the fair coin questions, there were two people answering 49.9, one 49.9999, one 49.999999, and one 51. :-/

Comment author: TrE 29 November 2012 07:40:46PM 0 points [-]

Were they excluded from the probabilities questions?

Comment author: Cakoluchiam 29 November 2012 10:03:09PM *  1 point [-]

It was stated that they should give the obvious answer and that surveys that didn't follow the rules would be thrown out... but maybe 50% isn't as obvious as 99.99% of the population thinks it is.

Is there any reason the prompt for the question shouldn't have explicitly stated "(The obvious answer is the correctly formatted value equivalent to p=0.5 or 50%)"?

Comment author: Eugine_Nier 01 December 2012 04:39:26AM 1 point [-]

My working theory is that they were trolling.

Comment author: Tripitaka 30 November 2012 10:54:11AM *  1 point [-]

Here is a paper which shows that natural coin tosses are not fair- with a 51:49 bias of the side thats "up" at the beginning. Maybe ask for the probability on an indealized coin toss next year? edit: fixed the markup

Comment author: [deleted] 30 November 2012 11:17:29AM 5 points [-]

Certain tossing techniques can bias the results much more than that, as described in Probability Theory by Jaynes. But the survey did ask about a “fair coin” (emphasis added).

Comment author: dbaupp 30 November 2012 12:06:23PM *  3 points [-]

(For the [text](url) link syntax to work, you need the full URL, i.e. including the http:// bit at the start: http://comptop.stanford.edu/preprints/heads.pdf)

Comment author: jimrandomh 29 November 2012 03:48:00PM 26 points [-]

The calibration question is an n=1 sample on one of the two important axes (those axes being who's answering, and what question they're answering). Give a question that's harder than it looks, and people will come out overconfident on average; give a question that's easier than it looks, and they'll come out underconfident on average. Getting rid of this effect requires a pool of questions, so that it'll average out.

Comment author: Morendil 29 November 2012 06:26:32PM 8 points [-]

Yep. (Or as Yvain suggests, give a question which is likely to be answered with a bias in a particular direction.)

It's not clear what you can conclude from the fact that 17% of all people who answered a single question at 50% confidence got it right, but you can't conclude from it that if you asked one of these people a hundred binary questions and they answered "yes" at 50% confidence, that person would only get 17% right. The latter is what would deserve to be called "atrocious"; I don't believe the adjective applies to the results observed in the survey.

I'm not even sure that you can draw the conclusion "not everyone in the sample is perfectly calibrated" from these results. Well, the people who were 100% sure they were wrong, and happened to be correct, are definitely not perfectly calibrated; but I'm not sure what we can say of the rest.

Comment author: steven0461 29 November 2012 11:28:36PM *  1 point [-]

I would agree that this explains the apparent atrocious calibration. It's worth an edit to the main post. No reason to beat ourselves up needlessly.

People were answering different questions in the sense that they each had an interval of their own choosing to assign a probability to, but obviously different people's performance here was going to be strongly correlated. Bayes just happens to be the kind of guy who was born surprisingly early. If everyone had literally been asked to assign a probability to the exact same proposition, like "Bayes was born before 1750" or "this coin will come up heads", that would have been a more extreme case. We'd have found that events that people predicted with probability x% actually happened either 0% or 100% of the time, and it wouldn't mean people were infinitely badly calibrated.

Comment author: [deleted] 30 November 2012 12:12:00AM -1 points [-]

All of that also applies to the year calibration questions in previous surveys and yet people did much better in those.

Comment author: steven0461 30 November 2012 12:48:15AM 4 points [-]

Because they weren't about events that occurred surprisingly early.

Comment author: gwern 29 November 2012 05:53:17PM 10 points [-]

The 2011 survey ran 33 days and collected 1090 responses. This year's survey ran 23 days and collected 1195 responses.

Why did you close it early? That seems entirely unnecessary.

One friend didn't see the survey because she hangs out on the #lesswrong channel more than the main site.

I put a link and exhortation prominently in the #lesswrong topic from the day the survey opened to the day it closed.

M (trans f->m): 3, 0.3% / F (trans m->f): 16, 1.3%

3 vs 16 seems like quite a difference, even allowing for the small sample size. Is this consistent with the larger population?

Prefer polyamorous: 155, 13.1%...NUMBER OF CURRENT PARTNERS:... [>1 partners = 4.5%]

So ~3x more people prefer polyamory than are actually engaged in it...

Referred by HPMOR: 262, 22.1%

Impressive.

gwern.net: 5 people

Woot! And I'm not even trying or linking LW especially often.

(I am also pleased by the nicotine and modafinil results, although you dropped a number in 'Never: 76.5%')

TROLL TOLL POLICY: Disapprove: 194, 16.4% Approve: 178, 15%

So more people are against than for. Not exactly a mandate for its use.

Are people who understand quantum mechanics are more likely to believe in Many Worlds? We perform a t-test, checking whether one's probability of the MWI being true depends on whether or not one can solve the Schrodinger Equation. People who could solve the equation had on average a 54.3% probability of MWI, compared to 51.3% in those who could not. The p-value is 0.26; there is a 26% probability this occurs by chance. Therefore, we fail to establish that people's probability of MWI varies with understanding of quantum mechanics.

Sounds like you did a two-tailed test. shminux's hypothesis, which he has stated several times IIRC, is that people who can solve it will not be taken in by Eliezer's MWI flim-flam, as it were, and would be less likely to accept MWI. So you should've been running a one-tailed t-test to reject the hypothesis that the can-solvers are less MWI'd. The p-value would then be something like 0.13 by symmetry.

Comment author: gwillen 29 November 2012 07:54:25PM 10 points [-]

So ~3x more people prefer polyamory than are actually engaged in it...

I would not describe this as an accurate conclusion. For one thing, I currently have one partner who has other partners, so I think I am unambiguously "currently engaged in polyamory" even though I would have put 1 on the survey.

For another, I think it is reasonable to say that someone who is in a relationship with exactly one other person, but is not monogamous with that person (i.e. is available to enter further relationships) is engaged in polyamory.

Comment author: gwern 29 November 2012 08:54:14PM 6 points [-]

Do you think your situation explains 2/3s of those who prefer polyamory?

Comment author: gwillen 29 November 2012 09:13:28PM 0 points [-]

Well, I think you can probably break it down as follows, given just the data we have:

  • 0 partners
  • 1 partner, looking
  • 1 partner, not looking
  • 2 partners+

Of those, I would say the second and fourth are unambiguously practicing poly, the third could go either way but you could say is presumptively mono, and the first probably doesn't count (since they are actively practicing neither mono nor poly.)

If someone wants to run those numbers, I'd be curious how they come out.

Comment author: gwern 29 November 2012 09:28:50PM 2 points [-]

The second could be people looking for replacements for their current partner, no? I wouldn't call that unambiguous.

Comment author: JoeW 29 November 2012 09:46:12PM 1 point [-]

TL;DR - I think it's not that simple.

Opinion is divided as to whether poly is an orientation or a lifestyle (something one is vs. something one does).

i.e. saying someone with no partners is practising neither mono nor poly is like saying someone with no partners is not currently engaged in homo-/bi-/hetero-sexuality. (However I would accept a claim that they were engaged in asexuality.)

Comment author: thomblake 29 November 2012 09:53:33PM 3 points [-]

This is a good point.

I wonder if it's worth even making the distinction between "lifestyle" and "act". Thus, poly could be an orientation ("I'm not poly because I don't want multiple partners"), lifestyle ("I'm not poly because I don't have and I'm not actively seeking multiple partners"), and act ("I'm not poly because I don't currently have multiple partners").

I used to always use the "act" definition when discussing sexual orientation ("I don't have one - I haven't had sex with anyone lately") to the confusion of all interlocutors.

Comment author: JoeW 29 November 2012 10:18:56PM 4 points [-]

Heh, in fact I started but then deleted as a derail some discussion of problems in activist and academic discussions of sexual orientation - what are we to make of someone whose claimed orientation (identification) does not match their current and past behaviour, which might in turn be different again to their stated actual preferences.

I'm not current in my academic reading of sexuality, but when I was, anyone researching from a public health perspective went with behaviour, while psychologists and sociologists were split between identification and preference.

Queer activism seems to have generally gone with identification as primary, although I'm not as current there as I used to be. The trumping argument there was actually precisely your situation, where to accept behaviour as primary meant that no virgins had any orientation, and that does not agree with our intuitions or most peoples' personal experiences.

There's also a bi-activism point which says that position means the only "true" bisexuals are people engaged in mixed-gender group sex. (This is intended as reductio ad absurdem but I've heard people use it seriously.)

Poly seems to be more complicated still, q.v. distinctions between swinging, "monogamish", open relationships, polyfidelity and polyamory. I know multiple examples of dyadic couples who regularly have sex with other people but identify as monogamous, and of couples who aren't currently involved with anyone else, aren't looking, but are firm in their poly identification.

I guess my TL;DR is that I'm entirely untroubled by an apparent difference between preference and practice, and if the survey had asked similar questions about sexual orientation preference & practice, we would have seen "discrepancies" there too.

Comment author: Cakoluchiam 29 November 2012 10:12:21PM *  1 point [-]

I don't agree that the first doesn't count. The Relationship Style question was about preferred style, not current active situation. It could be that 2/3 of the polyamorous people just can't get a date (lord knows I've been there). (ETA:) Or, in the case of not looking, don't want a date right now (somewhere I've also been).

Comment author: DaFranker 29 November 2012 10:15:21PM *  1 point [-]

It could just be that 2/3 of the polyamorous people just can't get a date (lord knows I've been there).

I'm in the "no preference" camp, not the poly specifically, but I'm certainly there. LessWrong does seem to indirectly filter for people who are there, simply because people who aren't are less likely to take an interest in things that would lead them to LW, IME.

Comment author: DaFranker 29 November 2012 09:03:06PM *  1 point [-]

3 vs 16 seems like quite a difference, even allowing for the small sample size. Is this consistent with the larger population?

Might be close enough to assume it's due to the small sample:

Recent statistics from the Netherlands indicate that about 1 in 12,000 natal males undergo sex-reassignment and about 1 in 34,000 natal females. Source: Transgender Issues: A Fact Sheet

No idea how reliable those numbers are, nor how they compare with elsewhere in the world. The main website that hosts that PDF should have more complete data that could be cross-referenced, if someone wants to take the time to do that.

Comment author: thomblake 29 November 2012 10:15:32PM 1 point [-]

Interesting. Going to the source of some of those numbers, it doesn't look like there was clear specification of what they meant by "sexual orientation", so that line of the chart is actually entirely meaningless to me. Anyone have a good guess as to how people would have answered?

Comment author: DaFranker 29 November 2012 10:26:46PM *  1 point [-]

AFAICT It seems to be answered in terms of the sex of their partners post-transition, i.e. a hetero MTF would prefer sexually-male partners.

The fact that the 59% stat for history of rape is symmetrical for MTF and FTM really bugs me, though. It seems to imply weird causal arrows pointing in completely opposite directions depending on whether you were originally male or female, based on my prior knowledge.

Which seems very scary, because it could also imply that MTFs are a dozen decibels more likely to be targets of rape than average females. Now I wonder if that has been taken into account when looking at the mental health stats.

Comment author: thomblake 30 November 2012 02:41:23PM 1 point [-]

Yeah, somewhere in there are some pretty disturbing violent crime stats. A notable proportion of violent crime in one country was towards trans people.

Comment author: NancyLebovitz 30 November 2012 06:32:07PM 1 point [-]
Comment author: thomblake 29 November 2012 09:07:21PM 2 points [-]

3 vs 16 seems like quite a difference, even allowing for the small sample size. Is this consistent with the larger population?

As I understand it, there isn't good data. Stereotypically, there are more MtF than FtM. But according to Wikipedia, a Swedish study found a ratio of 1.4:1 in favor of MtF for those requesting sexual reassignment surgery, and 1:1 for those going through with it. Of course, this is the sort of Internet community where I'd expect some folks to identify as trans without wanting to go through surgery at all.

Comment author: gwern 29 November 2012 09:23:18PM *  10 points [-]

After I posted my comment, I realized that 3 vs 16 might just reflect the overall gender ratio of LW: if there's no connection between that stuff and finding LW interesting (a claim which may or may not be surprising depending on your background theories and beliefs), then 3 vs 16 might be a smaller version of the larger gender sample of 120 vs 1057. The respective decimals are 0.1875 and 0.1135, which is not dramatic-looking. The statistics for whether membership differs between the two pairs:

R> M <- as.table(rbind(c(120, 1057), c(3,16)))
R> dimnames(M) <- list(status=c("c","t"), gender=c("M","F"))
R> M
gender
status M F
c 120 1057
t 3 16
R> chisq.test(M, simulate.p.value = TRUE, B = 20000000)
Pearson's Chi-squared test with simulated p-value (based on 2e+07 replicates)
data: M X-squared = 0.6342, df = NA, p-value = 0.4346

(So it's not even close to the usual significance level. As intuitively makes sense: remove or add one person in the right category, and the ratio changes a fair bit.)

Comment author: thomblake 29 November 2012 09:35:39PM 8 points [-]

After I posted my comment, I realized that 3 vs 16 might just reflect the overall gender ratio of LW

Now I feel dumb for not even noticing that. "In a group where most people were born males, why is it the case that most trans people were born males?" doesn't even seem like a question.

Comment author: DaFranker 29 November 2012 09:51:34PM 0 points [-]

Haha, that's a great way to look at it. Had skipped over this myself too!

Now it makes me wonder which would be more significant between this and the apparent prominence of M->F over F->M that I just read some stats about (if the stats are true/reliable, 0.7 conf there).

Comment author: thomblake 29 November 2012 09:58:02PM 0 points [-]

I just read some stats about

link?

Comment author: DaFranker 29 November 2012 10:09:15PM 1 point [-]

Oh, heh, sorry.

I mentioned them in a different subthread around here. The linked PDF has a few fun numbers, but didn't notice any obvious dates or timelines. The main website hosting it has a bit more data and references from what little I looked into.

Comment author: TorqueDrifter 29 November 2012 10:14:27PM *  10 points [-]

Under this theory, it seems (with low statistical confidence of course) that LW-interest is perhaps correlated with biological sex rather than gender identity, or perhaps with assigned-gender-during-childhood. Which is kind of interesting.

Comment author: Emile 30 November 2012 12:59:14PM 6 points [-]

Does anybody know if this holds for other other preferences that tend to vary heavily by gender? Are MtoF transsexuals heavily into say programming, or science fiction? (I know of several transsexual game developers/designers, all MtoF).

Comment author: TorqueDrifter 30 November 2012 08:43:38PM *  3 points [-]

I don't know of any such data. I'd imagine that there's less of a psychological barrier to engaging in traditionally "gendered" interests for most transgendered people (that is, if you think a lot about gender being a social construct, you're probably going to care less about a cultural distinction between "tv shows for boys" and "tv shows for girls"). Beyond that I can't really speculate.

Edit: here's me continuing to speculate anyway. A transgendered person is more likely than a cisgendered person to have significant periods of their life in which they are perceived as having different genders, and therefore is likely to be more fully exposed to cultural expectations for each.

Comment author: thomblake 30 November 2012 09:22:26PM 4 points [-]

FWIW, I have the opposite intuition. Transgendered people (practically by definition) care about gender a lot, so presumably would care more about those cultural distinctions.

Contrast the gender skeptic: "What do you mean, you were assigned male but are really female? There's no 'really' about it - gender is just a social construct, so do whatever you want."

Comment author: [deleted] 01 December 2012 12:21:57AM 8 points [-]

It's more complicated than that. Gender nonconformity in childhood is frequently punished, so a great many trans people have some very powerful incentives to suppress or constrain our interests early in life, or restrict our participation in activities for which an expressed interest earns censure or worse.

Pragmatically, gender is also performed, and there are a lot of subtle little things about it that cisgender people don't necessarily have innately either, but which are learned and transmitted culturally, many of which are the practical aspects of larger stuff (putting on makeup and making it look good is a skill, and it consists of lots of tiny subskills). Due to the aforementioned process, trans people very frequently don't get a chance to acquire those skills during the phase when their cis counterparts are learning them, or face more risks for doing so.

Finally, at least in the West: Trans medical and social access were originally predicated on jumping through an awful lot of very heteronormative hoops, and that framework still heavily influences many trans communities, particularly for older folks. This aspect is changing much faster thanks to the internet, but you still only need to go to the right forum or support group to see this dynamic in action. There's a lot of gender policing, and some subsets of the community who basically insist on an extreme version of this framing as a prerequisite for "authentic" trans identity.

So...when a trans person transitions, very often they are coping with some or all of this, often for the first time, simultaneously, and within a short time frame. We're also under a great deal of pressure about all of it.

"What do you mean, you were assigned male but are really female? There's no 'really' about it - gender is just a social construct, so do whatever you want."

Relevant: http://xkcd.com/592/

Comment author: [deleted] 01 December 2012 12:10:01AM 2 points [-]

It's a common inside joke amongst SF-loving, programmer trans women that there are a lot of SF-loving, programmer trans women, or that trans women are especially and unusually common in those fields. But they usually don't socialize with large swathes of other trans women who come unsorted by any other criterion save "trans and women"; I think this is an availability bias coupled with a bit of "I've found my tribe!" thinking.

Comment author: DaFranker 29 November 2012 10:03:35PM 2 points [-]

Hmm. Thanks for the link to that wikipedia page. Interesting...

...the definitions given on that wikipedia page seem to imply that I'm strongly queer and/or andro*, at least in terms of my experiences and gender-identity. Had never noticed nor cared (which, apparently, is a component of some variants of andro-somethings). I'm (very visibly) biologically male and "identify" (socially) as male for obvious reasons (AKA don't care if miscategorized, as long as the stereotyping isn't too harmful), and I'm attracted mostly to females because of instinct (I guess?) and practical issues (e.g. disdain of anal sex).

Oh well, one more thing to consider when trying to figure out why people get confused by my behaviors. I've always (in recent years anyway) thought of myself as "human with penis".

Comment author: [deleted] 01 December 2012 12:06:35AM 11 points [-]

I'm attracted mostly to females because of instinct (I guess?) and practical issues (e.g. disdain of anal sex).

If you can't think of practical ways for two people with penises to have sex that don't involve anal, you might just need better porn.

Comment author: thomblake 29 November 2012 09:10:32PM 4 points [-]

So ~3x more people prefer polyamory than are actually engaged in it...

I wonder, if you split out poly/mono preference and number of partners, whether the number who prefer poly but have <2 partners would be significantly different from the number who prefer mono but have <1 partner.

Now that I've wondered this out loud, I feel like I should have just asked a computer.

Comment author: DaFranker 29 November 2012 09:17:22PM 5 points [-]

I was about to reply the same thing. The quoted statement doesn't sound particularly more surprising than "Most people prefer to be in a relationship, but only a fraction of those are actually engaged in one".

Comment author: Kindly 29 November 2012 09:30:07PM 4 points [-]

Would it be more surprising to find people that prefer poly relationships, but only have one partner and aren't looking for more, than to find people that prefer mono relationships, but have no partners and aren't looking for any?

Among those with firm mono/poly preferences, there are 15% of the former (24% if we also include people that prefer poly, have no partners, and aren't looking for more) and 14% of the latter.

Comment author: Kindly 29 November 2012 09:33:30PM 3 points [-]

Also, roughly 2/7 of people that prefer poly are single, while roughly 3/7 of people that prefer mono are.

Comment author: thomblake 29 November 2012 09:37:00PM 2 points [-]

Thanks, computer!

Comment author: Kindly 29 November 2012 09:42:31PM *  1 point [-]

Oh, I forgot to answer your actual question. Slightly over 2/3 of people that prefer poly have 0 or 1 partners.

Edit: Although I guess this much was evident from the data if we assume that people that prefer mono won't have 2 or more partners. I guess the group that doesn't have a firm mono/poly preference (which I ignored entirely) could confuse things a bit.

Comment author: thomblake 29 November 2012 09:57:01PM 0 points [-]

Also, roughly 2/7 of people that prefer poly are single, while roughly 3/7 of people that prefer mono are.

Slightly over 2/3 of people that prefer poly have 0 or 1 partners.

So, people that prefer mono are more likely to have their preferred number of partners, but people who prefer poly have more partners.

Comment author: DaFranker 29 November 2012 09:43:59PM 1 point [-]

Would it be more surprising (...)?

Not by that much, but yes, I suppose a tad more.

Thanks for clearing this up.

Comment author: Cakoluchiam 29 November 2012 10:23:55PM *  4 points [-]

TROLL TOLL POLICY: Disapprove: 194, 16.4% Approve: 178, 15%

So more people are against than for. Not exactly a mandate for its use.

Hypothesis: those directly affected by the troll policy (trolls) are more likely to have strong disapproval than those unaffected by the troll policy are to have strong approval.

In my opinion, a strong moderation policy should require a plurality vote in the negative (over approval and abstention) to fail a motion to increase security, rather than a direct comparison to the approval. (withdrawn as it applies to LW, whose trolls are apparently less trolly than other sites I'm used to)

Comment author: gwern 29 November 2012 11:15:29PM *  16 points [-]

Hypothesis: those directly affected by the troll policy (trolls) are more likely to have strong disapproval than those unaffected by the troll policy are to have strong approval.

Hypothesis rejected when we operationalize 'trolls' as 'low karma':

R> lwtroll <- lw[!is.na(lw$KarmaScore),]
R> lwtroll <- lwtroll[lwtroll$TrollToll=="Agree with toll" | lwtroll$TrollToll=="Disagree with toll",]
R> # disagree=3, agree=2; so:
R> # if positive correlation, higher karma associates with disagreement
R> # if negative correlation, higher karma associates with agreement
R> # we are testing hypothesis higher karma = lower score/higher agreement
R> cor.test(as.integer(lwtroll$TrollToll), lwtroll$KarmaScore, alternative="less")
Pearson's product-moment correlation
data: as.integer(lwtroll$TrollToll) and lwtroll$KarmaScore t = 1.362, df = 315, p-value = 0.9129
alternative hypothesis: true correlation is less than 0 95 percent confidence interval:
-1.0000 0.1679 sample estimates:
cor 0.07653
R> # a log-transform of the karma scores doesn't help:
R> cor.test(as.integer(lwtroll$TrollToll), log1p(lwtroll$KarmaScore), alternative="less")
Pearson's product-moment correlation
data: as.integer(lwtroll$TrollToll) and log1p(lwtroll$KarmaScore) t = 2.559, df = 315, p-value = 0.9945
alternative hypothesis: true correlation is less than 0 95 percent confidence interval:
-1.0000 0.2322 sample estimates:
cor 0.1427

Plots of the scores, regular and log-transformed:

<code>plot(lwtroll$TrollToll, lwtroll$KarmaScore)</code>

<code>plot(lwtroll$TrollToll, log1p(lwtroll$KarmaScore))</code>

Comment author: Cakoluchiam 29 November 2012 11:25:52PM 15 points [-]

If this were anywhere but a site dedicated to rationality, I would expect trolls to self-report their karma scores much higher on a survey than they actually are, but that data is pretty staggering. I accept the rejection of the hypothesis, and withdraw my opinion insofar as it applies to this site.

Comment author: Yvain 30 November 2012 04:29:26AM 1 point [-]

Sounds like you did a two-tailed test. shminux's hypothesis, which he has stated several times IIRC, is that people who can solve it will not be taken in by Eliezer's MWI flim-flam, as it were, and would be less likely to accept MWI. So you should've been running a one-tailed t-test to reject the hypothesis that the can-solvers are less MWI'd. The p-value would then be something like 0.13 by symmetry.

Yes, but I imagined someone like Eliezer might have the hypothesis that the math naturally leads to MWI and rationalists who understood the math would realize that.

Comment author: [deleted] 30 November 2012 11:29:26AM 0 points [-]

3 vs 16 seems like quite a difference, even allowing for the small sample size. Is this consistent with the larger population?

Well, there also are nine times as many male-born males as female-born females, for that matter.

Comment author: gwern 30 November 2012 04:12:06PM 0 points [-]
Comment author: NancyLebovitz 29 November 2012 07:14:05PM 11 points [-]

If we haven't even learned to avoid the one bias that we can measure super well and which is most susceptible to training, what are we even doing here?

This sounds like a job for cognitive psychology!

"Well-calibrated" should probably be improved to "well-calibrated about X"-- it's plausible that people have better and worse calibration about different subjects, and the samples in the survey only explored a tiny part of calibration space.

Comment author: Aharon 29 November 2012 07:18:46PM 34 points [-]

Hi Yvain,

please state a definite end date next year. Filling out the survey didn't have a really high priority for me, but knowing that I had "about a month" made me put it off. Had I known that the last possible day was the 26th of November, I probably would have fit it in sometime in between other stuff.

Comment author: John_Maxwell_IV 30 November 2012 09:08:56AM 5 points [-]

Hm, could it be that the longer survey format this time around cut down on the number of responses as well?

Comment author: [deleted] 30 November 2012 11:56:51AM 0 points [-]

The 2011 survey ran 33 days and collected 1090 responses. This year's survey ran 23 days and collected 1195 responses.

So, what "cut down on the number of responses"?

Comment author: siodine 29 November 2012 07:33:39PM 3 points [-]

How well calibrated were the prediction book users?

Comment author: ChristianKl 29 November 2012 08:00:40PM 2 points [-]

Unfortunately we lacked a question to track prediction book users.

Comment author: siodine 29 November 2012 08:09:01PM *  0 points [-]

Hopefully then someone will do a supplementary calibration test for prediction book users in the comments here or in a new post on the discussion board. (Apologies for not doing it myself)

Comment author: gwern 29 November 2012 08:42:01PM 3 points [-]

http://predictionbook.com/predictions displays an overall graph.

Comment author: gwern 29 November 2012 07:37:32PM 1 point [-]

By the way, has anyone figured out how to load the CSV in R? read.table chokes with problems like "Error in read.table("for_public.csv", header = TRUE, quote = "", sep = ",") : more columns than column names"; dos2unix doesn't help, and loading it up in OpenOffice, it looks fine but re-exporting as CSV leads to the same darn errors. Mucking about, I can get it to load if I delete everything but the first entry and then the last 4 commas but this solution doesn't work for any additional entries!

Comment author: EvelynM 29 November 2012 07:43:20PM 5 points [-]

This worked "dat <- read.csv('http://raikoth.net/Stuff/LessWrong/for_public.csv')"

Comment author: gwern 29 November 2012 08:02:11PM 0 points [-]

That works, thanks.

Comment author: ChristianKl 29 November 2012 09:03:47PM 2 points [-]

Another R issue. How do I convert the scores from IQTest into real numbers? as.numeric seems to do strange things.

Comment author: EvelynM 29 November 2012 09:23:21PM 0 points [-]

"iqt <- as.numeric(dat$IQTest)" The already numeric IQ is in dat$IQ, iqt is only the suspect, online IQtest.

Comment author: Matt_Simpson 29 November 2012 09:43:21PM *  6 points [-]

use "as.numeric(as.character(dat$IQTest))"

The IQtest data is stored as factor. A factor variable has a set of levels, numbered 1,2,3,... that are the variable can possibly take on and labels for those factors. as.numeric(X) returns the level numbers of X. as.character returns the labels of X. In the case that the labels are actually numbers (usually integers that R is interpreting as character labels for some reason), as.numeric(as.character(X)) will return the numeric values that R is interpreting as labels.

EDIT:

In this case, when no value for IQtest was reported, it was stored as " " instead of "", which made R think the variable contained character data which R defaults to treating as factors. The " "'s should all be NA's once it's converted properly.

Comment author: Douglas_Knight 29 November 2012 09:59:57PM *  1 point [-]

If you really did quote="", then you don't have any quote character and it won't work. But that's probably some kind of markdown error. The default in read.table is to allow both double and single quotes, while the default in read.csv is to only allow single quotes; I find that if I change your argument to quote="\"" to allow only double quotes, then it reads it with no errors. Another difference between read.table and read.csv is that read.table defaults to allowing # comments, which mangles one of the lines. This can be fixed with comment.char="", at which point read.csv and read.table produce the same result.

Comment author: ChristianKl 29 November 2012 07:38:08PM 0 points [-]

God: 6 + 18.7 (0, 0, 1) [n = 1098] Simulation: 25.1 + 29.7 (1, 10, 50) [n = 1039]

From those who believe that we are in a simulation with over 70% confidence there's only one person who has a higher chance of God existing then the chance that we live in a simulation. Given that a God got here defined as someone with world making powers, how do you get a simulation without a God?

Comment author: BerryPick6 29 November 2012 07:45:58PM 0 points [-]

That the simulation controllers/creators aren't necessarily omnibenevolent is one possible explanation for us being in a simulation and there not existing what most people call 'god.'

Comment author: ChristianKl 29 November 2012 07:55:33PM *  4 points [-]

Omnibenevolent was not in the criteria for this question. If people used it as a criteria, it suggests that they felt victim to some bias that let's them underrate the possibility that a god exists.

Maybe it's the cognitive dissonance, because a good rationalist shouldn't believe in a god?

What is the probability that there is a god, defined as a supernatural (see above) intelligent entity who created the universe?

Comment author: BerryPick6 29 November 2012 08:01:11PM 1 point [-]

Right, just saw that, my bad.

Comment author: CCC 30 November 2012 07:31:16AM 0 points [-]

Maybe it's the cognitive dissonance, because a good rationalist shouldn't believe in a god?

I still don't see how that follows. Rationality can show that certain potential gods very probably don't exist (e.g. Thor), but I think that's as far as it goes.

Comment author: fubarobfusco 29 November 2012 07:51:27PM *  7 points [-]

Look again at the survey questions:

P(Supernatural)
What is the probability that supernatural events, defined as those involving ontologically basic mental entities, have occurred since the beginning of the universe?

P(God)
What is the probability that there is a god, defined as a supernatural (see above) intelligent entity who created the universe?

A simulator is not a god because gods are ontologically basic, while simulators are not.

Comment author: ChristianKl 29 November 2012 08:44:45PM 1 point [-]

John lives in a simulation. He thinks about the properties of the simulator. The simulator can't be reduced to the kind of physical objects that John can observe in his reality. The simulator is made of different stuff.

If it just about reducing the entity into multiple parts than that's possible for the Chrisitan God who's made up of three parts.

Otherwise can you point me to a good definition of ontologically basic?

Comment author: fubarobfusco 29 November 2012 11:06:52PM *  3 points [-]

If [naturalism] is true, then all minds, and all the contents and powers and effects of minds, are entirely caused by natural [i.e. fundamentally nonmental] phenomena. But if naturalism is false, then some minds, or some of the contents or powers or effects of minds, are causally independent of nature. In other words, such things would then be partly or wholly caused by themselves, or exist or operate directly or fundamentally on their own.

Richard Carrier

It's not a matter of whether they are "made of different stuff" but if they are made of stuff at all. A simulator is no more supernatural to us than we are to a boxed AI; we're both running inside the same material universe, just in different ways.

Comment author: ChristianKl 30 November 2012 12:06:56AM -1 points [-]

The boxed AI runs inside a universe that follow the laws of Turing computing.

Nature is the stuff around us. A simulation simulates nature. The one who runs the simulation isn't part of that nature. The simulator can exist without needing anything from the nature in with John lives.

If I'm reading Harry Potter, Harry Potter lives in a world where magic happens. I don't. Those two world are fundamentally different. The magic that happens in Harry Potter's world is causally independent from myself.

Comment author: Cakoluchiam 29 November 2012 10:47:40PM 0 points [-]

Would someone who created a computer that created the universe count as a god? I can easily write computer games with more complex behavior than I feel capable of fully comprehending, but I would not consider that computer program an intelligent entity. I can imagine that someone more educated and with a higher mental capacity than I could similarly write a computer program that is capable of creating and maintaining in simulation a universe with the global constants and initial conditions necessary to produce intelligent life without the program actually qualifying as intelligent itself.

My personal belief is that if there is a "god", he is quite probably much like a video game programmer, who can set up a universe like an MMO and let it run "infinitely" in "real-time", but, being constrained to a similar time-scale as the "players", is unable to make a large number of fine-grained adjustments to local variables at the immediate behest of said players (i.e. "answering prayers"). Someday we may get a version 2.0 release which allows third-party plugins so players can hack the universe to answer their own prayers, but I don't place a high conditional probability on that happening within my projected lifetime.

Comment author: Cakoluchiam 29 November 2012 10:55:57PM *  1 point [-]

Hell, if the mathematical universe hypothesis is correct, then somewhere out there in the universe there is, with no intelligent priors, a collection of particles in the form of a computer, simulating a universe containing intelligent entities.

Comment author: Decius 30 November 2012 02:28:39AM *  0 points [-]

And the universe it is simulating itself contains an entity which created a computer which simulates a universe...

Given that the mathematical universe hypothesis is correct, what are the odds that the universe we experience
A- is the mathematical universe
B- is a computer analogue simulation which was generated spontaneously
C- is a computer analogue simulation which was created by an intelligence but is currently unattended
D- is a computer analogue simulation which is attended by someone who wishes to suppress the thought that the universe is atten---

Comment author: CCC 30 November 2012 07:25:53AM 2 points [-]

My personal belief is that if there is a "god", he is quite probably much like a video game programmer, who can set up a universe like an MMO and let it run "infinitely" in "real-time", but, being constrained to a similar time-scale as the "players", is unable to make a large number of fine-grained adjustments to local variables at the immediate behest of said players (i.e. "answering prayers").

Given that all the 'players' are running in the universe in question, being able to make a large number of fine-grained adjustments to local variables in an instant (in-universe time) is simple; simply pause the simulation.

...unless some of you out there are actually players from outside the universe, in which case the rest of us would appreciate a hint.

Comment author: DaFranker 30 November 2012 03:38:47PM *  1 point [-]

My personal belief is that if there is a "god", he is quite probably much like a video game programmer, who can set up a universe like an MMO and let it run "infinitely" in "real-time", but, being constrained to a similar time-scale as the "players", is unable to make a large number of fine-grained adjustments to local variables at the immediate behest of said players (i.e. "answering prayers").

This seems to unreasonably deviate from the way almost every type of simulation I've ever heard of works. You can pause/resume, you can increase/decrease the "timesteps" to make the world go "faster" (with larger quanta levels though), or you could just arbitrarily increase the raw processing speed of the machine running the simulation to make the ratio of simulated vs external time proportionally higher.

Of course, if what you're proposing instead is that our "minds" are actually outside the simulation and sending input into it, rather than being fully contained within the simulation, then yes, the real-time constraint does apply.

ETA: In the latter case, I would argue that the term "Virtual Reality" is more appropriate and the use of "simulation" here is misleading and prone to conflating or confusing the two scenarios.

Comment author: Jayson_Virissimo 30 November 2012 08:02:22AM *  0 points [-]

According to Yvain's definition, you are correct. On the other hand, I can't think of something more godlike than creating or designing the universe, can you? It just seems like a very idiosyncratic definition, which is why I complained to Yvain about it last year.

Comment author: fubarobfusco 30 November 2012 09:15:46PM *  3 points [-]

I can't think of something more godlike than creating or designing the universe, can you?

Sure. For instance, some theists claim that God created and maintains logic itself.

Moreover, a simulator who is not ontologically basic (i.e. is made of matter; arose through material processes in their own universe) does not meet four of Aquinas's "five ways" — unmoved mover, first cause, necessary being, or maximum degree of goodness.

Comment author: shminux 29 November 2012 07:50:13PM *  2 points [-]

Are people who understand quantum mechanics are more likely to believe in Many Worlds? We perform a t-test, checking whether one's probability of the MWI being true depends on whether or not one can solve the Schrodinger Equation. People who could solve the equation had on average a 54.3% probability of MWI, compared to 51.3% in those who could not. The p-value is 0.26; there is a 26% probability this occurs by chance. Therefore, we fail to establish that people's probability of MWI varies with understanding of quantum mechanics.

Just wanted to point out a few fallacies in the above:

  • "can solve the Schrodinger Equation" means nothing or less without specifying the problem you are solving. The two simplest problems taught in a modern physics course, the free particle and a one-dimensional infinite square well are hardly comparable with, say, calculating the MRI parameters.

  • self-reporting "can solve the Schrodinger Equation" does not mean one actually can.

  • even then, "can solve the Schrodinger Equation" does not mean "understand quantum mechanics", as it does not require one to understand measurement and decoherence, which is what motivates MWI in the first place.

  • there are many versions of MWI, from literal ("the Universe split into two or more every time something happens") to Platonic ("Mathematical Universe").

Basically, I hope that you realize that this is a prime example of "garbage in, garbage out". I suppose it's a good thing that there was no correlation, otherwise one might draw some unwarranted conclusions from this.

Comment author: gwern 29 November 2012 11:55:06PM 7 points [-]

If the correlation had come out the other way, you'd be jumping on it as proof of your thesis that LWers favor MWI because they are sheepishly following Eliezer. In what universe where they are indeed sheepishly and ignorantly following him does a question like that show nothing whatsoever?

Comment author: shminux 30 November 2012 12:05:44AM 2 points [-]

If the correlation had come out the other way, you'd be jumping on it as proof of your thesis that LWers favor MWI because they are sheepishly following Eliezer.

Probably (though not a proof, just one piece of evidence). I suspect that "garbage in" is the reason why we don't see it, but I do not have a convincing argument either way, short of asking Eliezer to post an insincere message "I no longer believe in MWI", take the survey soon after, then have him retract the retraction. This would, however, be rather damaging to his credibility in general.

Comment author: orthonormal 30 November 2012 12:07:19AM 7 points [-]

I'm assuming that the question was meant as a simple and polite proxy for "Does your knowledge of quantum mechanics include some actual mathematical content, or is it just taken from popular science books and articles?"

Comment author: shminux 30 November 2012 12:14:18AM 3 points [-]

Probably. The reason he mentioned the Schrodinger equation was likely an attempt to quantify it. I am arguing that the threshold is set too low to be useful.

Comment author: [deleted] 30 November 2012 12:37:57AM 4 points [-]

The question was specifically about the SE for a hydrogen atom. But I agree that having good PDE-fu isn't necessarily a good proxy for anything else.

Comment author: NancyLebovitz 30 November 2012 12:58:04AM 1 point [-]

I gave you a tentative upvote because this comment sounds very plausible, but since I don't know how to solve any version of the Schrodinger Equation, I'm going by more general priors.

Comment author: shminux 30 November 2012 01:02:49AM 1 point [-]

since I don't know how to solve any version of the Schrodinger Equation, I'm going by more general priors

Sure. It is very reasonable to put some trust (but probably not too much) in what EY says about MWI if your experience shows that he is not out to lunch in the areas of your expertise. Assuming that is what you mean by "more general priors".

Comment author: NancyLebovitz 30 November 2012 02:51:02AM 2 points [-]

That wasn't at all what I had in mind, though Eliezer's generally high level of intelligence and meticulousness makes MWI seem a little more likely to me.

No,my strongest general priors in play are that it's likely that there are different degrees of understanding the Schrodinger Equation, that people might kid themselves about how well they understand it, and that there's more than one take on MWI. My prior for there's more to understanding MWI than the Schrodinger equation is a little weaker, but not much.

Comment author: Yvain 30 November 2012 04:25:54AM 3 points [-]

The actual survey specified "can solve the Schrodinger equation for a hydrogen atom". Although it is not exactly synonymous with "understands quantum mechanics", you would expect them to be highly correlated.

Comment author: shminux 30 November 2012 04:49:31PM 2 points [-]

The actual survey specified "can solve the Schrodinger equation for a hydrogen atom".

Right, sorry, I forgot that qualifier since the time I took the survey. It does imply more familiarity with the underlying math than the simplest possible cases. Still, I recall that when I was at that level, I was untroubled by the foundational issues, just being happy to have mastered the math.

Although it is not exactly synonymous with "understands quantum mechanics", you would expect them to be highly correlated

I wonder if there is a way to test this assertion. One would presumably start by defining what "understands quantum mechanics" means.

Comment author: Eugine_Nier 01 December 2012 05:10:18AM *  2 points [-]

I suspect asking about density matrices might be a better test.

Comment author: Quirinus_Quirrell 29 November 2012 08:00:18PM 15 points [-]

"Eliezer Yudkowsky personality cult."
"The new thing for people who would have been Randian Objectivists 30 years ago."
"A sinister instrument of billionaire Peter Thiel."

Nope, no one guessed whose sinister instrument this site is. Muaha.

Comment author: hankx7787 29 November 2012 08:54:52PM *  3 points [-]

Well-educated atheist American white men in their mid 20s with no children who work with computers.

"The new thing for people who would have been Randian Objectivists 30 years ago."

The demographics are essentially the same except LW is probably more than 2:1 politically left vs. right. Objectivists are probably more than 2:1 in the other direction.

Since when did people like us decide it is OK to be liberal/socialist?

Comment author: Alejandro1 29 November 2012 10:56:54PM 2 points [-]

I think there is a significant correlation between Objectivism/hardcore libertarianism and the described demographics, but it does not mean that all or even most people of that demographic have that ideology; it just means that this demographic is much more likely to have this ideology that a random person is.

Also, while it is true that there are more LWers that are atheist than theist, male than female, white than other races, etc, it is at the same time very unlikely that most LWers have all those characteristics. (Being typical in all respects is very atypical). And just having one of those characteristics different might make the correlation with Objectivism/hardcore libertarianism reduce a lot.

Comment author: magfrump 01 December 2012 04:41:39AM 4 points [-]

Also, while it is true that there are more LWers that are atheist than theist, male than female, white than other races, etc, it is at the same time very unlikely that most LWers have all those characteristics.

Given that we are 86.2% male cisgender, 84.3% caucasion (non-hispanic), and 83.3% atheist (spiritual or not) that means a minimum of 53% of LWers are all three; probably the actual number is over 60%.

In answer to the parent, atheism in America may have started becoming a more liberal pursuit somewhere around 30 years ago when the Republican party started being substantially more religious and dismissive of atheism and science.

Comment author: [deleted] 01 December 2012 04:46:42AM *  0 points [-]

In answer to the parent, atheism in America may have started becoming a more liberal pursuit somewhere around 30 years ago when the Republican party started being substantially more religious and dismissive of atheism and science.

Nice job breaking it, Nixon.

Comment author: magfrump 01 December 2012 05:05:21AM *  3 points [-]

I would actually have said that Nixon was the last Republican president that wasn't actively hostile to science and atheism. Compared with Reagan and Bush, he certainly has a very different reputation.

EDIT: I would have said that without having looked at the link or having been alive at the time.

Comment author: MTGandP 29 November 2012 09:08:06PM 5 points [-]

I was surprised to see that LW has almost as many socialists as libertarians. I had thought due to anecdotal evidence that the site was libertarian-dominated.

I was also surprised that a plurality of people preferred dust specks to torture, given that it appears to be just a classic problem of scope insensitivity, which this site talks about repeatedly.

I was happy to see that we have more vegetarians and fewer smokers than the general population.

Comment author: Khoth 29 November 2012 09:18:05PM 7 points [-]

It's not exactly libertarian-dominated. More that there are far more libertarians here than in real life (and more socialists, too, likely as not. It's the "normal" political positions that are underrepresented)

Comment author: Emile 30 November 2012 08:49:58PM 7 points [-]

and more socialists, too, likely as not.

If you break down political orientation by country, you get around 50% socialists among europeans (which may be a bit higher than the population), and around 20% socialists among americans.

Comment author: ArisKatsaris 30 November 2012 02:51:21PM 16 points [-]

Generally, half the time we get visiting leftwingers accusing us of being rightwing reactionaries, and the other half of the time we get visiting rightwingers accusing us of being leftwing sheep.

So if you thought that the site was libertarian-dominated, I'm hereby making a prediction with 75% certainty that you consider yourself a left winger. Am I right?

Comment author: Vaniver 30 November 2012 09:01:36PM *  15 points [-]

There are a number of old posts from the Overcoming Bias days in which EY comments that the audience is primarily libertarian- which makes sense for the blog of a GMU economist. A partial explanation might be people reading that and assuming he's talking about the modern population distribution of LW.

Comment author: MTGandP 30 November 2012 09:55:16PM 4 points [-]

Generally, half the time we get visiting leftwingers accusing us of being rightwing reactionaries, and the other half of the time we get visiting rightwingers accusing us of being leftwing sheep.

The first one surprises me because hardly anyone on LW seems conservative (and the polls confirm this).

I'm hereby making a prediction with 75% certainty that you consider yourself a left winger. Am I right?

I'm definitely a non-libertarian, so that may be it.

Comment author: Nornagest 30 November 2012 10:00:47PM *  6 points [-]

The first one surprises me because hardly anyone on LW seems conservative (and the polls confirm this).

Libertarians count as right-wing by most left-wing standards, even far right. And then we've got a small but vocal faction of neoreactionary/Moldbugger types, who don't fit cleanly into any modern political typologies but who tend to look extra-super right-wing++ through leftist eyes.

Comment author: NancyLebovitz 30 November 2012 10:05:00PM 9 points [-]

The first one surprises me because hardly anyone on LW seems conservative (and the polls confirm this).

However, there are a few fairly common (or at least it seems so to me) opinions on LW which are distinctively un-Left: democracy is bad, there are racial differences in important traits, and women complain way too much about how men treat them. We'll see how that last one plays out.

Comment author: TimS 30 November 2012 03:26:27PM 11 points [-]

I was also surprised that a plurality of people preferred dust specks to torture, given that it appears to be just a classic problem of scope insensitivity, which this site talks about repeatedly.

I was surprised as well, but I disagree that it is necessarily scope insensitivity - believing utility is continuously additive requires choosing torture. But some people take that as evidence that utility is not additive - more technically, evidence that utility is not the appropriate analysis of morality (aka picking deontology or virtue ethics or somesuch).

More specific analysis here and more generally here.

Comment author: thomblake 30 November 2012 04:09:53PM 6 points [-]

In support of this, 435 people chose specks, and 430 chose virtue ethics, deontology, or other.

Comment author: BerryPick6 30 November 2012 04:12:07PM *  2 points [-]

I̶ ̶t̶h̶i̶n̶k̶ ̶w̶e̶'̶v̶e̶ ̶f̶o̶u̶n̶d̶ ̶o̶u̶r̶ ̶a̶n̶s̶w̶e̶r̶,̶ ̶t̶h̶e̶n̶.̶

ETA: Really nice work from satt to prove I was jumping to conclusions here.

Comment author: Eugine_Nier 01 December 2012 05:16:50AM 3 points [-]

I was surprised to see that LW has almost as many socialists as libertarians. I had thought due to anecdotal evidence that the site was libertarian-dominated.

I suspect you'd see a higher percentage of libertarians if you restricted to non-lurkers, and even higher if you restricted by karma, or how often they post.

Comment author: Cakoluchiam 29 November 2012 09:57:56PM 3 points [-]

Any results for the calibration IQ?

Comment author: gwern 30 November 2012 12:15:47AM *  2 points [-]

The original question:

What do you think is the probability that the IQ you gave earlier in the survey is greater than the IQ of over 50% of survey respondents?

Well, the predictions spread the usual range and look OK to me:

R> lwci <- as.numeric(as.character(lw$CalibrationIQ))
R> lwci <- lwci[!is.na(lwci)]
R> # convert tiny decimals to percentages & put a ceiling of 100 (thanks to Mr. 1700...)
R> lwci <- sapply(lwci, function(x) if (x<=1.00) { x*100 } else { if(x>100) { 100 } else { x }})
R> summary(lwci)
Min. 1st Qu. Median Mean 3rd Qu. Max. 0.0 20.0 50.0 44.8 70.0 100.0
Comment author: dbaupp 29 November 2012 10:09:12PM *  3 points [-]

I think you missed some duplicates in for_public.csv: Rows 26, 30, 761 and 847 are identical to their preceding one.

Comment author: Kindly 29 November 2012 10:14:33PM 3 points [-]

Fishing for correlations is a statistically dubious practice, but also fun. Some interesting ones (none were very high, except e.g. Father Age and Mother Age):

  • IQ and Hours Writing have correlation 0.26 (75 degrees), which is the only interesting IQ correlation.
  • Siblings and Older siblings have correlation 0.48 (61 degrees), which isn't too surprising , but makes me wonder: do we expect this correlation to be 0.5 in general?
  • Most of the Big Five answers are slightly correlated (around +/-0.25, or 90+/-15 degrees) with each other, but not with anything else except the Autism Score. Shouldn't well-designed personality traits be orthogonal, ideally?
  • CFAR question 7 (guess of height of redwood) was negatively correlated with Height (-0.23, or 103 degrees). No notable correlation with the random number, though.
Comment author: thomblake 29 November 2012 10:18:54PM 5 points [-]

Most of the Big Five answers are slightly correlated (around +/-0.25, or 90+/-15 degrees) with each other, but not with anything else except the Autism Score. Shouldn't well-designed personality traits be orthogonal, ideally?

It might just pick out the cluster of "Less Wrong personality type".

CFAR question 7 (guess of height of redwood) was negatively correlated with Height

Obviously it's a matter of perspective. Tall people just tower over those redwoods.

Comment author: Kindly 29 November 2012 10:26:43PM 3 points [-]

It might just pick out the cluster of "Less Wrong personality type".

In that case, it says something about the cluster as well. For example, Openness and Extraversion wouldn't be positively correlated just because most LWers are both open and extraverted (or because most LWers are closed and introverted). We'd have to have something that specifically makes "open and extraverted" more likely to happen together than individually.

Comment author: [deleted] 30 November 2012 12:26:04AM 3 points [-]

Something like Berkson's paradox (people who are neither open nor introverted are unlikely to read LW)?

Comment author: Kindly 30 November 2012 01:21:41AM 2 points [-]

Good point. Objection retracted (in the conversational sense).

Comment author: [deleted] 29 November 2012 10:54:40PM *  12 points [-]

Top 100 Users' Data, aka Karma 1000+

I was thinking about the fact that there is probably a difference between active LWers versus lurkers or newbies. So I looked at the data for the Top 100 users (actually Top 107, because there was a tie). This happily coincided with the nice Schelling point of 1000 karma. (make sense, because people are likely to round to that number.) To me, this reads as "has been actively posting for at least a year".

So, some data on 1000+ karma people:

Slightly more likely to be male:
92.5% Male, 7.4% Female

NOTE: First percentage is for 1000+ users, second number is for all survey respondents

Much more likely to be polyamorous:

Prefer mono 36% v. 54%
Prefer poly 24% v. 13%
Uncertain 33% v. 30%
Other 4% v. 2%

About the same Age:
average 28.6 v. 27.8

About as likely to be single
51% v. 53%

Equally likely to be vegetarian
12%

Much more likely to use modafinil at least once per month:
15% v. 4%

About equal on intelligence tests

SAT out of 1600: 1509 v. 1486
SAT out of 2400: 2260 v. 2319
Self reported IQ: 138.5 v. 138.7
online IQ test: 127 v. 126
ACT score: 33.3 v. 32.7

Similar income
50k

Slightly lower Autism quotient:
average 22 v. 24

More likely to choose torture
Torture: 42% v. 22%
Dust Specks: 29% v. 37%

More likely to cooperate in a Prisoner's Dilemma:
Cooperate: 36% v. 27%
Defect: 20% v. 29%

Some notes: Yes, I realize my data analysis methods are not the best. Namely, that instead of comparing the people with >1000 karma to the people with <100 karma, which would have been more accurate, I just compared them to the overall results (which includes their answers). I did this because it takes much less time.

Also, a hint for other people playing with the data in Excel format: A lot of the numbers are in text format, and a pain to convert to numeric format in a way that allows you to manipulate them. The easiest work around (so long as you don't want to do anything complicated) is to just paste the needed columns either into google spreadsheet, or into another Excel sheet that's been formatted numerically. If you want to do something complicated you probably need to find the "right" way to fix it.

Comment author: William_Quixote 30 November 2012 12:57:23PM 2 points [-]

Also, a hint for other people playing with the data in Excel format: A lot of the numbers are in text format, and a pain to convert to numeric format in a way that allows you to manipulate them. The easiest work around (so long as you don't want to do anything complicated) is to just paste the needed columns either into google spreadsheet, or into another Excel sheet that's been formatted numerically. If you want to do something complicated you probably need to find the "right" way to fix it.

multiplying text by 1 or adding zero can often force auto conversion in excel. You can do this by past as values multiply. Shortcut keys are copy 1 then highlight data ALT+e s then v m enter.

Comment author: Manfred 30 November 2012 12:18:49AM *  3 points [-]

Women were on average newer to the community - 21 months vs. 39 for men - but to my surprise a t-test was unable to declare this significant. Maybe I'm doing it wrong?

Well, possibly. The t-distribution is used for "estimating the mean of a normally distributed population," (yay wikipedia) and you're trying to estimate the mean of a slanted-uniformly-distributed-with-a-spike-at-the-beginning population.

But there is another important consideration, which is that applying more scrutiny to unexpected results gives you systematic error (confirmation bias), and that's bad. To avoid this big problem, any increase in test quality should probably be part of a wholesale reanalysis, i.e. prolly not gonna happen. But there is another route, which is just accepting that your results are imperfect and widening your mental error bars. After all, where does this systematic error come from when you re-analyze unexpected results? It comes from you making mistakes on other things too, but not re-analyzing them! So once you know about the systematic error, you also know about all these other mistakes you have on average made :P

Comment author: gwern 30 November 2012 01:30:52AM *  3 points [-]

Well, possibly. The t-distribution is used for "estimating the mean of a normally distributed population," (yay wikipedia) and you're trying to estimate the mean of a slanted-uniformly-distributed-with-a-spike-at-the-beginning population.

Yeah, it'd have to be some combination of a uniform Poisson (since we don't seem to be growing a lot, per Yvain) and an exponential distribution (constant mortality of users). If we graph histograms, either blunt or finegrained, it looks like that but also with weird huge spikes besides the original OB->LW spike:

R> hist(as.numeric(as.character(lw$TimeinCommunity)))

R> hist(as.numeric(as.character(lw$TimeinCommunity)), breaks=50)

But on the plus side, if we look at the genders as a box plot, we discover why the mean is lower for women but there's not significance:

R> lwm <- subset(lw, as.character(Gender)=="M (cisgender)")
R> lwf <- subset(lw, as.character(Gender)=="F (cisgender)")
R> boxplot(as.numeric(lwm$TimeinCommunity), as.numeric(lwf$TimeinCommunity))

There are, after all, many fewer women.

Comment author: JonatasMueller 30 November 2012 12:58:55AM 2 points [-]

As for the IQ question and especially the self-reported IQ, it did not take into account that IQ should come at least with standard deviation. Otherwise it's like asking for a height number without saying if it is in centimeters, meters, or feet. It's understandable that people who didn't study psychometrics with some depth don't know this, though.

IQ can be a ratio IQ or a deviation IQ. In the first case it is mental age divided by actual age, with the normalcy as 100. This is still used mostly for children, but it's still possible to see such scores. Deviation IQ is more common and it's supposed to measure one's intelligence according to rarity in a population.

Sometimes these tests are standardized for certain countries, in which case an IQ score only has relevance in relation to that country's population, but generally the standard is the population of England or the USA, with its average being 100. Other countries have averages ranging from about 67 to 107 (s.d. 15), compared to it. The average IQ score of the world is estimated at about 90, but there are also differences in standard deviation among different populations, some have bigger variation than others, and also between the sexes (men have a slightly higher standard deviation).

Standard deviations used are 15, 16, and 24. For instance, an IQ score one standard deviation above 100 could be 115, 116, or 124. An IQ of 163 in s.d. 15 corresponds to an IQ of 167 in s.d. 16, or an IQ of 200 in s.d. 24, which, in average, correspond to a ratio IQ of 185. When estimating the true world rarity of IQ scores, though, very lengthy and complex estimations would need to be made, otherwise the scores only reflect the rarity in England or in the USA, and not in the world. When it comes to scores higher than two or three standard deviations above the average, most IQ tests are inadequate and insufficiently standardized to measure them and their rarity well.

This information is for your curiosity. The relevant point is that the self-reported IQ scores quite possibly were stated in differing standard deviations.

Comment author: Epiphany 30 November 2012 02:07:17AM *  2 points [-]

Problem:

The line: "This includes all types with greater than 10 people. You can see the full table here." links to a gif that is inaccurate, has no key to explain oddities, and is of such poor graphical quality that parts of it are actually unreadable.

It may be that the reason that invalid personality types like "INNJ" are listed is due to typos on the part of the survey participants. If so, then great! But it may also be that the person who constructed this graphic put typos in (I consider this fairly likely due to the fact that the graphical quality is so low that some of it's not readable. For instance, the number of INTPs is so unclear I can't even tell what it says - it looks like 113 but your results in the post claim 143). It isn't obvious why the invalid types are there, so a key or note would be nice.

Also, some of the participants had a good idea: if one of your personality dimension letters changes when taking the test multiple times, you can fill it out with an X. Can we add an instruction for them to do this on the next survey?

Comment author: Yvain 30 November 2012 04:21:12AM 5 points [-]

The graphic was automatically generated by a computer program, so there's no chance that typos were introduced. There's no key to explain oddities because I have no way of knowing the explanation any better than you. When in doubt, blame survey takers being trolls.

But I do apologize for the poor graphic quality.

Comment author: DaFranker 30 November 2012 03:11:17PM *  2 points [-]

Also, some of the participants had a good idea: if one of your personality dimension letters changes when taking the test multiple times, you can fill it out with an X. Can we add an instruction for them to do this on the next survey?

I don't take this test all too often (in fact, didn't take the one in the survey IIRC), but if we can do this, here's my personality type: IXXX. Oh wait.

(Yes, seriously, if I take an online MBTI test several times at evenly spaced time intervals within the same month, the first varies between .6 and .95 towards I, and the others just jump around in a manner I can't predict (yet, anyway, probably could eventually if I did more timewasting internet-test-taking))

I predict similar (perhaps less pronounced?) variation would be present in around 30% of LWers (not too confident in this number), and that we could reduce the variation dramatically by eliminating confused questions and tabooing ambiguous or vague words / phrases, replacing them with multiple questions containing various common meanings, and an even greater (bitwise) reduction by giving more contextual information from which the respondent can infer or judge values and weight variables on "It depends, but I suppose most of the time I would..." -type answers. (much more confident in these last two predictions than the first)

Comment author: Epiphany 30 November 2012 02:19:06AM *  22 points [-]

On IQ Accuracy:

As Yvain says, "people have been pretty quick to ridicule this survey's intelligence numbers as completely useless and impossible and so on" because if they're true, it means that the average LessWronger is gifted. Yvain added a few questions to the 2012 survey, including the ACT and SAT questions and the Myers-Briggs personality type question that I requested (I'll explain why this is interesting), and that give us a few other things to check against, which has made the figures more believable. The ridicule may be an example of the "virtuous doubt" that Luke warns about in Overconfident Pessimism, so it makes sense to "consider the opposite":

The distribution of Myers-Briggs personality types on LessWrong replicates the Mensa pattern. This is remarkable since the patterns of personality types here are, in many significant ways, the exact opposite of what you'd find in the regular population. For instance, the introverted rationalists and idealists are each about 1% of the population. Here, they are the majority and it's the artisans and guardians who are relegated to 1% or less of our population.

Mensa's personality test results were published in the December 1993 Mensa Bulletin. Their numbers.

So, if you believe that most of the people who took the survey lied about their IQ, you also need to believe all of the following:

  • That most of these people also realized they needed to do IQ correlation research and fudge their SAT and ACT scores in order for their IQ lie to be believable.

  • Some explanation as to why the average of lurker's IQ scores would come out so close to the average of poster's IQ scores. The lurkers don't have karma to show off, and there's no known incentive good enough to get so many lurkers to lie about their IQ score. Vaniver's figures.

  • Some explanation for why the personality type pattern at LessWrong is radically different from the norm and yet very similar to the personality type pattern Mensa published and also matched my predictions. Even if they had knowledge of the Mensa personality test results and decided to fudge their personality type responses, too, they somehow managed to fudge them in such a way that their personality types accidentally matched my predictions.

  • That they decided not to cheat when answering the Bayes birthday question even though they were dishonest enough to lie on the IQ question, motivated to look intelligent, and it takes a lot less effort to fudge the Bayes question than the intelligence and personality questions. (This was suggested by ArisKatsaris).

  • That both posters and lurkers had some motive strong enough to justify spending 20+ minutes doing the IQ correlation research and fudging personality test questions while probably bored of ticking options after filling out most of a very long survey.

It's easier just to put the real number in the IQ box than do all that work to make it believable, and it's not like the liars are likely to get anything out of boasting anonymously, so the cost-benefit ratio is just not working in favor of the liar explanation.

If you think about it in terms of Occam's razor, what is the better explanation? That most people lied about their IQ, and fudged their SAT, ACT and personality type data to match, or that they're telling the truth?


Summary of criticism:

Possible Motive to Lie: The desire to be associated with a "gifted" group:

In re to this post, it was argued by NonComposMentis that a potential motive to lie is that if the outside world perceives LessWrong as gifted, then anyone having an account on LessWrong will look high-status. In rebuttal:

  • I figure that lurkers would not be motivated to fudge their results because they don't have a bunch of karma on their account to show off and anybody can claim to read LessWrong, so fudging your IQ just to claim that the site you read is full of gifted people isn't likely to be motivating. I suggested that we compare the average IQs of lurkers and others. Vaniver did the math and they are very, very close..

  • I argued, among other things, that it would be falling for a Pascal's mugging to believe that investing the extra time (probably at least $5 worth of time for most of us) into fudging the various different survey questions is likely to contribute to a secret conspiracy to inflate LessWrong's average IQ.

Did the majority avoid filling out intelligence related questions, letting the gifted skew the results?

Short answer: 74% of people answered at least one intelligence related question and since most people filled out only one or two, the fact that the self-report, ACT and SAT score averages are so similar is remarkable.

I realized, while reading Vaniver's post that if only 1/3 of the survey participants filled out the IQ score, this may have been due to something which could have skewed the results toward the gifted range, for instance, if more gifted people had been given IQ tests for schooling placement (and the others didn't post their IQ score because they did not know it) or if the amount of pride one has in their IQ score has a significant influence on whether one reported it.

So I went through the data and realized that most of the people who filled out the IQ test question did not fill out all the others. That means that 804 people (74% not 33%) answered at least one intelligence related question. As we have seen, the IQ correlations for the IQ, SAT and ACT questions were very close to each other (unsurprisingly, it looks like something's up with the internet test... removing those, it's 63% of survey participants that answered an intelligence related question). It's remarkable in and of itself that each category of test scores generated an average IQ so similar to the others considering that different people filled them out. I mean if 1/3 of the population filled out all of the questions, and the other 2/3 filled out none, we could say "maybe the 1/3 did IQ correlation research and fudged these" but if most of the population fills out one or two, and the averages for each category come out close to the averages for the other categories, why is that? How would that happen if they were fudging?

It does look to me like people gave whatever test scores they had and that not all the people had test scores to give but it does not look to me like a greater proportion of the gifted people provided an intelligence related survey answer. Instead it looks like most people provided an intelligence related survey answer and the average LessWronger is gifted.

Exploration of personality test fudging:

Erratio and I explored how likely it is that people could successfully fudge their personality tests and why they might do that.

  • There are a lot of questions on the personality test that have an obvious intelligence component, so it's possible that people chose the answer they thought was most intelligent.

  • There are also intelligence related questions where it's not clear which answer is most intelligent. I listed those.

  • The intelligence questions would mostly influence the sensing/intuition dichotomy and the thinking/feeling dichotomy. This does not explain why the extraversion/introversion and perceiving/judging results were similar to Mensa's.

Comment author: gwern 30 November 2012 02:22:27AM 7 points [-]

(I believe Mensa's personality test results were published in the December 2006 Mensa newsletter which is, unfortunately, behind a login on the Mensa website, so I can't link to it here.)

Make a copy and post it. Most browsers have the ability to print/save pages as PDFs or various forms of HTML.

Comment author: Epiphany 30 November 2012 02:24:37AM *  21 points [-]

Ok I managed to dig it up!

 E/I | S/N | T/F | J/P (Category)
----------------------------------------------
75/25 75/25 55/45 50/50 (Overall population)
27/73 10/90 75/25 65/35 (Mensans)
15/85 03/97 88/12 54/46 (LessWrongers) *

From the December 1993 Mensa Bulletin.

* The LessWrongers were added by me, using the same calculation method as in the comment where I test my personality type predictions and are based on the 2012 survey results.

Comment author: DaFranker 30 November 2012 03:00:03PM *  3 points [-]

Thanks for the analysis. I agree with your conclusion.

On a less relevant note, it does feel good to see more evidence that the community we hang out with is smart and awesome.

Comment author: Epiphany 30 November 2012 09:01:29PM *  16 points [-]

This also explains a lot of things. People regard IQ as if it is meaningless, just a number, and they often get defensive when intellectual differences are acknowledged. I spent a lot of time doing research on adult giftedness (though I'm most interested in highly gifted+ adults) and, assuming the studies were done in a way that is useful (I've heard there are problems with this), and my personal experiences talking to gifted adults are halfway decent as representations of the gifted adult population, there are a plethora of differences that gifted adults have. For instance, in "You're Calling Who A Cult Leader?" Eliezer is annoyed with the fact that people assume that high praise is automatic evidence that a person has joined a cult. What he doesn't touch on is that there are very significant neurological differences between people in just about every way you could think of, including emotional excitability. People assume that others are like themselves, and this causes all manner of confusion. Eliezer is clearly gifted and intense and he probably experiences admiration with a higher level of emotional intensity than most. If the readers of LessWrong and Hacker News are gifted, same goes for many of them. To those who feel so strongly, excited praise may seem fairly normal. To all those who do not, it probably looks crazy. I explained more about excitability in the comments.

I also want to say (without getting into the insane amount of detail it would take to justify this to the LW crowd - maybe I will do that later, but one bit at a time) that in my opinion, as a person who has done lots of reading about giftedness and has a lot of experience interacting with gifted people and detecting giftedness, the idea that most survey respondents are giving real answers on the IQ portion of the survey seems very likely to me. I feel 99% sure that LessWrong's average IQ really is in the gifted range, and I'd even say I'm 90%+ sure that the ballpark hit on by the surveys is right. (In other words, they don't seem like a group of predominantly exceptionally or profoundly gifted Einsteins or Stephen Hawkings, or just talented people at the upper ends of the normal range with IQs near 115, but that an average IQ in the 130's / 140's range does seem appropriate.)

This says nothing about the future though... The average IQ has been decreasing on each survey for an average of about two points per year. If the trend continues, then in as many years as LessWrong has been around, LessWrong may trend so far toward the mean that LessWrong will not be gifted anymore (by all IQ standards that is, it would still be gifted by some definitions and IQ standards but not others). I will be writing a post about the future of LessWrong very soon.

Comment author: DaFranker 30 November 2012 09:37:08PM *  3 points [-]

Looks like Aumann at work. My own readings, though more specifically on teenage giftedness in the 145+ range, along with stuff on ASD and asperger, heavily corroborate with this.

When I was 17, my (direct) family and I had strong suspicions that I was in this range of giftedness - suspicions which were never reliably tested, and thus neither confirmed nor infirmed. It's still up in the air and I still don't know whether I fit into some category of gifted or special individuals, but at some point I realized that it wasn't all that important and that I just didn't care.

I might have to explore the question a bit more in depth if I decide to return into the official educational system at some point (I mean, having a paper certifying that you're a genius would presumably kind of help when making a pitch at university to let you in without the prerequisite college credit because you already know the material). Just mentioning all of the above to explain a bit where my data comes from. Both of my parents and myself were all reading tons of books, references, papers and other information along with several interviews with various psychology professionals for around three months.

Also, and this may be another relevant point, the only recognized, official IQ test I ever took was during that time, and I had a score of "above 130"² (verbal statement) and reportedly placed in the 98th and 99th percentiles on the two sections of a modified WAIS test. The actual normalized score was not included in the report (that psychologist(?¹) sucked, and also probably couldn't do the statistics involved correctly in the first place).

However, I was warned that the test lost statistical significance / representativeness / whatever above 125, and so that even if I had an IQ of 170+ that test wouldn't have been able to tell - it had been calibrated for mentally deficient teenagers and very low IQ scores (and was only a one-hour test, and only ten of the questions were written, the rest dynamic or verbal with the psychologist). Later looking-up-stats-online also revealed that the test result distributions were slightly skewed, and that a resulting converted "IQ" of "130" on this particular test was probably more rare in the general population than an IQ of 130 normally represents, because of some statistical effects I didn't understand at the time and thus don't remember at all.

Where I'm going with this is that this doesn't seem like an isolated effect at all. In fact, it seems like most of North America in general pays way more attention to mentally deficient people and low IQs than to high-IQs and gifted individuals. Based on this, I have a pretty high current prior that many on LW will have received scores suffering from similar effects if they didn't specifically seek the sorts of tests recommended by Mensa or the likes, and perhaps even then.

Based on this, I would expect such effects to compensate or even overcompensate for any upward nudging in the self-reporting.

=====

  1. I don't know if it was actually a consulting psychologist. I don't remember the title she had (and it was all done in French). She was "officially" recognized to be in legal capacity to administrate IQ tests in Canada, though, so whatever title is normally in charge of that is probably the right one.

  2. Based on this, the other hints I mention in the text, and internet-based IQ tests consistently giving me 150-ish numbers when at peak performance and 135-ish when tired (I took those a bit later on, perhaps six months after the official one), 135 is the IQ I generally report (including in the LW survey) when answering forms that ask for it and seems like a fairly accurate guess in terms of how I usually interact with people of various IQ levels.

Comment author: gwern 30 November 2012 03:27:25AM *  26 points [-]

I previously mentioned that item non-response might be a good measure of Conscientiousness. Before doing anything fancy with non-response, I first checked that there was a correlation with the questionnaire reports. The correlation is zero:

R> lwc <- subset(lw, !is.na(as.integer(as.character(BigFiveC))))
R> missing_answers <- apply(lwc, 1, function(x) sum(sapply(x, function(y) is.na(y) || as.character(y)==" ")))
R> cor.test(as.integer(as.character(lwc$BigFiveC)), missing_answers)
Pearson's product-moment correlation
data: as.integer(as.character(lwc$BigFiveC)) and missing_answers
t = -0.0061, df = 421, p-value = 0.9952
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
-0.09564 0.09505
sample estimates:
cor
-0.0002954
# visualize to see if we made some mistake somewhere
R> plot(as.integer(as.character(lwc$BigFiveC)), missing_answers)

I am completely surprised. The results in the economics paper looked great and the rationale is very plausible. Yet... The 2 sets of data here have the right ranges, there's plenty of variation in both dimension, I'm sure I'm catching most of the item non-responses or NAs given that there are non-responses as high as 34, there's a lot of datapoints, and it's not that the correlation is the opposite direction which might indicate a coding error but that there's none at all. Yvain questions the Big Five results, but otherwise they look exactly as I would've predicted before seeing the results: low C and E and A, high O, medium N.

There may be something very odd about LWers and Conscientiousness; when I try C vs Income, there's a almost-zero correlation again:

R> cor.test(as.integer(as.character(lwc$BigFiveC)), log1p(as.integer(lwc$Income)))
Pearson's product-moment correlation
data: as.integer(as.character(lwc$BigFiveC)) and log1p(as.integer(lwc$Income))
t = 0.2178, df = 421, p-value = 0.8277
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
-0.08482 0.10585
sample estimates:
cor
0.01061

I guess the next step is a linear model on income vs age, Conscientiousness, and IQ:

lwc <- subset(lw, !is.na(as.integer(as.character(BigFiveC)))))
lwc <- subset(lw, !is.na(as.integer(as.character(Age))))
lwc <- subset(lw, !is.na(as.integer(as.character(IQ))))
lwc <- subset(lw, !is.na(as.integer(as.character(Income))))
c <- as.integer(as.character(lwc$BigFiveC))
age <- as.integer(as.character(lwc$Age))
iq <- as.integer(as.character(lwc$IQ))
income <- log1p(as.integer(as.character(lwc$Income)))
summary(lm(income ~ (age + iq + c)))
Call:
lm(formula = income ~ (age + iq + c))
Residuals:
Min 1Q Median 3Q Max
-8.762 -0.849 1.191 2.319 3.644
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -0.5531 3.5479 -0.16 0.88
age 0.1311 0.0323 4.06 9.5e-05
iq 0.0339 0.0267 1.27 0.21
c 0.0174 0.0121 1.44 0.15
Residual standard error: 3.35 on 106 degrees of freedom
(489 observations deleted due to missingness)
Multiple R-squared: 0.196, Adjusted R-squared: 0.173
F-statistic: 8.59 on 3 and 106 DF, p-value: 3.73e-05

So all of them combined don't explain much and most of the work is being done by the age variable... There's many high-income LWers, supposedly (in this subset of respondents reporting age, income, IQ, and Conscientiousness, the max is 700,000), so I'd expect a cumulative r^2 of more than 0.173 for all 3 variables; if those aren't governing income, what is? Maybe everyone working with computers is rich and the others poor? Let's look at everyone who submitted salary and profession and see whether the practical computer people are making bank:

lwi <- subset(lw, !is.na(as.integer(as.character(Income))))
lwi <- subset(lwi, !is.na(as.character(Profession)))
cs <- as.integer(as.character(lwi[as.character(lwi$Profession)=="Computers (practical: IT, programming, etc.)",]$Income))
others <- as.integer(as.character(lwi[as.character(lwi$Profession)!="Computers (practical: IT, programming, etc.)",]$Income))
# ordinary t-test, but we'll exclude anyone with zero income (unemployed?)
t.test(cs[cs!=0],others[others!=0])
Welch Two Sample t-test
data: cs[cs != 0] and others[others != 0]
t = 5.905, df = 309.3, p-value = 9.255e-09
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
22344 44673
sample estimates:
mean of x mean of y
76458 42950

Wow. Just wow. 76k vs 43k. I mean, maybe this would go away with enough fiddling (eg. cost-of-living) but it's still dramatic. This suggests a new theory to me: maybe Conscientiousness does correlate with income at its usual high rate for everyone but computer people who are simply in so high demand that lack of Conscientiousness doesn't matter:

R> lwi <- subset(lw, !is.na(as.integer(as.character(Income))))
R> lwi <- subset(lwi, !is.na(as.character(Profession)))
R> lwi <- subset(lwi, !is.naBigFiveC)))))
R> cs <- lwi[as.character(lwi$Profession)=="Computers (practical: IT, programming, etc.)",]
R> others <- lwi[as.character(lwi$Profession)!="Computers (practical: IT, programming, etc.)",]
R> cor.test(as.integer(as.character(cs$BigFiveC)), as.integer(as.character(cs$Income)))
Pearson's product-moment correlation
data: as.integer(as.character(cs$BigFiveC)) and as.integer(as.character(cs$Income))
t = 0.5361, df = 87, p-value = 0.5933
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
-0.1527 0.2625
sample estimates:
cor
0.05738
R> cor.test(as.integer(as.character(others$BigFiveC)), as.integer(as.character(others$Income)))
Pearson's product-moment correlation
data: as.integer(as.character(others$BigFiveC)) and as.integer(as.character(others$Income))
t = 1.997, df = 200, p-value = 0.04721
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
0.001785 0.272592
sample estimates:
cor
0.1398

So for the CS people the correlation is small and non-statistically-significant, for non-CS people the correlation is almost 3x larger and statistically-significant.

Comment author: Kindly 30 November 2012 04:07:27AM 14 points [-]

There is a correlation of 0.13 between non-responses and N.

Of course, there's also a correlation of -0.13 between C and the random number generator.

Comment author: [deleted] 30 November 2012 10:48:20AM 10 points [-]

People who had seen the RNG give a large number were primed to feel unusually reckless when taking the Big 5 test. Duh. (Just kidding.)

Comment author: NancyLebovitz 30 November 2012 05:13:55AM *  4 points [-]

Were you expecting that people with high C would or wouldn't skip questions? I can see arguments either way. Conscientious people might skip questions they don't have answers to or that they aren't willing to put the time into to give a good answer, or they might put in the work to have answers they consider good to as many questions as possible.

Is it feasible to compare wrong sort of answer with C?

Is it possible that the test for C wasn't very good?

Comment author: gwern 30 November 2012 05:19:50AM 7 points [-]

Were you expecting that people with high C would or wouldn't skip questions?

Wouldn't; that was the claim of the linked paper.

Is it feasible to compare wrong sort of answer with C?

Not really, if it wasn't caught by the no-answer check or the NA check.

Is it possible that the test for C wasn't very good?

As I said, it came out as expected for LW as a whole, and it did correlate with income once the CS salaries were removed... Hard to know what ground-truth there could be to check the scores against.

Comment author: Vaniver 30 November 2012 05:47:35AM 5 points [-]

I am also surprised by this. I wonder about the effect of "I'm taking this survey so I don't have to go to bed / do work / etc.," but I wouldn't have expected that to be as large as the diligence effect.

Also, perhaps look at nonresponse by section? I seem to recall the C part being after the personality test, which might be having some selection effects.

Comment author: gwern 30 November 2012 04:13:11PM 1 point [-]

Also, perhaps look at nonresponse by section? I seem to recall the C part being after the personality test, which might be having some selection effects.

What do you mean? I can't compare non-response with anyone who didn't supply a C score, and there were plenty of questions to non-response on after the personality test section.

Comment author: Vaniver 30 November 2012 05:28:39PM 2 points [-]

It seems to me that other survey non-response may be uncorrelated with C once you condition on taking a long personality survey, especially if the personality survey doesn't allow nonresponse. (I seem to recall taking all of the optional surveys and considering the personality one the most boring. I don't know how much that generalizes to other people.) The first way that comes to mind to gather information for this is to compare the nonresponse of people who supplied personality scores and people who didn't, but that isn't a full test unless you can come up with another way to link the nonresponse to C.

I was thinking it might help to break down the responses by section, and seeing if nonresponse to particular sections was correlated with C, but the result could only be that some sections are anticorrelated if a few are correlated. So that probably won't get you anything.

Comment author: gwern 01 December 2012 12:21:20AM 2 points [-]

It seems to me that other survey non-response may be uncorrelated with C once you condition on taking a long personality survey, especially if the personality survey doesn't allow nonresponse. (I seem to recall taking all of the optional surveys and considering the personality one the most boring. I don't know how much that generalizes to other people.)

Why would the strong correlation go away after adding a floor? That would simply restrict the range... if that were true, we'd expect to see a cutoff for all C scores but in fact we see plenty of very low C scores being reported.

The first way that comes to mind to gather information for this is to compare the nonresponse of people who supplied personality scores and people who didn't, but that isn't a full test unless you can come up with another way to link the nonresponse to C.

Yes. You'd expect, by definition, that people who answered the personality questions would have fewer non-responses than the people who didn't... That's pretty obvious and true:

R> lwc <- subset(lw, !is.na(as.integer(as.character(BigFiveC))))
R> missing_answers1 <- apply(lwc, 1, function(x) sum(sapply(x, function(y) is.na(y) || as.character(y)==" ")))
R> lwnc <- subset(lw, is.na(as.integer(as.character(BigFiveC))))
R> missing_answers2 <- apply(lwnc, 1, function(x) sum(sapply(x, function(y) is.na(y) || as.character(y)==" ")))
R> t.test(missing_answers1, missing_answers2)
Welch Two Sample t-test
data: missing_answers1 and missing_answers2
t = -25.19, df = 806.8, p-value < 2.2e-16
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-18.77 -16.05
sample estimates:
mean of x mean of y
9.719 27.129
Comment author: Vaniver 30 November 2012 05:24:48AM 12 points [-]

According to IQ Comparison Site, an SAT score of 1485/1600 corresponds to an IQ of about 144. According to Ivy West, an ACT of 33 corresponds to an SAT of 1470 (and thence to IQ of 143).

Only if you took the SAT before 1994. Here's the percentiles for SATs taken in 2012; someone who was 97th percentile would get ~760 on math and ~730 on critical reading, adding up to 1490 (leaving alone the writing section to keep it within 1600), and 97th percentile corresponds to an IQ of 128.

Here's a classic calibration chart.

An important part of the calibration chart (for people) is the frequency of times that they provide various calibrations. Looking at your table, I would focus on the large frequency between 10% and 30%.

I'll also point out that fixed windows are a pretty bad way to do elicitation. I tend to come from the calibration question from the practical side: how do we get useful probabilities out of subject-matter experts without those people being experts at calibration? Adopting those strategies seems more useful than making people experts at calibration.

Comment author: Epiphany 30 November 2012 07:09:06AM *  2 points [-]

Alternate Explanations for LW's Calibration Atrociousness:

Maybe a lot of the untrained people simply looked up the answer to the question. If you did not rule that out with your study methods, then consider seeing whether a suspiciously large number of them entered the exact right year?

Maybe LWers were suffering from something slightly different from the overconfidence bias you're hoping to detect: difficulty admitting that they have no idea when Thomas Bayes was born because they feel they should really know that.

Comment author: Kindly 30 November 2012 10:35:25PM *  7 points [-]

The mean was 1768, the median 1780, and the mode 1800. Only 169 of 1006 people who answered the question got an answer within 20 years of 1701. Moreover, the three people that admitted to looking it up (and therefore didn't give a calibration) all gave incorrect answers: 1750, 1759, and 1850. So it seems like your first explanation can't be right.

After trying a bunch of modifications to the data, it seems like the best explanation is that the poor calibration happened because people didn't think about the error margin carefully enough. If we change the error margin to 80 years instead of 20, then the responses seem to look roughly like the untrained example from the graph in Yvain's analysis.

Another observation is that after we drop the 45 people who gave confidence levels >85% (and in fact, 89% of them were right), the remaining data is absolutely abysmal: the remaining answers are essentially uncorrelated with the confidence levels.

This suggests that there were a few pretty knowledgeable people who got the answer right and that was that. Everyone else just guessed and didn't know how to calibrate; this may correspond to your second explanation.

Comment author: Epiphany 01 December 2012 03:05:08AM *  2 points [-]

Good points, Kindly, thank you. New alternate explanation idea:

When these people encounter this question, they're slogging through this huge survey. They're not doing an IQ test. This is more casual. They're being asked stuff like "How many partners do you have?" By the time they get down to that question, they're probably in a casual answering mode, and they're probably a little tired and looking for an efficient way to finish. When they see the Bayes question, they're probably not thinking "This question is so important! They're going to be gauging LessWrong's rationality progress with it! I had better really think about this!" They're probably like "Output answer, next question."

If we really want to test them, we need to make it clear that we're testing them. And if we want them to be serious about it, we have to make it clear that it's important. I hypothesize that if we were to do a test (not a survey) and explain that it's serious because we're gauging LessWrong's progress, and also make it short so that the person can focus a lot of attention onto each question, we'd see less atrocious results.

In hindsight, I wonder why I didn't think about the effects of context before. Yvain didn't seem to either; he thought something might be wrong with the question. This seems like one of those things that is right in front of our faces but is hard to see.

I think that people may be rationing their mental stamina, and may not be going through all the steps it takes to answer this type of question.

Comment author: Kindly 01 December 2012 03:56:31AM 2 points [-]

If you'll excuse the expression, I'm suspicious of your sudden epiphany. That is, I accept your suggestion as a possible explanation (although I'm not convinced, mainly because this doesn't describe the way I answered the question; I don't know about anyone else). But I think saying "Oh gosh! The true answer has been staring us in the face all along!" is premature.

Comment author: katydee 01 December 2012 03:59:28AM 7 points [-]

If we really want to test them, we need to make it clear that we're testing them. And if we want them to be serious about it, we have to make it clear that it's important.

Uh, what? The point of LessWrong is to make people better all the time, not just better when they think "ah, now it's time to turn on my rationality skills." If people aren't applying those skills when they don't know they're being tested, that's a very serious problem, because it means the skills aren't actually ingrained on the deep and fundamental level that we want.