2013 Survey Results
Thanks to everyone who took the 2013 Less Wrong Census/Survey. Extra thanks to Ozy, who helped me out with the data processing and statistics work, and to everyone who suggested questions.
This year's results are below. Some of them may make more sense in the context of the original survey questions, which can be seen here. Please do not try to take the survey as it is over and your results will not be counted.
Part I. Population
1636 people answered the survey.
Compare this to 1195 people last year, and 1090 people the year before that. It would seem the site is growing, but we do have to consider that each survey lasted a different amount of time; for example, last survey lasted 23 days, but this survey lasted 40.
However, almost everyone who takes the survey takes it in the first few weeks it is available. 1506 of the respondents answered within the first 23 days, proving that even if the survey ran the same length as last year's, there would still have been growth.
As we will see lower down, growth is smooth across all categories of users (lurkers, commenters, posters) EXCEPT people who have posted to Main, the number of which remains nearly the same from year to year.
We continue to have very high turnover - only 40% of respondents this year say they also took the survey last year.
II. Categorical Data
SEX:
Female: 161, 9.8%
Male: 1453, 88.8%
Other: 1, 0.1%
Did not answer: 21, 1.3%
[[Ozy is disappointed that we've lost 50% of our intersex readers.]]
GENDER:
F (cisgender): 140, 8.6%
F (transgender MtF): 20, 1.2%
M (cisgender): 1401, 85.6%
M (transgender FtM): 5, 0.3%
Other: 49, 3%
Did not answer: 21, 1.3%
SEXUAL ORIENTATION:
Asexual: 47, 2.9%
Bisexual: 188, 12.2%
Heterosexual: 1287, 78.7%
Homosexual: 45, 2.8%
Other: 39, 2.4%
Did not answer: 19, 1.2%
RELATIONSHIP STYLE:
Prefer monogamous: 829, 50.7%
Prefer polyamorous: 234, 14.3%
Other: 32, 2.0%
Uncertain/no preference: 520, 31.8%
Did not answer: 21, 1.3%
NUMBER OF CURRENT PARTNERS:
0: 797, 48.7%
1: 728, 44.5%
2: 66, 4.0%
3: 21, 1.3%
4: 1, .1%
6: 3, .2%
Did not answer: 20, 1.2%
RELATIONSHIP STATUS:
Married: 304, 18.6%
Relationship: 473, 28.9%
Single: 840, 51.3%
RELATIONSHIP GOALS:
Looking for more relationship partners: 617, 37.7%
Not looking for more relationship partners: 993, 60.7%
Did not answer: 26, 1.6%
HAVE YOU DATED SOMEONE YOU MET THROUGH THE LESS WRONG COMMUNITY?
Yes: 53, 3.3%
I didn't meet them through the community but they're part of the community now: 66, 4.0%
No: 1482, 90.5%
Did not answer: 35, 2.1%
COUNTRY:
United States: 895, 54.7%
United Kingdom: 144, 8.8%
Canada: 107, 6.5%
Australia: 69, 4.2%
Germany: 68, 4.2%
Finland: 35, 2.1%
Russia: 22, 1.3%
New Zealand: 20, 1.2%
Israel: 17, 1.0%
France: 16, 1.0%
Poland: 16, 1.0%
LESS WRONGERS PER CAPITA:
Finland: 1/154,685.
New Zealand: 1/221,650.
Canada: 1/325,981.
Australia: 1/328,659.
United States: 1/350,726
United Kingdom: 1/439,097
Israel: 1/465,176.
Germany: 1/1,204,264.
Poland: 1/2,408,750.
France: 1/4,106,250.
Russia: 1/6,522,727
RACE:
Asian (East Asian): 60, 3.7%
Asian (Indian subcontinent): 37, 2.3%
Black: 11, .7%
Middle Eastern: 9, .6%
White (Hispanic): 73, 4.5%
White (non-Hispanic): 1373, 83.9%
Other: 51, 3.1%
Did not answer: 22, 1.3%
WORK STATUS:
Academics (teaching): 77, 4.7%
For-profit work: 552, 33.7%
Government work: 55, 3.4%
Independently wealthy: 14, .9%
Non-profit work: 46, 2.8%
Self-employed: 103, 6.3%
Student: 661, 40.4%
Unemployed: 105, 6.4%
Did not answer: 23, 1.4%
PROFESSION:
Art: 27, 1.7%
Biology: 26, 1.6%
Business: 44, 2.7%
Computers (AI): 47, 2.9%
Computers (other academic computer science): 107, 6.5%
Computers (practical): 505, 30.9%
Engineering: 128, 7.8%
Finance/economics: 92, 5.6%
Law: 36, 2.2%
Mathematics: 139, 8.5%
Medicine: 31, 1.9%
Neuroscience: 13, .8%
Philosophy: 41, 2.5%
Physics: 92, 5.6%
Psychology: 34, 2.1%
Statistics: 23, 1.4%
Other hard science: 31, 1.9%
Other social science: 43, 2.6%
Other: 139, 8.5%
Did not answer: 38, 2.3%
DEGREE:
None: 84, 5.1%
High school: 444, 27.1%
2 year degree: 68, 4.2%
Bachelor's: 554, 33.9%
Master's: 323, 19.7%
MD/JD/other professional degree: 31, 2.0%
PhD.: 90, 5.5%
Other: 22, 1.3%
Did not answer: 19, 1.2%
POLITICAL:
Communist: 11, .7%
Conservative: 64, 3.9%
Liberal: 580, 35.5%
Libertarian: 437, 26.7%
Socialist: 502, 30.7%
Did not answer: 42, 2.6%
COMPLEX POLITICAL WITH WRITE-IN:
Anarchist: 52, 3.2%
Conservative: 16, 1.0%
Futarchist: 42, 2.6%
Left-libertarian: 142, 8.7%
Liberal: 5
Moderate: 53, 3.2%
Pragmatist: 110, 6.7%
Progressive: 206, 12.6%
Reactionary: 40, 2.4%
Social democrat: 154, 9.5%
Socialist: 135, 8.2%
Did not answer: 26.2%
[[All answers with more than 1% of the Less Wrong population included. Other answers which made Ozy giggle included "are any of you kings?! why do you CARE?!", "Exclusionary: you are entitled to an opinion on nuclear power when you know how much of your power is nuclear", "having-well-founded-opinions-is-really-hard-ist", "kleptocrat", "pirate", and "SPECIAL FUCKING SNOWFLAKE."]]
AMERICAN PARTY AFFILIATION:
Democratic Party: 226, 13.8%
Libertarian Party: 31, 1.9%
Republican Party: 58, 3.5%
Other third party: 19, 1.2%
Not registered: 447, 27.3%
Did not answer or non-American: 856, 52.3%
VOTING:
Yes: 936, 57.2%
No: 450, 27.5%
My country doesn't hold elections: 2, 0.1%
Did not answer: 249, 15.2%
RELIGIOUS VIEWS:
Agnostic: 165, 10.1%
Atheist and not spiritual: 1163, 71.1%
Atheist but spiritual: 132, 8.1%
Deist/pantheist/etc.: 36, 2.2%
Lukewarm theist: 53, 3.2%
Committed theist 64, 3.9%
RELIGIOUS DENOMINATION (IF THEIST):
Buddhist: 22, 1.3%
Christian (Catholic): 44, 2.7%
Christian (Protestant): 56, 3.4%
Jewish: 31, 1.9%
Mixed/Other: 21, 1.3%
Unitarian Universalist or similar: 25, 1.5%
[[This includes all religions with more than 1% of Less Wrongers. Minority religions include Dzogchen, Daoism, various sorts of Paganism, Simulationist, a very confused secular humanist, Kopmist, Discordian, and a Cultus Deorum Romanum practitioner whom Ozy wants to be friends with.]]
FAMILY RELIGION:
Agnostic: 129, 11.6%
Atheist and not spiritual: 225, 13.8%
Atheist but spiritual: 73, 4.5%
Committed theist: 423, 25.9%
Deist/pantheist, etc.: 42, 2.6%
Lukewarm theist: 563, 34.4%
Mixed/other: 97, 5.9%
Did not answer: 24, 1.5%
RELIGIOUS BACKGROUND:
Bahai: 3, 0.2%
Buddhist: 13, .8%
Christian (Catholic): 418, 25.6%
Christian (Mormon): 38, 2.3%
Christian (Protestant): 631, 38.4%
Christian (Quaker): 7, 0.4%
Christian (Unitarian Universalist or similar): 32, 2.0%
Christian (other non-Protestant): 99, 6.1%
Christian (unknown): 3, 0.2%
Eckankar: 1, 0.1%
Hindu: 29, 1.8%
Jewish: 136, 8.3%
Muslim: 12, 0.7%
Native American Spiritualist: 1, 0.1%
Mixed/Other: 85, 5.3%
Sikhism: 1, 0.1%
Traditional Chinese: 11, .7%
Wiccan: 1, 0.1%
None: 8, 0.4%
Did not answer: 107, 6.7%
MORAL VIEWS:
Accept/lean towards consequentialism: 1049, 64.1%
Accept/lean towards deontology: 77, 4.7%
Accept/lean towards virtue ethics: 197, 12.0%
Other/no answer: 276, 16.9%
Did not answer: 37, 2.3%
CHILDREN
0: 1414, 86.4%
1: 77, 4.7%
2: 90, 5.5%
3: 25, 1.5%
4: 7, 0.4%
5: 1, 0.1%
6: 2, 0.1%
Did not answer: 20, 1.2%
MORE CHILDREN:
Have no children, don't want any: 506, 31.3%
Have no children, uncertain if want them: 472, 29.2%
Have no children, want children: 431, 26.7%
Have no children, didn't answer: 5, 0.3%
Have children, don't want more: 124, 7.6%
Have children, uncertain if want more: 25, 1.5%
Have children, want more: 53, 3.2%
HANDEDNESS:
Right: 1256, 76.6%
Left: 145, 9.5%
Ambidextrous: 36, 2.2%
Not sure: 7, 0.4%
Did not answer: 182, 11.1%
LESS WRONG USE:
Lurker (no account): 584, 35.7%
Lurker (account) 221, 13.5%
Poster (comment, no post): 495, 30.3%
Poster (Discussion, not Main): 221, 12.9%
Poster (Main): 103, 6.3%
SEQUENCES:
Never knew they existed: 119, 7.3%
Knew they existed, didn't look at them: 48, 2.9%
~25% of the Sequences: 200, 12.2%
~50% of the Sequences: 271, 16.6%
~75% of the Sequences: 225, 13.8%
All the Sequences: 419, 25.6%
Did not answer: 24, 1.5%
MEETUPS:
No: 1134, 69.3%
Yes, once or a few times: 307, 18.8%
Yes, regularly: 159, 9.7%
HPMOR:
No: 272, 16.6%
Started it, haven't finished: 255, 15.6%
Yes, all of it: 912, 55.7%
CFAR WORKSHOP ATTENDANCE:
Yes, a full workshop: 105, 6.4%
A class but not a full-day workshop: 40, 2.4%
No: 1446, 88.3%
Did not answer: 46, 2.8%
PHYSICAL INTERACTION WITH LW COMMUNITY:
Yes, all the time: 94, 5.7%
Yes, sometimes: 179, 10.9%
No: 1316, 80.4%
Did not answer: 48, 2.9%
VEGETARIAN:
No: 1201, 73.4%
Yes: 213, 13.0%
Did not answer: 223, 13.6%
SPACED REPETITION:
Never heard of them: 363, 22.2%
No, but I've heard of them: 495, 30.2%
Yes, in the past: 328, 20%
Yes, currently: 219, 13.4%
Did not answer: 232, 14.2%
HAVE YOU TAKEN PREVIOUS INCARNATIONS OF THE LESS WRONG SURVEY?
Yes: 638, 39.0%
No: 784, 47.9%
Did not answer: 215, 13.1%
PRIMARY LANGUAGE:
English: 1009, 67.8%
German: 58, 3.6%
Finnish: 29, 1.8%
Russian: 25, 1.6%
French: 17, 1.0%
Dutch: 16, 1.0%
Did not answer: 15.2%
[[This includes all answers that more than 1% of respondents chose. Other languages include Urdu, both Czech and Slovakian, Latvian, and Love.]]
ENTREPRENEUR:
I don't want to start my own business: 617, 37.7%
I am considering starting my own business: 474, 29.0%
I plan to start my own business: 113, 6.9%
I've already started my own business: 156, 9.5%
Did not answer: 277, 16.9%
EFFECTIVE ALTRUIST:
Yes: 468, 28.6%
No: 883, 53.9%
Did not answer: 286, 17.5%
WHO ARE YOU LIVING WITH?
Alone: 348, 21.3%
With family: 420, 25.7%
With partner/spouse: 400, 24.4%
With roommates: 450, 27.5%
Did not answer: 19, 1.3%
DO YOU GIVE BLOOD?
No: 646, 39.5%
No, only because I'm not allowed: 157, 9.6%
Yes, 609, 37.2%
Did not answer: 225, 13.7%
GLOBAL CATASTROPHIC RISK:
Pandemic (bioengineered): 374, 22.8%
Environmental collapse including global warming: 251, 15.3%
Unfriendly AI: 233, 14.2%
Nuclear war: 210, 12.8%
Pandemic (natural) 145, 8.8%
Economic/political collapse: 175, 1, 10.7%
Asteroid strike: 65, 3.9%
Nanotech/grey goo: 57, 3.5%
Didn't answer: 99, 6.0%
CRYONICS STATUS:
Never thought about it / don't understand it: 69, 4.2%
No, and don't want to: 414, 25.3%
No, still considering: 636, 38.9%
No, would like to: 265, 16.2%
No, would like to, but it's unavailable: 119, 7.3%
Yes: 66, 4.0%
Didn't answer: 68, 4.2%
NEWCOMB'S PROBLEM:
Don't understand/prefer not to answer: 92, 5.6%
Not sure: 103, 6.3%
One box: 1036, 63.3%
Two box: 119, 7.3%
Did not answer: 287, 17.5%
GENOMICS:
Yes: 177, 10.8%
No: 1219, 74.5%
Did not answer: 241, 14.7%
REFERRAL TYPE:
Been here since it started in the Overcoming Bias days: 285, 17.4%
Referred by a friend: 241, 14.7%
Referred by a search engine: 148, 9.0%
Referred by HPMOR: 400, 24.4%
Referred by a link on another blog: 373, 22.8%
Referred by a school course: 1, .1%
Other: 160, 9.8%
Did not answer: 29, 1.9%
REFERRAL SOURCE:
Common Sense Atheism: 33
Slate Star Codex: 20
Hacker News: 18
Reddit: 18
TVTropes: 13
Y Combinator: 11
Gwern: 9
RationalWiki: 8
Marginal Revolution: 7
Unequally Yoked: 6
Armed and Dangerous: 5
Shtetl Optimized: 5
Econlog: 4
StumbleUpon: 4
Yudkowsky.net: 4
Accelerating Future: 3
Stares at the World: 3
xkcd: 3
David Brin: 2
Freethoughtblogs: 2
Felicifia: 2
Givewell: 2
hatrack.com: 2
HPMOR: 2
Patri Friedman: 2
Popehat: 2
Overcoming Bias: 2
Scientiststhesis: 2
Scott Young: 2
Stardestroyer.net: 2
TalkOrigins: 2
Tumblr: 2
[[This includes all sources with more than one referral; needless to say there was a long tail]]
III. Numeric Data
(in the form mean + stdev (1st quartile, 2nd quartile, 3rd quartile) [n = number responding]))
Age: 27.4 + 8.5 (22, 25, 31) [n = 1558]
Height: 176.6 cm + 16.6 (173, 178, 183) [n = 1267]
Karma Score: 504 + 2085 (0, 0, 100) [n = 1438]
Time in community: 2.62 years + 1.84 (1, 2, 4) [n = 1443]
Time on LW: 13.25 minutes/day + 20.97 (2, 10, 15) [n = 1457]
IQ: 138.2 + 13.6 (130, 138, 145) [n = 506]
SAT out of 1600: 1474 + 114 (1410, 1490, 1560) [n = 411]
SAT out of 2400: 2207 + 161 (2130, 2240, 2330) [n = 333]
ACT out of 36: 32.8 + 2.5 (32, 33, 35) [n = 265]
P(Aliens in observable universe): 74.3 + 32.7 (60, 90, 99) [n = 1496]
P(Aliens in Milky Way): 44.9 + 38.2 (5, 40, 85) [n = 1482]
P(Supernatural): 7.7 + 22 (0E-9, .000055, 1) [n = 1484]
P(God): 9.1 + 22.9 (0E-11, .01, 3) [n = 1490]
P(Religion): 5.6 + 19.6 (0E-11, 0E-11, .5) [n = 1497]
P(Cryonics): 22.8 + 28 (2, 10, 33) [n = 1500]
P(AntiAgathics): 27.6 + 31.2 (2, 10, 50) [n = 1493]
P(Simulation): 24.1 + 28.9 (1, 10, 50) [n = 1400]
P(ManyWorlds): 50 + 29.8 (25, 50, 75) [n = 1373]
P(Warming): 80.7 + 25.2 (75, 90, 98) [n = 1509]
P(Global catastrophic risk): 72.9 + 25.41 (60, 80, 95) [n = 1502]
Singularity year: 1.67E +11 + 4.089E+12 (2060, 2090, 2150) [n = 1195]
[[Of course, this question was hopelessly screwed up by people who insisted on filling the whole answer field with 9s, or other such nonsense. I went back and eliminated all outliers - answers with more than 4 digits or answers in the past - which changed the results to: 2150 + 226 (2060, 2089, 2150)]]
Yearly Income: $73,226 +423,310 (10,000, 37,000, 80,000) [n = 910]
Yearly Charity: $1181.16 + 6037.77 (0, 50, 400) [n = 1231]
Yearly Charity to MIRI/CFAR: $307.18 + 4205.37 (0, 0, 0) [n = 1191]
Yearly Charity to X-risk (excluding MIRI or CFAR): $6.34 + 55.89 (0, 0, 0) [n = 1150]
Number of Languages: 1.49 + .8 (1, 1, 2) [n = 1345]
Older Siblings: 0.5 + 0.9 (0, 0, 1) [n = 1366]
Time Online/Week: 42.7 hours + 24.8 (25, 40, 60) [n = 1292]
Time Watching TV/Week: 4.2 hours + 5.7 (0, 2, 5) [n = 1316]
[[The next nine questions ask respondents to rate how favorable they are to the political idea or movement above on a scale of 1 to 5, with 1 being "not at all favorable" and 5 being "very favorable". You can see the exact wordings of the questions on the survey.]]
Abortion: 4.4 + 1 (4, 5, 5) [n = 1350]
Immigration: 4.1 + 1 (3, 4, 5) [n = 1322]
Basic Income: 3.8 + 1.2 (3, 4, 5) [n = 1289]
Taxes: 3.1 + 1.3 (2, 3, 4) [n = 1296]
Feminism: 3.8 + 1.2 (3, 4, 5) [n = 1329]
Social Justice: 3.6 + 1.3 (3, 4, 5) [n = 1263]
Minimum Wage: 3.2 + 1.4 (2, 3, 4) [n = 1290]
Great Stagnation: 2.3 + 1 (2, 2, 3) [n = 1273]
Human Biodiversity: 2.7 + 1.2 (2, 3, 4) [n = 1305]
IV. Bivariate Correlations
Ozy ran bivariate correlations between all the numerical data and recorded all correlations that were significant at the .001 level in order to maximize the chance that these are genuine results. The format is variable/variable: Pearson correlation (n). Yvain is not hugely on board with the idea of running correlations between everything and seeing what sticks, but will grudgingly publish the results because of the very high bar for significance (p < .001 on ~800 correlations suggests < 1 spurious result) and because he doesn't want to have to do it himself.
Less Political:
SAT score (1600)/SAT score (2400): .835 (56)
Charity/MIRI and CFAR donations: .730 (1193)
SAT score out of 2400/ACT score: .673 (111)
SAT score out of 1600/ACT score: .544 (102)
Number of children/age: .507 (1607)
P(Cryonics)/P(AntiAgathics): .489 (1515)
SAT score out of 1600/IQ: .369 (173)
MIRI and CFAR donations/XRisk donations: .284 (1178)
Number of children/ACT score: -.279 (269)
Income/charity: .269 (884)
Charity/Xrisk charity: .262 (1161)
P(Cryonics)/P(Simulation): .256 (1419)
P(AntiAgathics)/P(Simulation): .253 (1418)
Number of current partners/age: .238 (1607)
Number of children/SAT score (2400): -.223 (345)
Number of current partners/number of children: .205 (1612)
SAT score out of 1600/age: -.194 (422)
Charity/age: .175 (1259)
Time on Less Wrong/IQ: -.164 (492)
P(Warming)/P(GlobalCatastrophicRisk): .156 (1522)
Number of current partners/IQ: .155 (521)
P(Simulation)/age: -.153 (1420)
Immigration/P(ManyWorlds): .150 (1195)
Income/age: .150 (930)
P(Cryonics)/age: -.148 (1521)
Income/children: .145 (931)
P(God)/P(Simulation): .142 (1409)
Number of children/P(Aliens): .140 (1523)
P(AntiAgathics)/Hours Online: .138 (1277)
Number of current partners/karma score: .137 (1470)
Abortion/P(ManyWorlds): .122 (1215)
Feminism/Xrisk charity donations: -.122 (1104)
P(AntiAgathics)/P(ManyWorlds) .118 (1381)
P(Cryonics)/P(ManyWorlds): .117 (1387)
Karma score/Great Stagnation: .114 (1202)
Hours online/P(simulation): .114 (1199)
P(Cryonics)/Hours Online: .113 (1279)
P(AntiAgathics)/Great Stagnation: -.111 (1259)
Basic income/hours online: .111 (1200)
P(GlobalCatastrophicRisk)/Great Stagnation: -.110 (1270)
Age/X risk charity donations: .109 (1176)
P(AntiAgathics)/P(GlobalCatastrophicRisk): -.109 (1513)
Time on Less Wrong/age: -.108 (1491)
P(AntiAgathics)/Human Biodiversity: .104 (1286)
Immigration/Hours Online: .104 (1226)
P(Simulation)/P(GlobalCatastrophicRisk): -.103 (1421)
P(Supernatural)/height: -.101 (1232)
P(GlobalCatastrophicRisk)/height: .101 (1249)
Number of children/hours online: -.099 (1321)
P(AntiAgathics)/age: -.097 (1514)
Karma score/time on LW: .096 (1404)
This year for the first time P(Aliens) and P(Aliens2) are entirely uncorrelated with each other. Time in Community, Time on LW, and IQ are not correlated with anything particularly interesting, suggesting all three fail to change people's views.
Results we find amusing: high-IQ and high-karma people have more romantic partners, suggesting that those are attractive traits. There is definitely a Cryonics/Antiagathics/Simulation/Many Worlds cluster of weird beliefs, which younger people and people who spend more time online are slightly more likely to have - weirdly, that cluster seems slightly less likely to believe in global catastrophic risk. Older people and people with more children have more romantic partners (it'd be interesting to see if that holds true for the polyamorous). People who believe in anti-agathics and global catastrophic risk are less likely to believe in a great stagnation (presumably because both of the above rely on inventions). People who spend more time on Less Wrong have lower IQs. Height is, bizarrely, correlated with belief in the supernatural and global catastrophic risk.
All political viewpoints are correlated with each other in pretty much exactly the way one would expect. They are also correlated with one's level of belief in God, the supernatural, and religion. There are minor correlations with some of the beliefs and number of partners (presumably because polyamory), number of children, and number of languages spoken. We are doing terribly at avoiding Blue/Green politics, people.
More Political:
P(Supernatural)/P(God): .736 (1496)
P(Supernatural)/P(Religion): .667 (1492)
Minimum wage/taxes: .649 (1299)
P(God)/P(Religion): .631 (1496)
Feminism/social justice: .619 (1293)
Social justice/minimum wage: .508 (1262)
P(Supernatural)/abortion: -.469 (1309)
Taxes/basic income: .463 (1285)
P(God)/abortion: -.461 (1310)
Social justice/taxes: .456 (1267)
P(Religion)/abortion: -.413
Basic income/minimum wage: .392 (1283)
Feminism/taxes: .391 (1318)
Feminism/minimum wage: .391 (1312)
Feminism/human biodiversity: -.365 (1331)
Immigration/feminism: .355 (1336)
P(Warming)/taxes: .340 (1292)
Basic income/social justice: .311 (1270)
Immigration/social justice: .307 (1275)
P(Warming)/feminism: .294 (1323)
Immigration/human biodiversity: -.292 (1313)
P(Warming)/basic income: .290 (1287)
Social justice/human biodiversity: -.289 (1281)
Basic income/feminism: .284 (1313)
Human biodiversity/minimum wage: -.273 (1293)
P(Warming)/social justice: .271 (1261)
P(Warming)/minimum wage: .262 (1284)
Human biodiversity/taxes: -.251 (1270).
Abortion/feminism: .239 (1356)
Abortion/social justice: .220 (1292)
P(Warming)/immigration: .215 (1315)
Abortion/immigration: .211 (1353)
P(Warming)/abortion: .192 (1340)
Immigration/taxes: .186 (1322)
Basic income/taxes: .174 (1249)
Abortion/taxes: .170 (1328)
Abortion/minimum wage: .169 (1317)
P(warming)/human biodiversity: -.168 (1301)
Abortion/basic income: .168 (1314)
Immigration/Great Stagnation: -.163 (1281)
P(God)/feminism: -.159 (1294)
P(Supernatural)/feminism: -.158 (1292)
Human biodiversity/Great Stagnation: .152 (1287)
Social justice/Great Stagnation: -.135 (1242)
Number of languages/taxes: -.133 (1242)
P(God)/P(Warming): -.132 (1491)
P(Supernatural)/immigration: -.131 (1284)
P(Religion)immigration: -.129 (1296)
P(God)/immigration: -.127 (1286)
P(Supernatural)/P(Warming): -.125 (1487)
P(Supernatural)/social justice: -.125 (1227)
P(God)/taxes: -.145
Minimum wage/Great Stagnation: -124 (1269)
Immigration/minimum wage: .122 (1308)
Great Stagnation/taxes: -.121 (1270)
P(Religion)/P(Warming): -.113 (1505)
P(Supernatural)/taxes: -.113 (1265)
Feminism/Great Stagnation: -.112 (1295)
Number of children/abortion: -.112 (1386)
P(Religion)/basic income: -.108 (1296)
Number of current partners/feminism: .108 (1364)
Basic income/human biodiversity: -.106 (1301)
P(God)/Basic Income: -.105 (1255)
Number of current partners/basic income: .105 (1320)
Human biodiversity/number of languages: .103 (1253)
Number of children/basic income: -.099 (1322)
Number of children/P(Warming): -.091 (1535)
V. Hypothesis Testing
A. Do people in the effective altruism movement donate more money to charity? Do they donate a higher percent of their income to charity? Are they just generally more altruistic people?
1265 people told us how much they give to charity; of those, 450 gave nothing. On average, effective altruists (n = 412) donated $2503 to charity, and other people (n = 853) donated $523 - obviously a significant result. Effective altruists gave on average $800 to MIRI or CFAR, whereas others gave $53. Effective altruists gave on average $16 to other x-risk related charities; others gave only $2.
In order to calculate percent donated I divided charity donations by income in the 947 people helpful enough to give me both numbers. Of those 947, 602 donated nothing to charity, and so had a percent donated of 0. At the other extreme, three people donated 50% of their (substantial) incomes to charity, and 55 people donated at least 10%. I don't want to draw any conclusions about the community from this because the people who provided both their income numbers and their charity numbers are a highly self-selected sample.
303 effective altruists donated, on average, 3.5% of their income to charity, compared to 645 others who donated, on average, 1% of their income to charity. A small but significant (p < .001) victory for the effective altruism movement.
But are they more compassionate people in general? After throwing out the people who said they wanted to give blood but couldn't for one or another reason, I got 1255 survey respondents giving me an unambiguous answer (yes or no) about whether they'd ever given blood. I found that 51% of effective altruists had given blood compared to 47% of others - a difference which did not reach statistical significance.
Finally, at the end of the survey I had a question offering respondents a chance to cooperate (raising the value of a potential monetary prize to be given out by raffle to a random respondent) or defect (decreasing the value of the prize, but increasing their own chance of winning the raffle). 73% of effective altruists cooperated compared to 70% of others - an insignificant difference.
Conclusion: effective altruists give more money to charity, both absolutely and as a percent of income, but are no more likely (or perhaps only slightly more likely) to be compassionate in other ways.
B. Can we finally resolve this IQ controversy that comes up every year?
The story so far - our first survey in 2009 found an average IQ of 146. Everyone said this was stupid, no community could possibly have that high an average IQ, it was just people lying and/or reporting results from horrible Internet IQ tests.
Although IQ fell somewhat the next few years - to 140 in 2011 and 139 in 2012 - people continued to complain. So in 2012 we started asking for SAT and ACT scores, which are known to correlate well with IQ and are much harder to get wrong. These scores confirmed the 139 IQ result on the 2012 test. But people still objected that something must be up.
This year our IQ has fallen further to 138 (no Flynn Effect for us!) but for the first time we asked people to describe the IQ test they used to get the number. So I took a subset of the people with the most unimpeachable IQ tests - ones taken after the age of 15 (when IQ is more stable), and from a seemingly reputable source. I counted a source as reputable either if it name-dropped a specific scientifically validated IQ test (like WAIS or Raven's Progressive Matrices), if it was performed by a reputable institution (a school, a hospital, or a psychologist), or if it was a Mensa exam proctored by a Mensa official.
This subgroup of 101 people with very reputable IQ tests had an average IQ of 139 - exactly the same as the average among survey respondents as a whole.
I don't know for sure that Mensa is on the level, so I tried again deleting everyone who took a Mensa test - leaving just the people who could name-drop a well-known test or who knew it was administered by a psychologist in an official setting. This caused a precipitous drop all the way down to 138.
The IQ numbers have time and time again answered every challenge raised against them and should be presumed accurate.
C. Can we predict who does or doesn't cooperate on prisoner's dilemmas?
As mentioned above, I included a prisoner's dilemma type question in the survey, offering people the chance to make a little money by screwing all the other survey respondents over.
Tendency to cooperate on the prisoner's dilemma was most highly correlated with items in the general leftist political cluster identified by Ozy above. It was most notable for support for feminism, with which it had a correlation of .15, significant at the p < .01 level, and minimum wage, with which it had a correlation of .09, also significant at p < .01. It was also significantly correlated with belief that other people would cooperate on the same question.
I compared two possible explanations for this result. First, leftists are starry-eyed idealists who believe everyone can just get along - therefore, they expected other people to cooperate more, which made them want to cooperate more. Or, second, most Less Wrongers are white, male, and upper class, meaning that support for leftist values - which often favor nonwhites, women, and the lower class - is itself a symbol of self-sacrifce and altruism which one would expect to correlate with a question testing self-sacrifice and altruism.
I tested the "starry-eyed idealist" hypothesis by checking whether leftists were more likely to believe other people would cooperate. They were not - the correlation was not significant at any level.
I tested the "self-sacrifice" hypothesis by testing whether the feminism correlation went away in women. For women, supporting feminism is presumably not a sign of willingness to self-sacrifice to help an out-group, so we would expect the correlation to disappear.
In the all-female sample, the correlation between feminism and PD cooperation shrunk from .15 to a puny .04, whereas the correlation between the minimum wage and PD was previously .09 and stayed exactly the same at .09. This provides some small level of support for the hypothesis that the leftist correlation with PD cooperation represents a willingness to self-sacrifice in a population who are not themselves helped by leftist values.
(on the other hand, neither leftists nor cooperators were more likely to give money to charity, so if this is true it's a very selective form of self-sacrifice)
VI. Monetary Prize
1389 people answered the prize question at the bottom. 71.6% of these [n = 995] cooperated; 28.4% [n = 394] defected.
The prize goes to a person whose two word phrase begins with "eponymous". If this person posts below (or PMs or emails me) the second word in their phrase, I will give them $60 * 71.6%, or about $43. I can pay to a PayPal account, a charity of their choice that takes online donations, or a snail-mail address via check.
VII. Calibration Questions
The population of Europe, according to designated arbiter Wikipedia, is 739 million people.
People were really really bad at giving their answers in millions. I got numbers anywhere from 3 (really? three million people in Europe?) to 3 billion (3 million billion people = 3 quadrillion). I assume some people thought they were answering in billions, others in thousands, and other people thought they were giving a straight answer in number of individuals.
My original plan was to just adjust these to make them fit, but this quickly encountered some pitfalls. Suppose someone wrote 1 million (as one person did). Could I fairly guess they meant 100 million, even though there's really no way to guess that from the text itself? 1 billion? Maybe they just thought there were really one million people in Europe?
If I was too aggressive correcting these, everyone would get close to the right answer not because they were smart, but because I had corrected their answers. If I wasn't aggressive enough, I would end up with some guy who answered 3 quadrillion Europeans totally distorting the mean.
I ended up deleting 40 answers that suggested there were less than ten million or more than eight billion Europeans, on the grounds that people probably weren't really that far off so it was probably some kind of data entry error, and correcting everyone who entered a reasonable answer in individuals to answer in millions as the question asked.
The remaining 1457 people who can either follow simple directions or at least fail to follow them in a predictable way estimated an average European population in millions of 601 + 35.6 (380, 500, 750).
Respondents were told to aim for within 10% of the real value, which means they wanted between 665 million and 812 million. 18.7% of people [n = 272] got within that window.
I divided people up into calibration brackets of [0,5], [6,15], [16, 25] and so on. The following are what percent of people in each bracket were right.
[0,5]: 7.7%
[6,15]: 12.4%
[16,25]: 15.1%
[26,35]: 18.4%
[36,45]: 20.6%
[46,55]: 15.4%
[56,65]: 16.5%
[66,75]: 21.2%
[76,85]: 36.4%
[86,95]: 48.6%
[96,100]: 100%
Among people who should know better (those who have read all or most of the Sequences and have > 500 karma, a group of 162 people)
[0,5]: 0
[6,15]: 17.4%
[16,25]: 25.6%
[26,35]: 16.7%
[36,45]: 26.7%
[46,55]: 25%
[56,65]: 0%
[66,75]: 8.3%
[76,85]: 40%
[86,95]: 66.6%
[96,100]: 66.6%
Clearly, the people who should know better don't.

This graph represents your performance relative to ideal performance. Dipping below the blue ideal line represents overconfidence; rising above it represents underconfidence. With few exceptions you were very overconfident. Note that there were so few "elite" LWers at certain levels that the graph becomes very noisy and probably isn't representing much; that huge drop at 60 represents like two or three people. The orange "typical LWer" line is much more robust.
There is one other question that gets at the same idea of overconfidence. 651 people were willing to give valid 90% confidence interval on what percent of people would cooperate (this is my fault; I only added this question about halfway through the survey once I realized it would be interesting to investigate). I deleted four for giving extremely high outliers like 9999% which threw off the results, leaving 647 valid answers. The average confidence interval was [28.3, 72.0], which just BARELY contains the correct answer of 71.6%. Of the 647 of you, only 346 (53.5%) gave 90% confidence intervals that included the correct answer!
Last year I complained about horrible performance on calibration questions, but we all decided it was probably just a fluke caused by a particularly weird question. This year's results suggest that was no fluke and that we haven't even learned to overcome the one bias that we can measure super-well and which is most easily trained away. Disappointment!
VIII. Public Data
There's still a lot more to be done with this survey. User:Unnamed has promised to analyze the "Extra Credit: CFAR Questions" section (not included in this post), but so far no one has looked at the "Extra Credit: Questions From Sarah" section, which I didn't really know what to do with. And of course this is most complete survey yet for seeking classic findings like "People who disagree with me about politics are stupid and evil".
1480 people - over 90% of the total - kindly allowed me to make their survey data public. I have included all their information except the timestamp (which would make tracking pretty easy) including their secret passphrases (by far the most interesting part of this exercise was seeing what unusual two word phrases people could come up with on short notice).
Loading…
Subscribe to RSS Feed
= f037147d6e6c911a85753b9abdedda8d)
Comments (558)
Some things that took me by surprise:
People here are more favorable of abortion than feminism. I always thought the former as secondary to the latter, though I suppose the "favorable" phrasing makes the survey sensitive to opinion of the term itself.
Mean SAT (out of 1600) is 1474? Really, people? 1410 is 96th percentile, and it's the bottom 4th quartile. I guess the only people who remembered their scores were those who were proud of them. (And I know this is right along with the IQ discussion)
This would imply that LW is about as selective as a top university (like Harvey Mudd). That doesn't seem that implausible to me- but I definitely agree that we should expect the true mean to be lower than the self-reported mean (both because of inflated memories and selective memories).
It looks like you created the 2014 survey before I got around to posting my comment for this one. Oh well. Hopefully you will still find my comment useful. :)
Some answer choices from the survey weren't included in the results, without any explanation as to why. Does that mean no one selected them? If so, I suggest editing the post to make that clear.
I noticed that 13.6% of respondents chose not to answer the "vegetarian" question. I think it would have helped if you provided additional choices for "vegan" and "pescatarian".
I have some doubts as to how good of a gauge this question is for altruism. People may choose to defect if they have immediate pressing needs for money, if they think their charity is superior to what most other people would have chosen, or if they don't see a net altruistic benefit in taking more money away from the prize-giver just to give it to a randomly selected survey-taker. I suppose if they bothered to think through it carefully they might have reasoned that all else being equal you'd prefer them to cooperate, which is why you're willing to give them more money for it. However, it could have also been that you saw the promise of extra money as a necessary sacrifice in order to set up the dilemma properly, but secretly wished for most people to defect. (Which one was it, by the way, if you don't mind me asking? :P)
I think I know why removing the Mensa tests from the IQ results brought down the average. It's not because the Mensa test is unreliable, but because the people who bothered to take it are likely to have relatively higher IQs, in which case it would make sense to remove them from the sample to remove the bias.
My guess is that lower IQ people may spend more time on LW because they derive more benefit from reading posts about rationality. Perhaps higher-IQ people are more likely to efficiently limit their time on LW to reading only the top-rated interesting-looking posts and the top-rated comments.
Your data actually showed that height is anti-correlated with belief in the supernatural, unless that minus sign wasn't supposed to be there.
Thanks for posting these surveys and survey results, by the way. They are very fascinating. :)
Why not? If we're such smarty pants, maybe we should learn how to shut up and multiply. There are lots of people. Let's go with the 146 value. That's roughly 1 in a 1000 people have IQ >= 146. That high IQ people congregate at a rationality site shouldn't shock anyone. The site is easily accessible to all of the Anglosphere, which not so coincidentally, is 3/4 of the members.
One in a thousand just isn't that special of a snowflake for a special interest site.
Keep in mind that to get an average of 146 you need an implausibly huge number of >146 IQ people to balance the <146 people.
This is just ridiculous. It is well known and well documented that values such as IQ (or penis size) are incorrectly self reported. Furthermore - I do not have a link right now - the extent of exaggeration is greater when reporting old values than when reporting recently obtained values (and people here did take iqtest.dk , getting a lower number)
No, because there aren't an implausibly large number of people on the list. The world is a big place. The main issue in maintaining a high average isn't in getting the numbers of high IQ people, but in repelling the lower IQ people. But apparently, Mission Accomplished.
Note further that I was taking the 146 number as the highest reported estimate, to get the most "implausible" number, which was a mere 1/1000, and not really that rare. The 2013 survey had 138, which is 1/177, which is thoroughly unexciting as implausibly rare snowflakes.
Is that documented for the 146+ crowd?
I'm going by numbers I had in highschool, on two IQ tests in consecutive years which gave the same result, along with an SAT result which mapped even higher (I reported the IQ score).
It's not too hard to remember a number, and people interested, and indeed, proud of their results, are likely paying more attention.
The one real issue I see is sampling bias - only around a third of respondents gave an IQ or SAT score, and I would expect those giving scores to skew higher.
Then again, there are probably biases associated with posting and being active as well, with the higher IQ being more confident and willing to post.
The time on LW correlated negatively with IQ... (and getting the high IQ people to come is difficult). You don't get to invite the whole world.
It is still rarer than many other things, e.g. extremely overinflated self assessment is not very rare.
Well, yeah.
One can always special plead their ways out of any data. There's two types of IQ score, one of them is about mental age, by the way.
I thought we discovered this was driven by outliers in people who spent very little time on LW. (I'm on my phone, or I would check.)
Do you have any recollections on the source for that discovery?
Is the full survey data available, so that we could look at the distribution?
Yes; the OP has a link to the 2013 survey data in the last line. Also note survey results for 2012, 2011, and 2009. Here's my comment on this year's describing what happened last year, and while this is relevant I have a memory of looking at the data, making a graph, and calling it 'trapzeoidal,' but I don't know where that is, and I don't see the image uploaded where I probably would have uploaded it- so I guess I never published that analysis. Anyway, I recommend you take a look at it yourself.
Dunno, maybe. In any case 'repelling lower IQ people' hypothesis seems like it ought to yield a corresponding correlation between IQ and participation, but the opposite or no correlation is observed. (albeit the writing clarity here is quite seriously low - using private terminology instead of existing words, etc. which many may find annoying and perhaps inaccessible)
Some unique passphrases that weren't so unique (I removed the duplicates from people who took the survey twice). You won't want to reuse your passphrase for next year's survey!
I guess this is not a problem though: when the first word is announced two people will reply, but only one of them has the right answer. So the prize still goes to the right person.
It was interesting to see how very average I am (as a member of Less Wrong). My feelings of being an outsider (here at least) have diminished.
I've also resolved to do two things this year, thanks in part to this survey: 1) sign the hell up for cryonics already and 2) take a professional IQ test.
For cryonics, the number of yeses compared to the number who want to or are still considering is a bit of a wake-up call for me.
Were there enough CFAR workshoppers to check CFAR attendance against calibration?
Full version with labels. Also, data and methodology notes.
There are (very probably around) 1.7x10^11 galaxies in the observable universe. So I don't understand how can P(Aliens in Milky Way) be so closed to P(Aliens in observable universe)? If P(Aliens in an average galaxy) = 0.0000000001, P(Aliens in observable universe) should be around 1-(1-0.0000000001)^(1.7x10^11)=0.9999999586. I know there are other factors that influence these numbers, but still, even if there's a only a very slight chance for P(Aliens in Milky Way), then P(Aliens in observable universe) should be almost certain. There are possible rational justifications for the results of this survey, but I think (0.95) most people were victim of a cognitive bias. Scope insensitivity maybe? because 1.7*10^11 galaxies is too big to imagine. What do you think?
I wonder how many people cooperated only (or in part) because they knew the results would be correlated with their (political) views, and they wanted their "tribe"/community/group/etc. to look good. Maybe next year we could say that this result won't be compared to the other? So if less people cooperate, then it will indicate that maybe some people cooperate for their 'group' to look good. But if these people know that I/we want to compare the results we this year in order to verify this hypothesis, they will continue to cooperate. To avoid most of these, we should compare only the people that will have filled the survey for the first time next year. What do you think?
I think you shouldn't have corrected anything. When I assign a probability to the correctness of my answer, I included a percentage for having misread the question or made a data entry error.
Would some people be interested in answering 10 such questions and give their confidence about their answer every month? That would provide better statistics and a way to see if we're improving.
I don't think the responses of people here would be so much affected by directly wanting to present their own social group as good. However (false) correlation between those two could happen just because of framing by other questions.
E.g. the answer to prisoner's dilemma question might be affected by whether you've just answered "I'm associated with the political left" or whether you've just answered "I consider rational calculations to be the best way to solve issues".
If that is the effect causing a false correlation, then adding the statment "these won't be correlated" woudn't do any good - in fact, it would only serve as a further activation for the person to enter the political-association frame.
This is a common problem with surveys that isn't very easy to mitigate. Individually randomizing question order and analyzing differences in correlations based on presented question order helps a bit, but the problem still remains, and the sample size for any such difference-in-correlation analysis becomes increasingly small.
Only if our uncertainties about the different galaxies are independent, and don't depend on a common uncertainty about the laws of nature or something. It's true that P2>P1, but they can be made arbitrarily close, I think.
There's both PredictionBook and the Good Judgment Project as venues for this sort of thing.
Thank you.
EDIT: I just made my first (meta)prediction which is that I'm 50% sure that "I will make good predictions in 2014. (ie. 40 to 60% of my predictions with an estimate between 40 and 60% will be true.)"
Perhaps this is explainable with reference to why the Great Silence / Fermi paradox is so compelling? That even with very low rates of expansion, the universe should be colonized by now if an advanced alien civilization had arisen at any point in the past billion years or so. Hence, if there's aliens anywhere, then they should well have a presence here too.
Intergalactic travel is much harder than intragalactic. It's conceivable that even civilizations that colonize their galaxy might not make it further.
I remember my thought process going something like this:
P (Aliens in Milky way) ~0.75
P (Aliens) ~100
P (Answer pulled from anus on basis of half remembered internet facts is remotely correct) ~0,8
So:
P (Aliens) * P (Anus) ~0,8
P (Milky aliens) * P (Anus) ~0,6
<nitpick>It should have been P (Milky aliens) * P (Anus) + P (!Milky aliens) * P (!Anus) = 0.6 + 0.05.</nitpick>
I don't understand how P(Simulation) can be so much higher than P(God) and P(Supernatural). Seems to me that "the stuff going on outside the simulation" would have to be supernatural by definition. The beings that created the simulation would be supernatural intelligent entities who created the universe, aka gods. How do people justify giving lower probabilities for supernatural than for simulation?
At least part of it is that a commonly endorsed local definition of "supernatural" would not necessarily include the beings who created a simulation. Similarly, the definition of "god" around here is frequently tied to that definition of supernatural.
I am not defending those usages here, just observing that they exist.
I find it odd that 66.2% of LWers are "liberal" or "socialist" but only 13.8% of LWers consider themselves affiliated with the Democrat party. Can anybody explain this?
I was wondering about this word "liberal" -- when Will Wilkinson says he's a liberal, that means something entirely different from what you're describing. So, is it possible we have many right liberals here?
I'd interpret “affiliated” as ‘card-carrying’. If anything, it surprises me as high, but ISTR that in the US you need to be a registered member of a party to vote for their primaries, which would explain that.
It's probably meant to be interpreted as "registered". In the US, registering for a political party has significance beyond signaling affiliation, so it's fairly common: it allows you, in most states, to vote in your party's primary election (which determines the candidates sent by that party to the general election, which everyone can vote in). A few states choose their candidates with party caucuses, though, and California at one point allowed open primaries, though there were some questions about the constitutionality of that move and I don't remember how they were resolved.
Roughly two-thirds of Americans are registered with one of the two major parties.
Do you have a source for that, or is this the same statistic you quoted from wikipedia about "identification"?
I think only half of eligible voters are even registered to vote, but I'd expect almost all registered voters to register in a party. Young people, like LW users, are less likely to be registered.
I honestly don't remember, but I was probably trying to point toward the Wikipedia stats, in which case I shouldn't have used "registered". A quick search for registration percentages turns up this, which cites slightly under 60% registration in the most recent election (it's been going slowly down over time; was apparently just over 70% in the late Sixties). I haven't been able to turn up party-specific registration figures; I suspect but cannot prove that you're underestimating the number of Americans registered as independent.
First reason: by European standards, I imagine the Democrat party is still quite conservative. Median voter theorem and all that. Second reason: "affiliated" probably implies more endorsement than "it's not quite as bad as the other party". It could also be both of these together.
The democrat party is only socialist in the republican party's eyes.
As somebody who most definitely identified as liberal, but did not affiliate with the Democrats:
Your question reveals a hidden assumption:
There is no "Democrat party" in (almost) every other country in the world apart from yours* ;)
*(I am assuming you come from the USA based on this underlying assumption)
This is easily tested by comparing against the country of origin question. As it turns out, a bit over half of LW comes from the US. Wikipedia claims that about 33% of Americans identify as Democrats (vs. 28% Republican and 38% other or independent), so we'd expect about 17.5% of LW to identify as Democratic if the base rate applied, up to 35% if every American LWer identifying as liberal or socialist also identified as Democratic.
Bearing this in mind, it seems that party members identified as such really are underrepresented here.
Cool stuff. Thanks for going and checking against the numbers :)
I've just noticed there was no Myers-Briggs question this year. Why?
I expected that the second word in my passphrase would stay secret no matter what and the first word would only be revealed if I won the game.
Well, thank goodness I didn't pick anything too embarrassing.
Things that stuck out to me:
HPMOR: - Yes, all of it: 912, 55.7% REFERRAL TYPE: Referred by HPMOR: 400, 24.4%
EY's Harry Potter fanfic is more popular around here than I'd thought.
PHYSICAL INTERACTION WITH LW COMMUNITY: Yes, all the time: 94, 5.7% Yes, sometimes: 179, 10.9%
CFAR WORKSHOP ATTENDANCE: Yes, a full workshop: 105, 6.4% A class but not a full-day workshop: 40, 2.4%
LESS WRONG USE: Poster (Discussion, not Main): 221, 12.9% Poster (Main): 103, 6.3%
~6% at the maximum "buy-in" levels on these 3 items. My guess is they are all made up of a similiar group of people?
I'd be curious to know of 6.3% aho have published articles in Main (and, to a lesser extent, of the 12.9% who have published in Discussion), how many unique user are there?
Haven't you seen all those sprawling HPMOR discussion threads with >500 comments usually?
I hadn't paid attention, no.
It was the ~25% referral rate that was pretty shocking to me. And 55% of LWers have read all of it?! Wow.
I use it as a tool to encourage others to join. It's very good for that.
I tell people that if they get to the end of HP:MOR and want more MOR, then they should come try out LW.
It looks like lots of people put themselves as atheist, but still answered the religion question as Unitarian Universalist, in spite of the fact that the question said to answer your religion only if you are theist.
I was looking forward to data on how many LW people are UU, but I have no way of predicting how many people followed the rules as written for the question, and how many people followed the rules as (I think they were) intended.
We should make sure to word that question differently next year, so that people who identify as atheist and religious know to answer the question.
It looks like Judaism and Buddhism might have had a similar problem.
The "did not answer" option seems to be distorting the perception of the results. Perhaps structuring the presentation of the data with those percentages removed would be more straightforward to visualise.
Percentages including the non respondents is misleading, at first glance you could be mistaken for thinking there is a significant population of Non-English speakers as less than 70% of people who completed the survey answered English.
Non-respondents removed:
English: 1009, 87% German: 58, 5% Finnish: 29, 3% Russian: 25, 2% French: 17, 2% Dutch: 16, 1% 15.2% of the sample did not answer
This seems like it would be a better representation of the data which could be applied to the other questions.
N.B.: Average IQ drops to 135 when only considering tests administered at an adult age -- those "IQ 172 at age 7" entries shouldn't be taken as authoritative for adult IQ.
Formatting: I find the reports a bit difficult to scan, because each line contains two numbers (absolute numbers, relative percents), which are not vertically aligned. An absolute value of one line may be just below the value of another line, and the numbers may similar, which makes it difficult to e.g. quickly find a highest value in the set.
I think this could be significantly improved with a trivial change: write the numbers at the beginning of the line, that will make them better aligned. For even better legibility, insert a separator (wider than just a comma) between absolute and relative numbers.
Now:
Proposed:
For example in the original version it is easy to see something like "94.5, 179, 80.4, 48.2" when reading carelessly.
Two more possibilities with things really lined up. I think the first is somewhat better. The dots are added so Markdown doesn't destroy the spacing.
Yes, all the time........94......5.7%
Yes, sometimes......179......10.9%
No.........................1316.....80.4%
Did not answer...........48......2.9%
...94 = 5.7%.......Yes, all the time
.179 = 10.9%......Yes, sometimes
1316 = 80.4%......No
....48 = 2.9%......Did not answer
Yvain - Next year, please include a question asking if the person taking the survey uses PredictionBook. I'd be curious to see if these people are better calibrated.
Maybe ask them how many predictions they have made so we can see if using it more makes you better.
What if the people who have taken IQ tests are on average smarter than the people who haven't? My impression is that people mostly take IQ tests when they're somewhat extreme: either low and trying to qualify for assistive services or high and trying to get "gifted" treatment. If we figure lesswrong draws mostly from the high end, then we should expect the IQ among test-takers to be higher than what we would get if we tested random people who had not previously been tested.
The IQ Question read: "Please give the score you got on your most recent PROFESSIONAL, SCIENTIFIC IQ test - no Internet tests, please! All tests should have the standard average of 100 and stdev of 15."
Among the subset of people making their data public (n=1480), 32% (472) put an answer here. Those 472 reports average 138, in line with past numbers. But 32% is low enough that we're pretty vulnerable to selection bias.
(I've never taken an IQ test, and left this question blank.)
This sounds plausible, but from looking at the data, I don't think this is happening in our sample. In particular, if this were the case, then we would expect the SAT scores of those who did not submit IQ data to be different from those who did submit IQ data. I ran an Anderson–Darling test on each of the following pairs of distributions:
The p-values came out as 0.477 and 0.436 respectively, which means that the Anderson–Darling test was unable to distinguish between the two distributions in each pair at any significance.
As I did for my last plot, I've once again computed for each distribution a kernel density estimate with bootstrapped confidence bands from 999 resamples. From visual inspection, I tend to agree that there is no clear difference between the distributions. The plots should be self-explanatory:
(More details about these plots are available in my previous comment.)
Edit: Updated plots. The kernel density estimates are now fixed-bandwidth using the Sheather–Jones method for bandwidth selection. The density near the right edge is bias-corrected using an ad hoc fix described by whuber on stats.SE.
Thanks for digging into this! Looks like the selection bias isn't significant.
The large majority of LessWrongers in the USA have however also provided their SAT scores, and those are also very high values (from what little I know of SATs)...
The reported SAT numbers are very high, but the reported IQ scores are extremely high. The mean reported SAT score, if received on the modern 1600 test, corresponds to an IQ in the upper 120s, not the upper 130s. The mean reported SAT2400 score was 2207, which corresponds to 99th but not 99.5th percentile. 99th percentile is an IQ of 135, which suggests that the self-reports may not be that off compared to the SAT self-reports.
Some of us took the SAT before 1995, so it's hard to disentangle those scores. A pre-1995 1474 would be at 99.9x percentile, in line with an IQ score around 150-155. If you really want to compare, you should probably assume anyone age 38 or older took the old test and use the recentering adjustment for them.
I'm also not sure how well the SAT distinguishes at the high end. It's apparently good enough for some high IQ societies, who are willing to use the tests for certification. I was shown my results and I had about 25 points off perfect per question marked wrong. So the distinction between 1475 and 1600 on my test would probably be about 5 total questions. I don't remember any questions that required reasoning I considered difficult at the time. The difference between my score and one 100 points above or below might say as much about diligence or proofreading as intelligence.
Admittedly, the variance due to non-g factors should mostly cancel in a population the size of this survey, and is likely to be a feature of almost any IQ test.
That said, the 1995 score adjustment would have to be taken into account before using it as a proxy for IQ.
Conversion is a very tricky matter, because the correlation is much less than 1 ( 0.369 in the survey, apparently).
With correlation less than 1, regression towards the mean comes into play, so the predicted IQ from perfect SAT is actually not that high (someone posted coefficients in a parallel discussion), and predicted SAT from very high IQ is likewise not that awesome.
The reason the figures seem rather strange, is that they imply some kind of extreme filtering by IQ here. The negative correlation between time here and IQ suggest that the content is not acting as much of a filter, or is acting as a filter in the opposite direction.
The Wikipedia article states that those are percentiles of test-takers, not the population as a whole. What percentage of seniors take the SAT? I tried googling, but I could not find the figure.
My first thought is that most people who don't take the SAT don't intend to go to college and are likely to be below the mean reported SAT score, but then I realized that a non-negligible subset of those people must have taken only the ACT as their admission exam.
I don't have solid numbers myself, but percentile of test-takers should underestimate percentile of population. However, there is regression to the mean to take into account, as well as that many people take the SAT multiple times and report the most favorable score, both of which suggest that score on test should overestimate IQ, and I'm fudging it by treating those two as if they cancel out.
There could be some measurement bias here. I was on the fence about whether I should identify myself as an effective altruist, but I had just been reminded of the fact that I hadn't donated any money to charity in the last year, and decided that I probably shouldn't be identifying as an effective altruist myself despite having philosophical agreements with the movement.
This is blasphemy against Saint Boole.
Hypothesis: the predictions on the population of Europe are bimodal, split between people thinking of geographical Europe (739M) vs people thinking of the EU (508M). I'm going to go check the data and report back.
I've cleaned up the data and put it here.
Here's a "sideways cumulative density function", showing all guesses from lowest to highest:
There were a lot of guesses of "500" but that might just be because 500 is a nice round number. There were more people guessing within 50 of 508M (165) than in the 100-wide regions immediately above or below (126 within 50 of 408, 88 within 50 of 608) and more people guessing within 50 of 739 (107) than in the 100-wide regions immediately above or below (91 within 50 of 639, 85 within 50 of 839).
Here's a histogram that shows this, but in order to actually see a dip between the 508ish numbers and 739ish numbers the bucketing needs to group those into separate categories with another category in between, so I don't trust this very much:
If someone knows how to make an actual probability density function chart that would be better, because it wouldn't be sensitive to these arbitrary divisions on where to place the histogram boundaries.
Here is a kernel density estimate of the "true" distribution, with bootstrapped pointwise 95% confidence bands from 999 resamples:
It looks plausibly bimodal, though one might want to construct a suitable hypothesis test for unimodality versus multimodality. Unfortunately, as you noted, we cannot distinguish between the hypothesis that the bimodality is due to rounding (at 500 M) versus the hypothesis that the bimodality is due to ambiguity between Europe and the EU. This holds even if a hypothesis test rejects a unimodal model, but if anyone is still interested in testing for unimodality, I suggest considering Efron and Tibshirani's approach using the bootstrap.
Edit: Updated the plot. I switched from adaptive bandwidth to fixed bandwidth (because it seems to achieve higher efficiency), so parts of what I wrote below are no longer relevant—I've put these parts in square brackets.
Plot notes: [The adaptive bandwidth was achieved with Mathematica's built-in "Adaptive" option for SmoothKernelDistribution, which is horribly documented; I think it uses the same algorithm as 'akj' in R's quantreg package.] A Gaussian kernel was used with the bandwidth set according to Silverman's rule-of-thumb [and the sensitivity ('alpha' in akj's documentation) set to 0.5]. The bootstrap confidence intervals are "biased and unaccelerated" because I don't (yet) understand how bias-corrected and accelerated bootstrap confidence intervals work. Tick marks on the x-axis represent the actual data with a slight jitter added to each point.
As one datapoint I went with Europe as EU so it's plausible others did too
Me too, at least sort of - I just had a number stored in my brain that I associated with "Europe." Turned out it was EU only, although I didn't have any confusion about the question - I thought I was answering for all of Europe.
Same here.
The misinterpretation of the survey's meaning of "Europe" as "EU" is itself a failure as significant as wrongly estimating its population... so it's not as if it excuses people who got it wrong and yet neither sought for clarification, nor took the possibility of misinterpretation into account when giving their confidence ratios...
Its also not obvious that people who went with the EU interpretation were incorrect. Language is contextual, if we were to parse the Times, Guardian, BBC, etc over the past year and see how the word "Europe" is actually used, it might be the land mass, or it might be the EU. Certainly one usage will have been more common than the other, but its not obvious to me which one it will have been.
That said, if I had noticed the ambiguity and not auto parsed it as EU, I probably would have expected the typical American to use Europe as land mass and since I think Yvain is American that's what I should have gone with.
On the other other hand, the goal of the question is to gauge numerical calibration, not to gauge language parsing. If someone thought they were answering about the EU, and picked a 90% confidence interval that did in fact include the population of the EU that gives different information about the quantity we are trying to measure then if someone thinks Europe means the continent including Russia and picks a 90% confidence interval that does not include the population of the landmass. Remember this is not a quiz in school to see if someone gets "the right answer" this is a tool that's intended to measure something.
Yvain explicitly said "Wikipedia's Europe page".
Which users could not double-check because they might see the population numbers.
But they should expect the Wikipedia page to refer to the continent.
You might as well ask, "Who is the president of America?" and then follow up with, "Ha ha got you! America is a continent, you meant USA."
I don't think you're making the argument that Yvain deliberately wanted to trick people into giving a wrong answer -- so I really don't see your analogy as illuminating anything.
It was a question. People answered it wrongly whether by making a wrong estimation of the answer, or by making a wrong estimation of the meaning of the question. Both are failures -- and why should we consider the latter failure as any less significant than the former?
EDIT TO ADD: Mind you, reading the excel of the answers it seems I'm among the people who gave an answer in individuals when the question was asking number in millions. So it's not as if I didn't also have a failure in answering -- and yet I do consider that one a less significant failure. Perhaps I'm just being hypocritical in this though.
Thanks for taking the time to conduct and then analyze this survey!
What surprised me:
What disappointed me:
And a comment at the end:
Given that LW explicitly tries to exclude politics from discussion (and for reasons I find compelling), what makes you expect differently?
Incorporating LW debiasing techniques into daily life will necessarily be significantly harder than just reading the Sequences, and even those have only been read by a relatively small proportion of posters...
"Time online per week seems plausible from personal experience, but I didn't expect the average to be so high."
I personally spend an average of 50 hours a week online.
That's because, by profession, I am a web-developer.
The percentage of LessWrong members in IT is clearly higher than that of the average population.
I postulate that the higher number of other IT geeks (who, like me, are also likely spending high numbers of hours online per week) is pushing up the average to a level that seems, to you, to be surprisingly high.
With only 500 people responding to the IQ question, it is entirely possible that this is simply a selection effect. I.e. only people with high IQ test themselves or report their score while lower IQ people keep quiet.
There's nothing necessarily wrong with this. You are assuming that feminism is purely a matter of personal preference, incorrectly I feel. If you reduce feminism to simply asking "should women have the right to vote" then you should in fact find a correlation between that and "is there such a thing as global warming", because the correct answer in each case is yes.
Not saying I am necessarily in favour of modern day feminism, but it does bother me that people simply assume that social issues are independent of fact. This sounds like "everyone is entitled to their opinion" nonsense to me.
What I find more surprising is that there is no correlation between IQ and political beliefs whatsoever. I suspect that this is simply because the significance level is too strict to find anything.
With this, on the other hand, I agree completely.
I've heard GMOs described as the left equivalent for global warming-- maybe there should be a question about GMOs on next survey.
There is a question about it. It's the existential thread that's most feared among Lesswrongers. Bioengineered pandemics are a thread due to gene manipulated organisms.
If that's not what you want to know, how would you word your question?
I took "bioengineered" to imply 'deliberately' and "pandemic" to imply 'contagious', and in any event fear of > 90% of humans dying by 2100 is far from the only possible reason to oppose GMOs.
I didn't advocate that it's the only reason. That's why I asked for a more precise question.
If the tools that you need to genmanipulate organisms are widely available it's much easier to deliberately produce a pandemic.
It's possible to make a bacteria immune to antibiotica by just giving them antibiotica and making not manipulating the genes directly. On the other hand I think that people fear bioengineered pandemics because they expect stronger capabilities in regards to manipulating organisms in the future.
Is it, though? I did a quick fact check on this, and found this article which seems to say it is more split down the middle (for as much as US politicians are representative, anyway). It also highlights political divides for other topics.
It's a pity that some people here are so anti-politics (not entirely unjustified, but still). I think polling people here on issues which are traditionally right or left wing but which have clear-cut correct answers to them would make for quite a nice test of rationality.
Are you quite sure about that? Any examples outside of young earth / creationists?
While we're here, there may be questions about animal testing, alternative medicine, gun control, euthanasia, and marijuana legalization. (I'm not saying that the left is wrong about all of these.)
I object to GMOs, but I object to GMOs not because of fears that they may be unnoticed health hazards, but rather because they are often used to apply DRM and patents to food, and applying DRM and patents to food has the disadvantages of applying DRM and patents to computer software. Except it's much worse since 1) you can do without World of Warcraft, but you can't do without food, and 2) traditional methods of producing food involve copying and organisms used for food normally copy themselves.
ISTR I've read farmers have preferred to buy seeds from specialized companies rather than planting their own from the previous harvest since decades before the first commercial GMO was introduced.
Yes, but they wouldn't be sued out of existence IF they had to keep their own.
It seems that should make you object to certain aspects of the Western legal system.
Given your reasoning I don't understand why you object to GMOs but don't object on the same grounds to, say, music and videos which gave us DMCA, etc.
I object to DRM and patents on entertainment as well. (You can't actually patent music and videos, but software is subject to software patents and I do object to those.)
If you're asking why I don't object to entertainment as a class, it's because of practical considerations--there is quite a bit of entertainment without DRM, small scale infringers are much harder to catch for entertainment, much entertainment is not patented, and while entertainment is copyrighted, it does not normally copy itself and copying is not a routine part of how one uses it in the same way that producing and saving seeds is of using seeds. Furthermore, pretty much all GMO organisms are produced by large companies who encourage DRM and patents. There are plenty of producers of entertainment who have no interest in such things, even if they do end up using DVDs with CSS.
What do you think of golden rice?
To me it has always sounded right. I'm MENSA-level (at least according to the test the local MENSA association gave me) and LessWrong is the first forum I ever encountered where I've considered myself below-average -- where I've found not just one or two but several people who can think faster and deeper than me.
Below average or simply not exceptional? I'm certainly not exceptional here but I don't think I'm particularly below average. I suppose it depends on how you weight the average.
Same for me.
"The overconfidence data hurts, but as someone pointed out in the comments, it's hard to ask a question which isn't misunderstood."
I interpreted this poor level of calibration more to the fact that it's easier to read about what you should be doing than to actually go and practice the skill and get better at it.
Not sure how much sense it makes to take the arithmetic mean of probabilities when the odds vary over many orders of magnitude. If the average is, say, 30%, then it hardly matters whether someone answers 1% or .000001%. Also, it hardly matters whether someone answers 99% or 99.99999%.
I guess the natural way to deal with this would be to average (i.e., take the arithmetic mean of) the order of magnitude of the odds (i.e., log[p/(1-p)], p someone's answer). Using this method, it would make a difference whether someone is "pretty certain" or "extremely certain" that a certain statement is true or false.
Does anyone know what the standard way for dealing with this issue is?
Use medians and percentiles instead of means and standard deviations.
Yeah, log odds sounds like a good way to do it. Aggregating estimates is hard because peoples' estimates aren't independent, but averaging log odds will at least do better than averaging probabilities.
Next survey, I'd be interested in seeing statistics involving:
Excellent write-up and I look forward to next year's.
I'd also like to see time spent per day meditating, or other form of mental training
How would you word the question?
While I don't remember the precise level, I would note that there are studies suggesting a rather surprisingly low level of correlation between self perceived attractiveness and attractiveness as perceived by others, and if we could induce a sufficient sample of participants to submit images of themselves to be rated by others (possibly in a context where they would not themselves find out the rating they received,) I think the comparison of those two values would be much more interesting than self-perceived attractiveness alone.
I'd like:
Oh, we are really self-serving elitist overconfident pricks, aren't we?
How do you expect anybody to be able to answer that and what does it even mean? First, what community, exactly? Second, average - over what?
I think he means the people who take the survey.
If you ask in the survey for the self-perceived physical attractiveness you can ask in the same survey for the estimated average of all survey takers.
I think Acidmind means we should ask people their self-perceived attractiveness, and then ask them to estimate the average that will be given by all people taking the survey.
I thought quite a bit about this and couldn't decide on many good questions.
The Anki question is sort of a result of this desire.
I thought of asking about pedometer usage such as Fitbit/Nike Plus etc but I'm not sure if the amount of people is enough to warrant the question.
Which specific questions would you want?
By what metric? Total time investment? Few people can give you an accurate answer to that question.
Asking good questions isn't easy.
I personally don't think that term is very meaningful. I do have hotornot pictures that scored a 9, but what does that mean? The last time I used tinder I click through a lot of female images and very few liked me back. But I haven't yet isolated factors or know about average success rates for guy's using Tinder.
There interested in not gathering data that would cause someone to admit criminal behavior. A person might be findable if you know there stances on a few questions. There also the issue of possible outsiders being able to say: "30% of LW participants are criminals!"
I agree, that would be nice question.
Quantified Self examples:
Social media example:
Asking about self-perceived attractiveness tells us little about how attractive a person is, but quite a bit about how they see themselves, and I want to learn how that's correlated with answers to all these other questions.
Maybe the recreational drug use question(s) could be stripped from the public data?
Having a calendar with time of when you do what actions is recording of personal data and for most people for timeframes longer than a month.
Anyone who uses Anki gets automated backround data recording of how many minutes per day he uses Anki.
I might be willing to call either of those self-quantifying activities. Definitely the first one, if you actually put most activities you do on there rather than just the ones that aren't habit or important enough to definitely not forget. I think the question could be modified to capture the intent. Let's see...
That sounds like a good question. Hopefully we remember when the time comes up.
I'm not culture.
In some social circles I might behave in one way, in others another way. In different situations I act differently depending on how strongly I want to communicate a demand.
Good point. It might not even make sense to ask "Which culture of social interaction do you feel most at home with, Ask or Guess?".
Repeating complaints from last year:
The 2012 estimate from SATs was about 128, since the 1994 renorming destroyed the old relationship between the SAT and IQ. Our average SAT (on 1600) was again about 1470, which again maps to less than 130, but not by much. (And, again, self-reported average probably overestimates actual population average.)
I still think you're asking this question in a way that's particularly hard for people to get right. (The issue isn't the fact you ask about, but what sort of answers you look for.)
You've clearly got an error in your calibration chart; you can't have 2 out of 3 elite LWers be right in the [95,100] category but 100% of typical LWers are right in that category. Or are you not including the elite LWers in typical LWers? Regardless, the person who gave a calibration of 99% and the two people who gave calibrations of 100% aren't elite LWers (karmas of 0, 0, and 4; two 25% of the sequences and one 50%).
The calibration chart doesn't make clear the impact of frequency. If most people are providing probabilities of 20%, and they're about 20% right, then most people are getting it right- and the 2-3 people who provided a probability of 60% don't matter.
There are a handful of ways to depict this. One I haven't seen before, which is probably ugly, is to scale the width of the points by the frequency. Instead, here's a flat graph of the proportion of survey respondents who gave each calibration bracket:
Significant is that if you add together the 10, 20, and 30 brackets (the ones around the correct baseline probability of ~20% of getting it right) you get 50% for typical LWers and 60% for elite LWers; so most people were fairly close to correctly calibrated, and the people who thought they had more skill on the whole dramatically overestimated how much more skill they had.
(I put down 70% probability, but was answering the wrong question; I got the population of the EU almost exactly right, which I knew from GDP and per-capita comparisons to the US. Oops.)
It's very interesting that the same mistake was boldly made again this year... I guess this mistake is sort of self reinforcing due to the uncannily perfect equality between mean IQ and what's incorrectly estimated from the SAT scores.
According to Vaniver's data downthread, SAT taken only from LWers older than 36 (taking the old SAT) predicts 140 IQ.
I can't calculate the IQ of LWers younger than 36 because I can't find a site I trust to predict IQ from new SAT. The only ones I get give absurd results like average SAT 1491 implies average IQ 151.
Actually, I just ran the numbers on the SAT2400 and they're closer; the average percentile predicted from that is 99th, which corresponds to about 135.
2210 was 98th percentile in 2013. But it was 99th in 2007.
I haven't seen an SAT-IQ comparison site I trust. This one listed on gwern's website for example seems wrong.
If I remember correctly, I did SAT->percentile->average, rather than SAT->average->percentile; the first method should lead to a higher estimate if the tail is negative (which I think it is).
[edit]Over here is the work and source for that particular method- turns out I did SAT->average->percentile to get that result, with a slightly different table, and I guess I didn't report the average percentile that I calculated (which you had to rely on interpolation for anyway).
It's only accurate up to 1994.
For non-Americans, what's the difference between SAT 2400 and SAT 1600 ?
Averaging sat scores is a little iffy because, given a cut-off, they won't have Gaussian distribution. Also, given imperfect correlation it is unclear how one should convert the scores. If I pick someone with SAT in top 1% I shouldn't expect IQ in the top 1% because of regression towards the mean. (Granted I can expect both scores to be closer if I were picking by some third factor influencing both).
It'd be interesting to compare frequency of advanced degrees with the scores, for people old enough to have advanced degrees.
The SAT used to have only two sections, with a maximum of 800 points each, for a total of 1600 (the worst possible score, IIRC, was 200 on each for 400). At some point after I graduated high school, they added a 3rd 800 point section (I think it might be an essay), so the maximum score went from 1600 to 2400.
Yes, it's a timed essay.
The correlation is the slope of the regression line in coordinates normalised to unit standard deviations. Assuming (for mere convenience) a bivariate normal distribution, let F be the cumulative distribution function of the unit normal distribution, with inverse invF. If someone is at the 1-p level of the SAT distribution (in the example p=0.01) then the level to guess they are at in the IQ distribution (or anything else correlated with SAT) is q = F(c invF(p)). For p=0.01, here are a few illustrative values:
The standard deviation of the IQ value, conditional on the SAT value, is the unconditional standard deviation multiplied by c' = sqrt(1-c^2). The q values for 1 standard deviation above and below are therefore given by qlo = F(-c' + c invF(p)) and qhi = F(c' + c invF(p)).
One reason SAT1600 and SAT2400 scores may differ is that some of the SAT1600 scores might in fact have come from before the 1994 renorming. Have you tried doing pre-1994 and post-1994 scores separately (guessing when someone took the SAT based on age?)
SAT1600 scores by age:
Average SAT for LWers 30 and under (217 total): 1491. (27 1600s.)
Average SAT for LWers 31 to 35 (74 total): 1462.7 (9 1600s.)
Average SAT for LWers 36 and older (81 total): 1437. (One 1600, by someone who's 56.)
I'm pretty sure the 36 and above are all the older SAT, suspect the middle group contains both, and pretty confident the younger group is mostly the newer SAT. The strong majority comes from the post 1995 test, and the scores don't seem to have changed by all that much in nominal terms.
Which creates another question, why do the SAT 2400 and SAT 1600 differ so much?
What the best way to import the data into R without having to run as.numeric(as.character(...)) on all the numeric variables like the probabilities?
Could someone who voted for unfriendly AI explain how nanotech or biotech isn't much more of a risk than unfriendly AI (I'll assume MIRI's definition here)?
I ask this question because it seems to me that even given a technological singularity there should be enough time for "unfriendly humans" to use precursors to fully fledged artificial general intelligence (e.g. advanced tool AI) in order to solve nanotechnology or advanced biotech. Technologies which themselves will enable unfriendly humans to cause a number of catastrophic risks (e.g. pandemics, nanotech wars, perfect global surveillance (an eternal tyranny) etc.).
Unfriendly AI, as imagined by MIRI, seems to be the end product of a developmental process that provides humans ample opportunity to wreck havoc.
I just don't see any good reason to believe that the tools and precursors to artificial general intelligence are not themselves disruptive technologies.
And in case you believe advanced nanotechnology to be infeasible, but unfriendly AI to be an existential risk, what concrete scenarios do you imagine on how such an AI could cause human extinction without nanotech?
If I understand Eliezer's view, it's that we can't be extremely confident of whether artificial superintelligence or perilously advanced nanotechnology will come first, but (a) there aren't many obvious research projects likely to improve our chances against grey goo, whereas (b) there are numerous obvious research projects likely to improve our changes against unFriendly AI, and (c) inventing Friendly AI would solve both the grey goo problem and the uFAI problem.
Considering ... please wait ... tttrrrrrr ... prima facie, Grey Goo scenarios may seem more likely simply because they make better "Great Filter" candidates; whereas a near-arbitrary Foomy would spread out in all directions at relativistic speeds, with self-replicators no overarching agenty will would accelerate them out across space (the insulation layer with the sparse materials).
So if we approached x-risks through the prism of their consequences (extinction, hence no discernible aliens) and then reasoned our way back to our present predicament, we would note that within AI-power-hierachies (AGI and up) there are few distinct long-term dan-ranks (most such ranks would only be intermediary steps while the AI falls "upwards"), whereas it is much more conceivable that there are self-replicators which can e.g. transform enough carbon into carbon copies (of themselves) to render a planet uninhabitable, but which lack the oomph (and the agency) to do the same to their light cone.
Then I thought that Grey Goo may yet be more of a setback, a restart, not the ultimate planetary tombstone. Once everything got transformed into resident von Neumann machines, evolution amongst those copies would probably occur at some point, until eventually there may be new macroorganisms organized from self-replicating building blocks, which may again show significant agency and turn their gaze towards the stars.
Then again (round and round it goes), Grey Goo would still remain the better transient Great Filter candidate (and thus more likely than uFAI when viewed through the Great Filter spectroscope), simply because of the time scales involved. Assuming the Great Filter is in fact an actual absence of highly evolved civilizations in our neighborhood (as opposed to just hiding or other shenanigans), Grey Goo biosphere-resets may stall the Kardashev climb sufficiently to explain us not having witnessed other civs yet. Also, Grey Goo transformations may burn up all the local negentropy (nanobots don't work for free), precluding future evolution.
Anyways, I agree that FAI would be the most realistic long-term guardian against accidental nanogoo (ironically, also uFAI).
My own suspicion is that the bulk of the Great Filter is behind us. We've awoken into a fairly old universe. (Young in terms of total lifespan, but old in terms of maximally life-sustaining years.) If intelligent agents evolve easily but die out fast, we should expect to see a young universe.
We can also consider the possibility of stronger anthropic effects. Suppose intelligent species always succeed in building AGIs that propagate outward at approximately the speed of light, converting all life-sustaining energy into objects or agents outside our anthropic reference class. Then any particular intelligent species Z will observe a Fermi paradox no matter how common or rare intelligent species are, because if any other high-technology species had arisen first in Z's past light cone it would have prevented the existence of anything Z-like. (However, species in this scenario will observe much younger universes the smaller a Past Filter there is.)
So grey goo creates an actual Future Filter by killing their creators, but hyper-efficient hungry AGI creates an anthropic illusion of a Future Filter by devouring everything in their observable universe except the creator species. (And possibly devouring the creator species too; that's unclear. Evolved alien values are less likely to eat the universe than artificial unFriendly-relative-to-alien-values values are, but perhaps not dramatically less likely; and unFriendly-relative-to-creator AI is almost certainly more common than Friendly-relative-to-creator AI.)
Probably won't happen before the heat death of the universe. The scariest thing about nanodevices is that they don't evolve. A universe ruled by nanodevices is plausibly even worse (relative to human values) than one ruled by uFAI like Clippy, because it's vastly less interesting.
(Not because paperclips are better than nanites, but because there's at least one sophisticated mind to be found.)
Two reasons: uFAI is deadlier than nano/biotech and easier to cause by accident.
If you build an AGI and botch friendliness, the world is in big trouble. If you build a nanite and botch friendliness, you have a worthless nanite. If you botch growth-control, it's still probably not going to eat more than your lab before it runs into micronutrient deficiencies. And if you somehow do build grey goo, people have a chance to call ahead of it and somehow block its spread. What makes uFAI so dangerous is that it can outthink any responders. Grey goo doesn't do that.
This seems like a consistent answer to my original question. Thank you.
You on the one hand believe that grey goo is not going to eat more than your lab before running out of steam and on the other hand believe that AI in conjunction with nanotechnology will not run out of steam, or only after humanity's demise.
You further believe that AI can't be stopped but grey goo can.
Accidental grey goo is unlikely to get out of the lab. If I design a nanite to self-replicate and spread through a living brain to report useful data to me, and I have an integer overflow bug in the "stop reproducing" code so that it never stops, I will probably kill the patient but that's it. Because the nanites are probably using glucose+O2 as their energy source. I never bothered to design them for anything else. Similarly if I sent solar-powered nanites to clean up Chernobyl I probably never gave them copper-refining capability -- plenty of copper wiring to eat there -- but if I botch the growth code they'll still stop when there's no more pre-refined copper to eat. Designing truely dangerous grey goo is hard and would have to be a deliberate effort.
As for stopping grey goo, why not? There'll be something that destroys it. Extreme heat, maybe. And however fast it spreads, radio goes faster. So someone about to get eaten radios a far-off military base saying "help! grey goo!" and the bomber planes full of incindiaries come forth to meet it.
Contrast uFAI, which has thought of this before it surfaces, and has already radioed forged orders to take all the bomber planes apart for maintenance or something.
I think a large part of that may simply be LW'ers being more familiar with UFAI and therefore knowing more details that make it seem like a credible threat / availability heuristic. So for example I would expect e.g. Eliezer's estimate of the gap between the two to be less than the LW average. (Edit: Actually, I don't mean that his estimate of the gap would be lower, but something more like it would seem like less of a non-question to him and he would take nanotech a lot more seriously, even if he did still come down firmly on the side of UFAI being a bigger concern.)
Presumably many people fear a very rapid "hard takeoff" where the time from "interesting slightly-smarter-than-human AI experiment" to "full-blown technological singularity underway" is measured in at days (or less) rather than months or years.
The AI risk scenario that Eliezer Yudkowsky relatively often uses is that of the AI solving the protein folding problem.
If you believe a "hard takeoff" to be probable, what reason is there to believe that the distance between a.) an AI capable of cracking that specific problem and b.) an AI triggering an intelligence explosion is too short for humans to do something similarly catastrophic as what the AI would have done with the resulting technological breakthrough?
In other words, does the protein folding problem require AI to reach a level of sophistication that would allow humans, or the AI itself, within days or months, to reach the stages where it undergoes an intelligence explosion? How so?
My assumption is that the protein-folding problem is unimaginably easier than an AI doing recursive self-improvement without breaking itself.
Admittedly, Eliezer is describing something harder than the usual interpretation of the protein-folding problem, but it still seems a lot less general than a program making itself more intelligent.
Is this question equivalent to "Is the protein-folding problem equivalently hard to the build-a-smarter-intelligence-than-I-am problem?" ? It seems like it ought to be, but I'm genuinely unsure, as the wording of your question kind of confuses me.
If so, my answer would be that it depends on how intelligent I am, since I expect the second problem to get more difficult as I get more intelligent. If we're talking about the actual me... yeah, I don't have higher confidence either way.
It is mostly equivalent. Is it easier to design an AI that can solve one specific hard problem than an AI that can solve all hard problems?
Expecting that only a fully-fledged artificial general intelligence is able to solve the protein-folding problem seems to be equivalent to believing the conjunction "an universal problem solver can solve the protein-folding problem" AND "an universal problem solver is easier to solve than the protein-folding problem". Are there good reasons to believe this?
ETA: My perception is that people who believe unfriendly AI to come sooner than nanotechnology believe that it is easier to devise a computer algorithm to devise a computer algorithm to predict protein structures from their sequences rather than to directly devise a computer algorithm to predict protein structures from their sequences. This seems counter-intuitive.
Ah, this helps, thanks.
For my own part, the idea that we might build tools better at algorithm-development than our own brains are doesn't seem counterintuitive at all... we build a lot of tools that are better than our own brains at a lot of things. Neither does it seem implausible that there exist problems that are solvable by algorithm-development, but whose solution requires algorithms that our brains aren't good enough algorithm-developers to develop algorithms to solve.
So it seems reasonable enough that there are problems which we'll solve faster by developing algorithm-developers to solve them for us, than by trying to solve the problem itself.
Whether protein-folding is one of those problems, I have absolutely no idea. But it sounds like your position isn't unique to protein-folding.
Some thoughts on the correlations:
At first I saw that IQ seems to correlate with less children (a not uncommon observation):
But then I noticed that number of children obviously correlate with age and age with IQ (somewhat):
So it may be that older people just have lower IQ (Flynn effect).
Something to think about:
This can be read as smarter people stay shorter on LW. It seems to imply that over time LW will degrade in smarts. But it could also just mean that smarter people just turn over faster (thus also entering faster).
On the other hand most human endeavors tend toward the mean over time.
Older people (like me ahem) either take longer to notice LW or the community is spreading from younger to older people slowly.
This made me laugh:
Guess who does the voting :-)
"Time on Less Wrong/IQ: -.164 (492)
This can be read as smarter people stay shorter on LW. It seems to imply that over time LW will degrade in smarts. But it could also just mean that smarter people just turn over faster (thus also entering faster)."
Alternatively: higher IQ people can get the same amount of impact out of less reading-time on the site, and therefore do not need to spend as much time on the site
The 1600 SAT was renormed in 1994, and scores afterwards are much higher (and not directly comparable) to scores before. As well, depending on how the 'null' is interpreted, the youngest are unlikely to have a SAT score out of 1600, because it switched to 2400 in 2005. The line between having a score out of 1600 or not is probably at about 22 years old.
Wait, this means that reading less wrong makes you dumber!
Hmmm, there was something about correlation and causation... but I don't remember it well. I must be spending too much time on less wrong.
In the data set older people have a significantly higher IQ than younger people. The effect however disappears if you start to control for whether someone lives in the US.
US LW users are on average more intelligent and older.
For the second year in a row Pandemic is the leading cat risk. If you include natural and designed it has twice the support of the next highest cat risk.
That surprised me slightly, more because I'm not particularly aware of discussion of bioengineered pandemics as an existential risk than that I don't think its plausible. Suppose this means a lot of people are worried about it but not discussing it?
That's because cats never build research stations.
I gave blood before I was an EA but stopped because I didn't think it was effective. Does being veg*n correlate with calling oneself an EA? That seems like a more effective intervention.
Not necessarily a joke.
The link contains a typo, it links to a non-existing article on the/a Pirate part instead of the Pirate Party.
Fixed, thanks.
Thanks for doing this!
Results from previous years: 2009 2011 2012
The standard way to fix this is to run them on half the data only and then test their predictive power on the other half. This eliminates almost all spurious correlations.
Alternatively, Bonferroni correction.
That's roughly what Yvain did, by taking into consideration the number of correlations tested when setting the significance level.
Does that actually work better than just setting a higher bar for significance? My gut says that data is data and chopping it up cleverly can't work magic.
Cross validation is actually hugely useful for predictive models. For a simple correlation like this, it's less of a big deal. But if you are fitting a local linearly weighted regression line for instance, chopping the data up is absolutely standard operating procedure.
Hah, my score almost doubled from last year.
There's something strange about the analysis posted.
How is it that 100% of the general population with high (>96%) confidence got the correct answer, but only 66% of a subset of that population? Looking at the provided data, it looks like 3 out of 4 people (none with high Karma scores) who gave the highest confidence were right.
(Predictably, the remaining person with high confidence answered 500 million, which is almost the exact population of the European Union (or, in the popular parlance "Europe"). I almost made the same mistake, before realizing that a) "Europe" might be intended to include Russia, or part of Russia, plus other non-EU states and b) I don't know the population of those countries, and can't cover both bases. So in response, I kept the number and decreased my confidence value. Regrettably, 500 million can signify both tremendous confidence and very little confidence, which makes it hard to do an analysis of this effect.)
The second word in the winning secret phrase is pony (chosen because you can't spell the former without the latter); I'll accept the prize money via PayPal to main att zackmdavis daht net.
(As I recall, I chose to Defect after looking at the output of one call to Python's random.random() and seeing a high number, probably point-eight-something. But I shouldn't get credit for following my proposed procedure (which turned out to be wrong anyway) because I don't remember deciding beforehand that I was definitely using a "result > 0.8 means Defect" convention (when "result < 0.2 means Defect" is just as natural). I think I would have chosen Cooperate if the random number had come up less than 0.8, but I haven't actually observed the nearby possible world where it did, so it's at least possible that I was rationalizing.)
(Also, I'm sorry for being bad at reading; I don't actually think there are seven hundred trillion people in Europe.)
When I heard about Yvain's PD contest, I flipped a coin. I vowed that if it came up heads, I would Paypal the winner $200 (on top of their winnings), and if it came up tails I would ask them for the prize money they won.
It came up tails. YOUR MOVE.
(No, not really. But somebody here SHOULD have made such a commitment.)
Hey, it's not too late: if you should have made such a commitment, then the mere fact that you didn't actually do so shouldn't stop you now. Go ahead, flip a coin; if it comes up heads, you pay me $200; if it comes up tails, I'll ask Yvain to give you the $42.96.
...I don't think this is a very wise offer to make on the Internet unless the "coin" is somewhere you can both see it.
Yes, of course I thought of that when considering my reply, but in this particular context (where we're considering counterfactual dealmaking presumably because the idea of pulling such a stunt in real life is amusing), I thought it was more in the spirit of things to be trusting. As you know, Newcomblike arguments still go through when Omega is merely a very good and very honest predictor rather than a perfect one, and my prior beliefs about reasonably-well-known Less Wrongers make me willing to bet that Simplicio probably isn't going to lie in order to scam me out of forty-three dollars. (If it wasn't already obvious, my offer was extended to Simplicio only and for the specified amounts only.)
Em, I don't actually like those odds all that much, thanks!
Well played :)
True, though they forgot to change the "You may make my anonymous survey data public (recommended)" to "You may make my ultimately highly unanonymous survey data public (not as highly recommended)".
It'd be easy enough to claim the prize anonymously, no?
Results on google docs.
I would like to see how percent of positive karma, rather than total karma, correlates with the other survey responses. I find the former a more informative measure than the latter.
I agree that it would be interesting but I suspect that just as "total karma" is a combination of "comment quality" and "time on LW" (where for most purposes the former is more interesting, but the latter makes a big difference), so "percent positive karma" is a combination of "comment quality" and "what sort of discussions one frequents", where again the former is more interesting but the latter makes a big difference.
The correlations with number of partners seem like they confound two very different questions: "in a relationship or not?" and "poly or not, and if so how poly?". This makes correlations with things like IQ and age less interesting. It seems like it would be more informative to look at the variables "n >= 1" and "value of n, conditional on n >= 1".
(Too lazy to redo those analyses myself right now, and probably ever. Sorry. If someone else does I'll be interested in the results, though.)
I don't know if this is the LW hug or something but I'm having trouble downloading the xls. Also, will update with what the crap my passphrase actually means, because it's in Lojban and mildly entertaining IIRC.
EDIT: Felt like looking at some other entertaining passphrases. Included with comment.
sruta'ulor maftitnab {mine! scarf-fox magic-cakes!(probably that kind)}
Afgani-san Azerbai-chan {there... are no words}
DEFECTORS RULE
do mlatu {a fellow lojbanist!}
lalxu daplu {and another?}
telephone fonxa {and another! please get in contact with me. please.}
xagfu'a rodo {indeed! but where are all you people coming from, and why don't I know you?}
zifre dunda {OH COME ON WHERE ARE YOU PEOPLE]
eponymous hahanicetry_CHEATER {clever.}
fart butt {I am twelve...}
FROGPENIS SPOOBOMB {... and so is a lot of LW.}
goat felching {good heavens}
I don't want the prize! Pick someone else please!
I dont care about the MONETARY REWARD but you shoudl know that
Irefuse myprize
No thanks
not interested
{a lot of refusers!}
I'm gay
john lampkin (note: this is not my name)
lookatme iwonmoney {nice try guy}
mencius suckedmoziwasbetter
mimsy borogoves {repeated!}
TWO WORD {repeated, and try harder next time}
octothorpe interrobang
SOYUZ NERUSHIMIY {ONWARD, COMRADE(note: person is apparently a social democrat.)}
TERRORISTS WIN
thisissuspiciouslylike askingforourpasswordmethodologies {I should think not.}
zoodlybop zimzamzoom {OH MY GODS BILL COSBY IS A LESSWRONGER.}
AND THAT'S ALL, FOLKS.
Actual translation: INDESTRUCTIBLE UNION
(It's from the national anthem of the U.S.S.R.)
The following passphrases were repeated (two occurances each, the only entry that occured more than twice was the blank one):
Bagel bites
EFFulgent shackles
Kissing bobbies
mimsy borogoves
SQUEAMISH OSSIFRAGE
If we go case-insensitive, there was also 'No thanks' and 'no thanks'; and 'TWO WORD' and 'Two Word'.
(The first three of those came next to each other, so they were probably just multiple entries.)
It is a datapoint that only one person apparently took up the offer of SQUEAMISH OSSIFRAGE
So, I was going through the xls, and saw the "passphrase" column. "Wait, what? Won't the winner's passphrase be in here?"
Not sure if this is typos or hitting the wrong entry field, but two talented individuals managed to get 1750 and 2190 out of 1600 on the SAT.
I was curious about the breakdown of romance (whether or not you met your partner through LW) and sexuality. For "men" and "women," I just used sex- any blanks or others are excluded. Numbers are Yes/No/I didn't meet them through community but they're part of the community now:
Gay men: 2/36/3
Lesbian women: 0/2/0
Bi men: 4/111/9
Bi women: 12/32/7
Straight men: 29/1031/26
Straight women: 1/55/10
I'm not quite sure how seriously to take these numbers, though. If 29 straight guys found a partner through the LW community, and a total of 14 straight and bi women found partners through the community, we need to have men to be about twice as likely to take the survey as women. (Possible, especially if women are more likely to go to meetups and less likely to post, but I don't feel like looking that up for the group as a whole.)
But the results are clear: the yes/no ratio was way higher for bi women than anyone else. Bi women still win the yes+didn't/no ratio with .6, but straight women are next with .2, followed by gay men at .14 and bi men at .12.
So, uh, advertise LW to all the bi women you know?
That seems fairly plausible to me, actually. My impression of the community is that the physical side of it is less gender-skewed than the online side, although both are mostly male.
There's also polyamory to take into account.
In a manner of speaking: eponymous hahanicetry_CHEATER
I know, that's why I mentioned it- I decided not to quote it to leave it as a surprise for people who decided to then go check. But I had missed that someone else posted it.
You know, it would be interesting if Yvain had put something else there just to see how many people would try to cheat.
Nice work Yvain and Ozy, and well done to Zack for winning the MONETARY REWARD.
I continue to be bad at estimating but well calibrated.
(Also, I'm sure that this doesn't harm the data to any significant degree but I appear to appear twice in the data, both rows 548 and 552 in the xls file, with row 548 being more complete.)
I'm extremely surprised and confused. Is there an explanation for how these probabilities are so high?
Our universe came from somewhere. Can you be 100% sure that no intelligence was involved? If there was an intelligence involved, it would probably qualify as supernatural and god, even if it was something technically mundane (such as the author of the simulation we call reality, or an intelligent race that created our universe or tweaked the result, possibly as an attempt to escape the heat death of their universe). Eg if you ask our community, "What are the odds that in the next million years humans be able to create whole world simulations?" I suspect they'll answer "very high".
For extra fun, you can wonder if the total number of simulated humans is expected to outnumber the total number of real humans.
Well, we apparently have 3.9% of "committed theists", 3.2% of "lukewarm theists", and 2.2% of "deists, pantheists, etc.". If these groups put Pr(God) at 90%, 60%, 40% respectively (these numbers are derived from a sophisticated scientific process of rectal extraction) then they contribute 6.3% of the overall Pr(God) requiring an average Pr(God) of about 3.1% from the rest of the LW population. If enough respondents defined "God" broadly enough, that doesn't seem altogether crazy.
If those groups put Pr(religion) at 90%, 30%, 10% then they contribute about 4.7% to the overall Pr(religion) suggesting ~1% for the rest of the population. Again, that doesn't seem crazy.
So the real question is more or less equivalent to: How come there are so many committed theists on LW? Which we can frame two ways: (1) How come LW isn't more effective in helping people recognize that their religion is wrong? or (2) How come LW isn't more effective in driving religious people away? To which I would say (1) recognizing that your religion is wrong is really hard and (2) I hope LW is very ineffective in driving religious people away.
(For those who expect meta-level opinions on these topics to be perturbed by object-level opinions and wish to discount or adjust: I am an atheist; I don't remember what probabilities I gave but they would be smaller than any I have mentioned above.)
When it comes to a hypothesis as extreme as 'an irreducible/magical mind like the one described in various religions created our universe', I'd say that if 3% credence isn't crazy, 9% isn't either. I took shokwave to be implying that a reasonable probability would be orders of magnitude smaller, not 2/3 smaller.
The reason why I think ~3% for some kind of God and ~1% for some kind of religion aren't crazy numbers (although, I repeat, my own estimates of the probabilities are much lower) is that there is a credible argument to be made that if something is seriously believed by a large number of very clever and well informed people then you shouldn't assign it a very low probability. I don't think this argument is actually correct, but it's got some plausibility to it and I've seen versions of it taken very seriously by big-name LW participants. Accordingly, I think it would be unsurprising and not-crazy if, say, 10% of LW allowed a 10% probability for God's existence on the basis that maybe something like 10% of (e.g.) first-rate scientists or philosophers believe in God.
The links to the public data given at the end appear to be broken. They give internal links to Less Wrong instead of redirecting to Slate Star Codex. These links should work:
sav xls csv
Fixed.
Not quite. The averages might roughly work, but the correlations appear off. For instance this:
Is about half of what you'd expect.
Maybe this is as expected?
"Finally, at the end of the survey I had a question offering respondents a chance to cooperate (raising the value of a potential monetary prize to be given out by raffle to a random respondent) or defect (decreasing the value of the prize, but increasing their own chance of winning the raffle). 73% of effective altruists cooperated compared to 70% of others - an insignificant difference."
Assuming an EA thinks they will use the money better than the typical other winner, the most altruistic thing to do could be to increase their chances of winning, even at the cost of a lower prize. Or maybe they like the person putting up the prize, in which case they would prefer it to be smaller.
You mention a "very confused secular humanist." What other answers did that person provide that mark him/her/zer as confused?
People were supposed to fill out the religion field if they are theist. If a secular humanist field out that field it suggest that he's confused.
That dichotomy leaves no space for non-theistic religions. What if a secular humanist simpathizes with Taoism or Buddhism?
Or non-religious theists, for that matter.
In that case he would have put Taoism or Buddhism into the box instead of secular humanist. But you are right that the question is formed in a way to discourage non-theistic religions from being reported.