2016 LessWrong Diaspora Survey Analysis: Part Four (Politics, Calibration & Probability, Futurology, Charity & Effective Altruism)
Politics
The LessWrong survey has a very involved section dedicated to politics. In previous analysis the benefits of this weren't fully realized. In the 2016 analysis we can look at not just the political affiliation of a respondent, but what beliefs are associated with a certain affiliation. The charts below summarize most of the results.
Political Opinions By Political Affiliation

Miscellaneous Politics
There were also some other questions in this section which aren't covered by the above charts.
Voting
| Group | Turnout |
|---|---|
| LessWrong | 68.9% |
| Austrailia | 91% |
| Brazil | 78.90% |
| Britain | 66.4% |
| Canada | 68.3% |
| Finland | 70.1% |
| France | 79.48% |
| Germany | 71.5% |
| India | 66.3% |
| Israel | 72% |
| New Zealand | 77.90% |
| Russia | 65.25% |
| United States | 54.9% |
Calibration And Probability Questions
Calibration Questions
I just couldn't analyze these, sorry guys. I put many hours into trying to get them into a decent format I could even read and that sucked up an incredible amount of time. It's why this part of the survey took so long to get out. Thankfully another LessWrong user, Houshalter, has kindly done their own analysis.
All my calibration questions were meant to satisfy a few essential properties:
- They should be 'self contained'. I.E, something you can reasonably answer or at least try to answer with a 5th grade science education and normal life experience.
- They should, at least to a certain extent, be Fermi Estimable.
- They should progressively scale in difficulty so you can see whether somebody understands basic probability or not. (eg. In an 'or' question do they put a probability of less than 50% of being right?)
At least one person requested a workbook, so I might write more in the future. I'll obviously write more for the survey.
Probability Questions
| Question | Mean | Median | Mode | Stdev |
| Please give the obvious answer to this question, so I can automatically throw away all surveys that don't follow the rules: What is the probability of a fair coin coming up heads? | 49.821 | 50.0 | 50.0 | 3.033 |
| What is the probability that the Many Worlds interpretation of quantum mechanics is more or less correct? | 44.599 | 50.0 | 50.0 | 29.193 |
| What is the probability that non-human, non-Earthly intelligent life exists in the observable universe? | 75.727 | 90.0 | 99.0 | 31.893 |
| ...in the Milky Way galaxy? | 45.966 | 50.0 | 10.0 | 38.395 |
| What is the probability that supernatural events (including God, ghosts, magic, etc) have occurred since the beginning of the universe? | 13.575 | 1.0 | 1.0 | 27.576 |
| What is the probability that there is a god, defined as a supernatural intelligent entity who created the universe? | 15.474 | 1.0 | 1.0 | 27.891 |
| What is the probability that any of humankind's revealed religions is more or less correct? | 10.624 | 0.5 | 1.0 | 26.257 |
| What is the probability that an average person cryonically frozen today will be successfully restored to life at some future time, conditional on no global catastrophe destroying civilization before then? | 21.225 | 10.0 | 5.0 | 26.782 |
| What is the probability that at least one person living at this moment will reach an age of one thousand years, conditional on no global catastrophe destroying civilization in that time? | 25.263 | 10.0 | 1.0 | 30.510 |
| What is the probability that our universe is a simulation? | 25.256 | 10.0 | 50.0 | 28.404 |
| What is the probability that significant global warming is occurring or will soon occur, and is primarily caused by human actions? | 83.307 | 90.0 | 90.0 | 23.167 |
| What is the probability that the human race will make it to 2100 without any catastrophe that wipes out more than 90% of humanity? | 76.310 | 80.0 | 80.0 | 22.933 |
Probability questions is probably the area of the survey I put the least effort into. My plan for next year is to overhaul these sections entirely and try including some Tetlock-esque forecasting questions, a link to some advice on how to make good predictions, etc.
Futurology
This section got a bit of a facelift this year. Including new cryonics questions, genetic engineering, and technological unemployment in addition to the previous years.
Cryonics
Interestingly enough, of those who think it will work with enough confidence to say 'yes', only 14 are actually signed up for cryonics.
sqlite> select count(*) from data where CryonicsNow="Yes" and Cryonics="Yes - signed up or just finishing up paperwork";
14
sqlite> select count(*) from data where CryonicsNow="Yes" and (Cryonics="Yes - signed up or just finishing up paperwork" OR Cryonics="No - would like to sign up but unavailable in my area" OR "No - would like to sign up but haven't gotten around to it" OR "No - would like to sign up but can't afford it");
34
LessWrongers seem to be very bullish on the underlying physics of cryonics even if they're not as enthusiastic about current methods in use.
The Brain Preservation Foundation also did an analysis of cryonics responses to the LessWrong Survey.
Singularity
SingularityYear
By what year do you think the Singularity will occur? Answer such that you think, conditional on the Singularity occurring, there is an even chance of the Singularity falling before or after this year. If you think a singularity is so unlikely you don't even want to condition on it, leave this question blank.
Mean: 8.110300081581755e+16
Median: 2080.0
Mode: 2100.0
Stdev: 2.847858859055733e+18
I didn't bother to filter out the silly answers for this.Obviously it's a bit hard to see without filtering out the uber-large answers, but the median doesn't seem to have changed much from the 2014 survey.
Genetic Engineering
Well that's fairly overwhelming.
I find it amusing how the strict "No" group shrinks considerably after this question.
This question is too important to just not have an answer to so I'll do it manually. Unfortunately I can't easily remove the 'excluded' entries so that we're dealing with the exact same distribution but only 13 or so responses are filtered out anyway.
sqlite> select count(*) from data where GeneticImprovement="Yes";
1100
>>> 1100 + 176 + 262 + 84
1622
>>> 1100 / 1622
0.6781750924784217
67.8% are willing to genetically engineer their children for improvements.
These numbers go about how you would expect, with people being progressively less interested the more 'shallow' a genetic change is seen as.
All three of these seem largely consistent with peoples personal preferences about modification. Were I inclined I could do a deeper analysis that actually takes survey respondents row by row and looks at correlation between preference for ones own children and preference for others.
Technological Unemployment
LudditeFallacy
Do you think the Luddite's Fallacy is an actual fallacy?
Yes: 443 (30.936%)
No: 989 (69.064%)
We can use this as an overall measure of worry about technological unemployment, which would seem to be high among the LW demographic.
UnemploymentYear
By what year do you think the majority of people in your country will have trouble finding employment for automation related reasons? If you think this is something that will never happen leave this question blank.
Mean: 2102.9713740458014
Median: 2050.0
Mode: 2050.0
Stdev: 1180.2342850727339
Question is flawed because you can't distinguish answers of "never happen" from people who just didn't see it.Interesting question that would be fun to take a look at in comparison to the estimates for the singularity.
EndOfWork
Do you think the "end of work" would be a good thing?
Yes: 1238 (81.287%)
No: 285 (18.713%)
Fairly overwhelming consensus, but with a significant minority of people who have a dissenting opinion.
EndOfWorkConcerns
If machines end all or almost all employment, what are your biggest worries? Pick two.
| Question | Count | Percent |
| People will just idle about in destructive ways | 513 | 16.71% |
| People need work to be fulfilled and if we eliminate work we'll all feel deep existential angst | 543 | 17.687% |
| The rich are going to take all the resources for themselves and leave the rest of us to starve or live in poverty | 1066 | 34.723% |
| The machines won't need us, and we'll starve to death or be otherwise liquidated | 416 | 13.55% |
The plurality of worries are about elites who refuse to share their wealth.
Existential Risk
XRiskType
Which disaster do you think is most likely to wipe out greater than 90% of humanity before the year 2100?
Nuclear war: +4.800% 326 (20.6%)
Asteroid strike: -0.200% 64 (4.1%)
Unfriendly AI: +1.000% 271 (17.2%)
Nanotech / grey goo: -2.000% 18 (1.1%)
Pandemic (natural): +0.100% 120 (7.6%)
Pandemic (bioengineered): +1.900% 355 (22.5%)
Environmental collapse (including global warming): +1.500% 252 (16.0%)
Economic / political collapse: -1.400% 136 (8.6%)
Other: 35 (2.217%)
Significantly more people worried about Nuclear War than last year. Effect of new respondents, or geopolitical situation? Who knows.
Charity And Effective Altruism
Charitable Giving
Income
What is your approximate annual income in US dollars (non-Americans: convert at www.xe.com)? Obviously you don't need to answer this question if you don't want to. Please don't include commas or dollar signs.
Sum: 66054140.47384
Mean: 64569.052271593355
Median: 40000.0
Mode: 30000.0
Stdev: 107297.53606321265
IncomeCharityPortion
How much money, in number of dollars, have you donated to charity over the past year? (non-Americans: convert to dollars at http://www.xe.com/ ). Please don't include commas or dollar signs in your answer. For example, 4000
Sum: 2389900.6530000004
Mean: 2914.5129914634144
Median: 353.0
Mode: 100.0
Stdev: 9471.962766896671
XriskCharity
How much money have you donated to charities aiming to reduce existential risk (other than MIRI/CFAR) in the past year?
Sum: 169300.89
Mean: 1991.7751764705883
Median: 200.0
Mode: 100.0
Stdev: 9219.941506342007
CharityDonations
How much have you donated in US dollars to the following charities in the past year? (Non-americans: convert to dollars at http://www.xe.com/) Please don't include commas or dollar signs in your answer. Options starting with "any" aren't the name of a charity but a category of charity.
| Question | Sum | Mean | Median | Mode | Stdev |
| Against Malaria Foundation | 483935.027 | 1905.256 | 300.0 | None | 7216.020 |
| Schistosomiasis Control Initiative | 47908.0 | 840.491 | 200.0 | 1000.0 | 1618.785 |
| Deworm the World Initiative | 28820.0 | 565.098 | 150.0 | 500.0 | 1432.712 |
| GiveDirectly | 154410.177 | 1429.723 | 450.0 | 50.0 | 3472.082 |
| Any kind of animal rights charity | 83130.47 | 1093.821 | 154.235 | 500.0 | 2313.493 |
| Any kind of bug rights charity | 1083.0 | 270.75 | 157.5 | None | 353.396 |
| Machine Intelligence Research Institute | 141792.5 | 1417.925 | 100.0 | 100.0 | 5370.485 |
| Any charity combating nuclear existential risk | 491.0 | 81.833 | 75.0 | 100.0 | 68.060 |
| Any charity combating global warming | 13012.0 | 245.509 | 100.0 | 10.0 | 365.542 |
| Center For Applied Rationality | 127101.0 | 3177.525 | 150.0 | 100.0 | 12969.096 |
| Strategies for Engineered Negligible Senescence Research Foundation | 9429.0 | 554.647 | 100.0 | 20.0 | 1156.431 |
| Wikipedia | 12765.5 | 53.189 | 20.0 | 10.0 | 126.444 |
| Internet Archive | 2975.04 | 80.406 | 30.0 | 50.0 | 173.791 |
| Any campaign for political office | 38443.99 | 366.133 | 50.0 | 50.0 | 1374.305 |
| Other | 564890.46 | 1661.442 | 200.0 | 100.0 | 4670.805 |
This table is interesting given the recent debates about how much money certain causes are 'taking up' in Effective Altruism.
Effective Altruism
Vegetarian
Do you follow any dietary restrictions related to animal products?
Yes, I am vegan: 54 (3.4%)
Yes, I am vegetarian: 158 (10.0%)
Yes, I restrict meat some other way (pescetarian, flexitarian, try to only eat ethically sourced meat): 375 (23.7%)
No: 996 (62.9%)
EAKnowledge
Do you know what Effective Altruism is?
Yes: 1562 (89.3%)
No but I've heard of it: 114 (6.5%)
No: 74 (4.2%)
EAIdentity
Do you self-identify as an Effective Altruist?
Yes: 665 (39.233%)
No: 1030 (60.767%)
The distribution given by the 2014 survey results does not sum to one, so it's difficult to determine if Effective Altruism's membership actually went up or not but if we take the numbers at face value it experienced an 11.13% increase in membership.
EACommunity
Do you participate in the Effective Altruism community?
Yes: 314 (18.427%)
No: 1390 (81.573%)
Same issue as last, taking the numbers at face value community participation went up by 5.727%
EADonations
Has Effective Altruism caused you to make donations you otherwise wouldn't?
Yes: 666 (39.269%)
No: 1030 (60.731%)
Wowza!
Effective Altruist Anxiety
EAAnxiety
Have you ever had any kind of moral anxiety over Effective Altruism?
Yes: 501 (29.6%)
Yes but only because I worry about everything: 184 (10.9%)
No: 1008 (59.5%)
There's an ongoing debate in Effective Altruism about what kind of rhetorical strategy is best for getting people on board and whether Effective Altruism is causing people significant moral anxiety.
It certainly appears to be. But is moral anxiety effective? Let's look:
Sample Size: 244
Average amount of money donated by people anxious about EA who aren't EAs: 257.5409836065574
Sample Size: 679
Average amount of money donated by people who aren't anxious about EA who aren't EAs: 479.7501384388807
Sample Size: 249 Average amount of money donated by EAs anxious about EA: 1841.5292369477913
Sample Size: 314
Average amount of money donated by EAs not anxious about EA: 1837.8248407643312
It seems fairly conclusive that anxiety is not a good way to get people to donate more than they already are, but is it a good way to get people to become Effective Altruists?
Sample Size: 1685
P(Effective Altruist): 0.3940652818991098
P(EA Anxiety): 0.29554896142433235
P(Effective Altruist | EA Anxiety): 0.5
Maybe. There is of course an argument to be made that sufficient good done by causing people anxiety outweighs feeding into peoples scrupulosity, but it can be discussed after I get through explaining it on the phone to wealthy PR-conscious donors and telling the local all-kill shelter where I want my shipment of dead kittens.
EAOpinion
What's your overall opinion of Effective Altruism?
Positive: 809 (47.6%)
Mostly Positive: 535 (31.5%)
No strong opinion: 258 (15.2%)
Mostly Negative: 75 (4.4%)
Negative: 24 (1.4%)
EA appears to be doing a pretty good job of getting people to like them.
Interesting Tables
| Affiliation | Income | Charity Contributions | % Income Donated To Charity | Total Survey Charity % | Sample Size |
|---|---|---|---|---|---|
| Anarchist | 1677900.0 | 72386.0 | 4.314% | 3.004% | 50 |
| Communist | 298700.0 | 19190.0 | 6.425% | 0.796% | 13 |
| Conservative | 1963000.04 | 62945.04 | 3.207% | 2.612% | 38 |
| Futarchist | 1497494.1099999999 | 166254.0 | 11.102% | 6.899% | 31 |
| Left-Libertarian | 9681635.613839999 | 416084.0 | 4.298% | 17.266% | 245 |
| Libertarian | 11698523.0 | 214101.0 | 1.83% | 8.885% | 190 |
| Moderate | 3225475.0 | 90518.0 | 2.806% | 3.756% | 67 |
| Neoreactionary | 1383976.0 | 30890.0 | 2.232% | 1.282% | 28 |
| Objectivist | 399000.0 | 1310.0 | 0.328% | 0.054% | 10 |
| Other | 3150618.0 | 85272.0 | 2.707% | 3.539% | 132 |
| Pragmatist | 5087007.609999999 | 266836.0 | 5.245% | 11.073% | 131 |
| Progressive | 8455500.440000001 | 368742.78 | 4.361% | 15.302% | 217 |
| Social Democrat | 8000266.54 | 218052.5 | 2.726% | 9.049% | 237 |
| Socialist | 2621693.66 | 78484.0 | 2.994% | 3.257% | 126 |
| Community | Count | % In Community | Sample Size |
|---|---|---|---|
| LessWrong | 136 | 38.418% | 354 |
| LessWrong Meetups | 109 | 50.463% | 216 |
| LessWrong Facebook Group | 83 | 48.256% | 172 |
| LessWrong Slack | 22 | 39.286% | 56 |
| SlateStarCodex | 343 | 40.98% | 837 |
| Rationalist Tumblr | 175 | 49.716% | 352 |
| Rationalist Facebook | 89 | 58.94% | 151 |
| Rationalist Twitter | 24 | 40.0% | 60 |
| Effective Altruism Hub | 86 | 86.869% | 99 |
| Good Judgement(TM) Open | 23 | 74.194% | 31 |
| PredictionBook | 31 | 51.667% | 60 |
| Hacker News | 91 | 35.968% | 253 |
| #lesswrong on freenode | 19 | 24.675% | 77 |
| #slatestarcodex on freenode | 9 | 24.324% | 37 |
| #chapelperilous on freenode | 2 | 18.182% | 11 |
| /r/rational | 117 | 42.545% | 275 |
| /r/HPMOR | 110 | 47.414% | 232 |
| /r/SlateStarCodex | 93 | 37.959% | 245 |
| One or more private 'rationalist' groups | 91 | 47.15% | 193 |
| Affiliation | EA Income | EA Charity | Sample Size |
|---|---|---|---|
| Anarchist | 761000.0 | 57500.0 | 18 |
| Futarchist | 559850.0 | 114830.0 | 15 |
| Left-Libertarian | 5332856.0 | 361975.0 | 112 |
| Libertarian | 2725390.0 | 114732.0 | 53 |
| Moderate | 583247.0 | 56495.0 | 22 |
| Other | 1428978.0 | 69950.0 | 49 |
| Pragmatist | 1442211.0 | 43780.0 | 43 |
| Progressive | 4004097.0 | 304337.78 | 107 |
| Social Democrat | 3423487.45 | 149199.0 | 93 |
| Socialist | 678360.0 | 34751.0 | 41 |
A note about calibration of confidence
Background
In a recent Slate Star Codex Post (http://slatestarcodex.com/2016/01/02/2015-predictions-calibration-results/), Scott Alexander made a number of predictions and presented associated confidence levels, and then at the end of the year, scored his predictions in order to determine how well-calibrated he is. In the comments, however, there arose a controversy over how to deal with 50% confidence predictions. As an example, Scott has these predictions at 50% confidence, among his others:
|
Proposition |
Scott's Prior |
Result |
|
|
A |
Jeb Bush will be the top-polling Republican candidate |
P(A) = 50% |
A is False |
|
B |
Oil will end the year greater than $60 a barrel |
P(B) = 50% |
B is False |
|
C |
Scott will not get any new girlfriends |
P(C) = 50% |
C is False |
|
D |
At least one SSC post in the second half of 2015 will get > 100,000 hits: 70% |
P(D) = 70% |
D is False |
|
E |
Ebola will kill fewer people in second half of 2015 than the in first half |
P(E) = 95% |
E is True |
Scott goes on to score himself as having made 0/3 correct predictions at the 50% confidence interval, which looks like significant overconfidence. He addresses this by noting that with only 3 data points it’s not much data to go by, and could easily have been correct if any of those results had turned out differently. His resulting calibration curve is this:

However, the commenters had other objections about the anomaly at 50%. After all, P(A) = 50% implies P(~A) = 50%, so the choice of “I will not get any new girlfriends: 50% confidence” is logically equivalent to “I will get at least 1 new girlfriend: 50% confidence”, except that one results as true and the other false. Therefore, the question seems sensitive only to the particular phrasing chosen, independent of the outcome.
One commenter suggests that close to perfect calibration at 50% confidence can be achieved by choosing whether to represent propositions as positive or negative statements by flipping a fair coin. Another suggests replacing 50% confidence with 50.1% or some other number arbitrarily close to 50%, but not equal to it. Others suggest getting rid of the 50% confidence bin altogether.
Scott recognizes that predicting A and predicting ~A are logically equivalent, and choosing to use one or the other is arbitrary. But by choosing to only include A in his data set rather than ~A, he creates a problem that occurs when P(A) = 50%, where the arbitrary choice of making a prediction phrased as ~A would have changed the calibration results despite being the same prediction.
Symmetry
This conundrum illustrates an important point about these calibration exercises. Scott chooses all of his propositions to be in the form of statements to which he assigns greater or equal to 50% probability, by convention, recognizing that he doesn’t need to also do a calibration of probabilities less than 50%, as the upper-half of the calibration curve captures all the relevant information about his calibration.
This is because the calibration curve has a property of symmetry about the 50% mark, as implied by the mathematical relation P(X) = 1- P(~X) and of course P(~X) = 1 –P(X).
We can enforce that symmetry by recognizing that when we make the claim that proposition X has probability P(X), we are also simultaneously making the claim that proposition ~X has probability 1-P(X). So we add those to the list of predictions and do the bookkeeping on them too. Since we are making both claims, why not be clear about it in our bookkeeping?
When we do this, we get the full calibration curve, and the confusion about what to do about 50% probability disappears. Scott’s list of predictions looks like this:
|
Proposition |
Scott's Prior |
Result |
|
|
A |
Jeb Bush will be the top-polling Republican candidate |
P(A) = 50% |
A is False |
|
~A |
Jeb Bush will not be the top-polling Republican candidate |
P(~A) = 50% |
~A is True |
|
B |
Oil will end the year greater than $60 a barrel |
P(B) = 50% |
B is False |
|
~B |
Oil will not end the year greater than $60 a barrel |
P(~B) = 50% |
~B is True |
|
C |
Scott will not get any new girlfriends |
P(C) = 50% |
C is False |
|
~C |
Scott will get new girlfriend(s) |
P(~C) = 50% |
~C is True |
|
D |
At least one SSC post in the second half of 2015 will get > 100,000 hits: 70% |
P(D) = 70% |
D is False |
|
~D |
No SSC post in the second half of 2015 will get > 100,000 hits |
P(~D) = 30% |
~D is True |
|
E |
Ebola will kill fewer people in second half of 2015 than the in first half |
P(E) = 95% |
E is True |
|
~E |
Ebola will kill as many or more people in second half of 2015 than the in first half |
P(~E) = 05% |
~E is False |
You will by now have noticed that there will always be an even number of predictions, and that half of the predictions always are true and half are always false. In most cases, like with E and ~E, that means you get a 95% likely prediction that is true and a 5%-likely prediction that is false, which is what you would expect. However, with 50%-likely predictions, they are always accompanied by another 50% prediction, one of which is true and one of which is false. As a result, it is actually not possible to make a binary prediction at 50% confidence that is out of calibration.
The resulting calibration curve, applied to Scott’s predictions, looks like this:

Sensitivity
By the way, this graph doesn’t tell the whole calibration story; as Scott noted it’s still sensitive to how many predictions were made in each bucket. We can add “error bars” that show what would have resulted if Scott had made one more prediction in each bucket, and whether the result of that prediction had been true or false. The result is the following graph:

Note that the error bars are zero about the point of 0.5. That’s because even if one additional prediction had been added to that bucket, it would have had no effect. That point is fixed by the inherent symmetry.
I believe that this kind of graph does a better job of showing someone’s true calibration. But it's not the whole story.
Ramifications for scoring calibration (updated)
Clearly, it is not possible to make a binary prediction with 50% confidence that is poorly calibrated. This shouldn’t come as a surprise; a prediction at 50% between two choices represents the correct prior for the case where you have no information that discriminates between X and ~X. However, that doesn’t mean that you can improve your ability to make correct predictions just by giving them all 50% confidence and claiming impeccable calibration! An easy way to "cheat" your way into apparently good calibration is to take a large number of predictions that you are highly (>99%) confident about, negate a fraction of them, and falsely record a lower confidence for those. If we're going to measure calibration, we need a scoring method that will encourage people to write down the true probabilities they believe, rather than faking low confidence and ignoring their data. We want people to only claim 50% confidence when they genuinely have 50% confidence, and we need to make sure our scoring method encourages that.
A first guess would be to look at that graph and do the classic assessment of fit: sum of squared errors. We can sum the squared error of our predictions against the ideal linear calibration curve. If we did this, we would want to make sure we summed all the individual predictions, rather than the averages of the bins, so that the binning process itself doesn’t bias our score.
If we do this, then our overall prediction score can be summarized by one number:
Here P(Xi) is the assigned confidence of the truth of Xi, and Xi is the ith proposition and has a value of 1 if it is True and 0 if it is False. S is the prediction score, and lower is better. Note that because these are binary predictions, the sum of squared errors gives an optimal score if you assign the probabilities you actually believe (ie, there is no way to "cheat" your way to a better score by giving false confidence).
In this case, Scott's score is S=0.139, much of this comes from the 0.4/0.6 bracket. The worst score possible would be S=1, and the best score possible is S=0. Attempting to fake a perfect calibration by everything by claiming 50% confidence for every prediction, regardless of the information you actually have available, yields S=0.25 and therefore isn't a particularly good strategy (at least, it won't make you look better-calibrated than Scott).
Several of the commenters pointed out that log scoring is another scoring rule that works better in the general case. Before posting this I ran the calculus to confirm that the least-squares error did encourage an optimal strategy of honest reporting of confidence, but I did have a feeling that it was an ad-hoc scoring rule and that there must be better ones out there.
The logarithmic scoring rule looks like this:
Here again Xi is the ith proposition and has a value of 1 if it is True and 0 if it is False. The base of the logarithm is arbitrary so I've chosen base "e" as it makes it easier to take derivatives. This scoring method gives a negative number and the closer to zero the better. The log scoring rule has the same honesty-encouraging properties as the sum-of-squared-errors, plus the additional nice property that it penalizes wrong predictions of 100% or 0% confidence with an appropriate score of minus-infinity. When you claim 100% confidence and are wrong, you are infinitely wrong. Don't claim 100% confidence!
In this case, Scott's score is calculated to be S=-0.42. For reference, the worst possible score would be minus-infinity, and claiming nothing but 50% confidence for every prediction results in a score of S=-0.69. This just goes to show that you can't win by cheating.
Example: Pretend underconfidence to fake good calibration
In an attempt to appear like I have better calibration than Scott Alexander, I am going to make the following predictions. For clarity I have included the inverse propositions in the list (as those are also predictions that I am making), but at the end of the list so you can see the point I am getting at a bit better.
|
Proposition |
Quoted Prior |
Result |
|
|
A |
I will not win the lottery on Monday |
P(A) = 50% |
A is True |
|
B |
I will not win the lottery on Tuesday |
P(B) = 66% |
B is True |
|
C |
I will not win the lottery on Wednesday |
P(C) = 66% |
C is True |
|
D |
I will win the lottery on Thursday |
P(D) =66% |
D is False |
|
E |
I will not win the lottery on Friday |
P(E) = 75% |
E is True |
|
F |
I will not win the lottery on Saturday |
P(F) = 75% |
F is True |
|
G |
I will not win the lottery on Sunday |
P(G) = 75% |
G is True |
|
H |
I will win the lottery next Monday |
P(H) = 75% |
H is False |
|
… |
|
|
|
|
~A |
I will win the lottery on Monday |
P(~A) = 50% |
~A is False |
|
~B |
I will win the lottery on Tuesday |
P(~B) = 34% |
~B is False |
|
~C |
I will win the lottery on Wednesday |
P(~C) = 34% |
~C is False |
|
… |
|
|
|
Look carefully at this table. I've thrown in a particular mix of predictions that I will or will not win the lottery on certain days, in order to use my extreme certainty about the result to generate a particular mix of correct and incorrect predictions.
To make things even easier for me, I’m not even planning to buy any lottery tickets. Knowing this information, an honest estimate of the odds of me winning the lottery are astronomically small. The odds of winning the lottery are about 1 in 14 million (for the Canadian 6/49 lottery). I’d have to win by accident (one of my relatives buying me a lottery ticket?). Not only that, but since the lottery is only held on Wednesday and Saturday, that makes most of these scenarios even more implausible since the lottery corporation would have to hold the draw by mistake.
I am confident I could make at least 1 billion similar statements of this exact nature and get them all right, so my true confidence must be upwards of (100% - 0.0000001%).
If I assemble 50 of these types of strategically-underconfident predictions (and their 50 opposites) and plot them on a graph, here’s what I get:

You can see that the problem with cheating doesn’t occur only at 50%. It can occur anywhere!
But here’s the trick: The log scoring algorithm rates me -0.37. If I had made the same 100 predictions all at my true confidence (99.9999999%), then my score would have been -0.000000001. A much better score! My attempt to cheat in order to make a pretty graph has only sabotaged my score.
By the way, what if I had gotten one of those wrong, and actually won the lottery one of those times without even buying a ticket? In that case my score is -0.41 (the wrong prediction had a probability of 1 in 10^9 which is about 1 in e^21, so it’s worth -21 points, but then that averages down to -0.41 due to the 49 correct predictions that are collectively worth a negligible fraction of a point).* Not terrible! The log scoring rule is pretty gentle about being very badly wrong sometimes, just as long as you aren’t infinitely wrong. However, if I had been a little less confident and said the chance of winning each time was only 1 in a million, rather than 1 in a billion, my score would have improved to -0.28, and if I had expressed only 98% confidence I would have scored -0.098, the best possible score for someone who is wrong one in every fifty times.
This has another important ramification: If you're going to honestly test your calibration, you shouldn't pick the predictions you'll make. It is easy to improve your score by throwing in a couple predictions that you are very certain about, like that you won't win the lottery, and by making few predictions that you are genuinely uncertain about. It is fairer to use a list of propositions that is generated by somebody else, and then pick your probabilities. Scott demonstrates his honesty by making public predictions about a mix of things he was genuinely uncertain about, but if he wanted to cook his way to a better score in the future, he would avoid making any predictions at the 50% category that he wasn't forced to.
Input and comments are welcome! Let me know what you think!
* This result surprises me enough that I would appreciate if someone in the comments can double-check it on their own. What is the proper score for being right 49 times with 1-1 in a billion certainty, but wrong once?
The mystery of Brahms
I'm interested in how people form valuations of the opinions of others. One domain to study is art. We have a long historic record of how the elite arbiters of taste have decided what artists and what artworks were great.
This is more relevant to 21st century American thought than many of you probably think. The defaults we assume, the stories that are told on television and in our movies, the things taught in our colleges, were partly determined by assertions made by continental philosophers and psychologists of the 18th through 20th centuries, most of which they just made up. [1]
The process by which philosophers eventually get their views accepted into the Western canon looks the same to me as the process by which musicians or painters are accepted into or cast out of the Western canon. Neither has much to do with the quality of the product.
Vegetarianism Ideological Turing Test Results
Back in August I ran a Caplan Test (or more commonly an "Ideological Turing Test") both on Less Wrong and in my local rationality meetup. The topic was diet, specifically: Vegetarian or Omnivore?
If you're not familiar with Caplan Tests, I suggest reading Palladias' post on the subject or reading Wikipedia. The test I ran was pretty standard; thirteen blurbs were presented to the judges, selected by the toss of a coin to either be from a vegetarian or from an omnivore, and also randomly selected to be genuine or an impostor trying to pass themselves off as the alternative. My main contribution, which I haven't seen in previous tests, was using credence/probability instead of a simple "I think they're X".
I originally chose vegetarianism because I felt like it's an issue which splits our community (and particularly my local community) pretty well. A third of test participants were vegetarians, and according to the 2014 census, only 56% of LWers identify as omnivores.
Before you see the results of the test, please take a moment to say aloud how well you think you can do at predicting whether someone participating in the test was genuine or a fake.
.
.
.
.
.
.
.
.
.
.
.
.
.
If you think you can do better than chance you're probably fooling yourself. If you think you can do significantly better than chance you're almost certainly wrong. Here are some statistics to back that claim up.
I got 53 people to judge the test. 43 were from LessWrong, and 10 were from my local group. Averaging across the entire group, 51.1% of judgments were correct. If my Chi^2 math is correct, the p-value for the null hypothesis is 57% on this data. (Note that this includes people who judged an entry as 50%. If we don't include those folks the success rate drops to 49.4%.)
In retrospect, this seemed rather obvious to me. Vegetarians aren't significantly different from omnivores. Unlike a religion or a political party there aren't many cultural centerpieces to diet. Vegetarian judges did no better than omnivore judges, even when judging vegetarian entries. In other words, in this instance the minority doesn't possess any special powers for detecting other members of the in-group. This test shows null results; the thing that distinguishes vegetarians from omnivores is not familiarity with the other sides' arguments or culture, at least not to the degree that we can distinguish at a glance.
More interesting, in my opinion, than the null results were the results I got on the calibration of the judges. Back when I asked you to say aloud how good you'd be, what did you say? Did the last three paragraphs seem obvious? Would it surprise you to learn that not a single one of the 53 judges held their guesses to a confidence band of 40%-60%? In other words, every single judge thought themselves decently able to discern genuine writing from fakery. The numbers suggest that every single judge was wrong.
(The flip-side to this is, of course, that every entrant to the test won! Congratulations rationalists: signs point to you being able to pass as vegetarians/omnivores when you try, even if you're not in that category. The average credibility of an impostor entry was 59%, while the average credibility of a genuine response was 55%. No impostors got an average credibility below 49%.)
Using the logarithmic scoring rule for the calibration game we can measure the error of the community. The average judge got a score of -543. For comparison, a judge that answered 50% ("I don't know") to all questions would've gotten a score of 0. Only eight judges got a positive score, and only one had a score higher than 100 (consistent with random chance). This is actually one area where Less Wrong should feel good. We're not at all calibrated... but for this test at least, the judges from the website were much better calibrated than my local community (who mostly just lurk). If we separate the two groups we see that the average score for my community was -949, while LW had an average of -448. Given that I restricted the choices to multiples of 10, a random selection of credences gives an average score of -921.
In short, the LW community didn't prove to be any better at discerning fact from fiction, but it was significantly less overconfident. More de-biasing needs to be done, however! The next time you think of a probability to reflect your credence, ask yourself "Is this the sort of thing that anyone would know? Is this the sort of thing I would know?" That answer will probably be "no" a lot more than it feels like from the inside.
Full data (minus contact info) can be found here.
Those of you who submitted a piece of writing that I used, or who judged the test and left their contact information: I will be sending out personal scores very soon (probably by this weekend). Deep apologies regarding the delay on this post. I had a vacation in late August and it threw off my attention to this project.
EDIT: Here's a histogram of the identification accuracy.

EDIT 2: For reference, here are the entries that were judged.
Predict - "Log your predictions" app
As an exercise on programming Android, I've made an app to log predictions you make and keep score of your results. Like PredictionBook, but taking more of a personal daily exercise feel, in line with this post.
The "statistics" right now are only a score I copied from the old Credence calibration game, and a calibration bar chart.
I'm hoping for suggestionss for features and criticism on the app design.
Here's the link for the apk (v0.4), and here's the source code repository. You can download it at Google Play Store.
Pending/Possible/Requested Features:
- Set check-in dates for predictions
- Tags (and stats by tag)
- Stats by timeframe
- Beeminder integration
- Trivia questions you can answer if you don't have any personal prediction to make
- Ring pie chart to choose probability
Edit:
2015-08-26 - Fixed bug that broke on Android 5.0.2 (thanks Bobertron)
2015-08-28 - Change layout for landscape mode, and add a better icon
2015-08-31 -
- Daily notifications
- Buttons at the expanded-item-layout (ht dutchie)
- Show points won/lost in the snackbar when a prediction is answered
- Translation to portuguese
Mental Calibration for Bayesian Updates?
Hey all,
After reading "How to Measure Anything" I've experimented a bit with calibration training and using his calibration tools, and after being convinced by his data on the usefulness of calibration in forecasting for the real world, have seen a big update in my own calibration.
I'm wondering if anybody knows of similar tools and studies on calibration of Bayesian updating. Broadly,I imagine it would look like:
1. Using the tools and calibration methods I already use to figure out how the feeling of "correctness" of my prior correlates to a numerical value.
2. Using similar (but probably not identical) tools to figure out how "convincing" the new data feels correlates to specific numbers.
3. Calibrating these two numbers to bayes theorom, such that I know approximately how much to update the original feeling to reflect the new information
4. Using mmenomic or visualization techniques to pair the new feeling with the belief, so that next time I remembered the belief, I'd feel the slightly different calibration.
Anyways, I'm curious if anyone has experimented with these processes, if there's any research on it, or it has been previously experimented with on lesswrong. I'd definitely like to lock down a similar procedure for myself.
I should note that many times, I already do this naturally... but my guess is I systematically over and under update the feeling based on confirmation bias. I'd like to recalibrate my recalibration :).
A question and a tail
This is a rambling post, and I will appreciate your criticism to help dry it or delete it altogether.
It seems that however little a question I research by reviewing [botanical] literature, there is always a much more complex, and rather difficult to rigorously put, question that I have to ask for the first one to be meaningful. The second answer (or tier of answers) doesn't add much to the information I will build upon, but it might - just might! - add uncertainty to the result or allow predictions in advance. How do we use it in advance? We don't apply formal reasoning, usually, and yet somehow we use it!
1.
Consider: a certain invasive plant has a host of adaptations beneficial to its success. (They probably wouldn't be sufficient if there were some actual effort to manage manmade ecosystems, but duh.) A trait many IP share is the ability to increase their ploidy - from 2 to 3, 4, 6, 8 or even 10 sets of homologous chromosomes, etc. (Polyploidization sometimes happens even in single cells in somatic (= non-reproductive) tissues, so it's really a heavily used shortcut.)
Now, suppose I want to see how a different specific property of the species behaves abroad. I will have to check the ploidy level, of course! Quick, what does the literature say, how many chromosomes can it have?
...but wait. Make no mistake, I do have to count them; but what if there is a continent-wide study showing that it generally has 4n in Eastern Europe?.. That would allow me to at least expect 4n, or whatever amount they found, and see if there is any research specifically dealing with this situation within its native range.
...but wait. Of course, those findings will be useful in discussion if I find 4n, but if I don't, they will be just a point in the overall space of possibilities. Still relevant, but not worth putting much explanatory weight on.
Something in my brain evaluated the usefulness of a piece of data other people have found, which I myself have yet to look up, of whose exact composition I have no idea - perhaps there are simply no other reports! - and placed it in context of what I really expect to do.
2.
Okay, if I can think so about other people's writings without even reading them, then maybe I can compile a dummy set of data I expect right now and compare them to those I will find in the literature. And later, to actual data. Here's a simplified problem that doesn't approach labwork on any scale (I don't want to add too many qualifiers).
Let us 'measure' 8 parameters, and check if there have been studies that have found correlations between at least some of them (and maybe with some other ones), and then try to see if our expectations based on knowledge of study area and casual surveys fit our expectations based on published research in any specific way. We are not ready to put forth any causal structure - no real data yet - though we strongly suspect (80%) that all the parameters are in some way linked to each other.
The following table is rough and repetitive, but I think useful as an illustration of how things brew in [my own] a not-much-clever student's head. The numbers are 'dimensionless', distributions are normal, total number of studies measuring each parameter is 7 or less, and all correlations are no less that 0.8.
|
Parameter |
Total range |
Our expected data ±SE |
Reported data range* |
Our imaginary correlations |
Reported correlations |
|
A |
1-12 |
8±1 |
4-10 |
A&F, A&H |
A&D, A&F, A doesn't correlate with anything if nothing else correlates with anything |
|
B |
1-5 |
2±1 |
1-4 |
B&C, B&E, B&G, B&H |
B doesn't correlate with E if F&H |
|
C |
1-100 |
35±20 |
80±7 (only one other study) |
C&B, C&F, C&H |
Unknown |
|
D |
1-28 |
6±2 |
2-18 |
D&F |
D&G (and then E&F) |
|
E |
1-500 |
200±46 |
150-480 |
E&B, E&G |
E&F if D&G |
|
F |
1-50 |
47±8 |
8-45 |
F&A, F&C, F&D, F&H if A&H |
F&A, F&H (and then B doesn't correlate with E) |
|
G |
1-25 |
18±2 |
11-20 |
G&B, G&E |
G&D (and then E&F) |
|
H |
1-40 |
23±10 |
1-40 |
H&A, H&B, H&C, H&F (and then H&A) |
H&F (and then B doesn't correlate with E) |
*as in, 'for this species, out of 1-12 that are altogerther possible, only 4-10 have been so far observed. It might mean that 4-10 is the actual range, but the prior for that is about 60% due to difference in methodologies used by various researchers and to the fact that only a part of the species's habitats have been studied' etc.
Now I understand that this is hardly the most profitable presentation method and statistics has advanced much since Pearson and eveything. It is just that I find it difficult to compare graphs with diagrams with clouds along axes as they are published in different papers. I only want to guesstimate if my data fit a pattern, to discuss them qualitatively. To stratify the parameters in such a way that I will place explanative weight on some of them, and report the others to give a full picture. I have to do this explicitly, because I know I am doing this implicitly – it's a feeling I get, of brain working and deciding and not showing me what it has.
I cannot speak about A, only that maybe A, H and F do have something in common – perhaps I haven't measured it. B looks rather suspicious; I will need to reread that other report. C is intriguing, but ultimately belongs to the 'lower value stratum', and maybe those correlations I found are spurious; if only there was a way to reduce the variability... but it won't be cost-efficient. E, F, D and G also might be worth discussing together. F by itself doesn't seem very meaningful, unless there is a causal connection to the others; too bad one can imagine many plausible explanations for that. I will probably start discussion with H, since it probably has been studied for other plants and at least something has already been proposed.
Now when I have my own data I will see where they deviate from my expectations, and that will be some knowledge I can put into words, and I will hopefully start calibrating myself on these matters. And on matters of Discussion structuring:)
PredictIt, a prediction market out of New Zealand, now in beta.
From their website:
PredictIt is an exciting new, real money site that tests your knowledge of political and financial events by letting you make and trade predictions on the future.
Taking part in PredictIt is simple and easy. Pick an event you know something about and see what other traders believe is the likelihood it will happen. Do you think they have it right? Or do you think you have the knowledge to beat the wisdom of the crowd?
The key to success at PredictIt is timing. Make your predictions when most people disagree with you and the price is low. When it turns out that your view may be right, the value of your predictions will rise. You’ll need to choose the best time to sell!
Keep in mind that, although the stakes are limited, PredictIt involves real money so the consequences of being wrong can be painful. Of course, winning can also be extra sweet.
For detailed instructions on participating in PredictIt, How It Works.
PredictIt is an educational purpose project of Victoria University, Wellington of New Zealand, a not-for-profit university, with support provided by Aristotle International, Inc., a U.S. provider of processing and verification services. Prediction markets, like this one, are attracting a lot of academic and practical interest (see our Research section). So, you get to challenge yourself and also help the experts better understand the wisdom of the crowd.
How to calibrate your political beliefs
So you're playing the credence game, and you’re getting a pretty good sense of which level of confidence to assign to your beliefs. Later, when you’re discussing politics, you wonder how you can calibrate your political beliefs as well (beliefs of the form "policy X will result in outcome Y"). Here there's no easy way to assess whether a belief is true or false, in contrast to the trivia questions in the credence game. Moreover, it’s very easy to become mindkilled by politics. What do you do?
In the credence game, you get direct feedback that allows you to learn about your internal proxies for credence, i.e., emotional and heuristic cues about how much to trust yourself. With political beliefs, however, there is no such feedback. One workaround would be to assign high confidence only to beliefs for which you have read n academic papers on the subject. For example, only assign 90% confidence if you've read ten academic papers.
To account for mindkilling, use a second criterion: assign high confidence only to beliefs for which you are ideologically Turing-capable (i.e., able to pass an ideological Turing test). As a proxy for an actual ideological Turing test, you should be able to accurately restate your opponent’s position, or be able to state the strongest counterargument to your position.
In sum, to calibrate your political beliefs, only assign high confidence to beliefs which satisfy extremely demanding epistemic standards.
What information has surprised you most recently?
Information that surprises you is interesting as it exposes where you have been miscalibrated, and allows you to correct for that.
I suspect the users of LessWrong have fairly similar beliefs, so it is probable that information that has surprised you would surprise others here, so it would be useful for them if you shared them.
Example: In a discussion with a friend recently I realised I had massively miscalibrated on the percentage of the UK population who shared my beliefs on certain subjects, in general the population was far more conservative than I had expected.
In retrospect I was assuming my own personal experience was more representative than it was, even when attempting to correct for that.
Credence calibration game FAQ
Hey rationality friends, I just made this FAQ for the credence calibration game. So if you have people you'd like to introduce to it --- for example, to get them used to thinking of belief strengths as probabilities --- now is a good time :)
Needed: A large database of statements for true/false exercises
Does anybody know where to find a large database of statements that are roughly 50% likely to be true or false? These would be used for confidence calibration / Bayesian updating exercises for CMR/HRP.
One way to make such a database would be to buy a bunch of trivia games with True/False questions, and type each statement and its negation into a computer. A problem with this might be that trivia questions are selected to have surprising/counterintuitive truth values; I'm not sure if that's true. I'd be happy to acquire an already-made database of this form, but ideally I'd like statements that are "more neutral" in terms of how counterintuitive they are.
Any thoughts on where we might find a database like this to use/buy?
Thanks for any help!
Revision: We actually want a database of two-choice answer questions. This way, the player won't get trained on a base rate of 50% of statements in the world being true... they'll just get trained that when there are two possible answers, one is always true. In the end, the database should look something like this (warning: I made up the "correct" answers):
Question: "Which is diagnosed more often in America (2011)?";
Answers: (a) "the cold", (b) allergies";
Correct Answer: (a);
Tags: {medical}
Question: "Which city has a higher average altitude?";
Answers: (a) "Chicago", (b) "Las Vegas";
Correct Answer: (a)
Tags: {geography}
Question: "Who sold more albums while living"?;
Answers: (a) "Michael Jackson", (b) "Elvis Presley";
Correct Answer: (b)
Tags: {history, pop-culture, music}
Question: "Was the price of IBM stock higher or lower at the start of the month after the Berlin wall fell, compared with the start of the previous month?";
Answers: (a) "higher", (b) "lower";
Correct Answer: (a)
Tags: {history, finance}
Harry Potter and the Methods of Rationality predictions
The recent spate of updates has reminded me that while each chapter is enjoyable, the approaching end of MoR, as awesome as it no doubt will be, also means the end of our ability to learn from predicting the truth of the MoR-verse and its future.
With that in mind, I have compiled a page of predictions on sundry topics, much like my other page on predictions for Neon Genesis Evangelion; I encourage people to suggest plausible predictions that I've omitted, register their probabilities on PredictionBook.com, and come up with their own predictions. Then we can all look back when MoR finishes and reflect on what we (or Eliezer) did poorly or well.
The page is currently up to >182 predictions.
What does your accuracy tell you about your confidence interval?
Yvain's 2011 Less Wrong Census/Survey is still ongoing throughout November, 2011. If you haven't taken it, please do before reading on, or at least write down your answers to the calibration questions so they won't get skewed by the following discussion.
Naming the Highest Virtue of Epistemic Rationality
Edit: Looking back at this a few years later. It is pretty embarrassing, but I'm going to leave it up.
Why don't we start treating the log2 of the probability — conditional on every available piece of information — you assign to the great conjunction, as the best measure of your epistemic success? Let's call: log_2(P(the great conjunction|your available information)), your "Bayesian competence". It is a deductive fact that no other proper scoring rule could possibly give: Score(P(A|B)) + Score(P(B)) = Score(P(A&B)), and obviously, you should get the same score for assigning P(A|B) to A, after observing B, and assigning P(B) to B a priori, as you would get for assigning P(A&B) to A&B a priori. The great conjunction is the conjunction of all true statements expressible in your idiolect. Your available information may be treated as the ordered set of your retained stimulus.
If this doesn't make sense, or you aren't familiar with these ideas, checkout Technical Explanation after checking out Intuitive Explanation.
It is standard LW doctrine that we should not name the highest value of rationality, and it is often defended quite brilliantly:
You may try to name the highest principle with names such as “the map that reflects the territory” or “experience of success and failure” or “Bayesian decision theory”. But perhaps you describe incorrectly the nameless virtue. How will you discover your mistake? Not by comparing your description to itself, but by comparing it to that which you did not name.
and of course also:
How can you improve your conception of rationality? Not by saying to yourself, “It is my duty to be rational.” By this you only enshrine your mistaken conception. Perhaps your conception of rationality is that it is rational to believe the words of the Great Teacher, and the Great Teacher says, “The sky is green,” and you look up at the sky and see blue. If you think: “It may look like the sky is blue, but rationality is to believe the words of the Great Teacher,” you lose a chance to discover your mistake.
These quotes are from the end of Twelve Virtues
Should we really be wondering if there's a virtue higher than bayesian competence? Is there really a probability worth worrying about that the description of bayesian competence above is misunderstood? Is the description not simple enough to be mathematical? What mistake might I discover in my understanding of bayesian competence by comparing it to that which I did not name, after I've already given a proof that bayesian competence is proper, and that the restrictions: score(P(B)*P(A|B)) = score(P(B)) + score(P(A|B)), and: must be a proper scoring rule, uniquely specify Logb?
I really want answers to these questions. I am still undecided about them; and change my mind about them far too often.
Of course, your bayesian competence is ridiculously difficult to compute. But I am not proposing the measure for practical reasons. I am proposing the measure to demonstrate that degree of rationality is an objective quantity that you could compute given the source code to the universe, even though there are likely no variables in the source that ever take on this value. This may be of little to no value to the most obsessively pragmatic practitioners of rationality. But it would be a very interesting result to philosophers of science and rationality.
Updated to better express view of author, and take feedback into account. Apologies to any commenter who's comment may have been nullified.
The comment below:
The general reason Eliezer advocates not naming the highest virtue (as I understand it) is that there may be some type of problem for which bayesian updating (and the scoring rule referred to) yields the wrong answer. This idea sounds rather improbable to me, but there is a non-negligible probability that bayes will yield a wrong answer on some question. Not naming the virtue is supposed to be a reminder that if bayes ever gives the wrong answer, we go with the right answer, not bayes.
has changed my mind about the openness of the questions I asked.
Link: Compare your moral values to the general population
Jonathan Haidt, a professor at UVA, runs an online lab with quizzes that will compare your moral values to the rest of the population. I have found the test results useful for avoiding the typical mind fallacy. When someone disagrees with me on a belief/opinion I feel certain about, it's often difficult to tease apart how much of this disagreement stems from them not "getting it", and how much stems from them having a different fundamental value system. One of the tests alerted me that I am an outlier in certain aspects of how I judge morality (green = me; blue = liberals; red = conservatives):

Another benefit of these quizzes is that they can point out potential blind spots. For example, one quiz asks for opinions about punishment for crimes. If I discover I'm an outlier w.r.t. the population, I should reconsider whether my opinions are based on solid evidence (or did I see one study that found tit-for-tat punishment effective in a certain context, and take that as gospel?).
Extra reading: Haidt wrote a WSJ article last month that applied the learnings of these moral quizzes to better understanding the Tea Party.
= 783df68a0f980790206b9ea87794c5b6)
Subscribe to RSS Feed
= f037147d6e6c911a85753b9abdedda8d)