Unnamed comments on 2014 Survey Results - Less Wrong

87 Post author: Yvain 05 January 2015 07:36PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (279)

Sort By: Controversial

You are viewing a single comment's thread. Show more comments above.

Comment author: Unnamed 06 January 2015 02:04:54AM 10 points [-]

Details on data cleanup:

In the publicly available data set, I restricted my analysis to people who:
* entered a number on each of the 10 calibration probability estimates
* did not enter any estimates larger than 100
* entered at least one estimate larger than 1
* entered something on each of the 10 calibration guesses
* did not enter a number for any of the 10 calibration guesses

Failure to meet any of these criteria generally indicated either a failure to understand the format of the calibration questions, or a decision to skip one or more of the questions. Each of these criteria eliminated at least 1 person, leaving a sample of 1141 people.

I counted as "correct":
* any answer which Scott/Ozy counted as correct
* any answer to question 1 (largest bone) which began with "fem" (e.g., "femer")
* any answer to question 2 (Obama's state) which began with "haw" (e.g., "Hawii")
* any answer to question 4 (Norse god) which began with "od" or "wo" (e.g., "Wotan")
* any answer to question 8 (cell) which began with "mito" (e.g., "Mitochondira")

These seem to cover the most common misspellings (or alternate names, e.g. "Wotan" is the German name for Odin), while counting very few obviously wrong answers as correct, and without having to go through every answer one by one. Counting these answers gave the average participant another 0.15 correct answers, and I suspect we could add another 0.05 or so by going through answer by answer with lenient standards. The mitochondria leniency made the largest difference, adding 97 correct answers.

Without counting these additional correct answers, the average overconfidence score would have been 0.54 among the full sample, 0.40 among sequence readers, 0.32 among CFAR alumni, 0.40 among active-in-person LWers, 0.30 among those with 1000 karma, and 0.23 among those with high test scores. Counting these additional correct answers helped non-US LWers more than US LWers (by 0.21 questions vs. 0.11); I suspect that part of that is due to spelling difficulties for non-native speakers and part is due to the Odin vs. Wotan thing.