If it's worth saying, but not worth its own post (even in Discussion), then it goes here.
Notes for future OT posters:
1. Please add the 'open_thread' tag.
2. Check if there is an active Open Thread before posting a new one. (Immediately before; refresh the list-of-threads page before posting.)
3. Open Threads should be posted in Discussion, and not Main.
4. Open Threads should start on Monday, and end on Sunday.
About that survey... Suppose I ask you to guess the result of a biased coin which comes up heads 80% of the time. I ask you to guess 100 times, of which ~80 times the right answer is "heads" (these are the "easy" or "obvious" questions) and ~20 times the right answer is "tails" (these are the "hard" or "surprising" questions). Then the correct guess, if you aren't told whether a given question is "easy" or "hard", is to guess heads with 80% confidence, for every question. Then you're underconfident on the "easy" questions, because you guessed heads with 80% confidence but heads came up 100% of the time. And you're overconfident on the "hard" questions, because you guessed heads with 80% confidence but got heads 0% of the time.
So you can get apparent under/overconfidence on easy/hard questions respectively, even if you're perfectly calibrated, if you aren't told in advance whether a question is easy or hard. Maybe the effect Yvain is describing does exist, but his post does not demonstrate it.
I am probably misunderstanding something here, but doesn't this
Basically say, "if you have no calibration whatsoever?" If there are distinct categories of questions (easy and hard) and you can't tell which questions belong to which category, then simply guessing according to your overall base rate will make your calibration look terrible - because it is