lukeprog comments on Calibration Test with database of 150,000+ questions - Less Wrong

37 Post author: Nanashi 14 March 2015 11:22AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (31)

You are viewing a single comment's thread. Show more comments above.

Comment author: RowanE 13 March 2015 05:37:15PM 12 points [-]

I think the problem here is with many trivia questions you either know the answer or you don't; the dominant factor in my results so far is that I either have no answer in mind, assign 0 probability to my being right and am correctly calibrated there, and then all of my answers at other levels of certainty have turned out right so far so my calibration curve looks almost rectangular.

I might just be getting accurate information that I'm drastically underconfident, but I think this might be one of the worse types of questions to calibrate on. I mean, even if the problem is just that I'm drastically underconfident on trivia questions and shouldn't be assigning less than 50% probability to any of my answers when I have an answer, that sounds sufficiently unrepresentative of most areas where you need calibration, and how most people perform on other calibration tests, for this to be a pretty bad measure of calibration.

Perhaps it would be better as a multiple choice test, so one can have possible answers raised to attention that may or may not be right, and assign probabilities to those?

Comment author: lukeprog 13 March 2015 06:03:23PM 7 points [-]

0% probability is my most common answer as well, but I'm using it less often than I was choosing 50% on the CFAR calibration app (which forces a binary answer choice rather than an open-ended answer choice). The CFAR app has lots of questions like "Which of these two teams won the Superbowl in 1978" where I just have no idea. The trivia database Nanashi is using has, for me, a greater proportion of questions on which my credence is something more interesting than an ignorance prior.