Does anybody know where to find a large database of statements that are roughly 50% likely to be true or false? These would be used for confidence calibration / Bayesian updating exercises for CMR/HRP.
One way to make such a database would be to buy a bunch of trivia games with True/False questions, and type each statement and its negation into a computer. A problem with this might be that trivia questions are selected to have surprising/counterintuitive truth values; I'm not sure if that's true. I'd be happy to acquire an already-made database of this form, but ideally I'd like statements that are "more neutral" in terms of how counterintuitive they are.
Any thoughts on where we might find a database like this to use/buy?
Thanks for any help!
Revision: We actually want a database of two-choice answer questions. This way, the player won't get trained on a base rate of 50% of statements in the world being true... they'll just get trained that when there are two possible answers, one is always true. In the end, the database should look something like this (warning: I made up the "correct" answers):
Question: "Which is diagnosed more often in America (2011)?";
Answers: (a) "the cold", (b) allergies";
Correct Answer: (a);
Tags: {medical}
Question: "Which city has a higher average altitude?";
Answers: (a) "Chicago", (b) "Las Vegas";
Correct Answer: (a)
Tags: {geography}
Question: "Who sold more albums while living"?;
Answers: (a) "Michael Jackson", (b) "Elvis Presley";
Correct Answer: (b)
Tags: {history, pop-culture, music}
Question: "Was the price of IBM stock higher or lower at the start of the month after the Berlin wall fell, compared with the start of the previous month?";
Answers: (a) "higher", (b) "lower";
Correct Answer: (a)
Tags: {history, finance}
I would recommend against "X is the 13th largest Y", because other than people who've memorized the Top Twenty Ys getting this right is purely a matter of guesswork. "One of the 10 largest" is better; so is "X is bigger than Y".
Well, if one can come up with the top Ys, one can reason about what probability one wants to assign to that statement. For example, if I can think of 9 countries that I think are bigger than Algeria, and three of which I am uncertain, I can well assign a probability of, say, 30%. Calibration training could be done this way.