Does anybody know where to find a large database of statements that are roughly 50% likely to be true or false? These would be used for confidence calibration / Bayesian updating exercises for CMR/HRP.
One way to make such a database would be to buy a bunch of trivia games with True/False questions, and type each statement and its negation into a computer. A problem with this might be that trivia questions are selected to have surprising/counterintuitive truth values; I'm not sure if that's true. I'd be happy to acquire an already-made database of this form, but ideally I'd like statements that are "more neutral" in terms of how counterintuitive they are.
Any thoughts on where we might find a database like this to use/buy?
Thanks for any help!
Revision: We actually want a database of two-choice answer questions. This way, the player won't get trained on a base rate of 50% of statements in the world being true... they'll just get trained that when there are two possible answers, one is always true. In the end, the database should look something like this (warning: I made up the "correct" answers):
Question: "Which is diagnosed more often in America (2011)?";
Answers: (a) "the cold", (b) allergies";
Correct Answer: (a);
Tags: {medical}
Question: "Which city has a higher average altitude?";
Answers: (a) "Chicago", (b) "Las Vegas";
Correct Answer: (a)
Tags: {geography}
Question: "Who sold more albums while living"?;
Answers: (a) "Michael Jackson", (b) "Elvis Presley";
Correct Answer: (b)
Tags: {history, pop-culture, music}
Question: "Was the price of IBM stock higher or lower at the start of the month after the Berlin wall fell, compared with the start of the previous month?";
Answers: (a) "higher", (b) "lower";
Correct Answer: (a)
Tags: {history, finance}
Hmm... this made me think that perhaps two-choice questions are better than true/false questions, because when all the questions have the same two possible answers T/F, there is a base rate of how often the answer "T" is correct which the player should account for. For real life questions with two possible answers like "Who is taller, Alex or Bob?", there is not really a well-known base rate.
Thanks!