Hi all,
I put this calibration test together this morning. It pulls from a trivia API of over 150,000 questions so you should be able to take this many, many times before you start seeing repeats.
http://www.2pih.com/caltest.php
A few notes:
1. The questions are "Jeopardy" style questions so the wording may be strange, and some of them might be impossible to answer without further context. On these just assign 0% confidence.
2. As the questions are open-ended, there is no answer-checking mechanism. You have to be honest with yourself as to whether or not you got the right answer. Because what would be the point of cheating at a calibration test?
I can't think of anything else. Please let me know if there are any features you would want to see added, or if there are any bugs, issues, etc.
**EDIT**
As per suggestion I have moved this to the main section. Here are the changes I'll be making soon:
- Label the axes and include an explanation of calibration curves.
- Make it so you can reverse your last selection in the event of a misclick.
Here are changes I'll make eventually:
- Create an account system so you can store your results online.
- Move trivia DB over to my own server to allow for flagging of bad/unanswerable questions.
Here are the changes that are done:
- Change 0% to 0.1% and 99% to 99.9%
- Added second graph which shows the frequency of your confidence selections.
- Color code the "right" and "wrong" buttons and make them farther apart to prevent misclicks.
- Store your results locally so that you can see your calibration over time.
- Check to see if a question is blank and skip if so.
Got it. I'll make them color coded and farther apart.
I'll write some better instructions as well.
What would help most is: "Pick an answer. How confident are you that your answer is correct?"
Then, make sure that when the user clicks the 'show answer' button, make sure that neither of the two new buttons are in the same place.
ALSO, it would be nice if the calibration curve showed the credible interval for each bin, so I can tell at a glance that my getting 1/1 right at 30% and 0/1 right at 60% isn't actually that big a hit to my calibration.
And if the second graph was stacked so that I don't have this giant red bar at 100%, which just looks od... (read more)