Hey, thanks for mentioning this. I hadn't heard about it.
I've tried my hand at this app (50 questions or so), and it seems like the correct strategy, for me, is to go 50% for anything I have a little doubt on, and 99% for that I'm sure about. Maybe 5% of the questions fall into the 60%-90% range.
I'm still working to understand the tutorial and how to interpret my results.
:-)
It's not particularly hard to "perfect" your calibration in that game -- if you're over/under on a certain percentile, you can throw questions where you're confident into percentiles where you're "poorly calibrated" in order to spoof a good calibration curve.
The trick to that game, if you actually want to asses your calibration, is to play for points rather than for a good curve. Being well-calibrated means that when you play for points, you have a good curve automatically.
(I wish that they'd show you your curve less often, perhaps only when you leave the game. It's hard to resist cheating the curve. Then again, I'm not sure of a better way to provide the necessary feedback.)