Our hosts at Tricycle Developments have created PredictionBook.com, which lets you make predictions and then track your calibration - see whether things you assigned a 70% probability happen 7 times out of 10.
The major challenge with a tool like this is (a) coming up with good short-term predictions to track (b) maintaining your will to keep on tracking yourself even if the results are discouraging, as they probably will be.
I think the main motivation to actually use it, would be rationalists challenging each other to put a prediction on the record and track the results - I'm going to try to remember to do this the next time Michael Vassar says "X%" and I assign a different probability. (Vassar would have won quite a few points for his superior predictions of Singularity Summit 2009 attendance - I was pessimistic, Vassar was accurate.)
Calibration may be achievable by a general procedure of making and testing (banded) predictions, but I wouldn't trust anyone's calibration in a particular domain on evidence of calibration in another.
In other words, people will have studied the accuracy of only some of their maps.
Do you have any evidence for this? I don't remember any strongly domain-specific results in Tetlock's study, the book I read about calibration in business, or any studies. Nor does Wikipedia mention anything except domain experts being overconfident (as opposed to people being random outside their domain even when supposedly calibrated, as you imply), which is fixable with calibration training.
And this is what I would expect given that the question is not about accuracy (one would hope experts would win in a particular domain) but about calibration - why c... (read more)