Lumifer comments on The Triumph of Humanity Chart - Less Wrong Discussion
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (77)
Is this how you think it works or is this how you think it should work?
In particular, I am curious about the "calibrating historians" part. You're going to calibrate experts against what?
It's how I think it works.
Known historical data (which they don't know).
The problem is that you want to use the best experts you have. If you are going to try to calibrate them in their field, they know it (and might have written the textbook you're calibrating them against), and if you're trying to calibrate them in the field they haven't studied, I'm not sure it's relevant to the quality of their studies.
As to "how it works", I'm pretty sure no one is actually trying to calibrate historians. I suspect the process actually works by looking up published papers and grabbing the estimates from them without any further thought -- at best. At worst you have numbers invented out of thin air, straight extrapolation of available curves, etc. etc.
Resolution and calibration are separate. They may have lower resolution in other fields but they shouldn't have lower calibration.
Edit: Thought about the previous comment, but it's not true. One thing they talk about in superforecasting is that people tend to be overconfident in their own fields while better calibrated in others.
You're thinking about this in terms of forecasting. This is not forecasting, this is historical studies.
Consider the hard sciences equivalent: you take, say, some geneticists and try to figure out whether their estimates of which genes cause what are any good by asking them questions about quantum physics to "check how they are calibrated".
No. Bayesian estimate calibration is most often used in forecasting, but it's effective in any domain which there's uncertainty, including hard sciences. In fact, calibration training is often done with either numerical trivia, using 90% credible intervals, or with true or false questions using a single percentage estimate. I recommend checking out "How to Measure Anything" for a more indepth treatment.
Yes, that's essentially how it works, except that you then give them feedback to see if they're over or under confident. They'd have to be relatively easy questions though, otherwise all the estimates would cluster around fifty percent and it wouldn't be very useful training for high resolution answers.
Citation needed.
Not all uncertainty is created equal. If uncertainty comes from e.g. measurement limitations, the Bayesian calibration is useless.
Note that science is mostly about creating results that can be replicated by anyone regardless of how well or badly calibrated they are.
That's how you imagine it to work, since I don't expect anyone to actually be doing this. But let's see, assume we have successfully run the calibration exercises with our group of geneticists. What do you expect them to change in their studies of which genes do what? We can get even more specific, let's say we're talking about one of the twin studies where the author tracked a set of twins, tested them on some phenotype feature X, and is reporting the results that the twins correlate Y% while otherwise similar general population is correlated Z%. What results would better calibration affect?
That was an overconfident statement, but for more on how Calibration is useful in places other than Forecasting, check out "How to Measure Anything" as mentioned in the last comment.
Once calibrated, they can make estimates on how sure they are of certain hypotheses, and of how likely treatments based on those hypotheses would lead to lives saved. This in turn can allow them to quantify what experiment to run next using value of information calculations.
Furthermore, by taking a survey of many of these calibrated genetic experts then extremizing their results, you can get an idea of how likely certain hypotheses are to turn out being correct.
I don't know if you read scientific papers, but they don't "make estimates on how sure they are of certain hypotheses". They present the data and talk about the conclusions and implications that follow from the data presented. The potential hypotheses are evaluated on the basis of data, not on the basis of how well-calibrated does a particular researcher feel.
Calibration is good for guesstimates, it's not particularly valuable for actual research.
That's forecasting. Remember, we're not talking about forecasting.
I'm not really sure how to answer this because I think you misunderstand calibration.
Science moves forward through something called scientific consensus. How does scientific consensus work right now? Well, we just kind of use guesswork. Expert calibration is a more useful way to understand what the scientific consensus actually is.
No, it's a decision model. The decision model uses a forecast "How many lives can be saved", but it also uses calibration of known data "Based on the data you have, how sure are you that this particular fact is true".