Lumifer comments on The Triumph of Humanity Chart - Less Wrong

23 Post author: Dias 26 October 2015 01:41AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (77)

You are viewing a single comment's thread. Show more comments above.

Comment author: [deleted] 27 October 2015 05:29:29PM 1 point [-]

It's how I think it works.

You're going to calibrate experts against what?

Known historical data (which they don't know).

Comment author: Lumifer 27 October 2015 05:54:39PM *  1 point [-]

The problem is that you want to use the best experts you have. If you are going to try to calibrate them in their field, they know it (and might have written the textbook you're calibrating them against), and if you're trying to calibrate them in the field they haven't studied, I'm not sure it's relevant to the quality of their studies.

As to "how it works", I'm pretty sure no one is actually trying to calibrate historians. I suspect the process actually works by looking up published papers and grabbing the estimates from them without any further thought -- at best. At worst you have numbers invented out of thin air, straight extrapolation of available curves, etc. etc.

Comment author: [deleted] 28 October 2015 03:09:27AM *  0 points [-]

The problem is that you want to use the best experts you have. If you are going to try to calibrate them in their field, they know it (and might have written the textbook you're calibrating them against), and if you're trying to calibrate them in the field they haven't studied, I'm not sure it's relevant to the quality of their studies.

Resolution and calibration are separate. They may have lower resolution in other fields but they shouldn't have lower calibration.

Edit: Thought about the previous comment, but it's not true. One thing they talk about in superforecasting is that people tend to be overconfident in their own fields while better calibrated in others.

Comment author: Lumifer 28 October 2015 02:47:09PM 1 point [-]

You're thinking about this in terms of forecasting. This is not forecasting, this is historical studies.

Consider the hard sciences equivalent: you take, say, some geneticists and try to figure out whether their estimates of which genes cause what are any good by asking them questions about quantum physics to "check how they are calibrated".

Comment author: [deleted] 28 October 2015 04:29:43PM *  0 points [-]

You're thinking about this in terms of forecasting.

No. Bayesian estimate calibration is most often used in forecasting, but it's effective in any domain which there's uncertainty, including hard sciences. In fact, calibration training is often done with either numerical trivia, using 90% credible intervals, or with true or false questions using a single percentage estimate. I recommend checking out "How to Measure Anything" for a more indepth treatment.

Consider the hard sciences equivalent: you take, say, some geneticists and try to figure out whether their estimates of which genes cause what are any good by asking them questions about quantum physics to "check how they are calibrated".

Yes, that's essentially how it works, except that you then give them feedback to see if they're over or under confident. They'd have to be relatively easy questions though, otherwise all the estimates would cluster around fifty percent and it wouldn't be very useful training for high resolution answers.

Comment author: Lumifer 28 October 2015 05:04:41PM *  1 point [-]

it's effective in any domain which there's uncertainty, including hard sciences

Citation needed.

Not all uncertainty is created equal. If uncertainty comes from e.g. measurement limitations, the Bayesian calibration is useless.

Note that science is mostly about creating results that can be replicated by anyone regardless of how well or badly calibrated they are.

Yes, that's essentially how it works

That's how you imagine it to work, since I don't expect anyone to actually be doing this. But let's see, assume we have successfully run the calibration exercises with our group of geneticists. What do you expect them to change in their studies of which genes do what? We can get even more specific, let's say we're talking about one of the twin studies where the author tracked a set of twins, tested them on some phenotype feature X, and is reporting the results that the twins correlate Y% while otherwise similar general population is correlated Z%. What results would better calibration affect?

Comment author: [deleted] 28 October 2015 05:45:16PM 0 points [-]

Citation needed.

That was an overconfident statement, but for more on how Calibration is useful in places other than Forecasting, check out "How to Measure Anything" as mentioned in the last comment.

But let's see, assume we have successfully run the calibration exercises with our group of geneticists. What do you expect them to change in their studies of which genes do what? We can get even more specific, let's say we're talking about one of the twin studies where the author tracked a set of twins, tested them on some phenotype feature X, and is reporting the results that the twins correlate Y% while otherwise similar general population is correlated Z%. What results would better calibration affect?

Once calibrated, they can make estimates on how sure they are of certain hypotheses, and of how likely treatments based on those hypotheses would lead to lives saved. This in turn can allow them to quantify what experiment to run next using value of information calculations.

Furthermore, by taking a survey of many of these calibrated genetic experts then extremizing their results, you can get an idea of how likely certain hypotheses are to turn out being correct.

Comment author: Lumifer 28 October 2015 05:59:10PM *  0 points [-]

Once calibrated, they can make estimates on how sure they are of certain hypotheses

I don't know if you read scientific papers, but they don't "make estimates on how sure they are of certain hypotheses". They present the data and talk about the conclusions and implications that follow from the data presented. The potential hypotheses are evaluated on the basis of data, not on the basis of how well-calibrated does a particular researcher feel.

Calibration is good for guesstimates, it's not particularly valuable for actual research.

how likely treatments based on those hypotheses would lead to lives saved ...

That's forecasting. Remember, we're not talking about forecasting.

Comment author: [deleted] 28 October 2015 06:12:26PM 0 points [-]

"I don't know if you read scientific papers, but they don't "make estimates on how sure they are of certain hypotheses". They present the data and talk about the conclusions and implications that follow from the data presented. The potential hypotheses are evaluated on the basis of data, not on the basis of how well-calibrated does a particular researcher feel.

I'm not really sure how to answer this because I think you misunderstand calibration.

Science moves forward through something called scientific consensus. How does scientific consensus work right now? Well, we just kind of use guesswork. Expert calibration is a more useful way to understand what the scientific consensus actually is.

That's forecasting. Remember, we're not talking about forecasting.

No, it's a decision model. The decision model uses a forecast "How many lives can be saved", but it also uses calibration of known data "Based on the data you have, how sure are you that this particular fact is true".

Comment author: Lumifer 28 October 2015 06:37:03PM 1 point [-]

Science moves forward through something called scientific consensus.

No. This is absolutely false. Science moves forward through being able to figure out better and better how reality works. Consensus is really irrelevant to the process. The ultimate arbiter is reality regardless of what a collection of people with advanced degrees can agree on.

The decision model uses a forecast "How many lives can be saved", but it also uses calibration of known data "Based on the data you have, how sure are you that this particular fact is true".

That has nothing to do with calibration. "How many lives can be saved" is properly called a point forecast which provides an estimate of the center of the distribution. These are very popular but also limited because a much more useful forecast would come with an expected error and, ideally, would specify the shape of the distribution as well.

"Based on the data you have, how sure are you that this particular fact is true" is properly a question about the standard error of the estimate and it has nothing to do with subjective beliefs (well-calibrated or not) of the author.

I only care about someone's calibration if I'm asking him to guess. If the answer is "based on the data", it is based on the data and calibration is irrelevant.