MattG comments on The Triumph of Humanity Chart - Less Wrong

23 Post author: Dias 26 October 2015 01:41AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (77)

You are viewing a single comment's thread. Show more comments above.

Comment author: [deleted] 28 October 2015 06:12:26PM 0 points [-]

"I don't know if you read scientific papers, but they don't "make estimates on how sure they are of certain hypotheses". They present the data and talk about the conclusions and implications that follow from the data presented. The potential hypotheses are evaluated on the basis of data, not on the basis of how well-calibrated does a particular researcher feel.

I'm not really sure how to answer this because I think you misunderstand calibration.

Science moves forward through something called scientific consensus. How does scientific consensus work right now? Well, we just kind of use guesswork. Expert calibration is a more useful way to understand what the scientific consensus actually is.

That's forecasting. Remember, we're not talking about forecasting.

No, it's a decision model. The decision model uses a forecast "How many lives can be saved", but it also uses calibration of known data "Based on the data you have, how sure are you that this particular fact is true".

Comment author: Lumifer 28 October 2015 06:37:03PM 1 point [-]

Science moves forward through something called scientific consensus.

No. This is absolutely false. Science moves forward through being able to figure out better and better how reality works. Consensus is really irrelevant to the process. The ultimate arbiter is reality regardless of what a collection of people with advanced degrees can agree on.

The decision model uses a forecast "How many lives can be saved", but it also uses calibration of known data "Based on the data you have, how sure are you that this particular fact is true".

That has nothing to do with calibration. "How many lives can be saved" is properly called a point forecast which provides an estimate of the center of the distribution. These are very popular but also limited because a much more useful forecast would come with an expected error and, ideally, would specify the shape of the distribution as well.

"Based on the data you have, how sure are you that this particular fact is true" is properly a question about the standard error of the estimate and it has nothing to do with subjective beliefs (well-calibrated or not) of the author.

I only care about someone's calibration if I'm asking him to guess. If the answer is "based on the data", it is based on the data and calibration is irrelevant.

Comment author: passive_fist 28 October 2015 10:26:46PM 0 points [-]

No. This is absolutely false. Science moves forward through being able to figure out better and better how reality works.

While this completely true, and consensus only plays a minor role in science, it's not true that consensus is irrelevant. Given no other information about a certain hypothesis other than that the majority of scientists believe it to be true, the rational course of action would be to adjust belief in the hypothesis upward. Of course, evidence contradicting the hypothesis would nullify this consensus effect. Even a small amount of evidence trumps a large consensus.

Comment author: [deleted] 28 October 2015 07:37:59PM *  0 points [-]

No. This is absolutely false. Science moves forward through being able to figure out better and better how reality works. Consensus is really irrelevant to the process. The ultimate arbiter is reality regardless of what a collection of people with advanced degrees can agree on.

No, that's the popular conception of science, but unfortunately it's not an oracle that proves reality true or false. What observation and experiments give us are varying levels evidence that can falsify some hypotheses and point towards the truth of other hypotheses. We then use human reasoning to put all this evidence together and let humans decide how sure they are of something. If they have lots and lots of evidence that thing can become a "theory" based on the consensus that there's quite a lot of it and it's really good, and even more evidence that's even better makes that thing a "law". But it's based on a subjective sense of "how good these data are."

"Based on the data you have, how sure are you that this particular fact is true" is properly a question about the standard error of the estimate and it has nothing to do with subjective beliefs (well-calibrated or not) of the author.

Not quite. It also has to do with all the other previous experiments done, your certainty in the model itself, your ideas about how reality works, and a lot of other things.

That has nothing to do with calibration. "How many lives can be saved" is properly called a point forecast which provides an estimate of the center of the distribution. These are very popular but also limited because a much more useful forecast would come with an expected error and, ideally, would specify the shape of the distribution as well.

Yes, ideally this would be a credible interval with an estimated distribution, but even a credible interval assuming uniform distirubtion be very useful for this purpose.

In terms of calibration, if someone is well calibrated, and they give a credible interval with 90% confidence, the better calibrated you are, the more sure you can be that if they make 100 of such estimates, around 90% of them will lie within the credible interval you gave.

I only care about someone's calibration if I'm asking him to guess. If the answer is "based on the data", it is based on the data and calibration is irrelevant.

Well calibrated people will base their guesses on data, poorly calibrated people will not. Your understanding of calibration isn't in line with research done by Douglas Hubbard, Phillip Tetlock, and others who research human judgement.

Comment author: Lumifer 29 October 2015 03:08:53PM *  0 points [-]

that's the popular conception of science

Heh. Do you mean that's a conception of science held by not-too-smart uneducated people? X-)

an oracle that proves reality true or false

Sense make not. Reality is always true.

Speaking generally, you seem to treat science as people asserting certain things and so, to decide on how much to trust them, you need to know how calibrated those people are. That seems very different from my perception of science which is based on people saying "This is so, you can test it yourself if you want".

Under your approach, the goal is achieving consensus. Under my system, the goal is to provide replicability and show that it actually works.

Data does not depend on calibration of particular people.

Comment author: [deleted] 30 October 2015 03:11:16AM *  0 points [-]

This is so, you can test it yourself if you want Under your approach, the goal is achieving consensus. Under my system, the goal is to provide replicability and show that it actually works.

I think we have to separate two ideas here.

  1. There's the data you get from an experiment

  2. There's the conclusions you can draw from that data.

I would agree that the data does not depend on the calibration of particular people. But the conclusions you get from that data DO need to be calibrated. Furthermore, other scientists may want to do experiments based on those conclusions... their decision to do that will really be based on how likely they think the conclusions are accurate. The process of science is building new conclusions on the basis of those old conclusions - if it's just about gathering the data, you never gain a deeper understanding of reality.

Comment author: Lumifer 30 October 2015 02:47:14PM 0 points [-]

There's the conclusions you can draw from that data.

In the word "conclusions" you conflate two different things which I wish to keep separate.

One of them is subjective opinion/guesstimate/evaluation/conclusion of a person. I agree that the calibration of the person whose opinion we care about is relevant.

The other is objective facts/observations/measurements/conclusions that do not depend on anyone in particular. That's not just "data" from your first point. That's also conclusions that follow from the data in an explicit, non-subjective way. A study can perfectly well come to some conclusions by showing how the data leads to them without depending on anyone's calibration.

The answer to doubts about the first kind of conclusions is "trust me because I know what I'm talking about". The answer to doubts about the second kind of conclusions is "you don't have to trust me, see for yourself".

The process of science is building new conclusions on the basis of those old conclusions

I continue to disagree. In your concept of science the idea of testing against reality is somewhere in the back row. What's important is achieving consensus and being well-calibrated. I don't think this is what science is about.

Comment author: [deleted] 30 October 2015 08:46:05PM *  0 points [-]

In your concept of science the idea of testing against reality is somewhere in the back row. What's important is achieving consensus and being well-calibrated. I don't think this is what science is about.

Let's stop using the word "science" because I don't really care how we define that specific word.

Let's change it instead to "the process of learning things about reality" because that's what I'm talking about. I think it's what you're talking about as well, but traditionally science can also mean "the process of running experiments" - and if we defined it that way, then I'd agree that calibration isn't needed.

The other is objective facts/observations/measurements/conclusions that do not depend on anyone in particular. That's not just "data" from your first point. That's also conclusions that follow from the data in an explicit, non-subjective way.

I can't think of an example where conclusions are proven true from data in a specific, non-subjective way. Science works on falsification - you can prove things false in a specific, non-subjective way (assuming you trust completely in the protocol and the people running it), but you can't prove things true, because there's still ANOTHER experiment someone could run in different conditions that could theoretically falsify your current hypothesis. Furthermore, you may get the correlation right, but may misunderstand the causation.

Don't get too caught up on this example, because it's just a silly illustration of a general point, but say you made a hypothesis that "An object falling due to gravity accelerates at a rate of 9.8 meters/second squared". You could run many experiments with data that fit your hypothesis, but it's always possible that an alternative hypothesis that "Objects accelerate at 9.8 meters/second squared - except on Tuesday's when it's a full moon". Unless you had specifically tested that scenario, that hypothesis has some infinitesimal chance of being right - and the thing is, there's no way to test ALL of the potential scenarios.

That's where calibration comes in - you don't have certainty that objects accelerate at that rate due to gravity in every situation, but as you prove it in more and more situations, you (and the scientific community) become more and more certain that it's the correct hypothesis. But even then, someone like Einstein can come along, find some random edge case involving the speed light where the hypothesis doesn't hold, and present a better one.

Comment author: Lumifer 30 October 2015 08:51:58PM *  0 points [-]

Let's change it instead to "the process of learning things about reality" because that's what I'm talking about.

"The process of learning things about reality" is much MUCH larger and more varied than science.

That ain't where goalposts used to be :-/

Comment author: [deleted] 30 October 2015 09:09:38PM *  0 points [-]

We just had different goal posts. You learned science as "running an experiment" - I learned science as "Doing background research, determining likely outcomes, running experiments, sharing results back with the community". That's why I tabooed the word, to make sure we were on the same page.

Are we in agreements about the basic concept, if we agree that we have two different definitions of science?

Comment author: Lumifer 01 November 2015 10:35:33PM 1 point [-]

I learned science as...

Do tell. Where and how did you "learn science" this way?

Are we in agreements about the basic concept

What is the "basic concept"?

Comment author: gjm 29 October 2015 08:10:30PM 0 points [-]

That seems very different from my perception of science

Aren't both these views of science oversimplifications? I mean, in practice most of the people making use of the work scientists have done aren't really testing the scientists' work for themselves (they're kinda doing it implicitly by making use of that work, but the whole point is that they are confident it's not going to fail).

Reality certainly is the ultimate arbiter, but regrettably we don't get to ask Reality directly whether our theories are correct; all we can do is test them somewhat (in some cases it's not even clear how to begin doing that; I'm looking at you, string theory) and that testing is done by fallible people using fallible equipment, and in many cases it's very difficult to do in a way that actually lets you separate the signal from the noise, and most of us aren't well placed to evaluate how fallibly it's been done in any given case, and in practice usually we have to fall back on something like "scientific consensus" after all.

I think you and MattG are at cross purposes about the role he sees for calibration in science. The process by which actual primary scientific work becomes useful to people who aren't specialists in the field goes something like this:

  • Alice does some work where she exposes laboratory rats to bad journalism and measures the rate at which they get cancer. (So do Alex, Amanda, Aloysius, et Al.)
    • She forms some opinions about this stuff; we could, in LW style, represent these opinions as some kind of probability distribution over relationships between bad journalism and cancer. Both her point estimates and her estimates of the distribution around them are strongly constrained by the work she's done, but of course there are probably things she's failed to think of. If she's sensible, her opinions will include explicit allowance for having (maybe) made mistakes and missed things. Such considerations will probably not appear explicitly in the articles she publishes.
  • Bob talks to Alice (and Alex, Amanda, ...) or reads the articles they publish.
    • As a result, Bob too forms opinions about this stuff, which again we can represent in probabilistic terms. Bob's knowledge of the actual work is less direct than Alice's, and his opinions are going to depend not only on Alice's observed risk ratios and samples sizes and p-values and whatnot but also on how much he trusts Alice (having read her papers) to have done good work. And of course he will be trying to integrate what he learns from Alice with what he learns from Alex, Amanda et Al.
    • Bob may actually also be a primary researcher in the field, but here we're considering him in his role as someone who has looked at the primary researchers' work and drawn some conclusions.
  • Bob and Bill and Beth and Bert and all the other journo-oncologists (some of whom are in fact Alice and Alex etc.) all read more or less the same articles, and talk to one another at conferences, and write articles commenting on other people's work. Over the next few years, journo-oncological opinion converges to a rough consensus that reading the Daily Mail probably does causes cancer, that further work might pin that down further, but that the field has higher research priorities.
  • Carol, a non-specialist who wants to know whether reading the Daily Mail causes cancer, talks to some experts in the field or reads a popular book on the subject or even gets into the journals and finds a review article or two.
    • As a result, Carol also forms opinions about journo-oncology. If she has the necessary skills she may also look cursorily at some of the primary literature and get some idea of how rigorous that work is, how big the sample sizes are, whether the research was funded by Rupert Murdoch, etc., but on the whole she's dependent on what Bob and the other Bs tell her. So her opinions are going to be mostly shaped by what Bob says and what she thinks of Bob's accuracy on this point.

Calibration (in the sense we're talking about here) isn't of much relevance to Alice when she's doing the primary research. She will report that the Daily Mail is positively associated with brain cancer in rats (RR=1.3, n=50, CI=[1.1,1.5], p=0.01, etc., etc., etc.) and that's more or less it. (I take it that's the point you've been making.)

But Bob's opinion about the carcenogenicity of the Daily Mail (having read Alice's papers) is an altogether slipperier thing; and the opinion to which he and Beth and the others converge is slipperier still. It'll depend on their assessment of how likely it is that Alice made a mistake, how likely it is that Aloysius's results are fraudulent given that he took a large grant from the DMG Media Propaganda Fund, etc.; and on how strongly Bob is influenced when he hears Bill say "... and of course we all know what a shoddy operation Alex's lab is."

It is in these later stages that better calibration could be valuable, and that I think Matt would like to see more explicit reference to it. He would like Bob and Bill and Beth and the rest to be explicit about what they think and why and how confidently, and he would like the consensus-generating process to involve weighing people's opinions more or less heavily when they are known to be better or worse at the sort of subjective judgement required to decide how completely to mistrust Aloysius because of his funding.

I'm not terribly convinced that that would actually help much, for what it's worth. But I don't think what Matt's saying is invalidated by pointing out that Alice's publications don't talk about (this kind of) calibration.

Comment author: Lumifer 29 October 2015 09:14:51PM *  1 point [-]

I mean, in practice most of the people making use of the work scientists have done aren't really testing the scientists' work for themselves (they're kinda doing it implicitly by making use of that work, but the whole point is that they are confident it's not going to fail).

First, I think the "implicitly" part is very important. That glowing gizmo with melted-sand innards in front of me works. By working it verifies, very directly, a whole lot of science.

And "working in practice" is what leads to confidence, not vice versa. When a sailor took the first GPS unit on a cruise, he didn't say "Oh, science says it's going to work, so that's all going to be fine". He took it as a secondary or, probably, a tertiary navigation device. Now, after years of working in practice sailors take the GPS as a primary device and most often, a second GPS as a secondary.

Note, by the way, that we want useful science and useful science leads to practical technologies that we test and use all the time.

Calibration (in the sense we're talking about here) isn't of much relevance to Alice when she's doing the primary research.

Oh, good, we agree.

But Bob's opinion ... is an altogether slipperier thing; and the opinion to which he and Beth and the others converge is slipperier still.

Sure, that's fine. Bob and Beth are not scientists and are not doing science. Allow me to quote myself: "Calibration is good for guesstimates, it's not particularly valuable for actual research." Bob and Bill and Beth and Bert are not doing actual research. They are trying to use published results to form some opinions, some guesstimates and, as I agree, their calibration matters for the quality of their guesstimates. But, again, that's not science.

Comment author: gjm 30 October 2015 03:54:36AM 0 points [-]

Bob and Beth are not scientists and are not doing science.

Bob and Beth are scientists (didn't I make it clear enough in my gedankenexperiment that they are intended to be journo-oncologists just as much as Alice et al, it's just that we're considering them in a different role here?). And they are forming their opinions in the course of their professional activities. Doing science is not only about doing experiments and working out knotty theoretical problems; when two scientists discuss their work, they are doing science; when a scientist attends a conference presentation given by another, they are doing science; when a scientist sits and thinks about what might be a good problem to attack next, they are doing science.

Doing actual research is a more "central" scientific activity than those other things. But the other things are real, they are things scientists actually do, they are things scientists need to do, and I don't see any reason to deny that doing them is part of how science (the whole collective enterprise) functions.

Comment author: Lumifer 30 October 2015 02:53:52PM *  1 point [-]

when a scientist sits and thinks about what might be a good problem to attack next, they are doing science.

Sure, and you've expanded the definition of "doing science" into uselessness. "Doodling on paper napkins is doing science!" -- well, yeah, if you want it so, what next?

I'm not talking about what large variety of things scientists do in the course of their professional lives. I'm talking about the core concept of science and whether it, as MattG believes, "moves forward through something called scientific consensus".

In particular, I would like to distinguish between "doing science" (discovering how the world works) and "applying science" (changing the world based on your beliefs about how it works).

Comment author: gjm 30 October 2015 09:26:46PM 1 point [-]

the core concept of science

Let's distinguish two things. (1) The core activities of science are, for sure, things like doing carefully designed experiments and applying mathematics to make quantitative predictions based on precisely formulated theories. These activities, indeed, don't proceed by consensus, but no one claimed otherwise; even to ask whether they do is a type error. (2) How scientific knowledge actually advances. This is not only a matter of #1; if we had nothing but #1 then science wouldn't advance at all, because in order for science to advance each scientist's work needs to be based in, or at least aware of, the work of their predecessors. And #2, as it happens, does involve something like consensus, and it's reasonable to wonder whether being more explicitly and carefully rational about #2 would help science to advance more effectively. And that is what (AIUI) MattG is proposing.

Comment author: Lumifer 01 November 2015 10:32:41PM 1 point [-]

but no one claimed otherwise

I do believe MattG claimed otherwise. At least that was the most straightforward reading of what he said.

in order for science to advance each scientist's work needs to be based in, or at least aware of, the work of their predecessors.

That is true, the scientists do trust what's considered "solved", but that trust is conditional. One little ugly fact can blow up a lot of consensus sky-high.

I think one of the core issues here is resistance to cargo cult science. Consensus is dangerous because it is enables cargo cults, but the sceptical "show me" attitude is invaluable here.

more explicitly and carefully rational about #2 would help science to advance more effectively

What do you mean by "carefully rational"? How is that better than the baseline "show me"?

Comment author: RichardKennaway 31 October 2015 06:58:24AM 1 point [-]

Consensus is the result, not the means.


But this thread has drifted far from reality. It began with Lumifer's comment about estimates of historical poverty:

The charts posted claim to reflect the entire world and they go back to early XIX century. Whole-world data at that point is nothing but a collection of guesstimates.

To which MattG replied:

My understanding is you basically get a bunch of economists in the room to break down the problem into relevant parts, then get a bunch of historians in the room, calibrate them, get them to give credible intervals for the relevant data, and plug it all in to the model.

Lumifer:

Is this how you think it works or is this how you think it should work?

MattG:

It's how I think it works.

And the conversation drifted into the stratosphere with no further discussion of where those numbers actually came from.