MattG comments on The Triumph of Humanity Chart - Less Wrong Discussion
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (77)
No, that's the popular conception of science, but unfortunately it's not an oracle that proves reality true or false. What observation and experiments give us are varying levels evidence that can falsify some hypotheses and point towards the truth of other hypotheses. We then use human reasoning to put all this evidence together and let humans decide how sure they are of something. If they have lots and lots of evidence that thing can become a "theory" based on the consensus that there's quite a lot of it and it's really good, and even more evidence that's even better makes that thing a "law". But it's based on a subjective sense of "how good these data are."
Not quite. It also has to do with all the other previous experiments done, your certainty in the model itself, your ideas about how reality works, and a lot of other things.
Yes, ideally this would be a credible interval with an estimated distribution, but even a credible interval assuming uniform distirubtion be very useful for this purpose.
In terms of calibration, if someone is well calibrated, and they give a credible interval with 90% confidence, the better calibrated you are, the more sure you can be that if they make 100 of such estimates, around 90% of them will lie within the credible interval you gave.
Well calibrated people will base their guesses on data, poorly calibrated people will not. Your understanding of calibration isn't in line with research done by Douglas Hubbard, Phillip Tetlock, and others who research human judgement.
Heh. Do you mean that's a conception of science held by not-too-smart uneducated people? X-)
Sense make not. Reality is always true.
Speaking generally, you seem to treat science as people asserting certain things and so, to decide on how much to trust them, you need to know how calibrated those people are. That seems very different from my perception of science which is based on people saying "This is so, you can test it yourself if you want".
Under your approach, the goal is achieving consensus. Under my system, the goal is to provide replicability and show that it actually works.
Data does not depend on calibration of particular people.
I think we have to separate two ideas here.
There's the data you get from an experiment
There's the conclusions you can draw from that data.
I would agree that the data does not depend on the calibration of particular people. But the conclusions you get from that data DO need to be calibrated. Furthermore, other scientists may want to do experiments based on those conclusions... their decision to do that will really be based on how likely they think the conclusions are accurate. The process of science is building new conclusions on the basis of those old conclusions - if it's just about gathering the data, you never gain a deeper understanding of reality.
In the word "conclusions" you conflate two different things which I wish to keep separate.
One of them is subjective opinion/guesstimate/evaluation/conclusion of a person. I agree that the calibration of the person whose opinion we care about is relevant.
The other is objective facts/observations/measurements/conclusions that do not depend on anyone in particular. That's not just "data" from your first point. That's also conclusions that follow from the data in an explicit, non-subjective way. A study can perfectly well come to some conclusions by showing how the data leads to them without depending on anyone's calibration.
The answer to doubts about the first kind of conclusions is "trust me because I know what I'm talking about". The answer to doubts about the second kind of conclusions is "you don't have to trust me, see for yourself".
I continue to disagree. In your concept of science the idea of testing against reality is somewhere in the back row. What's important is achieving consensus and being well-calibrated. I don't think this is what science is about.
Let's stop using the word "science" because I don't really care how we define that specific word.
Let's change it instead to "the process of learning things about reality" because that's what I'm talking about. I think it's what you're talking about as well, but traditionally science can also mean "the process of running experiments" - and if we defined it that way, then I'd agree that calibration isn't needed.
I can't think of an example where conclusions are proven true from data in a specific, non-subjective way. Science works on falsification - you can prove things false in a specific, non-subjective way (assuming you trust completely in the protocol and the people running it), but you can't prove things true, because there's still ANOTHER experiment someone could run in different conditions that could theoretically falsify your current hypothesis. Furthermore, you may get the correlation right, but may misunderstand the causation.
Don't get too caught up on this example, because it's just a silly illustration of a general point, but say you made a hypothesis that "An object falling due to gravity accelerates at a rate of 9.8 meters/second squared". You could run many experiments with data that fit your hypothesis, but it's always possible that an alternative hypothesis that "Objects accelerate at 9.8 meters/second squared - except on Tuesday's when it's a full moon". Unless you had specifically tested that scenario, that hypothesis has some infinitesimal chance of being right - and the thing is, there's no way to test ALL of the potential scenarios.
That's where calibration comes in - you don't have certainty that objects accelerate at that rate due to gravity in every situation, but as you prove it in more and more situations, you (and the scientific community) become more and more certain that it's the correct hypothesis. But even then, someone like Einstein can come along, find some random edge case involving the speed light where the hypothesis doesn't hold, and present a better one.
"The process of learning things about reality" is much MUCH larger and more varied than science.
That ain't where goalposts used to be :-/
We just had different goal posts. You learned science as "running an experiment" - I learned science as "Doing background research, determining likely outcomes, running experiments, sharing results back with the community". That's why I tabooed the word, to make sure we were on the same page.
Are we in agreements about the basic concept, if we agree that we have two different definitions of science?
Do tell. Where and how did you "learn science" this way?
What is the "basic concept"?
Throughout elementary and middle school (early education here in the US) through textbooks with diagrams like this
That experiments can give you mostly non-subjective data about one experiment, but to draw broader conclusions about how the world works you have to combine the data from many experiments into a subjective estimate about how likely a hypothesis is.
That does not strike me as an adequate basis for deciding what science is or is not.
So, are you saying that the outcome of science is a set of subjective estimates that most people agree with?
Aren't both these views of science oversimplifications? I mean, in practice most of the people making use of the work scientists have done aren't really testing the scientists' work for themselves (they're kinda doing it implicitly by making use of that work, but the whole point is that they are confident it's not going to fail).
Reality certainly is the ultimate arbiter, but regrettably we don't get to ask Reality directly whether our theories are correct; all we can do is test them somewhat (in some cases it's not even clear how to begin doing that; I'm looking at you, string theory) and that testing is done by fallible people using fallible equipment, and in many cases it's very difficult to do in a way that actually lets you separate the signal from the noise, and most of us aren't well placed to evaluate how fallibly it's been done in any given case, and in practice usually we have to fall back on something like "scientific consensus" after all.
I think you and MattG are at cross purposes about the role he sees for calibration in science. The process by which actual primary scientific work becomes useful to people who aren't specialists in the field goes something like this:
Calibration (in the sense we're talking about here) isn't of much relevance to Alice when she's doing the primary research. She will report that the Daily Mail is positively associated with brain cancer in rats (RR=1.3, n=50, CI=[1.1,1.5], p=0.01, etc., etc., etc.) and that's more or less it. (I take it that's the point you've been making.)
But Bob's opinion about the carcenogenicity of the Daily Mail (having read Alice's papers) is an altogether slipperier thing; and the opinion to which he and Beth and the others converge is slipperier still. It'll depend on their assessment of how likely it is that Alice made a mistake, how likely it is that Aloysius's results are fraudulent given that he took a large grant from the DMG Media Propaganda Fund, etc.; and on how strongly Bob is influenced when he hears Bill say "... and of course we all know what a shoddy operation Alex's lab is."
It is in these later stages that better calibration could be valuable, and that I think Matt would like to see more explicit reference to it. He would like Bob and Bill and Beth and the rest to be explicit about what they think and why and how confidently, and he would like the consensus-generating process to involve weighing people's opinions more or less heavily when they are known to be better or worse at the sort of subjective judgement required to decide how completely to mistrust Aloysius because of his funding.
I'm not terribly convinced that that would actually help much, for what it's worth. But I don't think what Matt's saying is invalidated by pointing out that Alice's publications don't talk about (this kind of) calibration.
First, I think the "implicitly" part is very important. That glowing gizmo with melted-sand innards in front of me works. By working it verifies, very directly, a whole lot of science.
And "working in practice" is what leads to confidence, not vice versa. When a sailor took the first GPS unit on a cruise, he didn't say "Oh, science says it's going to work, so that's all going to be fine". He took it as a secondary or, probably, a tertiary navigation device. Now, after years of working in practice sailors take the GPS as a primary device and most often, a second GPS as a secondary.
Note, by the way, that we want useful science and useful science leads to practical technologies that we test and use all the time.
Oh, good, we agree.
Sure, that's fine. Bob and Beth are not scientists and are not doing science. Allow me to quote myself: "Calibration is good for guesstimates, it's not particularly valuable for actual research." Bob and Bill and Beth and Bert are not doing actual research. They are trying to use published results to form some opinions, some guesstimates and, as I agree, their calibration matters for the quality of their guesstimates. But, again, that's not science.
Bob and Beth are scientists (didn't I make it clear enough in my gedankenexperiment that they are intended to be journo-oncologists just as much as Alice et al, it's just that we're considering them in a different role here?). And they are forming their opinions in the course of their professional activities. Doing science is not only about doing experiments and working out knotty theoretical problems; when two scientists discuss their work, they are doing science; when a scientist attends a conference presentation given by another, they are doing science; when a scientist sits and thinks about what might be a good problem to attack next, they are doing science.
Doing actual research is a more "central" scientific activity than those other things. But the other things are real, they are things scientists actually do, they are things scientists need to do, and I don't see any reason to deny that doing them is part of how science (the whole collective enterprise) functions.
Sure, and you've expanded the definition of "doing science" into uselessness. "Doodling on paper napkins is doing science!" -- well, yeah, if you want it so, what next?
I'm not talking about what large variety of things scientists do in the course of their professional lives. I'm talking about the core concept of science and whether it, as MattG believes, "moves forward through something called scientific consensus".
In particular, I would like to distinguish between "doing science" (discovering how the world works) and "applying science" (changing the world based on your beliefs about how it works).
Let's distinguish two things. (1) The core activities of science are, for sure, things like doing carefully designed experiments and applying mathematics to make quantitative predictions based on precisely formulated theories. These activities, indeed, don't proceed by consensus, but no one claimed otherwise; even to ask whether they do is a type error. (2) How scientific knowledge actually advances. This is not only a matter of #1; if we had nothing but #1 then science wouldn't advance at all, because in order for science to advance each scientist's work needs to be based in, or at least aware of, the work of their predecessors. And #2, as it happens, does involve something like consensus, and it's reasonable to wonder whether being more explicitly and carefully rational about #2 would help science to advance more effectively. And that is what (AIUI) MattG is proposing.
I do believe MattG claimed otherwise. At least that was the most straightforward reading of what he said.
That is true, the scientists do trust what's considered "solved", but that trust is conditional. One little ugly fact can blow up a lot of consensus sky-high.
I think one of the core issues here is resistance to cargo cult science. Consensus is dangerous because it is enables cargo cults, but the sceptical "show me" attitude is invaluable here.
What do you mean by "carefully rational"? How is that better than the baseline "show me"?
I think you can only reach that conclusion by applying your preferred definition of "science" to MattG's statement about science. That's a mistake unless you know he's not using a substantially different definition.
Yes, of course. (Did anyone suggest it's not?)
For the avoidance of doubt, I am not for a minute suggesting blind or unquestioning trust of scientific consensus; at least, not for scientists. (It is possible that below some threshold of scientific competence blind trust is in fact the best available strategy.)
I mean what happens if the Bobs in my thought experiment, rather than arriving at their opinions informally and qualitatively, think explicitly about what they've heard and read and about how much evidence each thing they've heard or read provides, and determine their own opinions by deliberate reflection on that (not necessarily by actual calculation, but with that always available in cases of doubt).
This might well not be an improvement (e.g., because System 1 has hardware support that System 2 doesn't) but it's not obvious that it isn't.
"Carefully rational" isn't a proposed replacement for "show me", it's a proposed replacement for things like "I've read about this in a few papers so I'll assume it's true" (which probably doesn't get said explicitly very often, of course).
"Show me" is always there (usually in the background) as an option. Most scientists, most of the time, don't go banging on other scientists' lab doors demanding further evidence for what's in their papers. Most scientists, most of the time, don't attempt to replicate other scientists' results before (at least provisionally) accepting them.
(One reason is that replication and door-banging take effort. This is also an argument against the more explicit "carefully rational" approach I think MattG is advocating.)
I fail to discern your point. There is a lot of clarifications, adjustments, and edge-nibbling, but what is it that you want to say?
Consensus is the result, not the means.
But this thread has drifted far from reality. It began with Lumifer's comment about estimates of historical poverty:
To which MattG replied:
Lumifer:
MattG:
And the conversation drifted into the stratosphere with no further discussion of where those numbers actually came from.