komponisto comments on The Irrationality Game - Less Wrong

38 Post author: Will_Newsome 03 October 2010 02:43AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (910)

You are viewing a single comment's thread. Show more comments above.

Comment author: Vladimir_M 05 October 2010 10:42:20PM *  3 points [-]

That’s an excellent list of questions! It will help me greatly to systematize my thinking on the topic.

Before replying to the specific items you list, perhaps I should first state the general position I’m coming from, which motivates me to get into discussions of this sort. Namely, it is my firm belief that when we look at the present state of human knowledge, one of the principal sources of confusion, nonsense, and pseudosicence is physics envy, which leads people in all sorts of fields to construct nonsensical edifices of numerology and then pretend, consciously or not, that they’ve reached some sort of exact scientific insight. Therefore, I believe that whenever one encounters people talking about numbers of any sort that look even slightly suspicious, they should be considered guilty until proven otherwise -- and this entire business with subjective probability estimates for common-sense beliefs doesn’t come even close to clearing that bar for me.

Now to reply to your list.


(1) Confession of frequentism. Only sensible numerical probabilities are those related to frequencies, i.e. either frequencies of outcomes of repeated experiments, or probabilities derived from there. (Creative drawing of reference-class boundaries may be permitted.) Especially, prior probabilities are meaningless.

(2) Any sensible numbers must be produced using procedures that ultimately don't include any numerical parameters (maybe except small integers like 2,3,4). Any number which isn't a result of such a procedure is labeled arbitrary, and therefore meaningless. (Observation and measurement, of course, do count as permitted procedures. Admittedly arbitrary steps, like choosing units of measurement, are also permitted.)

My answer to (1) follows from my opinion about (2).

In my view, a number that gives any information about the real world must ultimately refer, either directly or via some calculation, to something that can be measured or counted (at least in principle, perhaps using a thought-experiment). This doesn’t mean that all sensible numbers have to be derived from concrete empirical measurements; they can also follow from common-sense insight and generalization. For example, reading about Newton’s theory leads to the common-sense insight that it’s a very close approximation of reality under certain assumptions. Now, if we look at the gravity formula F=m1*m2/r^2 (in units set so that G=1), the number 2 in the denominator is not a product of any concrete measurement, but a generalization from common sense. Yet what makes it sensible is that it ultimately refers to measurable reality via a well-defined formula: measure the force between two bodies of known masses at distance r, and you’ll get log(m1*m2/F)/log(r) = 2.

Now, what can we make out of probabilities from this viewpoint? I honestly can’t think of any sensible non-frequentist answer to this question. Subjectivist Bayesian phrases such as “the degree of belief” sound to me entirely ghostlike unless this “degree” is verifiable via some frequentist practical test, at least in principle. In this sense, I do confess frequentism. (Though I don’t wish to subscribe to all the related baggage from various controversies in statistics, much of which is frankly over my head.)

(3) Degrees of confidence shall be expressed without reflexive thinking about them. Trying to establish a fixed scale of confidence levels (like impossible - very unlikely - unlikely - possible - likely - very likely - almost certain - certain), or actively trying to compare degrees of confidence in different beliefs is cheating, since such scales can be then converted into numbers using a non-numerical procedure.

That depends on the concrete problem under consideration, and on the thinker who is considering it. The thinker’s brain produces an answer alongside a more or less fuzzy feeling of confidence, and the human language has the capacity to express these feelings with about the same level of fuziness as that signal. It can be sensible to compare intuitive confidence levels, if such comparison can be put to a practical (i.e. frequentist) test. Eight ordered intuitive levels of certainty might perhaps be too much, but with, say, four levels, I could produce four lists of predictions labeled “almost impossible,” “unlikely,” “likely,” and “almost certain,” such that common-sense would tell us that, with near-certainty, those in each subsequent list would turn out to be true in ever greater proportion.

If I wish to express these probabilities as numbers, however, this is not a legitimate step unless the resulting numbers can be justified in the sense discussed above under (1) and (2). This requires justification both in the sense of defining what aspect of reality they refer to (where frequentism seems like the only answer), and guaranteeing that they will be accurate under empirical tests. If they can be so justified, then we say that the intuitive estimate is “well-calibrated.” However, calibration is usually not possible in practice, and there are only two major exceptions.

The first possible path towards accurate calibration is when the same person performs essentially the same judgment many times, and from the past performance we extract the frequency with which their brain tends to produce the right answer. If this level of accuracy remains roughly constant in time, then it makes sense to attach it as the probability to that person’s future judgments on the topic. This approach treats the relevant operations in the brain as a black box whose behavior, being roughly constant, can be subjected to such extrapolation.

The second possible path is reached when someone has a sufficient level of insight about some problem to cross the fuzzy limit between common-sense thinking and an actual scientific model. Increasingly subtle and accurate thinking about a problem can result in the construction of a mathematical model that approximates reality well enough that when applied in a shut-up-and-calculate way, it yields probability estimates that will be subsequently vindicated empirically.

(Still, deciding whether the model is applicable in some particular situation remains a common-sense problem, and the probabilities yielded by the model do not capture this uncertainty. If a well-established physical theory, applied by competent people, says that p=0.9999 for some event, common sense tells me that I should treat this event as near-certain -- and, if repeated many times, that it will come out the unlikely way very close to one in 10,000 times. On the other hand, if p=0.9999 is produced by some suspicious model that looks like it might be a product of data-dredging rather than real insight about reality, common sense tells me that the event is not at all certain. But there is no way to capture this intuitive uncertainty with a sensible number. The probabilities coming from calibration of repeated judgment are subject to analogous unquantifiable uncertainty.)

There is also a third logical possibility, namely that some people in some situations have precise enough intuitions of certaintly that they can quantify them in an accurate way, just like some people can guess what time it is with remarkable precision without looking at the clock. But I see little evidence of this occurring in reality, and even if it does, these are very rare special cases.

(4) The question of whether somebody is well calibrated is confused for some reason. Calibrating people has no sense. Although we may take the "almost certain" statements of a person and look at how often they are true, the resulting frequency has no sense for some reason.

I disagree with this, as explained above. Calibration can be done successfully in the special cases I mentioned. However, in cases where it cannot be done, which includes the great majority of the actual beliefs and conclusions made by human brains, devising numerical probabilities makes no sense.

(5) Unlike #3, beliefs can be ordered or classified on some scale (possibly imprecisely), but assigning numerical values brings confusing connotations and should be avoided. Alternatively said, the meaning subjective probabilities is preserved after monotonous rescaling.

This should be clear from the answer to (3).


[Continued in a separate comment below due to excessive length.]

Comment author: komponisto 06 October 2010 06:45:20AM *  3 points [-]

I should first state the general position I’m coming from, which motivates me to get into discussions of this sort. Namely, it is my firm belief that when we look at the present state of human knowledge, one of the principal sources of confusion, nonsense, and pseudosicence is physics envy, which leads people in all sorts of fields to construct nonsensical edifices of numerology and then pretend, consciously or not, that they’ve reached some sort of exact scientific insight.

I'll point out here that reversed stupidity is not intelligence, and that for every possible error, there is an opposite possible error.

In my view, if someone's numbers are wrong, that should be dealt with on the object level (e.g. "0.001 is too low", with arguments for why), rather than retreating to the meta level of "using numbers caused you to err". The perspective I come from is wanting to avoid the opposite problem, where being vague about one's beliefs allows one to get away without subjecting them to rigorous scrutiny. (This, too, by the way, is a major hallmark of pseudoscience.)

But I'll note that even as we continue to argue under opposing rhetorical banners, our disagreement on the practical issue seems to have mostly evaporated; see here for instance. You also do admit in the end that fear of poor calibration is what is underlying your discomfort with numerical probabilities:

If I wish to express these probabilities as numbers, however, this is not a legitimate step unless the resulting numbers can be justified... If they can be so justified, then we say that the intuitive estimate is “well-calibrated.” However, calibration is usually not possible in practice...

As a theoretical matter, I disagree completely with the notion that probabilities are not legitimate or meaningful unless they're well-calibrated. There is such a thing as a poorly-calibrated Bayesian; it's a perfectly coherent concept. The Bayesian view of probabilities is that they refer specifically to degrees of belief, and not anything else. We would of course like the beliefs so represented to be as accurate as possible; but they may not be in practice.

If my internal "Bayesian calculator" believes P(X) = 0.001, and X turns out to be true, I'm not made less wrong by having concealed the number, saying "I don't think X is true" instead. Less embarrassed, perhaps, but not less wrong.

Comment author: Vladimir_M 06 October 2010 07:33:07AM *  0 points [-]

komponisto:

In my view, if someone's numbers are wrong, that should be dealt with on the object level (e.g. "0.001 is too low", with arguments for why), rather than retreating to the meta level of "using numbers caused you to err".

Trouble is, sometimes numbers can be not even wrong, with their very definition lacking logical consistency or any defensible link with reality. It is that category that I am most concerned with, and I believe that it sadly occurs very often in practice, with entire fields of inquiry sometimes degenerating into meaningless games with such numbers. My honest impression is that in our day and age, such numerological fallacies have been responsible for much greater intellectual sins than the opposite fallacy of avoiding scrutiny by excessive vagueness, although the latter phenomenon is not negligible either.

You also do admit in the end that fear of poor calibration is what is underlying your discomfort with numerical probabilities:

Here we seem to be clashing about terminology. I think that "poor calibration" is too much of a euphemism for the situations I have in mind, namely those where sensible calibration is altogether impossible. I would instead use some stronger expression clarifying that the supposed "calibration" is done without any valid basis, not that the result is poor because some unfortunate circumstance occurred in the course of an otherwise sensible procedure.

There is such a thing as a poorly-calibrated Bayesian; it's a perfectly coherent concept. The Bayesian view of probabilities is that they refer specifically to degrees of belief, and not anything else.

As I explained in the above lengthy comment, I simply don't find numbers that "refer specifically to degrees of belief, and not anything else" a coherent concept. We seem to be working with fundamentally different philosophical premises here.

Can these numerical "degrees of belief" somehow be linked to observable reality according to the criteria I defined in my reply to the points (1)-(2) above? If not, I don't see how admitting such concepts can be of any use.

If my internal "Bayesian calculator" believes P(X) = 0.001, and X turns out to be true, I'm not made less wrong by having concealed the number, saying "I don't think X is true" instead. Less embarrassed, perhaps, but not less wrong.

But if you do this 10,000 times, and the number of times X turns out to be true is small but nowhere close to 10, you are much more wrong than if you had just been saying "X is highly unlikely" all along.

On the other hand, if we're observing X as a single event in isolation, I don't see how this tests your probability estimate in any way. But I suspect we have some additional philosophical differences here.