Vaniver comments on Welcome to Less Wrong! (5th thread, March 2013) - Less Wrong

27 Post author: orthonormal 01 April 2013 04:19PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (1750)

You are viewing a single comment's thread. Show more comments above.

Comment author: Vaniver 03 July 2013 03:55:51AM 4 points [-]

I guess this sounds heretical, but I don't understand why Bayes theorem is placed on such a pedestal here. I understand Bayesian statistics, intuitively and also technically. Bayesian statistics is great for a lot of problems, but I don't see it as always superior to thinking inspired by the traditional scientific method.

I know a few answers to this question, and I'm sure there are others. (As an aside, these foundational questions are, in my opinion, really important to ask and answer.)

  1. What separates scientific thought and mysticism is that scientists are okay with mystery. If you can stand to not know what something is, to be confused, then after careful observation and thought you might have a better idea of what it is and have a bit more clarity. Bayes is the quantitative heart of the qualitative approach of tracking many hypotheses and checking how concordant they are with reality, and thus should feature heavily in a modern epistemic approach. The more precisely and accurately you can deal with uncertainty, the better off you are in an uncertain world.
  2. What separates Bayes and the "traditional scientific method" (using scare quotes to signify that I'm highlighting a negative impression of it) is that the TSM is a method for avoiding bad beliefs but Bayes is a method for finding the best available beliefs. In many uncertain situations, you can use Bayes but you can't use the TSM (or it would be too costly to do so), but the TSM doesn't give any predictions in those cases!
  3. Use of Bayes focuses attention on base rates, alternate hypotheses, and likelihood ratios, which people often ignore (replacing the first with maxent, the second with yes/no thinking, and the latter with likelihoods).
  4. I honestly don't think the quantitative aspect of priors and updating is that important, compared to the search for a 'complete' hypothesis set and the search for cheap experiments that have high likelihood ratios (little bets).

I think that the qualitative side of Bayes is super important but don't think we've found a good way to communicate it yet. That's an active area of research, though, and in particular I'd love to hear your thoughts on those four answers.

Comment author: Lumifer 18 September 2013 07:31:03PM 1 point [-]

I think that the qualitative side of Bayes is super important

What is the qualitative side of Bayes?

Comment author: Vaniver 18 September 2013 07:43:41PM 1 point [-]

Unfortunately, the end of that sentence is still true:

but [I] don't think we've found a good way to communicate it yet.

I think that What Bayesianism Taught Me is a good discussion on the subject, and my comment there explains some of the components I think are part of qualitative Bayes.

I think that a lot of qualitative Bayes is incorporating the insights of the Bayesian approach into your System 1 thinking (i.e. habits on the 5 second level).

Comment author: Lumifer 18 September 2013 08:20:20PM *  0 points [-]

Well, yes, but most of the things there are just useful ways to think about probabilities and uncertainty, proper habits, things to check, etc. Why Bayes? He's not a saint whose name is needed to bless a collection of good statistical practices.

Comment author: RobbBB 18 September 2013 08:54:03PM *  3 points [-]

It's more or less the same reason people call a variety of essentialist positions 'platonism' or 'aristotelianism'. Those aren't the only thinkers to have had views in this neighborhood, but they predated or helped inspire most of the others, and the concepts have become pretty firmly glued together. Similarly, the phrases 'Bayes' theorem' and 'Bayesian interpretation of probability' (whence, jointly, the idea of Bayesian inference) have firmly cemented the name Bayes to the idea of quantifying psychological uncertainty and correctly updating on the evidence. The Bayesian interpretation is what links these theorems to actual practice.

Bayes himself may not have been a 'Bayesian' in the modern sense, just as Plato wasn't a 'platonist' as most people use the term today. But the names have stuck, and 'Laplacian' or 'Ramseyan' wouldn't have quite the same ring.

Comment author: Vaniver 18 September 2013 09:19:41PM 1 point [-]

But the names have stuck, and 'Laplacian' or 'Ramseyan' wouldn't have quite the same ring.

I like Laplacian as a name better, but it's already a thing.

Comment author: Lumifer 18 September 2013 09:05:37PM 1 point [-]

If I were to pretend that I'm a mainstream frequentist and consider "quantifying psychological uncertainty" to be subjective mumbo-jumbo with no place anywhere near real science :-D I would NOT have serious disagreements with e.g. Vaniver's list. Sure, I would quibble about accents, importances, and priorities, but there's nothing there that would be unacceptable from the mainstream point of view.

Comment author: RobbBB 18 September 2013 10:44:19PM *  7 points [-]

My biggest concern with the label 'Bayesianism' isn't that it's named after the Reverend, nor that it's too mainstream. It's that it's really ambiguous.

For example, when Yvain speaks of philosophical Bayesianism, he means something extremely modest -- the idea that we can successfully model the world without certainty. This view he contrasts, not with frequentism, but with Aristotelianism ('we need certainty to successfully model the world, but luckily we have certainty') and Anton-Wilsonism ('we need certainty to successfully model the world, but we lack certainty'). Frequentism isn't this view's foil, and this philosophical Bayesianism doesn't have any respectable rivals, though it certainly sees plenty of assaults from confused philosophers, anthropologists, and poets.

If frequentism and Bayesianism are just two ways of defining a word, then there's no substantive disagreement between them. Likewise, if they're just two different ways of doing statistics, then it's not clear that any philosophical disagreement is at work; I might not do Bayesian statistics because I lack skill with R, or because I've never heard about it, or because it's not the norm in my department.

There's a substantive disagreement if Bayesianism means 'it would be useful to use more Bayesian statistics in science', and if frequentism means 'no it wouldn't!'. But this methodological Bayesianism is distinct from Yvain's philosophical Bayesianism, and both of those are distinct from what we might call 'Bayesian rationalism', the suite of mantras, heuristics, and exercises rationalists use to improve their probabilistic reasoning. (Or the community that deems such practices useful.) Viewing the latter as an ideology or philosophy is probably a bad idea, since the question of which of these tricks are useful should be relatively easy to answer empirically.

Comment author: Jayson_Virissimo 19 September 2013 02:08:32AM *  3 points [-]

For example, when Yvain speaks of philosophical Bayesianism, he means something extremely modest...

Yes, it is my understanding that epistemologists usually call the set of ideas Yvain is referring to "probabilism" and indeed, it is far more vague and modest than what they call Bayesianism (which is more vague and modest still than the subjectively-objective Bayesianism that is affirmed often around these parts.).

If frequentism and Bayesianism are just two ways of defining a word, then there's no substantive disagreement between them. Likewise, if they're just two different ways of doing statistics, then it's not clear that any philosophical disagreement is at work; I might not do Bayesian statistics because I lack skill with R, or because I've never heard about it, or because it's not the norm in my department.

BTW, I think this is precisely what Carnap was on about with his distinction between probability-1 and probability-2, neither of which did he think we should adopt to the exclusion of the other.

Comment author: Randaly 18 September 2013 11:53:54PM *  5 points [-]

Frequentism isn't this view's foil

Err, actually, yes it is. The frequentist interpretation of probability makes the claim that probability theory can only be used in situations involving large numbers of repeatable trials, or selection from a large population. William Feller:

There is no place in our system for speculations concerning the probability that the sun will rise tomorrow. Before speaking of it we should have to agree on an (idealized) model which would presumably run along the lines "out of infinitely many worlds one is selected at random..." Little imagination is required to construct such a model, but it appears both uninteresting and meaningless.

Or to quote from the essay coined the term frequentist:

The essential distinction between the frequentists and the [Bayesians] is, I think, that the former, in an effort to avoid anything savouring of matters of opinion, seek to define probability in terms of the objective properties of a population, real or hypothetical, whereas the latter do not.

Frequentism is only relevant to epistemological debates in a negative sense: unlike Aristotelianism and Anton-Wilsonism, which both present their own theories of epistemology, frequentism's relevance is almost only in claiming that Bayesianism is wrong. (Frequentism separately presents much more complicated and less obviously wrong claims within statistics and probability; these are not relevant, given that frequentism's sole relevance to epistemology is its claim that no theory of statistics and probability could be a suitable basis for an epistemology, since there are many events they simply don't apply to.)

(I agree that it would be useful to separate out the three versions of Bayesianism, whose claims, while related, do not need to all be true or false at the same time. However, all three are substantively opposed to one or both of the views labelled frequentist.)

Comment author: satt 19 September 2013 02:01:13AM 0 points [-]

Err, actually, yes it is. The frequentist interpretation of probability makes the claim that probability theory can only be used in situations involving large numbers of repeatable trials, or selection from a large population.

Depends which frequentist you ask. From Aris Spanos's "A frequentist interpretation of probability for model-based inductive inference":

It is argued that the proposed frequentist interpretation, not only achieves this objective, but contrary to the conventional wisdom, the charges of ‘circularity’, its inability to assign probabilities to ‘single events’, and its reliance on ‘random samples’ are shown to be unfounded.

and

The error statistical perspective identifies the probability of an event A—viewed in the context of a statistical model (x), xR^n_X—with the limit of its relative frequency of occurrence by invoking the SLLN. This frequentist interpretation is defended against the charges of [i] ‘circularity’ and [ii] inability to assign ‘single event’ probabilities, by showing that in model-based induction the defining characteristic of the long-run metaphor is neither its temporal nor its physical dimension, but its repeatability (in principle) which renders it operational in practice.

Comment author: RichardKennaway 19 September 2013 03:48:02PM *  3 points [-]

Depends which frequentist you ask. From Aris Spanos's "A frequentist interpretation of probability for model-based inductive inference":

For those who can't access that through the paywall (I can), his presentation slides for it are here. I would hate to have been in the audience for the presentation, but the upside of that is that they pretty much make sense on their own, being just a compressed version of the paper.

While looking for those, I also found "Frequentists in Exile", which is Deborah Mayo's frequentist statistics blog.

I am not enough of a statistician to make any quick assessment of these, but they look like useful reading for anyone thinking about the foundations of uncertain inference.

Comment author: RobbBB 19 September 2013 12:12:29AM *  0 points [-]

The frequentist interpretation of probability makes the claim that probability theory can only be used in situations involving large numbers of repeatable trials

I don't understand what this "probability theory can only be used..." claim means. Are they saying that if you try to use probability theory to model anything else, your pencil will catch fire? Are they saying that if you model beliefs probabilistically, Math breaks? I need this claim to be unpacked. What do frequentists think is true about non-linguistic reality, that Bayesians deny?

Comment author: Desrtopa 19 September 2013 01:53:07AM 3 points [-]

I don't understand what this "probability theory can only be used..." claim means. Are they saying that if you try to use probability theory to model anything else, your pencil will catch fire? Are they saying that if you model beliefs probabilistically, Math breaks?

I think they would be most likely to describe it as a category error. If you try to use probability theory outside the constraints within which they consider it applicable, they'd attest that you'd produce no meaningful knowledge and accomplish nothing but confusing yourself.

Comment author: nshepperd 19 September 2013 02:18:58AM 2 points [-]

IIRC a common claim was that modeling beliefs at all is "subjective" and therefore unscientific.

Comment author: Lumifer 19 September 2013 04:53:47PM -1 points [-]

The frequentist interpretation of probability makes the claim that probability theory can only be used in situations involving large numbers of repeatable trials, or selection from a large population.

Yes, but frequentists have zero problems with hypothetical trials or populations.

Do note that for most well-specified statistical problems the Bayesians and the frequentists will come to the same conclusions. Differently expressed, likely, but not contradicting each other.

Comment author: Vaniver 18 September 2013 09:43:49PM 3 points [-]

I would NOT have serious disagreements with e.g. Vaniver's list.

I think they would have significant practical disagreement with #3, given the widespread use of NHST, but clever frequentists are as quick as anyone else to point out that NHST doesn't actually do what its users want it to do.

Sure, I would quibble about accents, importances, and priorities, but there's nothing there that would be unacceptable from the mainstream point of view.

Hence the importance of the qualifier 'qualitative'; it seems to me that accents, importances, and priorities are worth discussing, especially if you're interested in changing System 1 thinking instead of System 2 thinking. The mainstream frequentist thinks that base rate neglect is a mistake, but the Bayesian both thinks that base rate neglect is a mistake and has organized his language to make that mistake obvious when it occurs. If you take revealed preferences seriously, it looks like the frequentist says base rate neglect is a mistake but the Bayesian lives that base rate neglect is a mistake.

Now, why Bayes specifically? I would be happy to point to Laplace instead of Bayes, personally, since Laplace seems to have been way smarter and a superior rationalist. But the trouble with naming methods of "thinking correctly" is that everyone wants to name their method "thinking correctly," and so you rapidly trip over each other. "Rationalism," for example, refers to a particular philosophical position which is very different from the modal position here at LW. Bayes is useful as a marker, but it is not necessary to come to those insights by way of Bayes.

(I will also note that not disagreeing with something and discovering something are very different thresholds. If someone has a perspective which allows them to generate novel, correct insights, that perspective is much more powerful than one which merely serves to verify that insights are correct.)

Comment author: Lumifer 19 September 2013 02:12:33AM *  0 points [-]

...but clever frequentists

Yeah, I said if I were pretend to be a frequentist -- but that didn't involve suddenly becoming dumb :-)

it seems to me that accents, importances, and priorities are worth discussing

I agree, but at this point context starts to matter a great deal. Are we talking about decision-making in regular life? Like, deciding which major to pick, who to date, what job offer to take? Or are we talking about some explicitly statistical environment where you try to build models, fit them, evaluate them, do out-of-sample forecasting, all that kind of things?

I think I would argue that recognizing biases (Tversky/Kahneman style) and trying to correct for them -- avoiding them altogether seems too high a threshold -- is different from what people call Bayesian approaches. The Bayesian way of updating on the evidence is part of "thinking correctly", but there is much, much more than just that.

Comment author: Vaniver 19 September 2013 08:10:11AM 2 points [-]

I think I would argue that recognizing biases (Tversky/Kahneman style) and trying to correct for them -- avoiding them altogether seems too high a threshold -- is different from what people call Bayesian approaches.

At least one (and I think several) of biases identified by Tversky and Kahneman is "people do X, a Bayesian would do Y, thus people are wrong," so I think you're overstating the difference. (I don't know enough historical details to be sure, but I suspect Tversky and Kahneman might be an example of the Bayesian approach allowing someone to discover novel, correct insights.)

The Bayesian way of updating on the evidence is part of "thinking correctly", but there is much, much more than just that.

I agree, but it feels like we're disagreeing. It seems to me that a major Less Wrong project is "thinking correctly," and a major part of that project is "decision-making under uncertainty," and a major part of uncertainty is dealing with probabilities, and the Bayesian way of dealing with probabilities seems to be the best, especially if you want to use those probabilities for decision-making.

So it sounds to me like you're saying "we don't just need stats textbooks, we need Less Wrong." I agree; that's why I'm here as well as reading stats textbooks. But it also sounds to me like you're saying "why are you naming this Less Wrong stuff after a stats textbook?" The easy answer is that it's a historical accident, and it's too late to change it now. Another answer I like better is that much of the Less Wrong stuff comes from thinking about and taking seriously the stuff from the stats textbook, and so it makes sense to keep the name, even if we're moving to realms where the connection to stats isn't obvious.

Comment author: Lumifer 19 September 2013 04:39:03PM *  1 point [-]

Hm... Let me try to unpack my thinking, in particular my terminology which might not match exactly the usual LW conventions. I think of:

Bayes theorem as a simple, conventional, and an entirely uncontroversial statistical procedure. If you ask a dyed-in-the-wool rabid frequentist whether the Bayes theorem is true he'll say "Yes, of course".

Bayesian statistics as an approach to statistics with three main features. First is the philosophical interpretation of (some) probability as subjective belief. Second is the focus on conditional probabilities. Third is the strong preferences for full (posterior) distributions as answers instead of point estimates.

Cognitive biases (aka the Kahneman/Tversky stuff) as certain distortions in the way our wetware processes information about reality, as well as certain peculiarities in human decision-making. Yes, a lot of it it is concerned with dealing with uncertainty. Yes, there is some synergy with Bayesian statistics. No, I don't think this synergy is the defining factor here.

I understand that historically the in the LW community Bayesian statistics and cognitive biases were intertwined. But apart from historical reasons, it seems to me these are two different things and the degree of their, um, interpenetration is much overstated on LW.

it sounds to me like you're saying "we don't just need stats textbooks, we need Less Wrong."

Well, we need for which purpose? For real-life decision making? -- sure, but then no one is claiming that stats textbooks are sufficient for that.

much of the Less Wrong stuff comes from thinking about and taking seriously the stuff from the stats textbook

Some, not much. I can argue that much of LW stuff comes from thinking logically and following chains of reasoning to their conclusion -- or actually just comes from thinking at all instead of reacting instinctively / on the basis of a gut feeling or whatever.

I agree that thinking in probabilities is a very big step and it *is* tied to Bayesian statistics. But still it's just one step.

Comment author: Axion 06 July 2013 03:16:16AM *  0 points [-]

I guess the distinction in my mind is that in a Bayesian approach one enumerates the various hypothesis ahead of time. This is in contrast to coming up with a single hypothesis and then adding in more refined versions based on results. There are trade-offs between the two. Once you get going with a Bayesian approach you are much better protected against bias; however if you are missing some hypothesis from your prior you don't find it.

Here are some specific responses to the 4 answers:

  1. If you have a problem for which it is easy to enumerate the hypotheses, and have statistical data, then Bayes is great. If in addition you have a good prior probability distribution then you have the additional advantage that it is much easier to avoid bias. However if you find you are having to add in new hypotheses as you investigate then I would say you are using a hybrid method.

  2. Even without Bayes one is supposed to specifically look for alternate hypothesis and search for the best answer.
    On the Less Wrong welcome page the link next to the Bayesian link is a reference to the 2 4 6 experiment. I'd say this is an example of a problem poorly suited to Bayesian reasoning. It's not a statistical problem, and it's really hard to enumerate the prior for all rules for a list of 3 numbers ordered by simplicity. There's clearly a problem with confirmation bias, but I would say the thing to do is to step back and do some careful experimentation along traditional lines. Maybe Bayesian reasoning is helpful because it would encourage you to do that?

  3. I would agree that a rationalist needs to be exposed to these concepts.

  4. I wonder about this statement the most. It's hard to judge qualitative statements about probabilities. For example, I can say that I had a low prior belief in cryonics, and since reading articles here I have updated and now have a higher probability. I know I had some biases against the idea. However, I still don't agree and it's difficult to tell how much progress I've made in understanding the arguments.