brian_jaress comments on Bayesian Flame - Less Wrong

37 Post author: cousin_it 26 July 2009 04:49PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (155)

You are viewing a single comment's thread.

Comment author: brian_jaress 26 July 2009 09:42:12PM 0 points [-]

I'd like to take advantage of frequentism's return to respectability to ask if anyonw knows where I can get a copy of "An Introduction to the Bootstrap" by Efron and Tibshirani.

It's on Google books, but I don't like reading things through Google books. It's for sale on-line, but it costs a lot and shipping takes a while. My university's library is supposed to have it, but the librarians can't find it. My local library hasn't heard of it.

I hardly know any statistics or probability; I've just been borrowing bits and pieces as I need them without worrying about bayesian vs. frequentism.

There is a little something that's been bothering me in the back of my mind when I see Eliezer waxing poetic about bayesianism. Maybe this is an ignorant question, but here it is:

If bayesians don't believe in a true probability waiting to be approximated, only in probabilities assigned by a mind, how do they justify seeking additional data? The rules require you to react to new data by moving your assigned probability in a certain way, but, without something desirable that you're moving towards, why is it good to have that new data?

Comment author: Cyan 26 July 2009 10:09:49PM *  1 point [-]

If bayesians don't believe in a true probability waiting to be approximated, only in probabilities assigned by a mind, how do they justify seeking additional data? The rules require you to react to new data by moving your assigned probability in a certain way, but, without something desirable that you're moving towards, why is it good to have that new data?

Collecting new data is not justifiable in general -- the cost of the new data may outweigh the benefit to be gained from it. But let's assume that collecting new data has a negligible cost. As a Bayesian, what you desire is the smallest loss possible. For reasonable loss functions, the smaller the region over which your distribution spreads its uncertainty (that is to say, the smaller its variance) the smaller you expect your loss to be. The law of total variance can be interpreted to say that you expect the variance of the posterior distribution to be smaller than the variance of the prior distributions.* So collect more data!

* law of total variance: prior variance = prior expectation of posterior variance + prior variance of posterior mean. This implies that the prior variance is larger than the prior expectation of posterior variance.

Comment author: brian_jaress 26 July 2009 10:32:52PM 0 points [-]

So, more data is good because it makes you more confident? I guess that makes sense, but it still seems strange not to care what you're confident in.

Comment author: Cyan 26 July 2009 10:42:28PM 2 points [-]

In any real problem there is a context and some prior information. Bayes doesn't give this to you -- you give it to Bayes along with the data and turn the crank on the machinery to get the posterior. The things you're confident about are in the context.

Comment author: brian_jaress 27 July 2009 12:27:20AM 0 points [-]

What about changing your mind?

Comment author: Cyan 27 July 2009 01:15:23AM 0 points [-]

In theory, if you can change your mind about something, you have uncertainty about it, and your prior distribution should reflect that. In practice, you abstract the uncertainty away by making some simplifying assumptions, do the analysis conditional on your assumptions, and reserve the right to revisit the assumptions if they don't seem adequate.

Comment author: brian_jaress 27 July 2009 02:53:16AM 1 point [-]

I didn't mean to ask how a bayesian changes his or her mind. I meant to ask how the thing you believe in can be in the context in situations where you change your mind based on new evidence.

Comment author: Cyan 27 July 2009 03:06:43AM *  1 point [-]

Let's say I'm weighing some acrylamide powder on an electronic balance. (Gonna make me some polyacrylamide gel!) The balance is so sensitive that small changes in air pressure register in the last two digits. From what I know about air pressure variations from having done this before, I create a model for the data. Also because I've done this before, I can eyeball roughly how much powder I've got on the balance; this determines my prior distribution before reading the balance. Then I observe some data from the balance readout and update my distribution.

Comment author: brian_jaress 27 July 2009 08:05:26AM 0 points [-]

I can't tell without more information whether that's an example of what I mean by "changing your mind." Here's one that I think definitely qualifies:

Let's say you're going to bet on a coin toss. You only have a small amount of information on the coin, and you decide for whatever reason that there's a 51% chance of getting heads. So you're going to bet on heads. But then you realize that there's a way to get more data.

At this point, I'm thinking, "Gee, I hardly know anything about this coin. Maybe I'm better off betting on tails and I just don't know it. I should get that data."

What I think you're saying about bayesians is that a bayesian would say, "Gee, 51% isn't very high. I'd like to be at least 80% sure. Since I don't know very much yet, it wouldn't take much more to get to 80%. I should get that data so I can bet on heads with confidence."

Which sort of makes sense but is also a little strange.

Comment author: Cyan 27 July 2009 03:29:39PM *  3 points [-]

Technical stuff: under the standard assumption of infinite exchangeability of coin tosses, there exists some limiting relative frequency for coin toss results. (This is de Finetti's theorem.)

Key point: I have a probability distribution for this relative frequency (call it f) -- not a probability of a probability.

You only have a small amount of information on the coin, and you decide for whatever reason that there's a 51% chance of getting heads. So you're going to bet on heads. But then you realize that there's a way to get more data.

Here you've said that my probability density for f is dispersed, but slightly asymmetric. I too can say, "Well, I have an awful lot of probability mass on values of f less than 0.5. I should collect more information to tighten this up."

"Gee, 51% isn't very high. I'd like to be at least 80% sure. Since I don't know very much yet, it wouldn't take much more to get to 80%. I should get that data so I can bet on heads with confidence."

This mixes up f on the one hand with my distribution for f on the other. I can certainly collect data until I'm 80% sure that f is bigger than 0.5 (provided that f really is bigger than 0.5). This is distinct from being 80% sure of getting heads on the next toss.