# Perplexed comments on Taking Ideas Seriously - Less Wrong

51 13 August 2010 04:50PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Sort By: Best

Comment author: 29 August 2010 06:55:36PM 2 points [-]

I think that the problem is that EY has introduced non-standard terminology here. Worse, he blames it on Jaynes, who makes no such mistake. I just looked it up.

There are two concepts here which must not be confused.

• a priori information, aka prior information, aka background information
• prior probabilities, aka priors (by everyone except EY. Jaynes dislikes this but acquiesces).

Prior information does indeed constitute a hypothesis in which you have complete confidence. I agree this is something of a weakness - a weakness which is recognized implicitly in such folklore as "Cromwell's rule" Prior information cannot be updated.

Prior probabilities (frequently known simply as priors) can be updated. In a sense, being updated is their whole purpose in life.

Comment author: 29 August 2010 08:40:41PM 0 points [-]

This is exactly what's going on. Thank you.

I apologize for my confused terminology.

Comment author: 29 August 2010 09:05:11PM 2 points [-]

This is exactly what's going on. Thank you.

You are welcome. Unfortunately, I was wrong. Or at least incomplete.

I misinterpreted what EY was saying in the posting you cited. He was not, as I mistakenly assumed, saying that prior probabilities should not be called priors. He was instead talking about a third kind of entity which should not be confused with either of the other two.

• Prior distributions over hypotheses, which Eliezer wishes to call simply "priors"

But there is not a confusion with referring to both prior probabilities and prior distributions as simply priors because a prior probability is simply a special case of a prior distribution. A probability is simply a distribution over a set of two competing hypotheses - only one of which can be true.

Bayes theorem in its usual form applies only to simple prior probabilities. It tells you how to update the probability. In order to update a prior distribution, you effectively need to use Bayes's theorem multiple times - once for each hypothesis in your set of hypotheses.

So what is that 1/2 number which Eliezer says is definitely not a prior? It is none of the above three things. It is something harder to describe. A statistic over a distribution. I am not even going to try to explain what that means. Sorry for any confusion I may have created. And thx to Sniffnoy and timtyler for calling my attention to my mistake.

Comment author: 29 August 2010 09:26:24PM *  0 points [-]

I'm not convinced that there's a meaningful difference between prior distributions and prior probabilities.

Going back to the beans problem, we have this:

50% mixed bag.

• 50% draw white bean

• 50% draw black bean

50% unmixed bag.

• 100% draw white bean

This can easily be "flattened" into a single, more complex, probability distribution:

25% draw white bean from mixed bag.

25% draw black bean from mixed bag.

50% draw white bean from unmixed bag.

If we wish to consider multiple draws, we can again flatten the total event into a single distribution:

1/8 mixed bag, black and black

1/8 mixed bag, black and white

1/8 mixed bag, white and black

1/8 mixed bag, white and white

1/2 unmixed bag, white and white

Translating the "what is that number" question into this situation, we can ask: what do we mean when we say that we are 5/8 sure that we will draw two white beans? I would say that it is a confidence; the "event" that has 5/8 probability is a partial event, a lossy description of the total event.

Comment author: 29 August 2010 09:47:17PM *  3 points [-]

I'm not convinced that there's a meaningful difference between prior distributions and prior probabilities.

There isn't when you have only two competing hypotheses. Add a third hypothesis and you really do have to work with distributions. Chapter 4 of Jaynes explains this wonderfully. It is a long chapter, but fully worth the effort.

But the issue is also nicely captured by your own analysis. As you show, any possible linear combination of the two hypotheses can be characterized by a single parameter, which is itself the probability that the next ball will be white. But when you have three hypotheses, you have two degrees of freedom. A single probability number no longer captures all there is to be said about what you know.

Comment author: 29 August 2010 09:50:27PM 0 points [-]

In retrospect, it's obvious that "probability" should refer to a real scalar on the interval [0,1].

Comment author: 29 August 2010 07:51:34PM *  0 points [-]

Everyone calls prior probabilities "priors" - including: http://yudkowsky.net/rational/bayes