JGWeissman comments on Information theory and the symmetry of updating beliefs - Less Wrong

45 Post author: Academian 20 March 2010 12:34AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (28)

You are viewing a single comment's thread. Show more comments above.

Comment author: JGWeissman 20 March 2010 04:58:18PM 0 points [-]

Given these assumptions, your theorem holds. But, the condition P(BC) = P(B)·P(C) is problematic. An important point you are glossing over is that in epistemological probability theory, all probabilities are conditional probabilities, and the condition is more formally P(BC|X) = P(B|X)*P(C|X), where X represents a state of belief. The problem is that, since B and C both depend on A, the condition is unstable with respect to P(A|X). While it is possible to come up with contrived examples of A, B, C, and X where this works, it is not the sort of thing I would expect to come up naturally and be actually useful in implementing an epistemology in the real world.

Contrasting with likelihood ratios, the example I gave earlier could represent the situation where you know a coin is biased to land on one side 9/10 of the time when flipped, but you don't know which, A represents that it is biased to show heads, and B1 and B2 represent that it does show heads in two separate trial flips.

Comment author: Academian 20 March 2010 06:53:51PM *  0 points [-]

X represents a state of belief .... The problem is that, since B and C both depend on A, the condition (1) [P(BC|X) = P(B|X)·P(C|X)] is unstable with respect to P(A|X)

Isn't this equally a problem for likelihoods? If I understand you, the "independent tests" assumptions for likelihood-ratio multiplying are also "unstable with respect to P(A|X)". Explicitly, I mean

(2) P(BC|XA) = P(B|XA)·P(C|XA) and P(BC|X~A) = P(B|XA)·P(C|X~A).

Even if P(A|X) stays constant, "changing X", i.e. making a new observation D, can change all of these assumptions, no?

If I've misunderstood, could you please explain in full detail

(0) What you mean by "unstable with respect to P(A|X)",

(1) How the "before and after" assumptions are "unstable", and

(2) How the "independent tests" assumptions are "stable"?

Even if this isn't it, I'm very curious to know if there are epistemological reasons to favor assumptions (1) over assumptions (2)...

Comment author: JGWeissman 20 March 2010 07:43:53PM 0 points [-]

Isn't this equally a problem for likelihoods? If I understand you, the "independent tests" assumptions for likelihood-ratio multiplying are also "unstable with respect to P(A|X)". Explicitly, I mean

(2) P(BC|XA) = P(B|XA)·P(C|XA) and P(BC|X~A) = P(B|X~[correction]A)·P(C|X~A).

It is not a problem for likelihood ratios because the inclusion of A or of ~A in the condition of each probability in (2) screens off P(A|X). When you assume that A is true (or when you assume A is false), you no longer need to consider the probability of A when figuring out the conclusions of that assumption.

Even if P(A|X) stays constant, "changing X", i.e. making a new observation D, can change all of these assumptions, no?

Yes, you can come up with scenarios where (2) does not apply, and you cannot multiply likelihood ratios. My point was that there are plausible scenarios where it does apply, and this sort of thing actually happens in real life.

(0) What you mean by "unstable with respect to P(A|X)"

A condition is unstable with respect to R if it can be true for a certain value of R, but this does not imply that it will be true for other values of R.

(1) How the "before and after" assumptions are "unstable",

In order to simultaneously have B and C be independent given X and given XA, they need to have a complicated dependence given X~A, this dependence itself depending on P(A|X).

(2) How the "independent tests" assumptions are "stable"?

As I stated earlier, the condition A or the condition ~A screens off P(A|X).

I would have expected this to be clear if you had applied your questions to my example, and tried to construct a corresponding example in which Pev's can be multiplied.

Comment author: Academian 20 March 2010 09:07:47PM 0 points [-]

It is not a problem for likelihood ratios because the inclusion of A or of ~A in the condition of each probability in (2) screens off P(A|X)

That your prose is not accompanied by any equations makes it very effortful to filter all its possible meanings to find the intelligent one you no doubt intend, and impractical to respond. I might disagree that

In order to simultaneously have B and C be independent given X and given XA, they need to have a complicated dependence given X~A, this dependence itself depending on P(A|X).

but I can't be sure, and I have nothing precise to refer back to. (You don't have to oblige me, of course.)

Comment author: JGWeissman 20 March 2010 10:03:39PM *  1 point [-]

It is not a problem for likelihood ratios because the inclusion of A or of ~A in the condition of each probability in (2) screens off P(A|X)

That your prose is not accompanied by any equations makes it very effortful to filter all its possible meanings to find the intelligent one you no doubt intend, and impractical to respond.

First, that is not a response to your quote from me. In that quote I was in fact pointing to elements of an equation.

I might disagree that

In order to simultaneously have B and C be independent given X and given XA, they need to have a complicated dependence given X~A, this dependence itself depending on P(A|X).

but I can't be sure, and I have nothing precise to refer back to. (You don't have to oblige me, of course.)

By "before" independence, we have

(1) P(B) = P(B|C).

Decompose B into BA or ~AB

(2) P(B|C) = P(AB|C) + P(~AB|C)
(3) P(B|C) = P(A|C)P(B|AC) + P(~A|C)P(B|~AC)

Apply "after" independence to (3)

(4) P(B|C) = P(A|C)P(B|A) + P(~A|C)P(B|~AC)

Combing (1) and (4)

(5) P(B) = P(A|C)P(B|A) + P(~A|C)P(B|~AC)

Solve for P(B|~AC)

(6) P(B|~AC) = (P(B) - P(A|C)P(B|A))/P(~A|C)

Use Bayes' Law substitute out conditional probabilities of A

(7) P(B|~AC) = (P(B) - P(A)(P(C|A)/P(C))P(B|A))/(P(~A)P(C|~A)/P(C))

I believe that (7) qualifies as a complicated dependence given ~A, which itself depends on P(A). This is just the result of my exploration of the issue, but you asked for it.

Also, I would like to reiterate:

I would have expected this to be clear if you had applied your questions to my example, and tried to construct a corresponding example in which Pev's can be multiplied.

Comment author: Academian 21 March 2010 03:38:56PM *  0 points [-]

Here's a simple example where pev(A,BC)=pev(A,B)pev(A,C):

[EG1] P(A)=1/4, P(B)=P(C)=1/2, P(BC)=1/4, P(AC)=P(BC)=1/16, P(ABC)=1/64.

Then just suppose there are some medical diagnostics with these probabilities, etc. But see below for a comparison of this with your coin scenario.

More generally, let a=P(A)=1/4, b=P(B), c=P(C), x=P(BC), y=P(AC), z=P(AB), t=P(ABC).

The "before and after independence" assumption [BAI] is that x=bc and at=yz

The "independent tests" assumption [IT] is that (x-t)=(b-z)(c-y) and at=yz.

(2) How the "independent tests" assumptions are "stable"?

As I stated earlier, the condition A or the condition ~A screens off P(A|X).

Neither [BAI] nor [IT] is stable with respect to the variable a=P(A), and I see no equation involving anything I'd call "screening off", though there might be one somewhere.

In any case, that does not interest me, because your equation (7) has my attention:

(7) P(B|~AC) = (P(B) - P(A)(P(C|A)/P(C))P(B|A))/(P(~A)P(C|~A)/P(C))

A qualitative distinction between my would-be medical scenario [EG1] and your coin scenario is that medical diagnostics, and particularly their dependence/independence, are not always "causally justified", but two coin flips can be seen as independent because they visibly just don't interact.

I bet your (7) gives a good description of this somehow, or something closely related... But I still have to formalize my thoughts about this.

Let me think about it awhile, and I'll post again if I understand or wonder something more precise.

Comment author: JGWeissman 21 March 2010 09:23:54PM *  1 point [-]

[EG1] P(A)=1/4, P(B)=P(C)=1/2, P(BC)=1/4, P(AC)=P(BC)=1/16, P(ABC)=1/64.

I assume you meant:

[EG1] P(A)=1/4, P(B)=P(C)=1/2, P(BC)=1/4, P(AC)=P(AB)=1/16, P(ABC)=1/64.

Then just suppose there are some medical diagnostics with these probabilities, etc.

You just glossed over the whole point of this exercise. The problem is that values such as P(ABC) are combined facts about the population and both tests. Try defining the scenario using only facts about the population in isolation (P(A)), about the tests in isolation (P(B|A), P(C|A), P(B|~A), P(C|~A)), and the dependence between the tests (P(B|AC), P(B|A~C), P(B|~AC), P(B|~A~C), [EDIT: removed redundant terms, I got carried away when typing out the permutations]). The point is to demonstrate how you have to contrive certain properties of the dependence between the tests, the conditional probabilities given ~A, to make summary properties you care about work out for a specific separate fact about the population, P(A).