You're looking at Less Wrong's discussion board. This includes all posts, including those that haven't been promoted to the front page yet. For more information, see About Less Wrong.

alex_zag_al comments on Open thread, September 2-8, 2013 - Less Wrong Discussion

0 Post author: David_Gerard 02 September 2013 02:07PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (376)

You are viewing a single comment's thread. Show more comments above.

Comment author: Kindly 13 September 2013 01:52:45PM *  0 points [-]

A not-quite-rigorous explanation of the thing in 18.15:

E_aa is, by construction, only relevant to A. A_p was defined (in 18.1) to screen off all previous knowledge about A. So in fact, if we are given evidence E_aa but then given evidence A_p, then E_aa becomes completely irrelevant: it's no longer telling us anything about A, but it never told us anything about anything else. Therefore P(F|A_p E_aa) can be simplified to P(F|A_p).

Comment author: alex_zag_al 13 September 2013 04:04:25PM *  0 points [-]

E_aa is, by construction, only relevant to A.

That's not true though. By construction, every part of it is relevant to A.

That doesn't mean it's not relevant to anything else. For example, It could be in this Bayes net: E_aa ---> A ----> F. Then it'd be relevant to F.

Although... thinking about that Bayes net might answer other questons...

Hmm. Remember that Ap screens A from everything. I think that means that A's only connection is to Ap - everything else has to be connected through Ap.

So the above Bayes net is really

Eaa --> Ap --> F With another arrow from Ap to A.

Which would mean that Ap screens Eaa from F, which is what 18.15 says.

The above Bayes net represents an assumption that Eaa and F's only relevance to each other is that they're both evidence of A, which is often true I think.

Hmm. When I have some time I'm gonna draw Bayes nets to represent all of Jaynes' assumptions in this chapter, and when something looks unjustified, figure out what Bayes net structure would justify it.

In fact, I skipped over this before but this is actually recommended in the comments of that errata page I posted:

p. 554, eqn. (18.1): This definition cannot hold true for arbitrary propositions $E$; for example, what if $E$ implies $A$? This kind of problem occurs throughout the chapter. I don't think you can really discuss the $A_p$ distribution properly without explicitly introducing the notion of a sample space and organizing one's information about the sample space as a graphical model in which $A$ has a single parent variable $\theta$, with $A_p$ defined as the proposition $\theta = p$. For those unfamiliar with graphical models / Bayesian networks, I recommend the following book:

Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference (J. Pearl).