Book Club Update, Chapter 2 of Probability Theory

Morendil

10 Book Club Update, Chapter 2 of Probability Theory

29th Jun 2010

3 min read

10

Previously: Book Club introductory post - First update and Chapter 1 summary

Discussion on chapter 1 has wound down, we move on to Chapter 2 (I have updated the previous post with a summary of chapter 1 with links to the discussion as appropriate). But first, a few announcements.

How to participate

This is both for people who have previously registered interest, as well as newcomers. This spreadsheet is our best attempt at coordinating 80+ Less Wrong readers interested in participating in "earnest study of the great literature in our area of interest".

If you are still participating, please let the group know - all you have to do is fill in the "Active (Chapter)" column. Write in an "X" if you are checked out, or the number of the chapter you are currently reading. This will let us measure attrition, as well as adapt the pace if necessary. If you would like to join, please add yourself to the spreadsheet. If you would like to participate in live chat about the material, please indicate your time zone and preferred meeting time. As always, your feedback on the process itself is more than welcome.

Refer to the previous post for more details on how to participate and meeting schedules.

Chapter 2: The Quantitative Rules

In this chapter Jaynes carefully introduces and justifies the elementary laws of plausibility, from which all later results are derived.

(Disclosure: I wasn't able to follow all the math in this chapter but I didn't let it deter me; the applications in later chapters are more accessible. We'll take things slow, and draw on such expertise as has been offered by more advanced members of the group. At worst this chapter can be enjoyed on a purely literary basis.)

Sections: The Product Rule - The Sum Rule. Exercises: 2.1 and 2.2

Chapter 2 works out the consequences of the qualitative desiderata introduced at the end of Chapter 1.

The first step is to consider the evaluation of the plausibility (AB|C), from the possibly relevant inputs: (B|C), (A|C), (A|BC) and (B|AC). Considerations of symmetry and the desideratum of consistency lead to a functional equation known as the "associativity equation": F(F(x,z),z)=F(x,F(y,z)), characterizing the the function F such that (AB|C)=F[(B|C),(A|BC)]. The derivation that follows requires some calculus, and shows by differentiating then integrating back the form of the product rule:

w(AB|C)=w(A|BC)w(B|C)=w(B|AC)w(A|C)

Having obtained this, the next step is to establish how (A|B) is related to (not-A|B). The functional equation in this case is

x*S(S(y)/x)=y*S(S(x)/y)

and the derivation, after some more calculus, leads to S(x)=(1-x^m)^(1/m). But the value of m is irrelevant, and so we end up with the two following rules:

p(AB|C)=p(A|BC)p(B|C)=p(B|AC)p(A|C)

p(not-A|B)+p(A|B)=1

The exercises provide a first opportunity to explore how these two rules yield a great many other ways of assessing probabilities of more complex propositions, for instance p(C|A+B), based on the elementary probabilities.

Sections: Qualitative Properties - Numerical Values - Notation and Finite Sets Policy - Comments. Exercises: 2.3

Jaynes next turns back to the relation between "plausible reasoning" and deductive logic, showing the latter as a limiting case of the former. The weaker syllogisms shown in Chapter 1 correspond to inequalities that can be derived from the product rule, and the direction of these inequalities start to point to likelihood ratios.

The product and sum rules allow us to consider the particular case when we have a finite set of mutually exclusive and exhaustive propositions, and background information which is symmetrical about each such proposition: it says the same about any one of them that it says about any other. Considering two such situations, where the propositions are the same but the labels we give them are different, Jaynes shows that, given our starting desiderata, we cannot do other than to assign the same probabilities to propositions which we are unable to distinguish otherwise than by their labels.

This is the principle of indifference; its significance is that even though what we have derived so far is an infinity of functions p(x) generated by the parameter m, the desiderata entirely "pin down" the numerical values in this particular situation.

So far in this chapter we had been using p(x) as a function relating the plausibilities of propositions, such that p(x) was an arbitrary monotonic function of the plausibility x. At this point Jaynes suggests that we "turn this around" and say that x is a function of p. These values of p, probabilities, become the primary mathematical objects, while the plausibilities "have faded entirely out of the picture. We will just have no further use for them".

The principle of indifference now allows us to start computing numerical values for "urn probabilities", which will be the main topic of the next chapter.

Exercise 2.3 is notable for providing a formal treatment of the conjunction fallacy.

Chapter 2 ends with a cautionary note on the topic of justifying results on infinite sets only based on a "well-behaved" process of passing to the limit of a series of finite cases. The Comments section addresses the "subjective" vs "objective" distinction.

Personal Blog

10

New Comment

Rendering 0/42 comments, sorted by

top scoring

(show more) Click to highlight new comments since: Today at 4:32 PM

Moderation Log

10 Book Club Update, Chapter 2 of Probability Theory

by Morendil

29th Jun 2010

3 min read

10

Previously: Book Club introductory post - First update and Chapter 1 summary

How to participate

Refer to the previous post for more details on how to participate and meeting schedules.

Chapter 2: The Quantitative Rules

In this chapter Jaynes carefully introduces and justifies the elementary laws of plausibility, from which all later results are derived.

Sections: The Product Rule - The Sum Rule. Exercises: 2.1 and 2.2

Chapter 2 works out the consequences of the qualitative desiderata introduced at the end of Chapter 1.

w(AB|C)=w(A|BC)w(B|C)=w(B|AC)w(A|C)

Having obtained this, the next step is to establish how (A|B) is related to (not-A|B). The functional equation in this case is

x*S(S(y)/x)=y*S(S(x)/y)

and the derivation, after some more calculus, leads to S(x)=(1-x^m)^(1/m). But the value of m is irrelevant, and so we end up with the two following rules:

p(AB|C)=p(A|BC)p(B|C)=p(B|AC)p(A|C)

p(not-A|B)+p(A|B)=1

Sections: Qualitative Properties - Numerical Values - Notation and Finite Sets Policy - Comments. Exercises: 2.3

The principle of indifference now allows us to start computing numerical values for "urn probabilities", which will be the main topic of the next chapter.

Exercise 2.3 is notable for providing a formal treatment of the conjunction fallacy.

Personal Blog

10

Mentioned in

6Book Club Update, Chapter 3 of Probability Theory

New Comment

Rendering 0/42 comments, sorted by

top scoring

(show more) Click to highlight new comments since: Today at 4:32 PM

Moderation Log

More from Morendil

Curated and popular this week

42Comments

Comment Permalink

taiyo16y00

I did not go through the 9 remaining cases, but I did think about one...

Suppose (AB|C) = F[(A|BC) , (B|AC)]. Compare A=B=C with (A = B) AND (C -> ~A).

Re 2-7: Yep, chain rule gets it done. By the way, took me a few minutes to realize that your citation "2-7" refers to a line in the pdf manuscript of the text. The numbering is different in the hardcopy version. In particular, it uses periods (e.g. equation 2.7) instead of dashes (e.g. equation 2-7), so as long as we're all consistent with that, I don't suppose there will be much confusion.

Cyan16y10

Suppose (AB|C) = F[(A|BC) , (B|AC)]. Compare A=B=C with (A = B) AND (C -> ~A).

Not sure what you're getting at. To rule out (AB|C) = F[(A|BC) , (B|AC)], set A = B and let A's plausibility given C be arbitrary. Let T represent the (fixed) plausibility of a tautology. Then we have

(A|BC) = (B|AC) = T (because A = B)
(AB|C) = F(T, T) = constant

But (AB|C) is arbitrary by hypothesis, so (AB|C) = F[(A|BC) , (B|AC)] is not useful.

ETA: Credit where it's due: page 13, point 4 of Kevin S. Van Horne's guide to Cox's theorem (warning: pdf).

3Morendil16y

OK, thanks. I'm able to follow a fair bit of what's going on here; the hard portions for me are when Jaynes gets some result without saying which rule or operation justifies it - I suppose it's obvious to someone familiar with calculus, but when you lack these background assumptions it can be very hard to infer what rules are being used, so I can't even find out how I might plug the gaps in my knowledge. (Definitely "deadly unk-unk" territory for me.) (Of course "follow" isn't the same thing at all as "would be able to get similar results on a different but related problem". I grok the notion of a functional equation, and I can verify intermediate steps using a symbolic math package, but Jaynes' overall strategy is obscure to me. Is this a common pattern, taking the derivative of a functional equation then integrating back?) The next bit where I lose track is 2.22. What's going on here, is this a total derivative?

4Kazuo_Thow16y

Could we standardize on using the whole-book-as-one-PDF version, at least for the purposes of referencing equations? ETA: So far I've benefited from checking the relevant parts of Kevin Van Horn's unofficial errata pages before (and often while) reading a particular section.

See in context