Lovely. Thanks.
K. S. Van Horn gives a few lines describing the derivation in his PT:TLoS errata. I don't understand why he does step 4 there -- it seems to me to be irrelevant. The two main facts which are needed are step 2-3 and step 5, the sum of a geometric series and the Taylor series expansion around y = S(x). Hopefully that is a good hint.
Nitpicking with his errata, 1/(1-z) = 1 + z + O(z^2) for all z is wrong since the interval of convergence for the RHS is (-1,1). This is not important to the problem since the z here will be z = exp(-q) which is less than 1 since q is positive.
I would like to share some interesting discussion on a hidden assumption used in Cox's Theorem (this is the result which states that what falls out of the desiderata is a probability measure).
First, some criticism of Cox's Theorem -- a paper by Joseph Y. Halpern published in the Journal of AI Research. Here he points out an assumption which is necessary to arrive at the associative functional equation:
F(x, F(y,z)) = F(F(x,y), z) for all x,y,z
This is (2.13) in PT:TLoS
Because this equation was derived by using the associativity of the conjunction operation A(BC) = (AB)C, there are restrictions on what values the plausibilities x, y, and z can take. If these restrictions were stringent enough that x,y and z could only take on finitely many values or if they were to miss an entire interval of values, then the proof would fall apart. There needs to be an additional assumption that the values they can take form a dense subset. Halpern argues that this assumption is unnatural and unreasonable since it disallows "notions of belief with only finitely many gradations." For example, many AI projects have only finitely many propositions that are considered.
K. S. Van Horn's article on Cox's Theorem addresses this criticism directly and powerfully starting on page 9. He argues that the theory that is being proposed should be universal and so having holes in the set of plausibilities should be unacceptable.
Anyhow, I found it interesting if only because it makes explicit a hidden assumption in the proof.
Ah OK. You're right. I guess I was taking the 'extension of logic' thing a little too far there. I had it in my head that ({any prop} | {any contradiction}) = T since contradictions imply anything. Thanks.
Yeah. My solution is basically the same as yours. Setting A=B=C makes F(T,T) = T. But setting A=B AND C -> ~A makes F(T,T) = F (warning: unfortunate notation collision here).
Yeah. A total derivative. The way I think about it is the dv thing there (jargon: a differential 1-form) eats a tangent vector in the y-z plane. It spits out the rate of change of the function in the direction of the vector (scaled appropriately with the magnitude of the vector). It does this by looking at the rate of change in the y-direction (the dy stuff) and in the z-direction (the dz stuff) and adding those together (since after taking derivatives, things get nice and linear).
I'm not too familiar with the functional equation business either. I'm currently trying to figure out what the heck is happening on the bottom half of page 32. Figuring out the top half took me a really long while (esp. 2.50).
I'm convinced that the inequality in eqn 2.52 shouldn't be there. In particular, when you stick in the solution S(x) = 1 - x, it's false. I can't figure out if anything below it depends on that because I don't understand much below it.
I did not go through the 9 remaining cases, but I did think about one...
Suppose (AB|C) = F[(A|BC) , (B|AC)]. Compare A=B=C with (A = B) AND (C -> ~A).
Re 2-7: Yep, chain rule gets it done. By the way, took me a few minutes to realize that your citation "2-7" refers to a line in the pdf manuscript of the text. The numbering is different in the hardcopy version. In particular, it uses periods (e.g. equation 2.7) instead of dashes (e.g. equation 2-7), so as long as we're all consistent with that, I don't suppose there will be much confusion.
Jaynes discusses a "tricky point" with regard to the difference between the everyday >meaning of the verb "imply" and its logical meaning; are there other differences between >the formal language of logic and everyday language?
In formal logic, the disjunction "or" is inclusive -- "A or B" is true if A and B are true. In everyday language, typically "or" is exclusive -- "A or B" is meant to exclude the possibility that A and B are both true.
Thank you for this info. I've signed up. I think this flipped my mood from gloomy to happy.
Incidentally, this is the second study I've signed up for via the web. The first is the Good Judgement Project which has been a fun exercise so far.