This post summarizes response to the Less Wrong Book Club and Study Group proposal, floats a tentative virtual meetup schedule, and offers some mechanisms for keeping up to date with the group's work. We end with summaries of Chapter 1.
Statistics
The proposal for a LW book club and study group, initially focusing on E.T. Jaynes' Probability Theory: The Logic of Science (a.k.a. PT:TLOS), drew an impressive response with 57 declarations of intent to participate. (I may have missed some or misinterpreted as intending to participate some who were merely interested. This spreadsheet contains participant data and can be edited by anyone (under revision control). Please feel free to add, remove or change your information.) The group has people from no less than 11 different countries, in time zones ranging from GMT-7 to GMT+10.
Live discussion schedule and venues
Many participants have expressed an interest in having informal or chatty discussions over a less permanent medium than LW itself, which should probably be reserved for more careful observations. The schedule below is offered as a basis for further negotiation. You can edit the spreadsheet linked above with your preferred times, and by the next iteration if a different clustering emerges I will report on that.
- Tuesdays at UTC 18:00 (that is 1pm Bay Area, 8pm in Europe, etc. - see linked schedule for more)
- Wednesdays at UTC 11:00 (seems preferred by Australian participants)
- Sundays at UTC 18:00 (some have requested a weekend meeting)
The unofficial Less Wrong IRC channel is the preferred venue. An experimental Google Wave has also been started which may be a useful adjunct, in particular as we come to need mathematical notations in our discussions.
I recommend reading the suggested material before attending live discussion sessions.
Objectives, math prerequisites
The intent of the group is to engage in "earnest study of the great literature in our area of interest" (to paraphrase from the Knowledge Hydrant pattern language, a useful resource for study groups).
Earnest study aims at understanding a work deeply. Probably (particularly so in the case of PT:TLOS) the most useful way to do so is sequentially, in the order the author presented their ideas. Therefore, we aim for a pace that allows participants to extract as much insight as possible from each piece of the work, before moving on to the next, which is assumed to build on it.
Exercises are useful stopping-points to check for understanding. When the text contains equations or proofs, reproducing the derivations or checking the calculations can also be a good way to ensure deep understanding.
PT:TLOS is (from personal experience) relatively accessible on rusty high school math (in particular requires little calculus) until at least partway through Chapter 6 (which is where I am at the moment). Just these few chapters contain many key insights about the Bayesian view of probability and are well worth the effort.
Format
My proposal for the format is as follows. I will post one new top-level post per chapter, so as to give people following through RSS a chance to catch updates. Each chapter, however, may require splitting up into more than one chunk to be manageable. I intend to aim for a weekly rhythm: the monday after the first chunk of a new chapter is posted, I will post the next chunk, and so on. If you're worried about missing an update, check the top-level post for the current chapter weekly on mondays.
Each update will identify the current chunk, and will link to a comment containing one or more "opening questions" to jump-start discussion.
Updates also briefly summarize the previous chunk and highlights of the discussion arising from it. (Participants in the live chat sessions are encouraged to designate one person to summarize the discussion and post the summary as a comment.) By the time a new chapter is to be opened, the previous post will contain a digest form of the group's collective take on the chapter just worked through. The cumulative effect will be a "Less Wrong's notes on PT:TLOS", useful in itself for newcomers.
Chapter 1: Plausible Reasoning
In this chapter Jaynes fleshes out a theme introduced in the preface: "Probability theory as extended logic".
Sections: Deductive and Plausible Reasoning - Analogies with Physical Theories - The Thinking Computer - Introducing the Robot (week of 14/06)
Classical (Aristotelian) logic - modus ponens, modus tollens - allows deduction (teasing apart the concepts of deduction, induction, abduction isn't trivial). But what if we're interested not just in "definitely true or false" but "is this plausible", as we are in the kind of everyday thinking Jaynes provides examples of? Plausible reasoning is a weaker form of inference than deduction, but one Jaynes argues plays an important role even in (say) mathematics.
Jaynes' aim is to construct a working model of our faculty of "common sense", in the same sense that the Wright brothers could form a working model of the faculty of flight, not by vague resort to analogy as in the Icarus myth, but by producing a machine embodying a precise understanding. (Jaynes, however, speaks favorably of analogical thinking: "Good mathematicians see analogies between theorems; great mathematicians seen analogies between analogies". He acknowledges that this line of argument itself stems from analogy with physics.)
Accordingly, Jaynes frames what is to follow as building an "inference robot". Jaynes notes, "the question of the reasoning process used by actual human brains is charged with emotion and grotesque misunderstandings", and so this frame will be helpful in keeping us focused on useful questions with observable consequences. It is tempting to also read a practical intent - just as robots can carry out specialized mechanical tasks on behalf of humans, so could an inference robot keep track of more details than our unaided common senses - we must however be careful not to project onto Jaynes some conception of a "Bayesian AI".
Sections: Boolean Algebra - Adequate Sets of Operations - The Basic Desiderata - Comments - Common Language vs Formal Logic - Nitpicking (week of 21/06)
Jaynes next introduces the familiar formal notation of Boolean algebra to represent truth-values of propositions, their conjunction and disjunction, and denial. (Equality denotes equality of truth-values, rather than equality of propositions.) Some care is required to distinguish common usage of terms such as "or", "implies", "if", etc. from their denotation in the Boolean algebra of truth-values. From the axioms of idempotence, commutativity, associativity, distributivity and duality, we can build up any number of more sophisticated consequences.
One such consequence, sketched out next, is that any function of n boolean variables can be expressed as a sum (logical OR) involving only conjunctions (logical AND) of each variable or its negation. Each of different logic functions can thus be expressed in terms of only building blocks and only three operations (conjunction, disjunction, negation). In fact an even smaller set of operations is adequate to construct all Boolean functions: it is possible to express all three in terms of the NAND (negation of AND) operation, for instance. (A key argument in Chapter 2 hinges on this reduction of logic functions to an "adequate set".)
The "inference robot", then, is to reason in terms of degrees of plausibility assigned to propositions: plausibility is a generalization of truth-value. We are generally concerned with "conditional probability"; how plausible something is given what else we know. This is represented in the familiar notation A|B (" the plausibility of A given that B is true", or "A given B"). The robot is assumed to be provided sensible, non-contradictory input.
Jaynes next considers the "basic desiderata" for such an extension. First, they should be real numbers. (This is motivated by an appeal to convenience of implementation; the Comments defend this in greater detail, and a more formal justification can be found in the Appendices.) By convention, greater plausibility will be represented with a greater number, and the robot's "sense of direction", that is, the consequences it draws from increases or decreases in the plausibility of the "givens", must conform to common sense. (This will play a key role in Chapter 2.) Finally, the robot is to be consistent and non-ideological: it must always draw the same conclusions from identical premises, it must not arbitrarily ignore information available to it, and it must represent equivalent states of knowledge by equivalent values of plausibility.
(The Comments section is well worth reading, as it introduces the Mind Projection Fallacy which LW readers who have gone through the Sequences should be familiar with.)
It occurs to me that Jaynes is missing a desideratum that I might have included. I can't decide if it's completely trivial, or if perhaps it's covered implicitly in his consistency rule 3c; I expect it will become clear as the discussion becomes more formal -- and of course, he did promise that the rules given would turn out to be sufficient. To wit:
One more thing. The footnote on page 12 wonders: Does it follow that AND and NOT (or NAND alone) are sufficient to write any computer program?
Isn't this trivial? Since AND and NOT can together be composed to represent any logic function, and a logic function can be interpreted as a function from some number of bits (the truth values of the variable propositions) to one result bit, it follows that we can write programs with AND and NOT that make any bits in our computer an arbitrary function of any of the other bits. Is there some complication I'm missing?
(Edited slightly for clarity.)
I don't understand what you mean by "(B|A) = (B|A')".