This post summarizes response to the Less Wrong Book Club and Study Group proposal, floats a tentative virtual meetup schedule, and offers some mechanisms for keeping up to date with the group's work. We end with summaries of Chapter 1.
Statistics
The proposal for a LW book club and study group, initially focusing on E.T. Jaynes' Probability Theory: The Logic of Science (a.k.a. PT:TLOS), drew an impressive response with 57 declarations of intent to participate. (I may have missed some or misinterpreted as intending to participate some who were merely interested. This spreadsheet contains participant data and can be edited by anyone (under revision control). Please feel free to add, remove or change your information.) The group has people from no less than 11 different countries, in time zones ranging from GMT-7 to GMT+10.
Live discussion schedule and venues
Many participants have expressed an interest in having informal or chatty discussions over a less permanent medium than LW itself, which should probably be reserved for more careful observations. The schedule below is offered as a basis for further negotiation. You can edit the spreadsheet linked above with your preferred times, and by the next iteration if a different clustering emerges I will report on that.
- Tuesdays at UTC 18:00 (that is 1pm Bay Area, 8pm in Europe, etc. - see linked schedule for more)
- Wednesdays at UTC 11:00 (seems preferred by Australian participants)
- Sundays at UTC 18:00 (some have requested a weekend meeting)
The unofficial Less Wrong IRC channel is the preferred venue. An experimental Google Wave has also been started which may be a useful adjunct, in particular as we come to need mathematical notations in our discussions.
I recommend reading the suggested material before attending live discussion sessions.
Objectives, math prerequisites
The intent of the group is to engage in "earnest study of the great literature in our area of interest" (to paraphrase from the Knowledge Hydrant pattern language, a useful resource for study groups).
Earnest study aims at understanding a work deeply. Probably (particularly so in the case of PT:TLOS) the most useful way to do so is sequentially, in the order the author presented their ideas. Therefore, we aim for a pace that allows participants to extract as much insight as possible from each piece of the work, before moving on to the next, which is assumed to build on it.
Exercises are useful stopping-points to check for understanding. When the text contains equations or proofs, reproducing the derivations or checking the calculations can also be a good way to ensure deep understanding.
PT:TLOS is (from personal experience) relatively accessible on rusty high school math (in particular requires little calculus) until at least partway through Chapter 6 (which is where I am at the moment). Just these few chapters contain many key insights about the Bayesian view of probability and are well worth the effort.
Format
My proposal for the format is as follows. I will post one new top-level post per chapter, so as to give people following through RSS a chance to catch updates. Each chapter, however, may require splitting up into more than one chunk to be manageable. I intend to aim for a weekly rhythm: the monday after the first chunk of a new chapter is posted, I will post the next chunk, and so on. If you're worried about missing an update, check the top-level post for the current chapter weekly on mondays.
Each update will identify the current chunk, and will link to a comment containing one or more "opening questions" to jump-start discussion.
Updates also briefly summarize the previous chunk and highlights of the discussion arising from it. (Participants in the live chat sessions are encouraged to designate one person to summarize the discussion and post the summary as a comment.) By the time a new chapter is to be opened, the previous post will contain a digest form of the group's collective take on the chapter just worked through. The cumulative effect will be a "Less Wrong's notes on PT:TLOS", useful in itself for newcomers.
Chapter 1: Plausible Reasoning
In this chapter Jaynes fleshes out a theme introduced in the preface: "Probability theory as extended logic".
Sections: Deductive and Plausible Reasoning - Analogies with Physical Theories - The Thinking Computer - Introducing the Robot (week of 14/06)
Classical (Aristotelian) logic - modus ponens, modus tollens - allows deduction (teasing apart the concepts of deduction, induction, abduction isn't trivial). But what if we're interested not just in "definitely true or false" but "is this plausible", as we are in the kind of everyday thinking Jaynes provides examples of? Plausible reasoning is a weaker form of inference than deduction, but one Jaynes argues plays an important role even in (say) mathematics.
Jaynes' aim is to construct a working model of our faculty of "common sense", in the same sense that the Wright brothers could form a working model of the faculty of flight, not by vague resort to analogy as in the Icarus myth, but by producing a machine embodying a precise understanding. (Jaynes, however, speaks favorably of analogical thinking: "Good mathematicians see analogies between theorems; great mathematicians seen analogies between analogies". He acknowledges that this line of argument itself stems from analogy with physics.)
Accordingly, Jaynes frames what is to follow as building an "inference robot". Jaynes notes, "the question of the reasoning process used by actual human brains is charged with emotion and grotesque misunderstandings", and so this frame will be helpful in keeping us focused on useful questions with observable consequences. It is tempting to also read a practical intent - just as robots can carry out specialized mechanical tasks on behalf of humans, so could an inference robot keep track of more details than our unaided common senses - we must however be careful not to project onto Jaynes some conception of a "Bayesian AI".
Sections: Boolean Algebra - Adequate Sets of Operations - The Basic Desiderata - Comments - Common Language vs Formal Logic - Nitpicking (week of 21/06)
Jaynes next introduces the familiar formal notation of Boolean algebra to represent truth-values of propositions, their conjunction and disjunction, and denial. (Equality denotes equality of truth-values, rather than equality of propositions.) Some care is required to distinguish common usage of terms such as "or", "implies", "if", etc. from their denotation in the Boolean algebra of truth-values. From the axioms of idempotence, commutativity, associativity, distributivity and duality, we can build up any number of more sophisticated consequences.
One such consequence, sketched out next, is that any function of n boolean variables can be expressed as a sum (logical OR) involving only conjunctions (logical AND) of each variable or its negation. Each of different logic functions can thus be expressed in terms of only building blocks and only three operations (conjunction, disjunction, negation). In fact an even smaller set of operations is adequate to construct all Boolean functions: it is possible to express all three in terms of the NAND (negation of AND) operation, for instance. (A key argument in Chapter 2 hinges on this reduction of logic functions to an "adequate set".)
The "inference robot", then, is to reason in terms of degrees of plausibility assigned to propositions: plausibility is a generalization of truth-value. We are generally concerned with "conditional probability"; how plausible something is given what else we know. This is represented in the familiar notation A|B (" the plausibility of A given that B is true", or "A given B"). The robot is assumed to be provided sensible, non-contradictory input.
Jaynes next considers the "basic desiderata" for such an extension. First, they should be real numbers. (This is motivated by an appeal to convenience of implementation; the Comments defend this in greater detail, and a more formal justification can be found in the Appendices.) By convention, greater plausibility will be represented with a greater number, and the robot's "sense of direction", that is, the consequences it draws from increases or decreases in the plausibility of the "givens", must conform to common sense. (This will play a key role in Chapter 2.) Finally, the robot is to be consistent and non-ideological: it must always draw the same conclusions from identical premises, it must not arbitrarily ignore information available to it, and it must represent equivalent states of knowledge by equivalent values of plausibility.
(The Comments section is well worth reading, as it introduces the Mind Projection Fallacy which LW readers who have gone through the Sequences should be familiar with.)
I find desideratum 1) to be poorly motivated, and a bit problematic. This is urged upon us in Chapter 1 mainly by considerations of convenience: a reasoning robot can't calculate without numbers. But just because a calculator can't calculate without numbers doesn't seem a sufficient justification to assume those numbers exist, i.e., that a full and coherent mapping from statements to plausibilities exists. This doesn't seem the kind of thing we can assume is possible, it's the kind of thing we need to investigate to see if it's possible.
This of course will depend on what class of statements we'll allow into our language. I can see two ways forward on this: 1) we can assume that we have language of statements for which desideratum 1) is true. But then we need to understand what restrictions we've placed on the kinds of statements that can have numerical plausibilities. Or 2) We can pick a language that we want to use to talk about the world, and then investigate whether desideratum 1) can be satisfied by that language. I don't see that this issue is touched on in Chapter 1.
There is further discussion of this in Appendix C; will this be discussed in connection with Chapter 1, or at some later time in the sequence? For example, in Appendix C, it turns out that desideratum 1 subdivides into two other axioms: transitivity, and universal comparability. The first one makes sense, but the second one doesn't seem as compelling to me.
It is indeed an extremely interesting question! Perhaps it would be wiser to use complex numbers for instance.
But intuitively it seems very likely that if you tell me two different propositions, that ... (read more)