From the last thread:
From Costanza's original thread (entire text):
"This is for anyone in the LessWrong community who has made at least some effort to read the sequences and follow along, but is still confused on some point, and is perhaps feeling a bit embarrassed. Here, newbies and not-so-newbies are free to ask very basic but still relevant questions with the understanding that the answers are probably somewhere in the sequences. Similarly, LessWrong tends to presume a rather high threshold for understanding science and technology. Relevant questions in those areas are welcome as well. Anyone who chooses to respond should respectfully guide the questioner to a helpful resource, and questioners should be appropriately grateful. Good faith should be presumed on both sides, unless and until it is shown to be absent. If a questioner is not sure whether a question is relevant, ask it, and also ask if it's relevant."
Meta:
- How often should these be made? I think one every three months is the correct frequency.
- Costanza made the original thread, but I am OpenThreadGuy. I am therefore not only entitled but required to post this in his stead. But I got his permission anyway.
Meta:
- I still haven't figured out a satisfactory answer to the previous meta question, how often these should be made. It was requested that I make a new one, so I did.
- I promise I won't quote the entire previous threads from now on. Blockquoting in articles only goes one level deep, anyway.
Yeah, but if your observation does not have a probability of 1 then Bayesian conditionalization is the wrong update rule. I take it this was Alex's point. If you updated on a 0.7 probability observation using Bayesian conditionalization, you would be vulnerable to a Dutch book. The correct update rule in this circumstance is Jeffrey conditionalization. If P1 is your distribution prior to the observation and P2 is the distribution after the observation, the update rule for a hypothesis H given evidence E is:
P2(H) = P1(H | E) P2(E) + P1(H | ~E) P2(~E)
If P2(E) is sufficiently close to 1, the contribution of the second term in the sum is negligible and Bayesian conditionalization is a fine approximation.
This is a strange distinction, Jeffrey conditionalization. A little google searching shows that someone got their name added to conditioning on E and ~E. To me that's just a straight application of probability theory. It's not like I just fell off the turnip truck, but I've never heard anyone give this a name before.
To get a marginal, you condition on what you know, and sum across the other things you don't. I dislike the endless multiplication of terms for special cases where the general form is clear enough.