alex_zag_al comments on Stupid Questions Open Thread Round 4 - Less Wrong Discussion
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (179)
Okay. I'm assuming everyone has the same prior. I'm going to start by comparing the case where C talks to A and learns everything A knows, to the case where C talks to B and learns everything B knows; that is, when C ends up conditioning on all the same things. If you already see why those two cases are very different, you can skip down to the second section, where I talk about what this implies about how C updates when just hearing that A knows a lot and what Pa(X) is, compared to how he updates when learning what B thinks. It's the same scenario as you described: knowlegable A, ignorant B, Pa(X) = Pb(X).
What happens when C learns everything B knows depends on what evidence C already has. If C knows nothing, then after talking to B, Pc(X) = Pb(X), because he'll be conditioning on exactly the same things.
In other words, if C knows nothing, then C is even more ignorant than B is. When he talks to B, he becomes exactly as ignorant as B is, and assigns the probability that you have in that state of ignorance.
It's only if C already has some evidence that talking to A and talking to B becomes different. As Kindly said, Pa(X) is very stable. So once C learns everything that A knows, C ends up with the probability Pa(X|whatever C knew), which is probably a lot like Pa(X). To take an extreme case, if A is well-informed enough, then she already knows everything C knows, and Pa(X|whatever C knew) is equal to Pa(X), and C comes out with exactly the same probability as A. But if C's info is new to A, then it's probably a lot like telling your biochemistry professor about a study that you read weighing in on one side of a debate: she's seen plenty of evidence for both sides, and unless this new study is particularly conclusive, it's not going to change her mind a whole lot.
However, B's probability is not stable. That biochemistry study might change B's mind a lot, because for all she knows, there isn't even a debate, and she has this pretty good evidence for one side of it. So, once C talks to B and learns everything B knows, C will be using the probability that incorporates all of B's knowledge, plus his own: Pb(X|whatever C knew). This is probably farther from Pb(X) aka Pa(X) than Pa(X|whatever C knew).
This is just how it would typically go. I say A's probability is more "stable", but there's actually some evidence that A would recognize as extremely significant that would mean nothing to B. In this case, one C has learned everything A knows, he would also recognize the significance of the little bit of knowledge that he came in with, and end up with a probability far different from Pa(X).
So that's how it would probably go if C actually sits down and learns everything they know. So, what if C just knows that A is knowledgable, and Pa(X)? Well, suppose that C is convinced by my reasoning, that if he sat down with A and learned everything she knew, then her probability of X would end up pretty close to Pa(X).
Here's the key thing: If C expects that, then his probability is already pretty close to Pa(X). All C knows is that A is knowledgable and has Pa(X), but if he expects to be convinced after learning everything A knows, then he already is convinced.
For any event Q, P(X) is equal to the expected value of P(X|the outcome of Q). That is, you don't know the outcome of Q, but if there's N mutually exclusive possible outcomes O1... ON, then P(X) = P(X|O1)P(O1) + ... + P(X|ON)P(ON). This is one way of stating Conservation of Probability. If the expected value of Pc(X|the outcome of learning everything A knows) is pretty close to Pa(X), then, well, Pc(X) must be pretty close too, because the expected value of Pc(X|the outcome of learning everything A knows) is equal to Pc(X).
Likewise, if C learns about B's knowledge and Pb(X), and he doesn't think that learning everything B knows would make much of a difference, then he also doesn't end up matching Pb(X) unless he started out matching before he even learned B's testimony.
I've been assuming that A's knowledge makes her probability more "stable"; Pa(X|one more piece of evidence) is close to Pa(X). What if A is knowledgable but unstable? I think it still works out the same way but I haven't worked it out and I have to go.
PS: This is a first attempt on my part. Hopefully it's overcomplicated and overspecific, so we can work out/receive a more general/simple answer. But I saw that nobody else had replied so here ya go.