You're looking at Less Wrong's discussion board. This includes all posts, including those that haven't been promoted to the front page yet. For more information, see About Less Wrong.

Romashka comments on Open thread, Mar. 9 - Mar. 15, 2015 - Less Wrong Discussion

5 Post author: MrMind 09 March 2015 07:48AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (109)

You are viewing a single comment's thread. Show more comments above.

Comment author: PhilGoetz 11 March 2015 06:17:38PM *  3 points [-]

Basic question about bits of evidence vs. bits of information:

I want to know the value of a random bit. I'm collecting evidence about the value of this bit.

First off, it seems weird to say "I have 33 bits of evidence that this bit is a 1." What is a bit of evidence, if it takes an infinite number of bits of evidence to get 1 bit of information?

Second, each bit of evidence gives you a likelihood multiplier of 2. E.g., a piece of evidence that says the likelihood is 4:1 that the bit is a 1 gives you 2 bits of evidence about the value of that bit. Independent evidence that says the likelihood is 2:1 gives you 1 bit of evidence.

But that means a one-bit evidence-giver is someone who is right 2/3 of the time. Why 2/3?

Finally, if you knew nothing about the bit, and had the probability distribution Q = (P(1)=.5, P(0)=.5), and a one-bit evidence giver gave you 1 bit saying it was a 1, you now have the distribution P = (2/3, 1/3). The KL divergence of Q from P (log base 2) is only 0.0817, so it looks like you've gained .08 bits of information from your 1 bit of evidence. ???

Comment author: Romashka 11 March 2015 07:03:00PM 1 point [-]

Why does the likelihood grow exactly twice? (I'm just used to really indirect evidence, which is also seldom binary in the sense that I only get to see whole suits of traits, which usually go together but in some obscure cases, vary in composition. So I guess I have plenty of C-bits that do go in B-bits that might go in A-bits, but how do I measure the change in likelihood of A given C? I know it has to do with d-separation, but if C is something directly observable, like biomass, and B is an abstraction, like species, should I not derive A (an even higher abstraction, like 'adaptiveness of spending early years in soil') from C? There are just so much more metrics for C than for B...) Sorry for the ramble, I just felt stupid enough to ask anyway. If you were distracted from answering the parent, please do.

Comment author: PhilGoetz 29 March 2015 02:52:20AM 1 point [-]

I don't understand what you're asking, but I was wrong to say the likelihood grows by 2. See my reply to myself above.