paper-machine comments on Open thread, Mar. 9 - Mar. 15, 2015 - Less Wrong Discussion
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (109)
Basic question about bits of evidence vs. bits of information:
I want to know the value of a random bit. I'm collecting evidence about the value of this bit.
First off, it seems weird to say "I have 33 bits of evidence that this bit is a 1." What is a bit of evidence, if it takes an infinite number of bits of evidence to get 1 bit of information?
Second, each bit of evidence gives you a likelihood multiplier of 2. E.g., a piece of evidence that says the likelihood is 4:1 that the bit is a 1 gives you 2 bits of evidence about the value of that bit. Independent evidence that says the likelihood is 2:1 gives you 1 bit of evidence.
But that means a one-bit evidence-giver is someone who is right 2/3 of the time. Why 2/3?
Finally, if you knew nothing about the bit, and had the probability distribution Q = (P(1)=.5, P(0)=.5), and a one-bit evidence giver gave you 1 bit saying it was a 1, you now have the distribution P = (2/3, 1/3). The KL divergence of Q from P (log base 2) is only 0.0817, so it looks like you've gained .08 bits of information from your 1 bit of evidence. ???
It seems weird to me because the bits of "33 bits" looks like the same units as the bit of "this bit", but they aren't the same. Map/territory. From now on, I'm calling the first, A-bits, and the second, B-bits.
It takes an infinite number of A-bits to know with absolute certainty one B-bit.
What were you expecting?