RobinZ comments on Why (and why not) Bayesian Updating? - Less Wrong

17 Post author: Wei_Dai 16 November 2009 09:27PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (26)

You are viewing a single comment's thread. Show more comments above.

Comment author: RichardKennaway 19 November 2009 11:34:48PM *  0 points [-]

No. The data in the scatter-plot in that article contains no mutual information between the variables A and B, not merely zero product-moment correlation. I linked there to the data that are plotted; anyone is welcome to have a go at finding mutual information in them.

I challenge anyone to analyse these data and demonstrate substantial mutual information between A and B. If the data are insufficient for your favorite method of analysis, I can generate arbitrarily large quantities of it, and if I were using a quantum RNG instead of a PRNG, there would be absolutely no way to determine any connection between the two variables.

Despite that, there is one. It only shows up when the process from which these data are taken is sampled on a sufficiently short timescale, as in the other data file I linked to in that post.

Comment author: RobinZ 19 November 2009 11:46:38PM 1 point [-]

Correct me if I'm wrong, but would the actual measure of the connection between A and B be more accurately summarized as K(A + B) < K(A) + K(B), then?

Comment author: SilasBarta 20 November 2009 04:06:02PM *  0 points [-]

I believe that's an equivalent way to express "H(X) - H(X|Y) > 0" and "P(A ∩ B) != P(A) * P(B)". Or at least, any one of the three can be derived from any of the others.

Note that the Kullback-Leibler divergence (a generalization of entropy) between X and Y is equivalent to the number of extra bits required to code data sampled from X when your compression algorithm is optimized for Y, which shows how these all relate.