PhilGoetz comments on Open Thread: March 2010, part 2 - Less Wrong

4 Post author: RobinZ 11 March 2010 05:25PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (334)

You are viewing a single comment's thread. Show more comments above.

Comment author: PhilGoetz 21 March 2010 08:29:11PM 0 points [-]

A bin is most informative if the statistics of the bin have the least entropy.

That's a good idea.

A natural measure of the entropy is just -p log p - (1-p) log (1- p), where p is the revealed frequency, but it's not the right one.

I'm glad you said that, since that was what I immediately thought of doing. I'll read up on the beta distribution, thanks!

Comment author: wnoise 22 March 2010 03:41:50PM 1 point [-]

I still think it's not a great choice, though clearly my other choices haven't worked well. But please do try it.

Given that the probability is a continuous distribution, the Fisher information might instead be a reasonable thing to look at. For a single distribution, maximizing it corresponds to minimizing the variance, so my suggestion for that wasn't as ad-hoc as I thought. I'm not sure the equivalence holds for multiple distributions.