Manfred comments on Putting in the Numbers - Less Wrong

8 Post author: Manfred 30 January 2014 06:41AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (32)

You are viewing a single comment's thread. Show more comments above.

Comment author: Manfred 31 January 2014 05:04:31AM 0 points [-]

Well, you can still define information entropy for probability density functions - though I suppose if we ignore Jaynes we can probably get paradoxes if we try. In fact, I'm pretty sure just integrating p*Log(p) is right. There's also a problem if you want to have a maxent prior over the integers or over the real numbers; that takes us into the realm of improper priors.

I don't know as much as I should about this topic, so you may have to illustrate using an example before I figure out what you mean.

Comment author: Kurros 31 January 2014 08:24:08AM *  1 point [-]

Yeah I think integral( p*log(p) ) is it. The simplest problem is that if I have some parameter x to which I want to assign a prior (perhaps not over the whole real set, so it can be proper as you say -- the boundaries can be part of the maxent condition set), then via the maxent method I will get a different prior depending on whether I happen to assign the distribution over x, or x^2, or log(x) etc. That is, the prior pdf obtained for one parameterisation is not related to the one obtained for a different parameterisation by the correct transformation rule for probability density functions; that is, they contain logically different information. This is upsetting if you have no reason to prefer one parameterisation or another.

In the simplest case where you have no constraints except the boundaries, and maybe expect to get a flat prior (I don't remember if you do when there are boundaries... I think you do in 1D at least) then it is most obvious that a prior flat in x contains very different information to one flat in x^2 or log(x).

Comment author: alex_zag_al 02 February 2014 04:25:48AM 0 points [-]

According to Jaynes, it's actually not - I don't have the page number on me, unfortunately. But the way he does it is by discretizing the space of possibilities, and taking the limit as the number of discrete possibilities goes to infinity. It's not the limit of the entropy H, since that goes to infinity, it's the limit of H - log(n). It turns out to be a little different from integrating p*Log(p).