You're looking at Less Wrong's discussion board. This includes all posts, including those that haven't been promoted to the front page yet. For more information, see About Less Wrong.

Oscar_Cunningham comments on The Joys of Conjugate Priors - Less Wrong Discussion

41 Post author: TCB 21 May 2011 02:41AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (24)

You are viewing a single comment's thread.

Comment author: Oscar_Cunningham 21 May 2011 12:46:11PM *  0 points [-]

Does each likelihood distribution have a unique conjugate prior? I doesn't seem immediately obvious that they do, but people say things like "The conjugate prior for the bernoulli distribution is the beta distribution".

Comment author: Cyan 21 May 2011 01:17:20PM 0 points [-]

No, in general are many conjugate priors for a given likelihood, if for no other reason than any weighted mixture of conjugate priors is also a conjugate prior.

Comment author: Matt_Simpson 21 May 2011 06:25:58PM 0 points [-]

What about the converse - does a conjugate prior exist for each likelihood (assume "nice" families of probability measures with a R-N derivative w.r.t counting measure or lebesgue measure if you like)? I think probably not (with a fairly high degree of certainty) but I don't think I've ever seen a proof of it.

Comment author: Cyan 21 May 2011 10:57:31PM 2 points [-]

The existence of a conjugate prior is not guaranteed. They exist for members of the exponential family, which is a very broad and useful class of distributions. I don't know of a proof, but if a gun were held to my head, I'd assert with reasonable confidence that the Cauchy likelihood doesn't have a conjugate prior.

Comment author: alex_zag_al 22 April 2014 10:58:06PM *  1 point [-]

I'm pretty sure that the Cauchy likelihood, like the other members of the t family, is a weighted mixture of normal distributions. (Gamma distribution over the inverse of the variance)

EDIT: There's a paper on this, "Scale mixtures of normal distributions" by Andrews and Mallows, if you want the details

Comment author: Cyan 23 April 2014 03:23:42AM *  2 points [-]

Oh, for sure it is. But that only gives it a conditionally conjugate prior, not a fully (i.e., marginally) conjugate prior. That's great for Gibbs sampling, but not for pen-and-paper computations.

In the three years since I wrote the grandparent, I've found a nice mixture representation for any unimodal symmetric distribution:

Suppose f(x), the pdf for a real-valued X, is unimodal and symmetric around 0. If W is positive-valued with pdf g(w) = -w f '(w) and U ~ Unif(-W, W), then U's marginal distribution is the same as X. Proof is by integration-by-parts. ETA: No, wait, it's direct. Derp.

I don't think it would be too hard to convert this width-weighted-mixture-of-uniforms representation to a precision-weighted-mixture-of-normals representation.

Comment author: Matt_Simpson 22 May 2011 07:38:37PM 0 points [-]

It turns out that it's not too difficult to construct a counter example if you restrict the hyper-parameter space of the family of prior distributions. For example, let the likelihood, f(x|theta) only take on two values of theta, so the prior just puts mass p on theta=0 (i.e. P(theta=0) = p )and mass 1-p on theta=1. If you restrict p < 0.5, then the posterior will yield a distribution on theta with p > 0.5 for some likelihoods and some values of x.