gwern comments on Probability, knowledge, and meta-probability - Less Wrong

38 Post author: David_Chapman 17 September 2013 12:02AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (71)

You are viewing a single comment's thread.

Comment author: gwern 14 September 2013 10:21:34PM 3 points [-]

So perhaps this is for the next post, but are these 'metaprobabilities' just regular hyperparameters?

Comment author: lucidian 15 September 2013 11:47:41PM *  2 points [-]

I was wondering this too. I haven't looked at this A_p distribution yet (nor have I read all the comments here), but having distributions over distributions is, like, the core of Bayesian methods in machine learning. You don't just keep a single estimate of the probability; you keep a distribution over possible probabilities, exactly like David is saying. I don't even know how updating your probability distribution in light of new evidence (aka a "Bayesian update") would work without this.

Am I missing something about David's post? I did go through it rather quickly.

Comment author: David_Chapman 14 September 2013 11:13:07PM *  1 point [-]

I'm sure you know more about this than I do! Based on a quick Wiki check, I suspect that formally the A_p are one type of hyperprior, but not all hyperpriors are A_p (a/k/a metaprobabilities).

Hyperparameters are used in Bayesian sensitivity analysis, a/k/a "Robust Bayesian Analysis", which I recently accidentally reinvented here. I might write more about that later in this sequence.

Comment author: Vaniver 14 September 2013 11:24:33PM *  5 points [-]

When you use an underscore in a name, make sure to escape it first, like so:

I suspect that formally the A\_p are one type of [hyperprior](http://en.wikipedia.org/wiki/Hyperprior), but not all hyperpriors are A\_p (a/k/a metaprobabilities).

(This is necessary because underscores are yet another way to make things italic, and only applies to comments, as posts use different formatting.)

Comment author: David_Chapman 15 September 2013 02:35:23AM 1 point [-]

Thanks! Fixed.

Comment author: alex_zag_al 17 September 2013 12:54:47AM *  0 points [-]

Yeah - from what I've seen, something mathematically equivalent to A_p distributions are commonly used, but that's not what they're called.

Like, I think you might call the case in this problem "a Bernoulli random variable with an unknown parameter". (The Bernoulli random variable being 1 if it gives you $2, 0 if it gives you $0). And then the hyperprior would be the probability distribution of that parameter, I guess? I haven't really heard that word before.

ET Jaynes, of course, would never talk like this because the idea of a random quantity existing in the real world is a mind projection fallacy. Thus, no "random variables". So he uses the A_p distribution as a way of thinking about the same math without the idea of randomness. Jaynes's A_p in this case corresponds exactly to the more traditional "the parameter of the Bernoulli random variable is p".

(btw I have a purely mathematical question about the A_p distribution chapter, which I posted to the open thread: http://lesswrong.com/lw/ii6/open_thread_september_28_2013/9pbn if you know the answer I'd really appreciate it if you told me)