You do not need a probability distribution on your probability distribution to represent uncertainty.
I think I do.
The uncertainty is captured by the spread (variance) of your prior.
First, my prior is a probability distribution, isn't it? Second, some but not all uncertainty is captured by the variance of my prior. For example, I could be uncertain about the shape of the distribution -- say, it might be skewed but I'm not sure whether it actually is. Or I don't know whether I'm looking at a Student's-t (which e.g. has a defined mean) or I'm looking at Cauchy (which doesn't). How will I express that uncertainty?
The only meaningful interpretation of a probability on a probability, is if you are unsure about what you actually believe.
So, what's wrong with that? Of course I am unsure of what I actually believe -- say, I have some prior about the future values of X, but my confidence in my prior is not 100%, it's quite possible that my prior is wrong. You basically want to collapse all the meta-levels into a single prior, and I think that having one or more meta-levels is actually useful for thinking about the situation.
I suggest that:
Example: you are looking at the results of a scientific experiment. You have two rival theories for what's going on. One predicts that the frobulator will show an average reading of 11.3, with variance of 3 units and something very close to a normal distribution. One predicts the ...
I often like to think of my epistemic probability assignments in terms of probabilities-of-probabilities, or meta-probabilities. In other words, what probability would I assign that my probability estimate is accurate? Am I very confident, am I only mildly confident, or do I only have a vague clue?
I often think of it as a sort of bell curve, with the x-axis being possible probability estimates and the y-axis being my confidence in those estimates. So if I have very low confidence in my estimate then the height of the bell will be very low, and if I have high confidence it'll be pretty high.
Here are a few issues and insights that have come up when discussing or thinking about this:
What would a meta-probability actually mean?
There's two ways I have for thinking about it:
1) The meta-probability is my prediction for how likely I am to change my mind (and to what extent) as I learn more information about the topic.
2) I know that I'm not even close to being an ideal Bayesian agent, and that my best shots at a probability estimate are fuzzy, imprecise, and likely mistaken anyway. The meta-probability is my prediction for what an ideal Bayesian agent would assign as the probability for the question at hand.
What's the point?
Primarily it's just useful for conveying how sure I am of the probability estimate I'm assigning. It's a way of conveying that a coin flip is 50% heads in a very different sense than me saying "I have not the slightest clue whether it'll rain tomorrow on the other side of the world, and if I need to bet on it I'd give it ~50% odds". I've seen other people convey related sentiments by saying things like, "well 90% is probably too low an estimate, and 99% is probably too high, so somewhere between those". I'd just view the 90% and 99% figures as maybe 95% confidence bounds on a bell curve.
Why not keep going and say how confident you are about your confidence estimates?
True, I could do this, and I sometimes will do this if needed by visualizing a bit of fuzziness in my bell curve. But in any case it's usually enough for my purposes.
Is there any use for such a view in terms of instrumental or utilitarian calculations?
Not sure. I've seen some relevant discussion by Scott Alexander and Holden Karnofsky, but I'm not sure I followed everything there. I also suspect that if you view it as a prediction of how your views might change if you learned more about the subject, then this might imply that it's useful in deciding how much time to invest in further research.
Thoughts?
[Note 1: I discussed this topic about a year ago on LessWrong, and got some insightful responses then. Some commenters disagreed with me then and I'll predict that they'll do so again here - I'd give it, oh, say an 80% chance, moderate confidence ;).]
[Note 2: If you could try to avoid complicated math in your responses that would be appreciated. I'm still on the precalculus level here.]
[Note 3: As I finished writing this I dug up some interesting LessWrong posts on the subject, with links to yet more relevant posts.]