For questions of a continuous nature, you think that subjective probability is best expressed as a distribution over the continuous support, right? I view these sorts of distributions over distributions as that- there's some continuous parameter potentially in the world (the proportion of white and black balls in the urn), and that continuous parameter may determine my subjective probability about binary events (whether ball #1001 is white or black).
Now, whether or not this formalism stretches to other ideas might be controversial. I might consider "the strength of the argument for Conclusion X" as having continuous support, possibly from 0 to 1, and so be able to express with my probability distribution over that how much more I expect to learn about the issue, but I can see reasons to avoid doing that.
[edit]That is, rather than modifying the likelihood ratios of all of the pieces of evidence for or against the argument being strong, I can modify my distribution on it. I think this runs in to trouble with, say, argument screening off authority- there's a case where you really do want to modify the likelihood ratios.
Subscribe to RSS Feed
= f037147d6e6c911a85753b9abdedda8d)
I'm pretty sure nothing I say here will be new to you, so consider this more of an effort to explain to you where I (and I think also Jonah, though I won't categorically speak for him) am coming from.
Jonah was looking at probability distributions over estimates of an unknown probability (such as the probability of a coin coming up heads). Unless you have some objection to probability distributions per se, I don't see anything wrong with taking a probability distribution to describe one's current state of knowledge of a probability.
If your goal is to answer the question "Will this coin come up heads?" for a single coin toss, and you can't run any experiments to augment your knowledge about the model, but only have access to your prior knowledge, then it's true that all your knowledge would be captured in a single probability number, and in case you have a subjective probability distribution, then the single probability number would simply be the expected value of the distribution.
If, however, you are trying to answer a similar question "Will this coin come up heads when I toss it on such-and-such date at such-and-such time?" but you can run experiments before that, it would make sense to use those experiments to try to understand the model that determines how the coin tossing works. Your model may be something like "with fairly extreme probability, I believe that there is a probability p such that the coin toss turns up heads with probability p, and that that probability p is independent of the time and place that it is tossed. I also have a Bayesian prior for the probability distribution of the probability p." You would start with the prior and then run coin-tossing experiments to continue updating that probability distribution of probabilities. The day before your grand toss, you'll need to take the expected value of the probability distribution that you have obtained by then. But at intermediate stages it would make sense to store the entire probability distribution rather than the expected value (the point estimate of the probability). For instance, if you think that the coin is either fair (probability 1/3), or always heads (probability 1/3), or always tails (probability 1/3), then it's worth storing that full prior rather than simply saying that there's a 50% chance of it turning up heads, so that you can appropriately update your evidence. I could also construct higher-order versions of this hypothetical, but they would be too tedious to describe.
Secondly, as Jonah said, if you're running the coin-tossing experiment multiple times and measuring the probability of, say, all heads, then the subjective probability distribution for p does matter for calculating the probability of all heads, and just the point estimate (expected value) of p would give a wrong answer.
Sorry if this isn't clear -- I can elaborate more later.
"Jonah was looking at probability distributions over estimates of an unknown probability (such as the probability of a coin coming up heads)"
It sounds like you are just confusing epistemic probabilities with propensities, or frequencies. I.e, due to physics, the shape of the coin, and your style of flipping, a particular set of coin flips will have certain frequency properties that you can characterise by a bias parameter p, which you call "the probability of landing on heads". This is just a parameter of a stochastic model, not a degree of belief.
However, you can have a degree of belief about what p is no problem. So you are talking about your degree of belief that a set of coin flips has certain frequentist properties, i.e. your degree of belief in a particular model for the coin flips.
edit: I could add that GIVEN a stochastic model you then have degrees of belief about whether a given coin flip will result in heads. But this is a conditional probability: see my other comment in reply to Vanvier. This is not, however, "beliefs about beliefs". It is just standard Bayesian modelling.