I've had a bit of success with getting people to understand Bayesianism at parties and such, and I'm posting this thought experiment that I came up with to see if it can be improved or if an entirely different thought experiment would be grasped more intuitively in that context:
Say there is a jar that is filled with dice. There are two types of dice in the jar: One is an 8-sided die with the numbers 1 - 8 and the other is a trick die that has a 3 on all faces. The jar has an even distribution between the 8-sided die and the trick die. If a friend of yours grabbed a die from the jar at random and rolled it and told you that the number that landed was a 3, is it more likely that the person grabbed the 8-sided die or the trick die?
I originally came up with this idea to explain falsifiability which is why I didn't go with say the example in the better article on Bayesianism (i.e. any other number besides a 3 rolled refutes the possibility that the trick die was picked) and having a hypothesis that explains too much contradictory data, so eventually I increase the sides that the die has (like a hypothetical 50-sided die), the different types of die in the jar (100-sided, 6-sided, trick die), and different distributions of die in the jar (90% of the die are 200-sided but a 3 is rolled, etc.). Again, I've been discussing this at parties where alcohol is flowing and cognition is impaired yet people understand it, so I figure if it works there then it can be understood intuitively by many people.
Statisticians, by and large, don't lose sleep over this problem. Even in your not-quite-fair die problem, the calculations involved are really hard. It wasn't made explicit in my comment but I wasn't even assuming that opposite sides have equal probability, because some subtle error in the setup could break the symmetry. In the Bayesian case, I was considered mentioning a mixture model that would take advantage of the symmetry if the data supported it. In KDD Cup types of problems, nobody is worried that a domain expert will show up with a winning solution that doesn't even need to see the training data (why would it if it were maximally physically justified?).
Bayesians have made peace with bias. In fact, decision rules that are both Bayes and unbiased have zero risk, which is a nice way of saying that they don't exist in non-trivial situations. Noorbaloochi and Meeden (1983) have to go through definitional contortions to establish a positive connection between being Bayes and unbiased.
Bias is what lets you get good inferential performance in small-sample regimes. If I observe side counts (2, 0, 1, 3, 2, 2), I'd be okay with my estimator inferring equal side probabilities, because that will be closer to the truth than the unbiased estimator which guesses (0.2, 0.0, 0.1, 0.3, 0.2, 0.2); ten rolls is not enough data to tell me that I should never see a "2". On the other hand, with side counts (200, 0, 100, 300, 200, 200), something closer to the unbiased estimator seems like a good idea. As long as the estimator is asymptotically unbiased, you can even still have consistency.
Unlike cognitive bias, we have control over our statistical bias and we should not be squeamish about using it to learn about the parts of the world that are hard to model with complete accuracy to the extent that we wouldn't need statistics anyways.
The point of the not quite fair die example was to demonstrate where 'probabilities' are coming from. The fair die, after several bounces, maps the initial state space into the final side-up states in a particular way, so that 1/6th of even a very tiny part (hypervolume) of initial state maps to each side-up final state. The not totally fair die is somewhat biased from that. Any problems involving die can be solved from first principles all the way from this through selection of the parts of initial state that are compatible with observation, to the answer... (read more)