I've had a bit of success with getting people to understand Bayesianism at parties and such, and I'm posting this thought experiment that I came up with to see if it can be improved or if an entirely different thought experiment would be grasped more intuitively in that context:
Say there is a jar that is filled with dice. There are two types of dice in the jar: One is an 8-sided die with the numbers 1 - 8 and the other is a trick die that has a 3 on all faces. The jar has an even distribution between the 8-sided die and the trick die. If a friend of yours grabbed a die from the jar at random and rolled it and told you that the number that landed was a 3, is it more likely that the person grabbed the 8-sided die or the trick die?
I originally came up with this idea to explain falsifiability which is why I didn't go with say the example in the better article on Bayesianism (i.e. any other number besides a 3 rolled refutes the possibility that the trick die was picked) and having a hypothesis that explains too much contradictory data, so eventually I increase the sides that the die has (like a hypothetical 50-sided die), the different types of die in the jar (100-sided, 6-sided, trick die), and different distributions of die in the jar (90% of the die are 200-sided but a 3 is rolled, etc.). Again, I've been discussing this at parties where alcohol is flowing and cognition is impaired yet people understand it, so I figure if it works there then it can be understood intuitively by many people.
If this is really what you mean, can you clarify it? Are you talking about going from P(data ; parameter) to P(data | parameter) by abuse of notation and then taking the conditioning seriously?
I'm not sure what you mean by "abuse of notation". I don't think P(data ; parameter) and P(data | parameter) are the same thing. The former is a member of a family of distributions indexed by parameter value, the latter is a conditional distribution. I do think that, from a Bayesian point of view, the former determines the latter.
As a Bayesian, you treat the parameter value m as the value of an unobserved random variable M. The observed data y is the value of a random variable Y. Your model,
can be used to straightforwardly derive the conditio... (read more)