I've had a bit of success with getting people to understand Bayesianism at parties and such, and I'm posting this thought experiment that I came up with to see if it can be improved or if an entirely different thought experiment would be grasped more intuitively in that context:
Say there is a jar that is filled with dice. There are two types of dice in the jar: One is an 8-sided die with the numbers 1 - 8 and the other is a trick die that has a 3 on all faces. The jar has an even distribution between the 8-sided die and the trick die. If a friend of yours grabbed a die from the jar at random and rolled it and told you that the number that landed was a 3, is it more likely that the person grabbed the 8-sided die or the trick die?
I originally came up with this idea to explain falsifiability which is why I didn't go with say the example in the better article on Bayesianism (i.e. any other number besides a 3 rolled refutes the possibility that the trick die was picked) and having a hypothesis that explains too much contradictory data, so eventually I increase the sides that the die has (like a hypothetical 50-sided die), the different types of die in the jar (100-sided, 6-sided, trick die), and different distributions of die in the jar (90% of the die are 200-sided but a 3 is rolled, etc.). Again, I've been discussing this at parties where alcohol is flowing and cognition is impaired yet people understand it, so I figure if it works there then it can be understood intuitively by many people.
I am an applied mathematician who actually does work on finding the values of probabilistic quantities in better computing time than straightforward numerical experimentation. Probability is not just statistics.
In so much as what you think Bayesians do deviates from what I know has to be done, you have a wrong idea of what Bayesians do (or giving you benefit of the doubt at expense of others, are referring to some "Bayesians" whom are plain wrong), or something like that but the discussion is too fuzzy for me to tell which. (Ditto for frequentists)
The point of frequentism is seeing the probability as frequency in infinite number of trials. The point of my die example is to demonstrate that physically the probability plain comes in as frequency, via a function from initial phase space to final phase space that maps, for fair die, 1/6 of initial phase space to each final side-up, this being the objective property of a system that has to be adequately captured by what ever methods you are using. And I do not give a slightest damn if you don't know that in practice - not for dies but for many other systems - you have to find probabilities bottom up from e.g. laws of physics. If you are given steel die to physically experiment with, there again are a lot better (faster) ways to find out the probabilities, than just tossing (do you even understand that your errors converge as 1/sqrt(N) , or how important of an issue is that in practice?!). Of course I won't bother making for you some example with actually the die, the point is the principle and i've done such solutions before with things that unfortunately don't make great examples.
edit: also, on science, the reason we do 'probability of data given model' is because science follows a strategy of committing to rarely (with certain probability) throwing out valid model. 'Probability of model given the data' is not well defined, unless you count stuff like 'Solomonoff induction as a prior', where it is defined but not computable (and is mathematically homologous to assigning probability of 1 to the 'we live inside Turing machine' model). The experimental physicists publish probability of data given model; people can then combine that with their priors if they want.
The world often isn't nice enough to give us the steel die. Figuratively, the steel die may be inside someone's skull, thousands of years in the past, millions of light-years away, or you may have five slightly different dice and really want to learn about the properties of all dice.
I do und... (read more)