Update: as it turns out, this is a voting system problem, which is a difficult but well-studied topic. Potential solutions include Ranked Pairs (complicated) and BestThing (simpler). Thanks to everyone for helping me think this through out loud, and for reminding me to kill flies with flyswatters instead of bazookas.
I'm working on a problem that I believe involves Bayes, I'm new to Bayes and a bit rusty on statistics, and I'm having a hard time figuring out where to start. (EDIT: it looks like set theory may also be involved.) Your help would be greatly appreciated.
Here's the problem: assume a set of 7 different objects. Two of these objects are presented at random to a participant, who selects whichever one of the two objects they prefer. (There is no "indifferent" option.) The order of these combinations is not important, and repeated combinations are not allowed.
Basic combination theory says there are 21 different possible combinations: (7!) / ( (2!) * (7-2)! ) = 21.
Now, assume the researcher wants to know which single option has the highest probability of being the "most preferred" to a new participant based on the responses of all previous participants. To complicate matters, each participant can leave at any time, without completing the entire set of 21 responses. Their responses should still factor into the final result, even if they only respond to a single combination.
At the beginning of the study, there are no priors. (CORRECTION via dlthomas: "There are necessarily priors... we start with no information about rankings... and so assume a 1:1 chance of either object being preferred.) If a participant selects B from {A,B}, the probability of B being the "most preferred" object should go up, and A should go down, if I'm understanding correctly.
NOTE: Direct ranking of objects 1-7 (instead of pairwise comparison) isn't ideal because it takes longer, which may encourage the participant to rationalize. The "pick-one-of-two" approach is designed to be fast, which is better for gut reactions when comparing simple objects like words, photos, etc.
The ideal output looks like this: "Based on ___ total responses, participants prefer Object A. Object A is preferred __% more than Object B (the second most preferred), and ___% more than Object C (the third most preferred)."
Questions:
1. Is Bayes actually the most straightforward way of calculating the "most preferred"? (If not, what is? I don't want to be Maslow's "man with a hammer" here.)
2. If so, can you please walk me through the beginning of how this calculation is done, assuming 10 participants?
Thanks in advance!
For me, a maximal entropy prior is a probability distribution over some reasonable set of hypotheses, such as H(n) = "n percent of people prefer A to B". In such case, the prior p(H(n)) is uniform over (0,100). If we know that say H(80) is true, we know that a randomly selected person is 80% likely to prefer A over B. A survey enables us to update the prior and eventually locate the correct hypothesis, whatever prior we are starting from. It doesn't need to explicitly assume any correlation. That 80% of people share an opinion isn't called correlation between their opinons, in the usual sense of what "correlation" means.
You seem to have somewhat different notion of maximal entropy prior. Perhaps maximal entropy distribution over all possible hypotheses? You seem to imply that with maximum entropy induction is impossible, or something along these lines. I don't think this is the standard meaning of "maximum entropy prior".
As I stated at the beginning, I don't know the standard meaning of maximum entropy prior.
This time when I looked it up I found a simpler definition with finite cases. I'm not sure why I missed that before. I think I can figure out where the confusion is. I was thinking of every possible combination of opinions being separate possibilities. If this is the case, having them all be independent of each other is the maximum entropy. If, on the other hand, you only look at correlation, and consider H(80) = 50 being one case, then maximum entropy would seem to be... (read more)