Update: as it turns out, this is a voting system problem, which is a difficult but well-studied topic. Potential solutions include Ranked Pairs (complicated) and BestThing (simpler). Thanks to everyone for helping me think this through out loud, and for reminding me to kill flies with flyswatters instead of bazookas.
I'm working on a problem that I believe involves Bayes, I'm new to Bayes and a bit rusty on statistics, and I'm having a hard time figuring out where to start. (EDIT: it looks like set theory may also be involved.) Your help would be greatly appreciated.
Here's the problem: assume a set of 7 different objects. Two of these objects are presented at random to a participant, who selects whichever one of the two objects they prefer. (There is no "indifferent" option.) The order of these combinations is not important, and repeated combinations are not allowed.
Basic combination theory says there are 21 different possible combinations: (7!) / ( (2!) * (7-2)! ) = 21.
Now, assume the researcher wants to know which single option has the highest probability of being the "most preferred" to a new participant based on the responses of all previous participants. To complicate matters, each participant can leave at any time, without completing the entire set of 21 responses. Their responses should still factor into the final result, even if they only respond to a single combination.
At the beginning of the study, there are no priors. (CORRECTION via dlthomas: "There are necessarily priors... we start with no information about rankings... and so assume a 1:1 chance of either object being preferred.) If a participant selects B from {A,B}, the probability of B being the "most preferred" object should go up, and A should go down, if I'm understanding correctly.
NOTE: Direct ranking of objects 1-7 (instead of pairwise comparison) isn't ideal because it takes longer, which may encourage the participant to rationalize. The "pick-one-of-two" approach is designed to be fast, which is better for gut reactions when comparing simple objects like words, photos, etc.
The ideal output looks like this: "Based on ___ total responses, participants prefer Object A. Object A is preferred __% more than Object B (the second most preferred), and ___% more than Object C (the third most preferred)."
Questions:
1. Is Bayes actually the most straightforward way of calculating the "most preferred"? (If not, what is? I don't want to be Maslow's "man with a hammer" here.)
2. If so, can you please walk me through the beginning of how this calculation is done, assuming 10 participants?
Thanks in advance!
Even if every participant answered all 21 questions, and if every participant answers with respect to some total ordering, then this still only reduces to the problem of preferential voting. This is a really hard problem and I'm not aware of any voting system that's provably "correct" -- in fact, there are results that for some definitions of "correct", there are no correct voting systems period. There are various voting systems that satisfy some, but not all, reasonable properties you could name, and you should pick one.
"Ranked pairs" is one natural choice (I don't know how well it performs in practice, though) because it deals in pairs to begin with. Basically, for each pair (A,B) you tally how many participants picked A over B. You go through the pairs in order of how significant a majority they represent. In this case, you wouldn't want to use the usual method for deciding which majority is better, because you have to take into account the number of people who answered the question, too. There was a post on LW once which linked to an article about this (in the context of e.g. ratings of movies on IMDB, where a movie rated 10.0 by 5 users isn't as good as a movie rated 9.0 by 100 users). Anyway, the next thing you do with the pairs is lock them in one by one, except when it would create a cycle together with pairs that have already been locked in. If you do this, the "locked in" pairs will show the most preferred object.
Upvoted. This problem is the well-studied voting problem.
In particular, the seminal result is Arrow's impossibility theorem, which states that no voting system satisfies all of the following criteria: (from Wikipedia)
... (read more)