Manfred comments on Help with a (potentially Bayesian) statistics / set theory problem? - Less Wrong

2 Post author: joshkaufman 10 November 2011 10:30PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (27)

You are viewing a single comment's thread.

Comment author: Manfred 10 November 2011 10:53:21PM 2 points [-]

If we allow inconsistency, i.e. if you plan on using this in the real world, then you could get the response B>A, A>C, C>B. That is, there may not be any such thing as a most preferred object, and thus no such thing as the "probability of being the 'most preferred' object."

A workaround would be to ask a related question: "if object 1 is A, and object 2 is unknown, what is the probability that A>2?" The most preferred object would just be the object with the highest answer to this question.

The alternate path would be to assume that there is a preference ordering from 1 to 7, with some sort of noise applied. This feels clunky to me, though.

Comment author: dlthomas 10 November 2011 11:14:12PM 1 point [-]

The alternate path would be to assume that there is a preference ordering from 1 to 7, with some sort of noise applied. This feels clunky to me, though.

While clunky, this seems to be a perfectly workable approach for 7 objects. There are 5040 permutations. For each piece of evidence, add probability mass to those that correspond and remove it from those that differ (perhaps weighted by how much they correspond or differ?). The probability of your object being most preferred, then, is the sum of the probabilities of those permutations in which it occupies the highest point.

Comment author: joshkaufman 10 November 2011 11:57:38PM *  0 points [-]

Okay, if A is preferred from { A , [B-G] }, that should add probability mass to [A, [B,...,G] ], where [A, [B,...,G] ] is a ranked set of objects where the first slot is most preferred. That would represent 720 (6!) sets out of 5040.

All other sets (7! - 6! = 4,320) should either stay the same probability or have probability mass removed.

Then, the probability of A being "most preferred" = the sum of the probability mass of all 720 sets that have A as the highest ranked member. Likewise for B through G. Highest total probability mass wins.

Am I understanding that correctly?

Comment author: dlthomas 11 November 2011 12:37:00AM 1 point [-]

I don't think I'm following you.

We see a new piece of evidence - one of the people prefers C to E

C will be preferred to E in half the lists. Those lists become more probable, the other half become less probable. How much more/less probable depends on how much error you expect to see and of what type.

Repeat on all the data.

You only actually look at the first member when asking the odds that a particular object is there - at which point, yes, you sum up the probability of those 720 sets.

Comment author: joshkaufman 11 November 2011 12:40:53AM 0 points [-]

Ah, I see. Instead of updating half the lists, I was updating the 720 sets where C is the #1 preference. Thanks for the clarification.

Comment author: dlthomas 10 November 2011 11:10:21PM 1 point [-]

I don't know that "most preferred" naturally translates into "preferred to every other option." Your "related question" seems a perfectly appropriate generalization of which situations with a single dominant object are a special case (with well ordered sets a special case of that).

Comment author: joshkaufman 10 November 2011 11:08:01PM *  1 point [-]

Good point about inconsistency... I was thinking that individual responses may be inconsistent, but the aggregated responses of the group might reveal a significant preference.

My first crack at this was to use a simple voting system, where B from {A,B} means +1 votes for B, 0 for A, largest score when all participant votes are tallied wins. What messes that up is participants leaving without completing the entire set, which introduces selection bias, even if the sets are served at random.

Preference ordering / ranking isn't ideal because it takes longer, which may encourage the participant to rationalize. The "pick-one" approach is designed to be fast, which is better for gut reactions when comparing words, photos, etc.

Comment author: VincentYu 10 November 2011 11:48:34PM *  1 point [-]

If the aggregated preferences are transitive (i.e., 'not inconsistent' in your and Manfred's wording), then this preference relation defines a total order on the objects, and there is a unique object that is preferred to every other object (in aggregate). (Furthermore, this is isomorphic to the set {1,2,3,...,7} under the ≤ relation.)

Comment author: dlthomas 11 November 2011 12:46:06AM 0 points [-]

As I understand things, you have no guarantee of transitivity in the aggregated preferences even if you do have transitivity in the individual preferences.

Comment author: VincentYu 11 November 2011 12:55:41AM 0 points [-]

Yes, of course you are correct.

Comment author: joshkaufman 11 November 2011 12:00:01AM *  0 points [-]

Very helpful - reading about this now. Starting here: http://en.wikipedia.org/wiki/Ranking