RobinZ comments on Why We Can't Take Expected Value Estimates Literally (Even When They're Unbiased) - Less Wrong

75 Post author: HoldenKarnofsky 18 August 2011 11:34PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (249)

You are viewing a single comment's thread. Show more comments above.

Comment author: handoflixue 19 January 2012 12:20:17AM *  0 points [-]

Item 1 would only seem useful when you have sufficient trusted expert ranking to calibrate, but still need to use the votes to extrapolate elsewhere (and where you expect trusted experts to align with your audience - if experts routinely downvote dark ales, and your audience prefers them, you're going to get a wonky heuristic). Basically, at that point, you're JUST using votes as a method to try predicting and extrapolating expert rankings, and I'd expect there's usually better heuristics for that which don't require user votes.

Item 2 strikes me as clever and ideal, but I'd think you'd need quite a lot of data before you'd be able to actually calibrate that. So you're stuck using 0.05 until you have quite a lot of data.

(Customer satisfaction surveys, etc. also run in to the "resource intensive" issue)

(edit: apparently pound makes the whole row a header or something)

Comment author: RobinZ 19 January 2012 03:38:00AM 0 points [-]

Item 1 would only seem useful when you have sufficient trusted expert ranking to calibrate, but still need to use the votes to extrapolate elsewhere [...]

Exactly. Remember, the whole point of this procedure is to tweak how much credibility you give to voters as a function of the number of voters you have - the only reason I mention experts is that they bypass the sample size problem.

(and where you expect trusted experts to align with your audience - if experts routinely downvote dark ales, and your audience prefers them, you're going to get a wonky heuristic)

Okay, that's a problem. I think it falls as a subset of the earlier problem of finding trusted expert rankings, however.

Item 2 strikes me as clever and ideal, but I'd think you'd need quite a lot of data before you'd be able to actually calibrate that. So you're stuck using 0.05 until you have quite a lot of data.

If you don't have a lot of data, you're not going to have much to offer your users anyway.