Comment author: gerg 04 June 2010 08:35:11AM 2 points [-]

A presentation critique: psychologically, we tend to compare the relative areas of shapes. Your ovals in Figure 1 are scaled so that their linear dimensions (width, for example) are in the ratio 2:5:3; however, what we see are ovals whose areas are in ratio 4:25:9, which isn't what you're trying to convey. I think this happens for later shapes as well, although I didn't check them all.

Comment author: RichardKennaway 25 April 2010 10:55:23AM *  1 point [-]

(Written before and edited where marked after reading the comments.)

1. I look at the jar and estimate how many jellybabies wide and tall is the filled volume, do some mental arithmetic, and reduce the answer according to an estimate of the packing fraction, which I would expect to be the greatest source of error.

Anyone else can do the same thing, if they're smart enough to not just pull a figure out of the air. If I think the other contestants are unlikely to be, I ignore them. It's like the thermometer example, except that I have reason to think my thermometer is better than most other people's.

If I think that everyone else is smart enough to make an estimate along those lines, then it is more like the original thermometer problem. But I'm not going to just take an average. If my estimate is 1000, and someone else's is 300, that's too big a discrepancy to explain by minor variations. It casts doubt on the assumption of identical thermometers. Assuming that I only have the other people's estimates, and there's no opportunity for discussion, I'll search for reasons why we might have come up with completely different answers, but if I find no error in my own, I'll discard all such outliers.

If I work in the confectionery trade and know the packing fraction for jellybabies, that elevates my confidence in my own estimate and again I ignore the others...unless this competition is being held at a confectionery trade show.

In general, averaging the estimates is only valid if the estimates are believed to be of similar worth. If you know that the estimates are all unbiased but with differing variances, then you can work out some optimally weighted average that puts more weight on the more accurate estimates but does not discard the less accurate ones. However, if estimates are wildly different, the assumption may be a bad one.

BTW, a real-world example is the assessment of conference papers by the programme committee. Each paper will have been refereed by, say, four members. The typical procedure is that if they all say it's excellent, it's accepted without discussion. Likewise, reject it if they all say it's rubbish. For uniformly middling assessments, the question is where to set the bar for acceptance. The only papers where a real discussion is required are the ones where the referees disagree. The disagreements are resolved by sharing evidence, not by an Aumann-like compromise based on sharing posteriors.

2. Recognise that whoever is at rational fault, agreement is not possible in the current state of things. Start recording who does what housework when, then return to the matter after some suitable time, with evidence to decide the issue.

3. Case 2 was set up to be symmetrical. Case 3 is different: rationality is right and religion is wrong. I continue in that belief. How I conduct my discussions thereafter with my religious friend is a separate matter.

4. I'm not sure how much rational perfection and common knowledge to assume of Alfred and Betty in this problem, but even if I assume that they are perfect reasoners with common priors, then I can't see my way to proving anything about the ordering of their second estimates. (Added after reading the comments: Alfred or Betty's estimate of the probable ordering of their second estimates is a different matter.) I suppose that some version of Aumann's theorem says that on iterating the process they must eventually converge.

Comment author: gerg 26 April 2010 05:52:13AM 3 points [-]

If my estimate is 1000, and someone else's is 300, that's too big a discrepancy to explain by minor variations. It casts doubt on the assumption of identical thermometers. Assuming that I only have the other people's estimates, and there's no opportunity for discussion, I'll search for reasons why we might have come up with completely different answers, but if I find no error in my own, I'll discard all such outliers.

What if everyone else's estimate is between 280 and 320? Do you discard your own estimate if it's an outlier? Does the answer depend on whether you can find an error in your reasoning?

Comment author: jimrandomh 25 April 2010 01:10:10PM *  1 point [-]

Problem 1 is basically a noisy thermometers problem, except that the noise is gaussian in the estimate of the length/density along one dimension, not the number given. So I would take the cube root of each answer (including my own), then average them, then cube the result to make my estimate. If I thought one person was a particularly good or bad estimator, I would apply that as a weighting in the middle step.

Comment author: gerg 26 April 2010 05:46:22AM 1 point [-]

I'm mathematically interested in this procedure; can you please provide a reference?

Comment author: Alicorn 10 April 2010 01:49:47AM 3 points [-]

Now I'm going to be arguing about that with myself all day. How do you possessivize a proper noun that ends in the word "it"?

Comment author: gerg 10 April 2010 08:25:30PM *  1 point [-]

good thing I didn't go with the username "this.is.she"!

Comment author: gerg 10 April 2010 01:38:32AM -4 points [-]

This is a belated reply to cousin_it's 2009 post Bayesian Flame

Not to be a grammar Nazi, but I believe it should be cousin_its....

Comment author: Wei_Dai 03 March 2010 07:05:45AM 3 points [-]

This seems like a good question that's worth thinking about. I wonder if adversarial legal systems (where the job of deciding who is guilty of a crime is divided into the roles of prosecutor, defense attorney, and judge/jury) can be considered an example of this, and if so, why don't scientific institutions do something similar?

Comment author: gerg 04 March 2010 07:32:46PM 3 points [-]

Nominating adversarial legal systems as role models of rational groups, knowing how well they function in practice, seems a bit misplaced.

Comment author: gerg 13 November 2009 03:06:20AM 5 points [-]

Part of the output of your quizzes is a line of the form "Your chance of being well calibrated, relative to the null hypothesis, is 50.445538580926 percent." How is this number computed?

I chose "25% confident" for 25 questions and got 6 of them (24%) right. That seems like a pretty good calibration ... but 50.44% chance of being well calibrated relative to null doesn't seem that good. Does that sentence mean that an observer, given my test results, would assign a 50.44% probability to my being well calibrated and a 49.56% probability to my not being well calibrated? (or to my randomly choosing answers?) Or something else?

Comment author: gerg 07 October 2009 05:55:40PM 16 points [-]

Taskifaction doesn't destroy romance any more than it destroys music or dance.

This one sentence alone is worth my upvote for its sheer truth. (Although

Sucking at stuff is not sublime.

is a close second.)

Comment author: cousin_it 06 August 2009 08:35:14AM *  6 points [-]

Are you saying that rich white kids adopted ghetto fashion before poor white kids did? Doesn't ring very true to me.

Comment author: gerg 06 August 2009 06:33:28PM 7 points [-]

Poor kids had ghetto clothes first; rich kids had the clothes second, but ghetto fashion first.

View more: Prev