RolfAndreassen comments on A Request for Open Problems - Less Wrong

25 Post author: MrHen 08 May 2009 01:33PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (104)

You are viewing a single comment's thread. Show more comments above.

Comment author: RolfAndreassen 08 May 2009 07:43:54PM 1 point [-]

"The uniform distribution centered at c" does not seem to make sense. Did you perchance mean the Gaussian distribution? Further, 'deviates' looks like jargon to me. Can we use 'samples'? I would therefore rephrase as follows, with specific example to hang one's visualisation on:

Heights of male humans are known to have a Gaussian distribution of width 10 cm around some central value <h>; unfortunately you have forgotten what the central value is. Joe is 180 cm, Stephen is 170 cm. The probability that <h> is between these two heights is 50%; explain why. Then find a better confidence interval for <h>.

Comment author: Cyan 08 May 2009 08:14:51PM *  1 point [-]

I mean the continuous uniform distribution. "Centered at c" is intended to indicate that the mean of the distribution is c.

ETA: Let me be specific -- I'll use the notation of the linked Wikipedia article.

You know that b - a = 1.

c = (a + b)/2 is unknown, and the confidence interval is supposed to help you infer it.

Comment author: MrHen 08 May 2009 08:26:13PM 0 points [-]

If exactly half of all men have a height less than the central value c, than randomly picking sample will have a 50% chance of being below c. Picking two samples (A and B) results in four possible scenarios:

  1. A is less than c; B is greater than c
  2. A is less than c; B is less than c
  3. A is greater than c; B is greater than c
  4. A is greater than c; B is less than c

The interval created by (A, B) contains c in scenarios (1) and (4) and does not contain c in scenarios (2) and (3). Since each scenario has an equal chance of occurring, c is in (A, B) 50% of the time.

That is as far as I got just thinking about it. If I am on the right path I can keep plugging away.

Comment author: Cyan 08 May 2009 08:34:06PM 1 point [-]

In the Gaussian case, you can do better than (A, B) but the demonstration of that fact won't smack you in the face they way it does in the case of the uniform distribution.

Comment author: steven0461 08 May 2009 08:43:08PM 0 points [-]

One thing you can do in the uniform case is shorten the interval to at most length 1/2. Not sure if that's face-smacking enough.

Comment author: Psy-Kosh 08 May 2009 10:39:43PM *  0 points [-]

You can do better than that. If the distance between the two data points is 7/4, you can shrink the 100% confidence interval to 1/4, etc. (The extreme case is as the distance between the two data points approaches 2, your 100% confidence interval approaches size zero.)

EDIT: whoops, I was stupid. Corrected 3/4 to 7/4 and 1 to 2. There, now it should be right

Comment author: AllanCrossman 08 May 2009 08:37:28PM 0 points [-]

Do we know the heights of the men A and B? If so, we can get a better estimate of whether c lies between their heights by taking into account the difference between A and B...

Comment author: Cyan 08 May 2009 08:40:28PM 0 points [-]

That's the basic idea. Now apply it in the case of the uniform distribution.

Comment author: AllanCrossman 08 May 2009 08:41:45PM *  1 point [-]

If all men are (say) within 10 cm of each other, and the heights are uniformly distributed...

... if we have two men who are 8 cm apart, then c lies between their heights with 80% probability?

Comment author: Cyan 08 May 2009 08:43:47PM 0 points [-]

Getting there... 80% is too low.

Comment author: AllanCrossman 08 May 2009 08:47:50PM *  1 point [-]

Wait, what? It must be 100%...

Comment author: Cyan 08 May 2009 08:50:57PM *  1 point [-]

That's it. The so-called 50% confidence interval sometimes contains c with certainty. Also, when x_max - x_min is much smaller than 0.5, 50% is a lousy summary of the confidence (ETA: common usage confidence, not frequentist confidence) that c lies between them.

Comment author: AllanCrossman 08 May 2009 08:54:43PM 0 points [-]

If it's less than 0.5, is the confidence simply that value times 2?