On average, if you eliminate twice as many hypotheses as I do from the same data, how much more data than you do I need to achieve the same results? Does it depend on how close we are to the theoretical maximum?

Comment author:gwern
20 June 2009 06:14:01PM
*
13 points
[-]

Well, think about it. If I can eliminate 1/2 the remaining hypotheses, and you just 1/4, then we're dealing with exponential processes here.

Let's suppose we get 1 bit a day. If we start with 4 hypotheses, then on day 1 I have 2 left, and you have 3; day 2, I have 1 left, and you have 2; on day 3, I blow up your planet just as you finally figure out the right hypothesis. If there are 1 billion hypotheses, then I'll be able to solve it in something like 20 days, and you 49. If there are a trillion, then 30 vs. 73; if a quadrillion, 40 vs. 97...

Yeah, we'll both solve the problem, but the difference can be significant.

## Comments (164)

OldOn average, if you eliminate twice as many hypotheses as I do from the same data, how much more data than you do I need to achieve the same results? Does it depend on how close we are to the theoretical maximum?

*13 points [-]Well, think about it. If I can eliminate 1/2 the remaining hypotheses, and you just 1/4, then we're dealing with exponential processes here.

Let's suppose we get 1 bit a day. If we start with 4 hypotheses, then on day 1 I have 2 left, and you have 3; day 2, I have 1 left, and you have 2; on day 3, I blow up your planet just as you finally figure out the right hypothesis. If there are 1 billion hypotheses, then I'll be able to solve it in something like 20 days, and you 49. If there are a trillion, then 30 vs. 73; if a quadrillion, 40 vs. 97...

Yeah, we'll both solve the problem, but the difference can be significant.