Daniel_Burfoot comments on The Statistician's Fallacy - Less Wrong

38 Post author: ChrisHallquist 09 December 2013 04:48AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (67)

You are viewing a single comment's thread.

Comment author: Daniel_Burfoot 09 December 2013 08:35:00PM *  23 points [-]

Essentially all scientific fields rely heavily on statistics.

This is true in a technical sense but misses a crucial distinction. Hard sciences (basically physics and its relatives), are far less vulnerable to statistical pitfalls because practitioners in those fields have the ability to generate effectively unlimited quantities of data by simply repeating experiments as many times as necessary. This makes statistical reasoning largely irrelevant: in the limit of infinite data, you don't need to do Bayesian updates because the weight of the prior is insignificant compared to the weight of the observations. Rutherford, for example, did not bother to state a prior probability for the plum pudding model of the atom compared to the planetary model; he just amassed a bunch of experimental data, and showed that the plum pudding model could not explain it. This large-data-generation ability of physics is largely why that field has succeeded in spite of continuing debates and confusion about the fundamentals of statistical philosophy. Researchers in fields like economics, nutrition, and medicine simply cannot obtain data on the same scale that physicists can.

Comment author: satt 11 December 2013 07:41:47AM *  3 points [-]

I agree that hard sciences are far less vulnerable to statistical pitfalls. However, I'd point at three factors other than data generation to explain it:

  1. The hard sciences have theories that define specific, quantitative models, which makes it far easier to test the theories. Fitting a misspecified model is much less of a risk, and a model may make such a specific prediction that fewer data are needed to falsify it.

  2. Signal-to-noise ratios are often much less in the hard sciences. Where that's the case, you generally don't need such advanced statistics to analyse results, and you're more likely to notice when you do the statistics incorrectly and get a wrong answer. And even if a model doesn't truly fit the data, it may still explain the vast majority of the variation in the data; you can get an of 0.999 in physics, while if you get an of 0.999 in the social sciences it means you did something stupid in Excel or SPSS and accidentally regressed something against itself.

  3. In the hard sciences, one has a good chance of accounting for all of the important causes of an effect of interest. In the social sciences this is usually impossible; often one doesn't even know the important causes of an effect, making it difficult to rule out confounding (unless one can sever unknown causal links via e.g. randomization).

Comment author: [deleted] 10 December 2013 09:53:42AM *  3 points [-]

Hard sciences (basically physics and its relatives), are far less vulnerable to statistical pitfalls because practitioners in those fields have the ability to generate effectively unlimited quantities of data by simply repeating experiments as many times as necessary.

There are exceptions such as ultra-high-energy cosmic ray physics, where it'd take decades to take enough data for naive frequentist statistics to be reliable.

Comment author: Kurros 10 December 2013 10:37:20PM 1 point [-]

The statistics also remains important at the frontier of high energy physics. Trying to do reasoning about what models are likely to replace the Standard Model is plagued by every issue in the philosophy of statistics that you can imagine. And the arguments about this affect where billions of dollars worth of research funding end up (build bigger colliders? more dark matter detectors? satellites?)

Comment author: [deleted] 15 December 2013 08:48:57AM 0 points [-]

Sure; if we had enough data to conclusively answer a question it would no longer be at the frontier. :-)

(I disagree with several of the claims in the linked post, but that's another story.)

Comment author: Eugine_Nier 11 December 2013 05:02:54AM -1 points [-]

I suspect it's not so much the amount of data as the fact that the underlying causal structure tends to be much simpler.

With, e.g., biology you the problem of the Harvard law.