dclayh comments on Error detection bias in research - Less Wrong

54 Post author: neq1 22 September 2010 03:00AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (36)

You are viewing a single comment's thread. Show more comments above.

Comment author: Morendil 22 September 2010 08:02:21AM 11 points [-]

I would not be surprised if at least 20% of published studies include results that were affected by at least one coding error.

My intuition is that this underestimates the occurrence, depending on the field. Let us define:

  • CE = study has been affected by at least one coding error
  • SP = study relies on a significant (>500 LOC) amount of custom programming

Then I'd assign over 80% to P(CE|SP).

My mom is a semi-retired neuroscientist, she's been telling me recently how appalled she's been with how many researchers around her are abusing standard stats packages in egregious ways. The trouble is that scientists have access to powerful software packages for data analysis but they often lack understanding of the concepts deployed in the packages, and consequently make absurd mistakes.

"Shooting yourself in the foot" is the occupational disease of programmers, and this applies even to non-career programmers, people who program as a secondary requirement of their job and may not even have any awareness that what they're doing is programming.

Comment author: dclayh 22 September 2010 05:59:18PM *  0 points [-]

I strongly agree that you're more likely to get wrong results out of someone else's code than your own, because you tend to assume that they did their own error checking, and you also tend to assume that the code works the way you think it should (i.e. the way you would write it yourself), either or both of which may be false.

This is what led to my discovering a fairly significant error in my dissertation the day before I had to turn it in :) (Admittedly, self-delusion also played a role.)