I have had the following situation happen several times during my research career: I write code to analyze data; there is some expectation about what the results will be; after running the program, the results are not what was expected; I go back and carefully check the code to make sure there are no errors; sometimes I find an error
No matter how careful you are when it comes to writing computer code, I think you are more likely to find a mistake if you think there is one. Unexpected results lead one to suspect a coding error more than expected results do.
In general, researchers usually do have general expectations about what they will find (e.g., the drug will not increase risk of the disease; the toxin will not decrease risk of cancer).
Consider the following graphic:
Here, the green region is consistent with what our expectations are. For example, if we expect a relative risk (RR) of about 1.5, we might not be too surprised if the estimated RR is between (e.g.) 0.9 and 2.0. Anything above 2.0 or below 0.9 might make us highly suspicious of an error -- that's the red region. Estimates in the red region are likely to trigger serious coding error investigation. Obviously, if there is no coding error then the paper will get submitted with the surprising results.
Error scenarios
Let's assume that there is a coding error that causes the estimated effect to differ from the true effect (assume sample size large enough to ignore sampling variability).
Consider the following scenario:
Type A. Here, the estimated value is biased, but it's within the expected range. In this scenario, error checking is probably more casual and less likely to be successful.
Next, consider this scenario:
Type B. In this case, the estimated value is in the red zone. This triggers aggressive error checking of the type that has a higher success rate.
Finally:
Type C. In this case it's the true value that differs from our expectations. However, the estimated value is about what we would expect. This triggers casual error checking of the less-likely-to-be-successful variety.
If this line of reasoning holds, we should expect journal articles to contain errors at a higher rate when the results are consistent with the authors' prior expectations. This could be viewed as a type of confirmation bias.
How common are programming errors in research?
There are many opportunities for hard-to-detect errors to occur. For large studies, there might be hundreds of lines of code related to database creation, data cleaning, etc., plus many more lines of code for data analysis. Studies also typically involve multiple programmers. I would not be surprised if at least 20% of published studies include results that were affected by at least one coding error. Many of these errors probably had a trivial effect, but I am sure others did not.
Ah, medium to strong disagree. I'm not far into my scientific career in $_DISCIPLINE, but any paper introducing a new "standard code" (i.e. one that you intend to use more than once) has an extensive section explaining how their code has accurately reproduced analytic results or agreed with previous simulations in a simpler case (simpler than the one currently being analysed). Most codes seem also to be open-source, since it's good for your cred if people are writing papers saying "Using x's y code, we analyse..." which means they need to be clearly written and commented - not a guarantee against pernicious bugs, but certainly a help. This error-checking setup is also convenient for those people generating analytic solutions, since they can find something pretty and say "Oh, people can use this to test their code.".
Of course, this isn't infallible, but sometimes you have to do 10 bad simulations before you can do 1 good one.
Fluid dynamics seems to be a much more serious field than the one I was doing an REU in. None of the standard papers I read even considered supplying code. Fortunately I have found a different field of study.
Also, you have persuaded me to include code in my senior thesis. Which I admit I've also debugged in a manner similar to the one mentioned in the article... I kept fixing bugs until my polynomials stopped taking up a whole page of Mathematica output and started fitting onto one line. Usually a good sign.