We do ten experiments. A scientist observes the results, constructs a theory consistent with them, and uses it to predict the results of the next ten. We do them and the results fit his predictions. A second scientist now constructs a theory consistent with the results of all twenty experiments.
The two theories give different predictions for the next experiment. Which do we believe? Why?
One of the commenters links to Overcoming Bias, but as of 11PM on Sep 28th, David's blog's time, no one has given the exact answer that I would have given. It's interesting that a question so basic has received so many answers.
I'd use the only tool we have to sort theories: Occam's razor.
This is what many do by assuming the second is “over-fitted”; I believe a good scientist would search the literature before stating a theory, and know about the first one; as he would also appreciate elegance, I'd expect him to come up with a simpler theory — but, as you pointed out, some time in a economics lab could easily prove me wrong, although I'm assuming the daunting complexity corresponds to plumbing against experiment disproving a previous theory, not the case that we consider here.
In one word: the second (longer references).
The barrel and box analogy hides that simplicity argument, by making all theories a ‘paper’. A stern wag of the finger to anyone who used statistical references, because there aren't enough data to do that.