I ran across Ed Hagen’s article “Academic success is either a crapshoot or a scam”, which pointed out that all the methodological discussion about science’s replication crisis is kinda missing the point: yes, all of the methodological stuff like p-hacking is something that would be valuable to fix, but the real problem is in the incentives created by the crazy publish-or-perish culture:
In my field of anthropology, the minimum acceptable number of pubs per year for a researcher with aspirations for tenure and promotion is about three. This means that, each year, I must discover three important new things about the world. […]
Let’s say I choose to run 3 studies that each has a 50% chance of getting a sexy result. If I run 3 great studies, mother nature will reward me with 3 sexy results only 12.5% of the time. I would have to run 9 studies to have about a 90% chance that at least 3 would be sexy enough to publish in a prestigious journal.
I do not have the time or money to run 9 new studies every year.
I could instead choose to investigate phenomena that are more likely to yield strong positive results. If I choose to investigate phenomena that are 75% likely to yield such results, for instance, I would only have to run about 5 studies (still too many) for mother nature to usually grace me with at least 3 positive results. But then I run the risk that these results will seem obvious, and not sexy enough to publish in prestigious journals.
To put things in deliberately provocative terms, empirical social scientists with lots of pubs in prestigious journals are either very lucky, or they are p-hacking.
I don’t really blame the p-hackers. By tying academic success to high-profile publications, which, in turn, require sexy results, we academic researchers have put our fates in the hands of a fickle mother nature. Academic success is therefore either a crapshoot or, since few of us are willing to subject the success or failure of our careers to the roll of the dice, a scam.
The article then suggests that the solution would be to have better standards for research, and also blames prestigious journal publishers for exploiting their monopoly on the field. I think that looking at the researcher incentives is indeed the correct thing to do here, but I’m not sure the article goes deep enough with it. Mainly, it doesn’t ask the obvious question of why researchers have such a crazy pressure to publish: it’s not the journals that set the requirements for promotion or getting to the tenure track, that’s the universities and research institutions. The journals are just exploiting a lucrative situation that someone else created.
Rather my understanding is that the real problem is that there are simply too many PhD graduates who want to do research, relative to the number of researcher positions available. It’s a basic fact of skill measurement that if you try to measure skill and then pick people based on how well they performed on your measure, you’re actually selecting for skill + luck rather than pure skill. If the number of people you pick is small enough relative to the number of applicants, anyone you pick has to be both highly skilled and highly lucky; simply being highly skilled isn’t enough to make it to the top. This is the situation we have with current science, and as Hagen points out, it leads to rampant cheating when people realize that they have to cheat in order to make the cut. As long as this is the situation, there will remain an incentive to cheat.
This looks hard to fix; two obvious solutions would be to reduce the number of graduate students or to massively increase the number of research jobs. The first is politically challenging, especially since it would require international coordination and lots of nations view the number of graduating PhDs as a status symbol. The second would be expensive and thus also politically challenging. One thing that some of my friends also suggested was some kind of a researchers’ basic income (or just a universal basic income in general); for fields in which doing research isn’t much more expensive than covering the researchers’ cost of living, a lot of folks would probably be happy to do research just on the basic income.
A specific suggestion that was thrown out was to give some number of post-docs a 10-year grant of 2000 euros/month; depending on the exact number of grants given out, this could fund quite a number of researchers while still being cheap in comparison to any given country’s general research and education expenses. The existence of better-paid and more prestigious formal research positions like university professorships would still exist as an incentive to actually do the research, and historically quite a lot of research has been done by people with no financial incentive for it anyway (Einstein doing his research on the side while working at the patent office maybe being the most famous example); the fact that most researchers are motivated by the pure desire to do science is already shown by the fact that anyone at all decides to go to academia today. A country being generous handing out these kinds of grants also has the potential to be made into an international status symbol, creating the incentive to actually do this. Alternatively, this could just be viewed as yet another reason to just push for a universal basic income for everyone.
EDIT: Jouni Sirén made the following interesting comment in response to this article: “I think the root issue goes deeper than that. There are too many PhD graduates who want to do research, because money and prestige are insufficient incentives for a large part of the middle class. Too many people want a job that is interesting or meaningful, and nobody is willing to support all of them financially.” That’s an even deeper reason than the one I was thinking of!
The problem isn't the selecting by outcome part, the problem is the part where one tries to discriminate more than the measure allows.
Suppose that you are making people take an exam on which they can score between 0 and 100. If you took their exam scores literally as indicative of their skill, you might be tempted to use the scores to choose the people in the top 1% (say). But the exam probably isn't capable of measuring skill that precisely; the error of the measure may be such that realistically, it's only reliably picking up on skill differences on the level of 10-point increments. If you try to use it to pick the top 1% anyway, you are just measuring noise.
So if you want to make things fair in the sense of rewarding everyone who actually has enough skill, you should set the cutoff so that everyone who scores in the top decentile gets rewarded. (This does mean that some people who actually aren't that skilled will also be rewarded by luck, but that's an unavoidable trade-off.)
If people didn't respond to incentives, then you wouldn't need to care about making things fair in this sense. After all, anyone who's in the top 1% is in the top 10% too, and picking the people in the top 1% does mean that you're very unlikely to get someone who's actually in the top 89% but made it over the 90% cutoff by luck. But people do respond to incentives; in particular, if they realize that they can't make it to your cutoff just by being talented and hard-working but have to be exceptionally lucky as well, then they will feel the pressure to cheat and deliver results that look good but are actually false.
In fact, if this is common enough, then picking the people that the measure says are in the top 1% may actually get you people who are worse, at least in the sense of being more willing to fake their results, if it's more likely that you'll get into the top 1% by cheating than by being lucky.
I find it useful to think of this literally as an exam situation: you know that you need to score in the top percentile to be rewarded. You also know that skill only gets anyone over the 90 point mark and the rest is luck. Furthermore, you know that a number of people also taking the exam - some equally skilled as you, some less - are going to cheat on it, and that the vast majority of people who do make it to the top percentile only succeed at that because they cheated. If you don't cheat, it doesn't matter how skilled or hard-working you are: you almost certainly won't make it.
Would you cheat?
If no, you'll get eliminated from the system and have to do something else. If yes, you're contributing to the situation and increasing the incentive for others to cheat, and helping establish implicit norms where it's taken for granted that anyone who made it past this round of elimination is a cheater, so cheating can't be that bad.
Whereas if the cutoff was at the 90th percentile level, some people would still get in because they were cheats, but the system would no longer be selecting strongly against people with integrity, and the environment would be much more conducive for establishing norms where cheating was looked unfavorably on.