You get what you measure/pay for. I'm actually surprised by how honest academia is given the terrible incentives. When I was on the Stanford Law Review we verified everything, including every single footnote, before publishing an article. While it would be impossible to do this for all scientific articles, how about doing it for those considered the best and not trusting articles that didn't receive this level of attention.
You get what you measure/pay for.
Sometimes. Monetary gains are a good way of promoting behavior for which good metrics exist, and which require little in the way of creativity, innovation, or initiative. If you pay people per peer review, then there will be people who just skim 10 papers a day and google 1 or 2 random nitpicks. If you try to mandate a time spent per paper, and the paper is more complex than the average paper, then people will not conduct a thorough peer review on those papers. Beware of Goodhart's Law.
I would lead toward trying to make it standard practice for researchers to be expected to spend a large chunk of their time reading the literature and conducting reviews. It would be ideal if it was expected to be a mix of broad, breadth first literature reviews and narrow, topic specific reading.
In my utopian world, I'd like to see things like this: (yes, it's probably a little/lot idealistic, but we can dream)
Spending too much or not enough time reading: "I see you've done a lot of good work, but I'm worried that this is coming at the cost of your peer review and other reading duties. It's important to have a strong foundation as well as depth of knowledge. This helps identify knowledge gaps in need of research, and helps prevent unknown-unknowns from jeopardizing projects. How about you spend a couple weeks reading once you've wrapped up the current project?"
Publishing too many, or not enough, papers: "You've put out a lot of papers recently. That's good, but I'm worried that this might indicate a lack of thoroughness. What do you think?" "Well, I think it just looked like more work than it was. I kinda selected them because they had a high perceived value, but didn't require as much effort. I'm aiming at low hanging fruit." "Fair enough. That's a legitimate strategy. Just try and make sure you are picking items with high actual value, rather than simply optimizing for perceived value."
Rushing through work rather than being thorough (or maybe even being too thorough?): "I just read the peer review of your paper. It seems like you made a couple mistakes that you could have caught by slowing down and being more thorough. For instance, he pointed out a flaw in your experimental design which left room for an alternative explanation of the data. Now we'll have to publish an entire separate experiment in order to confirm that the more likely hypothesis is correct. We're in the business of pushing science forward, NOT publishing papers."
I'd like to see things like this
That sounds a lot like a professor talking to a grad student.
If you have an actually innovative researcher (with probably a big ego), telling him to pause his research and read more is not likely to be productive X-)
I'm actually surprised by how honest academia is given the terrible incentives.
The consequences for being caught committing fraud (essentially termination of one's career in most cases) are too high. This probably acts as the main opposing force against fraud. Yet it's still apparently not enough.
The consequences for being caught committing fraud (essentially termination of one's career in most cases) are too high.
Not for "soft fraud" like data mining. And other types of fraud such as fudging the results of an experiment would be really hard to prove given that lots of honestly done experiments don't seem to replicate. Having someone find an error in your analysis certainly isn't cause for firing a tenured professor, and taking this into account I bet some people make deliberate errors that make their analysis more publishable. I've heard it can be sometimes very difficult to get another professor to give up his data, even when the data was used to publish an article in a journal that had a rule saying that you must make your data available upon request.
I'm actually surprised by how honest academia is given the terrible incentives.
How honest is it? Are you sure your not underestimating its honesty.
The statement from spinger on this is here. I can think of some ways to fix the particular issue that caused the retractions:
I would guess that any solution to the larger issue of scientific misconduct would need to consider Goodhart’s law and work to eliminate opportunites for people to game the system. There is a site called retractionwatch which has information on retractions that have occured.
I perceive implementing a solution to be a much bigger barrier than finding a solution. I suspect the premier publications have a strong grip on prestige that's very difficult to surmount. Without any outside pressure, those publications would take a huge risk to make any major changes from the status quo. But there are several thousand journals out there. I don't get why more of them aren't attempting major changes. The vast majority of journals can easily go up or down in prestige by quite a lot. The obscure journal that got rid of p-values got a lot more publicity than a typical journal of their stature normally would by doing that. PloS-One has rocketed upwards from out of nowhere in a short period of time even if they're not considered a premier publication. I read Andrew Gelman semi-regularly, and he infrequently suggests multiple different solutions.
Some ways to improve the system:
Lower barrier to entry for critiques of accepted papers. A critique of a paper should not be treated the same as a regular paper. If a flaw is found, it should get published in the paper that published the original paper.
Prediction Markets - This isn't practical for the vast majority of journals which are too small, but Science or the JAMA could if they chose to. A prediction market could be set up to predict whether the study's major finding would be replicated in the next 10 years let's say. If you bet yes, you would win the bet if it was replicated; you would lose the bet if it failed replication or no attempt at replication was made. This would give researchers an incentive to encourage others to independently replicate their work as no attempt is treated the same as a failed attempt. It would give bettors an incentive to verify replications for fraud as it could be worth a lot of money to them. This would only work for new findings, but I'm sure it could be modified for other types of studies.
Raw data requirements - A study can't be published without its accompanying raw data; possibly a complete replication package although this isn't as practical for a lot of disciplines. If the researcher isn't a programmer, they can't just grab everything out from R. The raw data itself though; definitely reasonable to ask for that. This wouldn't be included in the paper copy, but who cares? On a side note, why can't we have full color graphs? Seriously, it's the 21st century. Nobody reads the paper copy.
Great idea about using prediction markets. I'll think about making a few predictions PredictionBook for a few studies, but this is obviously inferior to your suggestion because the researchers in that case have incentives to be honest and encourage replication.
In my former area (hep-th, gr-qc, quant-ph) a lot of research is open-loop, speculative calculations, with no way to check even the basic assumptions. The first-rate research is reasonably easy to identify, but the rest is something like "we take this totally made-up model from some other paper, make more questionable simplifying or complicating assumptions, calculate some amplitudes, diagrams, propagators, BPS degeneracy, dual this or that". Unless you work on that particular topic, it is almost impossible to evaluate anything other than the math part, if that. Then it gets published, since the general area of research is fashionable, and there is nothing obviously wrong with the calculations, unrealistic though may be. So the "fraud" is basically writing something that could be confused with snarxiv, rather than falsifying anything outright. So the main issue is of noise, rather than of deliberate fraud.
Re payment to reviewers: I cannot see this working in any remotely desirable direction. Instead there would be people making money by writing positive reviews, like the shills on YELP and such.
I agree there is a problem, but the only suggestion I have is that arxiv should allow (semi-anonymous) comments by the experts in the area. Something Reddit-like, maybe.
A comment system is a whole 'nother ball of wax. Who is qualified to comment? Who moderates the comments? What kind of moderation is used? Experience suggests that the comments section - even if thoroughly moderated to get rid of spam and trolls - will become a place where people insert opinions that are mostly either completely unrelated to the article, or result from not reading the article, or just downright wrong. A possibility would be non-anonymous comments based on some sort of reputation system like the various stackoverflow-related sites.
yes, I mean the latter, reputation-based, but potentially anonymized, so that only serious replies are allowed.
So how to solve the problem of scientific misconduct? I don't have any good answers. I can think of things like "Stop awarding people for mere number of publications" and "Gauge the actual impact of science rather than empty metrics like number of citations or impact factor." But I can't think of any good way to do these things. Some alternatives - like using, for instance, social media to gauge the importance of a scientific discovery - would almost certainly lead to a worse situation than we have now.
If you go up the administration, at some point you reach someone who simply isn't equipped to evaluate a scientist's work. This may even just be the department head not being familiar with some subfield. Or it might be the Dean, trying to evaluate the relative merits of a physicist and a chemist. It's the rare person who knows enough about both fields to render good judgment. That's where metrics come in. It's a lot easier if you can point to some number as the basis for a decision. Even if it's agreed that number of publications or impact factor aren't good numbers to use, they're still convenient.
The practice can occur when researchers submitting a paper for publication suggest reviewers, ...
This is obviously corrupt, even with "real" reviewers. I review you, you review me. Our clique reviews each other. I wonder if we'll approve of each other. Duh.
The original idea behind this was to make the editor's job easier by suggesting experts in the field, and also to demonstrate that the authors are familiar with other people's work. The editor is then expected to use this information to aid in the actual selection process, and there is no guarantee that the editor will actually pick who the author suggested. In reality, what sometimes happens is lazy editors simply picking whoever the author suggested.
A crazy idea (for electronic journals): make everybody provide their own reviews of their works, and make the editorial a kind of meta-commentary of the articles and trends seen in the reviews. 'Dr. N doesn't provide any alternative explanation of her results, Dr. M admits was funded by So-and-so, Dr. L cites mostly her supervisor's work, Dr. K refused to supply his data, and here's to hope we won't have to retract Dr. P's latest... As to the statistics employed, such methods were used: a, b, c, which require such most basic assumptions: d, e, f, which were met by N, M and L, although we cannot, of course, say anything about K and have suspicions about P because of [reasons]… Here are the Letters from our Readers about last month's batch, with suggestions as to better experimental set-ups… And we are proud to announce the Null Hypothesis Stands received another admission; it seems that the consensus on the matter is as yet unchallenged, making the base of scientific superstructure that much stronger. Enjoy!':)
One way to support the integrity of research is to promote Open Science. By making the research process transparent and publishing data openly, it makes it easier to verify and reproduce research.
Congratulations on your PhD!
Not entirely related, but I think it would be nice if journal articles had little tags that explicitly told you where on the continuum between "immediately useful" and "this fundamental research might later lead to something useful" the findings that they presented were. Being unable to figure this out without wasting time on reading papers one isn't interested in can be a hassle.
This is a good idea, but it needs to be left up to the reviewers/editor, not the authors themselves.
Create incentives to catch misconduct seems the simplest solution. Some percentage of each grant should be set aside, not to those conducting the study, but to those who follow up on the study, with some award set aside for the first invalidation of its results. Set up a percentage penalty system where grant seekers working for universities with high levels of recent invalidations (expiring after some period of time) get fewer grants going forward, to incentivize at a systems level.
Some percentage of each grant should be set aside, not to those conducting the study, but to those who follow up on the study, with some award set aside for the first invalidation of its results.
I may be missing something, but... If there is a price for disproving a study that is a percent of the cost of the original study, then isn't that just making it lower payoff to cheat on the original study, and higher payoff to cheat on the disproving study?
That is, if I don't like the results that "Product X is ineffective", and I am willing to fund the study to disprove that claim, isn't it likely that I can more easily find a willing-to-fudge research team (because they are going to potentially get a bonus from the original grant's invalidation bonus)?
My understanding of this would be that that the original grant would be split something like 90%/10% (grant/invalidation bonus), and the second grant 90%/10% (grant/invalidation bonus) + 10% of the previous grant (if the original study is invalidated).
Your disproving study can itself be disproved, thus claiming a portion of the funding allocated to you, and reducing your systems-level reputation and hence grant approvals.
Create incentives to catch misconduct seems the simplest solution
That effectiveness of this solution depends, in particular on what other incentives are there.
Imagine a poor crime-ridden neighbourhood where police put up "Rat on your neighbours -- we pay for tips!" posters. That's "incentives to catch misconduct", but even if you collect the tip you still have to live in the neighbourhood and I expect that being a known snitch carries a heavy price.
Do you think objectivity and willingness to challenge ideas in science is regarded by those within the fields in such a manner?
I mean, it wouldn't surprise me, but if it's gotten that far, I think the problem may have gotten beyond a simple remedy.
...is regarded by those within the fields in such a manner?
By some, certainly. I expect the prevalence to vary depending on the field. In, say, physics, not so much, but in things like gender studies, close to 100%.
problem may have gotten beyond a simple remedy
Duh... X-/
but even if you collect the tip you still have to live in the neighbourhood and I expect that being a known snitch carries a heavy price.
I don't see how this point carries over to the problem at hand.... what's the heavy price for the scientist snitch?
For example, you won't be invited as a co-author for papers. People will exclude your from research groups. Reviewers will be nasty to your submissions.
Imagine a poor crime-ridden neighbourhood where police put up "Rat on your neighbours -- we pay for tips!" posters.
The bigger problem with that, is that the police will be flooded with false tips.
I don't usually submit articles to Discussion, but this news upset me so much that I think there is a real need to talk about it.
http://www.nature.com/news/faked-peer-reviews-prompt-64-retractions-1.18202
Types of Misconduct
We all know that academia is a tough place to be in. There is constant pressure to 'publish or perish', and people are given promotions and pay raises directly as a result of how many publications and grants they are awarded. I was awarded a PhD recently so the subject of scientific honesty is dear to my heart.
I'm of course aware of misconduct in the field of science. 'Softer' forms of misconduct include things like picking only results that are consistent with your hypothesis or repeating experiments until you get low p-values. This kind of thing sometimes might even happen non-deliberately and subconsciously, which is why it is important to disclose methods and data.
'Harder' forms of misconduct include making up data and fudging numbers in order to get published and cited. This is of course a very deliberate kind of fraud, but it is still easy to see how someone could be led to this kind of behaviour by virtue of the incredible pressures that exist. Here, the goal is not just academic advancement, but also obtaining recognition. The authors in this case are confident that even though their data is falsified, their reasoning (based, of course, on falsified data) is sound and correct and stands up to scrutiny.
What is the problem?
But the kind of misconduct being mentioned in the linked article is extremely upsetting to me, beyond the previous types of misconduct. It is a person or (more likely) a group of people knowing full well that their publication would not stand up to serious scientific scrutiny. Yet they commit the fraud anyway, guessing that no one will actually ever seriously scrutinize their work and it will take it at face value due to being present in a reputable journal. The most upsetting part is that they are probably right in this assessment.
Christie Aschwanden wrote a piece about this recently on FiveThirtyEight. She makes the argument that cases of scientific misconduct are still rare and not important in the grand scheme of things. I only partially agree with this. I agree that science is still mostly trustworthy, but I don't necessarily agree that scientific misconduct is too rare to be worth worrying about. It would be much more honest to say that we simply do not know the extent of scientific misconduct, because there is no comprehensive system in place to detect it. Surveys on this have indicated that as much as 1/3 of scientists admit to some form of questionable practices, with 2% admitting to downright fabrication or falsification of evidence. These figures could be widely off the mark. It is, unfortunately, easy to commit fraud without being detected.
Aschwanden's conclusion is that the problem is that science is difficult. With this I agree wholeheartedly. And to this I'd add that science has probably become too big. A few years ago I did some research in the area of nitric oxide (NO) transmission in the brain. I did a search and found 55,000 scientific articles from reputable publications with "nitric oxide" in the title. Today this number is over 62,000. If you expand this to both the title and abstract, you get about 160,000. Keep in mind that these are only the publications that have actually passed the process of peer review.
I have read only about 1,000 articles total during the entirety of my PhD, and probably <100 in the actual level of depth required to locate flaws in reasoning. The problem with science becoming too big is that it's easy to hide things. There are always going to be fewer fact-checkers than authors, and it is much harder to argue logically about things than it is to simply write things. The more the noise, the harder it becomes to listen.
It was not always this way. The rate of publication is increasing rapidly, outstripping even the rate of growth in number of scientists. Decades ago publications played only a minor role in the scientific process. Publications mostly had the role of disseminating important information to a large audience. Today, the opposite is true - most articles have a small audience (as, in people with the will and ability to read them), consisting of perhaps only a handful of individuals - often only the people in the same research group of institutional department. This leads to the problem where it is often seen that many publications actually receive most of their citations from people who are friends or colleagues of the authors.
Some people have suggested that because of the recent high-level cases of fraud that have been uncovered, there is now increased scrutiny and fraud is going to be uncovered more rapidly. This may be true for the types of fraud that already have been uncovered, but fraudsters are always going to be able to stay ahead of the scrutinizers. Experience with other forms of crime show this quite clearly. Before the article in nature I had never even thought about the possibility of sending reviews back to myself. It simply never occurred to me. All of these considerations lead me to believe that the problem of scientific fraud may actually get worse, not better, over time. Unless the root of the problem is attacked.
How Can it be Solved?
So how to solve the problem of scientific misconduct? I don't have any good answers. I can think of things like "Stop awarding people for mere number of publications" and "Gauge the actual impact of science rather than empty metrics like number of citations or impact factor." But I can't think of any good way to do these things. Some alternatives - like using, for instance, social media to gauge the importance of a scientific discovery - would almost certainly lead to a worse situation than we have now.
A small way to help might be to adopt a payment system for peer-review. That is, to get published, you pay a certain amount of money for researchers to review your work. Currently, most reviewers offer their services for free (however they are sometimes allocated a certain amount of time for peer-review in their academic salary). A pay system would at least give an incentive for people to rigorously review work rather than simply trying to optimize for minimum amount of time invested in review. It would also reduce the practice of parasitic submissions (people submitting to short-turnaround-time, high-profile journals like Nature just to get feedback on their work for free) and decrease the flow volume of papers submitted for review. However, it would also incentivize a higher rate of rejection to maximize profits. And it would disproportionately impact scientists from places with less scientific funding.
What are the real options we have here to minimize misconduct?