How to fix academia?
I don't usually submit articles to Discussion, but this news upset me so much that I think there is a real need to talk about it.
http://www.nature.com/news/faked-peer-reviews-prompt-64-retractions-1.18202
A leading scientific publisher has retracted 64 articles in 10 journals, after an internal investigation discovered fabricated peer-review reports linked to the articles’ publication.
The cull comes after similar discoveries of ‘fake peer review’ by several other major publishers, including London-based BioMed Central, an arm of Springer, which began retracting 43 articles in March citing "reviews from fabricated reviewers". The practice can occur when researchers submitting a paper for publication suggest reviewers, but supply contact details for them that actually route requests for review back to the researchers themselves.
Types of Misconduct
We all know that academia is a tough place to be in. There is constant pressure to 'publish or perish', and people are given promotions and pay raises directly as a result of how many publications and grants they are awarded. I was awarded a PhD recently so the subject of scientific honesty is dear to my heart.
I'm of course aware of misconduct in the field of science. 'Softer' forms of misconduct include things like picking only results that are consistent with your hypothesis or repeating experiments until you get low p-values. This kind of thing sometimes might even happen non-deliberately and subconsciously, which is why it is important to disclose methods and data.
'Harder' forms of misconduct include making up data and fudging numbers in order to get published and cited. This is of course a very deliberate kind of fraud, but it is still easy to see how someone could be led to this kind of behaviour by virtue of the incredible pressures that exist. Here, the goal is not just academic advancement, but also obtaining recognition. The authors in this case are confident that even though their data is falsified, their reasoning (based, of course, on falsified data) is sound and correct and stands up to scrutiny.
What is the problem?
But the kind of misconduct being mentioned in the linked article is extremely upsetting to me, beyond the previous types of misconduct. It is a person or (more likely) a group of people knowing full well that their publication would not stand up to serious scientific scrutiny. Yet they commit the fraud anyway, guessing that no one will actually ever seriously scrutinize their work and it will take it at face value due to being present in a reputable journal. The most upsetting part is that they are probably right in this assessment.
Christie Aschwanden wrote a piece about this recently on FiveThirtyEight. She makes the argument that cases of scientific misconduct are still rare and not important in the grand scheme of things. I only partially agree with this. I agree that science is still mostly trustworthy, but I don't necessarily agree that scientific misconduct is too rare to be worth worrying about. It would be much more honest to say that we simply do not know the extent of scientific misconduct, because there is no comprehensive system in place to detect it. Surveys on this have indicated that as much as 1/3 of scientists admit to some form of questionable practices, with 2% admitting to downright fabrication or falsification of evidence. These figures could be widely off the mark. It is, unfortunately, easy to commit fraud without being detected.
Aschwanden's conclusion is that the problem is that science is difficult. With this I agree wholeheartedly. And to this I'd add that science has probably become too big. A few years ago I did some research in the area of nitric oxide (NO) transmission in the brain. I did a search and found 55,000 scientific articles from reputable publications with "nitric oxide" in the title. Today this number is over 62,000. If you expand this to both the title and abstract, you get about 160,000. Keep in mind that these are only the publications that have actually passed the process of peer review.
I have read only about 1,000 articles total during the entirety of my PhD, and probably <100 in the actual level of depth required to locate flaws in reasoning. The problem with science becoming too big is that it's easy to hide things. There are always going to be fewer fact-checkers than authors, and it is much harder to argue logically about things than it is to simply write things. The more the noise, the harder it becomes to listen.
It was not always this way. The rate of publication is increasing rapidly, outstripping even the rate of growth in number of scientists. Decades ago publications played only a minor role in the scientific process. Publications mostly had the role of disseminating important information to a large audience. Today, the opposite is true - most articles have a small audience (as, in people with the will and ability to read them), consisting of perhaps only a handful of individuals - often only the people in the same research group of institutional department. This leads to the problem where it is often seen that many publications actually receive most of their citations from people who are friends or colleagues of the authors.
Some people have suggested that because of the recent high-level cases of fraud that have been uncovered, there is now increased scrutiny and fraud is going to be uncovered more rapidly. This may be true for the types of fraud that already have been uncovered, but fraudsters are always going to be able to stay ahead of the scrutinizers. Experience with other forms of crime show this quite clearly. Before the article in nature I had never even thought about the possibility of sending reviews back to myself. It simply never occurred to me. All of these considerations lead me to believe that the problem of scientific fraud may actually get worse, not better, over time. Unless the root of the problem is attacked.
How Can it be Solved?
So how to solve the problem of scientific misconduct? I don't have any good answers. I can think of things like "Stop awarding people for mere number of publications" and "Gauge the actual impact of science rather than empty metrics like number of citations or impact factor." But I can't think of any good way to do these things. Some alternatives - like using, for instance, social media to gauge the importance of a scientific discovery - would almost certainly lead to a worse situation than we have now.
A small way to help might be to adopt a payment system for peer-review. That is, to get published, you pay a certain amount of money for researchers to review your work. Currently, most reviewers offer their services for free (however they are sometimes allocated a certain amount of time for peer-review in their academic salary). A pay system would at least give an incentive for people to rigorously review work rather than simply trying to optimize for minimum amount of time invested in review. It would also reduce the practice of parasitic submissions (people submitting to short-turnaround-time, high-profile journals like Nature just to get feedback on their work for free) and decrease the flow volume of papers submitted for review. However, it would also incentivize a higher rate of rejection to maximize profits. And it would disproportionately impact scientists from places with less scientific funding.
What are the real options we have here to minimize misconduct?
Open Thread, November 23-30, 2013
If it's worth saying, but not worth its own post (even in Discussion), then it goes here.
Research on unconscious visual processing
There is a new paper out by Sanguinetti, Allen, and Peterson, The Ground Side of an Object: Perceived as Shapeless yet Processed for Semantics. In it, the authors conduct a series of experiments to try to answer the question of how the brain separates background from foreground in visual processing. I found it interesting, so I thought I'd share. The human visual system is incredibly complex and we still have no clear idea how it does a lot of the things it does.
The experimental protocol was as follows:
The stimuli were 120 small, mirror-symmetric, enclosed white silhouettes (Trujillo, Allen, Schnyer, & Peterson, 2010). Of these, 40 portrayed meaningful name-able objects (animals, plants, symbols) inside their borders and suggested only meaningless novel objects on the ground side of their borders. The remaining 80 silhouettes depicted meaningless novel objects (objects not encountered previously) inside their borders. Of these, 40 suggested portions of nameable meaningful objects on the ground side of their borders. Note, however, that participants were not aware of the meaningful objects suggested on the ground side of these silhouettes. The remaining 40 novel silhouettes suggested novel shapes on both sides of their borders."
Stimuli were presented on a 20-in. CRT monitor 90 cm from the participants using DMDX software. Participants’ heads were unrestrained. Their task was to classify the silhouettes as depicting real-world or novel objects. Responses were made via button press; assignment of the responses to the two response buttons was random.
They then recorded the EEG signals from the participants and found something surprising: When the background was meaningful, the subject's brain waves produced the same signatures as would be expected when conscious awareness had taken place (called 'N300' and 'N400' signatures because they occur 300 and 400 ms after presentation of the stimulus), even if the subjects did not report percieving anything meaningful in the images. However, if the background really was meaningless, this signature was absent.
This is consistent with the "Brains aren't magic" doctrine. The brain can't magically separate objects from non-objects in the simplest levels of processing. No, just like a computer it has to tediously go through the image, pixel by pixel, identifying patterns and gradually building up into complex representations. It is only after this process is over that it can reliably separate background from foreground. It's just that this process happens with a huge degree of parallelism and we are not consciously aware of what is happening. Of course, none of this should be of any surprise to those here at LessWrong. However, as the authors note, the thinking that there is some mysterious mechanism in the brain that does complex visual processing in the very first visual areas seems to be subconsciously prevalent in psychological studies, and this has also confused many machine learning researchers.
There is a question about the study, though. How far did the processing go? Was it just the brain recognizing some abstract features of the background, not identifying it as a particular object, or did the brain actually subconsciously figure out what the object was? To determine this, they designed another experiment:
To ascertain that semantic access underlies the N300 and N400 effects observed in Experiment 1, we preceded each silhouette with a word in Experiment 2. For a critical novel-object/meaningful-ground silhouette, the word named either the object suggested on the silhouette’s ground side (match condition) or a different object (mis-match condition). If semantic access occurs for grounds, N300 and N400 responses to the novel-object/meaningful-ground silhouettes would be reduced in the match condition compared with the mismatch condition even though the semantic repetition involved stimuli from different modalities.
What they found was that indeed, semantic access underlies the responses - the brain knows what it's seeing, even when the interpretation doesn't rise up to the level of full conscious awareness (the full statistical analysis is in the paper, which is sadly not open-access):
Experiment 2 replicated these results when semantics were accessed cross-modally. Our interpretation of the neurophysiological evidence is buttressed by a recent behavioral experiment showing semantic priming from objects suggested on the ground side of the silhouettes (Peterson et al., 2012). These results are contrary to the traditional serial-processing assumption that semantic representations are accessed only after object segregation and instead support a dynamic view of perception according to which more objects are evaluated by the visual system than are ultimately perceived.
If you regard brains as Bayesian networks and spikes as messages, neurons in the visual cortex send trillions of messages amongst each other before arriving on a 'consensus' probability distribution on what is being seen. Throughout this whole process, many complex hypotheses are generated but discarded before reaching conscious perception. The brain does not magically side-step through the search space. It must trudge through it just like a computer. This is consistent with what we know about existing bayesian vision systems and the limitations of computer hardware.

Subscribe to RSS Feed
= f037147d6e6c911a85753b9abdedda8d)