Less Wrong is a community blog devoted to refining the art of human rationality. Please visit our About page for more information.
For the first time in history, it has become possible for a limited group of a few thousand people to threaten the absolute destruction of millions.
-- Norbert Wiener (1956), Moral Reflections of a Mathematician.
Today, the general attitude towards scientific discovery is that scientists are not themselves responsible for how their work is used. For someone who is interested in science for its own sake, or even for someone who mostly considers research to be a way to pay the bills, this is a tempting attitude. It would be easy to only focus on one’s work, and leave it up to others to decide what to do with it.
But this is not necessarily the attitude that we should encourage. As technology becomes more powerful, it also becomes more dangerous. Throughout history, many scientists and inventors have recognized this, and taken different kinds of action to help ensure that their work will have beneficial consequences. Here are some of them.
This post is not arguing that any specific approach for taking responsibility for one's actions is the correct one. Some researchers hid their work, others refocused on other fields, still others began active campaigns to change the way their work was being used. It is up to the reader to decide which of these approaches were successful and worth emulating, and which ones were not.
… I do not publish nor divulge [methods of building submarines] by reason of the evil nature of men who would use them as means of destruction at the bottom of the sea, by sending ships to the bottom, and sinking them together with the men in them.
People did not always think that the benefits of freely disseminating knowledge outweighed the harms. O.T. Benfey, writing in a 1956 issue of the Bulletin of the Atomic Scientists, cites F.S. Taylor’s book on early alchemists:
Alchemy was certainly intended to be useful .... But [the alchemist] never proposes the public use of such things, the disclosing of his knowledge for the benefit of man. …. Any disclosure of the alchemical secret was felt to be profoundly wrong, and likely to bring immediate punishment from on high. The reason generally given for such secrecy was the probable abuse by wicked men of the power that the alchemical would give …. The alchemists, indeed, felt a strong moral responsibility that is not always acknowledged by the scientists of today.
With the Renaissance, science began to be viewed as public property, but many scientists remained cautious about the way in which their work might be used. Although he held the office of military engineer, Leonardo da Vinci (1452-1519) drew a distinction between offensive and defensive warfare, and emphasized the role of good defenses in protecting people’s liberty from tyrants. He described war as ‘bestialissima pazzia’ (most bestial madness), and wrote that ‘it is an infinitely atrocious thing to take away the life of a man’. One of the clearest examples of his reluctance to unleash dangerous inventions was his refusal to publish the details of his plans for submarines.
Later Renaissance thinkers continued to be concerned with the potential uses of their discoveries. John Napier (1550-1617), the inventor of logarithms, also experimented with a new form of artillery. Upon seeing its destructive power, he decided to keep its details a secret, and even spoke from his deathbed against the creation of new kinds of weapons.
But only concealing one discovery pales in comparison to the likes of Robert Boyle (1627-1691). A pioneer of physics and chemistry and possibly the most famous for describing and publishing Boyle’s law, he sought to make humanity better off, taking an interest in things such as improved agricultural methods as well as better medicine. In his studies, he also discovered knowledge and made inventions related to a variety of potentially harmful subjects, including poisons, invisible ink, counterfeit money, explosives, and kinetic weaponry. These ‘my love of Mankind has oblig’d me to conceal, even from my nearest Friends’.
Using the same method as in Study 1, we asked 20 University of Pennsylvania undergraduates to listen to either “When I’m Sixty-Four” by The Beatles or “Kalimba.” Then, in an ostensibly unrelated task, they indicated their birth date (mm/dd/yyyy) and their father’s age. We used father’s age to control for variation in baseline age across participants. An ANCOVA revealed the predicted effect: According to their birth dates, people were nearly a year-and-a-half younger after listening to “When I’m Sixty-Four” (adjusted M = 20.1 years) rather than to “Kalimba” (adjusted M = 21.5 years), F(1, 17) = 4.92, p = .040
That's from "False-Positive Psychology: Undisclosed Flexibility in Data Collection and Analysis Allows Presenting Anything as Significant," which runs simulations of a version of Shalizi's "neutral model of inquiry," with random (null) experimental results, augmented with a handful of choices in the setup and analysis of an experiment. Even before accounting for publication bias, these few choices produced a desired result "significant at the 5% level" 60.7% of the time, and at the 1% level 21.5% at the time.
I found it because of another paper claiming time-defying effects, during a search through all of the papers on Google Scholar citing Daryl Bem's precognition paper, which I discussed in a past post about the problems of publication bias and selection over the course of a study. For Bem, Richard Wiseman established a registry for the methods, and tests of the registered studies could be set prior to seeing the data (in addition to avoiding the file drawer).
Now a number of purported replications have been completed, with several available as preprints online, including a large "straight replication" carefully following the methods in Bem's paper, with some interesting findings discussed below. The picture does not look good for psi, and is a good reminder of the sheer cumulative power of applying a biased filter to many small choices.
Like The Cognitive Science of Rationality, this is a post for beginners. Send the link to your friends!
Science is broken. We know why, and we know how to fix it. What we lack is the will to change things.
In 2005, several analyses suggested that most published results in medicine are false. A 2008 review showed that perhaps 80% of academic journal articles mistake "statistical significance" for "significance" in the colloquial meaning of the word, an elementary error every introductory statistics textbook warns against. This year, a detailed investigation showed that half of published neuroscience papers contain one particular simple statistical mistake.
Also this year, a respected senior psychologist published in a leading journal a study claiming to show evidence of precognition. The editors explained that the paper was accepted because it was written clearly and followed the usual standards for experimental design and statistical methods.
Science writer Jonah Lehrer asks: "Is there something wrong with the scientific method?"
Yes, there is.
This shouldn't be a surprise. What we currently call "science" isn't the best method for uncovering nature's secrets; it's just the first set of methods we've collected that wasn't totally useless like personal anecdote and authority generally are.
As time passes we learn new things about how to do science better. The Ancient Greeks practiced some science, but few scientists tested hypotheses against mathematical models before Ibn al-Haytham's 11th-century Book of Optics (which also contained hints of Occam's razor and positivism). Around the same time, Al-Biruni emphasized the importance of repeated trials for reducing the effect of accidents and errors. Galileo brought mathematics to greater prominence in scientific method, Bacon described eliminative induction, Newton demonstrated the power of consilience (unification), Peirce clarified the roles of deduction, induction, and abduction, and Popper emphasized the importance of falsification. We've also discovered the usefulness of peer review, control groups, blind and double-blind studies, plus a variety of statistical methods, and added these to "the" scientific method.
In many ways, the best science done today is better than ever — but it still has problems, and most science is done poorly. The good news is that we know what these problems are and we know multiple ways to fix them. What we lack is the will to change things.
This post won't list all the problems with science, nor will it list all the promising solutions for any of these problems. (Here's one I left out.) Below, I only describe a few of the basics.
Recent renewed discussions of the parapsychology literature and Daryl Bem's recent precognition article brought to mind the "market test" of claims of precognition. Bem tells us that random undergraduate students were able to predict with 53% accuracy where an erotic image would appear in the future. If this effect was actually real, I would rerun the experiment before corporate earnings announcements, central bank interest rate changes, etc, and change the images based on the reaction of stocks and bonds to the announcements. In other words, I could easily convert "porn precognition" into "hedge fund trillionaire precognition."
If I was initially lacking in the capital to do trades, I could publish my predictions online using public key cryptography and amass an impressive track record before recruiting investors. If anti-psi prejudice was a problem, no one need know how I was making my predictions. Similar setups could exploit other effects claimed in the parapsychology literature (e.g. the remote viewing of the Scientologist-founded Stargate Project of the U.S. federal government). Those who assign a lot of credence to psi may want to actually try this, but for me this is an invitation to use parapsychology as control group for science, and to ponder a general heuristic for crudely estimating the soundness of academic fields for outsiders.
One reason we trust that physicists and chemists have some understanding of their subjects is that they produce valuable technological spinoffs with concrete and measurable economic benefit. In practice, I often make use of the spinoff heuristic: If an unfamiliar field has the sort of knowledge it claims, what commercial spinoffs and concrete results ought it to be producing? Do such spinoffs exist? What are the explanations for their absence?
For psychology, I might cite systematic desensitization of specific phobias such as fear of spiders, cognitive-behavioral therapy, and military use of IQ tests (with large measurable changes in accident rates, training costs, etc). In financial economics, I would raise the hundreds of billions of dollars invested in index funds, founded in response to academic research, and their outperformance relative to managed funds. Auction theory powers tens of billions of dollars of wireless spectrum auctions, not to mention evil dollar-auction sites.
This seems like a great task for crowdsourcing: the cloud of LessWrongers has broad knowledge, and sorting real science from cargo cult science is core to being Less Wrong. So I ask you, Less Wrongers, for your examples of practical spinoffs (or suspicious absences thereof) of sometimes-denigrated fields in the comments. Macroeconomics, personality psychology, physical anthropology, education research, gene-association studies, nutrition research, wherever you have knowledge to share.
ETA: This academic claims to be trying to use the Bem methods to predict roulette wheels, and to have passed statistical significance tests on his first runs. Such claims have been made for casinos in the past, but always trailed away in failures to replicate, repeat, or make actual money. I expect the same to happen here.
Some of you may remember past Less Wrong discussion of the Daryl Bem study, which claimed to show precognition, and was published with much controversy in a top psychology journal, JPSP. The editors and reviewers explained their decision by saying that the paper was clearly written and used standard experimental and statistical methods so that their disbelief in it (driven by physics, the failure to show psi in the past, etc) was not appropriate grounds for rejection.
Because of all the attention received by the paper (unlike similar claims published in parapsychology journals) it elicited a fair amount of both critical review and attempted replication. Critics pointed out that the hypotheses were selected and switched around 'on the fly' during Bem's experiments, with the effect sizes declining with sample size (a strong signal of data mining). More importantly, Richard Wiseman established a registry for advance announcement of new Bem replication attempts.
A replication registry guards against publication bias, and at least 5 attempts were registered. As far as I can tell, at the time of this post the subsequent replications have, unsurprisingly, failed to replicate Bem's results.1 However, JPSP and the other high-end psychology journals refused to publish the results, citing standing policies of not publishing straight replications.
From the journals' point of view, this (common) policy makes sense: bold new claims will tend to be cited more and raise journal status (which depends on citations per article), even though this means most of the 'discoveries' they publish will be false despite their p-values. However, this means that overall the journals are giving career incentives for scientists to massage and mine their data for bogus results, but not to challenge bogus results by others. Alas.
1 A purported "successful replication" by a pro-psi researcher in Vienna turns out to be nothing of the kind. Rather, it is a study conducted in 2006 and retitled to take advantage of the attention on Bem's article, selectively pulled from the file drawer.
ETA: The wikipedia article on Daryl Bem makes an unsourced claim that one of the registered studies has replicated Bem.
ETA2: Samuel Moulton, who formerly worked with Bem, mentions an unpublished (no further details) failed replication of Bem's results conducted before Bem submitted his article (the failed replication was not mentioned in the article).
ETA3: There is mention of a variety of attempted replications at this blog post, with 6 failed replications, and 1 successful replication from a pro-psi researcher (not available online). It is based on this ($) New Scientist article.
ETA4: This large study performs an almost straight replication of Bem (same methods, same statistical tests, etc) and finds the effect vanishes.
ETA5: Apparently, the mentioned replication was again submitted to the British Journal of Psychology:
When we submitted it to the British Journal of Psychology, it was finally sent for peer review. One referee was very positive about it but the second had reservations and the editor rejected the paper. We were pretty sure that the second referee was, in fact, none other than Daryl Bem himself, a suspicion that the good professor kindly confirmed for us. It struck us that he might possibly have a conflict of interest with respect to our submission. Furthermore, we did not agree with the criticisms and suggested that a third referee be brought in to adjudicate. The editor rejected our appeal.
When I found Less Wrong and started reading, when I made my first post, when I went to my first meetup….
It was a little like coming home.
And mostly it wasn’t. Mostly I felt a lot more out of place than I have in, say, church youth groups. It was hard to pinpoint the difference, but as far as I can tell, it comes down to this: a significant proportion of the LW posters are contrarians in some sense. And I’m a conformist, even if I would prefer not to be, even if that’s a part of my personality that I’m working hard to change. I’m much more comfortable as a follower than as a leader. I like pre-existing tradition, the reassuring structure of it. I like situations that allow me to be helpful and generous and hardworking, so that I can feel like a good person. Emotionally, I don’t like disagreeing with others, and the last thing I have to work hard to do is tolerate others' tolerance.
And, as evidenced by the fact that I attend church youth groups, I don’t have the strong allergy that many of the community seem to have against religion. This is possibly because I have easily triggered mystical experiences when, for example, I sing in a group, especially when we are singing traditional ‘sacred’ music. In a previous century, I would probably have been an extremely happy nun.
Someone once expressed surprise that I was able to become a rationalist in spite of this neurological quirk. I’ve asked myself this a few times. My answer is that I don’t think I deserve the credit. If anything, I ended up on the circuitous path towards reading LessWrong because I love science, and I love science because, as a child, reading about something as beautiful as general relativity gave me the same kind of euphoric experience as singing about Jesus does now. My inability to actual believe in any religion comes from a time before I was making my own decisions about that kind of thing.
I was raised by atheist parents, not anti-theist so much as indifferent. We attended a Unitarian Universalist church for a while, which meant I was learning about Jesus and Buddha and Native American spirituality all mixed together, all the memes watered down to the point that they lost their power. I was fourteen when I really encountered Christianity, still in the mild form of the Anglican Church of Canada. I was eighteen when I first encountered the ‘Jesus myth’ in its full, meme-honed-to-maximum-virulence form, and the story arc captivated me for a full six months. I still cry during every Good Friday service. But I must have missed some critical threshold, because I can’t actually believe in that story. I’m not even sure what it would mean to believe in a story. What does that feel like?
I was raised by scientists. My father did his PhD in physical chemistry, my mother in plant biology. I grew up reading SF and pop science, and occasionally my mother or my father’s old textbooks. I remember my mother’s awe at the beautiful electron-microscope images in my high school textbooks, and how she sat patiently while I fumblingly talked about quantum mechanics, having read the entire tiny physics section of our high school library. My parents responded to my interest in science with pride and enthusiasm, and to my interest in religion with indulgent condescension. That was my structure, my tradition. And yes, that has everything to do with why I call myself an atheist. I wouldn’t have had the willpower to disagree with my parents in the long run.
Ultimately, I have an awfully long way to go if I want to be rational, as opposed to being someone who’s just interested in reading about math and science. Way too much of my motivation for ‘having true beliefs’ breaks down to ‘maybe then they’ll like me.’ This is one of the annoying things about my personality, just as annoying as my sensitivity to religious memes and my inability to say no to anyone. Luckily, my personality also comes with the ability to get along with just about anyone, and in a forum of mature adults, no one is going to make fun of me because I’m wearing tie-dye overalls. No one here has yet made fun of me for my interest in religion, even though I expect most people disagree with it.
And there’s one last conclusion I can draw, albeit from a sample size of one. Not everyone can be a contrarian rationalist. Not everyone can rebel against their parents’ religion. Not everyone can disagree with their friends and family and not feel guilty. But everyone can be rational if they are raised that way.
My intent in the upcoming posts is to offer a practical overview of biological topics of both broad-scale importance and particular interest to the Less Wrong community. This will by no means be exhaustive (else I’d be writing a textbook instead, or more likely, you’d be reading one); instead I am going to attempt to sketch what amounts to a map of several parts of the discipline – where they stand in relation to other fields, where we are in the progress of their development, and their boundaries and frontiers. I’d like this to be a continually improving project as well, so I would very much welcome input on content relevance and clarity for any and all posts.
I will list relevant/useful references for more in-depth reading at the end of each post. The majority of in-text links will be used to provide a quick explanation of terms that may not be familiar or phenomena that may not be obvious. If the terms are familiar to you, you probably do not need to worry about those links. A significant minority of in-text links may or may not be purely for amusement.
If you want to carry a brimming cup of coffee without spilling it, you may want to "change" your goal to instead primarily concentrate on humming. This is an example of a general pattern. It sometimes helps to focus on a nearby artificial goal rather than your actual goal. Let me call that strategy "gamification". There is a business strategy, also named "gamification", of adding game mechanics to a website in order to achieve various business goals. This is related but different. Here I'm referring to a strategy for problem solvers.
We sometimes fail, and sometimes one failure is very similar to another failure. That is, there are characteristic ways that we fail. One of the primary ways that we can improve is to learn our failure modes and create external structures (pieces of paper, software tools) that check, protect against, or head off those forms of failure.
For example, imagine this plan of checklist improvement:
- Change your normal way of working to include an explicit checklist (that starts empty).
- When you make a mistake:
- Analyze what went wrong
- Try to generalize the particular incident to a category
- Add an item to your checklist.
This is very simple and generic, but it is reasonable to believe that if you carefully and diligently followed this plan, your reliability would go up (with diminishing returns because your errors are also your opportunities for improvement). I have not read Mayo, but her "error-theoretic" philosophy of science might be applicable here.
We can try to build a correspondence between failure modes, and game mechanics that attempt to cope for that failure mode.
I think I’ve always had certain stereotypes in my mind about research. I imagine a cutting-edge workplace, maybe not using the newest gadgets because these things cost money, but at least using the newest ideas. I imagine staff of research institutions applying the scientific method to boost their own productivity, instead of taking for granted the way that things have always been done. Maybe those were the naive ideas of someone who had never actually worked in a research field.
At the medical research institute where I work one day a week, I recently spent an entire seven-hour day going down a list of patient names, searching them on the hospital database, deciding whether they met the criteria for a study, and typing them into a colour-coded spreadsheet. The process had maybe six discrete steps, and all of them were purely mechanical. In seven hours, I screened about two hundred and fifty patients. I was paid $12.50 an hour to do this. It cost my employer 35 cents for each patient that I screened, and these patients haven't been visited, consented or included in any study. They're still only names on a spreadsheet. I’ve been told that I learn and work quickly, but I know I do this task inefficiently, because I’m not a simple computer program. I get bored. I make mistakes. Heaven forbid, I get distracted and start reading the nurses’ notes for fun because I find them interesting.
In 7 hours, I imagine that someone slightly above my skill level could write a simple program to do the same task. They wouldn’t screen any patients in those 7 hours, but once the program was finished, they could use it forever, or at least until the task changed and the program had to be modified. I don’t know how much it would cost the organization to employ a programmer; maybe it would cost more than just having me do it. I don’t know whether allowing that program to access the confidential database would be an issue. But it seems inefficient to pay human brains to do work that they’re bad at, that computers would be better at, even if those human brains belong to undergrad students who need the money badly enough not to complain.
One of the criteria I looked at when screening patients was whether they did their dialysis at a clinic in my hometown. They have to be driving distance, because my supervisor has to drive around the city and pick up blood samples to bring to our lab. I crossed out 30 names without even looking them up because I could see at a glance that they were a nearby city an hour’s drive away. How hard would it be to coordinate with the hospital in that city? Have the bloodwork analyzed there and the results emailed over? Maybe it would be non-trivially hard; I don’t know. I didn’t ask my supervisor because it isn’t my job to make management decisions. But medical research benefits everyone. A study with more patients produces data that’s statistically more valid, even if those patients live an hour’s drive away.
The office where I work is filled with paper. Floor-to-ceiling shelves hold endless binders full of source documents. Every email has to be printed and filed in a binder. Even the nurses’ notes and patient charts are printed off the database. It’s a legal requirement. The result is that we have two copies of everything, one online and one on paper, consuming trees. Running a computer consumes fossil fuels, of course. I don’t know for sure which is more efficient, paper or digital, but I do know that both is inefficient. I did ask my supervisor about this, and apparently it’s because digital records could be lost or deleted. How much would it take to make them durable enough?I guess that more than my supervisor, I see a future where software will do my job, where technology allows a study to be coordinated across the whole world, where digital storage will be reliable enough. But how long will it take for the laws and regulations to change? For people to change? I don’t know how many of my complaints are valid. Maybe this is the optimal way to do research, but it doesn’t feel like it. It feels like a papier-mâché of laws and habits and trial-and-error. It doesn't feel planned.
View more: Next