Informed consent bias in RCTs?
The problem of published research findings not being reliable has been discussed here before.
One problem with RCTs that has received little attention is that, due to informed consent laws and ethical considerations, subjects are aware that they might be receiving sham therapy. This differs from the environment outside of the research setting, where people are confident that whatever their doctor prescribes is what they will get from their pharmacist. I can imagine many ways in which subjects' uncertainty about treatment assignment could affect outcomes (adherence is one possible mechanism). I wrote a short paper about this, focusing out what we would ideally estimate if we could lie to subjects, versus what we actually can estimate in RCTs (link). Here is the abstract:
It is widely recognized that traditional randomized controlled trials (RCTs) have limited generalizability due to the numerous ways in which conditions of RCTs differ from those experienced each day by patients and physicians. As a result, there has been a recent push towards pragmatic trials that better mimic real-world conditions. One way in which RCTs differ from normal everyday experience is that all patients in the trial have uncertainty about what treatment they were assigned. Outside of the RCT setting, if a patient is prescribed a drug then there is no reason for them to wonder if it is a placebo. Uncertainty about treatment assignment could affect both treatment and placebo response. We use a potential outcomes approach to define relevant causal effects based on combinations of treatment assignment and belief about treatment assignment. We show that traditional RCTs are designed to estimate a quantity that is typically not of primary interest. We propose a new study design that has the potential to provide information about a wider range of interesting causal effects
Any thoughts on this? Is this a trivial technical issue or something worth addressing?
Error detection bias in research
I have had the following situation happen several times during my research career: I write code to analyze data; there is some expectation about what the results will be; after running the program, the results are not what was expected; I go back and carefully check the code to make sure there are no errors; sometimes I find an error
No matter how careful you are when it comes to writing computer code, I think you are more likely to find a mistake if you think there is one. Unexpected results lead one to suspect a coding error more than expected results do.
In general, researchers usually do have general expectations about what they will find (e.g., the drug will not increase risk of the disease; the toxin will not decrease risk of cancer).
Consider the following graphic:
Here, the green region is consistent with what our expectations are. For example, if we expect a relative risk (RR) of about 1.5, we might not be too surprised if the estimated RR is between (e.g.) 0.9 and 2.0. Anything above 2.0 or below 0.9 might make us highly suspicious of an error -- that's the red region. Estimates in the red region are likely to trigger serious coding error investigation. Obviously, if there is no coding error then the paper will get submitted with the surprising results.
Bayes' rule =/= Bayesian inference
Related to: Bayes' Theorem Illustrated, What is Bayesianism?, An Intuitive Explanation of Bayes' Theorem
(Bayes' theorem is something Bayesians need to use more often than Frequentists do, but Bayes' theorem itself isn't Bayesian. This post is meant to be a light introduction to the difference between Bayes' theorem and Bayesian data analysis.)
Bayes' Theorem
Bayes' theorem is just a way to get (e.g.) p(B|A) from p(A|B) and p(B). The classic example of Bayes' theorem is diagnostic testing. Suppose someone either has the disease (D+) or does not have the disease (D-) and either tests positive (T+) or tests negative (T-). If we knew the sensitivity P(T+|D+), specificity P(T-|D-) and disease prevalence P(D+), then we could get the positive predictive value P(D+|T+) using Bayes' theorem:
For example, suppose we know the sensitivity=0.9, specificity=0.8 and disease prevalence is 0.01. Then,
This answer is not Bayesian or frequentist; it's just correct.
Diagnostic testing study
Typically we will not know P(T+|D+) or P(T-|D-). We would consider these unknown parameters. Let's denote them by Θsens and Θspec. For simplicity, let's assume we know the disease prevalence P(D+) (we often have a lot of data on this).
Suppose 1000 subjects with the disease were tested, and 900 of them tested positive. Suppose 1000 disease-free subjects were tested and 200 of them tested positive. Finally, suppose 1% of the population has the disease.
Frequentist approach
Estimate the 2 parameters (sensitivity and specificity) using their sample values (sample proportions) and plug them in to Bayes' formula above. This results in a point estimate for P(D+|T+) of 0.043. A standard error or confidence interval could be obtained using the delta method or bootstrapping.
Even though Bayes' theorem was used, this is not a Bayesian approach.
Bayesian approach
The Bayesian approach is to specify prior distributions for all unknowns. For example, we might specify independent uniform(0,1) priors for Θsens and Θspec. However, we should expect the test to do at least as good as guessing (guessing would mean randomly selecting 1% of people and calling them T+). In addition, we expect Θsens>1-Θspec. So, I might go with a Beta(4,2.5) distribution for Θsens and Beta(2.5,4) for Θspec:
Using these priors + the data yields a posterior distribution for P(D+|T+) with posterior median 0.043 and 95% credible interval (0.038, 0.049). In this case, the Bayesian and frequentist approaches have the same results (not surprising since the priors are relatively flat and there are a lot of data). However, the methodology is quite different.
Example that illustrates benefit of Bayesian data analysis
(example edited to focus on credible/confidence intervals)
Suppose someone shows you what looks like a fair coin (you confirm head on one side tails on the other) and makes the claim: "This coin will land with heads up 90% of the time"
Suppose the coin is flipped 5 times and lands with heads up 4 times.
Frequentist approach
"A 95% confidence interval for the Binomial parameter is (.38, .99) using the Agresti-Coull method." Because 0.9 is within the confidence limits, the usual conclusion would be that we do not have enough evidence to rule it out.
Bayesian approach
"I don't believe you. Based on experience and what I know about the laws of physics, I think it's very unlikely that your claim is accurate. I feel very confident that the probability is close to 0.5. However, I don't want to rule out something a little bit unusual (like a probability of 0.4). Thus, my prior for the probability of heads is a Beta(30,30) distribution."
After seeing the data, we update our belief about the binomial parameter. The 95% credible interval for it is (0.40, 0.64). Thus, a value of 0.9 is still considered extremely unlikely.
This illustrates the idea that, from a Bayesian perspective, implausible claims require more evidence than plausible claims. Frequentists have no formal way of including that type of prior information.

Beauty quips, "I'd shut up and multiply!"
When it comes to probability, you should trust probability laws over your intuition. Many people got the Monty Hall problem wrong because their intuition was bad. You can get the solution to that problem using probability laws that you learned in Stats 101 -- it's not a hard problem. Similarly, there has been a lot of debate about the Sleeping Beauty problem. Again, though, that's because people are starting with their intuition instead of letting probability laws lead them to understanding.
The Sleeping Beauty Problem
On Sunday she is given a drug that sends her to sleep. A fair coin is then tossed just once in the course of the experiment to determine which experimental procedure is undertaken. If the coin comes up heads, Beauty is awakened and interviewed on Monday, and then the experiment ends. If the coin comes up tails, she is awakened and interviewed on Monday, given a second dose of the sleeping drug, and awakened and interviewed again on Tuesday. The experiment then ends on Tuesday, without flipping the coin again. The sleeping drug induces a mild amnesia, so that she cannot remember any previous awakenings during the course of the experiment (if any). During the experiment, she has no access to anything that would give a clue as to the day of the week. However, she knows all the details of the experiment.
Each interview consists of one question, "What is your credence now for the proposition that our coin landed heads?"
Two popular solutions have been proposed: 1/3 and 1/2
The 1/3 solution
From wikipedia:
Suppose this experiment were repeated 1,000 times. We would expect to get 500 heads and 500 tails. So Beauty would be awoken 500 times after heads on Monday, 500 times after tails on Monday, and 500 times after tails on Tuesday. In other words, only in a third of the cases would heads precede her awakening. So the right answer for her to give is 1/3.
Yes, it's true that only in a third of cases would heads precede her awakening.
Radford Neal (a statistician!) argues that 1/3 is the correct solution.
This [the 1/3] view can be reinforced by supposing that on each awakening Beauty is offered a bet in which she wins 2 dollars if the coin lands Tails and loses 3 dollars if it lands Heads. (We suppose that Beauty knows such a bet will always be offered.) Beauty would not accept this bet if she assigns probability 1/2 to Heads. If she assigns a probability of 1/3 to Heads, however, her expected gain is 2 × (2/3) − 3 × (1/3) = 1/3, so she will accept, and if the experiment is repeated many times, she will come out ahead.
Neal is correct (about the gambling problem).
These two arguments for the 1/3 solution appeal to intuition and make no obvious mathematical errors. So why are they wrong?
Let's first start with probability laws and show why the 1/2 solution is correct. Just like with the Monty Hall problem, once you understand the solution, the wrong answer will no longer appeal to your intuition.
The 1/2 solution
P(Beauty woken up at least once| heads)=P(Beauty woken up at least once | tails)=1. Because of the amnesia, all Beauty knows when she is woken up is that she has woken up at least once. That event had the same probability of occurring under either coin outcome. Thus, P(heads | Beauty woken up at least once)=1/2. You can use Bayes' rule to see this if it's unclear.
Here's another way to look at it:
If it landed heads then Beauty is woken up on Monday with probability 1.
If it landed tails then Beauty is woken up on Monday and Tuesday. From her perspective, these days are indistinguishable. She doesn't know if she was woken up the day before, and she doesn't know if she'll be woken up the next day. Thus, we can view Monday and Tuesday as exchangeable here.
A probability tree can help with the intuition (this is a probability tree corresponding to an arbitrary wake up day):
If Beauty was told the coin came up heads, then she'd know it was Monday. If she was told the coin came up tails, then she'd think there is a 50% chance it's Monday and a 50% chance it's Tuesday. Of course, when Beauty is woken up she is not told the result of the flip, but she can calculate the probability of each.
When she is woken up, she's somewhere on the second set of branches. We have the following joint probabilities: P(heads, Monday)=1/2; P(heads, not Monday)=0; P(tails, Monday)=1/4; P(tails, Tuesday)=1/4; P(tails, not Monday or Tuesday)=0. Thus, P(heads)=1/2.
Where the 1/3 arguments fail
The 1/3 argument says with heads there is 1 interview, with tails there are 2 interviews, and therefore the probability of heads is 1/3. However, the argument would only hold if all 3 interview days were equally likely. That's not the case here. (on a wake up day, heads&Monday is more likely than tails&Monday, for example).
Neal's argument fails because he changed the problem. "on each awakening Beauty is offered a bet in which she wins 2 dollars if the coin lands Tails and loses 3 dollars if it lands Heads." In this scenario, she would make the bet twice if tails came up and once if heads came up. That has nothing to do with probability about the event at a particular awakening. The fact that she should take the bet doesn't imply that heads is less likely. Beauty just knows that she'll win the bet twice if tails landed. We double count for tails.
Imagine I said "if you guess heads and you're wrong nothing will happen, but if you guess tails and you're wrong I'll punch you in the stomach." In that case, you will probably guess heads. That doesn't mean your credence for heads is 1 -- it just means I added a greater penalty to the other option.
Consider changing the problem to something more extreme. Here, we start with heads having probability 0.99 and tails having probability 0.01. If heads comes up we wake Beauty up once. If tails, we wake her up 100 times. Thirder logic would go like this: if we repeated the experiment 1000 times, we'd expect her woken up 990 after heads on Monday, 10 times after tails on Monday (day 1), 10 times after tails on Tues (day 2),...., 10 times after tails on day 100. In other words, ~50% of the cases would heads precede her awakening. So the right answer for her to give is 1/2.
Of course, this would be absurd reasoning. Beauty knows heads has a 99% chance initially. But when she wakes up (which she was guaranteed to do regardless of whether heads or tails came up), she suddenly thinks they're equally likely? What if we made it even more extreme and woke her up even more times on tails?
Implausible consequence of 1/2 solution?
Nick Bostrom presents the Extreme Sleeping Beauty problem:
This is like the original problem, except that here, if the coin falls tails, Beauty will be awakened on a million subsequent days. As before, she will be given an amnesia drug each time she is put to sleep that makes her forget any previous awakenings. When she awakes on Monday, what should be her credence in HEADS?
He argues:
The adherent of the 1/2 view will maintain that Beauty, upon awakening, should retain her credence of 1/2 in HEADS, but also that, upon being informed that it is Monday, she should become extremely confident in HEADS:
P+(HEADS) = 1,000,001/1,000,002
This consequence is itself quite implausible. It is, after all, rather gutsy to have credence 0.999999% in the proposition that an unobserved fair coin will fall heads.
It's correct that, upon awakening on Monday (and not knowing it's Monday), she should retain her credence of 1/2 in heads.
However, if she is informed it's Monday, it's unclear what she conclude. Why was she informed it was Monday? Consider two alternatives.
Disclosure process 1: regardless of the result of the coin toss she will be informed it's Monday on Monday with probability 1
Under disclosure process 1, her credence of heads on Monday is still 1/2.
Disclosure process 2: if heads she'll be woken up and informed that it's Monday. If tails, she'll be woken up on Monday and one million subsequent days, and only be told the specific day on one randomly selected day.
Under disclosure process 2, if she's informed it's Monday, her credence of heads is 1,000,001/1,000,002. However, this is not implausible at all. It's correct. This statement is misleading: "It is, after all, rather gutsy to have credence 0.999999% in the proposition that an unobserved fair coin will fall heads." Beauty isn't predicting what will happen on the flip of a coin, she's predicting what did happen after receiving strong evidence that it's heads.
ETA (5/9/2010 5:38AM)
If we want to replicate the situation 1000 times, we shouldn't end up with 1500 observations. The correct way to replicate the awakening decision is to use the probability tree I included above. You'd end up with expected cell counts of 500, 250, 250, instead of 500, 500, 500.
Suppose at each awakening, we offer Beauty the following wager: she'd lose $1.50 if heads but win $1 if tails. She is asked for a decision on that wager at every awakening, but we only accept her last decision. Thus, if tails we'll accept her Tuesday decision (but won't tell her it's Tuesday). If her credence of heads is 1/3 at each awakening, then she should take the bet. If her credence of heads is 1/2 at each awakening, she shouldn't take the bet. If we repeat the experiment many times, she'd be expected to lose money if she accepts the bet every time.
The problem with the logic that leads to the 1/3 solution is it counts twice under tails, but the question was about her credence at an awakening (interview).
ETA (5/10/2010 10:18PM ET)
Suppose this experiment were repeated 1,000 times. We would expect to get 500 heads and 500 tails. So Beauty would be awoken 500 times after heads on Monday, 500 times after tails on Monday, and 500 times after tails on Tuesday. In other words, only in a third of the cases would heads precede her awakening. So the right answer for her to give is 1/3.
Another way to look at it: the denominator is not a sum of mutually exclusive events. Typically we use counts to estimate probabilities as follows: the numerator is the number of times the event of interest occurred, and the denominator is the number of times that event could have occurred.
For example, suppose Y can take values 1, 2 or 3 and follows a multinomial distribution with probabilities p1, p2 and p3=1-p1-p2, respectively. If we generate n values of Y, we could estimate p1 by taking the ratio of #{Y=1}/(#{Y=1}+#{Y=2}+#{Y=3}). As n goes to infinity, the ratio will converge to p1. Notice the events in the denominator are mutually exclusive and exhaustive. The denominator is determined by n.
The thirder solution to the Sleeping Beauty problem has as its denominator sums of events that are not mutually exclusive. The denominator is not determined by n. For example, if we repeat it 1000 times, and we get 400 heads, our denominator would be 400+600+600=1600 (even though it was not possible to get 1600 heads!). If we instead got 550 heads, our denominator would be 550+450+450=1450. Our denominator is outcome dependent, where here the outcome is the occurrence of heads. What does this ratio converge to as n goes to infinity? I surely don't know. But I do know it's not the posterior probability of heads.
Self-indication assumption is wrong for interesting reasons
The self-indication assumption (SIA) states that
Given the fact that you exist, you should (other things equal) favor hypotheses according to which many observers exist over hypotheses on which few observers exist.
The reason this is a bad assumption might not be obvious at first. In fact, I think it's very easy to miss.
Argument for SIA posted on Less Wrong
First, let's take a look at a argument for SIA that appeared at Less Wrong (link). Two situations are considered.
1. we imagine that there are 99 people in rooms that have a blue door on the outside (1 person per room). One person is in a room with a red door on the outside. It was argued that you are in a blue door room with probability 0.99.
2. Same situation as above, but first a coin is flipped. If heads, the red door person is never created. If tails, the blue door people are never created. You wake up in a room and know these facts. It was argued that you are in a blue door room with probability 0.99.
So why is 1. correct and 2. incorrect? The first thing we have to be careful about is not treating yourself as special. The fact that you woke up just tells you that at least one conscious observer exists.
In scenario 1 we basically just need to know what proportion of conscious observers are in a blue door room. The answer is 0.99.
In scenario 2 you never would have woken up in a room if you hadn't been created. Thus, the fact that you exist is something we have to take into account. We don't want to estimate P(randomly selected person, regardless of if they exist or not, is in a blue door room). That would be ignoring the fact that you exist. Instead, the fact that you exist tells us that at least one conscious observer exists. Again, we want to know what proportion of conscious observers are in blue door rooms. Well, there is a 50% chance (if heads landed) that all conscious observers are in blue door rooms, and a 50% chance that all conscious observers are in red door rooms. Thus, the marginal probability of a conscious observer being in a blue door room is 0.5.
The flaw in the more detailed Less Wrong proof (see the post) is when they go from step C to step D. The *you* being referred to in step A might not exist to be asked the question in step D. You have to take that into account.
General argument for SIA and why it's wrong
Let's consider the assumption more formally.
Assume that the number of people to be created, N, is a random draw from a discrete uniform distribution1 on {1,2,...,Nmax}. Thus, P(N=k)=1/Nmax, for k=1,...,Nmax. Assume Nmax is large enough so that we can effectively ignore finite sample issues (this is just for simplicity).
Assume M= Nmax*(Nmax+1)/2 possible people exist, and we arbitrarily label them 1,...,M. After the size of the world, say N=n, is determined, then we randomly draw n people from the M possible people.
After the data are collected we find out that person x exists.
We can apply Bayes' theorem to get the posterior probability:
P(N=k|x exists)=k/M, for k=1,...,Nmax.
The prior probability was uniform, but the posterior favors larger worlds. QED.
Well, not really.
The flaw here is that we conditioned on person x existing, but person x only became of interest after we saw that they existed (peeked at the data).
What we really know is that at least one conscious observer exists -- there is nothing special about person x.
So, the correct conditional probability is:
P(N=k|someone exists)=1/Nmax, for k=1,...,Nmax.
Thus, prior=posterior and SIA is wrong.
Egotism
The flaw with SIA that I highlighted here is it treats you as special, as if you were labeled ahead of time. But the reality is, no matter who was selected, they would think they are the special person. "But I exist, I'm not just some arbitrary person. That couldn't happen in small world. It's too unlikely." In reality, that fact that I exist just means someone exists. I only became special after I already existed (peeked at the data and used it to construct the conditional probability).
Here's another way to look at it. Imagine that a random number between 1 and 1 trillion was drawn. Suppose 34,441 was selected. If someone then asked what the probability of selecting that number was, the correct answer is 1 in 1 trillion. They could then argue, "that's too unlikely of an event. It couldn't have happened by chance." However, because they didn't identify the number(s) of interest ahead of time, all we really can conclude is that a number was drawn, and drawing a number was a probability 1 event.
I give more examples of this here.
I think Nick Bostrom is getting at the same thing in his book (page 125):
..your own existence is not in general a ground for thinking that hypotheses are more likely to be true just by virtue of implying that there is a greater total number of observers. The datum of your existence tends to disconfirm hypotheses on which it would be unlikely that any observers (in your reference class) should exist; but that’s as far as it goes. The reason for this is that the sample at hand—you—should not be thought of as randomly selected from the class of all possible observers but only from a class of observers who will actually have existed. It is, so to speak, not a coincidence that the sample you are considering is one that actually exists. Rather, that’s a logical consequence of the fact that only actual observers actually view themselves as samples from anything at all
Related arguments are made in this LessWrong post.
1 for simplicity I'm assuming a uniform prior... the prior isn't the issue here







Subscribe to RSS Feed
= f037147d6e6c911a85753b9abdedda8d)