Comment author: IlyaShpitser 15 December 2015 01:36:00AM *  2 points [-]

confounding depends on the correlations with both the independent and dependent variables

That's not the correct definition of confounding (standard counterexample: M-bias).

Re: missing controls, can try to find similar people who didn't take the course, and match on something sensible.

the concept of significance can’t be applied to confounding in a straightforward way

Not sure what this means, people have been using bootstrap CIs for the ACE for ages.

Comment author: AstraSequi 15 December 2015 04:17:51AM *  0 points [-]

You'll have to clarify those points. For the first part, M-bias is not confounding. It's a kind of selection bias, and it happens when there is no causal relation with the independent or dependent variables (not no correlation), specifically when you try to adjust for confounding that doesn't exist. The collider can be a confounder, but it doesn't have to be. From the second link, "some authors refer to this type of (M-bias) as confounding...but this extension has no practical consequences"

I don't think you can get a good control group after the fact, because you need their outcomes at both timepoints, with a year in between. None of the options that come to mind are very good: you could ask them what they would have answered a year ago, you could start collecting data now and ask them in a year's time, or you could throw out the temporal data and use only a single cross-section.

Comment author: Unnamed 14 December 2015 06:49:10AM 0 points [-]

Could you give an example of an additional analysis that you think should be run?

If the study included a comparison group which differed on some demographic variables (like gender), then I understand the value of running analyses that control for those variables (e.g., did the treatment group have a larger increase in conscientiousness than the comparison group while controlling for gender?). But that wasn't the study design, so we can't just run a regression with demographic controls.

Comment author: AstraSequi 15 December 2015 12:48:02AM *  0 points [-]

You want adjusted effect sizes to check confounding. It’s not because variables are different for the controls, but because you don’t know if they affected your treatment group. You could stratify by group and take a weighted average of the effect sizes (“effect size” defined as change from baseline, as in the writeup). However, you might not have a large enough sample size for all strata, you can’t adjust for many variables at once, and it’s inferior to regression.

If correlation was your primary method to check confounding, there are two problems: a) confounding depends on the correlations with both the independent and dependent variables, but you only have data for the latter. b) the concept of significance can’t be applied to confounding in a straightforward way. It’s affected by sample size and variance, but confounding isn’t.

The main complication is the missing control group. I’m undecided on how to interpret this study, because I can’t think of any reason to avoid controls and I’m still trying to figure out the implications. If the RCT was done well, this makes the evidence a little bit stronger because it’s a replication. But by itself, I still haven’t thought of any way to draw useful conclusions from these data. There’s some good information, but it’s like two cross-sections, which are usually used only to find hypotheses for new research.

Comment author: Unnamed 13 December 2015 11:18:24PM 1 point [-]

We looked into the possibility of including a nonrandomized comparison group. In order to get a large enough sample size, we'd have to be much less selective than your example (people who were accepted to a workshop but weren't able to attend for several months). One option that we considered was surveying Less Wrongers. Another option was to ask for volunteers from the people who had shown an interest in CFAR (e.g., people who have subscribed to the CFAR newsletter, people who have applied to workshops and been turned down). We decided not to use either of those comparison groups in this study, but we might use them in future research.

Would you have much more confidence in these results if we had included one of those groups as a comparison, and found that they showed little or no change on these variables?

(RE terminology: studies with this design are often just called "longitudinal." Hopefully the methodology section clears up any ambiguity, and the opening of the post also points readers' thoughts in the right direction.)

Comment author: AstraSequi 14 December 2015 02:13:28AM *  1 point [-]

People with an interest in CFAR would probably work. It would account for possibilities like the population being drawn from people interested in self-improvement, since they could get that in other places.

I can't say how much confidence I'd have without seeing the data. The evidence for whether it's a good control mainly comes from checking the differences between groups at baseline. This isn't the same as whether the controls changed, which is a common pitfall. Even if the treatment group changes significantly and the control doesn't, it doesn’t mean the difference between treatment and control is significant.

Also, to clarify, the comparison at baseline isn’t limited to the outcome variables. It should include all the data on potential confounders, including things like age and gender. This is all presented in Table 1 in most studies of cause and effect in populations. A few differences don't invalidate the study, but they should be accounted for in the analysis.

RE terminology: Agreed it works as a shorthand and the methodology has enough detail to tell us what was done. It just seems unusual to use it as a complete formal description.

Another question: could you explain more of what you did about potential confounders? Using age as an example, you only wrote about testing for significant correlations. This doesn't rule out age as a confounder, so did you do anything else that you didn't include?

Comment author: AstraSequi 12 December 2015 09:28:14PM *  7 points [-]

The primary weakness of longitudinal studies, compared with studies that include a control group

Longitudinal studies can and should include control groups. The difference with RCTs is that the control group is not randomized. Instead, you select from a population which is as similar as possible to the treatment group, so an example is a group of people who were interested but couldn't attend because of scheduling conflicts. There is also the option of a placebo substitute like sending them generic self-help tips.

ETA: "Longitudinal" is also ambiguous here. It means that data were collected over time, and could mean one of several study types (RCTs are also longitudinal, by some definitions). I think you want to call this a cohort study, except without controls this is more like two different cross-sectional studies from the same population.

Comment author: VoiceOfRa 30 November 2015 04:00:32AM *  2 points [-]

Probably not. Consider why there was an increasing amount of dissatisfaction among the people, after all the Tsars had always been brutal, it was only when the Tsar was less brutal that dissatisfaction seemed to manifest.

Comment author: AstraSequi 30 November 2015 07:01:53AM *  1 point [-]

The main problem with that argument is that it assumes dissatisfaction is determined by the amount of repression. It's a factor, but there are others, like food, wars, and technical innovations.

This kind of question needs complex analysis and can't be answered that easily. You could plot a measurement of repression against a measure of dissatisfaction (assume the measurements are accurate), show that they corresponded perfectly from regime to regime, and even if you ignore confounders it still wouldn't show causality because you still wouldn't know which one came first.

Comment author: VoiceOfRa 28 November 2015 05:41:24AM 1 point [-]

But anger at perceived injustice is a pretty universal human emotion (based on my general observations).

My looking at history is that this isn't quite correct. It is the most restrained aggressor/tyrant who winds up getting targeted. To use an example I'm familiar with most of the Russian Tsars were rather despotic; however, two did make major liberal reforms, Alexander II freed the serfs, and Nicholas II make strides towards modernizing the country including introducing an elected parliament. Not-so-coincidentally, they were also the only tsars to be assassinated by revolutionaries.

Comment author: AstraSequi 30 November 2015 03:07:53AM 1 point [-]

Causality could go the other way here - the reforms might have been (ultimately ineffective) attempts to address dissatisfaction among the people.

Comment author: PipFoweraker 26 November 2015 10:02:43PM 3 points [-]

One normal-world handy use for the phrases 'skill' and 'intuition' is in shortcutting communication, either out of reticence or convenience. For example, if I ask a professional poker player why they made a particular play, if they don't want to get into a meaningful discussion about the sorts of math they work through while playing the game (plus attendant behavioural observations etc) then essentially either of those two responses are a reasonably polite way of brushing me off.

I'm sure you can think of instances where, regardless of the polite good intentions of a questioner, you've been in a situation where it's not in at leasts one parties' best interests to go into the minutiae of a process - either because you're talking across a vast knowledge gap or because there are other demands on your time.

I'm reminded of the joke variants that mechanically-inclined people tend to make: $1 for hitting your TV with a hammer, $50 for knowing where to hit it. Complex knowledge is valuable!

Another thing to consider - and something I'm guilty of - is using skill/intuition references to short-circuit people from getting sidetracked in an early stage of their learning process. I'm sure an analogy could be made with programming as in above responses. When the mastery of a complex field comes through a progression of skills {A, B, C [...] Z}, and you're trying to guide someone from B to C and they spot the shininess of J or K off on the intellectual horizon, as a teacher, your pedagogy might lean towards gently nudging them back to focussing on the fundamentals back at C.

Comment author: AstraSequi 28 November 2015 02:41:56AM *  0 points [-]

I agree with your main point, and I sometimes use the phrases in the same way. But what do you say when they ask you for details anyways? I mostly interact with non-rationalists, and my experience is that after people make a claim about skill or intuition, they're usually unable to explain further (or unwilling to the point of faking ignorance). If I'm talking to someone I trust to be honest with me and I keep trying to pin down an answer, it seems to eventually reduce to the claim that an explanation is impossible. A few people have said exactly that, but a claim like "you'll just know once you have more experience" is more common.

In a situation like this, what approach would get you to give more detail? I'd be happy with "you need to understand skills D through I before I answer you," but I'm rarely able to get that.

Comment author: gjm 25 November 2015 02:03:18PM 6 points [-]

I think this is not quite right, and it's not-quite-right in an important way. It really isn't true in any sense that "it's more likely that you'll alternate between heads and tails". This is a Simpson's-paradox-y thing where "the average of the averages doesn't equal the average".

Suppose you flip a coin four times, and you do this 16 times, and happen to get each possible outcome once: TTTT TTTH TTHT TTHH THTT THTH THHT THHH HTTT HTTH HTHT HTHH HHTT HHTH HHHT HHHH.

  • Question 1: in this whole sequence of events, what fraction of the time was the flip after a head another head? Answer: there were 24 flips after heads, and of these 12 were heads. So: exactly half the time, as it should be. (Clarification: we don't count the first flip of a group of 4 as "after a head" even if the previous group ended with a head.)
  • Question 2: if you answer that same question for each group of four, and ignore cases where the answer is indeterminate because it involves dividing by zero, what's the average of the results: Answer: it goes 0/0 0/0 0/1 1/1 0/1 0/1 1/2 2/2 0/1 0/1 0/2 1/2 1/2 1/2 2/3 3/3. We have to ignore the first two. The average of the rest is 17/42, or just over 0.4.

What's going on here isn't any kind of tendency for heads and tails to alternate. It's that an individual head or tail "counts for more" when the denominator is smaller, i.e., when there are fewer heads in the sample.

Comment author: AstraSequi 26 November 2015 01:53:56AM *  0 points [-]

My intuition is from the six points in Kahan's post. If the next flip is heads, then the flip after is more likely to be tails, relative to if the next flip is tails. If we have an equal number of heads and tails left, P(HT) > P(HH) for the next two flips. After the first heads, the probability for the next two might not give P(TH) > P(TT), but relative to independence it will be biased in that direction because the first T gets used up.

Is there a mistake? I haven't done any probability in a while.

Comment author: AstraSequi 25 November 2015 07:27:01AM *  1 point [-]

I treat conversations like this as a communication problem, since the information should be communicable if it's based in Douglas Crockford’s mind. I try to find what the intuition is based on, which helps i) send me in the right direction and ii) avoid double-counting the evidence if I find it independently.

To me, the labels “skill” or “intuition” mean that something is not well understood enough to be communicated objectively. A total understanding would include the ability to describe it as one or more clear-cut techniques or algorithms.

Comment author: AstraSequi 25 November 2015 02:24:34AM *  2 points [-]

I just found out about the “hot hand fallacy fallacy” (Dan Kahan, Andrew Gelman, Miller&Sanjuro paper) as a type of bias that more numerate people are likely more susceptible to, and for whom it's highly counterintuitive. It's described as a specific failure mode of the intuition used to get rid of the gambler's fallacy.

I understand the correct statement like this. Suppose we’re flipping a fair coin.

*If you're predicting future flips of the coin, the next flip is unaffected by the results of your previous flips, because the flips are independent. So far, so good.

*However, if you're predicting the next flip in a finite series of flips that has already occurred, it's actually more likely that you'll alternate between heads and tails.

The discussion is mostly about whether a streak of a given length will end or continue. This is for length of 1 and probability of 0.5. Another example is

...we can offer the following lottery at a $5 ticket price: a fair coin will be flipped 4 times. if the relative frequency of heads on flips that immediately follow a heads is greater than 0.5 then the ticket pays $10; if the relative frequency is less than 0.5 then the ticket pays $0; if the relative frequency is exactly equal to 0.5, or if no flip is immediately preceded by a heads, then a new sequence of 4 flips is generated. While, intuitively, it seems like the expected payout of this ticket is $0, it is actually $-0.71 (see Table 1). Curiously, this betting game may be more attractive to someone who believes in the independence of coin flips, rather than someone who holds the Gambler’s fallacy.

View more: Prev | Next