You're looking at Less Wrong's discussion board. This includes all posts, including those that haven't been promoted to the front page yet. For more information, see About Less Wrong.

Lumifer comments on Open thread, Dec. 21 - Dec. 27, 2015 - Less Wrong Discussion

2 Post author: MrMind 21 December 2015 07:56AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (230)

You are viewing a single comment's thread. Show more comments above.

Comment author: Lumifer 23 December 2015 06:41:18PM *  2 points [-]

A side question, prompted by an amusing factoid in the Hernan paper: "...we restricted the population to women who had reported plausible energy intakes (2510 –14,640 kJ/d)".

In the statistical analysis in this paper, and also as a general practice in medical publications based on questionnaire data, are there adjustments for uncertainty in the questionnaire responses?

When you have a data point that says, for example, that person #12345 reports her caloric intake as 4,000 calories/day, do you take it as a hard precise number, or do you take it as an imprecise estimate with its own error which propagates into the model uncertainty, etc.?

Comment author: IlyaShpitser 23 December 2015 08:02:45PM *  1 point [-]

Keyword is "measurement error." People think hard about this. Anders_H knows this paper in a lot more detail than I do, but I expect these particular authors to be careful.

This issue is also related to "missing data." What you see might be different from the underlying truth in systematic ways, e.g. you get systematic bias in your data, and you need to deal with that. This is also related to that causal inference stuff I keep going on about.

Comment author: Lumifer 23 December 2015 08:19:12PM *  0 points [-]

Keyword is "measurement error." People think hard about this.

People like engineers and physicists think a lot about this. I am not sure that medical researchers think a lot about this. The usual (easy) way is to throw out unreasonable-looking responses during the data cleaning and then take what remains as rock-solid. Accepting that your independent variables are uncertain leads to a lot of inconvenient problems (starting with the OLS regression not being a theoretically-correct form any more).

What you see might be different from the underlying truth in systematic ways, e.g. you get systematic bias in your data, and you need to deal with that.

Yes, that's another can of worms. In some areas (e.g. self-reported food intake) the problem is so blatant and overwhelming that you have to deal with it, but if it looks minor not many people want to bother.

Comment author: IlyaShpitser 23 December 2015 08:24:40PM 1 point [-]

Clinicians do not, "methodology people" (who often partner up with "domain experts") to do data analysis, absolutely do.