You're looking at Less Wrong's discussion board. This includes all posts, including those that haven't been promoted to the front page yet. For more information, see About Less Wrong.

Romashka comments on Stupid Questions August 2015 - Less Wrong Discussion

7 Post author: Grothor 01 August 2015 11:08PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (129)

You are viewing a single comment's thread. Show more comments above.

Comment author: gwern 12 August 2015 12:57:15AM *  3 points [-]

You can estimate the bias of each measurer much more efficiently if you have them measure the same sample, yes, analogous to crossover: now the differences are due less to the wide diversity of the sampled population and more to the particular measurer.

(To put it a little more mathily, when each measurer measures different samples, then the measurements will be spread very widely because it's Var(measurer-bias) + Var(population); but if we have the measurers measure the same sample, then Var(population) drops out and now there's just Var(measurer-bias). If I measure a sample and get 2.9 and you measure it as well and get 3.1, then probably the sample is really ~3.0 and my bias is -0.1 and your bias is +0.1. If I measure one sample and get 2.9 and you measure a different sample and get 3.1, then my bias and your bias are... ???)

For example, the classic example for MLMs is you have n classrooms' test scores, and you want to figure out the teachers' effects. It's hard to tell because the classrooms' average scores will differ a lot on their own. This is analogous to your original description: each measurer gets their own batch of samples. But what if you had a crossed design of one classroom with test scores after it's taught by each teacher? Then much of the differences in the average score will be due to the particular effect of each teacher and that will be much easier to estimate.

So if we look at the difference in differences between x1and x2, and it is greater for some middle latent variables (ways of staining) than for others, can we use it as a measure of 'the overall variability of the measuring method'? Say, if we have ten measurers and four measuring methods...

I guess. From a factor analysis perspective, you just want to pick the one with the highest loading on X, I think.

Comment author: Romashka 17 August 2015 12:12:21PM 1 point [-]

Huh. Your answer was even more useful for me than I expected. My 'secret agenda' is to put forth another mountant medium, which might have advantages over the one in use, but I will have to show that they do not differ in preparation quality. I think I am going to do a 2-by-2 crossover.

So - thank you! Analogies for the win!