gwern comments on 2014 Survey Results - Less Wrong

87 Post author: Yvain 05 January 2015 07:36PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (279)

Sort By: Controversial

You are viewing a single comment's thread. Show more comments above.

Comment author: gwern 04 January 2015 07:42:17PM *  9 points [-]

They're both pretty much the same test, right?

I thought they were partially not the same because they added the writing subtest.

If this is true, I would expect there to be a correlation of around .844 between one test score and a later test score under the same grading system.

The reliability of recent SAT tests seems to generally be ~0.9 according to one random PDF I found (and has long been high). If I am understanding the formulas in this page correctly, then in this application, reliability simplifies to the Pearson's r of the 2 scores*, and that reliability of 0.9 is pretty similar to the LW old/new correlation r of 0.84.

So this may be simply what one would expect from people taking the SAT twice, without having to invoke the lowered correlation caused by the additional sections and any other tweaks they've made.

* Specifically, I'm looking at Artifactual Influences, #3: reliability, where I think we can reuse the example: for test-retest, assume the LWer doesn't get dumber or smarter and the true correlation would be 1; the reliability of the old SAT should be 0.9, the reliability of the new one should be 0.9 too, so you get '1 * sqrt(0.9 * 0.9)' or 'sqrt(0.9 * 0.9)' or 'sqrt(0.9^2)' or '0.9'. So, the expected correlation of 2 SAT tests simplifies to the original reliability of 0.9.