You're looking at Less Wrong's discussion board. This includes all posts, including those that haven't been promoted to the front page yet. For more information, see About Less Wrong.

jsteinhardt comments on Against NHST - Less Wrong Discussion

57 Post author: gwern 21 December 2012 04:45AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (62)

You are viewing a single comment's thread. Show more comments above.

Comment author: jsteinhardt 21 December 2012 05:06:28PM 1 point [-]

Presumably not actually 95%, but, as gwern said, a threshold based on the cost of false positives.

Comment author: gwern 21 December 2012 05:34:11PM *  4 points [-]

Yes, in this case you could keep using p-values (if you really wanted to...), but with reference to the value of, say, each customer. (This is what I meant by setting the threshold with respect to decision theory.) If the goal is to use on a site making millions of dollars*, 0.01 may be too loose a threshold, but if he's just messing with his personal site to help readers, a p-value like 0.10 may be perfectly acceptable.

* If the results were that important, I think there'd be better approaches than a once-off a/b test. Adaptive multi-armed bandit algorithms sound really cool from what I've read of them.