You're looking at Less Wrong's discussion board. This includes all posts, including those that haven't been promoted to the front page yet. For more information, see About Less Wrong.

cleonid comments on Should you write longer comments? (Statistical analysis of the relationship between comment length and ratings) - Less Wrong Discussion

11 Post author: cleonid 20 July 2015 02:09PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (47)

You are viewing a single comment's thread. Show more comments above.

Comment author: cleonid 21 July 2015 11:43:00AM 0 points [-]

True, but it is virtually impossible to see a meaningful pattern when you have thousands data points on the graph and R2<0.2.

Comment author: Douglas_Knight 22 July 2015 05:28:45AM 0 points [-]

I disagree. I find point clouds useful, as long as they are not pure black. Kernel density plots are better, though.

But Lumifer gave you a concrete suggestion: plot a regression curve, not a bunch of buckets. Bucketing and drawing lines between points are kinds of smoothing, so you should instead use a good smoothing. Say, loess. Just use ggplot and trust its defaults. (not loess with this many points)

Comment author: Lumifer 21 July 2015 04:30:10PM 0 points [-]

Well, one question is if it's "impossible to see a meaningful pattern", should you melt-and-recast the data so that the pattern appears X-/

Another observation is that you are constrained by Excel. R can deal with such problems easily -- do you have the raw dataset available somewhere?