gwern comments on 2012 Survey Results - Less Wrong

80 Post author: Yvain 07 December 2012 09:04PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (640)

Sort By: Controversial

You are viewing a single comment's thread. Show more comments above.

Comment author: gwern 30 November 2012 01:30:52AM *  3 points [-]

Well, possibly. The t-distribution is used for "estimating the mean of a normally distributed population," (yay wikipedia) and you're trying to estimate the mean of a slanted-uniformly-distributed-with-a-spike-at-the-beginning population.

Yeah, it'd have to be some combination of a uniform Poisson (since we don't seem to be growing a lot, per Yvain) and an exponential distribution (constant mortality of users). If we graph histograms, either blunt or finegrained, it looks like that but also with weird huge spikes besides the original OB->LW spike:

R> hist(as.numeric(as.character(lw$TimeinCommunity)))

R> hist(as.numeric(as.character(lw$TimeinCommunity)), breaks=50)

But on the plus side, if we look at the genders as a box plot, we discover why the mean is lower for women but there's not significance:

R> lwm <- subset(lw, as.character(Gender)=="M (cisgender)")
R> lwf <- subset(lw, as.character(Gender)=="F (cisgender)")
R> boxplot(as.numeric(lwm$TimeinCommunity), as.numeric(lwf$TimeinCommunity))

There are, after all, many fewer women.

Comment author: VincentYu 02 December 2012 11:17:10AM 2 points [-]

but also with weird huge spikes besides the original OB->LW spike

The spikes are just due to people estimating in half-years: 12, 18, 24, 30, 36.