shminux comments on Best of Rationality Quotes, 2012 Edition - Less Wrong

31 Post author: DanielVarga 26 January 2013 03:03AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (18)

You are viewing a single comment's thread.

Comment author: shminux 25 January 2013 06:57:50PM 0 points [-]

Neat. Does the quote karma follow something like the exponential distribution?

Comment author: gwern 25 January 2013 07:51:18PM *  2 points [-]

I tried some stuff in R. While it looks exponential, none of the code or fitting functions gave good results on the highest-karma quotes - I guess because all the other thousand quotes look so linear. Of course, I could have just messed up in any of the following:

Open http://people.mokk.bme.hu/~daniel/rationality_quotes_2012/rq.html in Firefox; C-a; then:

$ xclip -o | grep Permalink | grep points | cut -f 1 -d' ' | tr '\n' ','
$ R
R> karma <- sort(c(105,73,66,64,63,62,60,60,58,58,57,57,57,57,57,56,56,55,55,54,53,51,50,50,49,49,
48,48,48,47,47,46,46,45,45,44,44,44,43,43,43,43,43,43,43,43,43,42,42,41,41,41,
41,41,40,40,40,40,39,39,38,38,38,38,38,38,38,38,37,37,37,37,37,37,37,37,36,36,
36,36,36,36,36,35,35,35,35,35,34,34,34,34,34,34,34,34,34,34,34,34,34,34,33,33,
33,33,33,33,33,32,32,32,32,32,32,32,32,32,32,32,32,32,31,31,31,31,31,31,31,31,
31,31,31,31,31,30,30,30,30,30,30,30,30,30,30,30,29,29,29,29,29,29,29,29,29,29,
29,29,29,29,29,28,28,28,28,28,28,28,28,28,28,28,28,28,28,28,28,28,28,28,27,27,
27,27,27,27,27,27,27,27,27,27,27,27,27,27,27,27,27,27,27,27,26,26,26,26,26,26,
26,26,26,26,26,26,26,26,26,26,26,26,26,26,26,26,26,26,26,26,26,26,25,25,25,25,
25,25,25,25,25,25,25,25,25,25,25,25,25,25,24,24,24,24,24,24,24,24,24,24,24,24,
24,24,24,24,24,24,24,24,24,24,24,23,23,23,23,23,23,23,23,23,23,23,23,23,23,23,
23,23,23,23,23,23,23,23,23,23,23,23,23,23,23,23,23,23,23,23,23,22,22,22,22,22,
22,22,22,22,22,22,22,22,22,22,22,22,22,22,22,22,22,22,22,22,22,22,22,22,22,22,
22,22,22,21,21,21,21,21,21,21,21,21,21,21,21,21,21,21,21,21,21,21,21,21,21,21,
21,21,21,21,21,21,21,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,
20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,19,19,19,19,19,19,19,
19,19,19,19,19,19,19,19,19,19,19,19,19,19,19,19,19,19,19,19,19,19,19,19,19,19,
19,19,19,19,19,19,19,19,19,19,19,19,18,18,18,18,18,18,18,18,18,18,18,18,18,18,
18,18,18,18,18,18,18,18,18,18,18,18,18,18,18,18,18,18,18,18,18,18,18,18,18,18,
18,18,18,18,18,18,18,18,18,18,18,18,18,18,18,17,17,17,17,17,17,17,17,17,17,17,
17,17,17,17,17,17,17,17,17,17,17,17,17,17,17,17,17,17,17,17,17,17,17,17,17,17,
17,17,17,17,17,17,17,17,17,17,17,17,17,17,17,17,17,17,17,17,17,17,17,17,17,17,
17,17,17,16,16,16,16,16,16,16,16,16,16,16,16,16,16,16,16,16,16,16,16,16,16,16,
16,16,16,16,16,16,16,16,16,16,16,16,16,16,16,16,16,16,16,16,16,16,16,16,16,16,
16,16,16,16,16,16,16,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,
15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,
15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,14,14,14,14,14,
14,14,14,14,14,14,14,14,14,14,14,14,14,14,14,14,14,14,14,14,14,14,14,14,14,14,
14,14,14,14,14,14,14,14,14,14,14,14,14,14,14,14,14,14,14,14,14,14,14,14,14,14,
14,14,14,14,14,14,14,14,14,14,14,14,14,13,13,13,13,13,13,13,13,13,13,13,13,13,
13,13,13,13,13,13,13,13,13,13,13,13,13,13,13,13,13,13,13,13,13,13,13,13,13,13,
13,13,13,13,13,13,13,13,13,13,13,13,13,13,13,13,13,13,13,13,13,13,13,13,13,13,
13,13,13,13,13,13,13,13,13,13,13,13,13,13,13,13,13,13,13,13,12,12,12,12,12,12,
12,12,12,12,12,12,12,12,12,12,12,12,12,12,12,12,12,12,12,12,12,12,12,12,12,12,
12,12,12,12,12,12,12,12,12,12,12,12,12,12,12,12,12,12,12,12,12,12,12,12,12,12,
12,12,12,12,12,12,12,12,12,12,12,12,12,12,12,12,12,12,12,12,12,12,12,12,12,12,
12,12,12,12,12,12,12,12,12,12,12,12,12,12,12,12,12,12,12,12,12,12,12,12,12,12,
11,11,11,11,11,11,11,11,11,11,11,11,11,11,11,11,11,11,11,11,11,11,11,11,11,11,
11,11,11,11,11,11,11,11,11,11,11,11,11,11,11,11,11,11,11,11,11,11,11,11,11,11,
11,11,11,11,11,11,11,11,11,11,11,11,11,11,11,11,11,11,11,11,11,11,11,11,11,11,
11,11,11,11,11,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,
10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,
10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,
10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10))
R> summary(karma)
Min. 1st Qu. Median Mean 3rd Qu. Max.
10.0 12.0 17.0 19.5 23.0 105.0
R> n <- seq(length(karma))
R> temp <- data.frame(y = karma, x = n)
# first try, fitting a nonlinear model
R> plot(temp$x, temp$y)
R> mod <- nls(y ~ exp(a + b * x), data = temp, start = list(a = 0, b = 0))
R> lines(temp$x, predict(mod, list(x = temp$x))); mod
Nonlinear regression model
model: y ~ exp(a + b * x)
data: temp
a b
1.9094 0.0016
residual sum-of-squares: 17684
Number of iterations to convergence: 9
Achieved convergence tolerance: 8.9e-06

Fitted exponential

# second try, fitting a quadratic
R> lm(temp$y ~ temp$x + I(temp$x^2))
Call:
lm(formula = temp$y ~ temp$x + I(temp$x^2))
Coefficients:
(Intercept) temp$x I(temp$x^2)
1.33e+01 -1.91e-02 3.96e-05
# third try, log transform
R> exp(fitted(lm(log(temp$y) ~ temp$x)))
....
1121 1122 1123 1124 1125 1126 1127 1128 1129 1130 1131 1132 1133 1134
35.080 35.123 35.167 35.211 35.255 35.299 35.343 35.387 35.431 35.475 35.520 35.564 35.608 35.653
1135 1136 1137 1138 1139 1140
35.697 35.742 35.786 35.831 35.876 35.920
# fourth and final try, log variation
R> cc <- coef(lm(log(temp$y) ~ temp$x)); cc
(Intercept) temp$x
2.160310 0.001246
R> with(temp, fitted(nls(y ~ exp(a + b*x), start = list(a = cc[1], b = cc[2]))))
...
[1106] 39.594 39.657 39.721 39.784 39.848 39.912 39.976 40.040 40.104 40.168 40.232 40.297 40.361
[1119] 40.426 40.491 40.556 40.620 40.686 40.751 40.816 40.881 40.947 41.012 41.078 41.144 41.210
[1132] 41.276 41.342 41.408 41.474 41.541 41.607 41.674 41.740 41.807
attr(,"label")
[1] "Fitted values"
Comment author: DanielVarga 25 January 2013 07:59:58PM *  0 points [-]

It is roughly exponential in the range between 3 and 60 karma.

You can find the raw data here.

Edit: I didn't spot gwern's more careful analysis. I am still digesting it. gwern, you should use the above link, it contains the below-10 quotes, too.

Comment author: gwern 25 January 2013 08:49:54PM *  0 points [-]

The extra data doesn't seem to make much difference:

R> karma <- read.table("<http://people.mokk.bme.hu/~daniel/rationality_quotes_2012/scores>")
R> karma <- sort(karma$V2)
R> summary(karma)
Min. 1st Qu. Median Mean 3rd Qu. Max.
-8.0 4.0 8.0 10.7 15.0 105.0
...
Nonlinear regression model
model: y ~ exp(a + b * x)
data: temp
a b
-0.01088 0.00134
residual sum-of-squares: 22772
Number of iterations to convergence: 7
Achieved convergence tolerance: 3.59e-06

With &lt;10 quotes too

It is roughly exponential in the range between 3 and 60 karma.

Eyeballing it, looks like the previous fit crosses around 40.

R> karma <- karma[karma<40]
...
Nonlinear regression model
model: y ~ exp(a + b * x)
data: temp
a b
-0.01088 0.00134
residual sum-of-squares: 22772
Number of iterations to convergence: 7
Achieved convergence tolerance: 3.59e-06

The fit looks much better:

Quote karma from -8 to 40

Comment author: DanielVarga 25 January 2013 09:06:02PM 0 points [-]

I am afraid I don't understand your methodology. How is a rank versus value function supposed to look like for an exponentially distributed sample?

Comment author: gwern 25 January 2013 09:08:38PM 0 points [-]

How else would you do it?

Comment author: DanielVarga 25 January 2013 10:29:57PM *  0 points [-]

When I stated that the middle is roughly exponential, this was the graph that I was looking at:

d <- density(karma)

plot(log(d$y) ~ d$x)

I don't do this for a living, so I am not sure at all, but if I really really had to make this formal, I would probably use maximum likelihood to fit an exponential distribution on the relevant interval, and then Kolmogorov-Smirnoff. It's what shminux said, except there is probably no closed formula because the cutoffs complicate the thing. And at least one of the cutoffs is really necessary, because below 3 it is obviously not exponential.

Comment author: shminux 25 January 2013 09:19:21PM 0 points [-]

I expected something like this or the section thereafter.