JoshuaZ comments on What are you working on? - Less Wrong

8 Post author: jsalvatier 15 August 2011 02:43PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (68)

You are viewing a single comment's thread. Show more comments above.

Comment author: JoshuaZ 17 August 2011 08:49:23PM *  1 point [-]

(Also, I'm not sure you're interpreting the graphs right. My understanding is that the graphs show that I am substantially underconfident as compared to PB in general.)

Every 10% range should have an actual certainty about midway in the range right? So for example, for the "50%" range a perfect calibration would be 55% (assuming equidistribution over the whole 50-60 range). For PB as a whole, every category is at least 10% off. For PB as a whole we get: 55% compared to an abysmal 37%, 65% compared to 58%, 75% compared to 58%, 85% compared to 70%, 95% compared to 79%. And the real kicker is that the 100% category is wrong one fifth of the time. In contrast, your numbers are 55% going to 41%, 65% going to 51%, 75% going to 60%, 85% going to 86%, and 95% going to 92%. Your 100% goes to 93%. Thus, with the exception of the 60-70 range every one of yours is better calibrated, and your 80-90 and 90-100 ranges are nearly spot on. Am I misinterpreting the graphs?

Comment author: gwern 17 August 2011 09:02:29PM 2 points [-]

You know, I thought that I was supposed to have as flat a line as possible, but now I'm not sure. Re-reading the two axis, I guess the ideal graph is not the green line, but a line at a 45 degree angle going from 50%/50% in the middle-left to 100%/100% in the upper-right.

Have I been misreading the graphs this entire time? How embarrassing! I guess these graphs could be clearer, and explicitly graph the 'ideal' line...

Comment author: JoshuaZ 17 August 2011 09:27:47PM 0 points [-]

You know, I sort of presumed you were one of the people who had been involved in setting up PB because you spend so much time with it and seem to know its ins and outs. But your comment suggests that's not the case. Who does run it?

Comment author: gwern 17 August 2011 10:40:36PM 1 point [-]

Tricycle runs it, like LW (see Eliezer's ANN). Matthew Fallenshaw seems to be the one most involved with it - at least, I've always corresponded with him about it.