satt comments on Summary of my Participation in the Good Judgment Project - Less Wrong Discussion
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (7)
Thanks for posting this. The GJP's sparked only sporadic discussion here, maybe because it focuses so much on world politics as opposed to stereotypically LWesque STEM stuff, and that's a bit of a shame. I'm a STEM nerd myself, but in a way that made the GJP more enticing because I thought participating in it might nudge me to learn a tiny bit about world politics (it did), and because I wanted to see whether I could beat the averages despite having minimal domain-specific knowledge (I could).
IIRC I filled out a pre-registration form that just asked for bare-bones demographic info like occupation and highest-level education qualification. After the GJP let me into the study, but before they assigned me to a group, I think I filled out a longer background survey about myself, and did the political knowledge/calibration test.
I did the short training session after getting the group assignment. Presumably the (sub)group assignments are randomized so the researchers can make causal inferences about which treatments generate better forecasts.
It's actually still running for my group. We have 31 questions still open which don't close until the 8th or 9th.
I wound up putting in more time than I think I anticipated, probably more than half an hour a week most weeks, and so far I've made 335 predictions on 36 questions. Since GJP started displaying my rank in my group, my overall Brier score's consistently been in the lowest 20%.
Maybe we were in the same group. My group also had no prediction markets, but I could read "tips" written by other people who were apparently chatting to each other in a forum to which I didn't have access. I also couldn't/can't see other users' predictions in real time, although I could see the group's median Brier score for each question after it was closed.
Ah, but if you didn't make a prediction on a question, you still got a Brier score for it — GJP gave you the median score of the group members who did make a prediction. (Or that's how it worked for me, anyway.) So the trick is to choose questions where you expect to do better than the median predictor, even if those questions look difficult. (Perhaps especially questions which look difficult to you, because other people might be overconfident about them.) The sample size is small, but on each of the 4 questions where my Brier score was high (≥ 0.5) I scored 0.07-0.15 fewer points than the group score, which really helped drive down my overall score.
Mostly true, although election-result questions tended to be nice & easy. A few other political events weren't obvious slam-dunks if I looked at them from a distance, but became very obvious slam-dunks as soon as I investigated them.
Example: "Will a referendum on Quebec's affiliation with Canada be held before 31 December 2014?", which I didn't touch until October. But when I ran Google News searches about it, the lack of positive evidence for expecting a referendum was stark, and I immediately gave it only a 5% probability. During October I monotonically lowered that as tips came in pointing out that the one party pushing for a referendum was unpopular and leader-less, and that a referendum would take time to organize. For all of November & December I had that question at 0%, and my final Brier score for it halved the (already tiny) group score.
I also discovered that the prediction difficulty of the political questions was often time-dependent. IARPA tried to pick relevant & topical questions, which meant that a lot of questions were provoked by news coverage. But because the news prefers dramatic, sudden events, quite a few of the resulting questions were about transient crises or other hot issues that rapidly cooled down and became highly predictable within days or weeks, leaving them easy to predict for most of the (months-long) prediction windows.
A good tactic therefore turned out to be: just wait. It'd be interesting to see how people would do in a GJP re-run where the questions had shorter prediction windows, and that tactic would surely be less successful.
Yes. Actually it is. Somehow I misinterpreted one of the last mails. At least it's closed on all my forecasts.
Maybe. The best forecaster is grossz18 in my group.
No. 1 in my group is morrell. Our groups are probably different after all...or GJP is feeding us different rankings as part of the experiment!
Thank you for your detailed contribution!
Hm, yes, that makes sense as these scores are listed in grey in my coulmn too. I just didn't make that connection and can't seem to remember that it was explicitly explained that way - but maybe I misunderstood which averaging applies to which. Esp. before actually seeing the UI.
Yes. That seems like another sensible strategy to game it,
I recommend that you send this as a reply to one of the last mails. They seem to really read them.
I don't think "little interest" is a fair description. Searching LW for Good Judgment Project provides 290 search hits.
I just did a search for "("Good Judgment Project" OR GJP)" and got only 87 hits, so most of your results might merely have been recent comments/posts in LW's sidebar.
Looking through the first couple of pages of hits I see
a link post for GJP season 3, and the only comments are the ones I linked in the grandparent (I upvoted them anyway because they're interesting feedback)
a link post about an earlier GJP round, which does actually have a lot of GJP talk among its 55 comments
this Gunnar_Zarncke post
part 1 of Morendil's 2012 "Raising the forecasting waterline", about participating in the GJP, which has 108 comments but most aren't about the GJP
a short follow-up by gwern to post 2, with 2 comments
part 2 of Morendil's "Raising the forecasting waterline" (and 22 comments)
your user page, which comes up because of the parent comment
a link post to an FT article on forecasting, with comments that don't talk about the GJP
the list of recent comments for LW's Discussion section, which comes up because of the parent comment
VipulNaik's "Some historical evaluations of forecasting", which discusses the GJP for a paragraph (the only comment doesn't mention the GJP)
the list of Discussion posts tagged "tetlock", which matches because post 8 comes up
Morendil starting a short subthread about the GJP under "The Martial Art of Rationality"
post 5 at a different URL
an unrelated post which only comes up because Google indexed it while my GJP-mentioning comment was in the sidebar
another VipulNaik post which again discusses the GJP for a paragraph; all 3 comments talk about something else
Morendil's "Raising the waterline" mentions the GJP a few times (none of its comments do)
VipulNaik's "An overview of forecasting for politics, conflict, and political violence" lists various forecasting efforts, and discusses the GJP as one of them across several bullet points (0 comments)
VipulNaik's list of submitted posts
Morendil mentioning the GJP in a one-sentence comment.
VipulNaik again giving the GJP a paragraph in "Domains of forecasting" (none of the 4 comments mention the GJP)
That is more commentary than I remembered (I'd definitely forgotten about Morendil's 3 top-level posts), and yeah, "little interest" is too strong. I'll change that to "sporadic discussion", which I think is fair. Aside from Morendil's posts and this G_Z post, most of the mentions of GJP on LW seem to be asides or links to external articles, and they're spread out over about 4 years.