Thank you for including Replication Markets! A couple of notes:
*We got to test that a bit in Round 8 when we discovered a coordinated "attack" that accounted for ~1/3 of our surveys. Some forecasts would have changed, prizes would have been won, but neither so much as we feared.
A forecasting digest with a focus on experimental forecasting. The newsletter itself is experimental, but there will be at least four more iterations. Feel free to use this post as a forecasting open thread; feedback is welcome.
Index
Prediction Markets & Forecasting platforms.
Augur: augur.net
Augur is a decentralized prediction market. Here is a fine piece of reporting outlining how it operates and the road ahead.
Coronavirus Information Markets: coronainformationmarkets.com
For those who want to put their money where their mouth is, this is a prediction market for coronavirus related information.
Making forecasts is tricky, so would-be-bettors might be better off pooling their forecasts together with a technical friend. As of the end of this month, the total trading volume of active markets sits at $26k+ (upwards from $8k last month), and some questions have been resolved already.
Further, according to their FAQ, participation from the US is illegal: "Due to the US position on information markets, US citizens and residents, wherever located, and anyone physically present in the USA may not participate in accordance with our Terms." Nonetheless, one might take the position that the US legal framework on information markets is so dumb as to be illegitimate.
CSET: Foretell
The Center for Security and Emerging Technology is looking for (unpaid, volunteer) forecasters to predict the future to better inform policy decisions. The idea would be that as emerging technologies pose diverse challenges, forecasters and forecasting methodologies with a good track record might be a valuable source of insight and advice to policymakers.
One can sign-up on their webpage. CSET was previously funded by the Open Philanthropy Project; the grant writeup contains some more information.
Epidemic Forecasting: epidemicforecasting.org (c.o.i)
As part of their efforts, the Epidemic Forecasting group had a judgemental forecasting team that worked on a variety of projects; it was made up of forecasters who have done well on various platforms, including a few who were official Superforecasters.
They provided analysis and forecasts to countries and regions that needed it, and advised a vaccine company on where to locate trials with as many as 100,000 participants. I worked a fair bit on this; hopefully more will be written publicly later on about these processes.
They've also been working on a mitigation calculator, and on a dataset of COVID-19 containment and mitigation measures.
Now they’re looking for a project manager to take over: see here for the pitch and for some more information.
Foretold: foretold.io (c.o.i)
I personally added a distribution drawer to the Highly Speculative Estimates utility, for use within the Epidemic Forecasting forecasting efforts; the tool can be used to draw distributions and send them off to be used in Foretold. Much of the code for this was taken from Evan Ward’s open-sourced probability.dev tool.
/(Good Judgement?[^]*)|(Superforecast(ing|er))/gi
(The title of this section is a regular expression, so as to accept only one meaning, be maximally unambiguous, yet deal with the complicated corporate structure of Good Judgement.)
Good Judgement Inc. is the organization which grew out of Tetlock's research on forecasting, and out of the Good Judgement Project, which won the IARPA ACE forecasting competition, and resulted in the research covered in the Superforecasting book.
Good Judgement Inc. also organizes the Good Judgement Open gjopen.com, a forecasting platform open to all, with a focus on serious geopolitical questions. They structure their questions in challenges. Of the currently active questions, here is a selection of those I found interesting (probabilities below):
Probabilities: 25%, 75%, 40%, 62%, 20%
On the Good Judgement Inc. side, here is a dashboard presenting forecasts related to covid. The ones I found most worthy are:
Otherwise, for a recent interview with Tetlock, see this podcast, by Tyler Cowen.
Metaculus: metaculus.com
Metaculus is a forecasting platform with an active community and lots of interesting questions. In their May pandemic newsletter, they emphasized having "all the benefits of a betting market but without the actual betting", which I found pretty funny.
This month they've organized a flurry of activities, most notably:
Predict It & Election Betting Odds: predictIt.org & electionBettingOdds.com
PredictIt is a prediction platform restricted to US citizens, but also accessible with a VPN. This month, they present a map about the electoral college result in the USA. States are colored according to the market prices:
Some of the predictions I found most interesting follow. The market probabilities can be found below; the engaged reader might want to write down their own probabilities and then compare.
Some of the most questionable markets are:
Market probabilities are: 76%, 9%, 75%, 82%, 8%, 2%, 6%, 11%.
Election Betting Odds aggregates PredictIt with other such services for the US presidential elections, and also shows an election map. The creators of the webpage used its visibility to promote ftx.com, another platform in the area, whose webpage links to effective altruism and mentions:
Replication Markets: replicationmarkets.com
On Replication Markets, volunteer forecasters try to predict whether a given study's results will be replicated with high power. Rewards are monetary, but only given out to the top few forecasters, and markets suffer from sometimes being dull.
The first week of each round is a survey round, which has some aspects of a Keynesian beauty contest, because it's the results of the second round, not the ground truth, what is being forecasted. This second round then tries to predict what would happen if the studies were in fact subject to a replication, which a select number of studies then undergo.
There is a part of me which dislikes this setup: here was I, during the first round, forecasting to the best of my ability, when I realize that in some cases, I'm going to improve the aggregate and be punished for this, particularly when I have information which I expect other market participants to not have.
At first I thought that, cunningly, the results of the first round would be used as priors for the second round, but a programming mistake by the organizers revealed that they use a simple algorithm: claims with p < .001 start with a prior of 80%, p < .01 starts at 40%, and p < .05 starts at 30%.
In The News.
Articles and announcements in more or less traditional news media.
Grab Bag
Podcasts, blogposts, papers, tweets and other recent nontraditional media.
Some interesting discussion about forecasting over at Twitter, in David Manheim's and Philip Tetlock's accounts, some of which have been incorporated into this newsletter. This twitter thread contains some discussion about how Good Judgement Open, Metaculus and expert forecasters fare against each other, but note the caveats by @LinchZhang: "For Survey 10, Metaculus said that question resolution was on 4pm ET Sunday, a lot of predictors (correctly) gauged that the data update on Sunday will be delayed and answered the letter rather than the spirit of the question (Metaculus ended up resolving it ambiguous)." This thread by Marc Lipsitch has become popular, and I personally also enjoyed these two twitter threads by Linchuan Zhang, on forecasting mistakes.
SlateStarCodex brings us a hundred more predictions for 2020. Some analysis by Zvi Mowshowitz here and by user Bucky.
FLI Podcast: On Superforecasting with Robert de Neufville. I would have liked to see a more intense drilling on some of the points. It references The NonProphets Podcast, which looks like it has some more in-depth stuff. Some quotes:
Space Weather Challenge and Forecasting Implications of Rossby Waves. Recent advances may help predict solar flares better. I don't know how bad the worst solar flare could be, and how much a two year warning could buy us, but I tend to view developments like this very positively.
An analogy-based method for strong convection forecasts in China using GFS forecast data. "Times in the past when the forecast parameters are most similar to those forecast at the current time are identified by searching a large historical numerical dataset", and this is used to better predict one particular class of meteorological phenomena. See here for a press release.
The Cato Institute releases 12 New Immigration Ideas for the 21st Century, including two from Robin Hanson: Choosing Immigrants through Prediction Markets & Transferable Citizenship. The first idea is to have prediction markets forecast the monetary value of taking in immigrants, and decide accordingly, then rewarding forecasters according to their accuracy in predicting e.g. how much said immigrants pay in taxes.
A General Approach for Predicting the Behavior of the Supreme Court of the United States. What seems to be a pretty simple algorithm (a random forest!) seems to do pretty well (70% accuracy). Their feature set is rich but doesn't seem to include ideology. It was written in 2017; today, I'd expect that a random bright highschooler might be able to do much beter.
From Self-Prediction to Self-Defeat: Behavioral Forecasting, Self-Fulfilling Prophecies, and the Effect of Competitive Expectations. Abstract: Four studies explored behavioral forecasting and the effect of competitive expectations in the context of negotiations. Study 1 examined negotiators' forecasts of how they would behave when faced with a very competitive versus a less competitive opponent and found that negotiators believed they would become more competitive. Studies 2 and 3 examined actual behaviors during negotiation and found that negotiators who expected a very competitive opponent actually became less competitive, as evidenced by setting lower, less aggressive reservation prices, making less demanding counteroffers, and ultimately agreeing to lower negotiated outcomes. Finally, Study 4 provided a direct test of the disconnection between negotiators' forecasts for their behavior and their actual behaviors within the same sample and found systematic errors in behavioral forecasting as well as evidence for the self-fulfilling effects of possessing a competitive expectation.
Neuroimaging results altered by varying analysis pipelines. Relevant paragraph: "the authors ran separate ‘prediction markets’, one for the analysis teams and one for researchers who did not participate in the analysis. In them, researchers attempted to predict the outcomes of the scientific analyses and received monetary payouts on the basis of how well they predicted performance. Participants — even researchers who had direct knowledge of the data set — consistently overestimated the likelihood of significant findings". Those who had more knowledge did slightly better, however.
Forecasting s-curves is hard: Some clear visualizations of what it says on the title.
Forecasting state expenses for budget is always a best guess; exactly what it says on the tin. Problem could be solved with a prediction market or forecasting tournament.
Fashion Trend Forecasting using Instagram and baking preexisting knowledge into NNs.
The advantages and limitations of forecasting. A short and sweet blog post, with a couple of forecasting anecdotes and zingers.
Negative examples.
I have found negative examples to be useful as a mirror with which to reflect on my own mistakes; highlighting them may also be useful for shaping social norms. Andrew Gelman continues to fast-pacedly produce blogposts on this topic. Meanwhile, amongst mortals:
Long content
This section contains items which have recently come to my attention, but which I think might still be relevant not just this month, but throughout the years. Content in this section may not have been published in the last month.
How to evaluate 50% predictions. "I commonly hear (sometimes from very smart people) that 50% predictions are meaningless. I think that this is wrong."
Named Distributions as Artifacts. On how the named distributions we use (the normal distribution, etc.), were selected for being easy to use in pre-computer eras, rather than on being a good ur-prior on distributions for phenomena in this universe.
The fallacy of placing confidence in confidence intervals. On how the folk interpretation of confidence intervals can be misguided, as it conflates: a. the long-run probability, before seeing some data, that a procedure will produce an interval which contains the true value, and b. and the probability that a particular interval contains the true value, after seeing the data. This is in contrast to Bayesian theory, which can use the information in the data to determine what is reasonable to believe, in light of the model assumptions and prior information. I found their example where different confidence procedures produce 50% confidence intervals which are nested inside each other particularly funny. Some quotes:
Psychology of Intelligence Analysis, courtesy of the American Central Intelligence Agency, seemed interesting, and I read chapters 4, 5 and 14. Sometimes forecasting looks like reinventing intelligence analysis; from that perspective, I've found this reference work useful. Thanks to EA Discord user @Willow for bringing this work to my attention.
Chapter 4: Strategies for Analytical Judgement. Discusses and compares the strengths and weaknesses of four tactics: situational analysis (inside view), applying theory, comparison with historical situations, and immersing oneself on the data. It then brings up several suboptimal tactics for choosing among hypotheses.
Chapter 5: When does one need more information, and in what shapes does new information come from?
Chapter 14: A Checklist for Analysts. "Traditionally, analysts at all levels devote little attention to improving how they think. To penetrate the heart and soul of the problem of improving analysis, it is necessary to better understand, influence, and guide the mental processes of analysts themselves." The Chapter also contains an Intelligence Analysis reading list.
The Limits of Prediction: An Analyst’s Reflections on Forecasting, also courtesy of the American Central Intelligence Agency. On how intelligence analysts should inform their users of what they are and aren't capable of. It has some interesting tidbits and references on predicting discontinuities. It also suggests some guiding questions that the analyst may try to answer for the policymaker.
How to Measure Anything, a review. "Anything can be measured. If a thing can be observed in any way at all, it lends itself to some type of measurement method. No matter how “fuzzy” the measurement is, it’s still a measurement if it tells you more than you knew before. And those very things most likely to be seen as immeasurable are, virtually always, solved by relatively simple measurement methods."
The World Meteorological organization, on their mandate to guarantee that no one is surprised by a flood. Browsing the webpage it seems that the organization is either a Key Organization Safeguarding the Vital Interests of the World or Just Another of the Many Bureaucracies Already in Existence, but it's unclear to me how to differentiate between the two. One clue may be their recent Caribbean workshop on impact-based forecasting and risk scenario planning, with the narratively unexpected and therefore salient presence of Gender Bureaus.
95%-ile isn't that good: "Reaching 95%-ile isn't very impressive because it's not that hard to do."
The Backwards Arrow of Time of the Coherently Bayesian Statistical Mechanic: Identifying thermodynamic entropy with the Bayesian uncertainty of an ideal observer leads to problems, because as the observer observes more about the system, they update on this information, which in expectation reduces uncertainty, and thus entropy. But entropy increases with time.
Behavioral Problems of Adhering to a Decision Policy
Immanuel Kant, on Betting
Vale.
Conflicts of interest: Marked as (c.o.i) throughout the text.
Note to the future: All links are automatically added to the Internet Archive. In case of link rot, go there.