[Suggested] Changes:
SimonM models potential meat demand decreasing [by reusing historical data from] the "decline" in smoking
NPR (a) has gotten some economists [-] who disagree [-] to make quantifiable predictions
Responses:
incentivize people to make forecasts on lots of questions even if they have no particular information advantage,
Might be 'okay' early on to create a base estimate (i.e., trying to make sure questions aren't 'empty' (of responses)).
disincentivize forecasters to forecast even if they know the true word-from-God probability exactly
Variant*: A fair coin is being flipped. The distribution is 50/50, but someone who knows that is disincentivized from saying (or forecasting, in system) this outcome.
*or example?
I remember reading them, and finding them very persuasive, and then realizing that such persuasiveness was probably fairly uncorrelated with the truth.
Did they make testable, or empirical, claims?
The quotes from "The Perils of Forecasting (a)" was great.
Interestingly, at a time when even the finest elite publications do not cover foreign affairs as seriously and as disinterestedly as they once did, corporations have been reaching out to private forecasting companies to get a cold-blooded sense of the middle-term future in many places.
Also, the practice of archiving/finding an archived version of referenced material - is fantastic.
Thanks, added your suggestions.
***
Sure, might be ok early on. But you could require the question maker to provide a probability (and, at least I always predict on the questions I predict), or reward forecasting early directly.
***
Did they make testable, or empirical, claims?
Actually yes, I have a list of 10 predictions here which I extracted from his blog, but I've been procrastinating on evaluating them.
***
Cheers.
Highlights
Index
Sign up here or browse past newsletters here.
Prediction Markets & Forecasting Platforms
CSET-Foretell
The Superforecasting workshop for "Foretell Pros" has ended. An tidbit I learnt from it is that, unlike prediction markets, forecasting platforms can look at the covariance between forecasters—whether two forecasters' predictions are closer or further apart than average—and update on it. That is, if two forecasters who often disagree instead agree on a question, that is evidence that their side is correct (h/t Eva Chen).
CSET has also been collaborating with Ought, and has given "Foretell Pros" access to Oughts GPT-3 assistant capabilities. I'm unclear on how often Ought's tools will be used in practice by forecasters.
Good Judgment Open
The World Ahead: What If? (a)—a new Good Judgment Open tournament in collaboration with The Economist—presents five long-term questions. This is something I don't recall seeing before, and I'm glad to see that Good Judgment Open is dipping its toes into the tricky business of long-term predictions. The questions will not be scored, perhaps because Good Judgment Open uses an improper scoring rule which gets worse for longer term questions, see below.
Good Judgment Open dips its toes into the tricky business of long-term predictions, and presents five questions in a new tournament The World Ahead: What If? (a), in collaboration with The Economist.
Metaculus
Metaculus has a new redesign (a) in progress, and an accompanying blogpost (a) by Metaculus' CEO. Some discussion can be found on Metaculus itself here (a).
Metaculus also launched the Trade Signal tournament (a), where Metaculus users attempt to predict economic indicators which might be used to make trades. For this, they are looking for a "Community Trader" (a). So far, the one candidate (a) seems very formidable.
Michael Aird, of Rethink Priorities, organized the Nuclear Risk Forecasting Tournament (a). Questions can be found here (a).
The 20/20 Insight Forecasting Contest (a) has concluded. Winners can be seen here (a).
SimonM kindly curated the top comments from Metaculus this past June. They are:
Polymarket
Polymarket has at times been nigh-unusable because of network congestion and dependency failures. Polygon, the second layer solution for Ethereum which Polymarket uses, has been becoming more popular, so costs to make transactions (gas costs) have increased, and the infrastructure needed to process those transactions has at times been taxed beyond capacity. In response, Polymarket has increased the gas prices which its contracts were willing to pay; this doesn't really affect users because even comparatively high gas prices on Polygon are at most cents.
In addition, The Graph (a)—a service which Polymarket was relying on to let its webpage know what its blockchain contracts were doing—has also been suffering from constant failures, presumably also as a result of scaling pains.
Polygon itself has also been accused of being insecure, because 5 out 8 developers and early community members ("multisignature key holders") could conspire to upgrade Polygon's protocol. Here are two (a) letters (a) from "DeFi Watch".
Polymarket contentiously resolved its "Will NYC fully reopen by July 1?" (a) positively. Here (a) is someone on twitter making the case for a "No" resolution, and here (a) is the case for a "Yes" resolution, whereas here (a) is Polymarket's rationale for resolving it positively, as they did. Polymarket also prematurely resolved Will Joe Biden be President of the USA on June 30, 2021? (a) as a "Yes", and they are reimbursing market participants who held "No" positions.
On the positive side, Polymarket passed its one year aniversary (a) this month, and organized a party. Some community members were invited and reimbursed for their travel expenses.
A Polymarket community member has released a polymarket trading tool (a), which allows users to interact with Polymarket's Polygon contracts directly, without having to use Polymarket's frontend. Polymarket has also added some rudimentary search functionality to their frontpage.
In more detail: Why and how could Polygon multisignature key holders steal user's funds?
A key point of contention is whether upgrading Polygon's protocol could be used to straight-out steal user's assets (or just make the platform unusable). Answering that question would require understanding some of the finer points on cross-chain communication, which are a bit beyond me. In particular, what the multisignature key holders would be stealing wouldn't directly be the valuable USDC, or ETH assets, but rather a Doppelgänger (a) of those assets, a clone asset on the Polygon Chain which is guaranteed to be redeemable for original tokens, originals which are safely stashed away in the Ethereum Chain. See: Moving assets to Polygon (a), and wrapped tokens (a).
Its possible that stealing the Doppelgänger tokens would just make them instantly worthless. More specifically, because USDC is controlled by a central authority (a), it could just refuse to honor stolen tokens. However, the malicious multisignature key holders could steal users' assets, and then very quickly swap those assets for decentralized assets (like DAI)), using Uniswap (a); they could then disappear using Tornado Cash (a). This would normally not be possible, but in this case, the process to upgrade Polygon's protocol is not under a timelock (a): there is no enforced waiting period between the announcement of an upgrade and when that upgrade takes effect.
In the short term, I'm not actually too worried, and I'm keeping my assets on Polymarket, on Polygon. But in the medium to long term, the probability of things like regulatory attacks or plain old human unreliability or malice start to add up.
Superforecasters
A shrewdness (a) of superforecasters has started a substack (a), so far featuring fortnightly forecasts of in fashion affairs (a).
Others
In the News
NPR (a) has gotten some economists—who disagree with each other—to make quantifiable predictions, and to promise to come back in a couple of months to analyze what they got right or wrong: h/t @CrunchWrapSupreme.
The Rise Fund Announces $100 Million Strategic Investment in Climavision (a). The Rise Fund is one of the largest, if not the largest, impact investment funds. The investment is supposed to improve weather forecasting. Taken directly from the press release:
There has recently been a heat wave in the US. Compare coverage from Fox (a), from the Associated Press (a) and from Reuters (a).
European data monopoly hurt forecasts of deadly eruption, Congolese researchers charge (a).
Papers
In Alignment Problems With Current Forecasting Platforms (a), my coauthor Alex Lawsen and I expand upon our earlier Incentive Problems With Current Forecasting Competitions (a). We classify current problems as more or less either reward specification problems or more or less principal-agent problems. Reward specification problems are those which incentivize forecasters to behave in ways which are not useful from the perspective of the accuracy of the broader system.
For instance, some platforms:
With regards to principal-agent problems, forecasters also sometimes stop trying to maximize their expected score, and instead start optimizing for other metrics. For example, discrete prizes create incentives to be in the top people who get prizes, or in the top few spots where people can brag that they won a tournament. We try to analyze this effect quantitatively. We also prove that some platforms, like Good Judgment Open or CSET-Foretell, straight out use an improper scoring rule, where participants can get a better score in expectation by inputting something other than their true probability.
I thought that this was going to be a big deal, because Superforecasters are chosen from Good Judgment Open, but per Good Judgment Inc, the effect probably turns out to be small. As a tidbit from history, IARPA's ACE tournament also used an improper scoring rule, but other groups besides the Good Judgment Project thought that it would be too much of a hassle to change.
In any case, each of the alignment problems we identify can manifest itself in different ways. Forecasters can consciously follow their flawed incentives. But it is also the case that each alignment failure adds noise to the ranking of forecasters (even if the noise is random). More spookily, forecasters also interpret their scores (or the monetary reward in the case of a tournament) as feedback. So to the extent that this feedback is flawed, forecasters might implicitly learn the wrong lessons. This possibility is particularly worrisome to me because "the feeling of a 80%", or "the feeling of updating from an 80% to a 60%" is for me something fairly intuitive. Thus, it is something which I could imagine could be vulnerable to flawed training. See Unconscious Economics (a) for an elaboration of the point that incentives don't have to consciously be followed to affect outcomes.
Many of the problems above are solved by prediction markets. But prediction markets have their own problems (a) and inefficiencies (a). For example, prediction markets also greatly disincentivize collaboration and thus greatly incentivize redundancy in research (a.k.a. "have you ever seen good comments on PredictIt?" h/t Marc Koehler.)
We also propose solutions for these problems. My preferred solution right now is one in which:
However, in the setup I have in mind, the forecasting platform ends up paying money proportionally to the number of forecasters (and is thus easily exploitable), or forecasters are disincentivized to bring other people in even if they would improve probabilities. Additionally, forecasters have with an incentive to "slack-off"—to wait until someone else shares their hard work and reap similar rewards as them.
The conclusion section makes some comparisons between aligning forecasting systems and aligning machine systems. They both have a chain of proxies between the original goal and what ends up being maximized. And even though the human forecasters aren't being trained or optimized, there still seems to be a comparison to be made between the inner alignment (a) problem for reinforcement learners and the principal/agent problem for forecasters. Similarly, reward specification seems fairly equivalent to outer alignment, though I might be missing some nuance. I'm not really sure to what extent I'm shooting from the hip here, but I suggest that alignment proposals which would apply to superhuman systems could be tested on human forecasters with the goal of making them produce useful forecasts.
Blog Posts
Dominic Cummings (a) has started a substack. On the one hand, he appears to have deep insight about the inner workings of Britain's political machinery. On the other hand, it's difficult to say how Machiavellian he is, what proportion of what he communicates is intended to shape public opinion in a certain way, or how distorted his models of the world are by a goal of communicating information to have some effect. One of the things the British leave campaign did under his direction was to run randomized trials/focus groups on the most persuasive arguments for Brexit were. I remember reading them, and finding them very persuasive, and then realizing that such persuasiveness was probably fairly uncorrelated with the truth. In LessWrong lingo, I'm unsure about which Simulacrum Level (a) Cummings is operating at.
Event-driven mission hedging and the 2020 US election (a) considers a case where it is cheaper to buy some altruistic good if Biden wins, so one could bet on his success and buy it only if he wins. The post makes the mistake of ignoring market dynamics, but this doesn't change the thrust of its argument.
The Ultimate Guide to Decentralized Prediction Markets (a), an old Augur blog post that covers the topic in depth.
What to Expect When You're Expecting Inflation (a):
Jason Crawford on precognition (a):
Taboo "Outside View" (a):
The Generalized Product Rule (a) outlines how a certain step in Cox's theorem (a)—the step which proves that probability updating is multiplicative—can be applied to other problems as well.
The Perils of Forecasting (a):
What if Military AI is a Washout? (a). The author presents his "hunches" on the future of military AI, in which it does improve, but it ends up affecting war not because of its overwhelming dominance, but by changing the tradeoffs and best practices of war. For instance, war might move more and more into cities, because they are an environment in which classifier systems might be more uncertain about whether someone is a civilian or an enemy combatant.
Note to the future: All links are added automatically to the Internet Archive, using this tool (a). "(a)" for archived links was inspired by Milan Griffes (a), Andrew Zuckerman (a), and Alexey Guzey (a).